Fuzzy Private Matching (Extended Abstract)

Reading time: 6 minute
...

📝 Original Info

  • Title: Fuzzy Private Matching (Extended Abstract)
  • ArXiv ID: 0710.5425
  • Date: 2007-10-30
  • Authors: ** 논문에 명시된 저자 정보가 제공되지 않았습니다. (원문에 저자 명단이 포함되지 않음) **

📝 Abstract

In the private matching problem, a client and a server each hold a set of $n$ input elements. The client wants to privately compute the intersection of these two sets: he learns which elements he has in common with the server (and nothing more), while the server gains no information at all. In certain applications it would be useful to have a private matching protocol that reports a match even if two elements are only similar instead of equal. Such a private matching protocol is called \emph{fuzzy}, and is useful, for instance, when elements may be inaccurate or corrupted by errors. We consider the fuzzy private matching problem, in a semi-honest environment. Elements are similar if they match on $t$ out of $T$ attributes. First we show that the original solution proposed by Freedman et al. is incorrect. Subsequently we present two fuzzy private matching protocols. The first, simple, protocol has bit message complexity $O(n \binom{T}{t} (T \log{|D|}+k))$. The second, improved, protocol has a much better bit message complexity of $O(n T (\log{|D|}+k))$, but here the client incurs a O(n) factor time complexity. Additionally, we present protocols based on the computation of the Hamming distance and on oblivious transfer, that have different, sometimes more efficient, performance characteristics.

💡 Deep Analysis

📄 Full Content

In the private matching problem [1], a client and a server each hold a set of elements as their input. The size of the set is n and the type of elements is publicly known. The client wants to privately compute the intersection of these two sets: the client learns the elements it has in common with the server (and nothing more), while the server obtains no information at all.

In certain applications, the elements (think of them as words consisting of letters, or tuples of attributes) may not always be accurate or completely known. For example, due to errors, omissions, or inconsistent spelling, entries in a database may not be identical. In these cases, it would be useful to have a private matching algorithm that reports a match even if two entries are similar, but not necessarily equal. Such a private matching is called fuzzy, and was introduced by Freedman et al. [1]. Elements are called similar (or matching) in this context if they match on t out of T letters at the right locations.

Fuzzy private matching (FPM) protocols could also be used to implement a more secure and private algorithm of biometric pattern matching. Instead of sending the complete template corresponding to say a scanned fingerprint, a fuzzy private matching protocol could be used to determine the similarity of the scanned fingerprint with the templates stored in the database, without revealing any information about this template in the case that no match is found.

All known solutions for fuzzy private matching, as well as our own protocols, work in a semi-honest environment. In this environment participants do not deviate from their protocol, but may use any (additional) information they obtain to their own advantage.

Freedman et al. [1] introduce the fuzzy private matching problem and present a protocol for 2-out-of-3 fuzzy private matching. We show that, unfortunately, this protocol is incorrect (see Section III): the client can “steal” elements even if the sets have no similar elements in common.

Building and improving on their ideas, we present two protocols for t-out-of-T fuzzy private matching (henceforth simply called fuzzy private matching or FPM for short). The first, simple, protocol has time complexity O(n T t ) and bit message complexity O(n T t (T log |D|+k)) (protocol 3). The second protocol is based on linear secret sharing and has a much better bit message complexity O(nT (log |D| + k)) (protocol 5). Here the client incurs a O(n 2 T t ) time complexity penalty. Note that this is only a factor n worse than the previous protocol. We also present a simpler version of protocol 5 (protocol 4) to explain the techniques used incrementally. This protocol has a slightly worse bit message complexity.

Note that, contrary to intuition, fuzzy extractors and secure sketches ( [2]) cannot be used to solve fuzzy private matching problem.

Indyk and Woodruff [3] present another approach for solving fuzzy private matching, using the computation of the Hamming distance together with generic techniques like secure 2party computations and oblivious transfer. Generic multi-party computation and oblivious transfer are considered not to be efficient techniques. Therefore, based on the protocol from [3], we design protocols based on computation the Hamming distance that do not use secure 2-party computation. One protocol is efficient for small domains of letters (protocol 6 version 1) and the second protocol uses oblivious transfer (protocol 6 version 2). The major drawback of the first protocol is a strong dependence on the size of the domain of letters. The main weakness of the second protocol is its high complexity

Bit Complexity (O) [1] (corrected), Fig. 3 protocol n T t ´O(n T t ´) n `T t ´(T log |D|+k) 1 For the sake of simplicity time complexities are given roughly in numbers of efficient operations (e.g., secret sharing’s reconstructions, encryptions, polynomial’s evaluations etc.); we also report here only the complexity of the slowest participant 2 the authors of the paper do not give exact complexity in the O notation.

3 protocol with subroutine from first paragraph of section VI-A. 4 protocol with subroutine equality-matrix from Figure 7.

Fig. 1. Results overview -in the protocol there are n 2 • T oblivious transfer calls. We present these protocols mainly to show that other approaches to solve the fuzzy private matching problem exist as well. We compare our protocols to existing solutions using several complexity measures in Table 1. One of these complexity measures is the Õ notation used for the bit message complexity in [3]. This notation is defined as follows. For functions f and g, we write

where k is the security parameter. This notation hides certain factors like a strong dependence on the security parameter k (e.g. k 3 ), and is therefore less accurate than the standard big-O notation. We prefer this measure for the plain message complexity, where we restrict the bit size of the messages to be linear in k.

Relate

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut