Secure Mining of Association Rules in Horizontally Distributed Databases

Reading time: 6 minute
...

📝 Abstract

We propose a protocol for secure mining of association rules in horizontally distributed databases. The current leading protocol is that of Kantarcioglu and Clifton (TKDE 2004). Our protocol, like theirs, is based on the Fast Distributed Mining (FDM) algorithm of Cheung et al. (PDIS 1996), which is an unsecured distributed version of the Apriori algorithm. The main ingredients in our protocol are two novel secure multi-party algorithms — one that computes the union of private subsets that each of the interacting players hold, and another that tests the inclusion of an element held by one player in a subset held by another. Our protocol offers enhanced privacy with respect to the protocol of Kantarcioglu and Clifton. In addition, it is simpler and is significantly more efficient in terms of communication rounds, communication cost and computational cost.

💡 Analysis

We propose a protocol for secure mining of association rules in horizontally distributed databases. The current leading protocol is that of Kantarcioglu and Clifton (TKDE 2004). Our protocol, like theirs, is based on the Fast Distributed Mining (FDM) algorithm of Cheung et al. (PDIS 1996), which is an unsecured distributed version of the Apriori algorithm. The main ingredients in our protocol are two novel secure multi-party algorithms — one that computes the union of private subsets that each of the interacting players hold, and another that tests the inclusion of an element held by one player in a subset held by another. Our protocol offers enhanced privacy with respect to the protocol of Kantarcioglu and Clifton. In addition, it is simpler and is significantly more efficient in terms of communication rounds, communication cost and computational cost.

📄 Content

arXiv:1106.5113v1 [cs.DB] 25 Jun 2011 Secure Mining of Association Rules in Horizontally Distributed Databases Tamir Tassa1 Department of Mathematics and Computer Science, The Open University, Israel Abstract. We propose a protocol for secure mining of association rules in horizontally distributed databases. The current leading protocol is that of Kantarcioglu and Clifton [12]. Our protocol, like theirs, is based on the Fast Distributed Mining (FDM) algorithm of Cheung et al. [6], which is an unsecured distributed version of the Apriori algorithm. The main ingredients in our protocol are two novel secure multi-party algo- rithms — one that computes the union of private subsets that each of the interacting players hold, and another that tests the inclusion of an element held by one player in a subset held by another. Our protocol offers enhanced privacy with respect to the protocol in [12]. In addition, it is simpler and is significantly more efficient in terms of communication rounds, communication cost and computational cost. Key words: Privacy Preserving Data Mining, Distributed Computa- tion, Frequent Itemsets, Association Rules 1 Introduction We study here the problem of secure mining of association rules in horizontally partitioned databases. In that setting, there are several sites (or players) that hold homogeneous databases, i.e., databases that share the same schema but hold information on different entities. The goal is to find all association rules with given minimal support and confidence levels that hold in the unified database, while minimizing the information disclosed about the private databases held by those players. That goal defines a problem of secure multi-party computation. In such prob- lems, there are M players that hold private inputs, x1, . . . , xM, and they wish to securely compute y = f(x1, . . . , xM) for some public function f. If there existed a trusted third party, the players could surrender to him their inputs and he would perform the function evaluation and send to them the resulting output. In the absence of such a trusted third party, it is needed to devise a protocol that the players can run on their own in order to arrive at the required output y. Such a protocol is considered perfectly secure if no player can learn from his view of the protocol more than what he would have learnt in the idealized setting where the computation is carried out by a trusted third party. Yao [21] was the first to propose a generic solution for this problem in the case of two players. Other generic solutions, for the multi-party case, were later proposed in [2,4,10]. 2 T. Tassa In our problem, the inputs are the partial databases, and the required out- put is the list of association rules with given support and confidence. As the above mentioned generic solutions rely upon a description of the function f as a Boolean circuit, they can be applied only to small inputs and functions which are realizable by simple circuits. In more complex settings, such as ours, other methods are required for carrying out this computation. In such cases, some re- laxations of the notion of perfect security might be inevitable when looking for practical protocols, provided that the excess information is deemed benign (see examples of such protocols in e.g. [12,20,23]). Kantarcioglu and Clifton studied that problem in [12] and devised a protocol for its solution. The main part of the protocol is a sub-protocol for the secure computation of the union of private subsets that are held by the different play- ers. (Those subsets include candidate itemsets, as we explain below.) That is the most costly part of the protocol and its implementation relies upon crypto- graphic primitives such as commutative encryption, oblivious transfer, and hash functions. This is also the only part in the protocol in which the players may extract from their view of the protocol information on other databases, beyond what is implied by the final output and their own input. While such leakage of information renders the protocol not perfectly secure, the perimeter of the excess information is explicitly bounded in [12] and it is argued that such information leakage is innocuous, whence acceptable from practical point of view. Herein we propose an alternative protocol for the secure computation of the union of private subsets. The proposed protocol improves upon that in [12] in terms of simplicity and efficiency as well as privacy. In particular, our proto- col does not depend on commutative encryption and oblivious transfer (what simplifies it significantly and contributes towards reduced communication and computational costs). While our solution is still not perfectly secure, it leaks excess information only to a small number of coalitions (three), unlike the pro- tocol of [12] that discloses information also to some single players. In addition, we claim that the excess information that our protocol may leak is less sensitive than the excess informati

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut