Secure Friend Discovery via Privacy-Preserving and Decentralized Community Detection

Secur e Friend Discovery via Priv acy-Preserving and Decentralized Community Detection Pili Hu H U P I L I @ I E . C U H K . E D U . H K Sherman S.M. Chow S M C H OW @ I E . C U H K . E D U . H K Wing Cheong Lau W C L AU @ I E . C U H K . E D U . H K Departmen t of Infor mation Engineerin g, The Chinese Uni versity of Hong K ong Abstract The problem of secure friend discovery on a so- cial network has long b een pr oposed an d stud- ied. The requ irement is that a p air of nod es can make befriend ing decisions with minimum infor- mation exposed to the oth er party . In this p aper , we propose to use commun ity detectio n to tackle the p roblem of secure friend discovery . W e for - mulate the ﬁrst pr i v acy-preser ving and decentral- ized commun ity detection prob lem as a multi- objective optimization. W e desig n the ﬁrst proto- col to solve this problem, which transforms com- munity detection to a series of Pri vate Set In- tersection (PSI) instances using T run cated Ran- dom W alk (TR W). Prelimin ary theoretical results show t hat our proto col can uncover com munities with overwhelming probability and pr eserve pri- vac y . W e also discuss future works, potential ex- tensions and variations. 1 Intr oduction One importan t fu nction provided by social network is friend discovery . The problem of ﬁndin g p eople of the same attribute/ in terest/ commu nity has long b een studied in the co nte xt o f social network. For example, proﬁle-b ased friend discovery can reco mmend peo ple who ha ve similar attributes/ in terests; topology-ba s ed friend discov ery can recommen d people from the same commu nity . One special requiremen t of algo rithms operating on social network is tha t it must be priv acy-preservin g. For exam- ple, social network nodes may be willing to share their attributes/ interests with peop le having similar proﬁle; Or they may b e willing to share their r a w co nnections with ICML 2014 W o rkshop on Learning, Security and Priv acy people in the same comm unity . Howe ver , it is unfa vourable to lea k those pr i v ate data to arbitrary stran gers. T owards this en d, the frien d d is covery routin e sho uld only expo s e minimal necessary informa tion to in volved parties. In the cur rent m odel o f large-scale OSNs, service provider s like Facebook play a ro le of Trusted-Third -P arty (TTP). The frien d discovery is accomp lis hed as follows: 1) Ev- ery nod e ( user) gi ve his/her pro ﬁ le and friend list to TTP; 2) TTP runs any sophisticated social network mining algo- rithm (e.g. link p rediction, commun ity detection) and re- turns the frien d recommend ations to only related u sers. Th e mining alg orithm can b e a co mplex one in volving node- lev el attrib utes, netweork topology , or both. Since TTP has all the data, the result can be very accur ate. This m odel is comm ercially viab le and succ ess fully dep lo yed in large- scale. Howe ver , recent arise of privac y concern mo ti vates both resear chers and de velopers to pu rsue o ther solu tions. Decentralized Social Network (DSN) like Diaspora 1 has re- cently been proposed and implemented. Since it is very d if- ﬁcult to design, implement and deploy a DSN ( Datta et al. , 2010 ), mu ch research attention was fo cused on system is- sues. W e envision that the DSN movement will grad ually grow with user’ s increa s ing awareness o f p ri vac y . In fact, Diaspora, the largest DSN up-to-date, has alread y accumu- lated 1 million users. W ith the de centralized infra s truc- ture established, next question is: can we support ac curate friend discovery under the co nstraint that each nod e only observes partial information of the whole social network? Note that th e who le motiv ation o f DSN is that sin gle ser- vice provider can not be fully trusted, so the TTP appro ach can not be re-used. T owards this end , the compu tation pro- cedure must be decentralized. One common ap proach in literature to achieve decentral- ized and p ri vac y-pr eserving fr iend discovery is to tran s - form it into a set match ing pro blem. For the ﬁr s t type, it is natural to rep resent one’ s attributes/ interests/ so- cial a cti vities in form of a set ( Zhang et al. , 201 2 ). For 1 https://joind iaspora.com Secure Friend Disco very via Priva cy-Preserv ing and Decentralized Community Detection the second type, one straightfor w ard way is to repr esent one’ s friend (neighbou r) list in form of a set ( Nagy et al. , 2013 ). In this w ay , bo th pr oﬁle m atching an d c ommon friend detection becom e a set intersection pr oblem. There exists one useful cry pto primiti ve called Priv ate Set In- tersection (PSI). Brieﬂy and ro ughly speaking, giv en two sets W 1 and W 2 held by two node v 1 and v 2 , PSI pro- tocol can compute |W 1 ∩ W 2 | without letting either v 1 or v 2 know other party’ s r a w input. Resaerchers ha ve propo sed PSI schem es based on com mutati ve encryp tion ( Agrawal et al. , 2003 ), o bli vious po lynomial evaluation ( Freedman et al. , 2004 ) oblivious psud orandom function ( Freedman et al. , 2005 ), index-hidin g m es sage en coding ( Manulis et al. , 2010 ), h ardware ( Hazay & Lindell , 2008 ) or generic constructio n ( Huan g et al. , 201 2 ) using gar- bled circuit ( Y ao , 1982 ). The afor ementioned privac y- preserving proﬁle matching/ com mon friend detection pro- tocols are variants of PSI pro tocols in terms of ou tput, ad- versary model, security requiremen t an d ef ﬁciency . One major drawback of all th e above works is th at they can n ot fully utilize the topolo gy of a social ne tw ork. Firstly , proﬁle is just n ode-le vel inf ormation and not al- ways available on e very social network . On the co n- trary , topolog y (conn ections/ f riendship relations) is the fundam ental data av ailable on social network s. Sec- ondly , comm on friend is just one topolo gy-based a p- proach and it only works fo r nod es within 2 -hops. In fact, our previous in vestigation showed that common friend heuristic h as a moderate precision and lo w recall for discov ering com munity-based friendship ( Hu & Lau , 2013 ). This result is u nsurprising becau s e a commu- nity can easily span multiple hop s. T owards th is e nd, we focus o n extending trad itional secure friend discov- ery beyond 2- hops via com munity detection . Note that topolog y-only community detection ( Clauset et al. , 2004 ) ( Blondel et al. , 2008 ) ( Raghav an et al. , 2007 ) ( Leung et al. , 2009 ) ( Agarwal & K empe , 2008 ) ( Coscia et al. , 2012 ) ( Sound arajan & Hopcrof t , 2013 ) is a classical prob lem un- der centralized and no n priv acy-preser ving setting, i.e. a single-party po ss esses the complete social grap h and does arbitrary com putation. Altho ugh one can tran slate those al- gorithms into a priv acy-preservin g and decentralized proto- col u si ng generic g arbled circuit c onstruction ( Y ao , 19 82 ), the computation a nd co mmunication cost renders it im prac- tical in th e real world. T o design an ef ﬁcient scheme, we need to con sider com munity detection accuracy and pr i- vac y preservation as a wh ole. A tradeoff among accuracy , priv acy and efﬁciency can also be ma de when necessary . T o summarize, this paper mad e the following co ntrib ution s : • W e pro posed a nd f ormulated the ﬁrst privacy-pr eserving and decentralized comm unity detection prob lem, which largely improves the recall of topology -based frien d d is- covery on Decentralized Social Networks. • W e d esigned the ﬁrst protoco l to solve this p roblem. T he protoco l tran sforms the community d etection pro blem to a series of Priv ate Set Intersection (PSI) instance s via Trun- cated Random W a lk (TR W). Prelimin ary results show that the protocol can uncover commu nities with overwhelming probab ility and preserve pri vac y . • W e p ropose open pro blems and discuss future works, ex- tensions and variations in the end . 2 Related W ork First ty pe o f related w ork is Priv ate Set Intersection (PSI) as the y are alread y widely used for secure friend discov ery . Second ty pe of related work is to pology-based graph min- ing. Although our p roblem is termed “comm unity detec- tion”, the most closely related works are actually t opo logy- based Sybil defen se. Th is is b ecause previous commu nity detection prob lems are main ly considered unde r the cen- tralized scena rio. O n the contrary , Sybil defen se scheme sees wide application in P2P system, so one of the root c on- cern is decentralized execution. Note, there e xist some dis- tributed comm unity d etection works but they can not be di- rectly used becau se nodes e xch ange too mu ch informatio n. For example ( Hui et al. , 2007 ) allow nodes to exchange adjacency lists and inter mediate commun ity detection re- sults, which d irectly breaks the priv acy constra int that we will formulate in following section s. Due to space limit, a d etailed survey of related w ork is o mitted. In terested readers can see com munity detection surveys ( Fortunato , 2010 )( Xie et al. , 201 3 ) and Syb il detection surveys ( Y u , 2011 )( Alvisi et al. , 2013 ). 3 Pr oblem Formulation The no tion of com munity is that in tra-community is dense and inter-community link age is sparse. In this section, we ﬁrst revie w classical com munity detection formu lations un- der centralized scenar io and our p re vious formulation un- der decentralized scenario. Then we formulate the priv acy- preserving version. T o make the problem amenable to theo- retical analysis, we consider a Co mmunity-Based Random Graph (CBRG) model in the last part. 3.1 Previous Community Detectio n Formulations Classical comm unity detection is form ulated as a clu s ter- ing pro blem. Th at is, gi ven the full graph G = ( V , E ) , partition the vertex set into K subsets S 1 , S 2 . . ., S K (a par- titioning), such that ∩ K i =1 S i = ∅ and ∪ K i =1 S i = V . A quality metr ic Q ( { S 1 , . . . , S K } ) is deﬁned ov er the parti- tions and a community detec tion alg orithm will try to ﬁnd a partitioning that max imize or min imize Q depending on its nature. Th is is for no n-overlapping commun ity detection Secure Friend Disco very via Priva cy-Preserv ing and Decentralized Community Detection and one can simply remove th e constrain t ∩ K i =1 S i = ∅ to get the overlapping version. Note that Q is o nly an artiﬁcial surrogate to the axiom atic notion of commu nity . The max- imum Q does n ot necessarily corresponds to the best com- munity . Howe ver, th e co mmunity detectio n pro blem be- comes tractable via well-studied o ptimization f rame works by assumin g a for m of Q e.g. Modularity , Condu ctance. Most classical w ork s are alo ng this line mainly du e to the lack of groun d-truth data at early years. Now consider the d ecentralized scenario. One node (ob- server) is limited to its local view of the who le gr aph. It is unreasona ble to ask for a global partitioning in ter ms of sets of node s . Th e tractable qu estion to ask is: whe ther one node is in the same comm unity as the observer or not? This giv es a bin ary classiﬁcation f ormulation of community de- tection ( Hu & Lau , 201 3 ). T he result of comm unity detec- tion with respect to a single observer can be represented as a length- | V | vector . Stacking all those vectors togeth er , we can get a community encod ing m atrix ( Zhong et al. , 20 14 ): M i,j =  1 ∃ S k , s.t.v i ∈ S k , v j ∈ S k 0 else This matrix re presentation is subsumed by partitioning rep- resentation i n general case. I f restricted to non- o verlappin g case, the two representatio ns are equ i valent. Since M e n- codes all pair-wise outcom e, it is imm ediately useful for friend discovery a pplication. In what follows, we will de- ﬁne accu racy an d p ri vac y in terms of how w ell M can be learned by nodes or adversary . 3.2 Privacy-Preserving Community Detection In this initial study , we focus on n on collu si ve passive ad- versary . Tha t is , DSN node s all e xecute our p rotocol f aith- fully but they ar e curious to in fer further infor mation from observed protocol s eque nce. W e u se a single non-collu s ive sniff-only adversar y to capture this notion. Th e system compon ents are as follows: • Graph: G = ( V , E ) . The conn ection ma trix is denoted as C , where C i,j = 1 if ( v i , v j ) ∈ E ; otherwise, C i,j = 0 . The g round-truth com munity encodin g matrix is den oted as M g , which is unknown to all parties at the beginn ing. For simplicity of discussion , we assume the no des iden ti- ﬁers, i.e. V , is public information. • Nodes: v 1 , . . . , v | V | ∈ V . A n ode’ s initial knowledge is its own direct co nnections, i.e. N ( v i ) = { v j | ( v j , v i ) ∈ E } . Nodes are fu lly ho nest. Their objective is to max- imize the accu rac y o f detectin g M . Eventually , a nod e v i can get full ro w (co lumn) in M de noted by M i, : ( M : ,i ). Dependin g on the protocol ch oice, relev ant cells in M can be made a vailable immediately or on-deman d. • Adversary: A . It can passively sniff on one nod e v a ∈ V . A will ob s erve all protocol sequ ence related with a , in- cluding initial knowledge N ( v a ) an d th e co mmunity de- tection result M a, : . A ’ s o bjecti ve is to maximize success- ful rate in g uessing M g and C , using any Probab il istic Polynom ial Algorithms ( PP A). Note, the f ull separatio n of Nodes and Adversary is for ease of d is cussion. In real DSN, this passi ve attacker can b e a curious user wh o wants to infer more information of the network. As protoco l designer, our objectives are: • Accurately d etect commu nity after ex ecution of th e pro- tocol, i.e. mak ing M and M g as close as possible. • Limit the successful rate of ad v ersary ’ s g uessing of M g and C , under the cond iti on that A gets the proto col se- quence on node v a and makes best guess via PP A. One can see that o ur pr oblem is multi-objective in nature. The a ccuracy par t is a maximization prob lem and the pri- vac y pa rt is a is min-max pro blem. Formal d eﬁnition is giv en in Eq. 1 . In this form ulation, “Pro tocol” is an abstract notation of the protoco l speciﬁcation, not protocol e xecution sequence. I a is the inf ormation ob served by adversary , w hich is depen- dent on Protocol. Succ( B 1 , B 2 , R ) is the m easure of suc- cessful rate with symbols deﬁned as follows: • B 1 , B 2 ∈ { 0 , 1 } | V |×| V | are two { 0 , 1 } matrix in the same size as M and C . • R ⊆ V × V is the challenge relations. • T o measure how close are the two matrix over the chal- lenge set, we use the successful rate: Succ( B 1 , B 2 , R ) = Pr n B 1 i,j = B 2 i,j | ( v i , v j ) $ ← − R o That is, how likely a randomly selected pair of nod es fr om R will have the same value in B 1 and B 2 . For the ac curacy part, we deﬁne the challenge relation as V × V because we want th e result to be accur ate for all nodes. For th e priv acy part, we deﬁne the challeng e rela- tion as R C a = R M a = ( V − U ( a )) × ( V − U ( a )) , where U ( a ) den otes the set of nod es in th e same community as a . Th e reason to exclude nod es from the sam e commu nity is obvious. Since ad versary will g et M a , : after proto col execution, it alrea dy knows the community membership of U ( a ) . Giv en the k no wledge of comm unity , on e can make more intelligent guess of the con nections. This is m ade clear in later discussions. 3.3 Community-Based Random Graph (CBRG) Generation Model Before proceed, we remark that the problem deﬁn ed in Eq. 1 is hard even without the p ri vac y-p reserving objectiv e. I n other words, the co mmunity detection pr oblem (accur ac y) has not been fu lly solved ev en under the TTP scenario. T o impr o ve the accu rac y , research ers have alrea dy u sed Secure Friend Disco very via Priva cy-Preserv ing and Decentralized Community Detection max Find Proto col , M = Proto col ( G )        Succ( M , M g , V × V ) , −      max Algo ∈ PP A , a $ ← − V , C A , M A ← Al go (Proto col , I a )  Succ( C A , C, R C a ) , Succ( M A , M g , R M a )              (1) M g =     1 1 1 0 0 1 0 0 1 1     , E [ C ] =     p p p q q p q q p p     Figure 1. Illustration of community-based random graph genera- tion. K = 2 , c = 2 heavy mathematical p rogramming tools, try to incorp orate more side info rmation, develop pro blem-speciﬁc heu ris- tics, o r p erform heavy-duty param eter tu ning. T o make our problem a menable to theoretical a nalysis, we consider a Commu nity-Based Random Graph (CBRG) model in this paper . L et M g be the g round-truth community en coding matrix. W e ge nerate the rando m connection matrix as fol- lows: 1) Pr { C i,j = 1 } = p if M g i,j = 1 ( v i and v j are in the same community) ; Pr { C i,j = 1 } = q other wise. There are K comm unities and each of size c , so th e total number of vertices is | V | = K c . W e denote such a random g raph as CBRG( K, c, p, q ) . One example grou nd-truth commu - nity enc oding matrix and th e expected conne ction matrix are illustrated in Fig. 1 . 4 Pr oposed Scheme In this section, we present our protoco l and main results. 4.1 Protocol Design Our protoco l in volves the two stages: • Pre-proc es sing is done via T runc ated R ando m W alk. E v- ery nod e send out W r andom walk ers, w v i 1 , . . . , w v i W , with time-to-live (TTL) values l v i 1 , . . . , l v i W initially set to L . Upon receiving a Random W alker (R W) w , the node records th e ID o f w , deducts its TTL l , and sen ds it to a random neigh bour if l > 0 . At the end of this stage, each node v i accumulated a set o f rando m walker IDs W i . W ith proper par ameters W and L , the truncated rand om walk er issued by v i will more likely re ach other nodes in the same commun ity as v i . So by inspecting the intersection size of W i and W j , we ca n an swer wh ether v i and v j are in the same commu nity . T his essentially transforms the commu- nity detection problem to a set intersection problem. • T o uncover the r ele vant ce lls in pairwise co mmunity en- coding ma trix M , we o nly need to perf orm Pri vacy Set Intersection (PSI) on tw o sets. PSI s cheme s differ in their ﬂav ours: 1) reveal inter s ection set ( PSI-Set); 2) reveal in- tersection size (PSI-Cardinality); 3) rev eal w hether inter- section size is g reater than a threshold (PSI-Thr eshold). W e u s e the 3rd type PSI in ou r co nstruction, which can be imp lemented by adapting ( Zhang et al. , 2012 ). I n what follows, we just assume existence of such a crypto primi- ti ve: it comp utes I[ |W i ∩ W j | > T ] without leaking extra informa tion. One can see that the sche me is decentralize d by design. W e only need to argu e its commu nity detection accuracy and the priv acy-preservin g pr operty . 4.2 Summary of Theoretical Guarantees The intuition of our proof is as follows: • T runcated Rand om W alk will b e mo s tly limited to one commun ity , if the axiom atic notion of “community ” holds. More p recisely , as long as p is enoug h larger than β 1 = ( K − 1) q , there will b e enough difference in in ter - section size for nodes coming from the same and dif feren t commun ities. In this case, we can set prop er t hresh old to ensure low erro r rate. • Observe two facts about priv acy ob jecti ve: 1) m ost pr o- tocol sequen ce the ad v ersary obser v ed comes from its own com munity; 2) we exclude A ’ s co mmunity fr om challenge r elations. In order to make be tter -than-pr iori guesses, A at least need to observe some other nodes fro m protoco l sequence. The num ber of n odes from V − U ( a ) can b e observed is limited . Even if we assume adversary can m ake good use of the informatio n (ca ptured by coe f- ﬁcient γ M , γ C ∈ [0 , 1] ) , this sma ll advantage is averaged out over a large challenge relation set. The detailed pr oof is omitted and the main results are sum- marized in the following theo rem. Theorem 1 Our pr otocol guarantees: • F a ls e P ositive Rate: Pr {|W i ∩ W j | > T 1 | M g i,j = 0 } 6 φW L ( L + 1) 2 2( K − 1) T 1 • F a ls e Ne gative Rate: ( µ = cW P ) Pr {|W i ∩ W j | 6 T 2 | M g i,j = 1 } 6 e − µ (1 − T 2 /µ ) 2 / 2 • Adversary’ s advantage: Adv( M A , M g , R M a ) 6 γ M 4 W ( L + 1) ( K − 1 ) c Adv( C A , C g , R C a ) 6 γ C 4 W ( L + 1) ( K − 1 ) c Secure Friend Disco very via Priva cy-Preserv ing and Decentralized Community Detection In the theorem , Adv( B 1 , B 2 , R ) = Succ( B 1 , B 2 , R ) − Prior( B 1 , B 2 , R ) . P rior ( B 1 , B 2 , R ) den otes the probabil- ity to make succe s sful guess based on me re prior inf orma- tion of B 2 . For example, supp ose B 2 contains 1 as ma- jority , i.e. Pr n B 2 i,j = 1 | i, j $ ← − R o = P > 0 . 5 . The best guess is to let B 1 i,j = 1 , ∀ i, j ∈ R . On e can show th at the succ es s p robability is P and this strategy is optimal if no other inf ormation is av ailable. Due to the speciﬁcs o f our problem, adversary can make more in telligent guesses than ran dom { 0 , 1 } bit. T owards this e nd, the advantage is deﬁned with resp s ect to successful r ate of th is priori-based strategy . 4.3 One Instantiation Due to the speciﬁcs of th e pr oblem, both accu rac y and pri- vac y guarantees are parameterized. T o give an intuitive view of what can be achie ved, consider one instantiation of CBRG: K = 1 00 (# of communities), c = 500 (# of n odes in one community ), p = 0 . 5 (intra-comm unity ed ge gen - eration prob ability), β 1 = q ( K − 1) = 0 . 05 , q = 0 . 00 05 (inter-community edge generation pr obability). W e can set proto col parameters as follows: W = 100 (# of R Ws issued b y one no de), L = 3 (leng th of R W) and T = 61 (thre s hold of in tersection size). This g i ves us following accuracy and pri vacy guarantee s : • False Ne gative Rate: 6 1 . 9 × 10 − 22 • False P ositive Rate: 6 0 . 066 • Advantage fo r guessing M : 6 0 . 032 × γ M • Advantage fo r guessing C : 6 0 . 032 × γ C One can see th at our propo sed protocol c an accurately de- tect com munity and preserve p ri vac y given pro per pa ram- eters. Note ﬁrst that ab o ve W an d L ar e casually selected by heuristics, whic h have not been jointly optim ized. Note second that the FPR and FNR can be exponentially reduced by repeated experimen ts, which only m aps to a linear in- crease in W . The e xample in this section is only to demon- strate the ef fectiveness of ou r proto col and a full explo- ration of design space is left for future work. 5 Conclusion, Discussion and Futur e W ork W e formulated th e priv acy-preserv ing comm unity de tec- tion pro blem in this paper as a multi-o bjecti ve optimiza- tion. W e prop osed a protocol ba s ed on Truncated Ran- dom W alk (TR W) a nd Priv ate Set Intersection (PSI). W e have proven that our pr otocol detects community with over- whelming probab ility an d preserves priv acy . Exploratio n of the design space an d thor ough experimentatio n on syn - thesized/ real graphs are left for futu re work. In following parts of this e arly report, we discuss several simpler can- didate protoco ls and how they fail to meet our objective. This help to demonstra te the rationale of o ur f ormulation and proto col design. 5.1 Simpler But W eak er Protocols Suppose we chang e the pro tocol such that v i and v j ﬁrst exchange W i and W j and then r un any intersection algo - rithm separately . After uncovering all related cells in M , adversary kn o ws W i , ∀ i = 1 , . . . , | V | . A can directly cal- culate |W i ∩ W j | , ∀ i, j . This allo ws ad versary to gu ess M p erfectly . Fro m the commu nity memb ership, A can further infer lin ks becau se intra-commun ity edge gen era- tion pro bability an d in ter -commu nity generation pro babil- ity are different. This already allows better gu ess than using global prior of C . Fur thermore, inferring links fro m mea- surements is a classical well-studied topic called Network T omog raphy . A can actually re-organize W i ’ s into a list of size- L sets, each representing the nodes traversed by a R W . Researchers ha ve shown that links can be inferred from this co -occurrence data with goo d accu racy , e.g. NICO ( Rabbat et al. , 2008 ). Another n atural though t to protect non-c ommon set ele- ments is via hashing . Sup pose there exists a cryp tographic hash h ( · ) . W e deﬁn e H i = { h ( w ) | w ∈ W i } . Now , two nodes just compa re H i and H j in the com munity uncover stage. This can protect tru e id entities of the R Ws if their ID space is large enough. Howe ver , it does n ot prev ent adver- sary fro m intellig ent guess of M and C . Method s noted in previous paragraph can also b e used in this case. In o ur protocol, we used the PSI-Thr eshold version . Th at is, g i ven W i and W j , the two p arties know nothing excep t for the indicator I[ |W i ∩ W j | > T ] . T wo weaker a nd wid ely studied v ariations are : PSI-Cardin ality and PSI-Set. Con- sider PSI-Set. The ad v ersary n o w o nly knows e lements in the intersectio n. Based on his own W a and PSI-Set pro - tocol sequ ence, he can get W i ∩ W j ∩ W a , ∀ i, j . A can calculate th e prob ability that a R W w tranv erses both v i and v j condition ed o n w tranverses v a . Based on this in- formation , A can ad just thresho ld T 1 and T 2 to accurately detect commu nities. The deriv ation is similar to our pr o- tocol in this p aper but m ore technically in volved, which is also left as futur e w ork . The bottom line is that PSI-Set leaks enough info rmation for more intellig ent guesses. As for PSI-Cardinality , we are not su re at p resent what an ad- versary can d o with |W i ∩ W a | , ∀ i . Since the two variants leak more info rmation and might be po tentially exploited, we use PSI-Thresho ld in our protoco l. 5.2 Open Problems Follo wing are some open pro blems o f priv acy-preserving commun ity detection: • If we allow a small fractio n o f nodes to collude, ho w to d e- Secure Friend Disco very via Priva cy-Preserv ing and Decentralized Community Detection ﬁne a reasonab le security game? What privac y-pr eserving result can we achiev e? • Current scheme requir es all nodes to re- run th e pr otocol, if there is any change in the topo logy , e. g. new nod e joins or new friend ship (con nection) is formed . I s it possible to ﬁnd a pri vacy-preserving com munity detection scheme that can be incrementally updated? • The privac y pr es ervation o f our proposed proto col is de- penden t on g raph size. One root cause is that we o nly lev eraged crypto primitives in th e Pri vate Set In tersection (PSI) part. The simulation of T runca ted Random W alk (TR W) is do ne in a nor mal way . Since rand om walk is a basic co nstruct in many g raph algorithm s , it is of in terest know how ( whether or not) nodes can simu late Rando m W alk in a decentralized and priv acy preserving f ashion. R E F E R E N C E S Agarwal, G. and Kempe, D. Modularity- maximizing graph com munities via m athematical program ming. The Eur opean Physical Journal B-Condensed Matter and Complex Systems , 66 (3):409–41 8, 200 8. Agrawal, R., Evﬁmievski, A., and Srikan t, R. Info rmation sharing across pr i v ate databases. In Pr oceedings of the 2003 ACM S IGMOD internationa l conference on Man- agement of da ta , pp. 86 –97. AC M, 2003 . Alvisi, Lorenzo, Clement, Allen, Epasto, Alessandro, L at- tanzi, Silvio, and Panconesi, Alessandro. SoK: The evo- lution o f sybil defen se via social n etw ork s . In Security and Privac y ( SP), 2013 IEEE Sympo s ium on , pp. 382 – 396. IEEE, 2013. Blondel, V .D., Guillaume, J .L., Lambiotte, R., and Lefeb- vre, E. Fast u nfolding of co mmunities in large networks. Journal of Statistical Mechanics: Theo ry and Experiment , 2008( 10):P10008, 2008. Clauset, A., Newman, M.E. J., and Moore, C. Finding commun ity structure in very large networks. Physica l r e- view E , 70 (6):066111 , 20 04. Coscia, M., Rossetti, G. , Gian notti, F ., and Pedreschi, D. DEMON: a local-ﬁrst discovery meth od for overlapping commun ities. In ACM SIGKDD , 2012. Datta, A., Buchegger, S., V u, L.H., Strufe, T ., and Rzadca, K. Decentralized on line social networks. Hand book o f Social Network T echnologies and Applica tions , pp. 349 – 378, 2010. Fortunato, S. Community detection in g raphs. Ph ysics Reports , 486(3-5 ):75–174, 2010. Freedman, M., Nissim, K. , and Pinkas, B. Efﬁcient priv ate matching and set intersection. In Adva nces in Cryptology-EUR OCRYPT 2 004 , pp . 1–19. Springer, 2004. Freedman, Michael J, Ishai, Y uval, Pinkas, Benny , and Reingold, Om er . Keyw ord search and oblivious p seudo- random fu nctions. In Theory of Cryptogr aph y , p p. 303– 324. Springer, 2005 . Hazay , Carmit a nd L indell, Y ehud a. Constructions of truly practical secure protoc ols using standard smartcard s . In Pr oceedings of the 15th A CM confer ence on Computer and communicatio ns security , pp. 491–50 0. A CM, 2008. Hu, Pili and Lau , W ing Cheo ng. C omm unity classiﬁ- cation on dece ntralized social networks based on 2 -hop neighbo urhood information . In IE EE ICNP , 201 3. Huang, Y an, Evans, David, and Katz, Jonatha n. Priv ate set inter s ection: Are ga rbled circuits better tha n custom protoco ls. In Network and Distributed System Security Symposium (NDSS). The Internet Society , 2012. Hui, P ., Y oneki, E., Chan, S.Y ., and Crowcroft, J. Dis- tributed c ommunity detection in delay tolerant networks. In Pr oceedings of 2n d ACM/IEEE internation al workshop on Mobility in the evolving internet ar chitectur e , pp. 7. A CM, 2007. Leung, I.X.Y ., Hui, P ., Lio, P ., and Crowcroft, J. T owards real-time community detection in large n etw orks. Physi- cal Review E , 79( 6):066107, 2009. Manulis, Mark, Pinkas, Ben n y , and Poettering , Bertram. Priv acy-preserving grou p discovery with linear complex- ity . I n App lied Cryptography and Network Security , pp. 420–4 37. Springer, 2010. Nagy , Marcin , De Cristofaro , Emiliano , Dmitrienko, Alexandra, Asok an, N, and Sad e ghi, Ahmad- Reza. Do i know yo u?: efﬁcient an d privac y-pr eserving common friend- ﬁ nde r proto cols and applications. In Pr oceeding s of the 2 9th Annu al Computer S ecurity App lications Con- fer ence , pp. 159–1 68. A CM, 2013. Rabbat, M.G., Figueiredo, M.A.T ., a nd Nowak, R.D. Net- work inf erence f rom co-o ccurrences. IE EE T r ansa ctions on Information Theory , 54(9):40 53–4068, 2008. Raghav an, U.N. , Albert, R., and Kumara, S. Near line ar time algorith m to de tect comm unity structures in large- scale networks. Physical Review E , 76(3):0361 06, 2007. Soundar ajan, Sucheta and Hopc roft, Johh E . Use of loc al group inform ation to id entify commun ities in networks. TKDD , 2013. Xie, Jierui, Kelle y , St eph en, and Szym anski, Bolesla w K. Overlapping commun ity d etection in networks: the state of the art and compara ti ve study . ACM Computing Sur - veys , 45(4 ):1–37, 2013 . Y ao, A.C. Protocols for secure computa ti ons. In Pr o- ceedings of the 23rd Annua l S ymposium on F oun dations of Computer Science , pp. 160–16 4, 1982. Y u, H. Sybil defenses v ia social networks: a tutorial and survey . ACM SIGAC T News , 42(3):80–1 01, 20 11. Zhang, R., Zh ang, Y . , Sun , J.S., and Y an , G. Fine-g rained priv ate match ing for pr oximity-based mobile social n et- working. In Info com , 20 12. Zhong , Xiang, Hu, Pili, and Lau, W ing Cheong . Scalable and robust comm unity d etection via pro ximity-based c ut and merge. In IEEE ICC , 2014 .

Secure Friend Discovery via Privacy-Preserving and Decentralized Community Detection

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment