Private Data Release via Learning Thresholds

Private Data Release via Learning Thresholds Moritz Hardt ∗ Guy N. Rothblum † Rocco A. Servedio ‡ November 27, 2024 Abstract This work co nsiders c omputatio nally e ﬃ cient pri vac y-preser ving data release. W e study the task of analyzing a database containing s ensitiv e information about individual participants. Given a set of s tatis- tical queries on the data, we want to release approximate answers to the queries while also guaranteeing di ﬀ er ential p rivacy —protec ting each participant’ s sensiti ve data. Our focus is on computationally e ﬃ cient data release algorithms; we seek algorithms whose running time is polynom ial, or at least s ub-expo nential, in the data dimensionality . Ou r primary contribution is a computatio nally e ﬃ cient reduction from di ﬀ ere ntially priv ate data release for a class of counting queries, to learnin g thresholded sums of predicates from a related class. W e instantiate this gener al reduction with a variety of algorithms for learning thresholds. These instantiations yield se veral new re sults for di ﬀ erentially private d ata release. As two examp les, taking { 0 , 1 } d to be the data domain (of dimension d ), we obtain di ﬀ eren tially p riv ate algorithm s for: 1. Releasing all k -way con junction cou nting quer ies (or k -way co ntingen cy tables). For any g iv en k , the resulting data release algorithm has bou nded erro r as lon g as the database is of size at least d O  √ k log( k log d )  (ignor ing th e depen dence on other parameter s). Th e ru nning time is poly nomial in the database size. The best sub-exponential time algor ithms known p rior to our work required a database of size ˜ O ( d k / 2 ) [Dwork McSherry Nissim and Smith 2006]. 2. Releasing a (1 − γ )-fr action of all 2 d parity counting q ueries. For any γ > po ly(1 / d ), the algorithm has bounded err or as long as the d atabase is of size at least p oly( d ) ( again ig noring the depe ndence on other parameters). Th e running time is polynomial in the database size. Sev eral o ther instantiations yield further results for p riv acy-preserving data release. Of the two results highligh ted above, the ﬁr st learning algo rithm uses techniq ues for representin g threshold ed sums o f predicates as low-degree po lynomial thr eshold fun ctions. The secon d lear ning algo rithm is based on Jackson’ s Harmonic Siev e algorithm [Jack son 1997 ]. It utilizes Fourier analysis of the database v iewed as a functio n mapping quer ies t o answers. ∗ Center for C omputational Intractability , Department of Computer Science, Pr inceton Univ ersity . Supported by NSF grants CCF-0426582 and CCF-0832797. Email: mhardt@cs.princet on.edu . † Microsoft Research, Silicon V alley Campus. Most of this work w as done while the a uthor was at the Department of Co mputer Science at Princeton Univ ersity and S upported by NSF Grant CCF-0832797 and by a Computing Innov ation Fellowship. E mail: rothblum@a lum.mit.ed u . ‡ Columbia Univ ersity Department of Computer Science and Center for Computational Intractability , Department of Com- puter Science, Princeton Unive rsity . Supported by NS F grants CCF-0832797, CNS-07-16245 and CCF-0915929. Email: rocco@cs.c olumbia.ed u 1 Introd uction This work conside rs priv acy- preserv ing statisti cal analysis of sensiti ve data. In this settin g, we wish to ex- tract statistics from a database D that contains i nformatio n about n indiv idual parti cipants . Each in di vidual’ s data is a record in the data domain U . W e focus here on the o ﬄ ine (or non-intera cti ve) setting , in which the informati on to be extracted is speciﬁed by a set Q of statistical queries. E ach query q ∈ Q is a function mapping the databas e to a query answer , where in this work we focu s on real-v alued queries with range [0 , 1 ]. O ur goal is data r elease : Extrac ting approximat e answers to all the quer ies in the query set Q . An important concern in this setting is protecting the pri v acy of indi viduals whose sens iti ve data (e.g. medical or ﬁnancial reco rds) are be ing anal yzed. Di ﬀ erential pri vac y [ DMNS06 ] pr ovid es a rigor ous notion of priv acy protection, guaranteein g that each indi vidual only has a small e ﬀ ect on the data release algorithm’ s outpu t. A gro wing body of work explores the p ossibil ity of extr acting ri ch statistics in a di ﬀ erentially pr i v ate manner . O ne line of research [ BLR08 , DNR + 09 , DR V10 , RR10 , HR 10 ] has shown that di ﬀ erenti al pri v acy often permits surprisin gly accurat e statisti cs. These works put forward general algorithms and techn iques for di ﬀ erenti ally pri v ate data analysis, but the algorithms hav e running time that is (at least) exp onenti al in the dimensi onality of the data domain. Thus, a cen tral question in di ﬀ erentially pri vate data analysi s is to de velop gen eral techn iques and algorithms that are e ﬃ cient , i.e. with runni ng time that is polynomial (or at least sub-e xponential) in the data dimensiona lity . While some computation al har dness results are known [ DNR + 09 , UV11 , GHR U11 ], they app ly only to restricted classes of data release algorithms. This W ork. Our primary contrib ution is a computation ally e ﬃ cient new tool for pri vac y-pres erving data release : a genera l reduction to the task of learning thr esholds of sums of pre dicate s . The class of predicates (for learnin g) in our reduct ion is deri ved dire ctly from the class of queri es (for data release) . At a high lev el, w e draw a connection between data release and learning as follo ws. In the data release setting , one can vie w the databas e as a function : it maps queries in Q to answers in [0 , 1]. The data release goal is appr oximating this function on queries / e xamples in Q . The challeng e is doing so with only boun ded access to the data base / fun ction; in particu lar , we only allo w access that preserv es di ﬀ er ential priv acy . For exa mple, this often means that w e only get a bounded number of oracle queries to the data base function with noisy answers. At this high lev el there is a striking similarit y to learning theor y , where a standa rd goal is to e ﬃ ciently learn / ap proximat e a function giv en limited access t o it, e.g. a b ounde d number of la beled examples or o racle querie s. Thus a natural approach to data releas e is learni ng the database function using a comp utation al learnin g algorith m. While the appro ach is intu iti vely appealin g at this high leve l, it fa ces immediate obstacl es because of appare nt incompatib ilities between the requirements of learnin g algorithms and t he type of “limited” access to data that are impo sed by pri vate data relea se. For example, in the data rel ease sett ing a standar d techniqu e for ensuring di ﬀ erentia l priv acy is adding noise, but many e ﬃ cient learning algorithms fail badly when run on n oisy data . A s a nother e xample, for pr iv ate data release, the nu mber of (noisy) databa se accesses is often ver y restricted : e.g sub-linea r , or at most quadratic in the databas e size. In the learning setting, on the other hand, it is almost al ways the case that the number of ex amples or oracle queri es required to learn a fun ction is lower bound ed by its descripti on length (and is often a larg e polyno m ial in the descrip tion length). Our work ex plores the connec tion between learnin g and priv ate data release. W e (i) gi ve an e ﬃ cient reductio n that shows tha t, in fac t, a general clas s of data release tasks can be reduced to related and natural computat ional learning tasks; and (ii) instant iate this gene ral reduc tion using ne w and kno wn learning algorithms to obtain ne w computa - tional ly e ﬃ cient di ﬀ erential ly pri v ate data release algorithms. 1 Before giv ing mor e details o n o ur reduction in Section 1 .1 , we brieﬂy d iscuss i ts contex t and some of th e ways that we apply / instant iate it. W hile the search for e ﬃ cient di ﬀ erentially priv ate data release algorit hms is relativ ely ne w , t here are decades of work in learning theory aimed at dev eloping technique s and algorithms for compu tationa lly e ﬃ cient learning, going back to the ea rly work of V aliant [ V al84 ]. Given the high-le vel similarit y between th e two ﬁelds, le veraging the existin g body o f work and ins ights from learning theor y for data release is a promising direction for future research; we vie w our reduction as a step in this direction . W e n ote that our work is by no mea ns the ﬁ rst to draw a conne ction between pri va cy-p reservi ng data release and learning theory ; as discu ssed in the “Related W ork” sectio n belo w , sev eral prior wo rks used learning techni ques in the dat a re lease setting. A novelt y i n our w ork i s that it giv es an exp licit and mo dular reduction from data release to natura l learning problems. Conceptual ly , our reductio n ov ercomes two main hur dles: – bridging the gap between the noisy oracle access arising in pri v ate data release and the noise-fr ee oracle access require d by many learn ing algorithms (including the ones we use) . – av oiding any depende nce on the databas e size in the complex ity of the learning algorithm being used. W e use this reduct ion to const ruct ne w data release algorit hms. In this work we e xplore two main appli- cation s of our reduct ion. The ﬁ rst aims to answer boolean conjunc tion queries (also kno w n as contin genc y tables or margina l queries), one of the most well-moti vated and widely-st udied classes of statistic al queries in th e di ﬀ ere ntial pri vac y literatur e. T aking t he dat a uni verse U to be { 0 , 1 } d , th e k -way b oolean con junctio n corres pondin g to a subset S of k attrib utes in [ d ] counts what fractio n of items in the database ha ve all the attrib utes in S set to 1. Approximating the answers for k -way conjun ctions (or all c onjunc tions) has been th e focus of se veral past works (see, e.g. [ BCD + 07 , KRSU10 , UV11 , GHR U 11 ]). A pplyin g our reduct ion with a ne w learn ing algorit hm tailored for t his class , we obt ain a d ata relea se algor ithm that, for d atabase s of size d O  √ k log( k log d )  , re leases accur ate answers to all k -way c onjunc tions simultaneo usly (we ignor e for no w the depen dence of the datab ase size on other parameter s such as the error). T he running time is poly( d k ) . Pre- vious algorithms either had running time 2 Ω ( d ) (e.g. [ DNR + 09 ]) or required a database of size d k / 2 (addin g indepe ndent noise [ DMNS06 ]). W e also obtain better bound s for the task of approximati ng the answers to a large fraction of all (i.e. d -way) conjun ctions under arbitrary distrib utions. These results follo w from algori thms for learning thresholds of sums of the rele vant predica tes; we base these algorithms on learning theory techniques for representi ng such functions as low-d egre e polynomial threshold functions, follo wing works suc h as [ KS 04 , K OS04 ]. W e giv e an ov ervie w of these results in Section 1.2 belo w . Our second applicat ion uses Fourier analysis of the database (vi ewed , again, as a real -v alued functio n on the queries in Q ). W e obtain ne w polynomia l and quas i-poly nomial data releas e algorithms for pari ty counti ng queries and lo w-depth (A C 0 ) cou nting queries respecti vely . The lea rning algorithms we use for this are (respec ti vely ) Jack son’ s Harmonic S ie ve algorith m [ Jac97 ], and an algorithm for learning Majority- of-A C 0 circuit s due to Jackson et al. [ JKS02 ]. W e elaborate on these results in Section 1.3 belo w . 1.1 Private Data Release Reduces to Learning Thr esholds In th is sectio n we gi ve more details on the reductio n from pri vac y-preservin g data release to l earning thr esh- olds. T he full det ails a re in Sect ions 3 and 4 . W e be gin with loose deﬁnit ions of the data release and l earning tasks we consi der , and then proceed with (a simple case of) our reduction. Counting Queries , Data Relea se and Learning Thr esholds. W e begin w ith p reliminari es and an i nformal speciﬁca tion of the data release and learnin g tasks we consider in our reductio n (see Sections 2 and 3.1 for full deﬁnit ions). W e refer to an element u in data domain U as an item . A database is a col lection of n items from U . A counti ng query is speciﬁed by a predicate p : U → { 0 , 1 } , and the query q p on database D outpu ts the fraction of items in D that sat isfy p , i.e. 1 n P n i = 1 p ( D i ). A class of co unting queries is spec iﬁed by a set Q of query description s and a predicate P : Q × U → { 0 , 1 } . For a query q ∈ Q , its corresp onding 2 predic ate is P ( q , · ) : U → { 0 , 1 } . W e will sometimes ﬁx a data item u ∈ U and consider the predicat e p u ( · ) , P ( · , u ) : Q → { 0 , 1 } . Fix a data domain U and query cla ss Q (s peciﬁed by a predicate P ). A da ta r elease algorit hm A gets as input a database D , and outputs a synops is S : Q → [0 , 1] that provi des appro ximate answers to queries in Q . W e say t hat A is an ( α, β, γ ) di strib ution-fr ee data r elease algo rithm for ( U , Q , P ) if , for an y distrib ution G over the query set Q , with probabilit y 1 − β ov er the algorithm’ s coins, the synopsis S satisﬁes that with probab ility 1 − γ ov er q ∼ G , the (addit i ve) error of S on q is bounded by α . Later we will also consider data rele ase algorith ms that onl y work for a speciﬁc distrib ution or class of dis trib utions (in this case we will not call the algo rithm distr ib ution- free). Finally , we as sume for no w tha t the data relea se alg orithm only access es the distrib ution G by sampling querie s from it, but later we will also consider more general types of access (see below). A di ﬀ er entially priv ate data release algo rithm is one whose output distrib ution (on synop ses) is di ﬀ erential ly pri v ate as per Deﬁnition 2.1 . See Deﬁnition 3.3 for full and formal details. Fix a clas s Q of e xamples and a s et F of predi cates on Q . Let F n , t be the set of threshold ed sums from F , i.e., the set of functi ons of the form f =  n 1 n P n i = 1 f i > t o , where f i ∈ F for a ll 1 6 i 6 n . W e refer to functio ns in F n , t as n-thr esholds. An algorithm for learning thre sholds gets access to a function in F n , t and outputs a hypothesi s h : Q → { 0 , 1 } that lab els examples in Q . W e say that it is a ( γ, β ) distrib ution-free learning algori thm for learning thresh olds over ( Q , F ) if, for an y distrib ution G over th e set Q , w ith prob ability 1 − β ov er the algorithm’ s coins the output hypothesi s h satisﬁes that with probability 1 − γ over q ∼ G , h labels q correct ly . As abov e, later we will also consider learning algorith ms that are not distrib ution free, and only work for a speciﬁc distrib ution or clas s or distrib utions. For no w , we assume that the learning algorith m only accesses the distrib ution G by drawing examples from it. These examples are labeled using the tar get functi on that the algorith m is trying to learn. See Deﬁnition 3.5 for full and formal detail s. The R eduction. W e can no w des cribe (a simple case of) our reduction from di ﬀ erentially priv ate data release to learning threshol ds. For any data do main U , set Q of query descr iptions , and predic ate P : Q × U → { 0 , 1 } , the reduct ion sho ws ho w to construct a (distrib ution free) data release algorithm giv en a (distri b ution free) algorithm for learning thresh olds o ver ( Q , { p u : u ∈ U } ), i.e., any algorith m for learning thresh olds w here Q is the exampl e set and the set F of predicates (over Q ) is o btaine d by the possibl e ways of ﬁxing the u -input to P . The result ing data release algorithm is ( α, β, γ )-accurat e as long as the database is not too small; the size bound depend s on the desired accurac y parameters and on the lea rning algorith m’ s sample comple xity . T he e ﬃ cie ncy of the learning algorith m is preserv ed (up to mild poly nomial factors). Theor em 1.1 (Red uction from D ata Rele ase to L earning Thres holds, Simpliﬁed) . Let U be a data un iverse, Q a set of query descript ions, and P : Q × U → { 0 , 1 } a pr edicat e. Ther e is an ε -di ﬀ er entially priva te ( α, β, γ ) -accur ate distrib ution fr ee data-r elease algorit hm for ( U , Q , P ) , pr ovid ed that: 1. ther e is a distrib ution-f r ee learning algorit hm L that ( γ , β )-learns thr esholds over ( Q , { p u : u ∈ U } ) using b ( n , γ, β ) lab eled e xamples and running time t ( n , γ, β ) for learnin g n-thr esholds. 2. n > C · b ( n ′ ,γ ′ ,β ′ ) · log(1 /β ) ε · α · γ , wher e n ′ = Θ (log |Q| /α 2 ) , β ′ = Θ ( β · α ) , γ ′ = Θ ( γ · α ) , C = Θ (1) . Mor eover , the data r elease algorith m only accesses the query distrib ution by sampling. The number of samples taken is O ( b ( n ′ , γ ′ , β ′ ) · log(1 /β ) /γ ) and the running time is poly( t ( n ′ , γ ′ , β ′ ) , n , 1 /α, log(1 /β ) , 1 /γ ) . Section 3.2 gi ves a formal (and more general) sta tement in Theor em 3.9 . Secti on 3.3 giv es a proof ov erview , and S ection 4 gi ves the full proof. Note that, since the data release algorithm we obtain from this reduct ion is distrib ution free (i.e. wo rks for any distrib ution on the query set) and only accesse s the query distrib ution by sampling, it can be boosted to yield accurate answers on all the queries [ DR V10 ]. 3 A More General Reduction. For clarity of exposi tion, we ga ve abov e a simpliﬁed form of the reduction. This assumed that the learning algo rithm is distrib ution- fr ee (i.e. works for any distrib ution over exam- ples) and only requires sampling access to labeled exampl es. Thes e strong assumpti ons enable us to get a distrib ution-f ree data release algorith m that only accesses the query distrib ution by sampling. W e also giv e a reduction that applies ev en to distrib ution-s peciﬁc learning algorithms that require (a certain kind of) oracle access to the function being learne d. In addition to samplin g labeled example s, the learnin g algo rithm can: ( i ) estimate the distrib ution G on any example q by querying q and recei ving a (multipli cati ve) approx imation to the probability G [ q ]; and ( ii ) query an oracle for the fun ction f being learne d on any q such that G [ q ] , 0 . W e refer to this as appr oximate distrib ution re stricted orac le access , see Deﬁnition 3.6 . Note that sev eral natural learning algorithms in the literature use ora cle queries in thi s way; in part icular , we sho w that this is true for Jacks on’ s Harmonic Sie ve Algo rithm [ Jac97 ], see Section 6 . Our genera l reduct ion gi ves a data rele ase algorithm for a class GQ of distrib utions on the query set, pro vided we hav e a learning algorithm which can also use approximate distrib ution restric ted oracle access, and which works for a slightly richer class of distrib utions GQ ′ (a smooth exten sion , see Deﬁnition 3.8 ). Again, sev eral such al gorithms (b ased on Fourie r anal ysis) are kno wn in the li teratur e; our genera l reducti on allo ws us to use them and obtain the ne w data release results outlin ed in Section 1.3 . Related W ork: P rivacy and Learning. Our new reduct ion adds to the fruitf ul and gro w ing interaction between the ﬁ elds of di ﬀ erentiall y pri vate data release and learnin g theo ry . Prior works also explore d this conne ction. In our work, w e “import” learn ing theory techni ques by drawin g a corres ponden ce betwee n the datab ase (in the data release settin g), for which we want to approximate query answers, and the targ et functi on (in the learning settin g) which lab els ex amples. Sev eral othe r works hav e used this c orrespo ndence (implicit ly or explici tly), e.g. [ DN R + 09 , DR V10 , GHRU11 ]. A di ﬀ erent view , in which queries in the data release settin g corresp ond to concep ts in learning theory , was used in [ BLR08 ] and also in [ GHR U 11 ]. There is also work on di ﬀ er entially priva te learning al gorith ms in which the go al is to gi ve di ﬀ erentiall y pri vate va riants of v arious learning algorit hms [ BDMN05 , KLN + 08 ]. 1.2 Ap plications (Part I ): Releas i ng Conjunctions W e use the reduct ion of T heorem 1.1 to obtain new data release algorithms “automaticall y” from learni ng algori thms that satisfy the theorem’ s requirement s. Here we describe the distrib ution-fre e data release algo- rithms w e o btain fo r appr oximatin g conjunction counti ng queries. These u se lea rning algor ithms (which a re themselv es distrib ution-fre e and require only random examples) based on polyn omial thresho ld function s. Through out this sectio n w e ﬁx the quer y class under cons iderati on to be conjuncti ons. W e tak e U = { 0 , 1 } d , and a (monotone) conju nction q ∈ Q = { 0 , 1 } d is satisﬁed by u i ﬀ ∀ i s.t. q i = 1 it is also the case that u i = 1. (Our monotone conjunct ion results extend easily to general non- monotone con junctio ns with paramete rs unchang ed. 1 ) Our ﬁrst result is an algorithm for releasing k -way conju nction s: Theor em 1.2 (Distrib ution-Fre e Data R elease for k -way conjunctio ns) . Ther e is an ε -di ﬀ er entially private ( α, β, γ ) -accur ate distrib ution-fr ee data re lease algorithm, which access es the query distrib ution only by sampling , for the class o f k-way monotone Boolean conj unction queries. The algorith m has runtime poly( n ) on databas es of size n pr ovided that n > d O  q k log  k log d α   · ˜ O log ( 1 / β ) 3 εαγ 2 ! . 1 T o see this, e xtend the data do main to be { 0 , 1 } 2 d , and for each item in the original do main include also its neg ation. General conjunctions in the original data domain can now be treated as mo notone conjunc tions in the ne w d ata domain. Note that the locality of a co njunction is un changed. Our results in this se ction are for arbitrary distrib utions o ver the set of mon otone conjunc tions (over the n e w dom ain), and so the y will continue to apply to a rbitrary distrib uti ons on general conjunction s ov er the original data domain. 4 Since this is a distrib ution-free data release algorit hm that only acces ses the query distri b ution by sam- pling, we can use the boos ting results o f [ DR V10 ] a nd ob tain a data releas e algorit hm that gen erates (w .h.p.) a synop sis that is accurate for all queries . This incre ases the running time to d k · poly( n ) (becaus e the boost- ing a lgorith m needs to en umerate ov er all t he k -way c onjun ctions) . T he req uired boun d on th e database size increa ses slightly b ut our big-Oh notation hides this small increase. The cor ollary is stated formally belo w: Cor ollary 1.3 (Boosted Data Release for k -way Conjunct ions) . Ther e is an ε -di ﬀ er entially priva te ( α, β, γ = 0) -acc urat e distrib ution-f r ee data r elease algorith m for the class of k -way monotone Boolean conjuncti on querie s with runtime d k · poly( n ) on databases of size n, pr ovide d that n > d O  q k log  k log d α   · ˜ O log ( 1 / β ) 3 εα ! . W e also obtain a ne w data release algorithm for releasing the answers to all conjun ctions: Theor em 1.4 (Distrib ution- Free Data Relea se for All Conjuncti ons) . T her e is an ε -di ﬀ er entially private ( α, β, γ ) -accur ate distrib ution-fr ee data re lease algorithm, which access es the query distrib ution only by sampling , for the class of all mono tone Boolean conj unction queries. The a lgorith m has runtime poly( n ) o n datab ases of size n, pr ovided that n > d O  d 1 / 3 · log 2 / 3 ( d α )  · ˜ O log ( 1 / β ) 3 εαγ 2 ! . Again, w e can apply boosting to this result; this gi ves improv ements ov er pre vious work for a certai n range of paramete rs (roughly k ∈ [ d 1 / 3 , d 2 / 3 ]). W e omit the details. Related W ork on R eleasing Conjunctions. Seve ral past works hav e con sidered di ﬀ erentially priv ate data release for conju nction s and k -way conjunction s (also kno w n as margi nals and contingenc y tables). As a coroll ary of their more gen eral Laplace and Gaussi an mechanisms, the work of Dwork et al. [ DMNS06 ] sho wed ho w to release a ll k -way conjunction s in running time d O ( k ) pro vided th at the data base siz e is at least d O ( k ) . Bara k et al. [ BCD + 07 ] showed ho w to release consi stent continge ncy tables w ith similar datab ase size bounds. T he running time, howe ver , was increased to exp( d ) . W e note that our data-releas e algorith ms do not gu arantee consistenc y . Gupta et al. gav e d istrib ution-sp eciﬁc data release a lgorith m for k -way and fo r all conjuncti ons. T hese algorithms work for the uniform distrib ution o ver ( k -way or genera l) conjunctio ns. The databa se size bound and running time were (roughly) d ˜ O ( 1 /α 2 ) . For dis trib ution-speciﬁc data release on the unif orm distrib ution, the depen dence on d in their wor k is better than our algorithms but the depende nce on α is wors e. Finally , we note that the gene ral infor mation-th eoretic algori thms for di ﬀ erentia lly pri v ate data release also yield algorithms for the speciﬁc case of conjun ctions. These algorith ms are (signiﬁcant ly) more computatio nally expens i ve, but they ha ve better database size bounds . For examp le, the algorit hm of [ HR10 ] has running time exp( d ) but database size bound is (rough ly) ˜ O ( d /α 2 ) (for the relaxe d notion of ( ε, δ )-d i ﬀ erentia l priv acy) . In terms of negat i ve results, Ullman and V adh an [ UV11 ] showed that, un der mild crypto graphic assump- tions, no data release algorithm for conju nctions (e ven 2-way) can ou tput a synth etic data base in runni ng time less than e xp( d ) (this holds ev en for distrib ution-spe ciﬁc data release on the unifo rm distrib ution). Our results side-s tep this nega ti ve result because the algo rithms do not release a synthet ic database . Kasi viswanathan et al. [ KRSU10 ] sho wed a lower bound of ˜ Ω  min n d k / 2 /α, 1 /α 2 o on the datab ase size needed for rele asing k -way conju nction s. T o see that this is consist ent with our bound s, note that our bound on n is alw ays large r than f ( α ) = 2 √ k log( 1 / α ) /α. W e ha ve f ( α ) < 1 /α 2 only if k < log(1 /α ) . But in the range where k < log(1 /α ) our theore m needs n to be lar ger than d k /α which is consist ent with the lower b ound. 5 1.3 Ap plications (Part I I ): Fou rier -Based Appr oach W e also use Theorem 1.1 (in its more general formulatio n giv en in Section 3.2 ) to obtain ne w data release algori thms for answering parity countin g queries (in pol ynomial time) and general A C 0 counti ng queries (in quasi-po lynomial time). For both of these w e ﬁx the data univ erse to be U = { 0 , 1 } d , and take the set of query descrip tions to also be Q = { 0 , 1 } d (with di ﬀ erent semantics for queries in the two cases). Both algori thms are distrib ution-spe ciﬁc, working for th e unifor m distrib ution ov er query descrip tions, 2 and bot h instan tiate the reduction with learnin g algorithms that use Fourier analysis of the targ et functio n. Thus the full data releas e algorit hms use Fourier analy sis of the database (vie wed as a func tion on queries ). Parity Counting Queries. Here w e consider counting queries that, for a ﬁxed q ∈ { 0 , 1 } d , output how many items in the database ha ve inner product 1 w ith q (inner products are taken ove r G F [2]). I.e., we use the parit y predicate P ( q , u ) = P i q i · u i (mod 2). W e obtain a polynomial-t ime data release algori thm for this class (w .r .t. the uniform distrib ution ov er queries). This uses our reduction, instantiated with Jackson ’ s Harmonic Sie ve le arning algorithm [ Jac97 ]. In Section 6 we prove : Theor em 1.5 (Uniform Distrib ution Data Release for Parity Counting Queries .) . T her e is an ε -di ﬀ e r entially privat e algorithm for r eleasi ng the clas s of par ity querie s over th e unifor m distrib ution on Q . F or databas es of size n, the algorith m has runtime poly( n ) and is ( α, β, γ ) -accur ate, pr ovide d that n > poly( d , 1 / α , 1 / γ , log( 1 / β )) ε . A C 0 Counting Queries. W e also consider a quite gene ral class of count ing queries, namely , any query family whose pred icate is computed by a consta nt depth (A C 0 ) circuit. For any family of this type, in Section 6 we obtain a data release algorithm ove r the uniform distrib ution that requires a database of quasi- polyn omial (in d ) size (and has running time polynomial in the database size, or quasi-po lynomial in d ). Theor em 1.6 (Uniform Distrib ution Data Release for AC 0 Counting Q ueries) . T ake U = Q = { 0 , 1 } d , and P ( q , u ) : Q × U → { 0 , 1 } a pr edicate computed by a Boolean cir cuit of depth ℓ = O (1) and size poly( d ) . Ther e is an ε -di ﬀ er entially priva te data r elease algorit hm for this query class over the uniform distrib ution on Q . F o r database s of size n, the algorit hm has runtime poly( n ) and is ( α, β, γ ) -acc ura te, pr ovided that: n > d O  log ℓ  d αγ  · ˜ O log 3 ( 1 /β ) εα 2 γ ! . This result uses our reduction ins tantiat ed with an algorithm of Jac kson et al. [ JKS02 ] for learning Majority -of-A C 0 circuit s. T o the best o f our kn o w ledge, this is the ﬁrst positi ve result fo r priv ate da ta release that uses the ( circuit ) structu re of th e que ry cla ss in a “ non blac k-box” way to ap proximat e the q uery ans wer . W e note that the class of A C 0 predic ates is quite rich. For exampl e, it includes conjunct ions, approximate counti ng [ Ajt83 ], and G F [2] poly nomials w ith polylog( d ) many terms. While our result is speciﬁc to the unifor m distrib ution ov er Q , we note that some query sets (and query descriptions ) may be amenable to ran dom self-r educibility , w here an algorithm prov iding accurate answers to uniformly random q ∈ Q can be used to get (w .h.p.) accura te answers to any q ∈ Q . W e also note that T heorem 1.6 leav es a lar ge degre e of freedo m in how a class of countin g querie s is to be represented. Man y di ﬀ ere nt sets of query descript ions Q and predicates P ( q , u ) can correspon d to the same set of counti ng querie s ov er the same U , and it m ay well be the case that some representat ions are more amenable to computatio ns in A C 0 and / or random self- reduci bility . Finally , we note that the hardness results of Dwork et al. [ DNR + 09 ] actually considered (and 2 More generally , we can get results for smoo th distributions, we defer these to the full version . 6 ruled out) e ﬃ cient dat a-relea se algorithms fo r A C 0 counti ng queri es (e ven for the uniform distrib ution case), b ut only w hen the algorithm’ s output is a synthetic database. Theor em 1.6 side-steps these negati ve results becaus e the output is not a syntheti c database . 2 Pr eliminaries Data sets and di ﬀ er ential priva cy . W e consid er a data univ erse U , where througho ut this work we take U = { 0 , 1 } d . W e typ ically refer to an element u ∈ U as an ite m . A data set ( or dat abase) D of si ze n o ver the uni verse U is an ord ered multiset consisting of n items from U . W e will sometimes think of D as a tuple in U n . W e use the notatio n | D | to denot e the size of D (here, n ). T wo d ata sets D , D ′ are calle d adjacent if the y are both of size n and they agre e in at least n − 1 items (i.e., their edit distanc e is at most 1). W e will be interest ed in ra ndomized algorithms that map data sets into some abstract ran ge R and s atisfy the notion of di ﬀ eren tial pri vac y . Deﬁnition 2.1 (Di ﬀ erential Priv acy [ DMNS06 ]) . A randomize d algorithm M m apping data sets over U to outcome s in R satisﬁes ( ε, δ ) -di ﬀ e r ential privacy if for all S ⊂ R and e very pair of two adjacen t databas es D , D ′ , we ha ve  ( M ( D ) ∈ S ) 6 e ε  ( M ( D ′ ) ∈ S ) + δ . If δ = 0 , we say the algorith m satisﬁes ε -di ﬀ er ential privac y . Counting queries. A class of counting queries is speciﬁed by a predicate P : Q × U → { 0 , 1 } where Q is a set of query descriptio ns. Each q ∈ Q speciﬁes a query and the answer for a query q ∈ Q on a single data item u ∈ U is gi ven by P ( q , u ) . The answer of a counting que ry q ∈ Q on a data set D is deﬁned as 1 n P u ∈ D P ( q , u ) . W e will often ﬁx a data item u and databa se D ∈ U n of n data items, and use the follo wing notatio n: – p u : Q → { 0 , 1 } , p u ( q ) def = P ( q , u ) . The pre dicate on a ﬁxed data item u . – f D : Q → [0 , 1 ] , f D ( q ) def = 1 n P u ∈ D P ( q , u ) . For an input query descriptio n and ﬁxed database, counts the fractio n of database items that satisfy that query . – f D t : Q → { 0 , 1 } , f D t ( q ) def =  n f D ( q ) > t o . For an input query descri ption and ﬁxed dat abase and thresh old t ∈ [0 , 1 ], indicates whether the fraction of datab ase items that satisfy that que ry is at least t . Here and in the follo wing  denotes the 0 / 1-indi cator function . W e close this sectio n w ith some concre te example s of query classes that we will cons ider . Fix U = { 0 , 1 } d and Q = { 0 , 1 } d . The query class of monotone boolean conj unction s is deﬁned by the predicat e P ( q , u ) = V i : q i = 1 u i . Note tha t we may equiv alently write P ( q , u ) = 1 − W i : u i = 0 q i . The query class of pariti es over { 0 , 1 } d is deﬁned by the predic ate P ( q , u ) = P i : u i = 1 q i (mod 2) . 3 Priv ate Data Release via Learning T hr esholds In this section we descri be our reduction from pr i v ate da ta release to a related co mputatio nal learnin g task of learnin g threshol ded sums. Sectio n 3.1 sets the stage, ﬁrst introdu cing deﬁnitions for handling distrib utions and access to an oracle, and then proceeds with notation and formal deﬁnitions of (non-inter acti ve) dat a re- lease and of learning thresh old functio ns. Section 3.2 formally state s our main the orem gi ving the reduction, and Section 3.3 gi ves an int uiti ve ove rvie w of the proof. The formal proof is then giv en in Section 4 . 7 3.1 Distribu tion access, data release, l earning thr esholds Deﬁnition 3.1 (Sampling or Eval uation Access to a Distrib ution) . Let G be a distrib ution ov er a s et Q . When we giv e an algo rithm A sampling acc ess to G , w e mean that A is allo wed to sample items distrib uted by G . When w e giv e an algorith m A evalua tion access to G , we mean that A is both allo wed to sample items distrib uted by G and also to make oracle queries: in such a query A speciﬁes any q ∈ Q and recei ves back the probabilit y G [ q ] ∈ [0 , 1] of q under G . For both types of access we will often measure A ’ s sample comple xity or number of queries (for the case of ev aluation access). 3 Deﬁnition 3.2 (Sampling Access to Labeled Examples) . L et G be a distrib ution ove r a set Q of potential exa mples, and let f be a function whose domain is Q . When we giv e an algor ithm A sampling acc ess to labele d e xamples by ( G , f ), we mean that A has sampling access to the distrib ution ( q , f ( q )) q ∼ G . Deﬁnition 3.3 (Data Release Algorithm) . Fix U to be a data univ erse, Q to be a set of query descripti ons, GQ to be a set of distrib utions on Q , and P ( q , u ) : Q × U → { 0 , 1 } to be a predicate. A ( U , Q , GQ , P ) data r elease algorithm A is a (probabilist ic) algorithm that get s sampling access to a distrib ution G ∈ GQ and tak es as input accurac y parameters α, β, γ > 0, a databas e size n , and a databa se D ∈ U n . A outputs a synop sis S : Q → [0 , 1 ]. W e say that A is ( α, β, γ ) -acc ura te for database s of size n , if for ev ery database D ∈ U n and query distrib ution G ∈ GQ :  S ←A ( n , D ,α,β,γ ) "  q ∼ G h | S ( q ) − f D ( q ) | > α i > γ # < β (1) W e also conside r data rele ase algorit hms that get ev aluation access to G . In this case, we say that A is a data r elease algorithm using evaluatio n access . The deﬁnition is unchanged , exce pt that A gets this additi onal form of access to G . When P and U are unders tood from the contex t, we someti mes refer to a ( U , Q , GQ , P ) data release algori thm as an algori thm for r eleasing the class of queries Q over GQ . This work focuses on di ﬀ er entially private data relea se alg orithms, i.e. data release algor ithms which are ε -di ﬀ erentially pri vate as per Deﬁnition 2.1 (note that such algorithms must be randomized ). In such data release algorithms, the pro babilit y of any outpu t s ynopsi s S di ﬀ ers by at most a n e ε multiplic ati ve f actor between any tw o adjacent databa ses. W e note two cases of p articul ar int erest. The ﬁ rst is when GQ is the set o f all distrib utions ove r Q . In t his case, we say that A is a distrib ution-fr ee data release algorithm. For such algorit hms it is possible to apply the “ boostin g for queries” resul ts of [ DR V10 ] a nd obta in a dat a release alg orithm whose s ynops is is (w .h.p.) accura te on all queries (i.e. with γ = 0) . W e note that those boos ting result s apply only to data release algori thms that access their di strib ution by samplin g (i.e. they nee d not hold for data release alg orithms that use e v aluatio n access). A second case of interest is when GQ contains only a single distrib ution, the uniform distrib ution o ver all queries Q . In this case both sampling and e valuat ion access are easy to simulate. Remark 3.4. Thr oughout this work, we ﬁx the accuracy parameter α , and lower-b ound the r equir ed databas e size n needed to ensur e the (additive) appr oximation err or is at most α . An alternat ive appr oach taken in some of the di ﬀ er ential privacy literatur e, is ﬁxing the databas e size n and upper bounding the appr oxima- tion err or α as a function of n (and of the other paramet ers) . Our datab ase size bounds can be con verted to err or bounds in the natura l way . 3 Note that, generally speaking, sampling and e v aluation access are incomparably powerful (see [ KMR + 94 , Nao96 ]). In this work, ho wev er, whene ver we giv e an algo rithm ev aluation access we wil l also gi ve it sampling access. 8 Deﬁnition 3.5 (Learning Threshold s) . Let Q be a set (which we now view as a domain of potential unlabeled exa mples) and let GQ be a set of dis trib utions on Q . L et F be a set of predicates on Q , i.e. function s Q → { 0 , 1 } . Give n t ∈ [0 , 1] , let F n , t be the set of all threshold functions of the form f =  n 1 n P n i = 1 f i > t o where f i ∈ F for all 1 6 i 6 n . W e refer to functions in F n , t as n-thr esholds o ver F . L et L be a ( probab ilistic) algori thm that gets sampling access to labeled exampl es by a distrib ution G ∈ GQ and a tar get function f ∈ F n , t . L takes as inpu t accurac y paramete rs γ, β > 0, an inte ger n > 0, and a thresho ld t ∈ [0 , 1]. L outpu ts a boolean hypoth esis h : Q → { 0 , 1 } . W e say that L is an ( γ, β ) -lear ning algorith m for thr esholds over ( Q , GQ , F ) if for e very γ, β > 0, eve ry n , e very t ∈ [0 , 1] , ev ery f ∈ F n , t and e very G ∈ GQ , w e ha ve  h ←L ( n , t ,γ,β ) "  q ∼ G  h ( q ) , f ( q )  > γ # < β . (2) The deﬁnitio n is analogo us for all other notions of oracle access (see e.g. Deﬁnition 3.6 belo w). 3.2 Statement of the main theor em In this section we formally state our main theore m, which establishe s a genera l reduction from pri vate data release to learning certain threshold functions. The next deﬁnitio n cap tures a notion of oracle acce ss for learni ng algorithms which arises in the reductio n. T he deﬁnitio n combi nes sampling access to label ed exa mples with a limite d kind of ev aluatio n access to the und erlying distrib ution and blac k-box oracl e access to the tar get function f . Deﬁnition 3.6 (approximate distrib ution-rest ricted oracle access) . Let G be a distrib ution ov er a domain Q , and let f be a function whose domain is Q . When we s ay that an algo rithm A has appr oximate G -r estricted eva luation access to f , we mean that 1. A has sampling access to labeled example s by ( G , f ); and 2. A can mak e oracle queries on any q ∈ Q , which are answered as follo ws: there is a ﬁxed const ant c ∈ [1 / 3 , 3] suc h that ( i ) if G [ q ] = 0 the answer is (0 , ⊥ ); and ( ii ) if G [ q ] > 0 the answer is a pai r ( c · G [ q ] , f ( q )). Remark 3.7. W e re mark that this is the typ e of of ora cle acc ess pr ovided to the learning algorithm in our r eductio n. This is di ﬀ er ent fr om the ora cle access that the data re lease algorith m has. W e coul d ext end Deﬁnitio n 3.3 to ref er to appr oximate evaluatio n acc ess to G ; all our r esults on data r elease using evaluati on access would e xtend to thi s weak er access (under appr opriate appr oximation guara ntees). F or simplic ity , we focus on the case wher e the data release algorit hm has perfe ctly accura te eva luatio n access , since this is su ﬃ cien t thr oughout for our purpose . One might initially hope that priv ately releasing a class of queries Q ov er some set of distrib utions GQ reduce s to learni ng corres pondin g threshold functi ons ov er the same set of distrib utions . Ho wev er , our reduct ion w ill need a learnin g algorithm that works for a potent ially larg er set of distrib utions GQ ′ ⊇ GQ . (W e will see in Theorem 3.9 that this poses a stronge r requirement on the learnin g algorithm.) Speciﬁcally , GQ ′ will be a smooth e xtensio n of GQ as deﬁned next. Deﬁnition 3.8 (smooth e xtensio ns) . Gi ven a distrib ution G ov er a set Q and a v alue µ > 1, the µ -smooth e xtensio n of G is the set of all distrib utions G ′ which are such that G ′ [ q ] 6 µ · G [ q ] for all q ∈ Q . Giv en a set of distrib utions GQ and µ > 1, the µ -smooth ext ension of GQ , denoted GQ ′ , is deﬁned as the set of all distrib utions that are a µ -smooth extens ion of some G ∈ GQ . 9 W ith these two deﬁnitions at hand, we can state our reduction in its most general form. W e will com- bine this general red uction with speciﬁc learning results to obtain concre te ne w data rele ase algorith ms in Sections 5 and 6 . Theor em 3.9 (Main Result: Pri v ate Data Release via L earnin g Thresholds) . Let U be a data univers e, Q a set of query descripti ons, GQ a set of distri b utions over Q , and P : Q × U → { 0 , 1 } a pr edicate. Then, ther e is an ε -di ﬀ er entially private ( α, β, γ ) -accu rat e data-r elease algorithm for databases of size n pr ovi ded that – ther e is an alg orithm L that ( γ , β )-lear ns thr esholds ov er ( Q , GQ ′ , { p u : u ∈ U } ) , runn ing in time t ( n , γ, β ) and using b ( n , γ, β ) queries to an appr oximate distrib ution-r estricted e valuation ora cle for the tar get n-thr eshold functi on, wher e GQ ′ is the (2 /γ ) -smooth extens ion of GQ ; and – we have n > C · b ( n ′ , γ ′ , β ′ ) · log  b ( n ′ ,γ ′ ,β ′ ) αγβ  · log(1 /β ′ ) εα 2 γ , (3) wher e n ′ = Θ (log |Q| /α 2 ) , β ′ = Θ ( βα ) , γ ′ = Θ ( γα ) and C > 0 is a su ﬃ ciently lar ge constant. The runnin g time of the data rele ase algorit hm is poly( t ( n ′ , γ ′ , β ′ ) , n , 1 /α, log(1 /β ) , 1 /γ ) . The nex t remark points out two simple modiﬁcations of this theorem. Remark 3.10. 1. W e can impr ove the dependenc e on n in ( 3 ) by a factor of Θ ( 1 / α ) in the case wher e the learni ng algorithm L on ly use s sampling acce ss to labeled ex amples. In this case the data r elease algori thm also uses only sampling access to the query distrib ution G . The pr ecise statemen t is given in Theor em 4.10 which we pr esent after the pr oof of Theor em 3.9 . 2. A similar theor em holds for ( ε, δ ) -di ﬀ e r ential privacy , w her e the re quir ement on n in ( 3 ) is im pr oved to a re quir ement on √ n up to a log(1 /δ ) factor . The pr oof is the same, e xcept for a di ﬀ er ent (bu t standa r d) privacy ar gument, e.g., us ing the C ompositio n T heor em in [ DRV10 ]. 3.3 Inf ormal proof over view Our goal in the data release setting is approx imating the query answers { f D ( q ) } q ∈ Q . T his is exactly the task of approximatin g or learnin g a sum of n predi cates from the set F = { p u : u ∈ U } . Indeed, each item u in the database speciﬁes a predicate p u , and for a ﬁ xed query q ∈ Q we are trying to approximat e the sum of the p redicat es f D ( q ) = 1 | D | · P u ∈ D p u ( q ). W e want to approximate suc h a s um i n a pri vac y-pres erving manner , and so we will only permit limited access to the function f D that w e try to approximate. In particular , we will only allow a bounded number of noisy oracle queries to this functio n. Using standard techniqu es (i.e. adding app ropria tely scaled Laplac e noise [ DMNS06 ]), an approximation obtained from a bounded number of noisy oracle que ries will be di ﬀ eren tially priv ate. It remai ns, then, to tackle the task of (i) learni ng a sum of n predic ates from F using an oracle to the sum, and (ii) doing so using only a bounded (smaller than n) number of oracle queries when we are prov ided noisy answers . Fro m Sums to Thres holds. Ignoring pri vac y concerns, it is straig htforwa rd to reduce the task o f learnin g a sum f D of pr edicate s (giv en an o racle for f D ) t o t he ta sk of learning thresh olded sums of predica tes (again gi ven an oracle for f D ). Indeed , set k = ⌈ 3 /α ⌉ and consider the thresho lds t 1 , . . . , t k gi ven by t i = i / ( k + 1) . No w , giv en an oracle for f D , it is easy to simulate an oracle for f D t i for an y t i . T hus, we can learn each of the thres hold functions f D t i to accura cy 1 − γ/ k with respect to G . C all the resulting hypoth eses h 1 , . . . , h k . Each h i labels a (1 − γ/ k )-fractio n of th e q ueries / examp les in Q correc tly w .r .t the thresh old functio n f D t i . W e 10 can produc e an aggre gated hypothesi s h for approximati ng f D as follo ws: gi ven a que ry / ex ample q , let h ( q ) equal t i where t i is the smallest i such that h i ( q ) = 0 and h i + 1 ( q ) = 1 . For random q ∼ G , we will then hav e | h ( q ) − f D ( q ) | 6 α/ 3 with probabil ity 1 − γ (ove r the choice of q ). Thus, we hav e redu ced learning a sum to learning thresholded sums (where in both cases the learning is done with an oracle for the sum). But because of pri vac y consider ations, we must address the challenges mentione d abov e: ( i ) l earnin g a thr esholde d sum of n predica tes using fe w (less tha n n ) ora cle quer ies to the sum, and ( ii ) learnin g when the orac le for the sum can return noisy answers. In parti cular , the noi sy sum answers can indu ce errors on thresho ld oracle queries (when the sum is close to the thresho ld). Restricti ng to Large Mar gins. L et us say that a query / e xample q ∈ Q has low mar gin with r espect to f D and t i if | f D ( q ) − t i | 6 α/ 7 . A useful observ ation is that in the ar gument sk etched above , we do not need to appro ximate each threshold function f D t i well on low m ar gin elements q . Indeed, suppose that each hypot hesis h i errs arbitrar ily on a set E i ⊆ Q that contain s only inputs that ha ve low margin w .r .t. f D and t i , b ut achie ves high accurac y 1 − γ / k with respect to G conditione d on the ev ent Q \ E i . Then the abov e aggre gated hypothes is h would still ha ve high accurac y with high probabi lity over q ∼ G ; more precisely , h would sati sfy | h ( q ) − f D ( q ) | 6 2 α/ 3 with probab ility 1 − γ for q ∼ G . The reason is that for e very q ∈ Q , there can only be one threshold i ∗ ∈ { 1 , . . . , k } s uch that | f D ( q ) − t i ∗ | 6 α/ 7 (since any two thresholds are α/ 3- apart from each other). While the thre shold hypothesi s h i ∗ might err on q (because q has low mar gin w .r .t. t i ∗ ), th e hypot heses h i ∗ − 1 and h i ∗ + 1 should sti ll be ac curate (w .h.p. ove r q ∼ G ), and thus the aggre gated hypothesis h will still output a value bet ween t i ∗ − 1 and t i ∗ + 1 . Thres hold Access to The Data Set. W e will use the abov e observ ation to our adv antage. Speci ﬁcally , we restric t all acc ess to the function f D to what we call a thr eshold or acle . Roughly speaking, the thre shold oracle (which we denote T O and deﬁne formally in Sect ion 4.1 ) works as follo ws: when giv en a que ry q and a threshold t , it draws a suitably scaled Lap lacian v ariable N (used to ensure di ﬀ ere ntial pri v acy) a nd re turns 1 if f D ( q ) + N > t + α/ 20 ; retu rns 0 if f D ( q ) + N 6 t − α/ 20 ; and returns “ ⊥ ” if t − α/ 20 < f D ( q ) + N < t + α/ 20 . If D is lar ge enough then we can e nsure that | N | 6 α/ 40 with high probabi lity , and thus whene ver the oracl e outpu ts ⊥ on a query q we kno w that q has low mar gin with respect to f D and t (since α/ 20 + | N | < α/ 7). W e will run the learning algorit hm L on e xamples generat ed using the orac le T O after remo ving all exa mples for which the oracle return ed ⊥ . Since we are condition ing on the T O orac le not returning ⊥ , this transforms the distrib ution G into a conditiona l distrib ution w hich we denot e G ′ . S ince we ha ve only condit ioned on removing lo w-mar gin q ’ s, the arg ument sketched abo ve applies. That is, the hypothesis that has high accur acy with respect to this condition al distrib ution G ′ is still useful for us. So the threshold oracle lets us use noisy sum answer s (allo wing the additi on of nois e and di ﬀ eren tial pri vac y), but in fact it also addresse s the second challeng e of reducin g the query comple xity of the learning algori thm. This is descri bed nex t. Sav ings in Query Complexity via Su bsampling. The remain ing challen ge is that the threshold oracle can be in vo ked only (at most) n times before we exceed our “priv acy b udget” . T his is problematic , because the query compl exit y of the unde rlying learn ing algo rithm may well depend on n , since f D is a sum of n predic ates. T o reduce the number of oracle queries that need to be made, w e observ e that the sum of n predic ates that we are trying to learn can actu ally be approximat ed by a sum of fe wer predicates. In fact, there exist s a sum f D ′ of n ′ = O (log |Q| /α 2 ) predicate s from F that is α/ 100-cl ose to f D on all inputs in Q , i.e. | f D ( q ) − f D ′ ( q ) | 6 α/ 100 for all q ∈ Q . (The proof is by a subsampling arg ument, as in [ BLR08 ]; see Section 4.1 .) W e will aim to learn this “smaller” sum. The hope is that the query comple xity for learni ng f D ′ may be conside rably smaller , namely scali ng w ith n ′ rather than n . Notice, howe ver , that learni ng a thresh old of f D ′ requir es a threshold oracle to f D ′ , rather than the threshold oracle we hav e, which is to f D . 11 Our goal, then, is to use the threshol d oracle to f D to simulate a threshold oracle to f D ′ . This will gi ve us “the best of both worlds”: we can make (rough ly) O ( n ) oracle queries thu s preserving di ﬀ ere ntial pri va cy , while usin g a learning algorit hm that is allo w ed to ha ve query comple xity superline ar in n ′ . The key observ ation sho wing that this is inde ed po ssible is that the threshol d oracle T O already “a voids ” lo w -mar gin querie s where f D t and f D ′ t might disagree! Whenev er the thresho ld oracle T O (w .r .t. D ) answers l , ⊥ on a query q , , we must hav e | f D ( q ) − t | > α/ 20 − N > α/ 100, and thus f D t ( q ) = f D ′ t ( q ) . Moreove r , it is still the case that T O only answers ⊥ on queries q that ha ve lo w mar gins w .r .t f D ′ t . T his means that, as abo ve, w e can run L using T O (w .r .t. D ) in order to learn f D ′ . T he query complexity depends on n ′ and is therefo re independ ent of n . A t the same time, we continu e to answer all querie s using the threshold oracle with respect to f D so that our priv acy budg et remains on the order | D | = n . Denoting the query complexity of the learning algorithm by b ( n ′ ) w e only need that n ≫ b ( n ′ ) . This allo ws us to use learning algorithms that ha ve b ( n ′ ) ≫ n ′ as is usually the case. Sampling fro m the conditional distrib ution. In the expositi on abov e we glosse d ove r one technica l d etail, which is that the learning algorithm requires sampling (or distrib ution restri cted) access to the distrib ution G ′ ov er queries q on which T O does not return ⊥ , where as the data releas e algorithm we are trying to b uild only has acces s to the origin al distrib ution G . W e r econci le this disparity as follo ws. For a thre shold t , let ζ t denote the prob ability that the oracle T O does not return ⊥ when gi ven a random q ∼ G and the threshold t . There are two cases depend ing on ζ t : ζ t < γ : Thi s means that the threshold t is such that with probabilit y 1 − γ a random sample q ∼ G has low mar gin with respect to f D and t . In this case, by simply outputti ng the constan t- t functio n as our approx imation for f D , we get a hypothe sis that has accurac y α/ 3 w ith probab ility 1 − γ over rand om q ∼ G . ζ t > γ : I n this case, the condi tional distri b ution G ′ induce d by the thresho ld oracle is 1 /γ -smooth w .r .t. G . In particular , G ′ is containe d in the smooth exte nsion GQ ′ for which the learning algorithm is guaran teed to work (by the condi tions of Theorem 3.9 ). This means tha t it we can sample from G ′ using rejection sampling to G . It su ﬃ ces to ov ersample by a facto r of O (1 /γ ) to make sure that we recei ve enough examples that are not rejected by the thresho ld oracle. Finally using a reasonably accurate estimate of ζ , we can also implement the distrib ution restric ted approx- imate oracle access that may be requi red by the learning algorithm. W e omit the details from this informal ov erview . 4 Pr oof of Theor em 3.9 In this section , we giv e a formal proof of Theorem 3.9 . W e formalize and analyze the threshold oracle ﬁrst. Then we proce ed to our main reductio n. 4.1 Thr eshold access and subsampling W e be gin by describ ing the thresho ld oracle that we use to ac cess the func tion f D throug hout our reductio n; it is presented in Figure 1 . The ora cle has two purposes. One is to ensur e di ﬀ erential pri vac y by addin g noise eve ry time we access f D . The oth er purpo se is to “ﬁlter out” que ries that are too close to the giv en thresh old. This will enabl e us to argu e that the threshold oracle for f D t agrees with the functio n f D ′ t where D ′ is a small subsa mple of D . Through out the remainder of t his section we ﬁx all input pa rameters to our o racle, i.e. the data set D and the v alues b , α > 0 . W e let β > 0 denote the desired error probabili ty of our algorithm. 12 Input: data set D of size n , tolerance α > 0 , query bound b ∈  . Thres hold Oracle T O ( D , α, b ): – When in v oked on the j -th query ( q , t ) ∈ Q × [0 , 1) , do the follo wing: – If j > b , output ⊥ and terminate. – If ( q , t ′ ) has not been asked before for any threshold t ′ , sample a fre sh Laplacian v ariable N q ∼ Lap( b /ε n ) and put A q = f D ( q ) + N q . Otherwise reuse the pre viously created val ue A q . – Output              0 if A q 6 t − 2 α/ 3, 1 if A q > t + 2 α/ 3, ⊥ otherwise. Figure 1 : Threshold o racle f or f D . T his thresho ld oracle is the only way in which the data release alg orithm ever interacts with the data set D . Its purpo se is to e nsure p riv acy and to reject queries that ar e to o close t o a given threshold. Lemma 4.1. Call tw o que ries ( q , t ) , ( q ′ , t ′ ) distinc t if q , q ′ . Then, the thr eshold oracle T O ( D , α, b ) answers any sequen ce of b distinct adaptive queries to f D with ε -di ﬀ er ential priva cy . Pr oof. This follows dire ctly from the guarante es of the Laplacian mechanism as shown in [ DMNS06 ].  Our goal is to use the threshol d oracle for f D t to correctl y answer queries to the function f D ′ t where D ′ is a smaller (sub-sampled) databas e that giv es “close” answers to D on all queries q ∈ Q . The next lemma sho ws that there alw ays exists such a smaller datab ase. Lemma 4.2. F or any α > 0 , ther e is a datab ase D ′ of size | D ′ | 6 10 log |Q| α 2 (4) suc h that max q ∈Q    f D ( q ) − f D ′ ( q )    < α . Pr oof. The existe nce of D ′ follo ws from a subsampli ng ar gument as shown in [ BLR08 ].  The nex t lemma states the two mai n proper ties of the thr eshold oracle that we need. T o state them more succin ctly , let us denote by Q ( t , α ) = { q ∈ Q : | f D ( q ) − t | > α } the set of elements in Q that are α -far from the thres hold t . Lemma 4.3 (Agreement) . Suppo se D satisﬁe s | D | > 30 b · log( b / β ) εα , (5) Then, the r e is a data se t D ′ of siz e | D ′ | 6 90 · α − 2 log |Q| and an even t Γ (o nly dependi ng on the c hoice of the Laplacia n variables) such that Γ has pr obability 1 − β and if Γ occurs, then T O ( D , α, b ) has the followin g guar antee: whenever T O ( D , α, b ) ou tputs l on one of the queries ( q , t ) in the sequence , then 1. if l , ⊥ then l = f D ′ t ( q ) = f D t ( q ) , and 13 2. if l = ⊥ then q < Q ( t , α ) . Pr oof. Let D ′ be the data set gi ven by Lemma 4.2 with its “ α ” v alue set to α/ 3 so that    f D ( q ) − f D ′ ( q )    < α/ 3 for e very inp ut q ∈ Q . The ev ent Γ is deﬁned as t he ev ent th at eve ry Laplacian variab le N q sampled by the o racle has m agnitu de | N q | < α / 3 . Under the gi ven as sumption on | D | in 5 and usi ng basic tail bou nds for the Laplacia n distrib ution, this happe ns with probab ility 1 − β. Assuming Γ occurs, the follo wing two statemen ts hold: 1. Whene ver the oracle outputs l , ⊥ on a query ( q , t ) , then we must hav e either f D ( q ) + N q − t > 2 α/ 3 (and thus bot h f D ( q ) > t + α / 3 and f D ′ ( q ) > t ) or else f D ( q ) + N q − t 6 − 2 α/ 3 (and thus both f D ( q ) < t − α / 3 and f D ′ ( q ) < t ). T his pro ves the ﬁrst claim of the lemma. 2. Whene ver q ∈ Q ( t , α ) , then | f D ( q ) + N q − t | > 2 α/ 3, and therefore the oracle does not output ⊥ . This pro ves the secon d claim of the lemma.  4.2 Privacy-pr eserving re duction In this sect ion we describe ho w to con vert a non- pri vate learni ng algorith m for thres hold functi ons of the form f D t to a priv acy-p reservi ng learning algori thm for functio ns of the fo rm f D . The reduction is prese nted in Figure 2 . W e call the algor ithm P riv L earn . Setting of parameters. In the descriptio n of P riv L earn we use the follo wing setting of parameter s: n ′ = 4410 · log |Q| α 2 k = & 3 α ' γ ′ = γ k β ′ = β 6 k (6) b base = b ( n ′ , γ ′ , β ′ ) b iter = 100 b base · log( 1 / β ′ ) γ b total = 2 k · b iter (7) Analysis of the r ed u ction. T hrough out the analysis of the algorith m we keep all input paramete rs ﬁxed so as to satis fy the assumpti ons of Theorem 3.9 . Speciﬁcally we will need | D | > 210 · b total · log(10 b total /β ) εα . (8) W e ha ve made no attempt to optimize vari ous constants througho ut. Lemma 4.4 (Priv acy) . Algorithm P riv L earn satisﬁes ε -di ﬀ er ential privacy . Pr oof. In each iterati on of the loo p in Step 3 the algor ithm make s at most 2 b iter querie s to T O (there are b iter calls made on th e samples and at most b base 6 b iter e v aluatio n queries). But no te that T O is instantia ted with a query bou nd of b total = 2 k b iter . Hence, it follo ws from L emma 4.1 tha t T O sat isﬁes ε -di ﬀ erential pri vac y . Since T O is the only way in w hich P riv L earn ev er interacts with the data set, P riv L earn satisﬁes ε -di ﬀ erenti al pri v acy .  W e no w prov e that the hypothesis produced by the algorithm is indeed accu rate, as formalized by the follo wing lemma. 14 Input: Distrib ution G ∈ GQ , data set D of size n , accurac y parameters α, β, γ > 0; learning algorit hm L for thresho lds o ver ( Q , GQ , F ) as in The orem 3.9 requiring b ( n ′ , γ ′ , β ′ ) labeled exa mples and approximate restric ted ev aluatio n access to the tar get function. Paramet ers: See ( 6 ) and ( 7 ). Algorithm P riv L earn for p riv ately learning f D : 1. Let T O den ote an instanti ation of T O ( D , α / 7 , b total ) . 2. Sample b iter points { q j } 1 6 j 6 b iter from G . 3. For each ite ration i ∈ { 1 , . . . , k } : (a) Let t i = i / k + 1 . (b) For each q j , j ∈ [ b iter ] send the query ( q j , t i ) to T O and let l j denote the answer . Let B i = { j : l j , ⊥} . (c) If | B i | b iter < γ 2 , output the consta nt t i functi on as hypoth esis h and terminate the algorithm. (d) Run the learnin g algo rithm L ( n ′ , t i , γ ′ , β ′ ) on the labeled exa mples { ( q j , l j ) } j ∈ B i , answering e v aluatio n queries from L as follo ws: – Giv en a query q posed by L , let l be the answer of T O on ( q , t i ) . – If l = ⊥ , then output (0 , ⊥ ) . Otherwise, output ( G [ q ] · b iter | B i | , l ) . (e) Let h i denote the resulti ng hypothe sis. 4. Havin g obtained hypoth eses h 1 , . . . , h k , the ﬁnal hypo thesis h is deﬁned as follo ws: h ( q ) equ als the smallest i ∈ [ k ] such that h i ( q ) = 1 and h i − 1 ( q ) = 0 (we take h 0 ( q ) = 0 and h k + 1 ( q ) = 1). Figure 2: Reduction from private data release to learning thresho lds (non-priv ately). Lemma 4.5 (Accurac y) . W ith over all pr obabil ity 1 − β, the hypothes is h ret urned by P riv L earn satisﬁes  q ∼ G n    h ( q ) − f D ( q )    6 α o > 1 − γ . (9) Pr oof. W e consider three possible cases: 1. The ﬁ rst case is that there exists a t ∈ { t 1 , . . . , t k } such that distrib ution G has at least 1 − γ/ 10 of its mass on poi nts that are α -close to t . In this case a Cherno ﬀ bound and the choice of b iter ≫ b base imply that w ith proba bility 1 − β the algorithm terminates prematurely and the resultin g hypothe sis satisﬁes ( 9 ). 2. In the sec ond c ase, there ex ists a t ∈ { t 1 , . . . , t k } such th at the probabili ty mass G puts on points th at are α -close to t is between 1 − γ and 1 − γ/ 10 . In this ca se if the algorithm termina tes prematu rely then ( 9 ) is satisﬁed; belo w we analyze what happens assuming the algorit hm does not terminate prematurel y . 3. In th e th ird case e very t ∈ { t 1 , . . . , t k } is such that G puts less than 1 − γ of its mass on points α -close to t . In this third case if the al gorithm ter minates prematu rely then ( 9 ) will not h old; ho wev er , our c hoice of b iter implies that in this third case the algorit hm terminates premature ly with prob ability at most 1 − β . As in the second case, below we will analyze what happen s assuming the algorithm does not terminate prematur ely . Thus in the remainder of the argumen t we may assume without loss of generality that the algorithm doe s not terminate prematurel y , i.e. it produ ces a full sequence of hypothes es h 1 , . . . , h k . F urthermor e, we can 15 assume that th e distrib ution G places at most 1 − γ/ 10 fraction of its weight ne ar any par ticular threshold t i . This leads to the follo wing claim, sho wing that in all iteratio ns, the number of labele d exa mples in B i is lar ge enough to run the learning algorithm. Claim 4.6.  {∀ i : | B i | > b base } > 1 − β 3 . Pr oof. By ou r ass umption, the pro babilit y that a sample q ∼ G is rejected at step t of P riv L earn is at most γ / 10 . By the choice of b iter it follo ws that | B i | > b base with probability 1 − β/ k . T aking a union bound ov er all thres holds t completes the proo f.  The proof strateg y from here on is to ﬁ rst analyz e the algor ithm on the conditi onal dist rib ution that is induce d by th e thresh old or acle. W e w ill then pass from this conditio nal dist rib ution to the actual distrib ution that we are interest ed in, namely , G . W e chose | D | lar ge enoug h so that we can apply L emma 4.3 to T O with the “ α ”-setting of Lemma 4.3 set to α / 7 . Let D ′ be the data set and Γ be the e ven t giv en in the conclusio n of L emma 4.3 applied to T O . (Note that n ′ = | D ′ | 6 7 2 · 90 α − 2 log |Q| as state d above .) By the choice of our paramete rs, we ha ve  { Γ } > 1 − β 3 . (10) Here the proba bility is computed only o ver the interna l rand omness of the threshold oracle T O w hich w e denote by R . Fix the randomn ess R of T O such that R ∈ Γ . For the sake of analys is, we can think of the randomn ess of the oracle as a collectio n of indepe ndent rando m variab les ( N q ) q ∈Q (where N q is used to answer all querie s of the form ( q , t ′ )). In particular , the beha vior of the oracle would not change if we were to sample all v ariables ( N q ) q ∈Q up front. When we ﬁx R we thus mean that we ﬁx N q for all q ∈ Q . W e may therefore ass ume for the remain der of the analysis that T O satisﬁes properties ( 1 ) and ( 2 ) of Lemma 4.3 . Let us den ote by Q i ⊆ Q the set of example s for which T O would not answer ⊥ in S tep 3 at the i -th iterati on of the algorithm. Note that this is a well-deﬁned set since we ﬁxed the randomness of the oracle . Denote by G i the distrib ution G conditi oned on Q i . Further , let Z i =  q ∼ G { q ∈ Q i } . Observ e that G i [ q ] =        G [ q ] / Z i q ∈ Q i 0 o.w . . (11) The nex t lemma sho ws that P riv L earn answers ev aluation querie s with the desire d multiplic ati ve precision . Lemma 4.7. W ith pr obability 1 − β / 6 k (over the rand omness of P riv L earn ), we have Z i 3 6 | B i | b iter 6 3 Z i . (12) Pr oof. The lemma follo ws from a Cherno ﬀ bound with the fact that we chose b iter ≫ b base .  Assuming that ( 12 ) holds, we can argue that the learnin g algorithm in step t produc es a “good” hypoth- esis as expr essed in the next lemma. Lemma 4.8. L et t ∈ { t 1 , . . . , t k } . Conditione d on ( 12 ), we have that with pr obability 1 − β / 6 k (o ver the inte rnal ran domness of the learning algor ithm in voked at step i) the hypothesis h i satisﬁ es  q ∼ G i n h i ( q ) = f D t i ( q ) o > 1 − γ k . 16 Pr oof. This fol lo w s dire ctly from the guarantee of the learning algo rithm L once we argue that (with the claimed probabi lity): 1. Each example q is sampled fro m G i and labele d correct ly by f D ′ t i ( q ) and f D ′ t i ( q ) = f D t i ( q ) . 2. All e valuat ion queries ask ed by the learning algorit hm are answered with the multiplica ti ve error allo wed in Deﬁnition 3.6 . 3. The algori thm recei ved su ﬃ ciently many , i.e., b base , labeled examp les. The ﬁrst claim follo ws from the deﬁnition of G i , since we can sample from G i by sampling from G and rejecti ng if the oracle T O returns ⊥ . S ince Γ is assumed to hold, we can in vok e propert y ( 1 ) of Lemma 4.3 to conclude that whenev er the oracl e does not return ⊥ , then its answer agree s with f D ′ t i ( q ) and moreo ver f D ′ t i ( q ) = f D t i ( q ) . T o see the second claim, consider an ev aluation query q . W e consid er two cases . The ﬁrst case is where the thresh old oracle returns ⊥ and P riv L earn output s (0 , ⊥ ) . Note that in this case indeed G i puts 0 weight on the query q . In the secon d case P riv L earn outputs ( G [ q ] · b iter / | B i | , l ) . By ( 11 ) and since we assumed Γ holds, the outpu t satisﬁes the desired multiplic ati ve bound. The third claim is a direct consequ ence of Claim 4.6 .  W e c onclud e from th e a bov e that wit h prob ability 1 − β/ 3 (o ver the co mbined ran domness of P riv L earn and of the learni ng algorith m), simultane ously for all i ∈ [ k ] we hav e  q ∼ G  h i ( q ) , f D t i ( q )     Q i  =  q ∼ G i n h i ( q ) , f D t i ( q ) o 6 γ k . (13) This follo ws from a union bound ov er all k applications of Lemm a 4.7 and Lemma 4.8 . W e can no w complete the proof of Lemma 4.5 . That is, we will sho w that assuming ( 13 ) the hypothesis h satisﬁes  q ∼ G n    h ( q ) − f D ( q )    6 α o > 1 − γ . Note that 1. ( 13 ) occurs with probabil ity 1 − β / 3 , 2. our assumption on the thresh old oracle , i.e., R ∈ Γ also occurs with probability 1 − β / 3 (o ver the randomn ess of the oracle) 3. the e vent in Claim 4.6 holds with prob ability 1 − β / 3 . Hence all three eve nts occur simultaneous ly with probability 1 − β which is what we claimed. W e proceed by assuming that all three e ven ts occurred . In the follo wing, let Err i = { q ∈ Q : h i ( q ) , f D t i ( q ) } denote the set of points on which h i errs. W e will need the follo wing claim. Claim 4.9. Let q ∈ Q . Then,    h ( q ) − f D ( q )    > α = ⇒ q ∈ [ i ∈ [ k ] Err i ∩ Q i . 17 Pr oof. Arguin g in the contraposi ti ve, sup pose q < S i ∈ [ k ] Err i ∩ Q i . This m eans that for all i ∈ [ k ] w e hav e that either q < Err i or q < Q i . Ho wev er , we claim that there can be at most one i ∈ [ k ] such that q < Q i meaning that q is rejected at step i . This follo ws from proper ty ( 2 ) of Lemma 4.3 which asserts that if q < Q i , then we m ust hav e | f D ( q ) − t i | < α/ 7, and the fact that any tw o thresh olds di ﬀ er by at least α/ 3 . Hence, under the assumption abov e it must be the case that q < Err i for all but at most one i ∈ [ k ] . This mean s that all b ut one hyp othesi s h i correc tly classify q . Since the threshold s are spac ed α/ 3 apar t, this means the hypoth esis h has error at most 2 α/ 3 6 α on q .  W ith the prev ious claim, we can ﬁnish the proof. Ind eed,  q ∼ G n    h ( q ) − f D ( q )    > α o 6  q ∼ G          [ i ∈ [ k ] Err i ∩ Q i          (using Claim 4.9 ) 6 k X i = 1  q ∼ G { Err i ∩ Q i } (union bound ) = k X i = 1  q ∼ G { q ∈ Err i | Q i }  q ∼ G { Q i } 6 k X i = 1  q ∼ G { q ∈ Err i | Q i } 6 k · γ k (using ( 13 )) = γ . This conclud es the proof of Lemma 4.5  Lemma 4.4 (Pri v acy ) together with Lemma 4.5 (Accurac y) conclude the proof of out main theorem, Theorem 3.9 . 4.3 Quantitative Impro vements without Membership Q ueries Here w e sho w ho w to sha ve o ﬀ a factor of 1 /α in th e r equiremen t on th e d ata set size n in The orem 3.9 . T his is possi ble if the learnin g algorithm uses only sampling access to labeled examp les. Theor em 4.10. Let U be a data univer se, Q a set of query descrip tions, GQ a set of distrib utions over Q , and P : Q × U → { 0 , 1 } a pr edicate . Then, ther e is an ε -di ﬀ er ential ly private ( α, β, γ ) -accur ate data-r elease algorithm pr ovid ed that ther e is an algor ithm L that ( γ , β )-lear ns thr esholds o ver ( Q , GQ ′ , { p u : u ∈ U } ) using b ( n , γ, β ) random e xamples; and we have n > C · b ( n ′ , γ ′ , β ′ ) · log  b ( n ′ ,γ ′ ,β ′ ) αγβ  · log (1 /β ′ ) εαγ , wher e n ′ = Θ (log |Q| /α 2 ) , β ′ = Θ ( βα ) , γ ′ = Θ ( γα ) and C > 0 is a su ﬃ ciently lar ge constant. If L runs in time t ( n , γ, β ) then the data rel ease algorit hm runs in time poly ( t ( n ′ , γ ′ , β ′ ) , n , 1 /α, log(1 /β ) , 1 /γ ) . Pr oof. The proof of this theorem is identical to that of Theorem 3.9 excep t that we put b total = 2 b iter rather than 2 k b iter . It is easy to check that the algorithm indeed make s only b total distin ct queries (in the sense of Lemma 4.1 ) to the threshol d oracle so that priv acy remains ensured . T he correctn ess arg ument is identical.  18 5 First A pplication: Data Release for Conjunctions W ith Theorems 3.9 and 4.10 in hand, we can obtai n new data release algori thms “automatical ly” from learnin g algorithms that satisfy the properties required by the theorem. In this section we present such data release algorithms for conjunct ion counting queries using learning algorithms (which require only random exa mples and work under any dist rib ution) based on polyno mial threshold function s. Through out this secti on we ﬁ x t he query class unde r consid eration to be mo notone con junctio ns, i.e. we tak e U = Q = { 0 , 1 } d and P ( q , u ) = 1 − W i : u i = 0 q i . The learni ng results gi ven late r in this section, together with T heorem 4.10 , immediate ly yield: Theor em 5.1 (Releasing conjunct ion counting queries) . 1. Ther e is an ε -di ﬀ er entially private algori thm for r eleasing the class of monoton e Boolean conj unction queries ov er GQ = { all pr obabi lity distri- b utions over Q} which is ( α, β, γ ) -accur ate and has runtime poly( n ) for databas es of size n pr ovide d that n > d O  d 1 / 3 log ( d α ) 2 / 3  · ˜ O log ( 1 / β ) 3 εαγ 2 ! . 2. Ther e is an ε -di ﬀ er entially priva te algor ithm for re leasing the class of monoton e Boolean conjunctio n querie s o ver GQ k = { all pr obabilit y distrib utions over Q supporte d on B k = { q ∈ Q : q 1 + · · · + q d 6 k } } which is ( α, β, γ ) -accura te and has runtime poly( n ) for datab ases of size n pr ovid ed that n > d O  q k log  k log d α   · ˜ O log ( 1 / β ) 3 εαγ 2 ! . These algorithms are distrib ution-free, and so we can apply the boosting machinery of [ DR V10 ] to get accura te answers to all of the k -way co njunctions with similar databa se size bounds . See the disc ussion and Corollary 1.3 in the introdu ction. In Section 5.1 we establish structural results showing that certain types of thresho lded real-v alued func- tions can be e xpressed as lo w-degree pol ynomial threshold functio ns. In Section 5.2 we sta te some learn ing results (for learnin g under arbitrary distr ib utions) that follo w from these represent ational results. Theo- rem 5.1 abo ve follo ws immediately from combin ing the learnin g results of Section 5.2 with Theorem 4.10 . 5.1 Polynomial thre shold function repr esentations Deﬁnition 5.2. Let X ⊆ Q = { 0 , 1 } d and let f b e a Boolean funct ion f : X → { 0 , 1 } . W e say that f has a polyno mial thr eshold functio n (PTF) of de gr ee a over X if the re is a re al polynomial A ( q 1 , . . . , q d ) of de gree a such that f ( q ) = sign ( A ( q )) for all q ∈ X where the sign functio n is sign ( z ) = 1 if z > 0, sign ( z ) = 0 if z < 0 . Note that the pol ynomial A may be assumed witho ut loss of generali ty to be multilin ear since X is a subset of { 0 , 1 } d . 5.1.1 Low-degr ee PTFs ove r sparse inputs Let B k ⊂ { 0 , 1 } d denote the collection of all points with Hamming weight at most k , i.e. B k = { q ∈ { 0 , 1 } d : q 1 + · · · + q d 6 k } . The main result of this subsectio n is a proof that for any t ∈ [0 , 1] the function f D t has a lo w -deg ree polynomial threshold function over B k . Lemma 5.3. F ix t ∈ [0 , 1] . F or any databas e D of size n, the functio n f D t has a polynomial thr eshold functio n of de gre e O  p k log n  ove r the domain B k . 19 T o prov e L emma 5.3 we will use the follo wing claim: Claim 5.4. F ix k > 0 to be a positive inte ger and ε > 0 . T her e is a univariate polynomial s of de gre e O  p k log( 1 / ε )  which is suc h that 1. s ( k ) = 1 ; and 2. | s ( j ) | 6 ε for all inte ger s 0 6 j 6 k − 1 . Pr oof. This claim was prov ed by Buhrman et al. [ BCdWZ99 ], who ga ve a quantum algori thm whi ch implies the existenc e of the claimed polynomial (see also Sectio n 1.2 of [ She09 ]). H ere we gi ve a self-c ontaine d constr uction of a polynomia l s with the claimed properties that satisﬁes the slightly weaker degree bound deg( s ) = O ( √ k log(1 /ε )) . W e will use the univ ariate Chebyshe v polynomial C r of degree r = ⌈ √ k ⌉ . Consider the polyno mial s ( j ) =         C r  j k  1 + 1 k  C r  1 + 1 k          ⌈ log(1 /ε ) ⌉ . (14) It is clear that if j = k then s ( j ) = 1 as desired , so suppose that j is an integ er 0 6 j 6 k − 1. This implies that ( j / k )(1 + 1 / k ) < 1. No w well-kno wn properties of the Chebyche v polynomial (see e.g. [ Che66 ]) imply that | C r (( j / k )(1 + 1 / k )) | 6 1 and C r (1 + 1 / k ) > 2 . T his gi ves the O ( √ k log(1 /ε )) degr ee bound.  Recall that the predic ate function for a data item u ∈ { 0 , 1 } d is deno ted by p u ( q ) = 1 − _ i : u i = 0 q i . As an easy coro llary of Claim 5.4 we get: Cor ollary 5.5. F ix ε > 0 . F or e very u ∈ { 0 , 1 } d , ther e is a d -variable polynomial A u of de gr ee O  p k log( 1 / ε )  which is suc h that for every q ∈ B k , 1. If p u ( q ) = 1 then A u ( q ) = 1 ; 2. If p u ( q ) = 0 then | A u ( q ) | 6 ε. Pr oof. Consider the linear function L ( q ) = k − P i : u i = 0 q i . For q ∈ B k we ha ve that L ( q ) is an integ er in { 0 , . . . , k } , and we ha ve L ( q ) = k if and only if p u ( q ) = 1 . T he desir ed polynomial is A u ( q ) = s ( L ( q )) .  Pr oof of Lemma 5.3 . Consider the polynomial A ( q ) = X u ∈ D A u ( q ) where f or each da ta item u , r u is th e polyno mial from Corollary 5.5 w ith its “ ε ” parameter set to ε = 1 / (3 n ) . W e will sho w that A ( q ) − ( ⌈ t n ⌉ − 1 / 2) is the desired polyn omial which gi ves a PTF for f D t ov er B k . First, consider any ﬁ xed q ∈ B k for which f D t ( q ) = 1. Such a q must satisfy f D ( q ) = j / n > t for some inte ger j , and hence j > ⌈ tn ⌉ . Corollary 5.5 no w giv es that A ( q ) > ⌈ t n ⌉ − 1 / 3 . Next, consider any ﬁxed q ∈ B k for which f D t ( q ) = 0 . Such a q must satisfy f D ( q ) = j / n < t for some inte ger j , and hen ce j 6 ⌈ t n ⌉ − 1 . Corolla ry 5.5 now giv es that A ( q ) 6 ⌈ t n ⌉ − 2 / 3 . This prov es th e lemma.  20 5.1.2 Low-degr ee PTFs ove r the entir e hyper cube T aking k = d in the pre vious subsectio n, the results there imply that f D t can be represen ted by a polyno mial thresh old function of degree O  p d log n  ov er the entire Boolean hyper cube { 0 , 1 } d . In this section we im- pro ve the degre e to O  d 1 / 3 (log n ) 2 / 3  . T his result is very similar to T heorem 8 of [ K O S04 ] (which is closely based on the main construction and result of [ KS04 ]) but with a few di ﬀ erences: ﬁrst, w e use C laim 5.4 to obtain slightly improv ed bounds . Second , we need to use the follo w ing notion in place of the notion of the “size of a conjun ction” that was used in the earlier results: Deﬁnition 5.6. The width of a data item u ∈ D is deﬁned as the number of coordin ates i such that u i = 0 . The width of D is deﬁned as the maximum width of any data item u ∈ D . W e use the follo wing lemma: Lemma 5.7. F ix any t ∈ [0 , 1] and suppos e that n-element databas e D has width w. Then f D t has a polyno - mial thr eshold function of de gr ee O  p w log n  ove r the domain { 0 , 1 } d . Pr oof. The proof follo ws the cons tructio ns and ar guments of the pre vious subsecti on, bu t with “ w ” in place of “ k ” throughout (in particular the linear function L ( q ) is no w deﬁned to be L ( q ) = w − P i : u i = 0 q i ).  Lemma 5.8. Fix any value r ∈ { 1 , . . . , d } . The function f D t ( q 1 , . . . , q d ) can be exp r essed as a decis ion tr ee T in which 1. eac h internal node of the tr ee contains a variable q i ; 2. eac h leaf of T contain s a function of the form f D ′ t wher e D ′ ⊆ D has width at most r ; 3. the tr ee T has rank at most (2 d / r ) ln n + 1 . Pr oof sketc h. T he resul t follo ws directly from the proof of Lemma 10 in [ KS04 ], exc ept that we use the notion of widt h from Deﬁnition 5.6 in p lace of th e notion of the siz e of a co njuncti on that is use d in [ KS04 ]. T o see that this works, observ e that since p u ( q ) = 1 − ∨ i : u i = 0 q i , ﬁxing q i = 1 will ﬁx all predicates p u with u i = 0 to be zero. T hus t he an alysis of [ KS04 ] goes through unc hanged , replacing “ter ms of f that ha ve size at least r ” w ith “data items in D that ha ve width at least r ” througho ut.  Lemma 5.9. The function f D t can be r epr esented as a polyno mial thr eshold function of de gr ee O ( d 1 / 3 (log n ) 2 / 3 ) . Pr oof. The proof is nearly identical to the proof of Theorem 2 in [ KS04 ] but with a fe w small change s. W e tak e r in Lemm a 5.8 to be d 2 / 3 (log n ) 1 / 3 and no w apply Lemm a 5.7 to each width- r database D ′ at a leaf of the resulting decision tree. Argu ing precisel y as in T heorem 2 of [ KS04 ] w e get that f D t has a polyno mial thresh old functio n of degr ee max n 2 d r ln n + 1 , O  p r log n o = O  p r log n  = O  d 1 / 3 (log n ) 2 / 3  .  5.2 Learn ing thresholds of conjunction queries under arbitrary distrib utions It is well known that using learning algorithms based on polynomial -time linear progra mming, ha ving low- deg ree PTFs for a class of funct ions implies e ﬃ cient P A C learni ng algorithms for that clas s under an y distrib ution using ran dom examp les only (see e.g. [ KS04 , HS07 ]). Thus the represen tationa l res ults of Section 5.1 immediately giv e learni ng resul ts for the class of thre shold functions over sums of data items. W e stat e these learning results using the terminolo gy of our reduct ion belo w . 21 Theor em 5.10. Let – U denote the data univer se { 0 , 1 } d ; – Q denote the set of query description s { 0 , 1 } d ; – P ( q , u ) = 1 − W i : u i = 0 q i denote the monotone conjunc tion pr edicate; – GQ denote the set of all pr obability distrib utions over Q ; and – GQ k denote the set of all pr obability distri b utions over Q tha t ar e supported on B k = { q ∈ { 0 , 1 } d : q 1 + · · · + q d 6 k } . Then 1. (Learnin g thr esholds of co njunct ion queries ov er all inputs) Ther e is an algorithm L that ( γ, β ) l earns thr esholds over ( Q , GQ , { p u : u ∈ U } ) using b ( n , γ , β ) = d O ( d 1 / 3 (log n ) 2 / 3 ) · ˜ O (1 /γ ) · log(1 /β ) queries to an appr oximate distrib ution-r estricted evalua tion ora cle for the tar get n-th r eshold functio n (in fact L only uses sampling access to labeled examples ). The runni ng time of L is poly( b ( n , γ, β )) . 2. (Learnin g thr esholds of conjunctio n queries ov er spar se inputs) Ther e is an algorithm L that ( γ , β ) learns thr esholds ov er ( Q , GQ k , { p u : u ∈ U } ) using b ( n , γ, β ) = d O (( k log n ) 1 / 2 ) · ˜ O (1 /γ ) · log(1 /β ) querie s to an appr oximate distrib ution-r estricted eva luation ora cle for the tar get n-thr eshold functio n (in fact L only uses sampling access to labeled e xamples). The running time of L is poly( b ( n , γ , β )) . Recall from the discussion at the begin ning of S ection 5 that these learning resu lts, together with our reduct ion, gi ve th e priv ate data release results stated at the beginn ing of the section. 6 Second A pplication: Data Release via F o urier -Ba sed Lea rning In this section we present data release algorithms for parity counting queries and A C 0 counti ng queries that instan tiate our reducti on Theo rem 3 .9 with Fourier -based algori thms fro m th e computation al learning theory literat ure. W e stress that these algorithms require the more general reduction Theor em 3.9 rather than the simpler version of Theorem 1.1 because the underlyin g learning algorithms are not distr ib ution free. W e ﬁrst gi ve our resu lts for parity counti ng queries in Section 6.1 and th en our result s for A C 0 counti ng queries in Section 6.2 . 6.1 Parity counting queries using the H armonic Sieve [ Jac97 ] In this subsection we ﬁx the query class under considera tion to be the class of parity queries, i.e. we take U = { 0 , 1 } d and Q = { 0 , 1 } d and w e take P ( q , u ) = P i : u i = 1 q i (mod 2) to be the parity predicate . Our main result for relea sing parity counti ng queries is: Theor em 6.1 (Releasing parity counting queries) . Ther e is an ε -di ﬀ er entially privat e al gorith m for re leasin g the class of parity quer ies over the uni form distri b ution on Q which is ( α, β, γ ) -accur ate and has runti me poly( n ) for databa ses of size n, pr ovided that n > poly( d , 1 /α, 1 /γ, lo g ( 1 / β )) ε . This theo rem is an immediate consequence of our main redu ction, Theorem 3.9 , and the followin g learnin g result: 22 Theor em 6.2. Let – U denote the data univer se { 0 , 1 } d ; – Q denote the set of query description s { 0 , 1 } d ; – P ( q , u ) = P i : u i = 1 q i (mod 2) denote the parity pr edicat e; and – GQ contain s only the uniform distrib ution ove r Q . Then ther e is an alg orithm L that ( γ, β ) learns thr esholds over ( Q , GQ ′ , { p u : u ∈ U } ) wher e GQ ′ is the (2 /γ ) -smooth ex tension of GQ . Algorith m L uses b ( n , γ , β ) = poly( d , n , 1 /γ ) · log(1 /β ) que ries to an appr ox- imate G -re stricte d evalua tion oracl e for the tar get n-thr eshold function when it is learn ing with re spect to a distrib ution G ∈ GQ ′ . The running time of L is poly( b ( n , γ, β )) . Pr oof. The claimed alg orithm L is esse ntially Jackson’ s Harmonic Siev e algorithm [ Jac97 ] for learning Majority of Parit ies; howe ver , a bit of addition al analysis of the algorithm is needed as we now e xplain. When Jack son’ s results on th e Harmonic Sie ve are expres sed in our t erminolo gy , they g iv e Theorem 6 .2 exa ctly as stated abov e excep t for one issue which we now describe. Let G ′ be any di strib ution in the (2 /γ )- smooth exten sion GQ ′ of the unif orm distrib ution. In Jackson’ s analysis , when it is learning a tar get functio n f under distrib ution G ′ , the Harmonic Siev e is gi ven black-box oracle access to f , sampl ing access to the distrib ution G ′ , and ac cess to a c -app roximatio n to an e valua tion oracle for G ′ , in the fo llo wing sense: ther e is some ﬁx ed c onstan t c ∈ [1 / 3 , 3] such th at w hen the or acle i s queried on q ∈ Q , it outputs c · G ′ [ q ] . T his is a formally more power ful type of a ccess to the underly ing distr ib ution G ′ than is allo w ed in The orem 6.2 since Theorem 6.2 only giv es L access to an approximate G ′ -restri cted e valua tion ora cle for the tar get functi on (recall Deﬁnition 3.6 ). T o be more precise, the only di ﬀ erence is that with the Siev e’ s black-box oracle access t o the targe t function f it is a priori po ssible for a learning alg orithm to query f ev en on points w here the distri b ution G ′ puts zero probability mass, whereas such queries are not allo wed for L . Thus to prov e Theorem 6.2 it su ﬃ ces to arg ue that the H armonic Sie ve algorith m, when it is run unde r distrib ution G ′ , ne ver needs to make queries on points q ∈ Q that hav e G ′ [ q ] = 0 . Fortun ately , this is an easy conse quence of the way the Harmonic Siev e algo rithm works. Inste ad of actual ly using black- box oracle queries for f , the algor ithm actually only ev er makes oracle queries to the functi on g ( q ) = 2 d · f ( q ) · D ′ [ q ], where D ′ is a c -ap proximat ion to an e v aluatio n oracle for a distrib ution G ′′ which is a smooth exte nsion of G ′ . (See the discussio n in S ection s 4.1 and 4.2 of [ Jac97 ], in particular Steps 16-18 of the HS algorithm of Figure 4 and Steps 3 and 5 of the WDNF algorithm of Figure 3.) By the deﬁnitio n of a smooth ex tensio n, if q is such that G ′ [ q ] = 0 then G ′′ [ q ] also equa ls 0, and conseq uently g ( q ) = 0 as well. Thus it is straightfo rward to run the Harmonic S ie ve using access to an approximat e G ′ - restric ted ev aluatio n oracle: if G ′ [ q ] returns 0 then “0” is the correct valu e of g ( q ), and otherwise the oracle pro vides precisely the informat ion that would be av ailable for the Siev e in Jackson’ s original formulation.  6.2 A C 0 queries using [ JKS02 ] Fix U = { 0 , 1 } d and Q = { 0 , 1 } d . In this subsection we sho w that our reduc tion enable s us to do e ﬃ cie nt pri vate data release for quite a broad class of queries , namely any query compu ted by a constant-de pth circuit . In m ore detail, let P ( q , u ) : { 0 , 1 } d × { 0 , 1 } d → { 0 , 1 } be any predica te that is computed by a circuit of depth ℓ = O (1) and size poly( d ) . Our data release result for such querie s is the follo wing: 23 Theor em 6.3 (Releasin g A C 0 querie s) . Let GQ be the set containing the uniform distrib ution and let U , Q , P be as describ ed abo ve. Ther e is an ε -di ﬀ er entiall y ( U , Q , GQ , P ) data r elease algorithm that is ( α, β, γ ) - accur ate and has runtime poly ( n ) for datab ases of size n, pr ovide d that n > d O  log ℓ  d αγ  · ˜ O log ( 1 / β ) 3 εα 2 γ ! . See the introduct ion for a discussi on of this resu lt. W e observe that giv en an y ﬁxed P as des cribed abo ve, for an y giv en u ∈ U = { 0 , 1 } d the function p u ( q ) is computed by a circu it of depth ℓ and size poly( d ) over the input bits q 1 , . . . , q d . Hence T heorem 6.3 is an immediate conseq uence of T heorem 3.9 and the follo wing learning result, which describes the performance guaran tee of the quasipolyn omial-time algori thm of Jackson et al. [ JKS02 ] for learning Majority- of-Pari ty in our langua ge: Theor em 6.4 (Theorem 9 of [ JKS02 ]) . Let – U denote the data univer se { 0 , 1 } d ; – Q denote the set of query description s { 0 , 1 } d ; – P ( q , u ) be any ﬁxed pr edicate computed by an A N D / OR / NO T cir cuit of depth ℓ = O (1) and size poly( d ) ; – GQ contain s only the uniform distrib ution ove r Q ; and – F be the set of all AND / OR / NOT cir cuits of depth ℓ and size poly( d ) . Then ther e is an algo rithm L that ( γ, β ) learns n-thr esholds over ( Q , GQ ′ , F ) wher e GQ ′ is the (2 /γ ) -smoo th e xtensio n of GQ . A lgorith m L uses appr oximate distrib ution r estricted or acle access to the function, uses b ( n , γ , β ) = d O (log ℓ ( nd /γ )) · log(1 /β ) samples and calls to the eval uation oracle , and runs in time t ( n , γ, β ) = d O (log ℓ ( nd /γ )) · log(1 /β ) . W e note that Theorem 9 of [ JKS02 ], as stat ed in that pap er , only deals with learning majority-of-A C 0 circuit s under the unifor m distrib ution: it say s that an n -way Majorit y of depth- ℓ , size-pol y ( d ) circuit s ov er { 0 , 1 } d can be learned to accurac y γ and conﬁdenc e β under the uniform distrib ution, using random exa mples only , in time d O (log ℓ ( nd /γ )) · log(1 /β ) . Ho weve r , the boosting -based algorithm of [ JKS02 ] is iden- tical in its high-le vel structu re to Jackson’ s Harmonic Sie ve; the only di ﬀ erence is that the [ JKS02 ] weak learne r simply performs an e xhausti ve search ov er all lo w -weight parit y functio ns to ﬁnd a weak h ypothe sis that has non-ne gligible correlation with the targe t, whereas the H armonic Sie ve uses a more sophisticate d membership -query algorithm (tha t is an exten sion of the algorith m of Kushil e vitz and Mansour [ KM93 ]). Arg uments identica l to the ones Jackson giv es for the Harmonic Sie ve (in Sectio n 7.1 of [ Jac97 ]) can be ap- plied unchan ged to the [ JKS02 ] algorith m, to sho w that it extend s, just like the Harmoni c Siev e, to learni ng under smooth distri b utions if it is pro vided w ith an appro ximate e valu ation oracle for the smooth distrib u- tion. In more deta il, these ar guments sho w that for any C - smooth dis trib ution G ′ , giv en sampling acces s to labeled example s by ( G ′ , f ) (where f is the tar get n -way Majority of depth- ℓ , size-poly ( d ) circuits) and approx imate e valuat ion acces s to G ′ , the [ JKS02 ] algorith m learns f to accurac y γ and conﬁdenc e β under G ′ in time d O (log ℓ ( C nd /γ )) · log(1 /β ) This is the resu lt that is resta ted in our data pri vac y language above ( note that the smoothn ess parameter there is C = 2 /γ ). 7 Conclusion and o pen pr oblems This work p ut forwa rd a ne w reduc tion from pri v acy -preser ving data analysis to learn ing threshold s. Instan- tiating this red uction with vari ous d i ﬀ erent lea rning alg orithms, we o btaine d new data release alg orithms for 24 a v ariety of query clas ses. One notab le improve ment was for the dat abase size (or error) in distrib ution-fre e release of conjun ctions and k -way conjunction s. G i ven these ne w result s, we see no kn o wn obsta cles for e ven m ore dramatic improv ements on this cen tral question. In par ticular , we conclude with the follo wing open ques tion. Open Question 7.1. Is there a di ﬀ erentia lly priv ate distrib ution-fre e data release algorithm (with constant error , e.g., α = 1 / 100) for conjuncti ons or k -way conjun ctions that works for databases of size poly ( d ) and runs in time poly( n ) (or poly( n , d k ) for the case of k -way conjunct ions)? Note that such an alg orithm for k -way conjunction s would also imply , via boosting [ DR V10 ], that w e can pri vately releas e all k -way conjunctio ns in time poly( n , d k ), pro vided that | D | > poly( d ) . Refer ences [Ajt83] Miklós Ajtai. Σ 1 [1]-for mulae on ﬁnite struct ures. Ann. Pur e Appl. Logic , 24(1):1 –43, 1983. [BCD + 07] Boaz Barak, Kamalika Chaudhuri, C ynthia Dwork, Satyen K ale, Frank McSherry , and Kunal T alwa r . Pri vac y , acc urac y , and con sistenc y too: a holist ic solut ion to contingen cy table r elease. In Pr oc. 26 th Symposiu m on Principles of Database Systems (PODS) , pages 273–28 2. A CM, 2007. [BCdWZ99] Harry B uhrman, R ichard Clev e, Ronald de W olf, and Christof Z alka. Bounds for small-error and zero-erro r quantum algorithms. In Pr oc. 40 th F oundations of Computer Science (FOCS) , pages 358–36 8. IE EE, 1999. [BDMN05] A vrim Blum, Cynthia D work, F rank McSherry , and K obbi Nissim. Practical priv acy : the SuLQ frame work. In P r oc. 24 th Symposi um on Principles of Databas e Systems (PODS) , pages 128–13 8. ACM, 20 05. [BLR08] A vrim Blum, K atrina Ligett, and Aaron Roth. A learning theory approach to non-i nteract iv e databa se pri v acy . In Pr oc. 40 th STOC , page s 609–618. A CM, 2008. [Che66] Elliott W . Cheney . Intr oduct ion to Appr oximation Theory . McGraw-Hill, New Y ork, New Y ork, 1966. [DMNS06] Cynthia Dwork, Fran k McSherry , Ko bbi Nissim, and Ada m Smith. Calibrati ng noise to sensi- ti vity in pri vate data analysis . In Pr oc. 3 r d TCC , pages 265 –284. S pringe r , 2006. [DNR + 09] Cynthia Dwork, Moni Naor , Omer Reingold, Guy N . Rothblum, and S alil P . V adhan. On the comple xity of di ﬀ erentially priv ate data release: e ﬃ ci ent algorithms and hardne ss results . In Pr oc. 41 st STOC , pages 381 –390. ACM, 20 09. [DR V10] Cynthia Dwork, Guy N. Rothbl um, and S alil V a dhan. Boostin g and di ﬀ erential pri va cy . In Pr oc. 51 st F oun dation s of Computer Science (FOCS) . IEEE, 2010 . [GHR U11] Anupam Gupta, Moritz Hardt, A aron Roth, and Jon Ullman. Priv ately releasing conjunction s and the statisti cal query barrier . In Pr oc. 43 nd STOC , pages 803 –812. ACM, 2 011. [HR10] Moritz Hardt and Guy Rothbl um. A mul tiplicat i ve weights mec hanism for pri vac y-pres erving data analysis. In Pr oc. 51 st F ou ndations of Compute r Scienc e (FOCS) , pages 61–70 . IEEE, 2010. 25 [HS07] Lisa Hellerste in and R occo A. Servedio. On P A C learning algorith ms fo r rich boolea n f unction classe s. Theor etical Computer Science , 384(1):6 6–76, 2007. [Jac97 ] Je ﬀ rey C . Jackson. An e ﬃ cien t membership-que ry algori thm for learnin g DNF with respect to the unifo rm distrib ution. J ournal of Computer and System Sciences , 55(3):414– 440, 1997. [JKS02] Je ﬀ rey Jackson , Adam Kliv ans, and Rocco A . Served io. L earnab ility be yond AC 0 . In Pr oc. 34 th ST OC , pages 776–784 . A CM, 2002. [KLN + 08] Shiv a P rasad Kasi viswa nathan , Homin K. Lee, Kobb i Nissim, Sofya R askhod niko va, and Adam Smith. What can we learn pri v ately? In Pr oc. 49 th F o undations of Computer Science (FOCS) , pages 531–5 40. IEEE, 2008. [KM93] E. Kushile vitz and Y . Mansour . Learning decision trees using the Fourier spectrum. SIAM J . on Computing , 22(6): 1331–1 348, 1993. [KMR + 94] Michael J. Ke arns, Y ishay Mansour , Dana Ron, Ronitt Rubinfeld , Robert E. Schapire, and Linda Sellie. On th e l earnab ility of discrete dist rib utions. In P r oc. 2 6 th STOC , pages 273 –282. A CM, 1994. [K OS 04] Adam K li v ans, Ryan O ’Donnell, and Rocco A. S erve dio. Learning intersect ions and thresh- olds of halfspa ces. J ournal of Computer & Syste m Scienc es , 68(4):80 8–840 , 2004. [KRSU10] Shiv a Kasivis wanath an, Mark Rudelson, Adam S mith, and Jonath an Ullman. The price of pri vatel y r eleasin g con tingen cy tab les and the spectra of ra ndom matrices with correlate d ro w s. In Pr oc. 42 nd STOC , pag es 775–784. A CM, 2010. [KS04] Adam Kliv ans and Rocco A. S erve dio. Learning DNF in time 2 ˜ O ( n 1 / 3 ) . J ournal of Computer & System Scienc es , 68(2):30 3–318, 2004. [Nao96] Moni Naor . Eva luation may be easier than generatio n. In Pr oc. 28 th STOC , pages 74–83. A CM, 1996. [RR10] Aaron R oth and T im Roughgarde n. Interacti ve pri vac y via the median mechanism. In Pr oc. 42 nd ST OC , pages 765–774 . A C M, 2010. [She09] Alexa nder A. Sherst ov . The inters ection of two ha lfspace s has high threshol d degree. In Pr oc. 50 th F oundation s of Computer Science (FOCS) . IEEE, 2009. [UV11] Jonath an U llman and Salil P . V adh an. Pcps and the hardness of generating priv ate synthetic data. In TCC , pages 400– 416. Springer , 2011. [V al84] Leslie V alia nt. A theory of the lear nable. Communicat ions of the ACM , 27(11 ):1134 –1142, 1984. 26

Private Data Release via Learning Thresholds

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment