Functional Forms of Optimum Spoofing Attacks for Vector Parameter Estimation in Quantized Sensor Networks

1 Functional F orms of Optimum Spoo ﬁng Att acks for V ector P aramete r Estimation in Quantized Sensor Netw orks Jiangfan Zhang, Stude nt Member , IEEE , Rick S. Blum, F e llow , IEEE , La nce Kaplan, F ellow , IEEE , a nd Xuanxua n Lu Abstract —Estimation of an u nknown deterministic v ector from quantized sensor data is consider ed in the presence of spooﬁn g attacks which alter th e d ata presented to sev eral sensors. Con- trary to previous w ork, a generalized attack model is employed which manipulates the data using transfo rmations with arbitrary functional form s d etermined by some attack parameter s w hose values are unk nown to the attacked system. For the ﬁrst time, necessary and sufﬁcient conditions ar e provided under which the transfo rmations pro vide a guaranteed attack perf ormance in terms of Cramer -Rao Bound (CRB) regardless of the processing the estimation system employs, thu s deﬁn ing a highl y desirable attack. Interestingly , these conditions imply that, fo r any such attack when the attacked sensors can be perfectly identiﬁed by the estimation system, either th e Fi sher In for mation Matrix (FIM) for jointly estimating the desired and attack parameter s is singular or that the attack ed system is unable to impro ve the CRB fo r the desired ve ctor parameter through this joint esti mation ev en though the joint FIM is nonsingular . It is shown that it is always possible to construct such a highly desirable attack by properly employing a sufﬁ ciently large dimension attack vector parameter re lative to the number of quantization levels employed, which was n ot observ ed pre viously . T o i llustrate the t heory in a concrete way , we also provide some n umerical r esults which corro borate that un der the h ighly d esirable attack, attacked data is not useful in re ducing the CRB. Index T erms —Sp ooﬁng attack, distributed vector parameter estimation, Cramer -Rao Bound , the Expectation-Maximization algorithm, sensor network. I . I N T RO D U C T I O N Recent de velopments in sensor tec hnolog y have encourag ed a large nu mber of a pplications of sensor networks for param - eter estimation rang ing from inexpensiv e commercia l systems to co mplex m ilitary and h omeland de fense surveillance sys- tems [1]. T ypically , large-scale sensor networks are c omprised of low-cost an d spatially distributed senso r nodes with limited battery power and low co mputing c apacity , which makes th e system vulnerab le to cyberattacks by ad versaries. Th is has led to great interest in study ing the v ulnerab ility of sensor net- works in various app lications and from different perspectives , see [2]–[1 0] and the ref erences therein. Due to the dominanc e This work w as supported by th e U. S. Army Re search Laboratory and the U. S. Army Research Ofﬁce and was acco mplished unde r Agreement Numbers W911NF-14-1-0245 and W911NF-14-1-0261. The views and conclusions contai ned in this document are those of the authors and should not be interpre ted as represent ing the of ﬁcial policie s, either expre ssed or implied, of the Army Research Labora tory , Army Re search Of ﬁce, or the U.S. Gov ernment. The U.S. Gov ernment is authorized to reproduce and distrib ute reprints for Governmen t pu rposes notwithstandi ng any cop yright notati on here on. of digital tech nolog y , a great d eal of attention h as focused on par ameter estimation using quantized data [11]– [15]. The sequel consider s th e problem of estimating a vector param eter by using q uantized data co llected from a distributed sensor network under the assumption that the measuremen ts from se veral subsets of sensors h av e been falsiﬁed by sp ooﬁng attacks, a topic that has received v irtually no attention to date. T o be speciﬁc, th e spoo ﬁng attacks malicio usly modif y the temporal an alog mea surements of the p henom enon acquired at the subset of attacked sensors. A. S ystem and Adversary Models Consider a distributed sensor network S N = { 1 , 2 , ..., N } consisting of N spatially distributed senso rs, with each making some measuremen ts of a particula r phe nomen on. W e assume that th e j -th sensor a cquires K j measuremen ts, and we de note the before- attack measuremen t of the j -th sensor at time instant k by x j k which follows a pr obability density function (pdf) f j k ( x j k | θ ) dep ending on an unk nown d eterministic vector parameter θ with dimension D θ that is to be estimated from the m easuremen ts. F or simplicity , we assume th at the measuremen ts { x j k } ar e statistically in depend ent but not necessarily identically distributed. Fusion Center Quantizer Sensor 1 Physica l phenomeno n Quantize r Sensor 2 Quantizer Sensor 3 Quantize r Sensor N Spoofin g Attack  Fig. 1: Distributed Estimation System in the Pr esence of Spooﬁng Attacks. The adversaries alter the p hysical phenom enon as in Fig. 1, thus tampering with th e measuremen ts at a subset of sensors in the sensor netw ork, hopin g to u nderm ine the estimation perfor mance of the system. Let V ⊂ S N denote the set of sensors undergoing spooﬁn g attacks while the set U ∆ = S N \V 2 represents the set o f un attacked sensors. A gene ralized math- ematical model of spooﬁng attack s wh ich maliciously modif y the distribution of the analog observations of the physical pheno menon at the attacked sensors is considere d employing general pro bability density fu nctions { f j k } an d { g j k } which depend o n the desired and attack vector p arameters. T o con- form to pre vious w ork, the functional forms of the attac ks, thus { f j k } and { g j k } , are assumed known to the attacked system but the desired an d attack vector pa rameters are not. Th us, the after-attack version ˜ x j k of x j k obeys th e statistical model that 1 { ˜ x j k } is ind epend ent and ˜ x j k ∼  f j k ( ˜ x j k | θ ) , if j ∈ U g j k  ˜ x j k   θ , ξ ( j )  , if j ∈ V , (1) where if j ∈ V , the after-attack pd f g j k ( x j k | θ , ξ ( j ) ) is parameteriz ed by the desired vector parameter θ and the attack vector param eter ξ ( j ) . It is worth mentionin g that the notation g j k ( x j k | θ , ξ ( j ) ) does n ot imply that the a fter-attack pdf g j k ( x j k | θ , ξ ( j ) ) of the measuremen ts at the j -th sensor has to depen d on θ . F or e xamp le, the adversaries can intercept the signal from the physical ph enomen on and generate a ne w signal using some different pdf solely based on its attack vector par ameter . A d etailed example of a pr actical attack of the typ e described in (1) is provide d in Section II. The set V of attacked sensor s can be divided into d isjoint subsets { A p } P p =1 in ter ms of distinct attack vector parameter s { ξ ( j ) } such th at V = P ∪ p =1 A p , and A l ∩ A m = ∅ , ∀ l 6 = m, (2) where the attacked sen sors in the subset A p are known by the sy stem un der attack to emp loy a n id entical attack vector parameter τ ( p ) with dimension D p so that ξ ( j ) = τ ( p ) , ∀ j ∈ A p . Th e id entical attack vectors are possibly du e to the sensors in A p being attacked by the same attacker . For the sake of no tational simp licity , we use A 0 to deno te the set U of unattacked sensors. Due to the commu nications emp loyed, each sensor is re- stricted to convert analog measu rements to digital data b efore transmitting th is data to the fusion center (FC) as shown in Fig. 1. At th e j - th sensor, each after-attack measurement ˜ x j k is quan tized to ˜ u j k by using a R j -lev el quantizer with quantization region s { I ( r ) j } R j r =1 , that is, ˜ u j k = R j X r =1 r 1 n ˜ x j k ∈ I ( r ) j o , (3) where 1 {·} is the indica tor fun ction. W e ad opt this gen eral quantization model due to the fact that op timized quantization regions { I ( r ) j } R j r =1 for d ifferent sensors can be very different, since the measurements from d ifferent sensors d o not neces- sarily obey an identical pdf [13], [1 6]. W e assum e that the quantizer d esign { I ( r ) j } R j r =1 for e ach sensor is p redeﬁned a nd known to the FC, but not the attacker . 1 The notations ˜ x j k and ˜ u j k denote the after-a ttack anal og measu rements and the corresponding quantiz ed measurements. Let Θ denote a vector containin g the unkn own parameter θ along with all the unkn own attack vector parameter s whic h parameteriz e the spooﬁng attacks in the sensor network Θ ∆ =  θ T ,  τ (1)  T , ...,  τ ( P )  T  T . (4) For the sake of notational simplicity in the fo llowing parts, we use p ( k ) j r to denote the af ter-attack p robab ility m ass function (pmf) o f th e q uantized m easuremen t ˜ u j k ev alu ated at ˜ u j k = r , that is, p ( k ) j r ∆ = Pr ( ˜ u j k = r | Θ ) = ( R I ( r ) j f j k ( ˜ x j k | θ ) d ˜ x j k , ∀ j ∈ A 0 R I ( r ) j g j k  ˜ x j k   θ , τ ( p )  d ˜ x j k , ∀ j ∈ A p , ∀ p ≥ 1 . ( 5) For simplicity , th e communication c hannel between the FC and each sensor is assumed ideal, and hence the FC is able to accurately receiv e wh at was tran smitted f rom bo th th e unattacked an d attacked sen sors. Af ter recei ving the quantized data f rom all sensors, the FC attempts to make an estimate of the desired vector parame ter without kn owledge of w hich sensors hav e been attacked no r the attack parameters used b y the attackers. B. P erformance Metric It is of con siderable interest to inv estigate the performance of spooﬁng attacks, and mathematically characterize the class of the mo st dev astating spoo ﬁng attacks und er the assumption that the adversaries ha ve no informa tion abou t what com- putations the FC is using. This paper de velops guarantees for the attacker’ s perf ormance that are indep endent of the computatio ns perf ormed at the FC. It is clear that if th e FC has the inform ation about the groupin gs o f similarly attacked sensors, i.e., {A p } , it can use this inform ation to im prove estimation per forman ce over the case where this information is not employed. The FC can al ways do better in estimating the desired vector parameter with extra knowledge. Th erefore, for spooﬁng attacks employing some speciﬁc { f j k ( x j k | θ ) } and { g j k ( ˜ x j k | θ , τ ( p ) ) } , th e case where the compro mised sensors are corre ctly categor ized into P d ifferent gro ups according to distinct types of attacks correspo nds to the case where the FC h as the b est ch ance to co mbat the spo oﬁng attacks. In other word s, the best p ossible estimation p erform ance (smallest error) un der this case provides a lower bound o n the estimation perform ance fo r any other cases, w hich implies that the corresponding spooﬁng attack perf ormanc e un der this case provid es a guaranteed attack perform ance in degradin g the estimation per forman ce no matter w hat comp utations the FC is using. T he re cent work in [ 8] has shown that fo r some classes of spo oﬁng attack s, with a sufﬁcient num ber of observations, the FC is able to p erfectly identify the set of un attacked sensors and categorize the attac ked sensors into different groups according to distinct types of spooﬁng attacks. For these reason s, we ado pt the f ollowing d eﬁnition of the optimal guar anteed degrad ation spooﬁng attacks in this pap er . Deﬁnition 1: Consider attacks employing { f j k ( x j k | θ ) } and { g j k ( ˜ x j k | θ , τ ( p ) ) } . The optimal g uaranteed degrad ation 3 spooﬁng attack ( OGDSA) max imizes the degradation 2 of the Cramer-Rao Bou nd (CRB) f or the vecto r p arameter of interest at the FC when the a ttacked sensors are well identiﬁed and categorized according to distinct types of sp ooﬁng attacks by the FC. The estimation per forman ce for a v ector p arameter in a distributed sensor network can be expressed using a n error correlation matrix. Ho wever , in most cases, a closed form expression for the error co rrelation m atrix is intractable. Thu s the CRB, an asymp totically achiev able lower b ound on the error correlation matrix, is employed in Deﬁnition 1 . It is worth men tioning that the op timal guara nteed degradation spooﬁng attack deﬁned in Deﬁnitio n 1 achiev es the classical deﬁnition of attack op timality (largest CRB) f or the scenar io where the FC has th e be st chanc e to co mbat the sp ooﬁng attacks. It migh t not be the classically optimal spooﬁng attack for the scenario wh ere the FC is u nable to determine which sensors are attacked, or to classify sensors into groups of distinct typ es of spooﬁn g a ttacks. Howev er, the OGDSAs deﬁned in Deﬁnition 1 c an provid e a guar antee that the actual degradation in the CRB must exceed some critical value n o matter what computatio ns the estimation system employs. This guaran tee makes OGDSA an excellent spoo ﬁng attack from the adversar y’ s point of view . C. S ummary of Results and Main Con tributions Unlike previous work, a gener alized a ttack mode l is em- ployed which manipu lates th e data using transfo rmations with arbitrary f unctiona l forms d etermined by some attack p aram- eters whose values ar e unknown to the attacked system. For the ﬁrst time, necessary and sufﬁcient conditions are provid ed under which these tran sformation s provide an OGDSA. These condition s imply that, for an OGDSA, either the Fisher I nfor- mation Matrix (FIM) und er the conditions of Deﬁnition 1 for jointly estimating the desired and attack par ameters is singular or that the attacked system is unable to impr ove the CRB under the co nditions of Deﬁ nition 1 for the desired vector parameter throu gh this joint estimation even though the joint FIM is nonsingular . It is shown that when the numb er o f temporal measur ements at each sensor is given, it is alw ays possible to construct an O GDSA by properly emp loying a sufﬁciently large dim ension attack vector par ameter relati ve to the nu mber of quantiza tion levels e mployed, wh ich was not observed p reviously . It is shown tha t a spooﬁng attack can render th e attacked measurements u seless in terms of r educing the CRB under th e conditio ns of Deﬁnition 1 for estimating the d esired vector parameter if and on ly if it is an OGDSA. None of these contributions are provided in the pre vious work. In order to illustrate the theory just described in a concrete way , we also provide some numerical results. For a sp eciﬁc class of OGDSAs, an enhanced Ex pectation-M aximization- based algor ithm that attem pts to use all th e attacked an d unattacked data to jointly estimate the desired and attack parameters is sh own, for a su fﬁcient number of ob servations, to essentially achieve the CRB wh ich knows which sensors are attacked a nd o nly uses data from un attacked sen sors. 2 See (33) for exampl e. For com pleteness, we specif y the Expectation -Maximizatio n- based algorithm fo r general attack s and enhance it with a heuristic roun ding approach previously suggested by others in a different application whic h seems to signiﬁcan tly improve the Expectation- Maximization -based alg orithm. The purpo se of th e algorith m an d n umerical r esults is to illustrate the proper ties o f OGDSAs. The num erical results demon strate that a r epresentative algorithm which tries to use the attacked data is not able to o btain perfo rmance that is better than the best achiev ab le perfo rmance of an approa ch that ignores the attacked data. D. Rela ted W ork In rece nt year s, estimation problems under different attack s have seen great interest in various engineering applications, s ee [2]–[10], [17]–[20] and the referen ces therein. Rather than the man-in- the-midd le attacks which falsify the d ata transmitted from th e sensors to the FC [5]– [7], we are primarily interested in spooﬁng attacks in this pap er , wh ich maliciously modify the measu rements of the physical ph enome non at a su bset of sensors, see Fig. 1. Spooﬁng attacks h av e been wid ely consid ered in wire less sensor n etworks, sma rt grids, radar systems and sonar systems [2]–[4], [8 ], [10], [17]–[2 1]. Each of these recent work s takes one speciﬁc type of spooﬁng attack into account, an d inves- tigates the attack perform ance o r the estimation per forman ce. In this pap er , we don’t focu s on one spec iﬁc type of spooﬁng attack. In stead, we con sider a generalized attack m odel which can describe the dif ferent kinds of sp ooﬁng attacks employed in all recent work, and mo reover , we make use of this generalized model to provide un iform tools to test if a spooﬁng attack is optimal in ou r deﬁned sense. In [19] and [20], the authors only considered one speciﬁc functio nal f orm of the spooﬁng attacks, the so-called data-injection attacks, so they do not address wh ich functio nal form s are optim um. Further, the work in [1 9] and [20] is only for smart grid systems, while our work is very gene ral. Another difference b etween our work and the other recent work on spoo ﬁng attacks in [2]–[4], [ 10], [17]–[ 21] is that we con sider e stimation based o n quan tized d ata which is typically the case in pra ctice. Inte restingly , we sho w that th e quantization limits the capability o f the estimation system to combat the spooﬁng attack s. In particular, it is shown th at the adversaries can launch a class of quantization indu ced OGDSAs which are easily constru cted in practice. E. Nota tion and Or ganizatio n Throu ghout this paper, bold upper case letters and bold lower ca se letters are u sed to denote matrices and c olumn vectors respectively . The symbol 1 {·} stand s fo r the indicator function . Let [ A ] i,j denote the element in the i -th row and j -th column of the matr ix A , a nd R ( A ) represents the range space o f A . A ≻ 0 and A  0 imp ly th at the m atrix is positive deﬁn ite and po siti ve semideﬁnite respectively . T o av o id cumbersom e sub -matrix and sub-vecto r expr essions in this p aper, we intro duce the fo llowing notation. The no tation [ A ] S , : stands for the sub- matrix of A which consists of the 4 elements with row indices in the set S , and [ A ] 1: N represents the N -by- N lead ing pr inciple minor of A . The i -th element of the vector v is denoted b y v i , and [ v ] S represents the sub- vector of v which only co ntains the elemen ts with ind ices in the set S . The sym bols ∇ v f and ∇ 2 v f respectively signif y the grad ient and Hessian of f with respec t to v . Finally , the expectation and rank operators are den oted b y E ( · ) a nd rank( · ) resp ectiv ely . The remaind er of the pa per is organ ized as follows. So me illustrativ e example of a practical spooﬁng attack is in troduc ed in Section II . Section III p rovides the necessary and sufﬁcient condition s for the OGDSAs. A jo int attack iden tiﬁcation and parameter estimation appro ach is de veloped in Section IV , which is u sed in Section V to corr obor ate our th eoretical results. Finally , Sectio n VI provides our co nclusions. I I . I L L U S T R A T I V E E X A M P L E O F A P R AC T I C A L S P O O FI N G A T TAC K Spooﬁng attacks on sensor networks can occur in various engineer ing application s. For instance, spooﬁng a ttacks hav e been d escribed for th e localizatio n problem in wireless sensor networks, see [ 2], [3] and th e references therein. T ab le I in [2] provides a sum mary of different ty pes of spoo ﬁng attack threats for th e localization pro blem. The dange rs of spoo ﬁng attacks in the Global Positionin g System (GPS) that c ontrols ev erythin g fro m ca r navigation to nation al power grids have drawn serious p ublic concern [2 2], [23]. Radar and son ar systems also suffer fr om spooﬁng attack threats in pr actice. As one example of a spo oﬁng attack technique, the application of an electronic countermeasure (ECM), which is designed to jam or d eceive the radar o r so nar sy stem, can critically degrad e the detection and e stimation pe rforman ce o f the system [24]. One popular techn ique for the im plementatio n of ECM emp loys digital radio fr equency memory (DRFM) in radar systems to manipulate the receiv ed signal and retran smit it back to confuse the victim r adar system. DRFM can mislead the estimation of the range of the target by alter ing the delay in transmission of pulses, an d fool the system into incorrectly estimating the veloc ity o f the target by in troducin g a Doppler shift in the retransmitted signal [17]. An example of a spooﬁng attack created by natu re is environmental variation in shallow water sonar systems. According to wa veguide-inv ariant theory [25], th e en vironm ental variation, such as sound-spee d o r water -depth perturbation s, essentially intro duces an ap parent shift in th e position o f the target of inter est wh en the data is processed by matched ﬁeld pro cessing [ 21], [26]. Hence these en viron mental variations can be treated as spooﬁng attacks which falsify the physical model of the recei ved signal in sonar systems. More recently , the data-injection attack c onsidered in smart grids is an other typical example of a spo oﬁng attack, see [4], [18]–[2 0] and the refer ences therein. In order to motiv ate the ma thematical d escription of spoo f- ing attacks, we consider a spooﬁng attack utilizing a DRFM in a radar sy stem as an example, which stor es th e receiv ed signal and strategically retra nsmits it back by in troducin g an additional delay to mislead the estimation of the ran ge of the target. In the absence of sp ooﬁng a ttacks, the simp liﬁed signal model o f the k -th measurement x j k at the j -th receiv er at time instant t j k , which igno res the Doppler shift, can be expressed as x j k = p E j a j s ( t j k − θ j ) + n j k , (6) where s ( · ) , E j and a j respectively re present the transmitted signal, th e transmitted en ergy , and the r eﬂection coefﬁcient. The time delay is d enoted b y θ j which is th e parameter to be estimated. W e assume that the clutter-plus-noise n j k is a zero-mea n Gaussian rando m v ariable with known variance σ 2 j while all other quantities in (6) a re deterministic. As a result, the pdf f j k ( x j k | θ j ) of x j k in the absence of spoo ﬁng attacks is given by 3 f j k ( x j k | θ j ) = N  p E j a j s ( t j k − θ j ) , σ 2 j  . (7) In the presence o f a spooﬁng attack, the after -attack measure- ment ˜ x j k can be d escribed as ˜ x j k = p E j a j s ( t j k − θ j − ξ j ) + n j k , (8) where ξ j is the delay introduced by th e DRFM. Therefore , we can obtain th e correspon ding after -attack pd f of ˜ x j k g j k ( ˜ x j k | θ j , ξ j ) = N  p E j a j s ( t j k − θ j − ξ j ) , σ 2 j  = f j k ( ˜ x j k | θ j + ξ j ) . (9) In th is example, the after-attack pd f g j k ( ˜ x j k | θ j , ξ j ) and th e before- attack pdf f j k ( x j k | θ j ) are in the same family as shown in ( 9), i.e ., the family of Gaussian distributions with the same variance σ 2 j . While this may n ot alw ays be true, the after- attack pd f is gener ally not only p arameterized b y the desired parameter θ j but also by an u nknown attack parameter ξ j . Motiv ated by this e xample and other popular spooﬁng attack examples, such as th ose in [2]–[4], [8], [ 17]–[21], the e ssential impact of a spooﬁng a ttack at the j -th sensor is to maliciou sly modify the me asurements at the j -th sen sor in a manner similar to (8). Hence, an y gi ven spooﬁng attac k at the j - th sensor can be described as a mapping which ma ps the before- attack pdfs { f j k ( x j k | θ ) } of the measur ements at the j -th senso r to the af ter-attack pd fs { g j k ( x j k | θ , ξ ( j ) ) } , where θ and ξ ( j ) account for the d esired vector parameter and the attack vector parameter at the j -th sensor which represents those deterministic unknowns which can determine the after- attack pdfs. I I I . T H E O P T I M A L I T Y O F S P O O FI N G A T TAC K S In this section , we p ursue the explicit ch aracterization of the optimal spoo ﬁng attack a s p er Deﬁnitio n 1 . Th e adversaries can attempt to m aximize the CRB fo r θ to achieve an op timal spooﬁng attack as per Deﬁnition 1 . Hen ce, we ﬁrst formu late the FIM f or estimating Θ in the fo llowing, and th en based on the expression of the FIM, we provide the nece ssary and sufﬁcient co ndition s fo r the op timal spooﬁn g attack as per Deﬁnition 1 . The FIM J Θ for estimating Θ is deﬁn ed as [27] [ J Θ ] l,m ∆ = − E  ∂ 2 L ( Θ ) ∂ Θ l ∂ Θ m  , (10) 3 N ( a, b ) denotes a Gaussian pdf with mean a and v ariance b . 5 where L ( Θ ) denotes the log -likelihood function. When the attacked senors are well identiﬁed and categorized into d ifferent group s according to distinct typ es of spooﬁng attacks, the log-likelihood function L ( Θ ) in (10) e valuated at ˜ u ∆ = [ ˜ u 11 , ˜ u 12 , ..., ˜ u 1 K 1 , ˜ u 21 , ..., ˜ u N K N ] T = r can be expr essed as 4 L ( Θ ) = ln Pr ( ˜ u = r | Θ ) = P X p =0 X j ∈A p K j X k =1 R j X r =1 1 { r j k = r } ln p ( k ) j r (11) by employin g ( 5). By substitutin g the expression of th e log-likeliho od function L ( Θ ) in (11 ) into the deﬁnition of the FIM in ( 10), it can be shown that the FIM J Θ for Θ takes the form J Θ ∆ =         J θ B 1 B 2 · · · B P B T 1 J τ (1) 0 · · · 0 B T 2 0 J τ (2) . . . . . . . . . . . . . . . . . . 0 B T P 0 · · · 0 J τ ( P )         (12) where J θ ∈ R D θ × D θ , J τ ( p ) ∈ R D p × D p , an d B p ∈ R D θ × D p for all p = 1 , 2 , ..., P . Moreover, following from (4), (10) and (11), we can obtain that ∀ p J τ ( p ) = X j ∈A p K j X k =1 R j X r =1 1 p ( k ) j r ∂ p ( k ) j r ∂ τ ( p ) " ∂ p ( k ) j r ∂ τ ( p ) # T (13) B p = X j ∈A p K j X k =1 R j X r =1 1 p ( k ) j r ∂ p ( k ) j r ∂ θ " ∂ p ( k ) j r ∂ τ ( p ) # T , (14) and J θ = P X p =0 J A p , (15) where J A p , wh ich is co ntributed from th e m easuremen ts observed at the sensors in A p , is deﬁn ed as J A p = X j ∈A p K j X k =1 R j X r =1 1 p ( k ) j r ∂ p ( k ) j r ∂ θ " ∂ p ( k ) j r ∂ θ # T . (16) Let N p denote the n umber of sensor s in A p , an d let { j p i } N p i =1 stand for the in dices of th e sensors in A p . F or each p , we deﬁne two matrices Φ θ ( p ) ∆ =  φ θ ( p ) j p 1 11 , φ θ ( p ) j p 1 12 , ..., φ θ ( p ) j p 1 1 R j p 1 , φ θ ( p ) j p 1 21 , ..., φ θ ( p ) j p 1 2 R j p 1 , φ θ ( p ) j p 1 31 ...., φ θ ( p ) j p 1 K j p 1 R j p 1 , φ θ ( p ) j p 2 11 , ..., φ θ ( p ) j p N p K j p N R j p N p  , (17) 4 Note that if p ( k ) j r = 0 for some j , k and r , then we just need to eliminate the correspondin g sum mand in (11). Hence, without loss of generalit y , we assume p ( k ) j r > 0 . and Φ τ ( p ) ∆ =  φ τ ( p ) j p 1 11 , φ τ ( p ) j p 1 12 , ..., φ τ ( p ) j p 1 1 R j p 1 , φ τ ( p ) j p 1 21 , ..., φ θ ( p ) j p 1 2 R j p 1 , φ τ ( p ) j p 1 31 ...., φ τ ( p ) j p 1 K j p 1 R j p 1 , φ τ ( p ) j p 2 11 , ..., φ τ ( p ) j p N p K j p N R j p N p  , (18 ) where the vectors φ θ ( p ) j kr and φ τ ( p ) j kr in (17) an d (18) are given by φ θ ( p ) j kr ∆ = s 1 p ( k ) j r ∂ p ( k ) j r ∂ θ and φ τ ( p ) j kr ∆ = s 1 p ( k ) j r ∂ p ( k ) j r ∂ τ ( p ) . (19) By employing the singular value d ecompo sition of Φ θ ( p ) and Φ τ ( p ) for all p Φ τ ( p ) = U τ ( p ) Λ τ ( p ) V T τ ( p ) and Φ θ ( p ) = U θ ( p ) Λ θ ( p ) V T θ ( p ) , (20 ) the expr essions of J τ ( p ) , B p , and J θ in (13)–(15) can be written in co mpact forms following J τ ( p ) = Φ τ ( p ) Φ T τ ( p ) = U τ ( p ) Λ τ ( p ) Λ T τ ( p ) U T τ ( p ) , (21) B p = Φ θ ( p ) Φ T τ ( p ) , (22) and J θ = P X p =0 J A p = P X p =0 Φ θ ( p ) Φ T θ ( p ) = P X p =0 U θ ( p ) Λ θ ( p ) Λ T θ ( p ) U T θ ( p ) . (23) A. In estimable Spooﬁ ng Attac ks Next we show th at just du e to the sensor system employing a qu antization with a limited alphabet, the adversaries can launch a class of spooﬁn g attacks which bring about a sin gular FIM J Θ due to the singularity of J τ ( p ) for some p ∈ { 1 , 2 , ..., P } . W e fo rmally deﬁne these inestimable spooﬁn g attacks as f ollows. Deﬁnition 2 ( Inestimable spooﬁ ng attac k): Th e p -th spoof- ing attack is referred to as an in estimable spooﬁng attack (ISA) if the cor respond ing J τ ( p ) deﬁned in (1 3) is singu lar . From (13), we have the following result with regard to the singularity of J τ ( p ) . Theor em 1: F or the p -th spooﬁng attack, if the dimension D p of the attac k parameter τ ( p ) satisﬁes D p > X j ∈A p K j ( R j − 1) , (24) then J τ ( p ) is singular, an d furtherm ore, th e FIM J Θ is also singular . Pr oof: It is clear tha t R j X r =1 p ( k ) j r = 1 , (25) for all j an d k . Hence, w e can ob tain that R j X r =1 ∂ p ( k ) j r ∂ τ ( p ) = 0 , ∀ j and k, (26) 6 which yields rank   R j X r =1 ∂ p ( k ) j r ∂ τ ( p ) " ∂ p ( k ) j r ∂ τ ( p ) # T   ≤ R j − 1 , ∀ j and k . (27) Thus, the ran k of J τ ( p ) is boun ded above as p er rank ( J τ ( p ) ) = rank   X j ∈A p K j X k =1 R j X r =1 1 p ( k ) j r ∂ p ( k ) j r ∂ τ ( p ) " ∂ p ( k ) j r ∂ τ ( p ) # T   ≤ X j ∈A p K j X k =1 rank   R j X r =1 ∂ p ( k ) j r ∂ τ ( p ) " ∂ p ( k ) j r ∂ τ ( p ) # T   ≤ X j ∈A p K j ( R j − 1) . (28) Since J τ ( p ) is a D p -by- D p positive semid eﬁnite matr ix, we know that J τ ( p ) is singular if D p > P j ∈A p K j ( R j − 1) . Finally , th e p roof co ncludes by n oting that J Θ is singular as long as J τ ( p ) is singular . The proof of Theorem 1 demon strates that the ran k o f J τ ( p ) is upp er bound ed by the su m in (28) which is de- termined by the numbe r of temporal measurements and the size of the alphabet set employed at each sen sor und er the p -th spooﬁng attack. This implies that when the n umber of temporal measurements at each attacked sensor is gi ven, the number s of qu antization le vels e mployed at the compromised sensors will limit the size of the attack vector param eter the quantized estimation system can estimate with an accu racy that increases with more observations. Theo r em 1 provides a sufﬁcient condition u nder which inestimable spooﬁng attacks can be lau nched. Thus, these inestimab le spooﬁng attack s, which are quan tization induced , can be easily con structed in p ractice, even without any inform ation about the value of θ an d the qu antization regions { I ( r ) j } at each sensor . Further, if the adversaries have knowledge of the n umber of quan tization levels of each attacked sensor and the n um- ber o f tem poral me asurements at each attacked sensor, they know th e minimum size of the attack vector parameter they can employ to ensure an in estimable spooﬁng attack. One simple example of an inestimable spooﬁng attack employs D p > P j ∈A p K j ( R j − 1) and ˜ x j k = D p X i =1 τ ( p ) i ( x j k ) i . (29) If (24) is n ot satisﬁed, the inestimability is de termined by the { I ( r ) j } emp loyed at th e attacked sensor s and th e set of after-attack pdfs { g j k ( x j k | θ , τ ( p ) ) } . From (21), it is seen that the inestimab ility of the p -th sp ooﬁng attack is equiv alent to rank ( Λ τ ( p ) ) < D p . (30) In th e pre sence of inestimable spo oﬁng attacks, th e FIM J Θ for join t estimation of the desired vector parameter and the attack vecto r p arameters is singular, which im plies th at the FC is unable to improve th e estimation of θ via jointly estimating θ an d the attack vecto r parameter s in the CRB sense. If (3 0) is true for all p = 1 , 2 , ..., P , this mean s th e b est th e FC ca n do in this sense is to estimate θ using only un attacked data, and henc e the CRB for θ in such case ca n be o btained as CRB ISA ( θ ) = J − 1 A 0 = U θ (0)  Λ θ (0) Λ T θ (0)  − 1 U T θ (0) (31) by employin g ( 23). B. Op timal Estimable Spo oﬁng Attacks In this sub section, we focus on estima ble spoo ﬁng attack s (deﬁned next), and ob tain the nece ssary and sufﬁcient con- ditions for the optimal estimab le spooﬁng attacks via FIM analysis. Deﬁnition 3 ( Estimable spooﬁng attack): The p -th sp oof- ing attack is said to be estimable if th e correspond ing J τ ( p ) deﬁned in (1 3) is no nsingular . W itho ut loss of generality , we assum e all sp ooﬁng attacks are estimable in this sub section. Otherwise, we can eliminate the observations at ISA sensors, and ju st consid er the joint estimation of the desired vector parameter θ and the estimab le attack vector parameters. From (12) and (15), w e can obtain the CRB f or θ in the presence of estimable spooﬁng attacks as  J − 1 Θ  1: D θ = J θ − P X p =1 B p J − 1 τ ( p ) B T p ! − 1 = " J A 0 + P X p =1  J A p − B p J − 1 τ ( p ) B T p  # − 1 . (32) In the following th eorem, we provide an up per boun d on the CRB for θ in (32) in the po siti ve semideﬁnite sense. Theor em 2: In th e presen ce of e stimable spoo ﬁng attacks, the CRB for θ is bou nded above as per CRB ESA ( θ ) ∆ =  J − 1 Θ  1: D θ  J − 1 A 0 . (33) Equality in (3 3) holds if an d only if ∀ p = 1 , 2 , ..., P , R  V θ ( p ) Λ T θ ( p )  ⊆ R  V τ ( p ) Λ T τ ( p )  . (34) Pr oof: Let’ s ﬁrst examine the term in the sum in (3 2). Noticing by (21), (22) and (2 3), we can o btain J A p − B p J − 1 τ ( p ) B T p = Φ θ ( p ) Φ T θ ( p ) − Φ θ ( p ) Φ T τ ( p )  Φ τ ( p ) Φ T τ ( p )  − 1 Φ τ ( p ) Φ T θ ( p ) . (35) Denote D ∆ =  Φ τ ( p ) Φ T τ ( p )  − 1 Φ τ ( p ) Φ T θ ( p ) , ( 36) then by em ploying (35), we can obtain that J A p − B p J − 1 τ ( p ) B T p =  Φ T θ ( p ) − Φ T τ ( p ) D  T  Φ T θ ( p ) − Φ T τ ( p ) D   0 . (37) What’ s mor e, the equality in (37) is attain ed if and only if Φ T θ ( p ) − Φ T τ ( p ) D = 0 , ∀ p ≥ 1 . (38) By employin g (20) and (36), (3 8) is equiv alent to ∀ p ≥ 1 , V τ ( p ) h I − Λ T τ ( p )  Λ τ ( p ) Λ T τ ( p )  − 1 Λ τ ( p ) i V T τ ( p ) V θ ( p ) Λ T θ ( p ) = 0 , 7 which implies R  V θ ( p ) Λ T θ ( p )  ⊆ R  V τ ( p ) Λ T τ ( p )  , ∀ p ≥ 1 . (39) Consequently , fr om (32), (3 7), and (39), we can c onclude that  J − 1 Θ  1: D θ  J − 1 A 0 , (40) with equ ality if and on ly if ∀ p = 1 , 2 , ..., P , R  V θ ( p ) Λ T θ ( p )  ⊆ R  V τ ( p ) Λ T τ ( p )  . (41) In Th eor em 2 , we provid e the n ecessary a nd su fﬁcient condition s und er wh ich the estimable spooﬁng attac ks can deteriorate the CRB for estimating θ to its up per bound as shown in (3 3). W e form ally deﬁne this class of optim al estimable spooﬁng attacks next. Deﬁnition 4 ( Optimal Estimable Sp ooﬁn g Attack): An es- timable spoo ﬁng attack wh ich satisﬁes th e n ecessary and sufﬁcient condition in (34) is called an o ptimal estimable spooﬁng attack (OESA) . The physical meanings of the ter ms in (32) and the insight into Theorem 2 d eserve som e discussion. Th e term J A 0 represents the info rmation on θ emb edded in the d ata fr om A 0 , while J A p indicates the infor mation on θ that can be provided b y the data from A p if τ ( p ) is known to the FC. T he term B p J − 1 τ ( p ) B T p speciﬁes the degradation o f the information on θ from A p , which is induced b y the unce rtainty of τ ( p ) . By co nsidering the inter pretations of these terms, the in sight into Theorem 2 is that if and only if (34) holds, th e un certainty of τ ( p ) can red uce the infor mation on θ conve yed by the d ata from A p to 0 in which case the sum in the in verse does no t contribute to ( 32). Mo reover , The or em 2 points o ut tha t th e degradation B p J − 1 τ ( p ) B T p cannot be strictly larger than J A p . There is an other interesting inter pretation of (34). W e deﬁne the pmf vector ψ ( j,k ) p of the k -th measur ement at the j -th sensor which is unde r the p -th spooﬁng attack as ψ ( j,k ) p ∆ = h p ( k ) j 1 , p ( k ) j 2 , ..., p ( k ) j R j i T , (42) where the af ter-attack pm f p ( k ) j r is de ﬁned in (5). It can be shown that (34) is equiv alent to the existence of a vector α ( p,i ) = [ α ( p,i ) 1 , α ( p,i ) 2 , ..., α ( p,i ) D p ] T such that for all j ∈ A p and all k , ∂ ψ ( j,k ) p ∂ θ i = D p X l =1 α ( p,i ) l ∂ ψ ( j,k ) p ∂ τ ( p ) l . (43) The relationsh ip in (43) demonstrates that f or all j and all k , the change of th e pmf vector ψ ( j,k ) p induced by c hanging each θ i can be repro duced by a linear co mbination of the ch anges of the pmf vector ψ ( j,k ) p induced b y chan ging the elements of τ ( p ) . This implies the FC will be unable to distinguish changes in the attack vector parameter τ ( p ) from chan ges in the desired vector parameter θ , based on the observations, which severely hinders estimation. Theor em 2 also describes h ow to design o ptimal estimab le spooﬁng attacks. The adversaries choose { g j k ( x j k | θ , τ ( p ) ) } to meet the necessary and sufﬁcient co ndition in (3 4). One trivial examp le of OESA, wh ich may be relatively easy to detect, is to replace th e o riginal measurements at the a ttacked sensors by some regenerated data obeying a distribution not parameteriz ed by θ , which leads to Φ θ ( p ) = 0 f or all p ≥ 1 , and there fore, (3 4) is satisﬁed. In the following pa rt, some typical OESA examp les of practical interest ar e in vestigated. Cor ollary 1: If th e spoo ﬁng attacks are such that for any p ≥ 1 , ∃ λ p satisfying Φ θ ( p ) = λ p Φ τ ( p ) , (44) then the CRB [ J − 1 Θ ] 1: D θ for θ will be max imized in the positive semideﬁnite sense, more speciﬁcally  J − 1 Θ  1: D θ = J − 1 A 0 . (45) Furthermo re, th e n ecessary an d sufﬁcient con dition und er which (44) is satisﬁed for any θ , τ ( p ) and { I ( r ) j } is that ∀ j ∈ A p and for all k , the after-attack pdf g j k ( x j k | θ , τ ( p ) ) can be expressed as g j k  x j k    θ , τ ( p )  = ˜ g j k  x j k    λ p θ + τ ( p )  , (46) for some ˜ g j k . Pr oof: Note that if for any p ≥ 1 , ∃ λ p such that Φ θ ( p ) = λ p Φ τ ( p ) , (47) then ∀ p ≥ 1 , D θ = D p and R  V θ ( p ) Λ T θ ( p )  ⊆ R  V τ ( p ) Λ T τ ( p )  . Thus, by The or em 2 , we can o btain that  J − 1 Θ  1: D θ = J − 1 A 0 . (48) In addition, by e mploying (17), ( 18) and (19), (47) is eq uiv a- lent to that for all j ∈ A p , all k , and all r , ∂ p ( k ) j r ∂ θ = λ p ∂ p ( k ) j r ∂ τ ( p ) . (49) Noticing by (5), in order to render (49) be assured for any θ , τ ( p ) and { I ( r ) j } , the adversaries n eed to ensure tha t fo r any θ and τ ( p ) , ∂ ∂ θ g j k  x j k    θ , τ ( p )  = λ p ∂ ∂ τ ( p ) g j k  x j k    θ , τ ( p )  (50) for all j ∈ A p and all k . It is clear th at if g j k  x j k    θ , τ ( p )  = ˜ g j k  x j k    λ p θ + τ ( p )  , (51) for some ˜ g j k , then (50) holds. On the other hand, if ( 50) is true for any θ and τ ( p ) , then ∀ l = 1 , 2 , ..., D θ , (1 , − λ p )   ∂ ∂ θ l g j k  x j k    { θ m } m 6 = l , { τ ( p ) m } m 6 = l , θ l , τ ( p ) l  ∂ ∂ τ ( p ) l g j k  x j k    { θ m } m 6 = l , { τ ( p ) m } m 6 = l , θ l , τ ( p ) l    = 0 (52) for any θ and τ ( p ) . This implies tha t for any θ l and τ ( p ) l , the g radient of g j k ( x j k |{ θ m } m 6 = l , { τ ( p ) m } m 6 = l , θ l , τ ( p ) l ) with respect to [ θ l , τ ( p ) l ] T is perpe ndicular to the vector [1 , − λ p ] T . Thu s, if we ch ange [ θ l , τ ( p ) l ] T in the d irection [1 , − λ p ] T , then g j k ( x j k |{ θ m } m 6 = l , { τ ( p ) m } m 6 = l , θ l , τ ( p ) l ) does 8 not c hange. Thu s, any equ iv alen t change in the perpendicu - lar dir ection to [1 , − λ p ] T will p roduce the same change in g j k ( x j k |{ θ m } m 6 = l , { τ ( p ) m } m 6 = l , θ l , τ ( p ) l ) . T herefor e, for any l , if ( λ p , 1)  0 t  = ( λ p , 1)  θ l τ ( p ) l  , (53) that is, t = λ p θ l + τ ( p ) l , then we can obtain that g j k  x j k    { θ m } m 6 = l , { τ ( p ) m } m 6 = l , 0 , t  = g j k  x j k    { θ m } m 6 = l , { τ ( p ) m } m 6 = l , θ l , τ ( p ) l  . (5 4) As a resu lt, for any l , by em ploying (54) and deﬁn ing ¯ g j k,l  x j k    { θ m } m 6 = l , { τ ( p ) m } m 6 = l , t  ∆ = g j k  x j k    { θ m } m 6 = l , { τ ( p ) m } m 6 = l , 0 , t  , (55) we can expre ss g j k ( x j k |{ θ m } m 6 = l , { τ ( p ) m } m 6 = l , θ l , τ ( p ) l ) as g j k  x j k    { θ m } m 6 = l , { τ ( p ) m } m 6 = l , θ l , τ ( p ) l  = ¯ g j k,l  x j k    { θ m } m 6 = l , { τ ( p ) m } m 6 = l , λ p θ l + τ ( p ) l  (56) for some ¯ g j k,l , which im plies that g j k  x j k    θ , τ ( p )  = ˜ g j k  x j k    λ p θ + τ ( p )  (57) for some ˜ g j k . As d emonstrated b y Cor o llary 1 , if the spooﬁng attack giv es rise to an af ter-attack pdf g j k ( x j k | θ , τ ( p ) ) which is only parameteriz ed by the su m o f λ p θ an d τ ( p ) for any λ p , then the spooﬁn g attack is o ptimal in th e sense o f Deﬁnitio n 4 . T his class of OESAs is interesting and p owerful in practice , since their o ptimality is in depend ent of the values o f the d esired vector par ameter an d the attack vector par ameter . The DRFM example d iscussed in th e introduction which intro duces a time delay is o ne example of this class of OESAs ( with λ p = 1 ). For the scenario where the d esired parame ter is the m ean of the o bservations, which is a p opular sign al mo del for sen sor network estimation systems with quantize d data [7], [13]–[1 5], this class o f OESAs can be easily lau nched by just adding an offset to the measureme nts at each attacked sensor . Another represen tativ e example of the class of OESAs described b y (46) is extensively co nsidered in sm art gr id systems under the name data- injection attacks, see [4], [18]– [20] and the refer ences there in. A t time instant k , the direct current power ﬂow mo del in the ab sence of spo oﬁng attacks can be expr essed as x k = H θ + n k . (58) Considering the p - th data-in jection attack, the after-attack measuremen ts from the sensors in A p at time instant k are giv en by [ ˜ x k ] A p = [ x k ] A p + a ( p ) = [ H ] A p , : θ + a ( p ) + [ n k ] A p , (59) where a ( p ) represents the data injected b y the p -th spooﬁn g attack. If the adversaries choose a ( p ) such that a ( p ) = [ H ] A p , : τ ( p ) (60) for some τ ( p ) , then the after-attack measurem ents fr om the sensors in A p can be equivalently w ritten as [ ˜ x k ] A p = [ H ] A p , :  θ + τ ( p )  + [ n k ] A p , (61) and therefo re, (46) is satisﬁed by the data-injection attack. Further, b y Co r ollary 1 , the CRB for θ is max imized in the po siti ve semideﬁn ite sense if all the attacks ar e of this type. Moreover , it can be sho wn that the stealth attack or undetectab le attack in [4], [18], [20], which attr acts extensi ve attention in recent literature on sma rt grids, is just such an attack with P = 1 . In addition to the class of OESAs described in (46), there are m any oth er OESAs. For example, if the p -th spoo ﬁng attack satis ﬁes that ∀ j ∈ A p and fo r all k , g j k ( x j k | θ , τ ( p ) ) = ˜ g j k ( x j k | h j k ( θ , τ ( p ) )) fo r some ˜ g j k and some symmetric func- tion h j k of θ and τ ( p ) , then it can be shown that the p -th spooﬁng attack is an OESA provided that the values of τ ( p ) and θ ar e equal. C. Discussion Under the c ondition s of Deﬁ nition 1 , it is clear that J − 1 A 0 is an upper bound on the CRB for θ , no matter what k ind of attacks have been launched. From (31) an d Theor em 2 , the CRB for θ un der ISA or O ESA equals to its upper boun d J − 1 A 0 . Therefo re, according to Deﬁnition 1 , both ISA and OESA are OGDSAs. Furtherm ore, note that Λ τ ( p ) is a D p × ( P j ∈A p R j ) matrix, and hen ce, rank ( Λ τ ( p ) ) ≤ D p . Thus, any OGDSA is either an ISA wh en rank ( Λ τ ( p ) ) < D p , or an OESA wh en rank ( Λ τ ( p ) ) = D p . A particular n ote of interest is th at the results in Section III-A and I II-B can be used to ju dge whether th e attacked measuremen ts are usef ul or not in terms o f reducing CRB under the conditions of Deﬁnition 1 . In particular, it is seen from (31) and Theorem 2 that the CRB for θ in th e presen ce of ISA or OESA is the same as the CRB f or θ when only unattacked data is used. Thus, we obtain the following corollary . Cor ollary 2: Unde r the co nditions of Deﬁnition 1 , the necessary and sufﬁcient con dition u nder which the attacked measuremen ts are useless in terms of reducing CRB is th at the spooﬁng attacks belong to eith er ISA or OESA which ar e deﬁned in Deﬁ nition 2 and 4 respectively . Howe ver, the f undame ntal mechanisms of ISA and OESA for making the attacked me asurements useless in terms of reducing CRB are very different. T o be speciﬁc, ISA ren ders the task o f estimating th e attack v ector param eters beyond the capab ilities of th e quan tized estimation system by causing the FIM f or jointly estimating the desired and attack vector parameters to be sing ular . Thus, ISA p revents the FC fro m potentially im proving the CRB for θ by jointly estimating θ an d th e attack vector par ameters. I n co ntrast, even though the joint FIM is nonsingular, the FC is n ot able to o btain any improvement fro m using th e attacked da ta in the CRB perfor mance f or θ under OESA. It is worth mentioning tha t ( 31) and Theo r em 2 demon strate that u nder the condition s of Deﬁn ition 1 , the CRB for θ reaches its u pper bo und in the presence of ISA or OE SA. I n 9 practice, howe ver , the FC may not be able to well id entify the set of unattacked sensor s and categorize the attacked sensor s into d ifferent group s according to distinct typ es of spooﬁng attacks. Thus, th e actual estimatio n perfo rmance u nder ISA and OESA ca n be expected to be in ferior to J − 1 A 0 . In the case where for any j , { ˜ x j k } is a statistically in- depend ent and identically d istributed seq uence over k , the n in Theorem 1 , the sufﬁcient co ndition in (24) will become D p > P j ∈A p R j − |A p | . No w th e right-hand side of the inequality is just th e sum of th e sizes of the alphabet sets employed at th e sensors under the p -th spooﬁng attack minus the size of A p . This quan tity even d oes not depend on the number o f temporal measur ements at each attacked sensor . In addition, it can b e shown tha t th e results in this sectio n can be easily extended to the cases where the attack vector parameters employed at the attacked sensor s ch ange over time, that is, the after-attack observation ˜ x j k obeys the statistical mo del ˜ x j k ∼  f j k ( ˜ x j k | θ ) , if j ∈ U g j k  ˜ x j k   θ , ξ ( j,k )  , if j ∈ V , (62) where ξ ( j,k ) is the attack vector parameter em ployed at time instant k at the j -th sensor . For th e sake of br evity , we skip the extension . I V . J O I N T A T TAC K I D E N T I FI C AT I O N A N D P A R A M E T E R E S T I M AT I O N U N D E R E S T I M A B L E S P O O FI N G A T TAC K In ord er to corr oborate the theory just described, we de - velop a r epresentative approach fo r the joint identiﬁcatio n of attacked sensors along w ith the estimation of the desired vector parameter and the attack vector parameters for the estimation system facing attac ks. In this section, we focu s o n a class of estimable spooﬁn g attacks in which for a ny p , ∀ j ∈ A p , the FIM for τ ( p ) based o n th e data from th e j -th sensor is nonsingu lar . Further , we assume th at J A 0 deﬁned in (16) is nonsingu lar in the presen ce of spoo ﬁng attacks. Th is could occur, fo r e xamp le, if only a small subset of sensors can be attacked in a distributed sen sor settin g o r if a su bset of sensors can be well protected in advance to gi ve rise to a nonsing ular J A 0 . In this section, we use { ξ ( j ) } N j =1 instead of { τ ( p ) } P p =1 to denote the attack vector parameters emp loyed by the ad ver- saries at the j -th sensor . For the sake of notational simplicity , we let q ( k ) j r and ˜ q ( k ) j r to denote the r -th v a lue of the after- attack pmf of the k -th time sample at the j -th sensor when it is unattacked and attacked resp ectiv ely q ( k ) j r ∆ = Z I ( r ) j f j k ( x j k | θ ) dx j k (63) and ˜ q ( k ) j r ∆ = Z I ( r ) j g j k  x j k    θ , ξ ( j )  dx j k , (64) where f j k and g j k represent th e pdf of the k - th tim e sample at the j -th sensor when it is unattacked and attacked respectiv ely . Before pro ceeding, the f ollowing assumptions are mad e from a pr actical viewpoint in this section. Assumption 1: As the sensors ar e ass umed to b e spread over a wide area and typically adversaries have limited re- sources, we assume that no more than half o f senso rs are attacked. Assumption 2 (S igniﬁca nt A ttack): In order to giv e rise to sufﬁcient impa ct on the statistical ch aracterization of the mea- surements at each attacked sensors, every attacker is required to gua rantee a minimu m av erage distor tion of the p mf at each attacked sensor, that is, 1 K j K j X k =1    ˜ q ( k ) j − q ( k ) j    2 ≥ d q , ∀ j / ∈ A 0 , (65) where q ( k ) j and ˜ q ( k ) j are deﬁned as q ( k ) j ∆ = h q ( k ) j 1 , q ( k ) j 2 , ..., q ( k ) j R j i T (66) and ˜ q ( k ) j ∆ = h ˜ q ( k ) j 1 , ˜ q ( k ) j 2 , ..., ˜ q ( k ) j R j i T . (67) W e do not consider mod iﬁcations smaller than (65) as attacks and assum e they have little impact on perfo rmance. Assump- tion 1 is made for avoiding the ambiguity between the attacked and unattac ked sensors. Let Ω denote a vector co ntaining the d esired vector p aram- eter θ , the set of unk nown attack vector para meters { ξ ( j ) } as well as a set of unk nown bin ary state variables { η j } that Ω ∆ =  Ξ T , η T  T , (68) where Ξ ∆ =  θ T ,  ξ (1)  T ,  ξ (2)  T , ...,  ξ ( N )  T  T (69) and η ∆ = [ η 1 , η 2 , ..., η N ] T . (70) The j -th element of η is zer o, i.e., η j = 0 , if the j -th sensor is unattacked, while η j = 1 imp lies the j - th sensor is attacked. The log-likeliho od function e valuated at ˜ u = r is L ( Ω ) ∆ = ln Pr ( ˜ u = r | Ω ) = N X j =1 K j X k =1 h η j ln ˜ q ( k ) j r jk + (1 − η j ) ln q ( k ) j r jk i . (71) Based on this settin g, the FC can jo intly identify the state of each sensor and estimate the d esired vector pa rameter θ b y solving the fo llowing constrained optimization problem ˆ Ω = ar g max Ω N X j =1 K j X k =1 h η j ln ˜ q ( k ) j r jk + (1 − η j ) ln q ( k ) j r jk i (72a) s. t. η j ∈ { 0 , 1 } , ∀ j, (72b) N X j =1 η j < N 2 , (72c) 1 K j K j X k =1    ˜ q ( k ) j − q ( k ) j    2 ≥ d q , ∀ η j = 1 , ( 72d) 10 where the con straints in (7 2c) and (72d) are d ue to Assump tion 1 and Assumption 2 . The integer constraint in (72b) makes the o ptimization problem dif ﬁcult to solve. For sma ll N , it m ay b e solved exactly simply by e xhaustively search ing through all possible combinatio ns of { η j } , while for large N , this is n ot feasible in practice, since the n umber of all po ssible combination of { η j } is on th e ord er of 2 N . T o this end, it is of considerab le practical interest to d evelop an ef ﬁcient algorithm to solve the optimization problem in ( 72). In this section, we p ropose a heuristic for solv ing (72). A. Ra ndom Relaxatio n with the EM Algo rithm According to the constraint in (7 2b), η j is an un known deterministic bin ary variable, and hence, (72 b) is eq uiv alent to π j ∆ = Pr ( η j = 1) ∈ { 0 , 1 } and Pr ( η j = 0) = 1 − π j , ∀ j. (73) Further, by dropp ing the co nstraint ( 72c) as well as (7 2d), and then relaxing the deterministic { η j } to be rando m, th at is, allowing π j = Pr ( η j = 1) ∈ [0 , 1] fo r all j = 1 , 2 , ..., N , th e problem in (72) reduces to ˆ Ω π = arg max Ω π N X j =1 K j X k =1 ln h π j ˜ q ( k ) j r jk + (1 − π j ) q ( k ) j r jk i (74a) s. t. π j ∈ [0 , 1] , ∀ j = 1 , 2 , ..., N , (74b) where Ω π ∆ = [ Ξ T , π T ] T and π ∆ = [ π 1 , π 2 ..., π N ] T . The ph ysical interp retation behind (74) is that via ran dom relaxation of the determin istic binary vector state variable η , the set A 0 of u nattacked sensors is no longer deterministic, and moreover , each sensor in the sensor network is attacked with a c ertain pro bability π j at every tim e instant. By introd ucing a latent vector variable z = [ z 11 , z 12 , ..., z 1 K 1 , z 21 , ..., z N K N ] T , (75) where z j k = 1 indicates that th e k - th measure ment at th e j -th sensor was attacked, and z j k = 0 implies th at th e k - th measurem ent at the j - th sensor was un attacked, we can employ the Ex pectation-M aximization (EM) algorithm [28], [29], which is an iter ativ e metho d that alternates between perfor ming an expectation ( E) step and a maximization (M) step, to solve the relaxed problem in (74). 1) E -step: The E-step computes the expected log -likelihood function Q ( Ω π | Ω ′ π ) , with respect to z giv en th e quantized data ˜ u = r and the cur rent estimate of the vector parameter ˆ Ω ′ π = [( ˆ Ξ ′ ) T , ( ˆ π ′ ) T ] T , as fo llowing Q  Ω π    ˆ Ω ′ π  ∆ = E z | ˆ Ω ′ π , ˜ u = r { L ( Ω π ) } , (76) where the log -likelihood function L ( Ω π ) is g iv en by L ( Ω π ) = ln Pr ( z , ˜ u = r | Ω π ) = ln Pr ( ˜ u = r | Ω π , z ) + ln Pr ( z | Ω π ) = N X j =1 K j X k =1 n 1 { z jk =1 }  ln ˜ q ( k ) j r jk + ln π j  + 1 { z jk =0 } h ln q ( k ) j r jk + ln (1 − π j ) io . (77) Deﬁne υ (1) j k ∆ = E z | ˆ Ω ′ π , ˜ u = r  1 { z jk = 1 }  = ˆ π ′ j ˜ q ( k ) j r jk ˆ π ′ j ˜ q ( k ) j r jk +  1 − ˆ π ′ j  q ( k ) j r jk (78) and υ (0) j k ∆ = E z | ˆ Ω ′ π , ˜ u = r  1 { z jk =0 }  = 1 − υ (1) j k , (79) then by employing ( 76) and (77), we can obtain the expec ted log-likelihood function Q  Ω π    ˆ Ω ′ π  = N X j =1 K j X k =1 n υ (1) j k  ln ˜ q ( k ) j r jk + ln π j  + υ (0) j k h ln q ( k ) j r jk + ln (1 − π j ) io . (80 ) 2) M-step: The M-step seeks to ﬁnd a new estimate of the vector par ameter ˆ Ω π to up date the cu rrent estimate o f the vector par ameter ˆ Ω ′ π by m aximizing the exp ected log- likelihood function Q ( Ω π | ˆ Ω ′ π ) , that is, ˆ Ω π = h ˆ Ξ T , ˆ π T i T = ar g max Ω π Q  Ω π    ˆ Ω ′ π  . (81) a) Upda ted estimate of π : According to (8 1), the up- dated estimate ˆ π j should satisfy ∂ Q  Ω π    ˆ Ω ′ π  ∂ π j = 1 π j K j X k =1 υ (1) j k − 1 1 − π j K j X k =1 υ (0) j k = 0 , (82) which yields, by employing (79), ˆ π j = 1 K j K j X k =1 υ (1) j k . (83 ) b) Upda ted estimate of Ξ : Similarly , the up dated esti- mate ˆ Ξ is the solution o f the following equation ∇ Ξ Q  Ω π    ˆ Ω ′ π  = 0 . (84) Generally , a closed-form solution fo r th e above equation may not exist. T o solve (84) in such cases, Ne wton’ s m ethod can be employed with an initial point ˆ Ξ (0) = ˆ Ξ ′ . At the ( i + 1) - th iteration of Newton’ s M ethod, the updated point ˆ Ξ ( t +1) can be expressed as ˆ Ξ ( t +1) = ˆ Ξ ( t ) − κ t h ∇ 2 Ξ Q  Ω ( t ) π    ˆ Ω ′ π i − 1 ∇ Ξ Q  Ω ( t ) π    ˆ Ω ′ π  (85) where Ω ( t ) π = [( ˆ Ξ ( t ) ) T , ( ˆ π ′ ) T ] T , and κ t ∈ (0 , 1) is the t -th step size co mputed by using a backtrack ing line search [30]. For com pleteness, the explicit expr essions for the gradien t and Hessian of th e expec ted log-likelihood fun ction with re- spect to Ξ ar e provided . The grad ient ∇ Ξ Q ( Ω ( t ) π | ˆ Ω ′ π ) consists of the quan tities ∂ ∂ θ l Q ( Ω ( t ) π | ˆ Ω ′ π ) and ∂ ∂ ξ ( j ) l Q ( Ω ( t ) π | ˆ Ω ′ π ) for different j and l , wh ich can be co mputed by ∂ ∂ θ l Q  Ω ( t ) π    ˆ Ω ′ π  = N X j =1 K j X k =1    υ (1) j k 1 ˜ q ( k ) j r jk ∂ ∂ θ l ˜ q ( k ) j r jk + υ (0) j k 1 q ( k ) j r jk ∂ ∂ θ l q ( k ) j r jk    (86) 11 and ∂ ∂ ξ ( j ) l Q  Ω ( t ) π    ˆ Ω ′ π  = K j X k =1 υ (1) j k 1 ˜ q ( k ) j r jk ∂ ∂ ξ ( j ) l ˜ q ( k ) j r jk . (87) The elements of the Hessian ∇ 2 Ξ Q ( Ω ( t ) π | ˆ Ω ′ π ) c an be calculated by the f ollowing expr essions ∂ 2 ∂ θ l ∂ θ m Q  Ω ( t ) π    ˆ Ω ′ π  = N X j =1 K j X k =1    υ (1) j k   1 ˜ q ( k ) j r jk ∂ 2 ˜ q ( k ) j r jk ∂ θ l ∂ θ m − 1 ( ˜ q ( k ) j r jk ) 2 ∂ ˜ q ( k ) j r jk ∂ θ l ∂ ˜ q ( k ) j r jk ∂ θ m   + υ (0) j k   1 q ( k ) j r jk ∂ 2 q ( k ) j r jk ∂ θ l ∂ θ m − 1 ( q ( k ) j r jk ) 2 ∂ q ( k ) j r jk ∂ θ l ∂ q ( k ) j r jk ∂ θ m      , (88 ) ∂ 2 ∂ θ l ∂ ξ ( j ) m Q  Ω ( t ) π    ˆ Ω ′ π  = K j X k =1 υ (1) j k   1 ˜ q ( k ) j r jk ∂ 2 ˜ q ( k ) j r jk ∂ θ l ∂ ξ ( j ) m − 1 ( ˜ q ( k ) j r jk ) 2 ∂ ˜ q ( k ) j r jk ∂ θ l ∂ ˜ q ( k ) j r jk ∂ ξ ( j ) m   , (89) ∂ 2 ∂ ξ ( j ) l ∂ ξ ( j ) m Q  Ω ( t ) π    ˆ Ω ′ π  = K j X k =1 υ (1) j k   1 ˜ q ( k ) j r jk ∂ 2 ˜ q ( k ) j r jk ∂ ξ ( j ) l ∂ ξ ( j ) m − 1 ( ˜ q ( k ) j r jk ) 2 ∂ ˜ q ( k ) j r jk ∂ ξ ( j ) l ∂ ˜ q ( k ) j r jk ∂ ξ ( j ) m   , (90) and ∂ 2 ∂ ξ ( i ) l ∂ ξ ( j ) m Q  Ω ( t ) π    ˆ Ω ′ π  = 0 , if i 6 = j. (91) The quantities in (86)– (91) are all ev aluated at Ω ( t ) π . Repeating the calculation of (8 5) until { ˆ Ξ ( t ) } converges, the limit point ˆ Ξ of { ˆ Ξ ( t ) } is the solution for (84), and also the upd ated estimate of Ξ . The co n vergence of the EM algorithm is guarantee d and the detailed analysis can be f ound in [28], [ 31], that is to say , by iteratively alternating between E-step and M-step , a locally optimal solution for (7 4) can be obtain ed. It is worth mentionin g that since we do no t req uire a very accu rate solution fo r the r elaxed optimization prob lem in (74), o nce the d ifference betwe en the upd ated and cu rrent estimates is sufﬁciently small, we can terminate the iterations in the EM algo rithm and utilize the current estimate o f Ω π in the following rounding step. B. Con strained V ariable Thr eshold Roun ding and Barrier Method By utilizing the E M algo rithm as illustrated in Section IV - A , we can o btain the solu tion ˆ Ω π for the r elaxed optimizatio n problem in (74). The element ˆ π j of ˆ Ω π can be interpreted as the prob ability o f the j -th sensor being attacked over time. Howe ver, accord ing to (72c) and (73), we know that befo re relaxation, ˆ π j ∈ { 0 , 1 } and 1 T ˆ π < N / 2 . T o th is en d, we consider the task of rou nding ˆ π to a valid binary vector . T o accomplish this task, we propo se a con strained variable threshold round ing (CVTR) appr oach wh ich is based on th e heuristic developed b y Zym nis et al. [32]. Th e basic idea of the CVTR is that we ﬁrst ro und ˆ π to generate a set of most likely prob ability vectors { ˜ π ( l ) } with binary elements wh ich satisfy th e constraints in (72c). Then , und er constrain t (72 d), the joint maximu m likelihood estimate of the desired vector parameter an d attac k vector parameters are pursued over the generated set of valid probability binary vectors { ˜ π ( l ) } . W e ﬁrst ge nerate the set o f the most likely valid binary probab ility vectors { ˜ π ( l ) } by employing the CVTR which can be described as n ˜ π ( l ) o ∆ =  sgn ( ˆ π − λ 1 ) : 0 ≤ λ ≤ 1 , k sgn ( ˆ π − λ 1 ) k 1 < N 2  . (92) Since the j -th element ˜ π ( l ) j of ˜ π ( l ) denotes the probability the j -th sensor is attacked, each probability vector ˜ π ( l ) with binary values cor respond s to a deterministic state variable vector ˜ η ( l ) as following ˜ η ( l ) = ˜ π ( l ) , ∀ l . (93) W e ref er to { ˜ η ( l ) } as th e set of the most likely state variable vectors, and we only co nsider the comb inations in this set. Further, it is seen from ( 92) th at as λ in creases fro m 0 to 1 , this approach o nly generates u p to ⌊ N / 2 ⌋ distinct valid binary probab ility vectors. Thus, it is feasible to exhau sti vely e valuate the maximu m likelihood function , which is max imized with respect to Ξ , for each given ˜ η ( l ) . As a result, the op timization problem in (72) can be red uced to ˆ Ω R = h ˆ Ξ T R , ˆ η T R i T = ar g max η ∈ { ˜ η ( l ) } max Ξ L ( Ω ) (94a) s. t. 1 K j K j X k =1    ˜ q ( k ) j − q ( k ) j    2 ≥ d q , ∀ η j = 1 . (94b) As (94) demonstrates, we n eed to solve the inne r maximiza - tion for each candidate state v ariable vector ˜ η ( l ) , and then k eep the solution which g iv es rise to the m aximal o bjective fu nction in (94). Noticin g tha t the constrain t in (94b) only has effects on the inner maximiza tion, the inner constrained maximizatio n for each ˜ η ( l ) in (94) can b e converted to an unc onstrained problem by em ploying a logarithm ic b arrier fun ction as max Ξ    N X j =1 K j X k =1 h ˜ η ( l ) j ln ˜ q ( k ) j r jk +  1 − ˜ η ( l ) j  ln q ( k ) j r jk i + µ N X j =1 ˜ η ( l ) j ln   1 K j K j X k =1    ˜ q ( k ) j − q ( k ) j    2 − d q      , (9 5) where the positiv e barrier param eter µ determin es the ac- curacy with which (9 5) ap prox imates the inner con strained maximization in (94). Since the objective functio n in (9 5) is differentiable, the un constraine d pr oblem in (95) can b e similarly solved b y Newton’ s M ethod as in S ection IV -A2b for any giv en µ . 12 Let ˆ Ξ ( l ) µ denote the solution of (95) for any g iv en ˜ η ( l ) and µ , and let L ( l ) ∗ represent the o ptimal objective v alue of th e inner constrained maximization in (9 4a) fo r any given ˜ η ( l ) . It c an be shown that as µ → 0 , a ny limit po int ˆ Ξ ( l ) ∗ of the sequence { ˆ Ξ ( l ) µ } µ is a solution of the inner constrained max imization in (94) [3 3]. Thus, we can ob tain an accurate solution o f the inner constrain ed m aximization in (94) by itera ti vely solving (95) for a sequence { µ m } of positive b arrier parameters, which decrease monotonica lly to ze ro, such that the so lution ˆ Ξ ( l ) µ m for µ m is chosen as the starting point f or the next iteration with barrier parameter µ m +1 . By deﬁning l ∗ ∆ = max l L ( l ) ∗ , the solution o f the co nstrained op timization pro blem in (94) can be obtained as ˆ Ω R = h ˆ Ξ R , ˆ η R i T =   ˆ Ξ ( l ∗ ) ∗  T ,  ˜ η ( l ∗ )  T  T . (96 ) C. Discu ssion It is well kn own that th e cond ition num ber o f the Hessian matrix of the logarithmic barr ier functio n in (95) migh t be- come in creasingly larger as the barrier parameter decreases to 0 . In order to overcome the ill-conditio ning issue in practical computatio n, the numerically stable appr oximation o f the Newton direction can be u tilized in Newton’ s m ethod fo r solving (9 5) with small barrier parameter, see [3 3] and the referenc es ther ein. It is worth men tioning that to preserve the generality , we don’t make additional assumptions to ensure the conv exity of the ob jectiv e functio ns in th e section. Hence, the EM algorithm and Newton’ s m ethod in volved in our appr oach might con verge to a lo cally optimal point if the starting point is not close to the globally optimal poin t. T o av oid this possibility , multiple starting poin ts can be employed and we choose th e one that yields th e maxima l objective func tion at conv ergence [ 27]. The propo sed ap proach in (94)–(96) attempts to use a ll sensor data, wh ether attacked or not, in an attempt to optimize the described objective functio n. T hus, for some scenarios where the spooﬁng attacks are not OGDSAs, s imilar to [7], one expects the proposed app roach will outperform the estimation approa ch which only utilizes the un attacked data to e stimate the desired vector par ameter . For examp le, please refer to th e numerical results in Section V -C . V . N U M E R I C A L R E S U LT S In this sectio n, we in vestigate the per forman ce o f the ap- proach es p ropo sed in S ection IV for some prac tical cases. The numerical results show that und er OGDSAs, the app roaches propo sed in S ection IV are not able to obtain p erform ance that is better than the op timal performa nce which ignores the attacked sensors, which is consistent with the the ory de scribed in Sectio n III . A. DRF M Attacks in MIMO Rad ars First, we consider MIMO radar with 1 transmit station and N = 10 moder ately space d receiv e stations u nder the spooﬁng attack using a DRFM in a generalization of ( 6)– (9). The ﬁrst 3 receive stations are u nder attack. Each station makes M measuremen ts of each pulse in the pu lse train, and employs an identical 4 -bit quantize r with a set of thresholds {−∞ , − 5 , − 4 , ..., 9 , ∞} to co n vert ana log m easuremen ts to quantized data before tra nsmitting them to the FC. Without any attack, the m -th measuremen t of the k -th pulse in the pulse train at th e j -th station can b e expressed as x ( k ) j m = p E j a j s  t ( k ) j m − θ j  + n ( k ) j m , (97 ) where θ j is the desired param eter , m = 1 , 2 , ..., M , k = 1 , 2 , ..., K , and K is the total numb er of pulses in the pulse train. Similar to (8), if the j -th station is u nder attack, the m -th measurem ent of the k -th pulse in th e pulse train is ˜ x ( k ) j m = p E j a j s  t ( k ) j m − θ j − ξ j  + n ( k ) j m , ( 98) where ξ j is the d elay introd uced by the DRFM. Assume { n ( k ) j m } is an independent and identically distributed zero-mea n Gaussian no ise seq uence with v ariance σ 2 . The signal s ( t ) is a Gau ssian pulse signal [3 4], that is, s ( t ) =  2 T 2  1 4 exp  − π t 2 T 2  , (99) and the sampling times ar e t ( k ) j m = ( m − 1)∆ t , ∀ m = 1 , 2 , ..., M . M oreover , we assume the distance between the target and any receiving station is m uch larger than the distance between e very pair of stations, and hen ce, we can assume θ j = θ for a ll j . In the simulations, let T = 0 . 1 , ∆ t = 0 . 001 , θ = 0 . 02 , M = 3 , σ 2 = 5 , and E j = 1 , a j = 1 for all j . In addition, the values of the attack parameters are ξ 1 = 0 . 0 4 , ξ 2 = 0 . 05 , ξ 3 = 0 . 0 6 , and the threshold deﬁned in Assumption 2 in (65) is d q = 0 . 1 5 . W e ﬁrst test the perf ormanc e of the appr oach which employs the rand om relaxation ( RR) with the EM and CVTR in identifying the attacked and unattacked sen sors. Fig. 2 illustrates the Monte Carlo approximation ( 100 0 times) of the ensemble a verag e of the p ercentage of all mis-classiﬁed sensors as a func tion of the number K of pulses in the p ulse train. As Fig. 2 shows, the av erage perce ntage of mis-classiﬁed sensors decr eases tow ards 0 as K increases. 10 20 40 60 80 100 120 140 160 180 200 The number K of pulses 0 1 2 3 4 5 6 Average percentage of misclassified sensors (%) Fig. 2: Perf ormance of iden tifying the DRFM attack s. 13 Next, we examine the estimation perform ance of the pro- posed appro aches in S ection IV , that is, the app roach which employs the RR with the EM, and the appro ach th at employs the RR with the EM and CVTR. Fig . 3 depicts the m ean squared error (MSE) perfor mance of th e two appro aches for estimating θ on a log scale. For comparison , in Fig. 3, th e CRB for θ which knows which sen sors are attacked an d only uses d ata from una ttacked sensors is also provided 5 . It is seen that as K in creases, the MSE performan ce of the approach with CVTR for estimating θ conv erges to the CRB for θ which knows which sensors are attacked and only uses data from unattacked sensors. T he large K resu lts in Fig. 3 also corrob orates the previous theoretical results in Se ction II I that under OESAs, the attacked data are no t useful to reduce the CRB. In addition, the MSE perf orman ce of the appro ach with CVTR is shown to be better than the approach which only employs the RR with the EM algorithm , which implies that the propo sed con strained variable thresh old roun ding can further improve the estimation performan ce for estimating the d eisred parameter . 10 20 40 60 80 100 120 140 160 180 200 The number K of pulses 10 -7 10 -6 10 -5 10 -4 10 -3 MSE RR with the EM RR with the EM and CVTR Unattack ed data only CRB for θ Fig. 3: Estimation perf orman ce of the prop osed approaches under DRFM attacks. B. Da ta-injection Attacks in Sensor Networks Next consider the spec iﬁc attacks on the rang e-based lo- calization system described in [3]. Th e attackers modify the receivers to alter the received signa l streng th to conf use the localization. Consider a case with N = 10 closely spaced sensors. Eac h sensor makes K measurem ents of the phy sical pheno menon, an d employs an identical 4 -bit quantizer with a set o f thr esholds { 0 , ± 1 , ± 2 , ..., ± 7 , ±∞} to co n vert analog measuremen ts to quantize d data bef ore transmitting th em to the FC. From [3], the received signal strength bef ore attac k is x j k = θ + n j k , ∀ k and ∀ j, (100) where θ is a determ inistic u nknown p arameter, and { n j k } is an i.i.d. zero -mean Gaussian noise sequence with distribution 5 It is worth mentioning that the CRB for θ which knows which sensors are attac ked and only uses data from unatt acke d sensors is equal to the CRB for estimati ng θ under ISA. Hence, the blue curv e marke d with squares in Fig.3 also indicates the CRB per formance under ISA. N (0 , σ 2 ) . Here, we estimate θ which allows us to directly calculate the co mmon distance to the emitter , due to a one-to- one relation ship. Further, we assume that the ﬁrst 3 sensors in the sensor network ar e unde r data-injection spo oﬁng attacks. The after-attack measurements are d escribed as ˜ x j k = θ + a j k + n j k , ∀ k and ∀ j = 1 , 2 , 3 , (101) where a j k is the unknown attack injected at the j -th sen sor at time k . 50 100 200 300 400 500 600 700 800 The number K of time samples at each sensor 0 1 2 3 4 5 6 7 8 9 10 Average percentage of misclassified sensors (%) Fig. 4: Perf ormance of identify ing the data-injection attacks. 50 100 200 300 400 500 600 700 800 The number K of time samples at each sensor 10 -4 10 -3 10 -2 10 -1 MSE RR with the EM for θ RR with the EM and CVTR for θ Unattack data only CRB for θ Fig. 5: Estimatio n perfor mance of the pr oposed approach es for estimating θ . W e co nsider the scenario that θ an d σ 2 are both the parameters of interest. Mor eover , the unknown injected attacks { a j k } are ind ependen t ran dom variables, where a j k obeys the Gaussian distribution N ( α j , β j ) fo r all k . In th e simulations, the desired vector param eter θ ∆ = [ θ , σ 2 ] T and the a ttack vector param eters { ξ ( j ) ∆ = [ α j , β j ] T } j =1 , 2 , 3 are θ = [1 , 3 ] T , ξ (1) = [ − 1 . 5 , 1 ] T , ξ (2) = [ − 2 , 2 ] T , and ξ (3) = [1 . 5 , 1] T . The thre shold deﬁned in Assumption 2 in (65) is d q = 0 . 04 . The perfo rmance o f the app roach which employs the RR with the EM and CVTR in identifying th e attac ked an d un attacked sensors is illustrated in Fig. 4. Fig. 4 dep icts the Monte Carlo approx imation ( 100 0 tim es) of the ensemble av erage of the 14 50 100 200 300 400 500 600 700 800 The number K of time samples at each sensor 10 -3 10 -2 10 -1 10 0 MSE RR with the EM for σ 2 RR with the EM and CVTR for σ 2 Unattack data only CRB for σ 2 Fig. 6: Estimation perf orman ce of the prop osed approaches for estimating σ 2 . percentag e of all mis-classiﬁed sensors versus the nu mber K of time samp les at each sensor . It is seen from Fig. 4 that the av erage perc entage of mis-classiﬁed sensor s reduces to wards 0 as K in creases. In Fig. 5 and Fig. 6, we p lot the MSE perfor mance of our proposed appro aches for jointly e stimating θ and σ 2 on a lo g scale. Th e CRBs f or estimating θ a nd σ 2 which kn ow which sensors are attacked an d only use da ta from unattacked sensor s are a lso respectiv ely plotted in Fig. 5 and Fig. 6 for co mparison . As Fig. 5 and Fig. 6 sho w , the MSEs of the approach which employs the RR with the EM and CVTR for jointly estimating θ and σ 2 respectively co n verge to the correspo nding CRBs which know which sensor s ar e attacked and o nly uses data from u nattacked sensors, and outperf orm the appr oach which only employs the RR with the EM algorithm . The large K results in Fig. 5 and Fig. 6 again justify the previous theoretical results in Section III that under OESAs, attacked data is not useful to red uce the CRB . Note that the mod el in ( 100) is the most studied mode l in the sensor network estimation literature, typically employed for no nlocalization application s. Thus, the analysis is useful in these oth er application s also. C. Da ta-injection Attacks in MIMO Rada rs It is worth men tioning th at the d ata-injection attacks are OGDSAs for the speciﬁc estimation pr oblem described in S ec- tion V - B , but they may not satisfy the necessary and suf ﬁcien t condition s fo r an OGDSA for o ther estimatio n pro blems. T o demonstra te this, we consider the data- injection attacks in the time dela y estimation problem describ ed in Section V -A . T o be speciﬁc, if the j -th station is u nder a data-injection attack, rather tha n ( 98), the m -th after-attack measurem ent of th e k -th pulse in th e pulse train is ˜ x ( k ) j m = p E j a j s  t ( k ) j m − θ j  + ξ j + n ( k ) j m , (102) where the sign al s ( t ) is de ﬁned in (9 9), { n ( k ) j m } is an in de- penden t and identically distributed zero-m ean Gaussian noise sequence with variance σ 2 , and the sampling tim es are t ( k ) j m = ( m − 1)∆ t , ∀ m = 1 , 2 , ..., M . In the following simulations, the values of mo st of system p arameters are set to b e the same as those in Section V -A , except that M = 40 an d the values of the attack para meters are ξ 1 = 1 , ξ 2 = − 2 and ξ 3 = − 1 . In add ition, the threshold d eﬁned in Assumption 2 in (6 5) is chosen to be d q = 0 . 075 . W e ﬁrst examine the p erform ance of the proposed approach in identif ying the attacked and unattacked sensors in the presence of data-injection attacks in the time delay estimation problem . Fig. 7 depic ts the Mon te Carlo appr oximation ( 50 0 times) of the e nsemble average of the percentage of all m is- classiﬁed sensors versus the num ber K of pulses in the pu lse train. It is seen that the a verage per centage o f m is-classiﬁed sensors r educes to 0 very rapidly as K inc reases. Fig. 8 presents the MSE perfo rmance o f the prop osed esti mation approa ches for estimating θ plotted on a log scale. T he CRB for estimating θ which knows which sensors are attacked and uses data from all sensors is also plotted in Fig. 8 alo ng with the CRB for estimating θ which knows which sensors are attacked but on ly u ses d ata fro m u nattacked sensors. Fig. 8 shows that the CRB which uses only the unattacked senso r data is strictly larger than the CRB which uses all th e data. In an OGDSA, the d ata at the attacked sensor can not be u sed to improve perform ance as stated in Section III . Thus the results in Fig. 8 illustrate that the d ata-injection attack is not OGDSA for the time delay estimation problem in (102). This can also be veriﬁed u sing our theory . Interestingly , e ach of the EM- based approaches we described provides an MSE very close to the CRB from using a ll data, as sh own in Fig. 8. 1 5 10 20 40 60 80 100 120 140 160 180 200 The number K of time samples at each sensor 0 1 2 3 4 5 6 7 8 9 10 Average percentage of misclassified sensors (%) Fig. 7: Perf ormance of iden tifying the data- injection attacks in the time delay estimation pro blem. V I . C O N C L U S I O N In th is p aper, we study the distributed estimation of a deterministic vector parameter by using quan tized data in the presence of spo oﬁng attacks. A gen eralized attack mod el is employed wh ich manipulates the d ata using transformation s with arbitrary fun ctional forms determ ined by some attack parameters whose values ar e unknown to the attacked sys- tem. Novel necessary and sufﬁcient conditions are provide d under which these transfo rmations provid e an OGDSA. It is shown that an OGDSA imp lies that either the FIM und er the 15 1 5 10 20 40 60 80 100 120 140 160 180 200 The number K of pulses 10 -8 10 -7 10 -6 10 -5 10 -4 MSE RR with the EM RR with the EM and CVTR Unattack ed data only CRB for θ All data CRB for θ Fig. 8: Estimation perf orman ce of the prop osed approaches for estimating θ in the pr esence of the da ta-injection attacks in the time delay estimation p roblem . condition s o f Deﬁnition 1 f or jointly estimating th e desired and attack parameters is singular or that the attacked sy stem is u nable to improve the CRB under the c ondition s of Def- inition 1 for the d esired vector p arameter even th ough the joint FIM is non singular . It is demo nstrated th at it is always possible to construct an O GDSA by properly emp loying a sufﬁciently large dim ension attack vector par ameter relati ve to the number of quantization le vels employed , which was not observed previously . In addition, we demon strate that u nder the conditions of Deﬁnition 1 , a spooﬁng attack can corrupt the o riginal m easuremen ts to m ake them useless in terms of reducing the CRB for estimating the desired vector parameter if and o nly if it is an OGDSA. In order to illustrate the theory in a co ncrete way , we also p rovide some numerical results considering some OGDSAs. F or a speciﬁc class of OGDSAs, an enhance d EM-based a lgorithm that attempts to use all the attacked and unattacked data to jointly estima te the desired and attack par ameters is shown, f or a sufﬁcient number of observations, to essentially achieve th e CRB which knows which sensors are attacked and only uses data fro m unattacked sensors. This tallies with the theoretical r esults that the attacked data is n ot useful under an OGDSA. For completen ess, we specify th e EM-b ased algorithm f or g eneral attacks and enhance it with a heuristic rounding approach previously suggested by other s in a different application which seems to sig niﬁcantly improve th e EM-b ased algorithm. R E F E R E N C E S [1] I. Akyildiz, W . Su, Y . Sankarasubraman iam, and E. Cayirci, “ A surve y on sensor networks, ” Communicat ions maga zine, IEE E , vol . 40, no. 8, pp. 102–114, 2002. [2] Z. Li, W . T rappe, Y . Zhang , and B. Nath, “Rob ust stati stical methods for securing w ireless localizati on in sensor netw orks, ” in Information Pr ocessing in Sensor Networks, 2005. IPSN 2005. F ourth Internatio nal Symposium on , April 2005, pp. 91–98. [3] J. H. Lee a nd R. Bue hrer , “Charact erizat ion and det ection of loca tion spooﬁng attacks, ” Communication s and Netwo rks, Jou rnal of , v ol. 14, no. 4, pp. 396–409, Aug 2012. [4] S. Cui, Z. Han, S. Kar , T . T . Kim, H. V . Poor , and A. T ajer , “Coordinated data-i njectio n attack and detectio n in the smart grid: A detaile d look at enrichin g detect ion solution s, ” Signal Pro cessing Magazine , IE EE , vol. 29, no. 5, pp. 106–115, 2012. [5] A. V empaty , L. T ong, and P . V arshney , “Distrib uted inference with Byzant ine data: State-of-the -art revie w on data falsiﬁcati on attacks, ” Signal Pr ocessing Ma gazine, IEEE , vol . 30, no. 5, pp. 65–75, 2013. [6] V . Nadendla, Y . S. H an, and P . K. V arshney , “Dist ribute d inferenc e with m-ary quan tized data in the presence of byzantine attacks, ” Signal Pr ocessing , IEEE T ransactio ns on , vol. 62, no. 10, pp. 268 1–2695, 2014. [7] J. Zhang, R. S. Blum, X. Lu, and D. Conus, “ Asymptotically optimum distrib uted estimat ion in the presenc e of attacks, ” Signal Pr ocessing, IEEE T ransact ions on , vol. 63, no. 5, pp. 1086–110 1, March 2015. [8] B. Alnajj ab, J. Zhang, and R. S . Blum, “ Attacks on sensor netw ork paramete r estimation with quanti zation: Performance and asymptotical ly optimum processing, ” Signal Pr ocessing , IEEE T ransactions on , vol. 63, no. 24, pp. 6659–6672, Dec 2015. [9] J. Zhang and R. S. Blum, “Distrib uted joint spooﬁng attack identiﬁca tion and estimati on in sensor networks, ” in Signal and Informati on Pr ocess- ing (ChinaSIP), 2015 IEEE China Summit and Internationa l Confer ence on . IEEE, 2015, pp. 701–705. [10] R. Niu and J. Lu, “False informati on dete ction with minimum mean squared errors for bayesia n estimatio n, ” in Informat ion Scie nces and Systems (CISS), 2015 49th Annual Conferen ce on . IE EE, 2015, pp. 1–6. [11] H. C. Papadopoul os, G. W . W ornell, and A. V . Oppenheim, “Sequentia l signal encoding from noisy measurements using quan tizers with dy- namic bias control, ” Information Theory , IEEE T ransact ions on , vol. 47, no. 3, pp. 978–1002, 2001 . [12] J.-J. Xiao, A. Ribeiro, Z.-Q. Luo, and G. B. Giannaki s, “Distrib uted compression-e stimation using wirel ess sensor networks, ” Signal Pr o- cessing Ma gazine , IEE E , v ol. 23, no. 4, pp. 27–41, 2006. [13] A. Ribeiro and G. B. Giannakis, “Bandwidth -constrain ed distrib uted estimati on for wireless sensor networ ks-part I: Gaussian case, ” Signal Pr ocessing , IEEE T ransactions on , vol. 54, no. 3, pp. 1131–1143, 2006. [14] R. Niu and P . K. V arshney , “ T arget locati on estimatio n in sensor netw orks with quantized data, ” Signal Proc essing, IEE E T ransactions on , vol. 54, no. 12, pp. 4519– 4528, 2006. [15] J. Fang and H. L i, “Hyperplane-b ased vector quantizatio n for distrib uted estimati on i n wireless sensor networ ks, ” Info rmation Theory , IEEE T ransactions on , vol. 55, no. 12, pp. 5682–5699, 2009. [16] P . V enkita subramaniam, L. T ong, and A. Swa mi, “Quantizati on for maximin are in distrib uted esti mation, ” Signal Proc essing, IEEE T rans- actions on , vol. 55, no. 7, pp. 3596–3605, 2007. [17] S. Roome, “Di gital radio freque ncy memory , ” Elect r onics Communica- tion Enginee ring J ournal , vol. 2, no. 4, pp. 147–153, Aug 1990. [18] D. Liu, P . Ning, A. L iu, C. W ang, and W . K. Du, “ Attack-resist ant locat ion estimatio n in wireless sensor netw orks, ” A CM T ransactio ns on Informatio n and System Security (TISSEC) , vol. 11, no. 4, p. 22, 2008. [19] T . T . Kim and H. V . Poor , “Stra tegi c protecti on against data i njectio n attac ks on powe r grids, ” Smart Gr id, IEEE T ransact ions on , vol. 2, no. 2 , pp. 326–333, 2011. [20] O. Kosut, L. Jia, R. J. Thomas, and L . T ong, “Maliciou s data att acks on the smart grid, ” Smart Grid, IE EE T ransact ions on , vol. 2, no. 4, pp. 645–658, 2011. [21] S. Kim, W . Kuperma n, W . Hodgkiss, H. Song, G. Edelmann, and T . Akal, “Rob ust ti me re versal focusing in th e oce an, ” The Journal of the A coustica l Society of America , vol. 114, p. 145, 2003. [22] E. Bland, “GPS ‘spooﬁng’ could threat en nati onal sec urity , ” http:/ /www .nbcnews.com/ id/26992456 , 2008, [Online]. [23] A. Couts, “W ant to see thi s $80 mill ion super yacht sink? with GPS spooﬁng, no w you can!” http:/ /www .digitaltre nds.com/mobile/gps- spooﬁng/ , 2013, [Online ]. [24] M. I. Skolnik , Intr oduction to Radar Syste ms , 2nd ed. Ne w Y ork: McGra w Hill Book Co., 1980. [25] G. G rache v , “Theory of acousti c ﬁeld in va riants in layered wav eguid es, ” Acoustical physic s , vol. 39, no. 1, pp. 33–35, 1993. [26] G. D’ spain, J. Murray , W . Hodgkiss, N. Booth, and P . Schey , “Mirages in shallo w water match ed ﬁeld processing, ” The Jo urnal of the Acoustical Societ y of America , vol . 105, no. 6, pp. 3245–3265, 1999. [27] S. M. Kay , Fundamentals of Statisti cal Signal Processi ng, V olume I: Estimation the ory . Uppe r Saddle Ri ver , NJ: Prentice Hall, 1993. [28] A. P . Dempster , N. M. Laird , and D. B. Ru bin, “Maximum lik elihoo d from incomplete data via the EM algorit hm, ” Journal of the Royal Statist ical Soci ety . Series B (Methodolog ical) , vol. 39, no. 1, pp. pp. 1–38, 1977. [29] G. J. McLachla n and T . Krishnan, The EM algorithm and e xtensions . W ile y , 1997. [30] S. P . Boyd and L. V andenber ghe, Con vex optimization . Cambridge uni versi ty press, 2004. 16 [31] C. F . J. Wu, “On the con ver gence prope rties of the EM algorithm, ” T he Annals of Statistics , vol. 11, no. 1, pp. pp. 95–10 3, 1983. [32] A. Zymnis, S. Boyd, and D. Gorine vsky , “Rela xed maximum a posteriori faul t identi ﬁcation, ” Signal Pr ocess. , v ol. 89, no. 6, pp. 989–999, Jun. 2009. [33] S. Nash, R. Polyak, and A. Sofer, “A numerical compari son of bar - rier and m odiﬁed barrie r methods for large -scale bound-constraine d optimiza tion, ” in Larg e Scale Optimization , W . Hager , D. Hearn, and P . Parda los, Eds. Springer US, 1994, pp. 319–3 38. [34] Q. He, R. S. Blum, and A. M. Haimovic h, “Noncoherent MIMO radar for location and velo city estimation : More antennas means bett er performanc e, ” Signal Pr ocessing, IE EE T ransact ions on , vol. 58, no. 7, pp. 3661–3680, 2010.

Functional Forms of Optimum Spoofing Attacks for Vector Parameter Estimation in Quantized Sensor Networks

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment