An optimization framework for resilient batch estimation in Cyber-Physical Systems

This paper proposes a class of resilient state estimators for LTV discrete-time systems. The dynamic equation of the system is assumed to be affected by a bounded process noise. As to the available measurements, they are potentially corrupted by a no…

Authors: Alex, re Kircher, Laurent Bako

An optimization framework for resilient batch estimation in   Cyber-Physical Systems
1 An optimization frame work for resilient batch estimation in Cyber -Physical Systems Alexandre Kircher 1 , Laurent Bako 1 , Eric Blanco 1 , Mohamed Benallouch 2 Abstract —This paper proposes a class of resilient state esti- mators for L TV discrete-time systems. The dynamic equation of the system is assumed to be affected by a bounded pr ocess noise. As to the av ailable measurements, they are potentially corrupted by a noise of both dense and impulsive natures. The latter in addition to being arbitrary in its form, need not be strictly bounded. In this setting, we construct the estimator as the set-valued map which associates to the measurements, the minimizing set of some appropriate performance functions. W e consider a family of such performance functions each of which yielding a specific instance of the general estimator . It is then shown that the proposed class of estimators enjoys the property of resilience, that is, it induces an estimation error which, under certain conditions, is independent of the extreme values of the (impulsi ve) measurement noise. Hence, the estimation err or may be bounded while the measurement noise is virtually unbounded. Moreov er , we provide sev eral error bounds (in different configurations) whose expressions depend explicitly on the degree of observability of the system being observed and on the considered performance function. Finally , a few simulation results are provided to illustrate the resilience property . Index T erms —Secure state estimation, resilient estimators, optimal estimation, Cyber -physical systems. I . I N T RO D U C T I O N Context. W e consider in this work the problem of designing state estimators which would be resilient against an (unknown) sparse noise sequence affecting the measurements. By sparse noise we refer here to a signal sequence which is of impulsiv e nature, that is, a sequence which is most of the time equal to zero, except at a few instants where it can take on arbitrarily large values. The problem is rele vant for example, in the supervision of Cyber-Physical Systems (CPS) [8]. In this application, the supervisory data may be collected by spatially distributed sensors and then sent to a distant processing unit through some communication netw ork. During the transmis- sion, the data may incur intermittent packet losses or adversar- ial attacks consisting in e.g., the injection of arbitrary signals. Beyond CPS, there are many other applications where the sparse noise model of uncertainty is relev ant: robust statistics [14], hybrid system identification [1], intermittent sensor fault detection, etc. Related works. The problem of estimating the state of CPS under attacks has been in vestigated recently through many different approaches. Since the measurements are assumed to 1 A. Kircher , L. Bako and E. Blanco are with Université de L yon, Laboratoire Ampère (Ecole Centrale L yon, CNRS UMR 5005), F-69134. E-mails: alexandre.kircher, laurent.bako, eric.blanco@ec-lyon.fr 2 M. Benallouch is with Université de Lyon, ECAM L yon, Lab ECAM, F-69321 L yon, France. E-mail: mohamed.benallouch@ecam.fr be affected by a sequence of outliers which is sparse in time, a natural scheme of solution to the state estimation problem may be to first process the data so as to detect the occurrences of the nonzero instances of that sparse noise, remove the corrupted data and then proceed with classical estimation methods such as the Kalman filter or the Luenberger type of observer [17], [20]. While this approach sounds a priori reasonable, the main challenge remains to achiev e an efficient detection and isolation of the outliers. Regarding the scenarios where the sporadic noise is modeled in a probabilistic setting, there e xists a body of interesting results providing performance limits of estimation schemes [25], [18], [21]. Another category of approaches, which are inspired by some recent results in compressiv e sampling [7], [11], rely on sparsity-inducing optimization techniques. A striking feature of these methods is that they do not treat separately the tasks of detection, data cleaning and estimation. Instead, an implicit discrimination of the wrong data is induced by some specific properties of the to-be-minimized cost function. One of the first works that puts forward this approach for the resilient state estimation problem is the one reported in [10]. There, it is assumed that only a fixed number of sensors are subject to attacks (sparse over time but otherwise arbitrary disturbances). The challenge then resides in the fact that at each time instant, one does not kno w which sensor is compromised. Note howe ver that the assumptions in [10] were quite restrictive as no dense process noise or measurement noise (other than the sparse attack signal) was considered. These limitations open ways for later extensions in many directions. For example, [24] suggests a reformulation which is argued to reduce com- putational cost by using the concept of ev ent-triggered update ; [19] considers an observation model which includes dense noise along with the sparse attack signal. In [9], the assumption of a fixed number of attacked sensors is relaxed. Finally , the recent paper [13] proposes a unified frame work for analyzing resilience capabilities of most of these (con vex) optimization- based estimators. Although a bound on the estimation error was derived in this paper , it is not quantitati vely related to the properties (e.g., observability) of the dynamic system being observed. The state estimation problem treated there is rather viewed as a linear regression problem similarly to [2], [5]. Contributions. The contributions of the current paper consist in the design and the analysis of a class of optimization-based resilient estimators for Linear T ime-V arying (L TV) discrete- time systems. The av ailable model of the system assumes bounded noise in both the dynamics and the observation equation with the latter being possibly affected, additionally , 2 by an unknown but sparse attack signal. Contrary to the settings considered in some existing works, we did not impose here any restriction on the number of sensors which are subject to attacks, that is, any sensor can be compromised at any time. Note also that no statistical significance is attached to the uncertainties modeled by noise. In this setting, by generalizing our pre vious w ork reported in [16], the current paper proposes a general (rob ust) estimation framew ork for the state of L TV systems. W e propose a class of state estimators constructed as the set-v alued maps which associate to the output measurements, the minimizing set of some appropriate performance functions. A variety of performance functions are considered for the design of the estimator and handled in a unified analysis frame work: con vex nonsmooth/smooth loss functions and noncon ve x saturated ones. Our main theoretical results concern the resilience analysis of the proposed class of estimators. W e show that the estimation error associated with the new class of estimators can be made, under certain conditions, insensiti ve to the (possibly very lar ge) amplitude of the sparse attack signal. The proposed analysis relies on new quantitativ e characterizations of the observ ability property of the system whose state is being observ ed. Although the deri ved error bounds may be conservati ve, they have the important advantage of being explicitly expressible in function of the properties of the considered dynamic system and those of the optimized loss functions. This makes them valuable qualita- tiv e tools for assessing the impact of the estimator’ s design parameters and that of the system matrices on the quality of the estimation. For example, the proposed error bounds reflect the intuition that the more observable the system is with respect to the ne w criteria, the larger the number of instances of gross values (of the output noise) it can handle and the smaller the values of the bounds. Finally the paper shows that for some choice of the design functions (loss functions), some instances of the proposed family of estimators enjoy the exact recov erability property in the particular situation where the measurements are corrupted only by sparse noise. W e present a condition for this property that can be numerically verified by means of conv ex optimization. Overall, in comparison with [13] which also considers resilient estimation though in a linear regression setting, we (i) introduce here an alternative definition of resilience, (ii) characterize quantitatively the impact of intrinsic properties (observ ability) of the system being observed on the quality of the estimation (iii) derive an explicit expression of a bound on the estimation error . Outline. The rest of the paper is structured as follows. W e start by introducing in Section II, the settings for the resilient state estimation problem. W e then define in Section III the new class of optimization-based estimators proposed here to address this problem. The analysis of this new framew ork is presented in Section IV. In Section V, we further discuss the properties of a special constrained version of the initial class of estimators. In Section VI, we comment on the numerical verification of the conditions deri ved in the analysis part. Some numerical results are described in Section VII and finally , concluding remarks are giv en in Section VIII. Notation. R ≥ 0 (respectiv ely R > 0 ) is the set of nonnegati ve (respectiv ely positiv e) reals. R ∗ designates the set of real numbers excluding zero. W e note R a the set of (column) vectors with a real elements and R a × b , the set of real matrices with a rows and b columns. If M ∈ R a × b , then M > will designate the transposed matrix of M . I will refer to the (square) identity matrix of appropriate dimension. The notation k·k will denote a norm ov er a gi ven set (which will be specified when necessary). k·k p denotes the ` p norm (for p ≥ 1 ) or the ` p quasi-norm (for 0 < p < 1 ) defined for z =  z 1 · · · z a  in R a by k z k p = ( | z 1 | p + · · · + | z a | p ) 1 /p . The limit of this when p ↓ 0 gi ves the so-called ` 0 -norm k·k 0 of z , i.e., the number of nonzero entries in z . Its limit when p ↑ + ∞ gi ves the infinity norm denoted k z k ∞ and returning the maximum value of the | z i | . For x ∈ R , e x refers to the exponential function applied to x . If S is a set, then P ( S ) is the po wer set of S . If S is a finite set, the notation |S | refers to the cardinality of S . K ∞ functions [15]. W e name K ∞ the set of functions f : R ≥ 0 → R ≥ 0 which are continuous, zero at zero, strictly increasing and satisfy lim λ → + ∞ f ( λ ) = + ∞ . If f ∈ K ∞ , then it admits an inv erse, denoted here f − 1 , which also lies in K ∞ . Similarly , we use the notation K sat ,a to denote the set of saturated functions f : R ≥ 0 → [0 , a ] which are continuous, zero at zero, strictly increasing on [0 , a ] and such that f ( λ ) = f ( a ) for all λ ≥ a . Supr emum. Given a function f ov er R a and a subset S of R a , the notation sup z ∈S f ( z ) < b , with b ∈ R , will mean that for all z in S , f ( z ) < b . This notation includes the case where the supremum is b b ut is not attained by any element of S . I I . T H E R E S I L I E N T E S T I M AT I O N P RO B L E M Consider a discrete-time Linear T ime-V arying (L TV) system described by Σ :  x t +1 = A t x t + w t y t = C t x t + f t (1) where x t ∈ R n is the state vector of the system at time t and y t ∈ R n y is the output vector at time t ; { A t } and { C t } are families of matrices with appropriate dimensions; { w t } is an unobserved bounded noise sequence . As to { f t } , it is regarded here as an (unobserved) arbitrary noise sequence affecting the measurements. For clarity of the exposition, it may be con venient to view f t as a combination of two types of sequences: a bounded sequence { v t } and a sparse sequence { s t } (this decomposition is indeed always possible for an arbitrary noise signal). Hence, we may write f t = v t + s t , (2) where v t is a sensor noise of moderate amplitude and s t is a sparse noise sequence in the sense that its (entrywise and/or timewise) components are mostly equal to zero but its nonzero elements can take on (possibly) arbitrarily large values. Such a sparse sequence { s t } may account for adversarial attacks in the same spirit as in [10], [13], intermittent sensor faults, or data losses, in particular when a communication network is in volved in the data acquisition-transmission chain. In the sequel, we may also refer to { w t } and { v t } in (1) and (2) as dense noises and to the largest elements of { s t } as outliers. 3 For the sake of simplicity , the sparse (and potentially arbitrary large) noise is assumed here to affect only the measurement equation. Note howe ver that the proposed anal- ysis method can be extended to the more general scenario where the sparse noises may affect both the dynamics and the measurements. Problem. The problem considered in this paper is the one of estimating the states x 0 , . . . , x T − 1 of the system (1) on a time period T = { 0 , . . . , T − 1 } giv en T measurements y 0 , ..., y T − 1 of the system output. W e shall seek an accurate estimate of the state despite the uncertainties in the system equations (1) modeled by w t and f t the characteristics of which are informally described abov e. In particular, we would like the to-be-designed estimator to produce an estimate such that the estimation error is, when possible, independent of the maximum amplitude of { f t } . Such an estimator will be called r esilient , see Definition 2 for a formal characterization of this property . Denote with X T − 1 =  x 0 x 1 . . . x T − 1  (3) the actual state trajectory of the system Σ on a finite time horizon of length T . Similarly , we use the notation Y T − 1 =  y 0 y 1 · · · y T − 1  (4) to refer to the collection of measurements on the same time horizon. The state estimation problem is approached here from an of fline perspecti ve, therefore T is fixed. For the sake of sim- plicity , the T index will be dropped from the variable names and it will be assumed that signal matrices without an index represent values on the period T = { 0 , . . . , T − 1 } . T o simplify further the formulas, we also pose T 0 = { 0 , . . . , T − 2 } while S = { 1 , . . . , n y } will be a set indexing the sensors (or the rows of the matrices C t in (1)). I I I . O P T I M I Z A T I O N - B A S E D A P P R OAC H T O R E S I L I E N T S T A T E E S T I M A T I O N A. The state estimator In this section we present an optimization-based framework for solving the state estimation problem defined above. T o de- fine formally the proposed state estimator , let us first introduce the to-be-minimized objectiv e function. Given the matrices { ( A t , C t ) } of the system (1) and T output measurements Y =  y 0 · · · y T − 1  , we consider a performance function V Σ : R n y × T × R n × T → R ≥ 0 defined by V Σ ( Y , Z ) = λ X t ∈ T 0 φ t ( z t +1 − A t z t ) + X t ∈ T ψ t ( y t − C t z t ) (5) where Z =  z 0 · · · z T − 1  ∈ R n × T is a hypothetical trajectory matrix with z i denoting the i -th column of Z ; λ > 0 is a user-defined parameter which aims at balancing the contributions of the two terms in volv ed in the e xpression of the performance index V Σ ( Y , Z ) . { φ t } and { ψ t } are two families of positi ve functions (called here loss functions) defined on R n and R n y respectiv ely . For the sake of simplicity , we will assume throughout the paper that for all t in T , φ t and ψ t can be expressed by φ t ( z ) = φ ( W t z ) ∀ z ∈ R n (6) ψ t ( e ) = ψ ( V t e ) ∀ e ∈ R n y , (7) where φ : R n → R ≥ 0 and ψ : R n y → R ≥ 0 are two fixed loss functions and { W t } and { V t } are two families of nonsingular weighting matrices with appropriate dimensions. Definition 1. Given a system Σ such as the one in (1) and given an output measur ement matrix Y ∈ R n y × T , we define a state estimator to be a set-valued map E : R n y × T → P ( R n × T ) which maps Y to a subset of the space of possible trajectories of the system. W e consider a class of state estimators defined by E ( Y ) = arg min Z ∈ R n × T V Σ ( Y , Z ) . (8) As such the estimator E is well-defined if for any fix ed Y , V Σ ( Y , Z ) admits a non empty minimizing set, that is, if there exists at least one Z ? such that V Σ ( Y , Z ) ≥ V Σ ( Y , Z ? ) for all Z ∈ R n × T . T o ensure this property we will need to put an observ ability assumption on the system whose state is being estimated and require some further properties on the loss functions φ and ψ entering in the definition of the objectiv e function V Σ . B. W ell-definedness of the estimator Let us start by stating the properties required for the loss functions inv olved in the definition of V Σ . Due to the multiple usages that will be made of these properties, it is con venient to state them for a generic loss function defined on a set of matrices (of which vectors constitute a special case). Throughout this paper, a loss function is a positive function ξ : R a × b → R ≥ 0 which will be required to satisfy a subset (depending of the specific usage) of the following properties: (P1) Positi ve definiteness: ξ (0) = 0 and ξ ( Z ) > 0 for all Z 6 = 0 (P2) Continuity: ξ is continuous (P3) Symmetry: ξ ( − Z ) = ξ ( Z ) for all Z ∈ R a × b (P4) Generalized Homogeneity (GH): There exists a K ∞ func- tion q : R ≥ 0 → R ≥ 0 such that for all λ ∈ R ∗ and for all Z ∈ R a × b , ξ ( Z ) ≥ q  1 | λ |  ξ ( λZ ) . (9) (P5) Generalized T riangle Inequality (GTI): There exists a positiv e real number γ ξ such that for all Z 1 , Z 2 in R a × b ξ ( Z 1 − Z 2 ) ≥ γ ξ ξ ( Z 1 ) − ξ ( Z 2 ) . (10) It can be usefully observed, for the future developments, that (10) can be equiv alently written as ξ ( Z 1 + Z 2 ) ≤ γ − 1 ξ ξ ( Z 1 ) + γ − 1 ξ ξ ( Z 2 ) . Examples of loss functions. Note that norms on R a × b satisfy naturally the properties (P1) – (P5) with q : λ 7→ λ and γ ξ = 1 , hence yielding the classic homogeneity property and triangle inequality . It can also be checked that functions ξ of the form ξ ( Z ) = k Z k p with p > 0 , fully qualify as loss functions in the 4 sense that they fulfill all the properties (P1)–(P5). In this case, γ ξ in (10) can be taken equal to 2 1 − 1 /p if 0 < p ≤ 1 and 2 1 − p otherwise. Lastly we note that if ` : R a × b → R ≥ 0 satisfies (P1) – (P3) and (P5), then so does the function ξ defined by ξ ( Z ) = 1 − e − ` ( Z ) (see Lemma 9 in the appendix). Similarly , saturated functions of the form ξ ( Z ) = min( ` ( Z ) , R 0 ) for some R 0 > 0 satisfy (P1) – (P3) and (P5). In the case of conv ex functions, a link can be established between (P4) and (P5). Lemma 1 ([16]) . If ξ : R a × b → R ≥ 0 is con vex and satisfies pr operty (P4) with a K ∞ function q , then it also satisfies (P5) with γ ξ = 2 q (1 / 2) . Observe that quadratic functions ξ : R a × b → R ≥ 0 of the form ξ ( Z ) = T r( Z > QZ ) with Q ∈ R a × a being a positi ve definite matrix and T r referring to the trace of a matrix, satisfy properties (P1) – (P4) with a K ∞ function q : λ 7→ λ 2 . Since such functions are con ve x, it follo ws from Lemma 1 abo ve that they also verify (P5) for γ ξ = 2 q (1 / 2) = 1 / 2 . Remark 1. In virtue of (6) - (7) , the families of functions { φ t } and { ψ t } satisfy (P1) – (P5) whenever φ and ψ satisfy (P1)– (P5). W e now recall from [16] a technical lemma which will play a fundamental role in analyzing the properties of the estimator (8). In particular, our proof of well-definedness relies on this lemma. Lemma 2 (Lo wer Bound of a loss function) . Let ξ : R a × b → R ≥ 0 be a function which has pr operties (P1) – (P2) and (P4) with a K ∞ function q . Then, for all norm k·k on R a × b , ξ ( Z ) ≥ D q ( k Z k ) ∀ Z ∈ R a × b (11) wher e D = min k Z k =1 ξ ( Z ) > 0 . (12) Pr oof: W e start by observing that the unit hypersphere S =  Z ∈ R a × b : k Z k = 1  is a compact set in the topology induced by the norm k·k . By the extreme value theorem, ξ being continuous, admits necessarily a minimum value on S , i.e., there is Z ? ∈ S such that ξ ( Z ) ≥ D , ξ ( Z ? ) > 0 for all Z ∈ S . F or any nonzero Z ∈ R a × b , Z k Z k ∈ S so that ξ ( Z k Z k ) ≥ D . On the other hand, by the relaxed homogeneity of ξ , ξ ( Z ) ≥ q ( k Z k ) ξ ( Z k Z k ) ≥ D q ( k Z k ) . Moreov er , this inequality holds for Z = 0 . It therefore holds true for any Z ∈ R a × b . Proposition 1 (W ell-definedness of the estimator) . Let the loss functions φ and ψ in (6) - (7) satisfy pr operties (P1)–(P5) and assume that the LTV system (1) is observable on [0 , T − 1] in the sense that the observability matrix O 0 ,T − 1 ,  ( C 0 ) > ( C 1 A 0 ) > · · · ( C T − 1 A T − 2 . . . A 1 A 0 ) >  > (13) has full column rank. Then the estimator (8) is well-defined, i.e., the objective function V Σ ( Y , · ) attains its minimum for any fixed Y . Hence, the condition of the proposition guarantees that E ( Y ) is non empty for all Y ∈ R n y × T . Before proving this result, we first make the following observation. Lemma 3 (Equi v alent condition of Observability) . Consider the objective function V Σ defined in (5) where { ( φ t , ψ t ) } are defined as in (6) - (7) with φ and ψ satisfying (P1) – (P4). Then the following two statements are equivalent: (i) The system is observable on the time interval [0 , T − 1] . (ii) Ther e exists a K ∞ function q such that for all Z =  z 0 z 1 . . . z T − 1  in R n × T , V Σ (0 , Z ) ≥ q ( k z 0 k ) (14) A proof of this lemma is reported in Appendix B. The function q can be interpreted here as a gain function which measures how much the system is observable with reg ards to the two families { φ t } and { ψ t } : the more the system is observable, the more q amplifies its argument magnitude, making different trajectories more discernible. Pr oof of Pr oposition 1: The idea of the proof is to show that V Σ ( Y , · ) is coer cive ( i.e. , continuous and radially un- bounded) for any given Y and then apply a result 1 in [22, Thm 1.9] to conclude on the attainability of the infimum (which certainly exists since V Σ ( Y , · ) is a positiv e function). Clearly , V Σ ( Y , · ) is continuous as a consequence of φ and ψ being continuous by assumption (see property (P2)). W e then just need to prov e the radial unboundedness of V Σ ( Y , · ) , i.e. , lim k Z k→ + ∞ V Σ ( Y , Z ) = + ∞ for an arbitrary norm k·k on the Z -space and for all fix ed Y . Since ψ satisfies property (P5), there exists a constant γ ψ > 0 such that ψ t ( y t − C t z t ) ≥ γ ψ ψ t ( C t z t ) − ψ t ( y t ) . Applying this property leads naturally to V Σ ( Y , Z ) ≥ F ( Z ) − X t ∈ T ψ t ( y t ) , where F ( Z ) = λ X t ∈ T 0 φ t ( z t +1 − A t z t ) + γ ψ X t ∈ T ψ t ( C t z t ) . (15) It can then be shown (following a similar reasoning as in Appendix B), under the observability assumption, that F satisfies the conditions of Lemma 2. It follo ws that for any norm k·k on R n × T , there exists a K ∞ function q such that F ( Z ) ≥ q ( k Z k ) . Combining this with the inequality abov e, we obtain that V Σ ( Y , Z ) ≥ q ( k Z k ) − X t ∈ T ψ t ( y t ) which implies the radial unboundedness of V Σ ( Y , · ) for any fixed Y . Hence the estimator (8) is well-defined as stated. As it turns out from Proposition 1, observability of system (1) and properties (P1) – (P4) imposed on φ and ψ ensure that E ( Y ) is a non empty set for any giv en Y . W e then call any 1 Note that radial unboundedness is equiv alent to lev el-boundedness in the terminology of [22]. 5 member ˆ X =  ˆ x 0 ˆ x 1 . . . ˆ x T − 1  of E ( Y ) , an estimate of the state trajectory X of system (1) on the time interv al T . In particular , ˆ x t is called an estimate of the state x t at time t ∈ T . T o conclude this section, note that the definition of the estimator in (8) does not require any con ve xity assumption on the objective function V Σ . Hence the theoretical analysis to be presented in the next sections does not make use of con ve xity either . Ho wever , we may prefer in practice to select con ve x loss functions φ and ψ . In ef fect, the elements of E ( Y ) are not necessarily expressible through an explicit formula. So, in practice one would resort instead to numerical solvers to approach the solution of the underlying optimization problem. And the numerical search process is known to be more efficient when V Σ ( Y , Z ) is a con ve x function of Z [4], [12]. Nev ertheless it is fair to recognize that noncon vex optimization methods can be successfully implemented as well though with less theoretical guarantees of reaching global optimality with general purpose solvers. I V . T H E R E S I L I E N C E P RO P E RT Y O F T H E P R O P O S E D C L A S S O F E S T I M A T O R S In this section, we prove that the state estimator proposed in (8) possesses the resilience property under some conditions. More specifically , our main result states that the estimation er- ror ˆ X − X , i.e. , the difference between the real state trajectory and the estimated one, is upper bounded by a bound which does not depend on the amplitude of the outliers contained in { f t } provided that the number of such outliers is below some threshold. A. Definition of the r esilience of an estimator Let us start with a formal definition of the resilience property for a state estimator of the form (8). For this purpose, let Y ? =  y ? 0 . . . , y ? T − 1  denote the noise-free output matrix of (1), i.e., the output defined by y ? t = C t x ? t , t = 0 , . . . , T − 1 , with x ? t +1 = A t x ? t and x ? 0 = x 0 ( x 0 being the true initial state of (1)). Let F r , a subset of R n y × T containing 0 , denote a matrix of measurement noise components. Definition 2 (Resilience of an estimator) . The set-valued estimator E defined in (8) is called r esilient against the set F r of measur ement noise if there exists a K ∞ function g such that, when the pr ocess noise { w t } is zer o, it holds that for any measur ement noise matrix F ∈ R n y × T , k ˆ X − X k ≤ g  inf Ω ∈F r d ( F − Ω)  ∀ ˆ X ∈ E ( Y ? + F ) (16) with k · k denoting some norm, d : R n y × T → R ≥ 0 being a function subject to (P1)-(P5). Hence inf Ω ∈F r d ( F − Ω) denotes some pseudo-distance fr om F to the set F r . Since 0 ∈ F r , a consequence of property (16) is that E ( Y ? ) = { X } which follows from (16) for F = 0 . This fact expresses correctness of the estimator in a nominal situation, i.e., its ability to recover the true state matrix X in the absence of any uncertainty in the a priori known model. Indeed this condition is guaranteed to hold if the system Σ is observable over the considered observation time horizon T . Another key implication of condition (16) is that the estimation error associated with a resilient estimator is totally insensitiv e to any measurement noise matrix F which lies in F r , that is, E ( Y ? + F ) = { X } for all measurement noise F ∈ F r . Throughout this paper, we consider a set F r defined as follows. For F =  f 0 · · · f T − 1  ∈ R n y × T , let T c 0 ( F ) = { t ∈ T : ψ t ( f t ) > 0 } and T 0 ( F ) = { t ∈ T : ψ t ( f t ) = 0 } . For r a positiv e integer , define F r to be the set of matrices in R n y × T having at most r nonzero columns, i.e., F r = { F : | T c 0 ( F ) | ≤ r } . (17) For the need of making explicit the resilience property (in the results to be presented) with respect to the set F r , we will need the follo wing lemma. Lemma 4. Consider the set F r of measur ement noise matrix defined in (17) and select a (pseudo distance) function d : R n y × T → R ≥ 0 defined by d ( F ) = P t ∈ T ψ t ( f t ) with ψ t a function defined as in (7) and having the pr operties (P1)-(P5). Then inf Ω ∈F r d ( F − Ω) is equal to the sum of the T − r smallest terms in { ψ t ( f t ) : t ∈ T } . Pr oof: Let I c r ( F ) denote the index set of the r largest entries of the vector  ψ 0 ( f 0 ) . . . ψ T − 1 ( f T − 1 )  and I r ( F ) denote the index set of its T − r smallest entries. Then, with the notation Ω =  ω 0 · · · ω T − 1  , inf Ω ∈F r d ( F − Ω) = inf Ω ∈F r X t ∈ T ψ t ( ω t − f t ) = inf Ω ∈F r h X t ∈ I r ( F ) ψ t ( ω t − f t ) + X t ∈ I c r ( F ) ψ t ( ω t − f t ) i = inf Ω ∈F r T 0 (Ω)= I r ( F ) h X t ∈ I r ( F ) ψ t ( f t ) + X t ∈ I c r ( F ) ψ t ( ω t − f t ) i = X t ∈ I r ( F ) ψ t ( f t ) where the notation T 0 (Ω) is defined in the lines preceding Eq. (17). The infimum is reached here for Ω ∈ F r such that ω t = 0 ∀ t ∈ I r ( F ) and ω t = f t ∀ t ∈ I c r ( F ) . Hence inf Ω ∈F r d ( F − Ω) is, as claimed, the sum of the T − r smallest v alues among { ψ t ( f t ) : t ∈ T } . W e will also introduce in the sequel a notion of appr oximate r esilience of E . This terminology refers to Definition 2 when the right hand side of (16) is modified as g  inf Ω ∈F r d ( F − Ω) + δ  with δ some nonnegati ve real number . B. Some notational conventions for the analysis For conv enience, let us introduce a fe w more notations. Let Φ : R n × T → R ≥ 0 and Ψ T : R n y × T → R ≥ 0 be defined by Φ( Z ) = X t ∈ T 0 φ t ( z t +1 − A t z t ) (18) Ψ T ( Z ) = X t ∈ T ψ t ( C t z t ) (19) W e also introduce the partial cost function Ψ I defined for any I ⊂ T by Ψ I ( Z ) = P t ∈ I ψ t ( C t z t ) . W e will assume throughout the paper that the loss functions φ and ψ satisfy 6 a subset of the properties (P1)–(P5) and in particular , when they are required to satisfy the GTI (P5), we will denote the associated positi ve constants with γ φ and γ ψ respectiv ely . Finally , let us pose H Σ ( Z ) = λγ φ Φ( Z ) + γ ψ Ψ T ( Z ) . (20) W e will organize the resilience analysis for the estimator (8) along two cases: first, the scenario where the gross error vector sequence { s t } in (2) is block-sparse in time and then the situation where it is both componentwise and temporally sparse. T o be more precise, if we denote with S ∈ R n y × T the matrix formed from the sequence { s t : t ∈ T } , then the first case refers to columnwise block-sparsity of S while the second is related to an entrywise sparsity . Note that the two cases coincide when the system of interest is single-input single- output (SISO). C. Resilience to intermittent timewise block-spar se err ors W e start by introducing the concept of r -Resilience index of an estimator such as the one in (8), a measure which depends of the system matrices, the structure of the performance function V Σ and on the loss functions φ and ψ . Definition 3. Let r be a nonne gative inte ger . Assume that the system Σ in (1) is observable on [0 , T − 1] . W e then define the r -Resilience index of the estimator E in (8) (when applied to Σ ) to be the real number p r given by p r = sup Z ∈ R n × T Z 6 =0 sup I ⊂ T | I | = r Ψ I ( Z ) H Σ ( Z ) (21) wher e H Σ is as defined in (20) . The supr emum is taken here over all nonzer o Z in R n × T and over all subsets I of T with car dinality equal to r . The index p r can be interpreted as a quantitative measure of the observ ability of the system Σ . The observability is needed here to ensure that the denominator H Σ ( Z ) of (21) is different from zero whene ver Z 6 = 0 . Furthermore, it should be remarked that Ψ I ( Z ) ≤ H Σ ( Z ) for any I ⊂ T , which implies that the defining suprema of p r are well-defined. Note that p r is an increasing function of r and satisfies p 0 = 0 and p T = 1 . More discussions on the numerical e valuation of p r are deferred to Section VI. In order to state the resilience result for the estimator (8) when applied to system Σ , let us introduce a last notation to be used in the analysis. Let ε ≥ 0 be a gi ven number . For any admissible sequence { f t } t ∈ T in (1), we can split the time index set T into two disjoint label sets, T ε = { t ∈ T : ψ t ( f t ) ≤ ε } , (22) indexing those f t which are upper bounded by ε and T c ε = { t ∈ T : ψ t ( f t ) > ε } indexing those f t which are possibly unbounded. It is important to keep in mind that ε is just a parameter for decomposing the noise sequence in two parts in view of the analysis (and not necessarily a bound on elements of the sequence { ψ ( f t ) } ). For example, taking ε = 0 would be appropriate for analyzing the properties of the estimator when f t is strictly sparse and each of its nonzero elements is treated as an outlier . Theorem 1 (Upper bound on the estimation error) . Consider the system Σ defined by (1) with output Y tog ether with the state estimator (8) in whic h the loss functions φ and ψ are assumed to obey (P1)–(P5). Denote with γ φ and γ ψ the constants associated with the GTI (P5) and q φ and q ψ the K ∞ functions associated with the GH (P4) for φ and ψ r espectively . Let ε ≥ 0 and set r = | T c ε | . If the system is observable on [0 , T − 1] and p r < 1 / (1 + γ ψ ) , then for any norm k·k on R n × T , k ˆ X − X k ≤ h  2 β Σ ( ε ) D  1 − (1 + γ ψ ) p r  + δ ( ε )  ∀ ˆ X ∈ E ( Y ) (23) with X denoting the true state matrix fr om (1) , D = min k Z k =1 H Σ ( Z ) > 0 and β Σ ( ε ) , δ ( ε ) and h being defined by β Σ ( ε ) = λ X t ∈ T 0 φ t ( w t ) + X t ∈ T ε ψ t ( f t ) , (24) δ ( ε ) = 1 − γ ψ D [1 − (1 + γ ψ ) p r ] X t ∈ T c ε ψ t ( f t ) (25) h ( α ) = max n q − 1 φ ( α ) , q − 1 ψ ( α ) o , α ∈ R ≥ 0 (26) Pr oof: Let ˆ X in E ( Y ) . By definition of E in (8), we hav e V Σ ( Y , ˆ X ) ≤ V Σ ( Y , X ) , which giv es explicitly λ X t ∈ T 0 φ t ( ˆ x t +1 − A t ˆ x t ) + X t ∈ T ψ t ( y t − C t ˆ x t ) ≤ λ X t ∈ T 0 φ t ( w t ) + X t ∈ T ψ t ( f t ) (27) Using the fact that x t +1 = A t x t + w t from (1) and applying the GTI and the symmetry properties of φ t , we can write φ t ( ˆ x t +1 − A t ˆ x t ) = φ t ( ˆ x t +1 − x t +1 − A t ( ˆ x t − x t ) + w t ) ≥ γ φ φ t ( e t +1 − A t e t ) − φ t ( w t ) with e t = ˆ x t − x t . It follows that the first term on the left hand side of (27) is lower bounded as follows λ X t ∈ T 0 [ γ φ φ t ( e t +1 − A t e t ) − φ t ( w t )] ≤ λ X t ∈ T 0 φ t ( ˆ x t +1 − A t ˆ x t ) . (28) Similarly , by making use of (1), observ e that ψ t ( y t − C t ˆ x t ) = ψ t ( f t − C t e t ) . W e now apply the GTI and symmetry of ψ t in two different ways depending on whether t belongs to T ε or T c ε : ∀ t ∈ T ε , ψ t ( y t − C t ˆ x t ) ≥ γ ψ ψ t ( C t e t ) − ψ t ( f t ) ∀ t ∈ T c ε , ψ t ( y t − C t ˆ x t ) ≥ γ ψ ψ t ( f t ) − ψ t ( C t e t ) These inequalities imply that the second term on the left hand 7 side of (27) is lower bounded as follows X t ∈ T ε [ γ ψ ψ t ( C t e t ) − ψ t ( f t )] + X t ∈ T c ε [ γ ψ ψ t ( f t ) − ψ t ( C t e t )] ≤ X t ∈ T ψ t ( y t − C t ˆ x t ) (29) Combining (27), (28) and (29) giv es λγ φ X t ∈ T 0 φ t ( e t +1 − A t e t ) + γ ψ X t ∈ T ψ t ( C t e t ) − (1 + γ ψ ) X t ∈ T c ε ψ t ( C t e t ) ≤ 2  λ X t ∈ T 0 φ t ( w t ) + X t ∈ T ε ψ t ( f t )  + X t ∈ T c ε (1 − γ ψ ) ψ t ( f t ) which, by using (19), (20), (24), can be written as H Σ ( E ) − (1 + γ ψ )Ψ T c ε ( E ) ≤ 2 β Σ ( ε ) + X t ∈ T c ε (1 − γ ψ ) ψ t ( f t ) with E =  e 0 e 1 · · · e T − 1  . As T c ε has r elements, applying the definition of p r giv es Ψ T c ε ( E ) ≤ p r H Σ ( E ) (30) By the assumption that p r < 1 / (1 + γ ψ ) we ha ve that 1 − (1 + γ ψ ) p r > 0 , and consequently , that H Σ ( E ) ≤ 1 1 − (1 + γ ψ ) p r h 2 β Σ ( ε ) + (1 − γ ψ ) X t ∈ T c ε ψ t ( f t ) i (31) Giv en that the system is observ able on [0 , T − 1] , it can be shown, thanks to Lemma 7 in the Appendix, that H Σ satisfies properties (P1)–(P4) (the proof of this is quite similar to that of Lemma 3 in Appendix B). W e can therefore apply Lemma 2 to conclude that for any norm k·k H Σ ( E ) ≥ D q 0 ( k E k ) (32) with D defined by D = min k Z k =1 H Σ ( Z ) and q 0 ( α ) = min { q φ ( α ) , q ψ ( α ) } . Finally , the result follows by selecting h to be h = q 0− 1 with q 0− 1 denoting the inv erse of q 0 , which can be simplified to match its definition in (26). Strict resilience. No w we state our (strict) resilience result as a consequence of Theorem 1 when the output error-measuring loss function ψ satisfies the triangle inequality . Corollary 1 (Resilience property) . Let the conditions of Theor em 1 hold with the additional requir ement that γ ψ = 1 . Then k ˆ X − X k ≤ h  2 β Σ ( ε ) D  1 − 2 p r   ∀ ˆ X ∈ E ( Y ) . (33) Pr oof: The proof is immediate by considering the bound in (23) and observing that δ ( ε ) expressed in (25) vanishes when γ ψ = 1 , hence eliminating completely the contribution of the extreme values of { f t } to the error bound. This giv es immediately (33). It remains now to make clear that (33) is consistent with the requirement (16) of Definition 2. For this purpose note that the bound in (33) can be written as g ( β Σ ( ε )) with g ∈ K ∞ defined by g ( α ) = h  2 α/  D (1 − 2 p r )  . More- ov er , β Σ ( ε ) reduces to P t ∈ T ε ψ t ( f t ) = inf Ω ∈F r d ( F − Ω) when the process noise is zero (see (24) and Lemma 4). Hence, E qualifies, in the sense of Definition 2, as an estimator which is resilient to the set F r of measurement noise defined in (17). The resilience property of the estimator (8) lies here in the fact that, under the conditions of Theorem 1 and Corollary 1, the bound in (33) on the estimation error does not depend on the magnitudes of the extreme values of the noise sequence { f t } t ∈ T . Considering in particular the function β Σ ( ε ) , we remark that it can be ov erestimated as follo ws β Σ ( ε ) ≤ λ X t ∈ T 0 φ t ( w t ) + | T ε | ε. (34) The first term on the left hand side of (34) represents the uncertainty brought by the dense noise { w t } over the whole state trajectory . It is bounded since { w t } is bounded by assumption (see the description of the system in Section II). The second term is a bound on the sum of those instances of f t whose magnitude is smaller that ε . Because β Σ is a function of ε , the bound in (33) represents indeed a family of bounds parameterized by ε . Since ε is a mere analysis device, a question would be how to select it for the analysis to achieve the smallest bound. Such fav orable values, say ε ? , satisfy ε ? ∈ arg min ε ≥ 0  h  2 β Σ ( ε ) D (1 − 2 p r )  : r = | T c ε | , p r < 1 / 2  . Another interesting point is that the inequality stated by Theorem 1 holds for any norm k·k on R n × T . Note though that the value of the bound depends (through the parameter D ) on the specific norm used to measure the estimation error . Moreov er , dif ferent choices of the performance-measuring norm will result in different geometric forms for the uncertain set, that is, the ball (in the chosen norm) centered at the true state with radius equal to the upper bound displayed in (33). W e also observe that the smaller the parameter p r is, the tighter the error bound will be, which suggests that the esti- mator is more resilient when p r is lo wer . A similar reasoning applies to the number D which is desired to be large here. These two parameters (i.e., p r and D ) reflect properties of the system whose state is being estimated. The y can be interpreted, to some extent, as measures of the degree of observability of the system. In conclusion, the estimator inherits partially its resilience property from characteristics of the system being observed. This is consistent with the well-known fact that the more observ able a system is, the more robustly its state can be estimated from output measurements. Appr oximate resilience. As discussed abov e, the triangle inequality property of the loss function ψ is fundamental for achieving strict r esilience . When ψ does not satisfy this property (i.e., when γ ψ 6 = 1 ), the term δ ( ε ) in (23) is unlikely to vanish completely . Howe ver we can prevent it from gro wing excessi vely by an appropriate choice of ψ in (7). T o see this, assume for example that ψ is defined by ψ ( y ) = 1 − e − ` ( y ) . Then since ψ ( y ) ≤ 1 for all y , δ ( ε ) saturates to a constant value regardless of how large the f t are for t ∈ T c ε . On the other hand, this choice introduces a ne w technical challenge 8 related to the fact that the function q 0 in (32) is no longer a K ∞ function but a bounded (saturated) function. Handling this situation will require some additional condition on the upper bound in (31). T o sum up, by selecting a saturated loss function for ψ , we obtain the following appr oximate resilience result. Corollary 2 (Case where γ ψ 6 = 1 ) . Let the conditions of Theor em 1 hold. Assume further that the loss function ψ in (7) is defined by ψ ( y ) = 1 − e − ` ( y ) wher e ` : R n y → R ≥ 0 satisfies (P1) – (P5). In particular , assume that pr operty (P4) is satisfied by ` with a K ∞ function q such that (9) is an equality r elation. Also, let ε ≥ 0 be such that b ( ε ) , 2 β Σ ( ε ) + r (1 − γ ψ ) r o ( ε ) D [1 − (1 + γ ψ ) p r ] < 1 , (35) wher e r = r ( ε ) = | T c ε | and r o ( ε ) = max t ∈ T c ε ψ t ( f t ) ≤ 1 . Then there exists a continuous and strictly increasing function h sat : [0 , 1] → [0 , 1] (obe ying h sat (0) = 0 and h sat (1) = 1 ) such that for any norm k·k on R n × T , k ˆ X − X k ≤ h − 1 sat  b ( ε )  ∀ ˆ X ∈ E ( Y ) . (36) with D in (35) defined as in the pr oof of Theorem 1 using the norm k·k . Pr oof: That the particular function ψ specified in the statement of the corollary satisfies the properties (P1)–(P3) and (P5) is a question which is fully answered by Lemma 9 in Section C of the appendix. Consequently , let us observe that the inequality (31) arising in the proof of Theorem 1 still holds true here. As to (32), it also holds as well but with the slight difference that q 0 is just a saturated function in K sat , 1 (as defined in the notation section) with bounded range [0 , 1] . This results in fact from Lemma 9 and the proof of Lemma 2. W e can therefore write q 0 ( k E k ) ≤ 1 D (1 − (1 + γ ψ ) p r ) h 2 β Σ ( ε ) + (1 − γ ψ ) X t ∈ T c ε ψ t ( f t ) i ≤ b ( ε ) < 1 with q 0 ∈ K sat , 1 . Note from the definition of the class K sat , 1 , that q 0 ( k E k ) < 1 implies that k E k < 1 (since otherwise we would hav e q 0 ( k E k ) = 1 ). Letting h sat be the restriction of such a function q 0 on [0 , 1] , we hav e q 0 ( k E k ) = h sat ( k E k ) ≤ b ( ε ) with h sat being inv ertible. W e can now apply h − 1 sat to each member of this inequality to reach the desired result since b ( ε ) lies in the range of h sat . D. Resilience to attacks on the individual sensors W e now consider the situation where the matrix S ∈ R n y × T formed from { s t } in (2) may be sparse entrywise i.e., a relativ ely important fraction of the entries of S are equal to zero 2 . This case is relev ant when any individual sensor may be faulty (or compromised by an attacker) at any time. T o address the resilient state estimation problem in this scenario, we select 2 In this case, the sparsity is expressed in term of fraction of nonzer o entries in the matrix S whereas in the timewise block-sparsity case, the sparsity level is measured in term of the fraction of nonzer o columns in S . the loss functions ψ t to hav e a separable structure. T o be more specific, let ψ t be such that for any e = [ e 1 · · · e n y ] ∈ R n y ψ t ( e ) = n y X i =1 ψ ti ( e i ) (37) where, consistently with (7), ψ ti ( e i ) = ψ ◦ i ( V ti e i ) with V ti ∈ R > 0 and ψ ◦ i : R → R + , i = 1 , . . . , n y , being some loss functions on R enjoying the properties (P1) – (P5). As in the statement of Corollary 1, we shall require that γ ψ ◦ i = 1 . It follows that one can set ψ ◦ i to be the absolute value without loss of generality . Let therefore set ψ ◦ i ( e i ) = | e i | so that ψ ti ( e i ) = | V ti e i | and ψ t ( e ) = k V t e k 1 (38) with V t being a diagonal matrix having the V ti , i = 1 , . . . , n y , on its diagonal. T o state the resilience property in this particular setting, we partition the index set T × S of the entries of S as Λ ε = { ( t, i ) ∈ T × S : ψ ti ( f ti ) ≤ ε } Λ c ε = { ( t, i ) ∈ T × S : ψ ti ( f ti ) > ε } (39) with f ti denoting the i -th entry of the vector f t ∈ R n y . Also, in order to account for the specificity of the ne w scenario, let us refine slightly the r -Resilience index (21) to be ˜ p r = sup Z ∈ R n × T Z 6 =0 sup I ⊂ T × S | I | = r P ( t,i ) ∈ I ψ ti ( c > ti z t ) H Σ ( Z ) (40) where H Σ is defined as in (20) from ψ t in (38) and c > ti is i -th row of the observ ation matrix C t . The difference between p r in (21) and ˜ p r in (40) resides in the index sets for counting possible error occurrences which are T and T × S , respectiv ely . W ith these notations, we can provide the following theorem which is the analog of Corollary 1 in the case where the disturbance matrix S (see Eq. (2)) is entrywise sparse. Theorem 2 (Upper bound on the estimation error: Separable case) . Consider the system Σ defined by (1) with output Y together with the state estimator (8) in which φ is assumed to obe y (P1)–(P5) and ψ is defined as in (38) . Let ε ≥ 0 and set r = | Λ c ε | with Λ c ε defined in (39) . If the system is observable on [0 , T − 1] and if ˜ p r < 1 / 2 , then ther e exists a K ∞ function ˜ h such that for all norm k·k on R n × T , k ˆ X − X k ≤ ˜ h 2 ˜ β Σ ( ε ) ˜ D (1 − 2 ˜ p r ) ! ∀ ˆ X ∈ E ( Y ) (41) with X denoting the true state matrix from (1) and ˜ β Σ ( ε ) defined by ˜ β Σ ( ε ) = λ X t ∈ T 0 φ t ( w t ) + X ( t,i ) ∈ Λ ε ψ ti ( f ti ) ˜ D and ˜ h ar e defined as in the statement of Theorem 1 with H Σ being constructed fr om ψ in (38) . T o some extent, Theorem 2 can be viewed as a special case of Theorem 1 in which the function ψ is taken to be the ` 1 9 norm and the data set is modified to be T × S . Hence the proof follows a similar line of arguments as that of Theorem 2. Again it is not hard to see that the result of Theorem 2 implies the property of resilience with respect to the set F r in (17) of measurement noise in the sense of Definition 2 (see the proof of Corollary 1). An interesting property of the estimator can be stated in the absence of dense noise, i.e., when only the sparse noise is activ e: Corollary 3. Consider the system Σ defined by (1) and let r = | Λ c 0 | (which means that we consider every nonzer o occurrence of f it as an outlier by taking ε = 0 in (39) ). If the conditions of Theorem 2 hold, ˜ p r < 1 / 2 , and if w t = 0 in (1) for all t , then the estimator defined by (8) r etrieves exactly the state trajectory of the system, i.e., E ( Y ) = { X } . Pr oof: This follows directly from the fact that ˜ β Σ (0) = 0 in the case where there is no dense noise w t and ε = 0 . Therefore, the estimator (8) has the exact recoverability prop- erty , that is, it is able to recover exactly the true state of the system (1) when only the sparse noise is active in the measurement equation provided that the number r = | Λ c 0 | of nonzero in the sequence { f ti } ( t,i ) ∈ T × S is small enough for ˜ p r to be less than 1 / 2 . According to our analysis, the number of outliers that can be handled by the estimator in this case can be underestimated by max  r : ˜ p r < 1 / 2  . (42) V . A S P E C I A L V A R I A N T O F T H E E S T I M A T O R E In this section, we consider a constrained reformulation of the estimator E defined in (8). As will be shown shortly , this reformulation also enjoys the resilience property but under a condition which is more easily verifiable from a numerical perspectiv e. W e start by considering the simple scenario where the process noise w t in (1) is identically equal to zero and the sequence { f t } is sparse in the sense that its dense component v t displayed in (2) does not exist. In this setting we can obtain a more resilient (to sparse noise in the measurement) estimator than (8) by making it aware of the absence of dense process noise. This can be achiev ed by contraining the searched state matrix to be in the set Z Σ ⊂ R n × T defined by Z Σ =  Z =  z 0 A 0 z 0 · · · A T − 1 · · · A 1 A 0 z 0  : z 0 ∈ R n  of possible trajectories starting in any initial state z 0 ∈ R n . Follo wing this idea, we consider the estimator E ◦ defined by E ◦ ( Y ) = arg min Z ∈Z Σ V Σ ( Y , Z ) . Then E ◦ ( Y ) can be rewritten more simply in the form E ◦ ( Y ) = n Z =  z 0 A 0 z 0 · · · A T − 1 · · · A 1 A 0 z 0  : z 0 ∈ arg min z ∈ R n V ◦ Σ ( Y , z ) o (43) where V ◦ Σ ( Y , z ) = X t ∈ T ψ t ( y t − M t z ) (44) with M t = C t A t − 1 · · · A 1 A 0 (45) for all t ≥ 1 and M 0 = C 0 . Hence the estimation of the state trajectory reduces to estimating the initial state x 0 . This can be viewed as a robust regression problem, like the ones discussed in [13], [2]. Generalizing a result in [2], we deriv e next a necessary and sufficient condition for exact recov ery of the true state, which holds if and only if arg min z ∈ R n V ◦ Σ ( Y , z ) = { x 0 } with x 0 being the exact initial state of the system Σ . T o this end, we first introduce the concept of concentration ratio of a collection of matrices with respect to a loss function. A notational conv ention will be necessary for the statement of this property: for any subset I of T , let Ψ ◦ I ( z ) = X t ∈ I ψ t ( M t z ) . (46) Definition 4 ( r -th concentration ratio) . Let { ψ t } be a family of loss functions defined by (7) in which ψ is assumed to satisfy (P1), (P3) and (P5) with constant γ ψ = 1 . Let M = { M t } t ∈ T be a sequence of matrices such that the function Ψ ◦ T defined in (46) is positive definite. W e call r -th concentration r atio of M , the number defined by ν r ( M ) = sup z ∈ R n z 6 =0 sup I ⊂ T | I | = r Ψ ◦ I ( z ) Ψ ◦ T ( z ) (47) At a fixed r , ν r ( M ) quantifies a genericity property for the sequence M = { M t } t ∈ T . In view of the particular structure of the collection M in (45), note that Ψ ◦ T is positiv e definite whenev er the system Σ is observable on T . Furthermore, ν r ( M ) can be interpreted to some extent, as a quantitative measure of observability . It is indeed all the smaller as the system is strongly observable. T o see this, recall from Lemma 3 that if the system is observable on [0 , T − 1] , then for all Z ∈ Z Σ initiated from z in R n , we have V Σ (0 , Z ) = Ψ ◦ T ( z ) ≥ q ( k z k ) for some K ∞ function q . It follows that ν r ( M ) ≤ sup z ∈ R n z 6 =0 sup I ⊂ T | I | = r Ψ ◦ I ( z ) q ( k z k ) (48) Hence the more observable (i.e., the larger the gain function q ), the smaller ν r ( M ) . For all ( Y , z 0 ) ∈ R n y × T × R n with Y expressed colum- nwise in the form Y =  y 0 · · · y T − 1  , consider now the following notations: I 0 ( Y , z 0 ) = { t ∈ T : y t − M t z 0 = 0 } I c ( Y , z 0 ) = { t ∈ T : y t − M t z 0 6 = 0 } . Theorem 3 (Exact Recoverability Condition) . Consider the cost function (44) where M = { M t } is assumed to have been constructed as in (45) fr om the matrices of system (1) . Assume that the loss functions { ψ t } in volved in (44) ar e defined by (7) in which ψ is assumed to satisfy (P1), (P3) and (P5) with constant γ ψ = 1 . Let r be a positive inte ger . If the system (1) is observable on [0 , T − 1] , then the two following pr opositions ar e equivalent: 10 (i) F or all Y in R n y × T and all z 0 in R n , |I c ( Y , z 0 ) | ≤ r ⇒ arg min z ∈ R n V ◦ Σ ( Y , z ) =  z 0  (49) (ii) The index ν r ( M ) satisfies ν r ( M ) < 1 / 2 (50) Pr oof: (i) ⇒ (ii): Assume that (i) holds. Consider an arbitrary subset I of T such that | I | ≤ r . Let z 0 6 = 0 be a vector in R n . Construct a sequence Y in R n y × T such that y t = 0 if t ∈ I and y t = M t z 0 otherwise. Then I c ( Y , z 0 ) ⊂ I , so that |I c ( Y , z 0 ) | ≤ r . It then follows from (i) that arg min z ∈ R n V ◦ Σ ( Y , z ) =  z 0  which means that V ◦ Σ ( Y , z 0 ) < V ◦ Σ ( Y , z ) for all z ∈ R n , z 6 = z 0 . In particular, V ◦ Σ ( Y , z 0 ) < V ◦ Σ ( Y , 0) which, by taking into account the definition of Y , gives Ψ ◦ I ( z 0 ) < Ψ ◦ I c ( z 0 ) , where I c = T \ I . Since Ψ ◦ T ( z 0 ) = Ψ ◦ I ( z 0 ) + Ψ ◦ I c ( z 0 ) , we see that Ψ ◦ I ( z 0 ) Ψ ◦ T ( z 0 ) < 1 / 2 This reasoning works for e very nonzero z 0 and for ev ery subset I of T . W e can hence conclude that ν r ( M ) < 1 / 2 . (ii) ⇒ (i): Assume that (ii) holds. Let ( Y , z 0 ) ∈ R n y × T × R n be such that |I c ( Y , z 0 ) | ≤ r . W e then need to prove that arg min z ∈ R n V ◦ Σ ( Y , z ) =  z 0  . Since the assertion (ii) is assumed true, it follows from (47) and (50) that 2Ψ ◦ I c ( z 0 0 ) < Ψ ◦ T ( z 0 0 ) ∀ z 0 0 ∈ R n , z 0 0 6 = 0 (51) where, for simplicity , we ha ve posed I c = I c ( Y , z 0 ) . In the deriv ation of (51), we have used the obvious fact that r 1 ≤ r 2 ⇒ ν r 1 ( M ) ≤ ν r 2 ( M ) . If we pose I = I 0 ( Y , z 0 ) = T \ I c , then the inequality (51) is equiv alent to X t ∈ I c ψ t ( M t z 0 0 ) < X t ∈ I ψ t ( M t z 0 0 ) (52) Now we observe that for all t in I = I 0 ( Y , z 0 ) , y t = M t z 0 , so that ψ t ( M t z 0 0 ) = ψ t  y t − M t ( z 0 + z 0 0 )  . On the other hand, for t ∈ I c = I c ( Y , z 0 ) , if we apply the GTI (10) with γ ψ = 1 , we obtain ψ t ( M t z 0 0 ) = ψ t  y t − M t z 0 − ( y t − M t ( z 0 + z 0 0 )  ≥ ψ t ( y t − M t z 0 ) − ψ t ( y t − M t ( z 0 + z 0 0 )) Combining all these remarks with (52) yields X t ∈ I c [ ψ t ( y t − M t z 0 ) − ψ t ( y t − M t ( z 0 + z 0 0 )] < X t ∈ I ψ t ( y t − M t ( z 0 + z 0 0 )) Rearranging this giv es V ◦ Σ ( Y , z 0 ) < V ◦ Σ ( Y , z 0 + z 0 0 ) for all z 0 0 ∈ R n with z 0 0 6 = z 0 . This is equiv alent to arg min z ∈ R n V ◦ Σ ( Y , z ) =  z 0  . Hence (ii) holds as claimed. From the statement of Theorem 3, we infer that under the assumption that only the sparse noise { s t } is acti ve (i.e., there is no dense noise ( w t , v t ) ) in the system equations (1), E ◦ ( Y ) = { X } whenever ν r ( M ) < 1 / 2 , i.e, the estimator E ◦ returns exactly the true state. For a giv en system, if one can ev aluate numerically the index ν r ( M ) , then it becomes possi- ble to assess the number r max , max { r : ν r ( M ) < 1 / 2 } of gross errors that can be corrected by the estimator E ◦ when applied to that system. W e will get back to the computational matter in Section VI. A. Special case of ` 0 -norm loss based estimator Consider the special case where the loss functions { ψ t } are defined, for all t ∈ T , by ∀ e ∈ R n y , ψ t ( e ) =  1 if e 6 = 0 0 otherwise (53) This corresponds to the block ` 0 -norm. Note that such func- tions satisfy the assumptions (P1), (P3) and (P5) requested in the definition 4 of ν r ( M ) and in the statement of Theorem 3. Hence ν r ( M ) is well-defined in this case. Corollary 4. Consider system (1) under the assumption that w t = 0 for all t . Assume observability of the system on [0 , T − 1] . Consider the estimator (43) in which the cost V ◦ Σ is defined fr om the family of loss functions { ψ t } expr essed in (53) . Then for all ( Y , z 0 ) ∈ R n y × T × R n , |I c ( Y , z 0 ) | < T − µ ( M ) + 1 2 ⇒ E ◦ ( Y ) = { X } , wher e µ ( M ) defined by µ ( M ) = min n k : ∀ I ⊂ T ,  | I | = k ⇒ rank( M I ) = n o (54) is the minimum number k such that any matrix M I ∈ R | I | n y × n formed by stacking vertically the matrices of the collection { M t : t ∈ I } indexed by I ⊂ T with | I | = k , has full column rank. Pr oof: Let us start by observing that with the particular loss functions in voked in the statement of the corollary , Ψ ◦ T ( z ) denotes the number of t ∈ T for which ψ t ( M t z ) 6 = 0 . It follows from the definition of µ ( M ) that Ψ ◦ T ( z ) ≥ T − µ ( M ) + 1 for all z 6 = 0 . The reason for this is that if ψ t ( M t z ) was to be equal to zero more than µ ( M ) − 1 times, then z would be necessarily equal to zero. As a result we get ν r ( M ) ≤ r T − µ ( M ) + 1 Hence by applying Theorem 3, |I c ( Y , z 0 ) | / ( T − µ ( M ) + 1) < 1 / 2 is a sufficient condition for exact recovery by the ` 0 -norm based estimator . Remark 2. Assume that ψ t is defined to be the counting norm, i.e., ψ t ( e ) = k e k 0 (55) Then ψ t has a separ able structure as illustrated in (37) . Con- sider then defining, still under the observability assumption, an entrywise version of the concentration ratio by ˜ ν r ( M ) = sup z ∈ R n z 6 =0 sup I ⊂ T × S | I | = r P ( t,i ) ∈ I k M ti z k 0 Ψ ◦ T ( z ) (56) 11 wher e M ti r efers to the i -th r ow of M t . Further , let ˜ µ ( M ) = min n k : ∀ I ⊂ T × S ,  | I | = k ⇒ rank( ˜ M I ) = n o with ˜ M I ∈ R | I |× n denoting the matrix obtained by stacking the r ow vectors { M ti : ( t, i ) ∈ I } . Then a r esult similar to Cor ollary 4 is obtainable: if the number the measur ements corrupted by a nonzer o err or (among the n y T available) is strictly less than ( n y T − ˜ µ ( M ) + 1) / 2 , then the estimator E ◦ expr essed in (43) (with ψ t being the ` 0 norm as in (55) ) r ecovers exactly the true state. Remark 3. Under the condition of Remark 2, if we consider the scenario wher e only a set of k < n y sensors may be compr omised by attacker s, then exact r ecovery is achie ved if k < n y 2 − ˜ µ ( M ) − 1 2 T . (57) T aking into consideration the fact that ˜ µ ( M ) − 1 < T , it can then be seen that (57) is equivalent to k ≤ d n y / 2 − 1 e wher e the notation d r e , for r ∈ R , r efers to the smallest inte ger lar ger or equal to r . T o sum up, when the ψ t ar e defined as in (55) , the estimator (43) is able to r eturn the true state matrix even when d n y / 2 − 1 e sensors get faulty over the entir e observation horizon. This is r eminiscent of a r esult stated in [10] which ther efore appears to be a consequence of Theor em 3. B. Stability of the class of estimators E ◦ with respect to dense noise W e hav e argued that the class of estimators E ◦ in (43) is able to obtain exactly the true state matrix when there is no dense noises ( w t , v t ) in the system equations and only the sparse noise { s t } is activ e. The question we ask no w is whether this set of estimators can, in addition to sparse noise, handle dense process and output noises and to what extent this is possible. The starting point of our reflection is the observation that the dynamical system defined by ˜ x t +1 = A t ˜ x t , ˜ x 0 = x 0 y t = C t ˜ x t + s t + ( v t + ˜ v t ) , (58) produces the same output as system (1). Here, ˜ v t = C t ˜ w t , with ˜ w t = P t − 1 k =0 A t − 1 · · · A k +1 w k a definition which uses the con vention that the product A t − 1 · · · A k +1 = I if k = t − 1 . Then the idea is to apply the estimator E ◦ to (58) by neglecting the dense component ( v t + ˜ v t ) of the output equation. T o state the resilience result for E ◦ , consider for a gi ven ε ≥ 0 , a partition ( ˜ T ε , ˜ T c ε ) of T defined as in (22) with f t replaced by ˜ f t , s t + ( v t + ˜ v t ) = f t + ˜ v t . Theorem 4. Consider the estimator (43) for the system (1) . Assume that the loss functions { ψ t } in volved in (44) are defined by (7) in which ψ is assumed to satisfy (P1) – (P5) with constant γ ψ = 1 . Let ε ≥ 0 and set r = | ˜ T c ε | . Denote with N a norm on R n × T defined by N ( Z ) = max t ∈ T k z t k with z t being the t -th column of Z and k·k being a norm on R n . If the system (1) is observable on [0 , T − 1] and ν r ( M ) < 1 / 2 , then there exists a K ∞ function α such that for all norm k·k on R n × T , N ( ˆ X − X ) ≤ R Σ α − 1 ( ρ ) + max t ∈ T k ˜ w t k ∀ ˆ X ∈ E ◦ ( Y ) , (59) wher e R Σ is some constant depending on the system Σ and ρ = 2 D 1  1 − 2 ν r ( M )  X t ∈ ˜ T ε ψ t ( ˜ f t ) with ˜ f t = f t + ˜ v t , and D 1 = min k z k =1 Ψ ◦ T ( z ) . Pr oof: Let ˆ x 0 ∈ arg min z ∈ R n V ◦ Σ ( Y , z ) . W e first provide a bound on the error e 0 = ˆ x 0 − x 0 with x 0 denoting the true initial state of system (1). By exploiting the fact that V ◦ Σ ( Y , ˆ x 0 ) ≤ V ◦ Σ ( Y , x 0 ) and noting that y t = M t x 0 + ˜ f t , we reach the inequality X t ∈ T ψ t ( ˜ f t − M t e 0 ) ≤ X t ∈ T ψ t ( ˜ f t ) . By then reasoning quite similarly as in the proof of Theorem 1, we get Ψ ◦ T ( e 0 ) − 2Ψ ◦ T c ε ( e 0 ) ≤ X t ∈ ˜ T ε ψ t ( ˜ f t ) which, by exploiting (47) and the assumption that ν r ( M ) < 1 / 2 , leads to Ψ ◦ T ( e 0 ) ≤ 2 1 − 2 ν r ( M ) X t ∈ ˜ T ε ψ t ( ˜ f t ) Applying no w Lemma 2, we conclude that for any norm k·k on R n , there exists a K ∞ function α such that k e 0 k ≤ α − 1 ( ρ ) . Now by observing that for any ˆ X ∈ E ◦ ( Y ) , ˆ X − X =  e 0 A 0 e 0 · · · A T − 1 · · · A 0 e 0  −  0 ˜ w 0 · · · ˜ w T − 1  the result follows by posing 3 R Σ = max t ∈ T k A t − 1 · · · A 0 k ind with k·k ind being the matrix norm induced by the vector norm k·k on R n . The interest in Theorem 4 is that it provides a condition of resilience for the estimator E ◦ which can be checked numerically as will be discussed in the next section. V I . O N T H E N U M E R I C A L E V A L U A T I O N O F T H E R E S I L I E N C E C O N D I T I O N S The analysis results presented in Sections IV and V rely on some functions (resilience index, concentration ratio, . . . ) which characterize quantitati vely some properties of the sys- tem being observed. A question we ask now is whether it would be possible to ev aluate numerically these measures. In effect, computing the r -resilience index in (21) would help testing for example the resilience condition in Theorem 1. Similarly , ev aluating the concentration ratio ν r ( M ) introduced in (47) is the way to assess whether a giv en estimator is able to return the true state of a giv en system if we make an hypothesis on the number of potential nonzero errors in the measurements. 3 W e use here the con vention that A t − 1 · · · A 0 = I if t = 0 . 12 Unfortunately , obtaining numerically the numbers p r , ˜ p r or ν r require solving some hard noncon vex and combinatorial optimization problems. This is indeed a common characteristic of the concepts which are usually used to assess resilience; for example, the popular Restricted Isometry Property (RIP) constant [6] is comparativ ely as hard to e valuate. W e note howe ver that when the dimension of the state is small enough, ν r ( M ) can be exactly computed by taking inspiration from a method presented in [23] even though at the price of a huge (but affordable) computational cost. Alternati vely , a cheaper ov erestimation can be obtained by means of conv ex optimization as suggested in [2]. The next lemma provides such an ov erestimate for ν r ( M ) . Lemma 5 (An estimate of ν r ) . Assuming all quantities are well-defined (see the conditions in Definitions 3 and 4), the following statements hold: (a) ν r ≤ p r (b) If µ ( M ) ≤ T − 1 then ν r ( M ) ≤ r ν o 1 + ν o , (60) wher e ν o = max t ∈ T min λ t ∈ R T n k λ t k ∞ : V t M t = X k ∈ T λ tk V k M k , λ tt = 0 o (61) In (61) , the λ tk denote the entries of the vector λ t ∈ R T and { V t } refer s to the sequence of nonsingular weighting matrices in volved in (7) . The proof of statement (a) is straightforward by noticing that (47) follows from (21) by constraining the v ariable Z to be in Z Σ . As to the proof of statement (b), it follows a similar reasoning as the proof of Theorem 2 in [2]. The interest of this lemma is twofold. First it suggests that the resilience condition of E ◦ is weaker (in the sense that it is easier to achiev e) than that of E . Second, it provides an upper bound on ν r ( M ) which can be computed by solving a con vex optimization problem (see Eqs (60)-(61)). More specifically , giv en ν o in (61), checking numerically whether | T c ε | < 1 / 2(1 + 1 /ν o ) provides a sufficient condition for ν r ( M ) < 1 / 2 and so, for the resilience of the estimator (43). In a similar spirit as in Lemma 5, we now show that the parameter ˜ p r defined in (40) can also be overestimated via con vex optimization if the loss functions φ t and ψ t in (6)–(7) are both taken to be norms. Lemma 6 (An estimate of ˜ p r ) . Consider the resilience pa- rameter ˜ p r defined in (40) where we assume that ψ ti ( e ) = | e | for all ( t, i, e ) ∈ T × S × R and φ t is an arbitrary norm. Then ˜ p r ≤ r b 1 (62) wher e b 1 = inf ( t,i ) ∈ T × S inf Z ∈ R n × T  H Σ ( Z ) : c > ti z τ = 1  = 1 ˜ p 1 . (63) Pr oof: (See Appendix D) Note, under the assumptions of Lemma 6, that inf Z ∈ R n × T  H Σ ( Z ) : c > ti z τ = 1  is a con vex optimization problem for any given ( t, i ) . Hence, solving for b 1 in (63) requires solving T n y con vex problems and picking the smallest value among all. The interest of the lemma is that it provides an o verestimate of ˜ p r which is numerically computable. Based on the so obtained ov erestimate of ˜ p r , we see from Theorem 1 that the estimator (8) is resilient to r outliers if r < b 1 / 2 . Moreover , we can deduce an underestimate of the number of outliers that the estimator (8) is able to handle as r max = max { r : r < b 1 / 2 } . As a last remark in this paragraph, let us observe that Lemma 6 is also applicable to ov erestimate p r defined in (21) in the case of a single-output system, i.e. , when n y = 1 . V I I . S I M U L A T I O N R E S U LT S In this part we will illustrate numerically the resilience properties of the proposed class of estimators. For this purpose, we consider for simplicity , an example of linear time-in variant system in the form (1). W e select a single-input single-output example where the pair ( A, C ) is giv en by A =  0 . 7 0 . 45 − 0 . 5 1  , C =  1 2  . (64) W e instantiate the loss functions in (5) as follo ws: For all t in T and for all ( z , e ) ∈ R n × R n y , φ t ( z ) = ψ ( W t z ) and ψ t ( e ) = ψ ( V t e ) where the weighting matrices W t and V t and the functions φ and ψ will be specified below for each experiment. A. Numerical certificate of exact r ecoverability Suppose in this section that the process noise w t and the dense component v t of f t (see Eq. (2)) are both identically equal to zero. W e then focus on testing the exact recov erability property of the estimator (43) in the presence only of the sparse noise { s t } . The times of occurrence of the nonzeros values in the sequence { s t } are picked at random. As to its values there are also randomly generated from a zero-mean normal distribution with v ariance 100 2 . Giv en T = 100 output mea- surements and the system matrices in (64), the estimator E ◦ is implemented by directly solving the optimization problem defined in (43) through the CVX interface [12]. Note that the implementation of the estimator (43)-(44) requires computing the matrices M t expressed in (45), which take the form C A t in the L TI case. A problem that may occur howe ver is that if A is Schur stable as is the case here (or unstable), taking successive powers of A produces matrices M t which might not be of the same order of magnitude. T o preserve the contribution of each term of (44), we introduce special weighting matrices { V t } (in the loss function ψ selected as the ` 1 norm) to normalize the rows of these matrices so that they all have unit 2 -norm. V t is therefore selected to be a diagonal matrix of the form V t = diag( V t 1 , · · · , V tn y ) , where V ti =  1 /   c > i A t   2 if c > i A t 6 = 0 1 otherwise . (65) Here, c > i , i = 1 , . . . , n y , denote the i -th row of the matrix C . Indeed the effect of the weighting function in (44) is equiv alent to changing y t and M t respectiv ely to ˜ y t = V t y t 13 0 0.2 0.4 0.6 0.8 1 Sparsity level of sparse noise 0 20 40 60 80 100 Probability of exact recovery [%] Figure 1: Probability of exact recov ery (expressed in percent- age) by the estimator (43) in the presence of only sparse measurement noise { s t } . The level of sparsity of the noise is expressed in terms of a fraction of nonzero v alues in the sequence { s t : t ∈ T } with | T | = T = 100 . and ˜ M t = V t C A t . Posing M =  ˜ M t  , it can be checked using the methods discussed in Section VI (See Eq. (60)) that at least r max = 30 erroneous data (out of T = 100 measurements) can be accommodated by the estimator while still returning exactly the true state. T o in vestigate empirical performance, we consider different ratios | Λ c 0 | /T of nonzero values in the sequence { s t } . For each fixed proportion of nonzero v alues, we run the estimator over 100 dif ferent realizations of the output measurements. The results, depicted in Figure 1, tend to show that the estimator can still find the true state ev en for proportions of gross errors as large as 60% . B. P erformances in the pr esence of dense noise W e consider now the more realistic scenario where the process noise { w t } and the measurement noise { v t } are nonzero. W e further assume them to be bounded, white and uniformly distributed. For the numerical e xperiments these signals are sampled from an interv al of the form [ − a, a ] . For comparison purpose, we conduct the estimation with several estimators: • an instance of the estimator (8), denoted E ` 2 2 ,` 1 in the sequel, in which the loss function φ t is quadratic and ψ t is the ` 1 -norm and λ = 1000 (see (5) and (6)-(7)) • an instance of the estimator (8) denoted E ` 1 ,` 1 in which both loss functions φ t and ψ t are the ` 1 -norm with λ = 10 • the estimator E ◦ defined in (43) In addition we implement oracle versions 4 of E ` 2 2 ,` 1 and of E ` 2 2 ,` 2 2 (the latter corresponding to an instance of E where both φ t and ψ t are instantiated as quadratic functions). Experiment 1: Resilience test. K eeping the lev el of both dense noises (i.e., w t and v t ) fixed with amplitude a = 0 . 03 4 By oracle version of an estimator , we refer here to an implementation of this estimator which is aware of the sparse noise sequence { s t } . for the entries of the former and a = 0 . 1 for the latter (yielding a Signal to Noise Ratio (SNR) of about 30 dB in each case), we apply the estimators E ` 2 2 ,` 1 , E ` 1 ,` 1 and E ◦ as defined above) to 100 different realizations of the output data and we compute the average of the corresponding relativ e estimation errors. This process is repeated for different fractions of nonzeros in the sparse noise { s t } ranging from 0 to 0 . 8 . The estimates obtained by these estimators are displayed in Figure 2 in log scale. For the sake of comparison, we also display the oracle estimates given by E ` 2 2 ,` 1 and those obtained by a standard least squares estimator E ` 2 2 ,` 2 2 (i.e. with φ t and ψ t taken to be both quadratic in (5)). By oracle of an estimator, we mean here a version of that estimator which is aware of the true values of the sparse noise sequence { s t } . The results tend to show that the estimator (8) remains stable until the (empirical) resilience condition is violated (an ev ent that happens when the sparsity level for the sparse noise is around 60% ). This is consistent with the resilience property characterized in Theorem 1 and the empirical observations made in Section VII-A according to which the estimator is insensitiv e to the sparse noise sequence { s t } as long as the number of nonzero values in it (whose magnitudes are possibly arbitrarily large) is less than a certain threshold determined by the properties of the system. While Lemma 6 provides an underestimate of the number of correctable outliers as r max = 8 (out of 100 ), we can observ e that the empirical breakpoint in the current example seems to be indeed around 40% . The discrepancy between the two values is partly explained by the pessimism of the upper bound of p r proposed in Lemma 6. 0 0 . 2 0 . 4 0 . 6 0 . 8 1 10 − 1 10 0 10 1 Sparsity level of sparse noise Relativ e estimation error Estimator E ` 2 2 ,` 1 Estimator E ` 1 ,` 1 Estimator E ◦ Oracle E ` 2 2 ,` 1 Oracle E ` 2 2 ,` 2 2 Figure 2: A verage relative estimation error (in logarithm scale) induced by dif ferent estimators versus sparsity lev el of the sparse noise { s t } . The relative error is expressed here as   ˆ X − X   2 / k X k 2 where X and ˆ X denote the true and estimated state matrices respectiv ely . Parameters of the estimator E in (8): λ = 1000 , W t = I 2 and V t = 1 for all t . Experiment 2: Stability with respect to dense noise. Now , we fix the sparsity le vel of the time sequence { s t } to 0 . 2 and let the powers of the dense noise { ( w t , v t ) } v ary jointly from 5 dB to 100 dB in term of SNR. The estimates obtained by the estimators (8) and (43) with the choices of φ t and ψ t agreed in the beginning of Section VII are displayed in Figure 3 in 14 term of log 10 of estimation errors. What this illustrates is that whenev er the number of faulty data is reasonable (here 20% of the available measurements), the estimator discussed in this section behav es almost in the same way as when there is no faulty data at all. 0 20 40 60 80 100 10 − 5 10 − 4 10 − 3 10 − 2 10 − 1 10 0 Signal-to-Noise Ratio (dB) Relativ e estimation error Estimator E ` 2 2 ,` 1 Estimator E ` 1 ,` 1 Estimator E ◦ Oracle E ` 2 2 ,` 1 Oracle E ` 2 2 ,` 2 2 Figure 3: A verage relative estimation error (in log scale) induced by different estimators for different le vels of both dense noises w t and v t . Parameters of the estimator E in (8): λ = 1000 , W t = I 2 and V t = 1 for all t . Experiment 3: Impact of the regularization parameter λ in E . T o assess the influence of the regularization parameter λ on the performance of the estimators E ` 2 2 ,` 1 and E ` 1 ,` 1 , we fix the amplitudes of both dense noises { w t } and { v t } at the same le vel as in Experiment 1 ( i.e. SNR equal to 30 dB and ratio of non-zeros entries in { s t } equal to 30 % ). In this setting, we consider a set of values of λ ranging from 10 − 3 to 10 6 . For each of these values we perform an estimation ov er a hundred realizations of the output data and compute the av erage of the corresponding relativ e estimation errors. The outcome of this test, depicted in Figure 4, tends to suggest that low values of λ yield quite poor results. Conv ersely , when λ tends towards infinity , both estimators’ performance measures saturate at the same value, namely 0 . 1 . It turns out that this limit value corresponds to the relative error obtained for E ◦ in Experiment 1 in the same configuration, hence suggesting that E tends indeed to E ◦ in behavior as λ becomes large. Finally , it is interesting to observe that both performance curves exhibit minima located around λ = 750 and λ = 10 for E ` 2 2 ,` 1 and E ` 1 ,` 1 respectiv ely . V I I I . C O N C L U S I O N In this paper , we hav e considered the problem of estimating the state of linear time-varying systems in the face of uncer- tainties modeled as process and measurement noises in the system equations. The measurement noise sequence assumes values of possibly arbitrarily large amplitude which occur intermittently in time and accross the av ailable sensors. For this problem we have proposed a class of estimators based on the resolution of a family of parameterizable optimization problems. The discussed family is rich enough to include 10 − 3 10 − 2 10 − 1 10 0 10 1 10 2 10 3 10 4 10 5 10 6 10 − 1 10 0 10 1 10 2 Parameter λ Relativ e estimation error Estimator E ` 2 2 ,` 1 Estimator E ` 1 ,` 1 Figure 4: A verage relativ e estimation error induced by E ` 2 2 ,` 1 and E ` 1 ,` 1 for the system (64), SNR = 30 dB and 30 % of non-zero entries in { s t } for dif ferent values of the regularization parameter λ . optimization-based estimators based on v arious loss functions which may be con vex (e.g., ` p -norms) or nonconv ex (e.g., ` p quasi-norms or saturated functions), smooth or nonsmooth. In particular , we hav e proved a resilience property for the proposed class of state estimators, that is, the resulting es- timation error is bounded by a bound which is independent of the extreme values of the measurement noise provided that the number of occurrences (over time and over the whole set of sensors) of such extreme values is limited. Note ho wev er that the estimators studied here operate in batch mode, that is, they apply to a finite collection of measurements. In future works we intend to inv estigate efficient and lo w cost adaptiv e versions of the proposed optimization framew ork. Another interesting research avenue would be to study the lev el of performance which is achiev able if one uses the discussed framew ork as a method to detect bad data prior to a refinement with standard least squares estimation. A P P E N D I X In this appendix, we provide some technical results used in the paper and the associated proofs. A. A useful technical lemma Lemma 7. Let ξ 1 , ξ 2 : R a × b → R ≥ 0 be two functions which satisfy pr operties (P1) – (P3) and let ` : R c × d → R a × b be an injective linear mapping. Then ξ 1 + ξ 2 and ξ 1 ◦ ` verify (P1)– (P3). In addition, the following holds: (j) If ξ 1 , ξ 2 verify (P4), then ξ 1 + ξ 2 and ξ 1 ◦ ` verify (P4) . (jj) If ξ 1 , ξ 2 verify (P5), then ξ 1 + ξ 2 and ξ 1 ◦ ` verify (P5) . The main point of interest of this lemma is that even if there are functions which satisfy properties (P4) and (P5) with different values of q and γ , their sum still verifies those properties. T o prove Lemma 7, we will need the following result. 15 Lemma 8 (Minimum function of tw o K ∞ functions) . If q 1 and q 2 ar e two K ∞ functions, then so is the function q defined by ∀ λ ∈ R ≥ 0 , q ( λ ) = min i ∈{ 1 , 2 } q i ( λ ) (66) Pr oof: W e have to prove that q is continuous, strictly increasing and satisfies q (0) = 0 and lim λ → + ∞ q ( λ ) = + ∞ . First of all, it is clear that q (0) = 0 . Also, continuity of q is immediate from that of q 1 and q 2 by noting that q = ( q 1 + q 2 − | q 1 − q 2 | ) / 2 . T o see the strict increasingness of q , consider λ 1 and λ 2 in R ≥ 0 such that λ 1 < λ 2 . Then q ( λ 1 ) ≤ q 1 ( λ 1 ) < q 1 ( λ 2 ) and q ( λ 1 ) ≤ q 2 ( λ 1 ) < q 2 ( λ 2 ) . It follo ws that q ( λ 1 ) < min i ∈{ 1 , 2 } q i ( λ 2 ) = q ( λ 2 ) and hence q is strictly increasing. W e now show that q ( λ ) tends to infinity when λ → + ∞ . Let M > 0 be an arbitrary positi ve number . Since q 1 and q 2 tend to infinity , there exist η 1 and η 2 such that λ ≥ η 1 ⇒ q 1 ( λ ) ≥ M and λ ≥ η 2 ⇒ q 2 ( λ ) ≥ M . By taking η = max i ∈{ 1 , 2 } η i , it holds that q ( λ ) ≥ M whene ver λ ≥ η , or equi valently that, lim λ → + ∞ q ( λ ) = + ∞ . Proof of Lemma 7: The sum ξ 1 + ξ 2 has clearly the properties (P1)–(P3) as a sum of continuous, ev en, positiv e definite functions. Moreover , the composition of a continuous, ev en, conv ex positive definite function with an injecti ve linear mapping yields a continuous, ev en, positiv e definite function, so ξ 1 ◦ ` satisfies properties (P1) – (P3) too. Pr oof of (j): Assume that ξ 1 and ξ 2 satisfy (P4) with K ∞ functions q 1 and q 2 respectiv ely . For all λ 6 = 0 and all Z ∈ R a × b , (9) yields ξ i ( Z ) ≥ min j ∈{ 1 , 2 } q j  1 | λ |  ξ i ( λZ ) . (67) If we define q so that for all λ ∈ R ≥ 0 , q ( λ ) = min i ∈{ 1 , 2 } q i ( λ ) , then q is a K ∞ function (see Lemma 8 abov e) such that for all λ 6 = 0 and Z ∈ R a × b , ξ 1 ( Z ) + ξ 2 ( Z ) ≥ q  1 | λ |  ( ξ 1 ( λZ ) + ξ 2 ( λZ )) (68) therefore ξ 1 + ξ 2 verifies property (P4). Besides, for all λ 6 = 0 and Z in R c × d , ξ 1 ( ` ( Z )) ≥ q 1  1 | λ |  ξ 1 ( λ` ( Z )) = q 1  1 | λ |  ξ 1 ( ` ( λZ )) (69) giv en the linearity of ` . W e can then conclude that ξ 1 ◦ ` also verifies property (P4). Pr oof of (jj): Assume that ξ 1 and ξ 2 satisfy (P5) for γ 1 and γ 2 respectiv ely . Let γ = min i ∈{ 1 , 2 } γ i . Similarly to the first case, for all Z 1 , Z 2 in R a × b and i in { 1 , 2 } , (10) yields ξ i ( Z 1 − Z 2 ) ≥ γ ξ i ( Z 1 ) − ξ i ( Z 2 ) (70) which giv es ξ 1 ( Z 1 − Z 2 )+ ξ 2 ( Z 1 − Z 2 ) ≥ γ ( ξ 1 ( Z 1 ) + ξ 2 ( Z 1 )) − ( ξ 1 ( Z 2 ) + ξ 2 ( Z 2 )) (71) therefore ξ 1 + ξ 2 satisfies property (P5). Moreov er , for all Z 1 and Z 2 in R c × d , ξ 1 ( ` ( Z 1 − Z 2 )) = ξ 1 ( ` ( Z 1 ) − ` ( Z 2 )) ≥ γ ξ 1 ( ` ( Z 1 )) − ξ 1 ( ` ( Z 2 )) (72) so ξ 1 ◦ ` satisfies (P5) too. B. Pr oof of Lemma 3 (i) ⇒ (ii): Assuming that the system is observable on the interv al [0 , T − 1] , we need to prov e that there exists a K ∞ function q which verifies (14). The idea of the proof is to apply Lemma 7 to the function F of R n × T defined by F ( Z ) = V Σ (0 , Z ) with V Σ defined as in (5). T o begin with, we note that F can be decomposed as F = ξ ◦ ` where ξ : R n × ( T − 1) × R n × T → R ≥ 0 is a loss function such that for Z =  z 0 · · · z T − 2  in R n × ( T − 1) , Y =  y 0 · · · y T − 1  in R n y × T , ξ ( Z, Y ) = T − 2 X t =0 φ t ( z t ) + T − 1 X t =0 ψ t ( y t ) and ` : R n × T → R n × ( T − 1) × R n × T a linear mapping such that for all Z =  z 0 · · · z T − 1  in R n × T , ` ( Z ) =   z 1 − A 0 z 0 · · · z T − 1 − A T − 2 z T − 2  ,  C 0 z 0 · · · C T − 1 z T − 1   . T o apply Lemma 7 to F , we need to check that F fulfills the properties (P1) – (P3). In virtue of the assumptions on φ t and ψ t agreed in the statement of the lemma, the first two properties are obviously satisfied. The third will be satisfied if ` is injectiv e, a propriety which we now check. Let Z be such that ` ( Z ) = 0 . Then ∀ t ∈ { 0 , . . . , T − 2 } , z t +1 − A t z t = 0 (73) ∀ t ∈ { 0 , . . . , T − 1 } , C t z t = 0 (74) An immediate consequence of (73)–(74) is that O 0 ,T − 1 z 0 = 0 which yields z 0 = 0 because the system is observ able on [0 , T − 1] . Therefore, thanks to the recursi ve relation (73), we can conclude that Z = 0 , and so, the linear mapping ` is injectiv e. W e can therefore apply Lemma 7 to conclude that F satisfy indeed (P1) – (P4). Now , consider a matrix norm k·k ind on R n × T induced by two vector norms k·k T and k·k defined respectiv ely on R T and R n in the sense that k Z k ind = sup η ∈ R T η 6 =0 k Z η k k η k T Applying Lemma 2 to F with the so-defined induced norm, we infer that there exists D > 0 defined as in (12) and a K ∞ function q 0 , such that for all Z in R n × T , F ( Z ) ≥ D q 0 ( k Z k ind ) (75) If we denote with e 1 the canonical vector of R T with all entries equal to zero except the first one which is equal to 1 , then Z e 1 = z 0 . Howe ver , by definition of the induced norm, we know that k Z e 1 k / k e 1 k T ≤ k Z k ind . Therefore, as q 0 is an increasing function, we get that q 0 ( k z 0 k / k e 1 k T ) ≤ q 0 ( k Z k ind ) . By posing q : λ 7→ D q 0 ( λ/ k e 1 k T ) , it is easy to see that q is a K ∞ function so that for all Z in R n × T , V (0 , Z ) = F ( Z ) ≥ 16 q ( k z 0 k ) . (ii) ⇒ (i): Assume that there exists q in K ∞ such that for all Z =  z 0 z 1 . . . z T − 1  in R n × T such that (14) holds. W e want to prov e that the matrix O 0 ,T − 1 defined in (13) is of full column rank, which is equi valent to showing that for z in R n , O 0 ,T − 1 z = 0 implies z = 0 . For all z ∈ R n , construct a sequence Z ∗ =  z ∗ 0 · · · z ∗ T − 1  as follows: z ∗ 0 = z and z ∗ t +1 = A t z ∗ t for all t ∈ { 0 , . . . , T − 2 } . Since the inequality (14) is supposed to be true for any sequence, so it is for the particular sequence { z ∗ t } defined abo ve. Applying this inequality to Z ∗ yields V (0 , Z ∗ ) = T − 1 X t =0 ψ t ( C t z ∗ t ) ≥ q ( k z ∗ 0 k ) (76) Now , observe that if O 0 ,T − 1 z = 0 , then it follows from the recursiv e relation z ∗ t +1 = A t z ∗ t that for all t in { 0 , . . . , T − 1 } , C t z ∗ t = 0 . Injecting this in (76) imposes that q ( k z ∗ 0 k ) ≤ 0 which necessarily implies that z = 0 as q is a K ∞ function. Therefore, the matrix O 0 ,T − 1 is injecti ve and the system is observable on the interval [0 , T − 1] . C. T echnical r esults for pr oving Corollary 2 This section contains some technical steps of the proof of Corollary 2. Lemma 9. If ` : R n y → R ≥ 0 satisfies (P1)–(P3) and (P5), then so does the function ψ defined by ψ ( y ) = 1 − e − ` ( y ) . Mor eover if ` fulfills (P4), then ψ satisfies the same pr operty but with a function q in K sat ,a for a = 1 . Pr oof: It is straightforward to check that ψ obe ys (P1)- (P3). By assumption, ` obeys (P5). Denote therefore the associated constant with γ ` (which, by (10), is necessarily less than or equal to 1 ). T o see then that (P5) is also satisfied by ψ , we just need to check that ψ ( a + b ) − ¯ γ ` ψ ( a ) − ¯ γ ` ψ ( b ) ≤ 0 ∀ ( a, b ) ∈ R n y × R n y (77) with ¯ γ ` = γ − 1 ` ≥ 1 , which is equiv alent to 1 − 2 ¯ γ ` + ¯ γ ` e − ` ( a ) + ¯ γ ` e − ` ( b ) − e − ` ( a + b ) ≤ 0 Noting that ` ( a + b ) ≤ ¯ γ ` ` ( a ) + ¯ γ ` ` ( b ) , we have − e − ` ( a + b ) ≤ − e − ¯ γ ` ` ( a ) − ¯ γ ` ` ( b ) . From this it follows that for (77) to hold, it is enough that 1 − 2 ¯ γ ` + ¯ γ ` e − ` ( a ) + ¯ γ ` e − ` ( b ) − e − ¯ γ ` ` ( a ) − ¯ γ ` ` ( b ) ≤ 0 Posing α = e − ` ( a ) and β = e − ` ( b ) , it suf fices that 1 − 2 ¯ γ ` + ¯ γ ` α + ¯ γ ` β − ( αβ ) ¯ γ ` ≤ 0 ∀ ( α, β ) ∈ ]0 , 1] which can indeed be checked to be true by applying the identity 1 + ¯ γ ` α − ¯ γ ` ≤ α ¯ γ ` , see e.g., [3, Fact 1.9.2]. In effect, it follows from this identity that 1 − 2 ¯ γ ` + ¯ γ ` α + ¯ γ ` β − ( αβ ) ¯ γ ` = (1 + ¯ γ ` α − ¯ γ ` ) + (1 + ¯ γ ` β − ¯ γ ` ) − 1 − ( αβ ) ¯ γ ` ≤ α ¯ γ ` + β ¯ γ ` − 1 − ( α β ) ¯ γ ` = − (1 − α ¯ γ ` )(1 − β ¯ γ ` ) ≤ 0 . In conclusion, (77) holds and therefore ψ satisfies (P5). It remains no w to check (P4). This follo ws directly from Lemma 10 below , from which we know that ψ ( y ) ≥ q ? (1 /λ ) ψ ( λy ) with q ? is a saturated function in K sat , 1 . Lemma 10. Let ` : R n y → R ≥ 0 be a function satisfying pr operties (P1) – (P2) and (P4). In particular , assume that pr operty (P4) is satisfied by ` with a K ∞ function q such that (9) is an equality relation. Let g ( y, λ ) = 1 − e − ` ( y ) 1 − e − ` ( y /λ ) for λ 6 = 0 and y 6 = 0 . Then the function q ? : R ≥ 0 → [0 , 1] defined by q ? ( λ ) = inf y 6 =0 g ( y, λ ) for λ > 0 and q ? (0) = 0 , is well-defined, continuous and strictly increasing on [0 , 1] . Mor eover we have 1 − e − ` ( y ) ≥ q ? (1 /λ )  1 − e − ` ( λy )  ∀ ( λ, y ) ∈ R > 0 × R n y Pr oof: Since g is positiv e on its domain (hence lower - bounded), the defining infimum of q ? is well-defined. Pose a = e − ` ( y ) . Then by using the continuity property of ` and its radial unboundedness (see Lemma 2), we see that the range of a when y liv es in R n y \ { 0 } is ]0 , 1[ . From the assumptions of the lemma, ` ( y /λ ) = q (1 /λ ) ` ( y ) for all y and all λ > 0 and so, q (1) = 1 and e − ` ( y /λ ) = a q (1 /λ ) . For all λ > 0 we can write q ? ( λ ) = inf y 6 =0 g ( y, λ ) = inf a ∈ ]0 , 1[ 1 − a 1 − a q ( 1 λ ) with q (1 /λ ) ≥ 1 for 0 < λ ≤ 1 and q (1 /λ ) < 1 for λ > 1 . W e therefore obtain q ? ( λ ) =    1 q (1 /λ ) if 0 < λ ≤ 1 1 otherwise The so obtained q ? is clearly continuous wherever it is well defined. Moreover , since lim λ → 0 q ? ( λ ) = q ? (0) = 0 , we conclude that q ? is continuous on its entire domain. From the properties of q , we deduce that q ? is strictly increasing on [0 , 1] . Lastly , we observe that the inequality in the statement of the lemma is a direct consequence of the definition of q ? . D. Pr oof of Lemma 6 The starting point of the proof is the observ ation that for ev ery integer r in { 1 , . . . , T } , ˜ p r ≤ r ˜ p 1 . Hence it suffices to show that ˜ p 1 = 1 /b 1 and is as expressed in (63). Recall that by definition, ˜ p 1 = sup ( t,i ) ∈ T × S sup Z ∈ R n × T Z 6 =0 | c > ti z t | H Σ ( Z ) . (78) W ithout loss of generality , assume that c > ti 6 = 0 for all ( t, i ) ∈ T × S Then for any ( t, i ) , sup Z ∈ R n × T Z 6 =0 | c > ti z t | H Σ ( Z ) = sup Z ∈ R n × T Z 6 =0  | c > ti z t | H Σ ( Z ) : c > ti z t 6 = 0  , 1 β ti 17 where β ti = inf Z ∈ R n × T Z 6 =0  H Σ ( Z ) | c > ti z t | : c > ti z t 6 = 0  = inf Z ∈ R n × T Z 6 =0  H Σ ( Z ) : | c > ti z t | = 1  = inf Z ∈ R n × T Z 6 =0  H Σ ( Z ) : c > ti z t = 1  Recalling that H Σ ( Z ) is a norm under the conditions of the lemma, the second equality in the expression of β ti abov e follows from the (strict) homogeneity property of norms. As to the last equality , it follows from the fact that c > ti z t is a scalar which induces the possibility to replace the constraint | c > ti z t | = 1 indif ferently either by c > ti z t = 1 or by c > ti z t = − 1 . Now by in voking the definition of ˜ p 1 , it can be seen that ˜ p 1 = sup ( t,i ) ∈ T × S 1 β ti = 1 inf ( t,i ) ∈ T × S β ti = 1 b 1 . R E F E R E N C E S [1] L. Bako. Identification of switched linear systems via sparse optimiza- tion. Automatica , 47:668–677, 2011. [2] L. Bako. On a class of optimization-based robust estimators. IEEE T ransactions on Automatic Contr ol , 62:5990–5997, 2017. [3] D. S. Bernstein. Matrix Mathematics: Theory, F acts, and F ormulas . 2009. [4] S. Boyd and L. V andenberghe. Con vex optimization . Cambridge Univ ersity Press, 2004. [5] E. Candès and P . A. Randall. Highly robust error correction by con vex programming. IEEE T ransactions on Information Theory , 54:2829– 2840, 2006. [6] E. J. Candès. The restricted isometry property and its implications for compressed sensing. Comptes rendus mathematique , 346(9-10):589– 592, 2008. [7] E. J. Candès and M. B. W akin. An introduction to compressiv e sampling. IEEE Signal Processing Society , 25:21–30, 2008. [8] A. Cardenas, S. Amin, and S. Sastry . Secure control: T owards surviv able cyber -physical systems. In The 28th International Confer ence on Distributed Computing Systems W orkshops , 2008. [9] Y . H. Chang, Q. Hu, and C. J. T omlin. Secure estimation based kalman filter for cyber-physical systems against sensor attacks. Automatica , 95:399–412, 2018. [10] H. Fawzi, P . T abuada, and S. Diggavi. Secure estimation and control for cyber -physical systems under adversarial attacks. IEEE Tr ansactions on Automatic Control , 59(6):1454–1467, 2014. [11] S. Foucart and H. Rauhut. A mathematical introduction to compressive sensing . Birkhäuser , 2013. [12] M. C. Grant and S. P . Boyd. CVX: Matlab software for disciplined con vex programming, version 2.1. 2017. [13] D. Han, Y . Mo, and L. Xie. Con vex optimization based state estimation against sparse integrity attacks. IEEE Tr ansactions on Automatic Contr ol , 64:2383–2395, 2019. [14] P . J. Huber and E. M. Ronchetti. Robust Statistics . A. John W iley & Sons, Inc. Publication (2nd Ed), 2009. [15] C. M. Kellett. A compendium of comparison function results. Mathe- matics of Control, Signals, and Systems , 26:339–374, 2014. [16] A. Kircher , L. Bako, E. Blanco, M. Benallouch, and A. K orniienko. Analysis of resilience for a state estimator for time-discrete linear systems. 2020 American Control Conference . [17] S. Mishra, Y . Shoukry , N. Karamchandani, S. N. Diggavi, and P . T abuada. Secure state estimation against sensor attacks in the presence of noise. IEEE T ransactions on Contr ol of Network Systems , 4:49–59, 2017. [18] Y . Mo and B. Sinopoli. Secure estimation in the presence of integrity attacks. IEEE T ransactions on Automatic Control , 60:1145–1151, 2015. [19] M. Pajic, I. Lee, and G. J. Pappas. Attack-resilient state estimation for noisy dynamical systems. IEEE T ransactions on Contr ol of Network Systems , 4:82–92, 2016. [20] F . Pasqualetti, F . Dorfler, and F . Bullo. Attack detection and identi- fication in cyber -physical systems. IEEE T ransactions on Automatic Contr ol , 58:2715–2729, 2013. [21] X. Ren, Y . Mo, J. Chen, and K. H. Johansson. Secure state estimation with byzantine sensors: A probabilistic approach. Manuscript https: // arxiv .org/ abs/ 1903.05698 , 2019. [22] R. T . Rockafellar and R. J.-B.W ets. V ariational analysis . Springer Science & Business Media, 2009. [23] Y . Sharon, J. Wright, and Y . Ma. Minimum sum of distances estimator: Robustness and stability . In American Control Conference, St. Louis, Mo, USA , pages 524–530, 2009. [24] Y . Shoukry and P . T abuada. Event-triggered state observers for sparse sensor noise/attacks. IEEE T ransactions on A utomatic Control , 61:2079– 2091, 2016. [25] B. Sinopoli, L. Schenato, M. Franceschetti, K. Poolla, M. Jordan, and S. Sastry . Kalman filtering with intermittent observations. IEEE T ransactions on Automatic Contr ol , 49:1453–1464, 2004.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment