Multi-agent estimation and filtering for minimizing team mean-squared error

Motivated by estimation problems arising in autonomous vehicles and decentralized control of unmanned aerial vehicles, we consider multi-agent estimation and filtering problems in which multiple agents generate state estimates based on decentralized …

Authors: Mohammad Afshari, Aditya Mahajan

Multi-agent estimation and filtering for minimizing team mean-squared   error
1 Multi-agent estimation and filtering for minimizing team mean-squared error Mohammad Afshari, Student Member , IEEE, and Aditya Mahajan, Senior Member , IEEE Abstract —Motivated by estimation problems arising in au- tonomous vehicles and decentralized control of unmanned aerial vehicles, we consider multi-agent estimation and filtering prob- lems in which multiple agents generate state estimates based on decentralized inf ormation and the objective is to minimize a cou- pled mean-squared error which we call team mean-square error . W e call the resulting estimates as minimum team mean-squared error (MTMSE) estimates. W e show that MTMSE estimates are different from minimum mean-squared error (MMSE) estimates. W e derive closed-form expressions for MTMSE estimates, which are linear function of the observations where the corresponding gain depends on the weight matrix that couples the estimation error . W e then consider a filtering problem where a linear stochastic process is monitor ed by multiple agents which can share their observations (with delay) over a communication graph. W e derive expr essions to recursively compute the MTMSE estimates. T o illustrate the effectiveness of the proposed scheme we consider an example of estimating the distances between ve- hicles in a platoon and show that MTMSE estimates significantly outperform MMSE estimates and consensus Kalman filtering estimates. I . I N T RO D U C T I O N Emerging applications in autonomous vehicles and decen- tralized control of U A Vs (unmanned aerial v ehicles) giv e rise to estimation problems where multiple agents use local measurements to estimate the state of the shared en vironment in which they are operating and then use these estimates to act in the en vironment. In the resulting decentralized estimation problems, the objective is to minimize the weighted mean- square error between the true state and the decentralized estimates generated by all agents. W e call such a coupled mean- square error as team mean-squar ed err or and the resulting estimates as minimum team mean-squar ed error (MTMSE) estimates . For example, consider a platoon of self-driving vehicles where the estimation objectiv e is to ensure that the position estimates of each vehicle are close to the true position of the vehicle and, at the same time, the difference between the position estimates of adjacent vehicles are close to the true difference between the positions. Or consider a fleet of U A Vs (unmanned aerial vehicles) where the estimation objectiv e is to ensure that the position estimates of each U A V are close to the true position of the U A V and, at the same time, the centroid The authors are with the Department of Electrical and Computer Engineering, McGill Univ ersity , Montreal, QC, H3A-0E9, Canada. Emails: mohammad.afshari2@mail.mcgill.ca, aditya.mahajan@mcgill.ca This research was supported by the Natural Science and Engineering Research Council of Canada (NSERC). A preliminary version of this paper was presented in the 2018 IEEE Conference on Decision and Control (CDC) [1]. of the estimates of all U A Vs is close to the true centroid of their positions. A salient feature of these examples is that there are multiple agents who generate state estimates based on different information and the objective is to minimize a weighted mean-squared error between the true state and the decentralized estimates generated by all agents. W e first start with a simple example to illustrate that MTMSE estimates are different from the standard MMSE (minimum mean-squared error) estimates. Consider a system with two agents, indexed by i ∈ { 1 , 2 } , which observe the state of nature x ∼ N (0 , 1) with noise. In particular, the measurement y i ∈ R of agent i is y i = x + v i , v i ∼ N (0 , σ 2 ) , where x , v 1 , and v 2 are independent. Agent i ∈ { 1 , 2 } generates an estimate ˆ z i = g i ( y i ) ∈ R based on its local measurements, where ( g 1 , g 2 ) is any arbitrary estimation strategy . The objectiv e is to ensure that ˆ z i is close to x and at the same time the average ( ˆ z 1 + ˆ z 2 ) / 2 of the estimates is close to x . Thus, the estimation error J ( g 1 , g 2 ) of the estimation strategy ( g 1 , g 2 ) is given by E [( x − ˆ z 1 ) 2 + ( x − ˆ z 2 ) 2 ] + λ E "  x − ˆ z 1 + ˆ z 2 2  2 # = E "  x − ˆ z 1 x − ˆ z 2  |  1 + λ 4 λ 4 λ 4 1 + λ 4   x − ˆ z 1 x − ˆ z 2  # , (1) where λ ∈ R > 0 . Naively choosing ˆ z i as the MMSE estimate of x given y i , i.e., choosing ˆ z i = g mmse i ( y i ) : = E [ x | y i ] = 1 1 + σ 2 y i , giv es an estimation error of J mmse = J ( g mmse 1 , g mmse 2 ) = 2  σ 2 1 + σ 2  1 + λ 4 · 1 + 2 σ 2 1 + σ 2  . This nai ve strategy does not minimize the team mean-squared error given by (1) , even within the class of linear estimation strate gies . T o see this, we identify the best linear estimation strategy . Let ˆ z i = g lin i ( y i ) = F y i where F is same for both agents due to symmetry . The estimation error for this linear strategy is J lin = J ( g lin 1 , g lin 2 ) = (2 + λ )(1 − F ) 2 + 2  1 + λ 4  F 2 σ 2 2 which is conv ex in F . The value of gain F which minimizes this estimation error is F = 1 1 + 1+ λ/ 4 1+ λ/ 2 σ 2 = 1 1 + ασ 2 , where α = (1 + λ/ 4) / (1 + λ/ 2) . The corresponding estimation error is J lin = (2 + λ ) ασ 2 1 + ασ 2 . Note that for large λ , α ≈ 1 / 2 and the relati ve improv ement ∆ : = J mmse − J lin J lin ≈ 1 2 · σ 2 (1 + σ 2 ) 2 , is significant for moderate values of σ . For example, for σ = 1 , the relative percentage improvement is 12.5%. A plot of the relati ve percentage improv ement ∆ as a function of the v ariance σ for different values of λ is shown in Fig. 1. The relativ e percentage improv ement ∆ : = ( J mmse − J lin ) /J lin × 100 as a function of σ for different values of λ is shown in Fig. 1. The improv ement is significant for higher values of λ . 1 2 5 10 σ Percent Improv ement ( ∆(%) ) λ = 10 λ = 25 λ = 50 Fig. 1: Comparison of the relativ e improv ement of the best linear MTMSE estimator over the MMSE estimator as a function of σ for different values of λ . This significant improvement over MMSE estimates for a simple example motiv ates the central question of this paper: what ar e the estimation and filtering strate gies that minimize the team mean-squared err or ? W e start by modeling and answering this question for estimation in Sec. II. Then, we model and answer this question for filtering, where we assume that agents are connected ov er a graph and can share their measurements ov er a communication graph in Sec. III. W e generalize the filtering results to infinite horizon setup in Sec. III-F. Finally , we present examples to illustrate that MTMSE estimates significantly outperform MMSE and consensus Kalman filtering estimates. A. Liter atur e overview Follo wing the seminal work of Kalman [2] on recursiv e MMSE filtering, sev eral v ariations of single- and multi- agent MMSE filtering have been inv estigated in the literature. Ho wev er , as far as we are aware, there are only two references which ha ve inv estigated estimation or filtering for the MTMSE objectiv e [3], [4]. Both references in vestigated multi-agent filtering of a continuous time linear stochastic process. In [3], each agent observes a noise corrupted measurement of the state and the objective is to minimize a specific form of team mean-squared error . The key idea of [3] is to consider an augmented state and observation model and formulate the team mean-square error as the squared norm of an appropriately defined inner product of these augmented variables. It is shown that team mean-squared filtering problem can be formulated as a Hilbert space mean-squared error filtering problem and, therefore, solved using an appropriate Kalman filter . The model considered in [4] is similar except that each agent has multiple observation channels and, at each time, can select which observation channel to use. The solution approach is similar to [3]. Although [3], [4] are able to transform a MTMSE filtering problem to a Hilbert space MMSE filtering problem, the approach has sev eral limitations. First, and most importantly , the approach of [3], [4] is only applicable to a specific form of MTMSE cost. The formulation of the team mean-squared error as a squared norm of an appropriately defined inner product does not hold for the more general team mean-squared error considered in this paper . In particular, the form of the team mean-squared error considered in the practical examples in Sec. IV cannot be written as the squared norm of an appropriate inner product. Second, the size augmented state variables used in [3], [4] scales linearly with the number of agents. In particular, for a n -agent MTMSE filtering where the state is of dimension d x , the augmented state (and therefore the augmented estimate) is of dimension n ( d x ) 2 × nd x . Thus the resulting Kalman filter needs to k eep track of n 2 ( d x ) 3 × n 2 ( d x ) 3 dimensional cov ariance matrix. In contrast, the solution that we propose only requires a Kalman filter with a d x × d x dimensional error covariance. Finally , [3], [4] did not consider sharing of measurements among the agents. Such a sharing of measurements is a key feature of the general filtering model that we consider in this paper . Estimation problems with coupling between the estimates hav e been considered in the economics literature [5]–[7]. Howe ver , in such models, agents are strategic and want to minimize an indi vidual estimation objectiv e. The solution concept is identifying estimation strategies which are in Nash equilibrium which is different from the solution concept of minimizing a common team estimation error considered here. There is a rich literature on multi-agent filtering for dis- tributed sensor fusion [8]–[12] as well as for distributed simul- taneous localization and mapping (SLAM) in robotics [13]–[15]. There is also a rich literature on multi-agent estimation using consensus and gossip Kalman filters [16]–[21] (and references therein). Howe ver , all these methods only consider MMSE filtering. As illustrated by motiv ating example presented at the beginning, MTMSE estimates can be significantly dif ferent from MMSE estimates. So, the vast literature on multi-agent MMSE filtering is not directly applicable for MTMSE filtering. B. Contrib utions of the paper The salient feature of the model is that agents are informa- tionally decentralized and need to cooperate to minimize a 3 common team estimation objective. Our focus is to identify the structure of estimation strategies that find MTMSE when the graph topology , system dynamics, and the noise covariances are known to all agents. W e consider the problem of minimizing the team mean- squared error in an estimation problem where the measurements of the agents may be split into a common measurement and local measurements. 1 Using tools from team theory [22], we show that the optimal MTMSE estimate is a sum of tw o terms. The first term is the MMSE estimate of the state giv en the common measurement. The second term is a linear function of the innovation in the local measurement giv en the common measurement. Furthermore, the corresponding gains are computed by solving a system of matrix equation, which can be con verted into a linear system of equations using vectorization. W e then consider the problem of minimizing the sum of team mean-squared errors ov er time in a filtering problem where the agents share their measurements with their neighbors ov er a completely connected communication graph. Since the graph is completely connected, the information av ailable at each agent can be split into common information and local information. W e show that the structure of the optimal MTMSE estimates identified in the estimation setup continue to hold for filtering as well. W e setup an appropriate linear system with delayed observation to deriv e recursi ve formulas for the MMSE estimate of the state based on the common information and the innov ation in the local measurements giv en the common measurements. W e also deri ve recursive formulas for computing various cov ariances needed to compute the gain which multiplies the innovation term in the optimal estimates. Finally , we show that under standard stabilizability and de- tectability conditions, a time-homogeneous estimation strategy is optimal for minimizing the long-term average team mean- squared error . A preliminary version of this paper appeared in [1], where the main result for the filtering problem (Theorem 2) was stated. The proof of Theorem 2 relies heavily on the results for the estimation problem (Theorem 1) which was not included in [1]. Neither were the generalization to infinite horizon (Theorem 3). The detailed numerical experiments and the comparison with MMSE estimate and consensus Kalman filtering (Section IV), the detailed comparison with [3], [4] (Section I), the relation between the MTMSE estimates and decentralized control (Section V -B ), and the trade-off between MTMSE filter complexity and estimation accuracy (Section V -C) are new as well. C. Notation Let δ ij denote the Kronecker delta function (which is one if i = j and zero otherwise). Given a matrix A , A ij denotes its ( i, j ) -th element, A i • denotes its i -th row , A • j denotes its j -th column, A | denotes its transpose, v ec( A ) denotes the column vector of A formed by vertically stacking the columns of A . Gi ven a vector x , k x k 2 denotes x | x . Gi ven matrices A and B , diag ( A, B ) denotes the matrix obtained by putting A 1 If no such split is possible, then the common measurement is simply empty . and B in diagonal blocks, and A ⊗ B denotes the Kronecker product of the two matrices. Gi ven matrices A and B with the same number of columns, ro ws( A, B ) denotes the matrix obtained by stacking A on top of B . Given a squared matrix A , T r( A ) denotes the sum of its diagonal elements. Giv en a symmetric matrix A , the notation A > 0 and A ≥ 0 mean that A is positive definite and semi-definite, respectively . 1 n × m is a n × m matrix with all elements being equal to one. 0 n is a square n × n matrix with all elements being equal to zero. I n is the n × n identity matrix. W e omit the subscript from I n when the dimension is clear from context. W e sometimes consider random vectors X = ( x 1 , . . . , x k ) as a set with random elements { x 1 , . . . , x k } . In particular , giv en two random vectors X = ( x 1 , . . . , x k ) and Y = ( y 1 , . . . , y m ) , we define X T Y to mean v ec( { x 1 , . . . , x k } T { y 1 , . . . , y m } ) . Similarly , we use X \ Y to mean vec( { x 1 , . . . , x k } \ { y 1 , . . . , y m } ) . Giv en any vector v alued process { y ( t ) } t ≥ 1 and any time instances t 1 , t 2 such that t 1 ≤ t 2 , y ( t 1 : t 2 ) is a short hand notation for v ec( y ( t 1 ) , y ( t 1 + 1) , . . . , y ( t 2 )) . Giv en matri- ces { A ( i ) } n i =1 with the same number of rows and vectors { w ( i ) } n i =1 , ro ws( J n i =1 A ( i )) and v ec( J n i =1 w ( i )) denote ro ws( A (1) , . . . , A ( n )) and v ec( w (1) , . . . , w ( n )) , respectively . Giv en random vectors x and y , E [ x ] and v ar( x ) denote the mean and variance of x while co v( x, y ) denotes the cov ariance between x and y . I I . M I N I M U M T E A M M E A N - S Q UA R E D E R RO R ( M T M S E ) E S T I M A T I O N A. Model and problem formulation Consider a system with n agents that are indexed by the set N = { 1 , . . . , n } . The agents are interested in estimating the state x ∈ R d x of nature. Agent i makes a local measurement y i ∈ R d i y , i ∈ N . In addition, all agents observe a common measurement, which we denote by y 0 ∈ R d 0 y . W e use N 0 to denote the set { 0 , 1 , . . . , n } . The variables ( x, y 0 , y 1 , . . . , y n ) are assumed to be jointly Gaussian zero-mean random variables. For any i, j ∈ N 0 , let Θ i = co v( x, y i ) and Σ ij = co v( y i , y j ) . Agent i ∈ N generates an estimate ˆ z i ∈ R d i z according to an estimation rule g i , i.e., ˆ z i = g i ( y 0 , y i ) . Gi ven weight matrices { S ij } i,j ∈ N and { L i } i ∈ N , where S ij ∈ R d i z × d j z and L i ∈ R d i z × d x , the performance is measured by the team estimation error given by: c ( x, ˆ z 1 , . . . , ˆ z n ) = X i ∈ N X j ∈ N ( L i x − ˆ z i ) | S ij ( L j x − ˆ z j ) . (2) Let ˆ z = v ec( ˆ z 1 , . . . , ˆ z n ) denote the estimate of all agents. The team estimation error c ( x, ˆ z ) is a weighted quadratic function of ( Lx − ˆ z ) . In particular , c ( x, ˆ z ) = ( Lx − ˆ z ) | S ( Lx − ˆ z ) , (3) where S and L are given by S =    S 11 · · · S 1 n . . . . . . . . . S n 1 · · · S nn    and L =    L 1 . . . L n    . (4) W e assume that the matrix S is positive definite. 4 W e now present a few examples of the estimation error function of the form (3): 1) Suppose x = vec( x 1 , . . . , x n ) , where x i is the local state of agent i ∈ N . Suppose the agents want to estimate their own local state, but at the same time, want to make sure that the av erage ¯ z : = 1 n P i ∈ N ˆ z i of their estimates is close to the av erage ¯ x : = 1 n P i ∈ N x i of their local states. In this case, the team mean-squared error function is c ( x, ˆ z ) = X i ∈ N k x i − ˆ z i k 2 + λ k ¯ x − ¯ z k 2 , (5) where λ ∈ R > 0 . This can be written in the form (3) with L = I , and S ij =  δ ij + λ n 2  I . 2) Suppose the agents are moving in a line (e.g., a vehicular platoon) or in a closed shape (e.g., U A Vs flying in a formation) and want to estimate their local state but, at the same time, want to ensure that the difference ˆ d i : = ˆ z i − ˆ z i +1 between their estimates is close to the difference d i : = x i − x i +1 of their local states. For example when agents are moving in a line, the team mean-squared error function is c ( x, ˆ z ) = X i ∈ N k x i − ˆ z i k 2 + λ X i ∈ N \ n k d i − ˆ d i k 2 , (6) where λ ∈ R > 0 . This can be written in the form (3) with L = I and S ij =          (1 + 2 λ ) I , i = j ∈ { 2 , . . . , n − 1 } (1 + λ ) I , i = j ∈ { 1 , n } − λ I , j ∈ { i + 1 , i − 1 } 0 , otherwise , A similar weight matrix can be obtained for the case when agents are moving in a closed shape. 3) Suppose each agent generates an estimate ˆ z i ∈ R d x of the state x of nature and the objective is to minimize c ( x, ˆ z 1 , . . . , ˆ z n ) = X i ∈ N X j ∈ N ( x − ˆ z i ) | S ij ( x − ˆ z j ) . This can be written in the form (3) with L = 1 n × 1 ⊗ I d x × d x . This cost function is equiv alent to the team mean- squared error considered in [3], [4]. W e are interested in the following optimization problem. Problem 1 Giv en the cov ariance matrices { Θ i } i ∈ N 0 and { Σ ij } i,j ∈ N 0 and weight matrices L and S , choose the es- timation strategy g = ( g 1 , . . . , g n ) to minimize the expected team estimation error J ( g ) given by J ( g ) : = E [ c ( x, ˆ z )] . (7) Remark 1 In Problem 1, the system model is common knowledge among all agents. Thus, it may be viewed as a problem of “centralized planning and decentralized execution. ” The ke y conceptual dif ficulty in the problem is that the estimates are generated using different information (recall that the information available at agent i is ( y 0 , y i ) ) with the objective of minimizing a common coupled team estimation error giv en by (3) . This feature makes the Problem 1 conceptually different from the standard estimation problem of minimizing the MMSE error . 2 B. Optimal team estimation strate gy W e define three auxiliary variables: • All agents’ common estimate of state x gi ven the common measurement y 0 at all agents. W e denote this estimate by ˆ x 0 and it is equal to E [ x | y 0 ] . • All agents’ common estimate of agent i ’ s measur ement y i giv en the common measurement y 0 . W e denote this estimate by ˆ y i and it is equal to E [ y i | y 0 ] . • The innovation in the local measur ement of agent i with respect to the common measurement. W e denote this innov ation ˜ y i and it is equal to y i − ˆ y i . Let ˆ Θ i denote the cov ariance co v( x, ˜ y i ) and ˆ Σ ij denote the cov ariance co v( ˜ y i , ˜ y j ) . From elementary properties of Gaussian random variables, we have the following: Lemma 1 The covariance matrices defined above ar e given by 1) ˆ Θ i = Θ i − Θ 0 Σ − 1 00 Σ 0 i . 2) ˆ Σ ij = Σ ij − Σ i 0 Σ − 1 00 Σ 0 j . Ther efor e, the auxiliary variables defined above ar e given by 3) ˆ x 0 = Θ 0 Σ − 1 00 y 0 . 4) ˆ y i = Σ ij Σ − 1 00 y 0 . Furthermor e, we have 5) E [ x i | y 0 , y i ] = ˆ x 0 + ˆ Θ i ˆ Σ − 1 ii ˜ y i . 6) E [ ˜ y j | y 0 , y i ] = ˆ Σ j i ˆ Σ − 1 ii ˜ y i . 2 The result follows from elementary properties of Gaussian random variables. Then, we have the following. Theorem 1 The estimation strate gy that minimizes the team mean-squar ed err or in Pr oblem 1 is a linear function of the measur ements. Specifically , the MTMSE estimate may be written as ˆ z i = L i ˆ x 0 + F i ˜ y i , ∀ i ∈ N , (8) wher e the gains { F i } i ∈ N satisfy the following system of matrix equations: X j ∈ N h S ij F j ˆ Σ j i − S ij L j ˆ Θ i i = 0 , ∀ i ∈ N . (9) If ˆ Σ ii > 0 for all i ∈ N , then (9) has a unique solution which can be written as F = Γ − 1 η , (10) wher e F = v ec( F 1 , . . . , F n ) , η = vec( S 1 • L ˆ Θ 1 , . . . , S n • L ˆ Θ n ) , Γ = [Γ ij ] i,j ∈ N , wher e Γ ij = ˆ Σ ij ⊗ S ij . Furthermor e, the minimum team mean-squared err or is given by J ∗ = T r( L | S LP 0 ) − η | Γ − 1 η , (11) wher e S i = [ S i 1 , . . . , S in ] and P 0 = v ar( x − ˆ x 0 ) . 2 5 The proof of Theorem 1 is presented in Appendix A. T o illustrate this result, consider the two agent example presented in the introduction. In that model, there is no common measurement. So ˆ x 0 = 0 , ˆ y i = 0 , and therefore ˜ y i = y i . Moreov er , ˆ Σ ij = 1 + σ 2 δ ij and ˆ Θ i = 1 . Therefore, Γ ij = S ij ˆ Σ ij = ( δ ij + λ 4 )(1 + δ ij σ 2 ) , η i = S i 1 + S i 2 = 1 + λ 2 . Thus, the optimal gains are F = Γ − 1 η = 1 1 + ασ 2  1 1  , where α = (1 + λ/ 4) / (1 + λ/ 2) and the minimum team mean- squared error is J ∗ =  X i,j S ij  − η | F = (2 + λ ) ασ 2 1 + ασ 2 . Thus, we reco ver the results obtained by brute force calculations in the introduction. Remark 2 In (8) , the first term of the estimate is the MMSE estimate of the current state given the common measurements. The second term may be viewed as a “correction” which depends on the innov ation in the local measurement. A salient feature of the result is that the gains { F i } i ∈ N depend on the weight matrix S . 2 Remark 3 When S is block diagonal, there is no cost coupling among the agents and Problem 1 reduces to n separate problems. Thus, the MMSE estimates L i ˆ x i are also the MTMSE estimates. 2 I I I . M I N I M U M T E A M M E A N - S Q UA R E D E R RO R ( M T M S E ) FI L T E R I N G In this section, we consider the problem of filtering to mini- mize team mean-squared error when agents share information ov er a communication graph. W e start with a quick overvie w of graph theoretic terminology . A. Overvie w of graph theoretic terminology A directed weighted graph G is an ordered set ( N , E , τ ) where N is the set of nodes and E ⊂ N × N is the set of ordered edges, and τ : E → R k is a weight function. An edge ( i, j ) in E is considered directed from i to j ; i is the in-neighbor of j ; j is the out-neighbor of i ; and i and j are neighbors. The set of in-neighbors of i , called the in-neighborhood of i , is denoted by N − i ; the set of out-neighbors of i , called the out-neighborhood , is denoted by N + i . In a directed graph, a directed path ( v 1 , v 2 , . . . , v k ) is a weighted sequence of distinct nodes such that ( v i , v i +1 ) ∈ E . The length of a path is the weighted number of edges in the path. The geodesic distance between two nodes i and j , denoted by ` ij , is the shortest weight length of all paths connecting the two nodes. The weighted diameter of the graph is the largest weighted geodesic distance between any two nodes. A directed graph is called str ongly connected if for every pair of nodes i, j ∈ N , there is a directed path from i to j and from j to i . A directed graph is called complete if for ev ery pair of no de s i, j ∈ N , there is a directed edge from i to j and from j to i . B. Model and problem formulation 1) Observation Model: Consider a linear stochastic process { x ( t ) } t ≥ 1 , x ( t ) ∈ R d x , where x (1) ∼ N (0 , Σ x ) and for t ≥ 1 , x ( t + 1) = Ax ( t ) + w ( t ) , (12) where A is a d x × d x matrix and w ( t ) ∈ R d x , w ( t ) ∼ N (0 , Q ) , is the process noise. There are n agents, indexed by N = { 1 , . . . , n } , which observe the process with noise. At time t , the measurement y i ( t ) ∈ R d i y of agent i ∈ N is given by y i ( t ) = C i x ( t ) + v i ( t ) , (13) where C i is a d i y × d x matrix and v i ( t ) ∈ R d i y , v i ( t ) ∼ N (0 , R i ) , is the measurement noise. Eq. (13) may be written in vector form as y ( t ) = C x ( t ) + v ( t ) , where C = rows( C 1 , . . . , C n ) , y ( t ) = v ec( y 1 ( t ) , . . . , y n ( t )) , and v ( t ) = vec( v 1 ( t ) , . . . , v n ( t )) . The agents are connected over a communication graph G , which is a str ongly connected weighted directed graph with verte x set N . For every edge ( i, j ) , the associated weight τ ij is a positive integer that denotes the communication delay from node i to node j . Let I i ( t ) denote the information a vailable to agent i at time t . W e assume that agent i knows the history of all its measurements and τ j i step delayed information of its in- neighbor j , j ∈ N − i , i.e., I i ( t ) = { y i (1: t ) }   K j ∈ N − i { I j ( t − τ j i ) }  . (14) In (14), we implicitly assume that I i ( t ) = ∅ for any t ≤ 0 . Let ζ i ( t ) = I i ( t ) \ I i ( t − 1) denote the new information that becomes av ailable to agent i at time t . Then, ζ i (1) = y i (1) and for t > 1 , I i ( t ) = v ec( y i ( t ) , { ζ j ( t − τ j i ) } j ∈ N − i ) . It is assumed that at each time t , agent j ∈ N , communicates ζ j ( t ) to all its out-neighbors. This information reaches the out-neighbor i of agent j at time t + τ j i . Some examples of the communication graph are as follows. Example 1 Consider a complete graph with τ -step delay along each edge. The resulting information structure is I i ( t ) = { y (1: t − τ ) , y i ( t − τ + 1: t ) } , which is the τ -step delayed sharing information struc- tur e [23].  Example 2 Consider a strongly connected graph with unit delay along each edge. Let τ ∗ = max i,j ∈ N ` ij , denote the weighted diameter of the graph and N k i = { j ∈ N : ` j i = k } denote the k -hop in-neighbors of i with N 0 i = { i } . The resulting information structure is I i ( t ) = τ ∗ [ k =0 [ j ∈ N k i { y j (1: t − k ) } , 6 which we call the neighborhood sharing information struc- tur e .  At time t agent i ∈ N generates an estimate ˆ z i ( t ) ∈ R d i z of L i x ( t ) (where L i is a R d i z × d x matrix) according to ˆ z i ( t ) = g i,t ( I i ( t )) , where g i,t is a measurable function called the estimation rule at time t . The collection g i : = ( g i, 1 , g i, 2 , . . . ) is called the estimation strate gy of agent i and g : = ( g 1 , . . . , g n ) is the team estimation strate gy profile of all agents. 2) Estimation Cost: Let ˆ z ( t ) = vec( ˆ z 1 ( t ) , . . . , ˆ z n ( t )) de- note the estimate of all agents. As in Sec. II, we assume that the estimation error c ( x ( t ) , ˆ z ( t )) is a weighted quadratic function of ( Lx ( t ) − ˆ z ( t )) of the form c ( x ( t ) , ˆ z ( t )) = ( Lx ( t ) − ˆ z ( t )) | S ( Lx ( t ) − ˆ z ( t )) . (15) Examples of such estimation error functions were gi ven in Sec. II-A. 3) Pr oblem F ormulation: It is assumed that the system satisfies the following assumptions. (A1) The cost matrix S is positive definite. (A2) The noise covariance matrices { R i } i ∈ N are positive definite and Q and Σ x are positive semi-definite. (A3) The primitiv e random v ariables ( x (1) , { w ( t ) } t ≥ 1 , { v 1 ( t ) } t ≥ 1 , . . . , { v n ( t ) } t ≥ 1 ) are independent. (A4) For any square root D of matrix Q such that D D = Q , ( A, D ) is stabilizable. (A5) ( A, C ) is detectable. W e are interested in the following optimization problem. Problem 2 (Finite Horizon) Giv en matrices A , { C i } i ∈ N , Σ x , Q , { R i } i ∈ N , L , S , a communication graph G (and the corresponding weights τ ij ), and a horizon T , choose a team estimation strategy profile g to minimize J T ( g ) giv en by J T ( g ) = E g  T X t =1 c ( x ( t ) , ˆ z ( t ))  . (16) Problem 3 (Infinite Horizon) Giv en matrices A , { C i } i ∈ N , Σ x , Q , { R i } i ∈ N , and a communication graph G (and the corresponding weights τ ij ), choose a team estimation strategy profile g to minimize ¯ J ( g ) giv en by ¯ J ( g ) = lim sup T →∞ 1 T J T ( g ) . (17) As was the case for the estimation problem presented in Sec. II, a salient feature of the model is that the estimates are generated using different information while the objectiv e is to minimize a common coupled estimation error giv en by (16) or (17) . This feature makes the Problems 2 and 3 conceptually dif ferent from the standard filtering problem of minimizing the MMSE error . Remark 4 For Problem 2, the assumption that the dynamics, measurements, and cost are time-homogeneous is made simply for con venience of notation. As will be evident from the analysis, the results for Problem 2 generalize to the setting of time-varying dynamics, measurements, and cost in a natural manner . 2 C. Roadmap of the results The main idea behind identifying a solution for Problem 2 is as follows. W e observe that the choice of the estimates only affects the instantaneous estimation error but does not affect the ev olution of the system or the estimation error in the future. Therefore, the problem of choosing an estimation profile g = ( g 1 , . . . , g n ) to minimize J T ( g ) is equiv alent to solving the following T separate optimization problems: min ( g 1 ,t ,...,g n,t ) E [ c ( x ( t ) , ˆ z ( t ))] , ∀ t ∈ { 1 , . . . , T } . (18) Since the communication graph is strongly connected, the information I i ( t ) av ailable at agent i can be written as I com ( t ) ∪ I loc i ( t ) , where I com ( t ) = \ i ∈ N I i ( t ) = y (1 : t − τ ∗ ) is the common information among all agents (recall that τ ∗ is the weighted diameter of the communication graph) and I loc i ( t ) = I i ( t ) \ I com ( t ) is the location information at agent i . Thus, we may view Problem (18) as an estimation problem with n agents where agents have local and common information and, therefore, use the results of Sec. II to derive the MTMSE filtering strategy . T o do so, we define variables which are equi valent to the auxiliary variables defined in Sec. II-B: • All agents’ common estimate of state x ( t ) giv en the common information I com ( t ) at all agents. W e denote this estimate by ˆ x com ( t ) and it is equal to E [ x ( t ) | I com ( t )] . • All agents’ common estimate of the local information at agent i giv en the common information. W e denote this estimate by ˆ I loc i ( t ) and it is equal to E [ I loc i ( t ) | I com ( t )] . • The innov ation in the local information at agent i with respect to the common information. W e denote this innov ation by ˜ I i ( t ) and it is equal to I i ( t ) − ˆ I i ( t ) . Furthermore, we let ˆ Θ i ( t ) denote the cov ariance co v( x ( t ) , ˜ I i ( t )) and ˆ Σ ij ( t ) denote the cov ariance co v( ˜ I loc i ( t ) , ˜ I loc j ( t )) . In order to use the results of Theorem 1, we need to derive expressions for recursively updating the abov e variables and cov ariances, which we do next. D. Recursive expr essions for auxiliary variables and covari- ances The information structure of the problem is ef fectiv ely equal to τ ∗ -step delayed information structure [23]. T o deriv e recursiv e expressions for auxiliary variables and cov ariances, we follow the central idea of [23] and express the system variables in terms of delayed state x ( t − τ ∗ + 1) . 1) Delayed state estimates and common estimates: W e define ˆ x ( t − τ ∗ + 1) = E [ x ( t − τ ∗ + 1) | I com ( t )] = E [ x ( t − τ ∗ + 1) | y (1: t − τ ∗ )] (19) as the delayed state estimate of the state and let ˜ x ( t − τ ∗ + 1) = x ( t − τ ∗ + 1) − ˆ x ( t − τ ∗ + 1) 7 denote the corresponding estimation error and P ( t − τ ∗ + 1) = v ar( ˜ x ( t − τ ∗ + 1)) denote the estimation error cov ariance. Note that ˆ x ( t − τ ∗ + 1) is the one-step prediction estimate in centralized Kalman filtering and can be updated as follows. Start with ˆ x (1) = 0 and for t ≥ 1 , update ˆ x ( t + 1) = A ˆ x ( t ) + AK ( t )[ y ( t ) − C ˆ x ( t )] , (20) where K ( t ) = P ( t ) C | [ C P ( t ) C | + R ] − 1 (21) is the Kalman gain. Furthermore, the error cov ariance P ( t ) can be pre-computed recursiv ely using the forward Riccati equation: P (1) = Σ x and for t ≥ 1 , P ( t + 1) = A ∆( t ) P ( t )∆( t ) | A | + AK ( t ) RK ( t ) | A | + Q, (22) where ∆( t ) = I − K ( t ) C . Now , observe that we can compute the common estimate ˆ x com ( t ) using a ( τ ∗ − 1) -step propagation of the delayed state estimate ˆ x ( t − τ ∗ + 1) as follows: ˆ x com ( t ) = A τ ∗ − 1 ˆ x ( t − τ ∗ + 1) . (23) 2) Local estimates and local innovation: T o find a conv e- nient expression for local innov ation ˜ I loc i ( t ) , we express I loc i ( t ) in terms of the delayed state x ( t − τ ∗ + 1) . For that matter , for any t, ` ∈ Z > 0 , define the d x × 1 random vector w ( k ) ( `, t ) as follows: w ( k ) ( `, t ) = t − ` − 1 X s =max { 1 ,t − k } A t − ` − s − 1 w ( s ) , (24) where w ( k ) ( `, t ) is the weighted accumulated process noise from time max { 1 , t − k } to time t − ` − 1 . Note that w ( k ) ( `, t ) = 0 if t ≤ min { k , ` + 1 } or ` ≥ k . For any t ≥ k , we may write x ( t ) = A k x ( t − k ) + w ( k ) (0 , t ) , (25) y i ( t ) = C i A k x ( t − k ) + C i w ( k ) (0 , t ) + v i ( t ) . (26) By definition I loc i ( t ) ⊆ y ( t − τ ∗ + 1: t ) . Thus, for any i ∈ N , we can identify matrix C loc i and random vectors w loc i ( t ) and v loc i ( t ) (which are linear functions of w ( t − τ ∗ + 1: t − 1) and v i ( t − τ ∗ + 1: t ) ) such that I loc i ( t ) = C loc i x ( t − τ ∗ + 1) + w loc i ( t ) + v loc i ( t ) . (27) As an example, we write the expressions for ( C loc i , w loc i ( t ) , v loc i ( t )) for the delayed sharing and neighborhood sharing information structures belo w . For any ` ≤ τ ∗ , define W i ( `, t ) = vec( C i w ( τ ∗ − 1) ( τ ∗ − 1 , t ) , . . . , C i w ( τ ∗ − 1) ( `, t )) , C i ( ` ) = ro ws( C i , C i A, . . . , C i A τ ∗ − ` − 1 ) , V i ( `, t ) = vec( v i ( t − τ ∗ + 1) , . . . , v i ( t − ` )) . Example 1 (cont.) For the τ -step delayed sharing information structure I loc i ( t ) = y i ( t − τ + 1: t ) . Thus, C loc i = C i (0) , w loc i ( t ) = W i (0 , t ) , and v loc i ( t ) = V i (0 , t ) .  Example 2 (cont.) For the neighborhood sharing information structure, I i ( t ) = S τ ∗ k =0 S j ∈ N k i { y j (1: t − k ) } . Thus, C loc i = ro ws  J τ ∗ − 1 ` =0 J j ∈ N ` i C j ( ` )  , w loc i ( t ) = v ec  J τ ∗ − 1 ` =0 J j ∈ N ` i W j ( `, t )  , v loc i ( t ) = v ec  J τ ∗ − 1 ` =0 J j ∈ N ` i V j ( `, t )  .  Now , a key-result is the following. Lemma 2 w loc i ( t ) , v loc i ( t ) , ˜ x ( t − τ ∗ + 1) , and I com ( t ) ar e independent. 2 P RO O F Observe that I com ( t ) = y (1: t − τ ∗ ) and ˜ x ( t − τ ∗ + 1) are functions of the primitive random variables up to time t − τ ∗ , while w loc i ( t ) and v loc i ( t ) are functions of the primitiv e random variables from time t − τ ∗ + 1 onwards. Thus, w loc i ( t ) and v loc i ( t ) are independent of ˜ x ( t − τ ∗ + 1) and I com ( t ) . Furthermore, (A3) implies that w loc i ( t ) and v loc i ( t ) are inde- pendent of each other . Note that ˜ x ( t − τ ∗ + 1) is the estimation error when estimating x ( t − τ ∗ + 1) giv en I com ( t ) and is, therefore, uncorrelated with I com ( t ) . Since all random variables are Gaussian, ˜ x ( t − τ ∗ + 1) and I com ( t ) being uncorrelated also means that they are independent.  Combining Lemma 2 with (27), we get ˆ I loc i ( t ) = E [ I loc i ( t ) | I com ( t )] = C loc i ˆ x ( t − τ ∗ + 1) . (28) Combining this with (27), we get, ˜ I loc i ( t ) = I loc i ( t ) − ˆ I loc i ( t ) = C loc i ˜ x ( t − τ ∗ + 1) + w loc i ( t ) + v loc i ( t ) . (29) 3) Covariances: Let P w ij ( t ) denote co v( w loc i ( t ) , w loc j ( t )) and P v ij ( t ) denote co v( v loc i ( t ) , v loc j ( t )) . Note that these can be computed from he expressions of w loc i ( t ) and v loc i ( t ) , which were derived earlier based on the communication graph. Eq. (29) and Lemma 2 imply that ˆ Σ ij ( t ) = co v( ˜ I loc i ( t ) , ˜ I loc j ( t )) = C loc i P ( t − τ ∗ + 1) C loc j | + P w ij ( t ) + P v ij ( t ) , (30) where P ( t ) is computed using (22). Furthermore, Eqs. (25) and (29) and Lemma 2 imply that ˆ Θ i ( t ) = co v( x ( t ) , ˜ I loc i ( t )) = A τ ∗ − 1 P ( t − τ ∗ + 1) C loc i | + P σ i ( t ) , (31) where P σ i ( t ) = cov( w ( τ ∗ − 1) (0 , t ) , w loc i ( t )) and P ( t ) is com- puted using (22). E. Main result for Pr oblem 2 As mentioned in Sec. III-C , the problem of choosing the MTMSE estimation strategy g = ( g 1 , . . . , g T ) to minimize J T ( g ) is equiv alent to solving T separate estimation sub- problems given by (18) . Based on Theorem 1, the MTMSE estimate of each of these sub-problems is given as follo ws. Theorem 2 Under assumptions (A1)–(A3), the filtering strat- e gy which minimizes the team mean-squared err or in Pr oblem 2 8 is a linear function of the measurements. Specifically , the MTMSE estimates at time t may be written as ˆ z i ( t ) = L i ˆ x com ( t ) + F i ( t ) ˜ I loc i ( t ) (32) wher e ˆ x com ( t ) and ˜ I loc i ( t ) ar e computed using (23) and (29) . The gains { F i ( t ) } i ∈ N satisfy the following system of matrix equations X j ∈ N h S ij F j ( t ) ˆ Σ j i ( t ) − S ij L j ˆ Θ i ( t ) i = 0 , ∀ i ∈ N , (33) wher e ˆ Σ ij ( t ) and ˆ Θ i ( t ) ar e computed using (30) and (31) . Eq. (33) has a unique solution which can be written as F ( t ) = Γ( t ) − 1 η ( t ) , (34) wher e F ( t ) = v ec( F 1 ( t ) , . . . , F n ( t )) , η ( t ) = v ec( S 1 • L ˆ Θ 1 ( t ) , . . . , S n • L ˆ Θ n ( t )) , Γ( t ) = [Γ ij ( t )] i,j ∈ N , wher e Γ ij ( t ) = ˆ Σ ij ( t ) ⊗ S ij . Furthermor e, the minimum team mean-squar ed error is given by J ∗ T = T X t =1  T r( L | S LP 0 ( t )) − η ( t ) | Γ( t ) − 1 η ( t )  , (35) wher e P 0 ( t ) = v ar( x ( t ) − ˆ x com ( t )) and is given by P 0 ( t ) = A τ ∗ − 1 P ( t − τ + 1)( A τ ∗ − 1 ) | + Σ w ( t ) , (36) and Σ w ( t ) = v ar( w ( τ ∗ − 1) (0 , t )) . 2 P RO O F The expressions for the MTMSE estimates (32) and the corresponding gains (33) follo w immediately from Theorem 1. Now , since R ii is positiv e definite (which is part of (A2)), standard results from Kalman filtering [24, Section 3.4] imply that P ( t ) is positiv e definite. Using this fact in (30) implies that ˆ Σ ii ( t ) is positiv e definite. Therefore, the vectorized formula (34) follows from Lemma 5. The expression for the minimum team mean-squared error follow from an argument similar to that in the proof of Theorem 1. The expression for P 0 ( t ) follows from (23) and (25).  Remark 5 Remark 2 about the structure of the MTMSE estimates continues to hold for filtering setup as well. The first term in the MTMSE estimate (32) is the MMSE estimate of the current state based on the common information. The second term is a “correction” which depends on the innovation in the local measurements. 2 Remark 6 As in the estimation setup, the gains which multiply the innovation in (32) are coupled and depend on the weight matrix S . 2 Remark 7 Since we hav e assumed that the dynamics are time-homogeneous, the processes { w ( τ ∗ − 1) (0 , t ) } t ≥ τ ∗ , { w loc i ( t ) } t ≥ τ ∗ , and { v loc i ( t ) } t ≥ τ ∗ are stationary . Hence, for t ≥ τ ∗ , the covariance matrices Σ w ( t ) , P σ i ( t ) , P w ij ( t ) , and P v ij ( t ) are constant. 2 Remark 8 Note that ˆ Σ ij ⊗ S ij = 0 when S ij = 0 . Therefore, when the weight matrix S is sparse, as is the case for the cost (6) , ˆ Σ ij (and, therefore, P w ij ( t ) and P v ij ( t ) ) need to computed only for those i, j ∈ N for which S ij 6 = 0 . 2 F . Main r esult for Pr oblem 3 Now , we consider the infinite horizon MTMSE filtering introduced in Problem 3, which can be thought of as a “steady- state” version of Sec. III-E . W e first state a standard result from centralized Kalman filtering [24]. Lemma 3 Under (A2)–(A5), for any initial covariance Σ x ≥ 0 , the sequence { P ( t ) } t ≥ 1 given by (21) is weakly incr easing and bounded (in the sense of positive semi-definiteness). Thus it has a limit, which we denote by ¯ P . Furthermore , 1) ¯ P does not depend on Σ x . 2) ¯ P is positive semi-definite. 3) ¯ P is the unique solution to the following algebraic Riccati equation. ¯ P = A ∆ ¯ P ∆ | A | + A ¯ K R ¯ K | A | + Q, (37) wher e ¯ K = ¯ P C |  C ¯ P C | + R  − 1 and ∆ = I − ¯ K C . 4) The matrix ( A − ¯ K C ) is asymptotically stable. 2 Recall from Remark 7 that Σ w ( t ) , P σ i ( t ) , P w ij ( t ) and P v ij ( t ) are constants for t ≥ τ ∗ . W e denote the corresponding values for t ≥ τ ∗ as ¯ Σ w , ¯ P σ i , ¯ P w ij , and ¯ P v ij . Now define: ¯ P 0 = A τ ∗ − 1 ¯ P ( A τ ∗ − 1 ) | + ¯ Σ w , (38) ¯ Σ ij = C loc i ¯ P C loc j | + ¯ P w ij + ¯ P v ij , (39) ¯ Θ i = A τ ∗ − 1 ¯ P C loc i | + ¯ P σ i . (40) Lemma 4 Under (A2)–(A5), we have the following: 1) lim t →∞ P 0 ( t ) = ¯ P 0 . 2) lim t →∞ ˆ Σ ij ( t ) = ¯ Σ ij . 3) lim t →∞ ˆ Θ i ( t ) = ¯ Θ i . 2 P RO O F All relations follow immediately from Lemma 3 and Remark 7.  Theorem 3 Under (A1)–(A5), the following time-homog eneous filtering strate gy minimizes the team mean-squar ed err or for Pr oblem 3: ˆ z i ( t ) = L i ˆ x com ( t ) + ¯ F i ˜ I loc i ( t ) , (41) wher e ˆ x com ( t ) = A τ ∗ − 1 ˆ x ( t − τ ∗ + 1) (which is same as (23) ), ˆ x ( t ) is updated using the steady state version of (20) given by ˆ x ( t + 1) = A ˆ x ( t ) + A ¯ K [ y ( t ) − C ˆ x ( t )] , (42) and the gains { ¯ F i } i ∈ N satisfy the following system of matrix equations: X j ∈ N h S ij ¯ F j ¯ Σ j i − S ij L j ¯ Θ i i = 0 , ∀ i ∈ N , (43) wher e ¯ Σ ij and ¯ Θ i ar e given by (39) and (40) . Eq. (43) has a unique solution and can be written mor e compactly as ¯ F = ¯ Γ − 1 ¯ η, (44) 9 Fig. 2: A four agent U A V formation. The arrows indicate communication links between the agents. Each link has delay 2 . wher e ¯ F = v ec( ¯ F 1 , . . . , ¯ F n ) , ¯ η = vec( S 1 • L ¯ Θ 1 , . . . , S n • L ¯ Θ n ) , ¯ Γ( t ) = [ ¯ Γ ij ] i,j ∈ N , wher e ¯ Γ ij = ¯ Σ ij ⊗ S ij . Furthermor e, the optimal performance is given by J ∗ = T r( L | S L ¯ P 0 ) − ¯ η | ¯ Γ − 1 ¯ η, (45) wher e ¯ P 0 is given by (38) . 2 The proof of Theorem 3 is presented in Appendix C. I V . S O M E I L L U S T R A T I V E E X A M P L E S In this section, we present a few examples to illustrate the details of the main results. A. T eam mean-squared estimation in a UA V formation Consider a U A V formation with n agents as sho wn in Fig. 2. Let N = { 1 , . . . , n } and x i ( t ) denote the state of agent i ∈ N . For the ease of exposition, we assume that x i ( t ) ∈ R , which could correspond to say the altitude of the U A V . Let x ( t ) = v ec( x 1 ( t ) , . . . , x n ( t )) denote the state of the system, which ev olves as x ( t + 1) = Ax ( t ) + w ( t ) , where A is a known n × n matrix and w ( t ) ∼ N (0 , Q ) . The agent i observes the state with noise, i.e., y i ( t ) = C i x ( t ) + v i ( t ) , i ∈ N , where v i ( t ) ∼ N (0 , R i ) . The communication graph is as shown in Fig. 2, where each link is assumed to have delay 2. Thus, the information structure is given by I i ( t ) = { y (1: t − 2) , y i ( t − 1: t ) } . The objectiv e is to determine the MTMSE filtering for per-step estimation error gi ven by (5) , i.e., the agents want to estimate their local state and ensure that the average of the local state estimates is close to the av erage of their actual states. W e first show the computations of the MTMSE estimates. Observe that I com ( t ) = y (1: t − 2) and I loc i ( t ) = { y i ( t − 1) , y i ( t ) } . Thus, C loc i = ro ws( C i , C i A ) , and w loc i ( t ) = v ec(0 , C i w ( t − 1)) , v loc i ( t ) = v ec( v i ( t − 1) , v i ( t )) . As argued in Remark 7, the cov ariance matrices Σ w ( t ) , P σ i ( t ) , P w ij ( t ) , and P v ij ( t ) are constant for t ≥ τ ∗ . Thus, we only need to compute these for t = 1 and t ≥ 2 . Note that the weight matrix S is dense, so we do not get the computational savings described in Remark 8. W e hav e the following: • Σ w (1) = 0 and for t ≥ 2 , Σ w ( t ) = Q . • P σ i (1) =  0 4 × 1 0 4 × 1  and for t ≥ 2 , P σ i ( t ) =  0 4 × 1 QC |  . • P w ij (1) = diag (0 , 0) and for t ≥ 2 , P w ij ( t ) = diag(0 , C i QC | j ) . • P v ii (1) = diag(0 , R i ) and P v ii ( t ) = diag( R i , R i ) . • P v ij ( t ) = diag(0 , 0) for j 6 = i and all t . Substituting these, we get that ˆ Σ ij (1) = δ ij diag(0 , R i ) and for t ≥ 2 , ˆ Σ ij ( t ) =  C i C i A  P ( t − 1)  C j C j A  | +  δ ij R i 0 0 Q ij + δ ij R i  . Substituting these in (33) or (34) giv es us the optimal gains. The MTMSE estimates can then be computed using (32) as described in Sec. V -A. W e compare the performance of MTMSE filtering strategy with two baselines. The first is MMSE strategy where, each agent ignores the cost coupling and simply generates the MMSE estimates using ˆ z mmse i ( t ) = L i E [ x ( t ) | I i ( t )] . (46) It can be shown that performance of the MMSE strategy is J mmse T = T r( L | S LP 0 ( t )) + X i ∈ N T r  K i ( t ) | L | i X j ∈ N S ij L j h K j ( t ) ˆ Σ j i ( t ) − 2 ˆ Θ i ( t ) i  . (47) Recall that for this particular example we hav e L = I . The second is a consensus based Kalman filter as described in [16]. W e do not have a closed form expression for the weighted mean square error of the consensus Kalman filter, so we ev aluate the performance J CKF T using Monte Carlo ev aluation averaged ov er 1000 sample paths. For the numerical experiments we pick A ij = ( 0 . 65 , i = j 0 . 1 , elsewhere C 1 = 2 × 1 1 × n , and for i 6 = 1 , C i = 0 . 1 e i , where e i is a vector with only the i th element equal to one and the rest zero, Q = I , R = 0 . 1 I , and T = 100 . The relative improv ements ∆ mmse T = J mmse T − J ∗ T J ∗ T and ∆ CKF T = J CKF T − J ∗ T J ∗ T of the MTMSE strategy compared to MMSE strategy and consensus Kalman filtering as a function of λ are shown in Fig. 3. These plots show that the MTMSE strategy outperforms 10 2 4 6 8 10 1 2 3 4 n = 4 n = 10 λ n 2 ∆ mmse T (a) 0 2 . 5 5 7 . 5 10 1 2 3 4 n = 4 n = 10 λ n 2 log 10 (∆ CKF T ) (b) Fig. 3: Relativ e improvement of MTMSE filtering compared to (a) MMSE strategy for 4 and 10 number of agents, and (b) consensus Kalman filtering (sho wn on a log scale) for U A V formation. the MMSE and consensus Kalman filtering strategies by up to a factor of 4 and 600 in the relativ e improv ements for n = 10 and λ n 2 = 10 . This improvement in performance will increase with the number of agents. B. T eam mean-squared estimation in a vehicular platoon No w we consider a vehicular platoon with four agents sho wn in Fig. 4. As before, let x i ( t ) ∈ R denote the position of the platoon. W e assume that the dynamics and the observation model are similar to that described in Sec. IV -A (but with different A and C matrices). The communication graph is as shown in Fig. 4. Thus, the information structure is giv en by I 1 ( t ) = { y 1 (1: t ) , y 2 (1: t − 1) , y 3 (1: t − 2) , y 4 (1: t − 3) } , I 2 ( t ) = { y 1 (1: t − 1) , y 2 (1: t ) , y 3 (1: t − 1) , y 4 (1: t − 2) } , I 3 ( t ) = { y 1 (1: t − 2) , y 2 (1: t − 1) , y 3 (1: t ) , y 4 (1: t − 1) } , I 4 ( t ) = { y 1 (1: t − 3) , y 2 (1: t − 2) , y 3 (1: t − 1) , y 4 (1: t ) } . The objectiv e is to determine the MTMSE filtering for per-step estimation error gi ven by (6) , i.e., the agents want to estimate their local states and ensure that the difference between the Fig. 4: A four agent vehicular platoon. The arrows indicate communication links between the agents. estimates of adjacent agents is close to difference between their actual states. W e first show the computations of the MTMSE estimates. Observe that I com ( t ) = y (1: t − 3) and I loc 1 ( t ) = { y 1 ( t − 2: t ) , y 2 ( t − 2: t − 1) , y 3 ( t − 2) } , I loc 2 ( t ) = { y 1 ( t − 2: t − 1) , y 2 ( t − 2: t ) , y 3 ( t − 2: t − 1) , y 4 ( t − 2) } , I loc 3 ( t ) = { y 1 ( t − 2) , y 2 ( t − 2: t − 1) , y 3 ( t − 2: t ) , y 4 ( t − 2: t − 1) } , I loc 4 ( t ) = { y 2 ( t − 2) , y 3 ( t − 2: t − 1) , y 4 ( t − 2: t ) } . Similar to the previous example, the cov ariance matrices Σ w ( t ) , P σ i ( t ) , P w ij ( t ) , and P v ij ( t ) are constant for t ≥ τ ∗ . Thus, we need to compute these for t = 1 , t = 2 , and t ≥ 3 . In addition, since the cost matrix S is sparse, we only need to compute P w ij ( t ) and P v ij ( t ) for j ∈ { i − 1 , i, i + 1 } ∩ N (see Remark 8). The details for computing ˆ Σ ij are similar to the pre vious section and are omitted due to space limitations. The MTMSE estimates can be computed using (32) as described in Sec. V -A. W e compare the performance of MTMSE filtering strategy with the MMSE strategy and the consensus Kalman filtering as before. For the numerical experiment in this part, we pick A =     0 . 9 0 0 0 0 . 7 0 . 9 0 0 0 . 7 0 . 7 0 . 9 0 0 . 5 0 . 7 0 . 7 0 . 9     , C i = I n , Q = I , R = 0 . 1 I , and T = 100 . The relati ve improvements as a function of λ are sho wn in Fig. 5. These plots show that the MTMSE strategy outperforms the MMSE and consensus Kalman filtering strategies by up to a factor of 2 and 800. Again, this improvement in performance will increase with the number of agents. V . D I S C U S S I O N O F T H E R E S U LT S A. Implementation of MTMSE filtering strate gy In this section, we provide the details about implementing the MTMSE filtering strategies for both the finite and infinite horizon setups. 1) Implementation of finite horizon MTMSE filtering str ate gy: Based on Theorem 2, the MTMSE filtering strategy can be implemented as follows. a) Computing the gains: The gains { F ( t ) } T t =1 are com- puted offline as follows. First the variance { P ( t ) } T t =1 are computed using the forward Riccati equation (22) . Then, the cov ariances { ˆ Σ ij ( t ) } T t =1 and { ˆ Θ i ( t ) } T t =1 are computed for 11 40 80 120 160 1 2 λ ∆ mmse T (a) 0 40 80 120 160 2 2 . 3 2 . 6 3 λ log 10 (∆ CKF T ) (b) Fig. 5: Relati ve improvement of MTMSE filtering compared to (a) MMSE strategy and (b) consensus Kalman filtering (shown on a log scale) for vehicular platoon. all i, j ∈ N . Thereafter , the gains { K ( t ) } T t =1 are computed using (21) and the gains { F ( t ) } T t =1 are computed using (34). Finally , the gains { K ( t ) } T t =1 and { F i ( t ) } T t =1 are stored in agent i . b) Computing the MTMSE estimates: Agent i ∈ N carries out the following computations to generate ˆ z i ( t ) . First, it computes the delayed centralized estimate ˆ x ( t − τ ∗ + 1) using (20) . Then, it uses ˆ x ( t − τ ∗ + 1) to compute ˆ x com ( t ) and ˆ I loc i ( t ) using (23) and (28) , respectiv ely . Then, it uses ˆ x com ( t ) and I loc i ( t ) to generate the MTMSE estimate as follows ˆ z i ( t ) = L i ˆ x com ( t ) + F i ( t )( I loc i ( t ) − ˆ I loc i ( t )) . 2) Implementation of infinite horizon MTMSE filtering strate gy: Based on Theorem 3, the MTMSE filtering strategy can be implemented as follows. a) Computing the gains: The gains { ¯ F i } are computed offline as follo ws. First the variance ¯ P is computed using the forward algebraic Riccati equation (37) . Then, the covariances ¯ P 0 , ¯ Σ ij , and ¯ Θ i are computed for all i, j ∈ N using (38) - (40) . Thereafter , the gain ¯ K is computed using Lemma 3 and the gain ¯ F is computed using (44) . Finally , the gains ¯ K and ¯ F are stored in agent i . b) Computing the MTMSE estimates: Agent i ∈ N carries out the following computations to generate ˆ z i ( t ) . First, it computes the delayed centralized estimate ˆ x ( t − τ ∗ + 1) using (42) . Then, it uses ˆ x ( t − τ ∗ + 1) to compute ˆ x com ( t ) and ˆ I loc i ( t ) using (23) and (28) , respectiv ely . Then, it uses ˆ x com ( t ) and I loc i ( t ) to generate the MTMSE estimate as follows ˆ z i ( t ) = L i ˆ x com ( t ) + ¯ F i ( I loc i ( t ) − ˆ I loc i ( t )) . B. Connection to decentralized stochastic contr ol One of the most celebrated results in centralized stochastic control of linear systems with quadratic cost and Gaussian dis- turbance (so-called LQG setup) is the separation of estimation and control. In particular, the optimal control action is equal to a gain multiplied by the current state estimate. The computation of the gain matrix and the estimate are separated from each other . The gain matrix is computed based on the solution of a backward Riccati equation where the state estimates are updated based on the Kalman filtering equation (which is a forward Riccati equation). The forward and the backward Riccati equations are decoupled and can be solved separately . These simplifications do not hold for decentralized control of LQG systems. In general, non-linear strategies may outperform the best linear strategies. Linear strategies are known to be optimal only for specific models [25]–[30]. But in these cases there is no separation of estimation and control. The results of this paper shed light on the lack of separation in decentralized control of LQG systems. W e explain this in Appendix D using the example of decentralized stochastic control with one-step delayed information structure [26], [31], [32]. For this model, we show that the decentralized control problem is equiv alent to a MTMSE filtering problem, where the weight matrix depends on the solution of a backward Riccati equation. As shown in Theorem 2, the gains for MTMSE filtering depends on the weight matrix S in the cost function. That is the reason that the computation of the state estimate is not separated from the computation of the controller gains. C. T rade-off between filter complexity and estimation accuracy For graphs with neighborhood sharing information structure, the dimension of ˜ I loc i ( t ) and F i ( t ) are proportional to the diameter τ ∗ of the graph. It is possible to trade-off the implementation complexity with the filtering accuracy by “shedding” information at each agent. W e explain this via the example of Sec. IV -B. W e consider two approximate information structures for this example, which we denote by { I (1) i ( t ) } i ∈ N and { I (2) i ( t ) } i ∈ N . For both these information structures, the common information is the same as before, i.e., I com , ( m ) ( t ) : = \ i ∈ N I ( m ) i ( t ) = y (1: t − 3) , m ∈ { 1 , 2 } . But the local information I loc , ( m ) i ( t ) : = I ( m ) i ( t ) \ I com , ( m ) ( t ) is a subset of the original I loc i ( t ) . In particular, we assume the following. 12 1) IS 1 : In the first approximation, each agent just uses the measurements from a time window of size two to “correct” the common information based estimate, i.e., I loc , (1) 1 ( t ) = { y 1 ( t − 1: t ) , y 2 ( t − 1) } , I loc , (1) 2 ( t ) = { y 1 ( t − 1) , y 2 ( t − 1: t ) , y 3 ( t − 1) } , I loc , (1) 3 ( t ) = { y 2 ( t − 1) , y 3 ( t − 1: t ) , y 4 ( t − 1) } , I loc , (1) 4 ( t ) = { y 3 ( t − 1) , y 4 ( t − 1: t ) } , 2) IS 2 : In the second approximation, each agent justs uses its local measurements to “correct” the common information based estimate, i.e., I loc , (2) i ( t ) = y i ( t − 2: t ) . For completeness, we refer to the original information structure as IS 0 . Note that I loc , ( m ) i ( t ) ⊂ I loc i ( t ) , therefore any filter- ing strategy based on the approximate information structure { I ( m ) i ( t ) } i ∈ N can be implemented in the original information structure { I i ( t ) } i ∈ N . The size of I loc i ( t ) (and therefore ˜ I loc i ( t ) ) for the different information structures is shown in T able I. T o compare the peformance of these three information structures, we note that the structure of the weight matrix S implies that lim λ →∞ J ∗ T /λ is a constant. So, we ev aluate J ∗ T /λ for large value of λ ( λ = 100 ) and compare the performance of the three information structures. The results are also sho wn in T able I. T ABLE I: Comparison of the size and performance of the three information structures for the v alues of parameters of Sec. IV -B and λ = 100 . Info structure Dimension of local info Performance J ∗ T /λ i ∈ { 1 , 4 } i ∈ { 2 , 3 } IS 0 : { I i ( t ) } i ∈ N 6 8 180.46 IS 1 : { I (1) i ( t ) } i ∈ N 3 4 193.72 IS 2 : { I (2) i ( t ) } i ∈ N 3 3 252.09 This example shows that it is possible to trade-off the complexity of the MTMSE filter with the estimation accuracy . Note that although the two approximate information structures are almost of the same size, IS 1 has better performance than IS 2 . This is because IS 1 uses some local infomration from the neighborhood nodes, while IS 2 does not. This suggested that it is better to hav e some information from many agents rather than a lot of information from a fe w agents but a more detailed in vestigation is needed to quantify such a comparison. V I . C O N C L U S I O N In this paper , we inv estigate multi-agent estimation and filtering to minimize team mean-square error . W e show that the MTMSE estimates are giv en by ˆ z i ( t ) = L i ˆ x com ( t ) + F i ( t )( I loc i ( t ) − ˆ I loc i ( t )) . The first term of the estimate is the conditional mean of the current state given the common information. The second term may be viewed as a “correction” which depends on the “innovation” in the local measurements. A salient feature of this result is that the gains { F i ( t ) } i ∈ N depend on the weight matrix S . Using illustrativ e examples, we show that the MTMSE estimates significantly smaller team mean-squared error as compared to MMSE strategy and consensus Kalman filtering. The results were derived under the assumptions that the state process { x ( t ) } t ≥ 1 is a linear stochastic process and the observation channels are linear and additive Gaussian noise. In future, we plan to in vestigate team estimation of general stochastic processes over general measurement channels, which will give rise to non-linear filtering equations. Finally , our focus in this paper was to establish the structure of MTMSE filtering and filtering strategies. Having identified this structure, it is possible to implement the policy efficiently in a distributed manner . For example, for the infinite horizon setup, it is possible to use a consensus Kalman filter [16]–[21] to keep track of the delayed state estimate ˆ x ( t − τ ∗ + 1) and use distributed algorithms to solve the linear system of equations ¯ Γ ¯ F = ¯ η using distributed algorithms [33]–[35]. A P P E N D I X A P RO O F O F T H E O R E M 1 A. A preliminary r esult In order to compute the gains and the performance, we need to compute ˆ Θ i = co v( x, ˜ y i ) and ˆ Σ ij = co v( ˜ y i , ˜ y j ) . Lemma 5 F or any { S ij } i,j ∈ N , { P ij } i,j ∈ N and { L i } i ∈ N of compatible dimensions, the following matrix equation X j ∈ N h S ij F j P j i − S ij L j P ii i = 0 , ∀ i ∈ N . (48) for unknown { F i } i ∈ N of compatible dimensions can be written in vectorized form as Γ F = η , (49) wher e F , η , and Γ ar e as defined in Theor em 1. Furthermor e, define S = [ S ij ] i,j ∈ N and P = [ P ij ] i,j ∈ N . If S > 0 , P ≥ 0 , and P ii > 0 , i ∈ N , then Γ > 0 and thus in vertible. Then, Eq. (48) has a unique solution that is given by F = Γ − 1 η . (50) 2 The proof of Lemma 5 is presented in Appendix B. B. Pr oof of Theorem 1 The ke y observation behind the proof is that Problem 1 may be viewed as a MTMSE filtering problem [22], where agents observe different information and want to minimize a common estimation cost. For the ease of notation, for a gi ven agent i , we let ( g i , g − i ) and ( ˆ z i , ˆ z − i ) denote the strategy and estimates of all agents. Pick an agent i ∈ N , and fix the strategy g − i of all the other agents. Then the expected cost from the point of view of agent i is giv en by E g − i [ c ( x, ˆ z i , ˆ z − i ) | y 0 , y i ] , where the superscript g − i in the expectation indicates that the cost depends on the strategy of agents other than i . 13 A necessary condition for optimality is that agent i is playing a best response to the strategy of all other players, i.e., ∂ ∂ ˆ z i E g − i [ c ( x, ˆ z i , ˆ z − i ) | y 0 , y i ] = 0 , ∀ i ∈ N . (51) It is sho wn in [22, Theorem 4], that when c ( x, ˆ z ) is con vex, (51) is also a sufficient condition for optimality . From the dominated con ver gence theorem, we can inter- change the order of deriv ative and expectation to get LHS of (51) = E g − i  ∂ ∂ ˆ z i c ( x, ˆ z i , ˆ z − i )     y 0 , y i  = E g − i  ∂ ∂ ˆ z i X k ∈ N X j ∈ N ( L k x − ˆ z k ) | S kj ( L j x − ˆ z j )     y 0 , y i  = 2 E g − i  X j ∈ N S ij ( L j x − ˆ z j )     y 0 , y i  . Substituting the above in (51) , we get that a necessary and suf ficient condition for a strategy ( g i , g − i ) to be team optimal is X j ∈ N h S ij E g j [ ˆ z j | y 0 , y i ] − S ij L j E [ x | y 0 , y i ] i = 0 , ∀ i ∈ N . (52) Note here that the superscript g j in E g j [ ˆ z j | y 0 , y i ] highlights that the expectation depends on the choice of g j . There is no such dependence in E [ x | y 0 , y i ] . Thus, the strategy g giv en by (8) is optimal if and only if X j ∈ N  S ij E  F j ( y j − ˆ y j ) + L j ˆ x 0   y 0 , y i  − S ij L j E  x   y 0 , y i   = 0 , ∀ i ∈ N , (53) or equiv alently X j ∈ N  S ij F j E  ˜ y j | y 0 , y i  − S ij L j E  x − ˆ x 0   y 0 , y i   = 0 . ∀ i ∈ N . (54) Note that from Lemma 1, we hav e E [ x − ˆ x 0 | y 0 , y i ] = ˆ Θ i ˆ Σ − 1 ii ˜ y i . Substituting the abov e and the expression for E [ ˜ y j | y 0 , y i ] from Lemma 1 in (54) , we get that the strategy giv en by (8) is optimal if and only if, for all i ∈ N , X j ∈ N h S ij F j ˆ Σ j i ˆ Σ − 1 ii − S ij L j ˆ Θ i ˆ Σ − 1 ii i ˜ y i = 0 . Since the above should hold for all ˜ y i ∈ R d i y , the coefficient of ˜ y i must be identically zero. Thus, the strategy given by (8) is optimal if and only if X j ∈ N h S ij F j ˆ Σ j i ˆ Σ − 1 ii − S ij L j ˆ Θ i ˆ Σ − 1 ii i = 0 , ∀ i ∈ N . (55) Furthermore, Lemma 5 implies that when ˆ Σ ii > 0 , then (55) has a unique solution giv en by (10). Now for the minimum value of the estimation error , consider a single term of the estimation error E [( L i x − ˆ z i ) | S ij ( L j x − ˆ z j )] ( a ) = E  ( x − ˆ x 0 ) | L | i S ij L j ( x − ˆ x 0 ) − 2( y i − ˆ y i ) | F | i S ij L j ( x − ˆ x 0 ) + ( y i − ˆ y i ) | F | i S ij F j ( y j − ˆ y j )  ( b ) = T r( P 0 L | i S ij L j ) − 2 T r( ˆ Θ i F | i S ij L j ) + T r( ˆ Σ | ij F | i S ij F j ) ( c ) = T r( P 0 L | i S ij L j ) − 2 T r( F | i S ij L j ˆ Θ i ) + T r( F | i S ij F j ˆ Σ j i ) , (56) where ( a ) follows from substituting (8) , ( b ) uses Lemma 1, and ( c ) uses the fact that for any matrices T r( AB C D ) = T r( B C DA ) . Thus, the expected team estimation error is J ∗ = X i ∈ N X j ∈ N E [( L i x − ˆ z i ) | S ij ( L j x − ˆ z j )] ( d ) = X i ∈ N X j ∈ N  T r( P 0 L | i S ij L j ) − 2 T r( F | i S ij L j ˆ Θ i ) + T r( F | i S ij F j ˆ Σ j i )  = T r( P 0 L | S L ) − X i ∈ N T r  F | i X j ∈ N h 2 S ij L j ˆ Θ i − S ij F j ˆ Σ j i i ( e ) = T r( P 0 L | S L ) − X i ∈ N T r  F | i X j ∈ N S ij L j ˆ Θ i  (57) where ( d ) follows from (56) , and ( e ) follows from (55) . The result now follows from observing that X i ∈ N T r  F | i X j ∈ N S ij L j ˆ Θ i  = X i ∈ N T r( F | i S i L ˆ Θ i ) = X i ∈ N v ec( F i ) | v ec( S i L ˆ Θ i ) = F | η = η | Γ − 1 η , where the first equality follo ws from T r( A | B ) = v ec( A ) | v ec( B ) . A P P E N D I X B P RO O F O F L E M M A 5 By vectorizing both sides of (48) and using v ec( AB C ) = ( C | ⊗ A ) × vec( B ) , we get X j ∈ N ( P ij ⊗ S ij ) vec( F j ) − v ec( S i • LP ii ) = 0 , ∀ i ∈ N . Substituting Γ ij = P ij ⊗ S ij and η i = vec( S i • LP ii ) , we get (49). If S > 0 , P ≥ 0 , and P ii > 0 , i ∈ N , then [32, Lemma 1] implies that Γ > 0 and thus in vertible. Hence, Eq. (48) has a unique solution that is giv en by (50). A P P E N D I X C P RO O F O F T H E O R E M 3 ¯ Σ ii is the variance of the innovation in the standard Kalman filtering equation and by positi ve definiteness of R i is positiv e 14 definite. Lemma 5 implies that (43) has a unique solution that is giv en by (44) . T o sho w the strategy (41) is optimal, we proceed in two steps. W e first identify a lower bound in optimal performance and then show that the proposed strategy achiev es that lower bound. Step 1: From Theorem 2, for any strategy g , we have that 1 T J T ( g ) ≥ 1 T T X t =1  T r( L | S LP 0 ( t )) − η ( t ) | Γ( t ) η ( t )  T aking limits of both sides and using Lemma 4 (which implies that lim t →∞ η ( t ) = ¯ η and lim t →∞ Γ( t ) = ¯ Γ ), we get lim sup T →∞ 1 T J T ( g ) ≥ T r( L | S L ¯ P 0 ) − ¯ η | ¯ Γ ¯ η = J ∗ (58) Step 2: Suppose ˆ z ( t ) is chosen according to strategy (44) and let J ( t ) denote E [ c ( x ( t ) , ˆ z ( t ))] . Following (56) and (57) in the proof of Theorem 1, we have that J ( t ) = T r( L | S LP 0 ( t )) − X i ∈ N T r  ¯ F | i X j ∈ N h 2 S ij L j ˆ Θ i ( t ) − S ij ¯ F j ˆ Σ j i ( t ) i  . From Lemma 4, we hav e that lim t →∞ J ( t ) = T r( L | S L ¯ P 0 ) − X i ∈ N T r  ¯ F | i X j ∈ N h 2 S ij L j ¯ Θ i − S ij ¯ F j ¯ Σ j i i  . = T r( L | S L ¯ P 0 ) − ¯ η | ¯ Γ ¯ η = J ∗ . Thus, by Cesaro’ s mean theorem, we get lim T →∞ 1 T P T t =1 J ( t ) = J ∗ . Hence, the strategy (44) achiev es the lower bound of (58) and is therefore optimal. A P P E N D I X D O N E - S T E P D E L A Y E D O B S E RV A T I O N S H A R I N G A. Pr oblem statement In this section, we use the result of Theorem 2 to show the relationship between MTMSE filtering and control in delayed observation sharing model [26], [31], [32]. The notation used in this section is self-contained and consistent with the standard notation used in decentralized stochastic control. Consider a decentralized control system with n agents, index ed by the set N = { 1 , . . . , n } . The system has a state x ( t ) ∈ R d x . The initial state x (1) ∼ N (0 , Σ x ) and the state ev olves as follows: x ( t + 1) = A ( t ) x ( t ) + B ( t ) u ( t ) + w ( t ) , (59) where A and B are matrices of appropriate dimensions. u ( t ) = vec( u 1 ( t ) , · · · , u n ( t )) , where u i ( t ) ∈ R d i u is the control action chosen by agent i , and { w ( t ) } t ≥ 1 , w ( t ) ∈ R d x is an i.i.d. process with w ( t ) ∼ N (0 , Σ w ) . Each agent observes a noisy version y i ( t ) ∈ R d i y of the state giv en by y i ( t ) = C i ( t ) x ( t ) + v i ( t ) (60) where { v i ( t ) } t ≥ 1 , v i ( t ) ∈ R d i y , is an i.i.d. process with v i ( t ) ∼ (0 , Σ i v ) . This may be written in a vector form as y ( t ) = C ( t ) x ( t ) + v ( t ) , (61) where C = ro ws( C 1 , . . . , C n ) , v ( t ) = vec( v 1 ( t ) , . . . , v n ( t )) , and y ( t ) = vec( y 1 ( t ) , . . . , y n ( t )) . Assumption 1: The primitiv e random variables ( x (1) , { w ( t ) } t ≥ 1 , { v 1 ( t ) } t ≥ 1 , . . . , { v n ( t ) } t ≥ 1 ) are independent. In addition to its local observ ation y i ( t ) , each agent also receiv es the one-step delayed observations of all agents. Thus, the information av ailable to agent i is giv en by I i ( t ) : = { y i ( t ) , y (1: t − 1) , u (1: t − 1) } . (62) Therefore, agent i chooses the control action u i ( t ) as follo ws. u i ( t ) = g i,t ( I i ( t )) , (63) where g i,t is the control laws of agent i at time t . The collection g = ( g 1 , . . . , g n ) , where g i = ( g i, 1 , . . . , g i,T ) is called the control strategy of the system. The performance of any control strategy g is giv en by J ( g ) = E g h T − 1 X t =1  x ( t ) | Qx ( t ) + u ( t ) | Ru ( t )  + x ( T ) | Qx ( T ) i , (64) where Q is symmetric positiv e semi-definite matrix, R is symmetric positive definite matrix, and the expectation is with respect to the joint measure on the system variables induced by the choice of g . Problem 4 Gi ven the system dynamics and the noise statistics, choose a control strategy g to minimize the total cost J ( g ) giv en by (64). Problem 4 is a decentralized stochastic control problem. In such problems there is no separation of estimation and control (see, for example [32]). W e show that this lack of separation is due to the fact that the MTMSE filtering strategy depends on the weight matrix of the estimation cost. B. Equivalence to MTMSE filtering W e start with a basic property of linear quadratic models. Let P (1: T ) denote the solution to the following backward Riccati equation. P ( T ) = Q and for t ∈ { T − 1 , . . . , 1 } , P ( t ) = Q + A | P ( t + 1) A − A | P ( t + 1) B ( R + B | P ( t + 1) B ) − 1 B | P ( t + 1) A. Define S ( t ) = R + B | P ( t + 1) B , L ( t ) = S ( t ) − 1 ( B | P ( t + 1) A ) . Then, we have the following. Lemma 6 F or any contr ol strate gy g , define J ◦ ( g ) = T − 1 X t =1 E [( u ( t ) + L ( t ) x ( t )) | S ( t )( u ( t ) + L ( t ) x ( t ))] . (65) Then, a strate gy g that minimizes J ◦ ( g ) also minimizes J ( g ) . 2 15 P RO O F Follo wing [36, Chapter 8, Lemma 6.1], we can show that the total cost J ( g ) can be written as J ( g ) = T − 1 X t =1 E  w ( t ) | P ( t + 1) w ( t ) + x (1) | P (1) x (1)  + T − 1 X t =1 E  ( u ( t ) + L ( t ) x ( t )) | S ( t )( u ( t ) + L ( t ) x ( t )) i . (66) The third term is equal to J ◦ ( g ) and the first two terms do not depend on the control strategy g . Thus, J ( g ) and J ◦ ( g ) hav e the same argmin.  Now , we split the state x ( t ) into a deterministic part ¯ x ( t ) and a stochastic part ˜ x ( t ) as follows. ¯ x (1) = 0 , ˜ x (1) = x (1) , and ¯ x ( t + 1) = A ¯ x ( t ) + B u ( t ) , ˜ x ( t + 1) = A ˜ x ( t ) + w ( t ) , ¯ y ( t ) = C ¯ x ( t ) , ˜ y ( t ) = C ˜ x ( t ) + v ( t ) . Since the system is linear , we have x ( t ) = ¯ x ( t ) + ˜ x ( t ) and y ( t ) = ¯ y ( t ) + ˜ y ( t ) . Note that ¯ x ( t ) is a function of the past control actions, which are known to all agents. No w , for any control strategy g , define ˆ z i ( t ) = u i ( t ) + L i ( t ) ¯ x ( t ) . Then, the cost J ◦ ( g ) may be written as T − 1 X t =1 E [( ˆ z i ( t ) + L ( t ) ˜ x ( t )) | S ( t )( ˆ z i ( t ) + L ( t ) ˜ x ( t ))] . (67) The process { ˜ x ( t ) } t ≥ 1 is an uncontrolled linear stochastic process and the cost (67) is of of the same form as the weighted mean-square cost that we hav e considered in this paper . Follo wing [25], we define ˜ I i ( t ) = { ˜ y i ( t ) , ˜ y (1: t − 1) } which may be considered as the control-free part of the information structure. Lemma 7 F or any strate gy g and any agent i ∈ N , ˜ I i ( t ) is equivalent to I i ( t ) , i.e., they gener ate the same sigma algebra. 2 P RO O F The result follows from a similar argument as given in [37, Chapter 7, Section 3].  Since ˜ I i ( t ) is equiv alent to I i ( t ) , we may assume that ˆ z i ( t ) is chosen as a function of ˜ I i ( t ) instead of I i ( t ) . Thus, Problem 4 is equiv alent to the following MTMSE filtering problem. Problem 5 Suppose n agents observe the linear dynamical system { ˜ x ( t ) } t ≥ 1 and share their observations ov er a one-step delayed sharing communication graph. Thus, the information av ailable at agent i is ˜ I i ( t ) = { ˜ y i ( t ) , ˜ y (1: t − 1) } . Agent i chooses an estimate ˆ z i ( t ) of ˜ x ( t ) according to an estimation strategy h i,t , i.e., ˆ z i ( t ) = h i,t ( ˜ I i ( t )) to minimize an estimation cost giv en by (67). Problem 5 is a MTMSE filtering problem and can be solved using Theorem 2. One can then take the solution of Problem 5 and translate it back to Problem 4 as follows. Theorem 4 Let h ∗ be the optimal strate gy for Pr oblem 5, i.e., h ∗ i,t ( ˜ I i ( t )) = − L i ( t ) ˆ ˜ x ( t ) − F i ( t )  ˜ y i ( t ) − E [ ˜ y i ( t ) | ˜ y (1: t − 1)]  , (68) wher e ˆ ˜ x ( t ) = E [ ˜ x ( t ) | ˜ y (1: t − 1)] , L ( t ) = ro ws( L 1 ( t ) , . . . , L n ( t )) , and the gains { F i ( t ) } ar e computed as per Theor em 2. Define strate gy g ∗ as follows: g ∗ i,t ( I i ( t )) = h ∗ i,t ( ˜ I i ( t )) − L i ( t ) ¯ x ( t ) , (69) i.e., g ∗ i,t ( I i ( t )) = − L i ( t ) ˆ x ( t ) − F i ( t )  y i ( t ) − E [ y i ( t ) | y (1: t − 1) , u (1: t − 1)]  , (70) wher e ˆ x ( t ) = E [ x ( t ) | I com ( t )] = ¯ x ( t ) + E [ ˜ x ( t ) | ˜ y (1: t − 1)] . Then g ∗ is the optimal strate gy for Pr oblem 4. 2 P RO O F The change of variables ˆ z i ( t ) = u i ( t ) + L i ( t ) ¯ x ( t ) implies that if h ∗ is an optimal strategy for Problem 5, then g ∗ giv en by (69) is optimal for Problem 4. T o establish (70) , we need to show that ˆ x ( t ) = ¯ x ( t ) + ˆ ˜ x ( t ) . Define, I com ( t ) = { y (1: t − 1) , u (1: t − 1) } and ˜ I com ( t ) = { ˜ y (1: t − 1) } . Then by Lemma 7 we have, I com ( t ) is equiv alent to ˜ I com ( t ) , i.e., they generate the same sigma algebra. The rest of the proof follows from the definition of ˆ x ( t ) . W e have ˆ x ( t ) = E [ x ( t ) | ˜ I com ( t )] ( a ) = E [ ¯ x ( t ) | I com ( t )] + E [ ˜ x ( t ) | ˜ I com ( t )] ( b ) = ¯ x ( t ) + ˆ ˜ x ( t ) , where ( a ) follows from state splitting and I com ( t ) = ˜ I com ( t ) and ( b ) follows from the fact that ¯ x ( t ) is a deterministic function of I com ( t ) .  The main take away is as follows. By a simple change of variables we showed that the one-step delayed observation sharing problem is equiv alent to a MTMSE filtering problem, where the weight matrix S ( t ) of the estimation cost depends on the backward Riccati equation for the cost function. The MTMSE filtering strategy depends on the weight matrix S ( t ) and that is the reason why there is no separation between estimation and control. Nonetheless, the optimal gains can be computed as follows. 1) Solve a Riccati equation to compute the weight functions S (1: T ) and gains L (1: T ) . 2) Solve a Kalman filtering equation (which does not depend on S (1: T ) ) to compute the cov ariances ˆ Σ( t ) and ˆ Θ( t ) defined in Theorem 2. 3) Use S ( t ) , L ( t ) , ˆ Σ( t ) , and ˆ Θ( t ) to obtain the optimal gains F i ( t ) by solving a system of matrix equations. 4) Using Theorem 4 above, we can write the optimal strategy g ∗ i,t in terms of F i ( t ) and L i ( t ) . 16 A C K N O W L E D G M E N T The authors are grateful to Peter Caines, Roland Malhame, and Demosthenis T eneketzis for useful discussion and feedback. R E F E R E N C E S [1] M. Afshari and A. Mahajan, “T eam optimal decentralized state estima tion, ” in IEEE Conference on Decision and Contr ol (CDC) . IEEE, Dec. 2018. [2] R. E. Kalman, “ A new approach to linear filtering and prediction problems” transaction of the asme journal of basic, ” 1960. [3] S. M. Barta, “On linear control of decentralized stochastic systems, ” Ph.D. dissertation, Massachusetts Institute of T echnology , 1978. [4] M. S. Andersland and D. T eneketzis, “Measurement scheduling for recur- siv e team estimation, ” Journal of Optimization Theory and Applications , vol. 89, no. 3, pp. 615–636, Jun 1996. [5] R. E. Lucas, “Expectations and the neutrality of money , ” Journal of Economic Theory , vol. 4, no. 2, pp. 103–124, Apr 1972. [6] S. Morris and H. S. Shin, “Social value of public information, ” The American Economic Review , vol. 92, no. 5, pp. 1521–1534, 2002. [7] F . Allen, S. Morris, and H. S. Shin, “Beauty contests and iterated expectations in asset markets, ” Review of Financial Studies , vol. 19, no. 3, pp. 719–752, 2006. [8] C. Sanders, E. T acker , and T . Linton, “ A new class of decentralized filters for interconnected systems, ” IEEE T rans. Autom. Contr ol , vol. 19, no. 3, pp. 259–262, Jun 1974. [9] C. Sanders, E. T acker, T . Linton, and R. Ling, “Specific structures for large-scale state estimation algorithms having information exchange, ” IEEE T rans. Autom. Contr ol , vol. 23, no. 2, pp. 255–261, Apr 1978. [10] J. Speyer , “Computation and transmission requirements for a decentral- ized linear-quadratic-Gaussian control problem, ” IEEE T rans. Autom. Contr ol , vol. 24, no. 2, pp. 266–269, April 1979. [11] C.-Y . Chong, “Hierarchical estimation, ” in Proc. MIT/ONR W orkshop on C3 , 1979. [12] M. Hassan, G. Salut, M. Singh, and A. T itli, “ A decentralized com- putational algorithm for the global Kalman filter, ” IEEE Tr ans. Autom. Contr ol , vol. 23, no. 2, pp. 262–268, Apr 1978. [13] S. J. Julier and J. K. Uhlmann, “Using covariance intersection for SLAM, ” Robotics and Autonomous Systems , vol. 55, no. 1, pp. 3–20, Jan 2007. [14] H. Li and F . Nashashibi, “Cooperative multi-vehicle localization using split cov ariance intersection filter , ” IEEE Intell. T ransp. Syst. Mag. , vol. 5, no. 2, pp. 33–44, 2013. [15] B. Noack, J. Sijs, M. Reinhardt, and U. D. Hanebeck, “Decentralized data fusion with in verse covariance intersection, ” Automatica , vol. 79, pp. 35–41, May 2017. [16] R. Olfati-Saber , “Distributed Kalman filtering for sensor networks, ” in IEEE Conf. on Decision and Control , Dec 2007, pp. 5492–5498. [17] ——, “Kalman-consensus filter : Optimality , stability , and performance, ” in IEEE Conference on Decision and Control and Chinese Control Confer ence , Dec 2009, pp. 7036–7042. [18] S. Kar and J. Moura, “Distributed consensus algorithms in sensor networks with imperfect communication: Link failures and channel noise, ” IEEE T rans. Signal Process. , vol. 57, no. 1, pp. 355–369, Jan 2009. [19] F . S. Cattiv elli and A. H. Sayed, “Diffusion strategies for distributed Kalman filtering and smoothing, ” IEEE T rans. Autom. Contr ol , vol. 55, no. 9, pp. 2069–2084, Sept 2010. [20] R. Olfati-Saber and P . Jalalkamali, “Coupled distributed estimation and control for mobile sensor networks, ” IEEE T rans. Autom. Contr ol , vol. 57, no. 10, pp. 2609–2614, Oct 2012. [21] G. Battistelli, L. Chisci, G. Mugnai, A. Farina, and A. Graziano, “Consensus-based linear and nonlinear filtering, ” IEEE Tr ans. Autom. Contr ol , vol. 60, no. 5, pp. 1410–1415, May 2015. [22] R. Radner, “T eam decision problems, ” The Annals of Mathematical Statistics , vol. 33, no. 3, pp. 857–881, 1962. [23] H. S. Witsenhausen, “Separation of estimation and control for discrete time systems, ” Pr oc. IEEE , vol. 59, no. 11, pp. 1557–1566, 1971. [24] P . E. Caines, Linear Stochastic Systems . New Y ork, NY , USA: John W iley & Sons, Inc., 1987. [25] Y .-C. Ho and K.-C. Chu, “T eam decision theory and information structures in optimal control problems–part I, ” IEEE T rans. Autom. Contr ol , vol. 17, no. 1, pp. 15–22, Feb 1972. [26] T . Y oshikawa, “Dynamic programming approach to decentralized stochas- tic control problems, ” IEEE T rans. Autom. Control , vol. 20, no. 6, pp. 796–797, Dec 1975. [27] S. M. Asghari, Y . Ouyang, and A. Nayyar, “Optimal local and remote controllers with unreliable uplink channels, ” IEEE T ransactions on Automatic Contr ol , 2019 (in print). [28] M. Afshari and A. Mahajan, “Optimal local and remote controllers with unreliable uplink channels: An elementary proof, ” IEEE T ransactions on Automatic Contr ol , vol. 65, no. 8, pp. 3616–3622, 2020. [29] J. Swigart and S. Lall, “ An explicit state-space solution for a decentralized two-player optimal linear-quadratic regulator , ” in Pr oceedings of the 2010 American Contr ol Conference , June 2010, pp. 6385–6390. [30] N. Nayyar , D. Kalathil, and R. Jain, “Optimal decentralized control with asymmetric one-step delayed information sharing, ” IEEE T rans. Contr ol Netw . Syst. , vol. 5, no. 1, pp. 653–663, March 2018. [31] B. Z. Kurtaran and R. Siv an, “Linear-quadratic-gaussian control with one-step-delay sharing pattern, ” IEEE Tr ans. Autom. Contr ol , vol. 19, no. 5, pp. 571–574, Oct 1974. [32] N. Sandell and M. Athans, “Solution of some nonclassical LQG stochastic decision problems, ” IEEE Tr ans. Autom. Control , vol. 19, no. 2, pp. 108– 116, Apr 1974. [33] J. Liu, A. S. Morse, A. Nedi ´ c, and T . Ba s ¸ ar , “Exponential con vergence of a distributed algorithm for solving linear algebraic equations, ” Automatica , vol. 83, pp. 37–46, Sep 2017. [34] T . Y ang, J. George, J. Qin, X. Yi, and J. Wu, “Distributed least squares solver for network linear equations, ” Automatica , vol. 113, p. 108798, Mar 2020. [35] P . W ang, S. Mou, J. Lian, and W . Ren, “Solving a system of linear equations: From centralized to distributed algorithms, ” Annual Reviews in Contr ol , vol. 47, pp. 306–322, 2019. [36] K. J. Astrom, Introduction to stochastic contr ol theory . Academic Press New Y ork, 1970. [37] P . R. Kumar and P . V araiya, Stochastic Systems: Estimation, Identification and Adaptive Contr ol . Upper Saddle River , NJ, USA: Prentice-Hall, Inc., 1986. Mohammad Afshari (S’12) received the B.S. and the M.S. degrees in Electrical Engineering from the Isfahan University of T echnology , Isfahan, Iran, in 2010 and 2012, respectively . He is currently working tow ards the Ph.D. degree in Electrical and Computer Engineering at McGill University , Montreal, Canada. His current area of research is decentralized stochastic control, team theory , and reinforcement learning. Mr . Afshari is member of the McGill Center of Intelligent Machines (CIM) and member of the Research Group in Decision Analysis (GERAD). Aditya Mahajan (S’06-M’09-SM’14) received B.T ech degree from the Indian Institute of T ech- nology , Kanpur, India, in 2003, and M.S. and Ph.D. degrees from the University of Michigan, Ann Arbor , USA, in 2006 and 2008. From 2008 to 2010, he was a Postdoctoral Researcher at Y ale University , New Hav en, CT , USA. He has been with the department of Electrical and Computer Engineering, McGill Univ ersity , Montreal, Canada, since 2010 where he is currently Associate Professor . He serves as Associate Editor of Springer Mathematics of Control, Signal, and Systems. He was an Associate Editor of the IEEE Control Systems Society Conference Editorial Board from 2014 to 2017. He is the recipient of the 2015 George Ax elby Outstanding Paper A ward, 2014 CDC Best Student Paper A ward (as supervisor), and the 2016 NecSys Best Student Paper A ward (as supervisor). His principal research interests include decentralized stochastic control, team theory , multi-armed bandits, real-time communication, information theory , and reinforcement learning.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment