Overlapping Covariance Intersection: Fusion with Partial Structural Knowledge of Correlation from Multiple Sources

IEEE TRANSACTIONS AND JOURNALS TEMPLA TE 1 Ov er lapping Co v ar iance Intersection: Fusion with P ar tial Str uctural Kno wledge of Correlation from Multiple Sources Leonardo P edroso , P edro Batista, W .P .M.H. (Maurice) Heemels Abstract — Emerging large-scale engineering systems rely on distributed fusion for situational awareness, where agents combine noisy local sensor measurements with exchanged inf ormation to obtain fused estimates. Ho w- ever , at the sheer scale of these systems, tracking cross- correlations becomes infeasible, preventing the use of op- timal ﬁlters. Cov ariance intersection (CI) methods address fusion prob lems with unknown correlations by minimizing wor st-case uncer tainty based on av ailable information. Ex- isting CI extensions exploit limited correlation knowledge but cannot incorporate structural knowledge of correlation from m ultiple sources, which naturally arises in distributed fusion prob lems. This paper introduces Overlapping Co- variance Intersection (OCI), a generalized CI framew ork that accommodates this novel inf ormation structure . We formalize the OCI problem and establish necessary and sufﬁcient conditions for feasibility . W e show that a family- optimal solution can be computed efﬁciently via semideﬁ- nite programming, enabling real-time implementation. The proposed tools enable improved fusion performance for large-scale systems while retaining robustness to unkno wn correlations. Index T erms — Cov ariance Intersection, distributed esti- mation, multisensor data fusion, partial knowledge of cor- relation. I . I N T R O D U C T I O N Emerging large-scale engineering systems are composed of a large number of agents that interact in an en vironment to cooperativ ely achie ve a goal. In these settings, each agent has access to noisy data from multiple sensors and from communication with other agents that must be fused to pro- vide a good estimate of the required quantities, for instance, for situational awareness. Examples are mega-constellations of satellites, which rely on absolute position and relativ e measurements from GNSS receiv ers and communication to estimate their position [1], [2]; vehicle-to-e verything networks, where autonomous vehicles obtain data from local sensors and from communication with infrastructure to estimate their position and the position of other vehicles and pedestrians [3]. This work was supported in par t by LARSyS FCT fund- ing (DOI: 10.54499/LA/P/0083/2020 , 10.54499/UIDP/ 50009/2020 , and 10.54499/UIDB/50009/2020 ). L. P edroso and W .P .M.H. Heemels are with the Control Sys- tems T echnology section, Department of Mechanical Engineer- ing, Eindhov en University of T echnology , The Netherlands (e-mail: { l.pedroso ,w .p.m.h.heemels } @tue .nl). L. P edroso and P . Batista are with the Institute for Systems and Robotics, Instituto Superior T ´ ecnico , Universidade de Lisboa, P or tugal (e-mail: pbatista@isr .tecnico.ulisboa.pt). Due to the sheer dimension of these systems, it is infeasible to keep track of the correlation between all measurements, which prevents the use of well-known optimal (centralized) ﬁltering solutions [4]. As a result, these systems fall into the class of ultra lar ge-scale systems , which, by deﬁnition, are control systems whose design cannot be feasibly carried out in a centralized manner [5]. Therefore, a distributed fusion approach is required. Howe ver , if unknown correlations are ignored, the estimation performance is degraded signiﬁcantly and, w orse, each agent computes decei vingly tighter estimation error bounds than the ground-truth bounds, which can lead to dire consequences. This effect is commonly known as double- counting and is showcased in [6], [7]. Covariance intersection (CI) tools have been devised for the past quarter century to av oid double-counting [8]. These techniques aim at fusing estimates from multiple sources whose correlation is (partially or totally) unknown. The fusion procedure is designed to minimize the worst- case uncertainty of the fused estimate under the av ailable information. The basic CI setting assumes an information structure whereby the correlation between estimates is totally unknown. It was ﬁrst developed in [9], [10]. The revie w paper [8] provides a recent surve y on progress and applications of this method. In the basic setting, when only two estimates are fused, there is a tractable optimal CI method [11], but when dealing with multiple estimates [12] tractable CI tools do not typically produce an optimal fusion scheme [13]. Naturally , partial information about correlations between measurements can signiﬁcantly improve fusion performance [14]. Such partial information can either be inferred from fundamental physical properties of the system or tracked by the agents. Several works in the literature deal with general- izing CI tools to different information structures with partial knowledge about correlation. These are brieﬂy surve yed in Section I-B. In this paper, we introduce an information structure that has not been analyzed in the CI literature and that stems from the distributed fusion problem over emerging ultra-large scale systems [5]. In Section I-A, we introduce the information structure addressed in this paper resorting to a motiv ating example of a toy cooperativ e localization problem. Then, in Section I-B, the proposed information structure is compared with others previously studied in the literature. 2 IEEE TRANSACTIONS AND JOURNALS TEMPLA TE A. Motivating Example As a motiv ating example, consider a cooperative localization toy problem of a team of vehicles. For simplicity , each vehicle j is characterized by a scalar position, denoted by x k j ∈ R at discrete-time instant k . The position of vehicle j e volv es according to a discrete-time zero-mean random-walk model with variance Q j > 0 , which is uncorrelated between different vehicles. Formally , x k +1 j = x k j + d k j , where d k j is the drift of the random-walk of vehicle j at time k . The goal is to devise a dynamic distributed ﬁltering solution whereby each vehicle j computes an unbiased estimate of its absolute position at each time instant k , which is denoted by ˆ x k j ∈ R . For that, it relies on the unbiased estimate at the pre vious discrete-time instant, i.e., ˆ x k − 1 j , and on relativ e position measurements w .r .t. other neighboring v ehicles at time k , which are deﬁned in what follows. W e denote the estimation error of the position of each generic vehicle j at time k − 1 by ˜ x k − 1 j := ˆ x k − 1 j − x k − 1 j , which is zero-mean because ˆ x k − 1 j is assumed to be unbiased. In this example, we analyze for simplicity a single fusion instance at time k of the vehicle i that is depicted in Fig. 1. V ehicles p and q , with respect to which vehicle i gets relative position measurements, are also depicted in Fig. 1. First, with the information about the dynamical model, vehicle i can make a prediction z k i := ˆ x k − 1 i of its position x k i at time k . The prediction error z k i − x k i is zero-mean because z k i = ˆ x k − 1 i = x k − 1 i + ˜ x k − 1 i = x k i + ( ˜ x k − 1 i − d k − 1 i ) (1) and ˜ x k − 1 i and d k − 1 i are zero-mean. Second, vehicle i , in par- ticular , communicates and has access to relati ve measurements w .r .t. to two other neighboring vehicles p and q , as depicted in Fig. 1. A relativ e measurement w .r .t. vehicle p at time k is giv en by y k i,p := x k p − x k i + e k i,p , where e k i,p is zero-mean measurement noise. Such relativ e measurement provides an estimate z k i,p := ˆ x k − 1 p − y k i,p of x k i , where ˆ x k − 1 p is transmitted from vehicle p to i . The error z k i,p − x k i is zero-mean because z k i,p = ˆ x k − 1 p − y k i,p = x k − 1 p + ˜ x k − 1 p − ( x k p − x k i + e k i,p ) = x k p − d k − 1 p + ˜ x k − 1 p − x k p + x k i − e k i,p = x k i + ( ˜ x k − 1 p − d k − 1 p − e k i,p ) , (2) and ˜ x k − 1 p , d k − 1 p , and e k i,p are zero-mean. Similarly , a relativ e measurement y k i,q w .r .t. vehicle q provides an estimate z k i,q := ˆ x k − 1 q − y k i,q = x k i + ( ˜ x k − 1 q − d k − 1 q − e k i,q ) . T o obtain a fused estimate of x k i , vehicle i uses a linear ﬁlter relying on the information from the predicted estimate and the two relativ e sensor measurements, i.e., ˆ x k i = K k i,i z k i + K k i,p z k i,p + K k i,q z k i,q . This ﬁlter is unbiased, i.e., E [ ˆ x k i ] = x k i , if and only if K k i,i + K k i,p + K k i,q = 1 . Deﬁne z k i := [ z k i z k i,p z k i,q ] ⊤ , e k i := [ z k i − x k i z k i,p − x k i z k i,q − x k i ] ⊤ , ˜ χ k − 1 i := [ ˜ x k − 1 p ˜ x k − 1 i ˜ x k − 1 q ] ⊤ , and K k i = [ K k i,i K k i,p K k i,q ] . 1 The variance of the estimation error of the fused estimate is giv en by E [( ˆ x k i − x k i ) 2 ] = K k i E [ e k i e k ⊤ i ] K k ⊤ i . From (1) and 1 Notice that a non-standard concatenation order is used in the deﬁnition of ˜ χ k − 1 i for the clarity of the graphical representation of the information structure in Fig. 2. i q p y k i,p = x k p − x k i + e k i,p y k i,q = x k q − x k i + e k i,q X k − 1 p X k − 1 i X k − 1 q ˆ x k − 1 q ˆ x k − 1 i ˆ x k − 1 p ˆ x k i = K k i,i z k i + K k i,p z k i,p + K k i,q z k i,q z k i = ˆ x k − 1 i z k i,p = ˆ x k − 1 p − y k i,p z k i,q = ˆ x k − 1 q − y k i,q X k i = . . . Fig. 1. Scheme of the cooperative localization to y problem. (2), one can write E [ e k i e k ⊤ i ] =  Q i 0 0 R k i + diag ( Q p , Q q )  +   0 1 0 1 0 0 0 0 1   E [ ˜ χ k − 1 i ˜ χ k − 1 ⊤ i ]   0 1 0 1 0 0 0 0 1   ⊤ , (3) where R k i = E [[ e k i,p e k i,q ][ e k i,p e k i,q ] ⊤ ] is the cov ariance of the relativ e measurement noise. The goal for vehicle i is to design the gain K k i such that E [( ˆ x k i − x k i ) 2 ] is minimized. Crucially , E [ ˜ χ k − 1 i ˜ χ k − 1 ⊤ i ] is not exactly known to vehi- cle i . Instead, partial information about E [ ˜ χ k − 1 i ˜ χ k − 1 ⊤ i ] is known, and one desires to minimize the worst-case E [( ˆ x k i − x k i ) 2 ] under the av ailable information. CI tools hav e been de vised for such ﬁlter design problems under different ﬂavors of partial knowledge of correlation be- tween the av ailable estimates. In the context of this example, we assume that each vehicle j keeps track of the estimate of their own position only and an upper bound on the joint estimation error covariance of the position estimates of vehicle j and its neighbors. In this running example: (i) vehicle i keeps and updates ˆ x k − 1 i and an upper bound E [ ˜ χ k − 1 i ˜ χ k − 1 ⊤ i ] ⪯ X k − 1 i ; (ii) vehicle p keeps and updates ˆ x k − 1 p and an upper bound E [ ˜ χ k − 1 p ˜ χ k − 1 ⊤ p ] ⪯ X k − 1 p , where ˜ χ k − 1 p := [ ˜ x k − 1 i ˜ x k − 1 p ] ⊤ ; and so forth. If the bounds X k − 1 p and X k − 1 q are transmitted from p and q , respectiv ely , to i , then vehicle i has access to three bounds on principal submatrices of E [ ˜ χ k − 1 i ˜ χ k − 1 ⊤ i ] . The structure of the bounds X k − 1 p and X k − 1 q on E [ ˜ χ k − 1 i ˜ χ k − 1 ⊤ i ] is depicted in Fig. 2(v), the bound X k − 1 i would be represented by a bound on the whole E [ ˜ χ k − 1 i ˜ χ k − 1 ⊤ i ] and is omitted for the sake of clarity of the ﬁgure. The goal for vehicle i is to design: (i) K k i that minimizes the worst-case second moment of the estimation error of the fused estimate, under kno wledge of the structural bounds X k − 1 i , X k − 1 p , and X k − 1 q ; and (ii) compute a consistent bound E [ ˜ χ k i ˜ χ k ⊤ i ] ⪯ X k i at time k . The aforementioned information structure for the fusion problem at vehicle i includes partial structural information about cross-correlations of E [ ˜ χ k − 1 i ˜ χ k − 1 ⊤ i ] , as depicted in PEDROSO et al. : O VERLAPPING COV ARIANCE INTERSECTION 3 Fig. 2(v). In contrast, state-of-art approaches to distributed fusion in this toy problem in volv e recei ving only bounds on E [ ˜ x k − 1 p ˜ x k − 1 ⊤ p ] and E [ ˜ x k − 1 q ˜ x k − 1 ⊤ q ] from vehicles p and q , respectively [15]. These bounds only provide information about the autocorrelations of E [ ˜ χ k − 1 i ˜ χ k − 1 ⊤ i ] , as depicted in Fig. 2(ii). Therefore, the novel information structure proposed in this paper has the potential for better distributed fusion performance by making use of partial information about cross- correlations. B. State-of-the-ar t CI intersection tools have been de veloped in the literature to address the follo wing information structures, which are schematically represented in Fig. 2: (i) Basic cov ariance intersection (CI): The correlation be- tween estimates is fully unknown and only bounds on the autocorrelation of each estimate are kno wn. After the seminal works [9] and [10], sev eral different approaches hav e been proposed to address this basic setting (e.g. [16]). (ii) Split CI (SCI): The estimates contain an indepen- dent component and a (possibly) correlated component. Bounds on the independent component are known (ma- trix on the left in Fig. 2(ii)). Correlation between the correlated components of different estimates is fully unknown, and bounds on the autocorrelation of the cor- related components of the estimates are known (bounds on matrix on the right in Fig. 2(ii)) [15], [17], [18]. SCI was also recently extended to capture known cross- correlations [19]. (iii) Correlation Coef ﬁcient CI (CCCI): A bound on the autocorrelation of each estimate is known and a bound on a scalar correlation coefﬁcient is also known (e.g., Pearson’ s correlation coefﬁcient) [14], [20], [21]. (iv) Partitioned CI (PCI): The autocorrelation of each esti- mate and the correlation between some pairs of estimates are exactly known. The correlation between other pairs is completely unknown [22]–[24]. Motiv ated by the toy cooperati ve localization problem in Section I-A, we introduce a novel information structure: (v) Overlapping CI (OCI): Multiple bounds on components of the joint estimation error cov ariance matrix are kno wn (two bounds are represented in Fig. 2(v)). The compo- nents that are affected by the multiple av ailable bounds may ov erlap, which justiﬁes the term coined for this structure. In Fig. 2(v), the two illustrative bounds are on principal submatrices, but that need not be the case, as it will be discussed further in Section II. Notice that this is the information structure that we obtain in the toy cooperative localization problem in Section I- A. Speciﬁcally , bounds X k − 1 i , X k − 1 p , and X k − 1 q are av ailable on principal submatrices of E [ ˜ χ k − 1 i ˜ χ k − 1 ⊤ i ] . T o the best of the authors’ knowledge, the distributed fusion problem under the OCI information structure has not been pre viously addressed in the literature. It is en- ? ? ? ? ? ? 0 0 0 0 0 0 ? ? ? ? ? ? + (i) CI (ii) SCI (iii) CCCI < ρ ? ? ? ? (iv) PCI (v) OCI Fig. 2. Compar ison of the estimation error covariance matr ix with three estimates for distinct par tial information structures. Dashed contours represent knowledge of bounds and solid contours represent e xact knowledge. countered in the motiv ating example and it is exemplary of many cooperative localization problems for the next generation of ultra large-scale engineering systems [5, Section 5]. Standard CI approaches are not appropriate to address the OCI problem. Indeed, in Section III-B, we apply a state-of- the-art approach (described, e.g., in [13]) that is common for CI problems and we analyze its shortcomings. The PCI problem is the closest in the literature to the OCI problem, since both have structural knowledge about the joint estimation error . Howe ver , their scope is very different. First, PCI requires exact knowledge about element-wise correlation between pairs of estimates, whereas OCI only requires bounds . Second, the structural knowledge of the OCI problem is compatible with correlation information that is provided from multiple sources and may overlap as a result, whereas the PCI problem is not. Third, the PCI problem is limited to handling bounds on principal components of the joint estimation error cov ariance matrix, whereas the OCI problem is not. Moreover , the techniques employed to approach the PCI problem are very distinct to those used in this paper to address the OCI problem. C . Contr ib utions The main contributions of this paper are twofold: (a) W e introduce a distributed fusion problem with partial structural knowledge of corr elation , which we call OCI, that has not been addressed previously in the literature. (b) The problem is analyzed in depth. W e establish necessary and sufﬁcient conditions on the av ailable information to ensure the feasibility of the problem. W e express a family- optimal solution to the problem as a semideﬁnite program (SDP), which is computationally tractable and suitable for real-time implementation. D . Notation Throughout this paper, the n × n identity , n × m null, and n × m ones matrices are denoted by I n , 0 n × m , and 4 IEEE TRANSACTIONS AND JOURNALS TEMPLA TE 1 n × m , respectiv ely . When clear from context, the subscripts will be dropped to streamline notation. The sets of n × n real symmetric positive semideﬁnite and positive deﬁnite matrices are denoted by S n + and S n ++ , respectiv ely . Moreov er , P ≻ 0 ( P ⪰ 0 ) denotes that the symmetric matrix P ∈ R n × n is positiv e deﬁnite (semideﬁnite) and P ≻ Q ( P ⪰ Q ) denotes that the symmetric matrix P − Q ∈ R n × n is positiv e deﬁnite (semideﬁnite). Given a matrix A ∈ R n × m , A + denotes the Moore-Penrose inv erse of A [25, Chap. 1.6], and col A ⊆ R n , ro w A ⊆ R m , and ker A ⊆ R m denote the column space, row space, and kernel of A , respectively . Gi ven a linear subspace K of R n , K ⊥ denotes the orthogonal complement of K and dim K its dimension. I I . P R O B L E M F O R M U L A T I O N The objecti ve is to estimate a state x ∈ R n based on the av ailability of N partial estimates z i = H i x + e i with i = 1 , 2 , . . . , N (which come from multiple sources). Here, each H i is a known matrix and e i is zero-mean random noise. Concatenating all estimates in a single vector z := [ z ⊤ 1 , . . . , z ⊤ N ] ⊤ , one can write z = Hx + e , where H := [ H ⊤ 1 · · · H ⊤ N ] ⊤ ∈ R o × n and e := [ e ⊤ 1 , . . . , e ⊤ N ] ⊤ . The second moment of e is denoted by E [ ee ⊤ ] . Naturally , we assume that z is rich enough to provide an unbiased estimate of x , i.e., that H has full column rank. W e are interested in designing a linear fusion law that provides an unbiased estimator ˆ x = Kz , where K ∈ R n × o is the fusion gain. Furthermore, ˆ x is unbiased if E [ ˆ x ] = x , which is equiv alent to enforcing KH = I when designing K . Crucially , we consider a case where E [ ee ⊤ ] is not exactly known and has the form E [ ee ⊤ ] = R + CPC ⊤ , (4) where R ∈ S o ++ and C ∈ R o × m are known and P ∈ S m ++ is not exactly known. Speciﬁcally , the information structure addressed in this paper, assumes kno wledge of M bounds on components of P , which come from the sources of the partial estimates. Each bound b ∈ { 1 , 2 , . . . , M } is written as W b PW ⊤ b ⪯ X b , where W b ∈ R o b × m and X b ∈ S o b ++ . One can now deﬁne the set of admissible matrices P given the known bounds as P := { P ∈ S m ++ : W b PW ⊤ b ⪯ X b ∀ b ∈ { 1 , 2 , . . . , M }} . (5) Example II.1. Notice that this information structure is a generalization of the one that arises from the cooperative localization toy problem introduced in Section I-A. Indeed, there are three estimates a vailable, z k i , z k i,p , and z k i,p and H 1 = H 2 = H 3 = 1 . Moreover , E [ e k i e k ⊤ i ] in (3) has the form of (4), with R =  Q i 0 0 R k i + diag ( Q p , Q q )  and C =   0 1 0 1 0 0 0 0 1   . T ypically , R represents process and/or sensor noise and C represents how the uncertainty described by P shapes the error of the output z . The three bounds on P = E [ ˜ χ k − 1 i ˜ χ k − 1 ⊤ i ] hav e the form W b PW ⊤ b ⪯ X b , where W 1 = I 3 , W 2 =  1 0 0 0 1 0  , W 3 =  0 1 0 0 0 1  . △ The estimation error is a random vector denoted by ˜ x := ˆ x − x . Given the constraint KH = I on the gain, one can write ˜ x = Kz − x = K ( Hx + e ) − x = Ke and E [ ˜ x ˜ x ⊤ ] = K E [ ee ⊤ ] K ⊤ . The goal is to design a gain K that optimizes an upper bound on the worst-case second moment of the estimation error for all admissible P ∈ P . Formally , the goal is to solve the optimization problem min K ∈ R n × o , B ∈ S n + J ( B ) s . t . KH = I B ⪰ K ( R + CPC ⊤ ) K ⊤ , ∀ P ∈ P , (6) where J : S n + → R ≥ 0 is any optimality criterion that satisﬁes the following monotonicity condition. Assumption 1. Gi ven X , Y ∈ S n + , the map J : S n + → R is such that X ≻ Y = ⇒ J ( X ) > J ( Y ) . This monotonicity assumption on J is very mild. Intuitively , let B 1 and B 2 be error cov ariance matrices. Assumption 1 enforces that if B 1 ≺ B 2 , i.e., the covariance B 2 portrays a larger spread of the error distribution in ev ery direction than B 1 , then J ( B 2 ) > J ( B 1 ) . Common criteria in fusion applications such as the trace or determinant satisfy it. Remark II.1. A Kalman ﬁltering problem can be cast in this framew ork. Suppose we hav e access to an unbiased a priori estimate for x , which is denoted by ˆ x − , i.e., ˆ x − := x + e ′ − , where e ′ − is zero-mean noise. In the context of the Kalman ﬁlter , ˆ x − would be the so-called predicted estimated. Suppose we also hav e access to a vector of sensor outputs y := C ′ x + e ′ y , where e ′ y is zero-mean noise. Then, one can formulate the problem of ﬁnding an unbiased estimate for x in the framew ork presented in this section with z ⊤ = [ ˆ x ⊤ − y ⊤ ] , H ⊤ = [ I n C ′⊤ ] , and e ⊤ = [ e ′⊤ − e ′⊤ y ] . If one writes the linear gain K as K = [ K − K y ] , where K − ∈ R n × n , the condition KH = I can be equiv alently written as K − = I − K y C ′ . The expression for the linear ﬁlter then becomes ˆ x = K − ˆ x − + K y y = ˆ x − + K y ( y − C ′ ˆ x − ) , which is the standard form of the update step of a Kalman ﬁlter . The design problem becomes ﬁnding the gain K y . △ Remark II.2. The form of E [ ee ⊤ ] in (4) and the partial information structure in (5) generalize multiple CI problems. For the sake of the illustration, consider the a vailability of only two partial estimates z 1 and z 2 . First, a basic CI problem can be cast in the OCI framew ork with R = 0 , C = I , W 1 = [ I 0 ] , and W 2 = [ 0 I ] . Second, a SCI problem can be cast as R = diag( X ind 1 , X ind 2 ) , C = I , W 1 = [ I 0 ] , and W 2 = [ 0 I ] , where X ind 1 and X ind 2 are the bounds on the autocorrelation of the independent components. Third, ov erlapping states fusion with CI [26] can be cast with R = 0 PEDROSO et al. : O VERLAPPING COV ARIANCE INTERSECTION 5 and C = I (matrices W 1 and W 2 are omitted for the sake of brevity). Howe ver , in this work, we focus exclusiv ely on the case R ≻ 0 , which arises from the motiv ating example in Section I-A. W e en vision that results for R = 0 and R ⪰ 0 that are analogous to the ones derived in this work can be obtained and applied to these CI ﬂav ors. Nevertheless, these results do not follow immediately from the results for R ≻ 0 herein and therefore are left for future work. △ I I I . O V E R L A P P I N G C OVA R I A N C E I N T E R S E C T I O N In this section, we propose an approach to solve the OCI problem (6). An analysis of the partial knowledge structure is carried out in Section III-A. In Section III-B, we apply a state-of-the-art approach that is common for CI problems to the OCI problem introduced in this paper and discuss why it is not appropriate. In Section III-C, we provide a computationally efﬁcient solution approach to the OCI problem (6). A. Analysis of P ar tial Knowledge First, one can rewrite the set P of admissible matrices P resorting to bounds on the in verse of P . The con venience of this reformulation will be discussed further in Example III.1 and also in Section III-C. Lemma 1. The set P of admissible matrices P can be ex- pr essed as P := { P ∈ S m ++ : P − 1 ⪰ Y b ∀ b ∈ { 1 , 2 , . . . , M }} , wher e Y b := W ⊤ b X − 1 b W b for b = 1 , 2 , . . . , M . Pr oof. See Appendix I-A. Second, we establish necessary and sufﬁcient conditions for the boundedness of admissible cov ariance matrices. Deﬁne W := [ W ⊤ 1 · · · W ⊤ M ] ⊤ . The row space of W characterizes the components where bounds of P are known, which will be instrumental in the following results. Lemma 2. There exists X ∈ S m ++ such that X ⪰ P for all P ∈ P if and only if W is full column rank. Ther e exists Q ∈ S o ++ such that Q ⪰ R + CPC ⊤ for all P ∈ P if and only if rank( W ) = rank([ W ⊤ C ⊤ ] ⊤ ) . Pr oof. See Appendix I-B. Note that the results of Lemma 2 are quite intuitiv e. If the bounds W b PW ⊤ b ⪯ X b , b = 1 , 2 , . . . , M , are in- formativ e about all components of P , i.e., the ro w space of [ W ⊤ 1 · · · W ⊤ M ] ⊤ is R m (or , equi valently , W is full column rank), then the admissible matrices P ∈ P are bounded. Furthermore, if the bounds are informativ e about the components of P extracted by C , i.e., the row space of C is a linear subspace of the row space of W (or , equiv alently , rank( W ) = rank([ W ⊤ C ⊤ ] ⊤ ) ), then R + CPC ⊤ is bounded for all admissible P ∈ P . Also note that the second condition in Lemma 2 is weaker than the ﬁrst, in the sense that the ﬁrst implies the second. Third, we turn to the feasibility analysis of the OCI problem (6). T o be clear, (6) is said to be feasible if there is a pair ( K , B ) that satisﬁes the constraints of (6). It is well-known that H admits a left in verse if and only if H is full column rank [27, Chap.1.3, Lemma 2], so that is a necessary condition for the existence of K subject to KH = I . On top of that, giv en any K , the boundedness of R + CPC ⊤ for all admissible P ∈ P is a sufﬁcient condition for the existence of B such that B ⪰ K ( R + CPC ⊤ ) K ⊤ for all P ∈ P . As a result, a sufﬁcient condition for feasibility follows as a corollary of Lemma 2. Corollary 1. If H is full column rank and rank( W ) = rank([ W ⊤ C ⊤ ] ⊤ ) , then the OCI problem (6) is feasible. △ Remark III.1. It is possible to establish a necessary and sufﬁcient condition for the feasibility of the OCI problem. Intuitiv ely , this result follows from the fact that the OCI problem is feasible if and only if there is a gain K that is a left in verse of H and such that the bounds are informativ e about the components of P extracted by K C , which formally amounts to rank( W ) = rank([ W ⊤ ( K C ) ⊤ ] ⊤ ) . Howe ver , these conditions cannot be easily written in terms of the param- eters H , R , C , and W . Nonetheless, after the reformulation of the OCI problem in Section III-C, such a condition follows easily and is stated in Theorem 2, which is presented later in the paper . △ Example III.1. The information bounds P − 1 ⪰ Y b ∀ b ∈ { 1 , 2 , . . . , M } hav e a very interesting geometrical interpre- tation. Indeed, a covariance matrix P can be geometrically represented by an ellipsoid E P = { x ∈ R m : x ⊤ P − 1 x ≤ 1 } . Furthermore, the relation P − 1 ⪰ Y b can be geometrically understood as E P ⊆ E Y − 1 b [28]. Therefore, the admissible set P can be characterized as the intersection of the ellipsoids generated by each bound, i.e., P ∈ P ⇐ ⇒ E P ⊆ ∩ M b =1 E Y − 1 b . In Fig. 3 this aspect is illustrated for m = 2 , with W 1 = I 2 , W 2 = [1 0] , and W 3 = [2 − 1] and randomly generated X b . Notice that the ﬁrst bound bounds two components of P , thus it deﬁnes a bounded region in R 2 . The second and third bounds only bound one component of P , thus they deﬁne unbounded regions in R 2 that can be understood as degenerate ellipsoids. In Fig. 4 we illustrate the bounds for the toy cooperative localization problem presented in Section I- A with W b deﬁned as in Example II.1 and with randomly generated X b . Notice that the ﬁrst bound is characterized by a bounded ellipsoid and the second and third by degenerate ellipsoids in R 3 . Furthermore, since H and W are full column rank, both conditions for feasibility of Corollary 1 hold and, as expected by Lemma 2, P is bounded. △ B. A State-of-the-ar t Approach Henceforth, we resume the analysis of the OCI problem (6) with the goal of ﬁnding solutions efﬁciently . The OCI problem (6) has a form that is similar to most ﬂav ors of the CI problem but suffers from additional complications. In this section, we apply a state-of-the-art approach that is common for CI problems to the OCI problem introduced in this paper and we discuss why it is not appropriate. The optimization problem (6) is nonlinear and challeng- ing to solve numerically due to the “coupling” between the gain and the admissible matrices P ∈ P in the constraint B ⪰ K ( R + CPC ⊤ ) K ⊤ , ∀ P ∈ P . Applying the common 6 IEEE TRANSACTIONS AND JOURNALS TEMPLA TE -1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1 Y 1 Y 2 Y 3 P Fig. 3. Illustrative OCI bounds and admissib le set P ⊂ S 2 ++ . Fig. 4. Illustrative OCI bounds and admissib le set P ⊂ S 3 ++ . approach to decouple a CI problem to the OCI problem (6) amounts to proposing a bound X for P and then optimizing the gains for that bound with B = K ⊤ ( R + CPC ⊤ ) K ⊤ , i.e., min K ∈ R n × o , X ∈ S m + J ( K ( R + CX C ⊤ ) K ⊤ ) s . t . KH = I X ⪰ P , ∀ P ∈ P . (7) A well-known result under Assumption 1 is that ( H ⊤ Q − 1 H ) − 1 H ⊤ Q − 1 = argmin K J ( K QK ⊤ ) s.t. KH = I for any ﬁxed Q ≻ 0 . Therefore, (7) can be decoupled by optimizing X when the gain K is chosen to be the optimal gain, i.e., min X ∈ S m + J (( H ⊤ ( R + CXC ⊤ ) − 1 H ) − 1 ) s . t . X ⪰ P , ∀ P ∈ P . (8) For details and a more thorough analysis on this approach see, for instance, [13]. There are three main shortcomings of applying this approach to the OCI problem: (i) Notice that if P is not bounded, (8) is not numerically feasible. One can mitigate this by making use of a change of v ariables F = CX C ⊤ , which allows to re write (8) as min F J (( H ⊤ ( R + F ) − 1 H ) − 1 ) s.t. F ⪰ CPC ⊤ , ∀ P ∈ P . Notice that if CPC ⊤ , ∀ P ∈ P , is not bounded, (8) is not numerically feasible (e ven after the change of vari- ables). Still, as analyzed in Remark III.1, boundedness of CPC ⊤ , ∀ P ∈ P , is not a necessary condition for the feasibility of the OCI problem. Furthermore, even if CPC ⊤ , ∀ P ∈ P , is bounded, the bound F that leads to the minimum objectiv e may need to be unbounded in some components, in which case numerical issues will arise. Therefore, expressing the OCI problem by (8) reduces the generality of the problem and may lead to numerical issues. In the basic CI setting, there is generally an assumption on the boundedness of the family of admissible cov ariance matrices. (ii) The objectiv e function for the basic CI problem that is analogous to (8) is gi ven by J (( H ⊤ X − 1 H ) − 1 ) . In that case, a variable change Y = X − 1 and thoughtfully de- vised family of bounds for Y enables a computationally or analytically tractable (suboptimal) solution [13], [16]. Due to the more conv oluted objectiv e function of (8), the same techniques do not work for the OCI problem. (iii) Bounds on admissible cov ariance matrices P ∈ P , such as the one in (8), are hard to characterize. In [29] and [13], this issue is analyzed in-depth for the basic CI problem (where P is block diagonal for which the diagonal blocks are kno wn and no information about the cross-diagonal components is available). Indeed, despite addressing a signiﬁcantly simpler information structure, it is concluded in [13] that families of bounds of P are either not simple (i.e., not possible to parameterize, thus requiring brute-force design) or do not characterize all tight bounds (thus it may not contain the optimal bound, which leads to the suboptimality of the CI optimization problem). For computational efﬁciency , parametrizations of bounds are often used (see, e.g., [11], [16]). For the aforementioned reasons, this approach is not suitable for the OCI problem introduced in this paper , especially because computational ef ﬁciency is of paramount importance in applications such as cooperativ e localization. C . Efﬁcient OCI Solution In this section, the shortcomings of the state-of-the-art approach outlined in Section III-B are addressed and an efﬁcient procedure to obtain solutions to the OCI problem (6) is proposed. First, we reframe the OCI problem (6) using nov el techniques. Second, we introduce a family of bounds of the PEDROSO et al. : O VERLAPPING COV ARIANCE INTERSECTION 7 set of admissible matrices P in the reframed problem, which allows for a very ef ﬁcient numerical computation of a family- optimal solution resorting to semideﬁnite programming. First, giv en Assumption 1, it is possible to decouple the optimization of the gain and co variance bounds in problem (6). Indeed, the following result establishes an equi valence be- tween the solutions of (6) and of min Y ∈ S m + , U , B ∈ S n + J ( B ) (9a) s . t .  B I I H ⊤ R − 1 H − U  ⪰ 0 (9b)  U H ⊤ R − 1 C ( H ⊤ R − 1 C ) ⊤ Y + C ⊤ R − 1 C  ⪰ 0 (9c) Y ⪯ P − 1 , ∀ P ∈ P . (9d) T o be fully clear , by equi valence we mean that gi ven a solution to (9) one can compute a solution to (6) and vice-versa. Theorem 1. Assume that Assumption 1 holds. If ( Y ⋆ , U ⋆ , B ⋆ ) is a solution to (9) , then the pair ( K ⋆ , B ⋆ ) is a solution to (6) , with K ⋆ := ( H ⊤ R − 1 ( R − C ( Y ⋆ + C ⊤ R − 1 C ) + C ⊤ ) R − 1 H ) − 1 H ⊤ R − 1 ( R − C ( Y ⋆ + C ⊤ R − 1 C ) + C ⊤ ) R − 1 . (10) If ( K ◦ , B ◦ ) is a solution to (6) , the triple ( Y • , U • , B • ) is a solution to (9) , with Y • = ( K ◦ C ) ⊤ ( B ◦ − K ◦ RK ◦⊤ ) + ( K ◦ C ) (11) U • = H ⊤ R − 1 C ( Y • + C ⊤ R − 1 C ) + C ⊤ R − 1 H (12) B • = ( H ⊤ R − 1 H − U • ) − 1 . (13) Furthermor e, the OCI pr oblem (6) is feasible if and only if (9) is feasible. Pr oof. See Appendix I-C. Note that the OCI problem (9) is formulated using in verses of bounds of admissible covariance matrices. 2 This allows to numerically account for unbounded components efﬁciently , which addresses shortcoming (i) described in Section III- B. Ho wev er , it comes at the expense of a more complex characterization of the bounds B in the OCI problem (6). It turns out that resorting to conditions on the positiv e def- initeness of block matrices allows to express the bounds B resorting to two linear matrix inequalities (9b) and (9c). This enables efﬁcient optimization with off-the-shelf solvers, which addresses shortcoming (ii). Remark III.2. Besides designing a gain K and obtaining a fused cov ariance bound B , one may be also interested in obtaining a bound for P or on some components of P , i.e., DPD ⊤ with D ∈ R d × m . In that case, one can add a regularization term to (9) to obtain a good bound with negligible increase of the original objectiv e. Clearly , for such bound M ∈ S d ++ to exist, DPD ⊤ needs to 2 W ithin the CI ﬁeld, the inverse cov ariance intersection approach studied in [30]–[32] also handles in verses of covariance matrices to address the basic CI information structure setting under a speciﬁc decomposition of the estimates and cov ariance bounds. The Kalman ﬁlter in information form [33, Chap. 6] also handles in verses of co variance matrices to pre vent numerical issues. be bounded for all P ∈ P , which by making cosmetic changes to the proof of Lemma 2 is the case if and only if rank( W ) = rank([ W ⊤ D ⊤ ] ⊤ ) . From Lemma 1, it follows that DPD ⊤ ⪯ M ⇐ ⇒ P − 1 ⪰ D ⊤ M − 1 D . So, from Y , one can obtain a bound M as Y ⪰ D ⊤ M − 1 D , which by Proposition I.1(i) can also be written as a LMI, and one can add a regularization term to the objective to minimize G ( M ) , where G is any performance criterion that also satisﬁes Assumption 1. Then, (9) becomes min Y ∈ S m + , U , B ∈ S n + M ∈ S d ++ J ( B ) + γ G ( M ) s . t .  B I I H ⊤ R − 1 H − U  ⪰ 0  U H ⊤ R − 1 C ( H ⊤ R − 1 C ) ⊤ Y + C ⊤ R − 1 C  ⪰ 0  M D D ⊤ Y  ⪰ 0 Y ⪯ P − 1 , ∀ P ∈ P , where γ > 0 is a small regularization weight. For exam- ple, in the toy cooperativ e localization problem presented in Section I-A, it is advantageous to obtain a bound X k i on E [ ˜ χ k i ˜ χ k ⊤ i ] to be used in the fusion instance at time k + 1 . Such a bound can be obtained using the procedure outlined in this remark. △ Remarkably , the formulation of the OCI problem as (9) allows to obtain a necessary and sufﬁcient condition for the feasibility of the original OCI problem (6), as sho wn in the follo wing result. Indeed, given { W b } b ∈{ 1 , 2 ,...,M } , H , R , and C , it is possible to ev aluate a simple rank condition to conclude on the feasibility of the OCI problem (9). Condition 1. The matrix H ⊤ R − 1 H − H ⊤ R − 1 C ( W ⊤ W + C ⊤ R − 1 C ) + C ⊤ R − 1 H is full rank. Theorem 2. Under Assumption 1, the OCI pr oblem (6) is feasible if and only if Condition 1 holds. Pr oof. See Appendix I-D T o address shortcoming (iii) described in Section III-B, similarly to the literature on other ﬂavors of the CI problem, we parameterize a family of bounds for all P ∈ P , i.e., we ﬁnd a parameterization of a family of matrices Y that satisfy Y ⪯ P − 1 for all P ∈ P . Recall from Example III.1 that each bound Y b can be geometrically interpreted as a (possibly degenerate) ellipsoid and that P can be interpreted as the intersection of all ellipsoids characterized by { Y b } b ∈{ 1 , 2 ,...,M } . Therefore, ﬁnding a family of bounds for all P ∈ P amounts to ﬁnding a family of bounds for the intersection of M ellipsoids, also known as a family of circumscribing ellipsoids. W e opt to use the very simple family studied in [34], which is characterized by Y = P M b =1 ω b Y b , where ω ∈ R M ≥ 0 is a vector that parameterizes the family and must sum to one, i.e., ω ∈ ∆ M := { ω ∈ R M ≥ 0 : 1 ⊤ ω = 1 } . W e say that a bound Y is tight if there is no S  = Y with 8 IEEE TRANSACTIONS AND JOURNALS TEMPLA TE S ⪯ Y that is also a bound. In [34] it is sho wn that this family characterizes ev ery tight ellipsoid when M = 2 and the intersection of the boundary of both ellipsoids is nonempty . For M > 2 it is also sho wn that when the intersection of the boundary of all ellipsoids is nonempty (which is rarely the case), each element of the family is a tight bound. Howe ver , for M > 2 the family does not characterize, in general, all tight bounds, therefore an OCI-optimal bound may not be contained in this f amily . Henceforth, we call this the Kahan family of bounding ellipsoids. Remarkably , this family is a generalization of the most common family of bounding ellipsoids used for the basic CI problem, e.g., [11]. Restricting the OCI problem to this family of bounding ellipsoids amounts to replacing the information constraints W b PW ⊤ b ⪯ X b , with b = 1 , 2 , . . . , M , with a more conservati ve information constraint P − 1 ⪰ P M b =1 ω b Y b , where the choice of the parameters ω only needs to satisfy ω ∈ ∆ M . Incorporating the conservati ve information constraints into the OCI problem (6) leads to the Kahan-family OCI problem min K ∈ R n × o , B ∈ S n + ω ∈ ∆ M J ( B ) s . t . KH = I B ⪰ K ( R + CPC ⊤ ) K ⊤ , ∀ P ∈ P KF ( ω ) (14) where P KF ( ω ) := n P ∈ S m ++ : P − 1 ⪰ P M b =1 ω b Y b o . Notice that considering more conservati ve bounds on the av ailable information via the Kahan family of bounding el- lipsoids amounts to tightening the constraint B ⪰ K ( R + CPC ⊤ ) K ⊤ , ∀ P ∈ P , in the OCI problem (6). Notice that M − 1 de grees of freedom hav e been introduced via ω in (14) to parameterize the family . Let ( K ⋆ , B ⋆ , ω ⋆ ) be a solution to (14). W e say that ( K ⋆ , B ⋆ ) is the Kahan-family- optimal solution to the OCI problem (6) associated with the optimal Kahan bounding ellipsoid Y ⋆ = P M b =1 ω ⋆ b Y b . A computationally efﬁcient characterization of the Kahan- family-optimal solution to the OCI problem (6) follo ws as a corollary of Theorems 1 and 2. Corollary 2. The pair ( K ⋆ , B ⋆ ) is a Kahan-family-optimal solution to the OCI problem (6), where ( U ⋆ , B ⋆ , ω ⋆ ) ∈ argmin U , B ∈ S n + ω ∈ ∆ M J ( B ) s . t .  B I I H ⊤ R − 1 H − U  ⪰ 0  U H ⊤ R − 1 C ( H ⊤ R − 1 C ) ⊤ P M b =1 ω b Y b + C ⊤ R − 1 C  ⪰ 0 (15) and K ⋆ :=   H ⊤ R − 1   R − C M X b =1 ω ⋆ b Y b + C ⊤ R − 1 C ! + C ⊤   R − 1 H   − 1 H ⊤ R − 1   R − C M X b =1 ω ⋆ b Y b + C ⊤ R − 1 C ! + C ⊤   R − 1 . Furthermore, (15) is feasible if and only if Condition 1 holds or , equi valently , the OCI problem (6) is feasible. Pr oof. See Appendix I-E. Crucially , the feasible set of (15) is con vex, since it is characterized by two LMIs and a linear equality constraint. Therefore, if J is conv ex, then (15) is a conv ex optimiza- tion problem and enjoys a plethora of desirable properties such as robustness to changes in input parameters and the existence of ef ﬁcient numerical algorithms with global opti- mality guarantees [35]. Notice that this formulation addresses shortcomings (ii) and (iii), pointed out in Section III-B, as a result. Moreover , for common choices of J such as the trace or determinant, one can write (15) as a SDP , for which well- performing off-the-shelf solvers with polynomial worst-case complexity exist [36] [37, Section 6]. Remark III.3. If one desires to use the determinant as the metric J , then the problem (15) needs to be slightly modiﬁed to be cast as a SDP . Indeed, since det( X − 1 ) = det( X ) − 1 and the logarithm is strictly increasing, the objectiv e of (15) should be chosen as − logdet( H ⊤ R − 1 H − U ) so that it is con vex and allows to cast (15) as a SDP (see [37, Section 6.2.3] for details). △ Example III.2. Solving the Kahan-family-optimal OCI prob- lem (15) with trace minimization for the two-dimensional example in Example III.1 yields K ⋆ = [2 . 1535 − 3 . 9684] , and B ⋆ = 0 . 9248 . The corresponding optimal Kahan bounding ellipsoid Y ⋆ is depicted in Fig. 5. One can compare the fusion performance between the proposed OCI and SCI, ev en though it is unfair since SCI uses less information. Indeed, in Fig. 6, we depict information bounds that are analogous to the ones in Example III.1 and that are compatible with the SCI framework. The optimal SCI solution yields K ⋆ SCI = [2 . 3894 − 3 . 6607] , B ⋆ SCI = 4 . 829 , and the optimal bounding ellipsoid Y ⋆ SCI is depicted in Fig. 6. One concludes that the set of admissible cov ariance matrices P is signiﬁcantly larger under the infor- mation structure of SCI, which also explains the signiﬁcantly larger fused cov ariance B ⋆ SCI . △ Remark III.4. A MA TLAB implementation of the com- putation of the Kahan-family-optimal solution of the OCI problem, as well as the code of all numerical examples in this paper, is available in an open-access repository at github.com/decenter2021/OCI . △ I V . C O N C L U S I O N The distrib uted fusion problem addressed in this paper stems from the cooperativ e localization problem in emerging ultra large-scale engineering systems. In these settings, each agent has access to noisy data from multiple sensors and from com- munication with other agents that must be fused. It is infeasible to keep track of the covariance between all measurements, but it is feasible that partial structural kno wledge about the joint estimation error cov ariance matrix is tracked by the agents in a distributed fusion frame work. The following conclusions were drawn in this paper . First, this problem can be expressed PEDROSO et al. : O VERLAPPING COV ARIANCE INTERSECTION 9 -1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1 Y 1 Y 2 Y 3 Y ? P Fig. 5. Illustrative trace-minimization OCI solution f or two-dimensional scenario in Example III.1. -1 -0. 5 0 0.5 1 -1 -0.5 0 0.5 1 Y SCI 1 Y SCI 2 Y ? SCI P SCI Fig. 6. Illustrative trace-minimization SCI solution for tw o-dimensional scenario in Example III.1. as a generalized covariance intersection (CI) problem and was named overlapping covariance intersection (OCI). Second, we establish necessary and sufﬁcient conditions on the av ailable information for the feasibility of the OCI problem, which turn out to be very mild. Third, we restrict the problem to a parameterized family of bounds for the joint estimation error cov ariance matrix (which is a generalization of the family of bounds used in the basic CI setting). A solution to the restricted OCI problem is gi ven by a semideﬁnite program (SDP), which is computationally tractable and suitable for real- time implementation. Future work should study whether it is possible to use a less conserv ativ e family of bounds while maintaining attracti ve computational properties. The analysis in [13] may prove fruitful in that regard. Furthermore, deriving analogous results for the cases R = 0 and R ⪰ 0 would enable the SDP characterization of a family-optimal solution to established CI problems that are a particularization of OCI. A P P E N D I X I P R O O F S The following results on the positive (semi)deﬁniteness of block matrices will be instrumental in establishing the results in this paper . Proposition I.1. Let X =  A B B ⊤ C  be a symmetric block matrix with A ∈ R p × p and C ∈ R q × q . Then: (i) If A ≻ 0 , then X ⪰ 0 ⇐ ⇒ C − B ⊤ A − 1 B ⪰ 0 ; (ii) X ⪰ 0 ⇐ ⇒ A ⪰ 0 , col B ⊆ col A , C − B ⊤ A + B ⪰ 0 ; (iii) If B = I p , then X ⪰ 0 = ⇒ A ≻ 0 , C ≻ 0 , A ⪰ C − 1 , A − 1 ⪯ C ; (iv) If B = I p , then A ≻ 0 , C ≻ 0 and either A ⪰ C − 1 or A − 1 ⪯ C implies X ⪰ 0 ; (v) If X ≻ 0 , then X − 1 ⪰  Q 0 0 0  = ⇒ A − 1 ⪰ Q , wher e Q ∈ S q + . Pr oof. Statement (i) follows from [25, Theorem 1.12(b)]. Statement (ii) follows from a particularization of [25, Theo- rem 1.20], which holds for any choice of generalized in verse of A , to the Moore-Penrose in verse A + . Statements (iii) and (iv) follow from [38, Corollary 7.7.10]. T o prove statement (v), note that, by [25, Theorem 1.12(a)], since X ≻ 0 , then A ≻ 0 , C ≻ 0 , C − B ⊤ A − 1 B ≻ 0 and by the Banachie wicz in version formula [25, Equation (0.7.2)], it follows that X − 1 =  A − 1 0 0 0  +  − A − 1 B I  ( C − B ⊤ A − 1 B ) − 1  − A − 1 B I  ⊤ , (16) which establishes (v) immediately . A. Proof of Lemma 1 T o prove the result, one has to show that for any P ∈ S m ++ and any b ∈ { 1 , 2 , . . . , M } , W b PW ⊤ b ⪯ X b ⇐ ⇒ P − 1 ⪰ Y b = W ⊤ b X − 1 b W b . Consider any P ∈ S m ++ and any b ∈ { 1 , 2 , . . . , M } , with W b ∈ R o b × m . Since P ≻ 0 , by Proposition I.1(i), it follo ws that W b PW ⊤ b ⪯ X b is equi valent to  P − 1 W ⊤ b W b X b  ⪰ 0 ⇐ ⇒  X b W b W ⊤ b P − 1  ⪰ 0 . Since X b ≻ 0 , again by Proposition I.1(i), it follows that W b PW ⊤ b ⪯ X b ⇐ ⇒ P − 1 ⪰ W ⊤ b X − 1 b W b = Y b . B. Proof of Lemma 2 The following proposition is required for the proof of the lemma. Proposition I.2. F or all P ∈ P it holds that P − 1 ⪰ W ⊤ diag( X 1 , X 2 , . . . , X M ) − 1 W / M . 10 IEEE TRANSACTIONS AND JOURNALS TEMPLA TE Pr oof. For any P ∈ P , it follows from Lemma 1 that P − 1 ⪰ W ⊤ b X − 1 b W b for all b ∈ { 1 , 2 , . . . , M } . There- fore, P − 1 = P M b =1 P − 1 / M ⪰ P M b =1 W ⊤ b X − 1 b W b / M = W ⊤ diag( X 1 , X 2 , . . . , X M ) − 1 W / M . Notice that the ﬁrst statement of the lemma is a particular case of the second statement. Indeed, if the second statement holds, then setting C = I yields the ﬁrst statement immedi- ately . Therefore, in what follows, we present a proof only for the second statement of the lemma, which is more general. One direction of the equiv alence states that if there exists Q ∈ S o ++ such that Q ⪰ R + CPC ⊤ for all P ∈ P , then rank( W ) = rank([ W ⊤ C ⊤ ] ⊤ ) . W e prov e the reciprocal of this implication, i.e., if rank( W )  = rank([ W ⊤ C ⊤ ] ⊤ ) , then there is not Q ∈ S o ++ such that Q ⪰ R + CPC ⊤ for all P ∈ P . If rank( W )  = rank([ W ⊤ C ⊤ ] ⊤ ) , then rank( W ) < rank([ W ⊤ C ⊤ ] ⊤ ) , which means that there exists a nonnull vector v ∈ R m such that Wv = 0 and Cv  = 0 . Moreov er , from the deﬁnition of W , it follo ws that W b v = 0 for all b ∈ { 1 , 2 , . . . , M } . T ake P 1 ∈ P and deﬁne P 2 = P 1 + α vv ⊤ for some α ≥ 0 . Notice that for all b ∈ { 1 , 2 , . . . , M } , W b P 2 W ⊤ b = W b P 1 W ⊤ b ⪯ X b , therefore P 2 ∈ P for any choice of α ≥ 0 . Since Cv  = 0 , there is no Q ∈ S m ++ such that R + CP 2 C ⊤ = R + CP 1 C ⊤ + α Cvv ⊤ C ⊤ ⪯ Q holds for any choice of α ≥ 0 . W e now prove the other direction of the equiv alence, i.e., if rank( W ) = rank([ W ⊤ C ⊤ ] ⊤ ) , then there exists Q ∈ S o ++ such that Q ⪰ R + CPC ⊤ for all P ∈ P . By Proposition I.2, it follows that P − 1 ⪰ W ⊤ diag( X 1 , X 2 , . . . , X M ) − 1 W / M for all P ∈ P . Notice that if rank( W ) = rank([ W ⊤ C ⊤ ] ⊤ ) , then row C ⊆ ro w W , hence there exists a matrix S such that C = SW . Since diag( X 1 , X 2 , . . . , X M ) − 1 ≻ 0 , then there exists ϵ > 0 such that diag( X 1 , X 2 , . . . , X M ) − 1 ⪰ ϵ S ⊤ S . Therefore, P − 1 ⪰ W ⊤ diag( X 1 , X 2 , . . . , X M ) − 1 W / M ⪰ ( ϵ/ M ) W ⊤ S ⊤ SW = ( ϵ/ M ) C ⊤ C for all P ∈ P . Equiv a- lently , by Lemma 1, CPC ⊤ ⪯ I M /ϵ for all P ∈ P . As a result, deﬁning Q = R + I M /ϵ , it follows that R + CPC ⊤ ⪯ Q for all P ∈ P , which concludes the proof. C . Proof of Theorem 1 The proof of the result relies hea vily on the following lemma, which also shows immediately that the OCI problem (6) is feasible if and only if (9) is feasible. Lemma 3. Assume Assumption 1 holds. If ( Y ⋆ , U ⋆ , B ⋆ ) is in the feasible domain of (9) , then the pair ( K ⋆ , B ⋆ ) deﬁned in (10) is in the feasible domain of (6) . If ( K ◦ , B ◦ ) is in the feasible domain of (6) , the triple ( Y • , U • , B • ) deﬁned in (11) – (13) is in the feasible domain of (9) and J ( B • ) ≤ J ( B ◦ ) . Pr oof. W e start by proving the ﬁrst statement. First, we show that K ⋆ is well deﬁned. Applying Proposition I.1(iii) to (9b) yields H ⊤ R − 1 H − U ⋆ ≻ 0 and applying Proposition I.1(ii) to (9c) giv es U ⋆ ⪰ H ⊤ R − 1 C ( Y ⋆ + CR − 1 C ) + C ⊤ R − 1 H . Thus, H ⊤ R − 1 ( R − C ( Y ⋆ + C ⊤ R − 1 C ) + C ⊤ ) R − 1 H ⪰ H ⊤ R − 1 H − U ⋆ ≻ 0 , (17) showing that K ⋆ is well deﬁned. Furthermore, from (10), K ⋆ H = I . Second, we show that K ⋆ ( R + CPC ⊤ ) K ⋆ ⊤ ⪯ B ⋆ for all P ∈ P . By hypothesis, Y ⋆ ⪯ P − 1 for all P ∈ P and, as a result Y ⋆ + C ⊤ R − 1 C ⪯ P − 1 + C ⊤ R − 1 C . (18) Matrix Y ⋆ + C ⊤ R − 1 C is real, symmetric, and positiv e semideﬁnite, so by the spectral theorem for real symmetric matrices it admits a factorization Y ⋆ + C ⊤ R − 1 C = [ V V ⊥ ]  D 0 0 0  [ V V ⊥ ] ⊤ , (19) where r = rank( Y ⋆ + C ⊤ R − 1 C ) , the columns of [ V V ⊥ ] form an orthonormal basis for R m , and D ∈ S r ++ is a diagonal matrix. Furthermore, since Y ⋆ ⪰ 0 and R − 1 ≻ 0 , then col( C ⊤ ) ⊆ col( V ) . T o see why this holds, take a vector v ∈ col( C ⊤ ) . Notice that v ⊤ Y ⋆ v ≥ 0 and, since Cv  = 0 , v ⊤ C ⊤ R − 1 Cv > 0 . Thus, v ⊤ ( Y ⋆ + C ⊤ R − 1 C ) v = v ⊤ VD V ⊤ v > 0 , which can only hold if v ∈ col( V ) . One concludes that v ∈ col( C ⊤ ) = ⇒ v ∈ col( V ) , i.e., col( C ⊤ ) ⊆ col( V ) . Moreover , since col( C ⊤ ) ⊆ col( V ) , CV ⊥ = 0 . Using the factorization (19) in (18) yields  [ V V ⊥ ] ⊤ ( P − 1 + C ⊤ R − 1 C ) − 1 [ V V ⊥ ]  − 1 ⪰  D 0 0 0  and, by Proposition I.1(v), it follows that V ⊤ ( P − 1 + C ⊤ R − 1 C ) − 1 V ⪯ D − 1 for all P ∈ P , which can equiv a- lently be written as [ V V ⊥ ][ V 0 ] ⊤ ( P − 1 + C ⊤ R − 1 C ) − 1 [ V 0 ][ V V ⊥ ] ⊤ ⪯ [ V V ⊥ ]  D − 1 0 0 0  [ V V ⊥ ] ⊤ = ( Y ⋆ + C ⊤ R − 1 C ) + . (20) Since CV ⊥ = 0 , C [ V V ⊥ ][ V 0 ] ⊤ = C [ V V ⊥ ][ V V ⊥ ] ⊤ = C then, pre- and post-multiplying both sides of (20) by C and C ⊤ , respectiv ely , yields C ( P − 1 + C ⊤ R − 1 C ) − 1 C ⊤ ⪯ C ( Y ⋆ + C ⊤ R − 1 C ) + C ⊤ for all P ∈ P . This condition is equiv alent to ( R + CPC ⊤ ) − 1 = R − 1 ( R − C ( P − 1 + C ⊤ R − 1 C ) − 1 C ⊤ ) R − 1 ⪰ R − 1 ( R − C ( Y ⋆ + C ⊤ R − 1 C ) + C ⊤ ) R − 1 (21) for all P ∈ P , where the equality follows from the Sherman- Morrison-W oodbury formula [39, Section 2.1.4] and the in- equality follows from (9d). It follows that for all P ∈ P K ⋆ ( R + CPC ⊤ ) K ⋆ ⊤ ⪯ ( H ⊤ R − 1 ( R − C ( Y ⋆ + C ⊤ R − 1 C ) + C ⊤ ) R − 1 H ) − 1 ⪯ ( H ⊤ R − 1 H − U ⋆ ) − 1 ⪯ B ⋆ , where the ﬁrst step follows from algebraic manipulation using (21) and (10), the second step follows from (17), and the third step from Proposition I.1(iii) applied to (9b). Finally , since K ⋆ H = I , B ⋆ ∈ S n + , and K ⋆ ( R + CPC ⊤ ) K ⋆ ⊤ ⪯ B ⋆ for all P ∈ P , it follows that ( K ⋆ , B ⋆ ) is in the feasible domain of (6), thereby establishing the ﬁrst statement of the lemma. PEDROSO et al. : O VERLAPPING COV ARIANCE INTERSECTION 11 Now , we prove the second statement. First, we show that P − 1 ⪰ Y • for all P ∈ P . By hypothesis K ◦ ( R + CPC ⊤ ) K ◦⊤ ⪯ B ◦ for all P ∈ P , which can equiv alently be written as K ◦ CP ( K ◦ C ) ⊤ ⪯ B ◦ − K ◦ RK ◦⊤ := ˜ B ◦ . By Proposition I.1(i), this is equiv alent to  P − 1 ( K ◦ C ) ⊤ K ◦ C ˜ B ◦  ⪰ 0 ⇐ ⇒  ˜ B ◦ K ◦ C ( K ◦ C ) ⊤ P − 1  ⪰ 0 . Since ˜ B ◦ ⪰ 0 , by Proposition I.1(ii), it follows that P − 1 ⪰ ( K ◦ C ) ⊤ ˜ B ◦ + K ◦ C = Y • for all P ∈ P and col K ◦ C ⊆ col ˜ B ◦ . Second, we sho w that B ◦ ⪰ K ◦ ( R + CY ◦ + C ⊤ ) K ◦⊤ . T o do so, notice that since ˜ B ◦ ⪰ 0 , col K ◦ C ⊆ col ˜ B ◦ , and Y • − ( K ◦ C ) ⊤ ˜ B ◦ + K ◦ C = 0 ⪰ 0 , then by Proposition I.1(ii)  ˜ B ◦ K ◦ C ( K ◦ C ) ⊤ Y •  ⪰ 0 ⇐ ⇒  Y • ( K ◦ C ) ⊤ K ◦ C ˜ B ◦  ⪰ 0 . Using Proposition I.1(ii), it follows that ˜ B ◦ ⪰ K ◦ CY ◦ + ( K ◦ C ) ⊤ , which is equi valent to B ◦ ⪰ K ◦ ( R + CY ◦ + C ⊤ ) K ◦⊤ . Third, we sho w that K ( R + CY ◦ + C ⊤ ) K ⊤ ⪰ K ( R + CPC ⊤ ) K ⊤ for all P ∈ P and all K that satisfy a condition on their kernel. Matrix Y • is real, symmetric, and positiv e semideﬁnite so, by the spectral theorem for real symmetric matrices, it admits a factorization Y • = [ V V ⊥ ]  D 0 0 ( m − r ) × r 0  [ V V ⊥ ] ⊤ , (22) where r = rank( Y • ) , the columns of [ V V ⊥ ] form an orthonormal basis for R m , and D ∈ S r ++ is a diagonal matrix. Furthermore, since ˜ B ◦ ⪰ K ◦ CP ( K ◦ C ) ⊤ by hypothesis, then ˜ B ◦ + can only possibly ha ve null eigenv alues along the components in ker ( K ◦ C ) ⊤ = (col K ◦ C ) ⊥ (otherwise x ⊤ ( ˜ B ◦ − K ◦ CP ( K ◦ C ) ⊤ ) x = − x ⊤ K ◦ CP ( K ◦ C ) ⊤ x < 0 for x ∈ ker ˜ B ◦ \ k er ( K ◦ C ) ⊤ , which is not possible). Moreover , since Y • = ( K ◦ C ) ⊤ ˜ B ◦ + K ◦ C and the null components of ˜ B ◦ + can only possibly be in (col K ◦ C ) ⊥ , the null components of Y • are along ker K ◦ C . As a result, K ◦ CV ⊥ = 0 . Employing the same procedure as in step two of the proof of the ﬁrst statement of the lemma for the expression P − 1 ⪰ Y • yields [ V V ⊥ ][ V 0 ] ⊤ P [ V 0 ][ V V ⊥ ] ⊤ ⪯ Y ◦ + (23) for all K such that K CV ⊥ = 0 , KC [ V V ⊥ ][ V 0 ] ⊤ = K C [ V V ⊥ ][ V V ⊥ ] ⊤ = K C . Therefore, pre- and post- multiplying both sides of (23) by KC and ( K C ) ⊤ , respec- tiv ely , yields KCP ( K C ) ⊤ ⪯ KCY ◦ + ( K C ) ⊤ . This condi- tion is equiv alent to K ( R + CPC ⊤ ) K ⊤ ⪯ K ( R + CY ◦ + C ⊤ ) K ⊤ (24) for all K such that KCV ⊥ = 0 and all P ∈ P . Fourth, we show that B • ⪰ ( H ⊤ R − 1 ( R − C ( Y • + C ⊤ R − 1 C ) + C ⊤ ) R − 1 H ) − 1 . Consider the optimization prob- lem min K ∈ R n × o , B ∈ S n + J ( B ) s . t . KH = I K CV ⊥ = 0 B ⪰ K ( R + CY ◦ + C ⊤ ) K ⊤ . (25) Notice that the feasible set of (25) is contained in the feasible set of original OCI optimization problem (6). Indeed, if the pair ( K , B ) satisﬁes the constraints of (25), then KH = I and, since K CV ⊥ = 0 , then by (24) it follows that K ( R + CPC ⊤ ) K ⊤ ⪯ K ( R + CY ◦ + C ⊤ ) K ⊤ ⪯ B . Deﬁne S ⊥ as a matrix whose columns form an orthonormal basis for CV ⊥ and S such that the columns of [ S S ⊥ ] form an orthonormal basis for R o . A gain that satisﬁes KCV ⊥ = 0 can be written as K = K [ S S ⊥ ][ S S ⊥ ] ⊤ = [ KS 0 ][ S S ⊥ ] ⊤ = KSS ⊤ . One can then equiv alently rewrite (25) for ˜ K = KS as min ˜ K , B ∈ S n + J ( B ) s . t . ˜ KS ⊤ H = I B ⪰ ˜ KS ⊤ ( R + CY ◦ + C ⊤ ) S ˜ K ⊤ (26) and then recover the solution to K with the relation K = ˜ KS ⊤ . Giv en Assumption 1, it is well-known [13] that, for any Q ∈ S o ++ , ( H ⊤ Q − 1 H ) − 1 H ⊤ Q − 1 = argmin K J ( K QK ⊤ ) s.t. KH = I . As a result, the pair ( K • , B • ) with K • =  H ⊤ S  S ⊤ ( R + CY ◦ + C ⊤ ) S  − 1 S ⊤ H  − 1 H ⊤ S  S ⊤ ( R + CY ◦ + C ⊤ ) S  − 1 S ⊤ B • = K • ( R + CY ◦ + C ⊤ ) K •⊤ =  H ⊤ S  S ⊤ ( R + CY ◦ + C ⊤ ) S  − 1 S ⊤ H  − 1 is a solution to (25). Furthermore, ( K ◦ , B ◦ ) is on the feasible set of the stricter problem (25) since K ◦ H = I by hypothesis and, as shown before, K ◦ CV ⊥ = 0 , and B ◦ ⪰ K ◦ ( R + CY ◦ + C ⊤ ) K ◦⊤ . Therefore, J ( B • ) ≤ J ( B ◦ ) . By Proposition I.3, at the end this section, B • can be rewritten as B • = ( H ⊤ R − 1 H − U • ) − 1 , with U • = H ⊤ R − 1 C ( Y • + C ⊤ R − 1 C ) + C ⊤ R − 1 H . Now no- tice that the triple ( Y • , U • , B • ) is in the feasible domain of (9). Speciﬁcally , from previous analysis Y • ⪯ P − 1 for all P ∈ P , so constraint (9d) is satisﬁed; Y • + C ⊤ R − 1 C ⪰ 0 , col C ⊤ R − 1 H ⊆ col ( Y • + C ⊤ R − 1 C ) , and U • − H ⊤ R − 1 C ( Y • + C ⊤ R − 1 C ) + C ⊤ R − 1 H = 0 ⪰ 0 , so, by Proposition I.1(ii), constraint (9c) is satisﬁed; B • = ( H ⊤ R − 1 H − U • ) − 1 ≻ 0 , so, by Proposition I.1(i v), constraint (9b) is satisﬁed, thereby establishing the second statement. First, we prove the ﬁrst statement of the theorem. Let ( Y ⋆ , U ⋆ , B ⋆ ) be a solution to (9). Therefore, ( Y ⋆ , U ⋆ , B ⋆ ) is in the feasible domain of (9) and, by Lemma 3, ( K ⋆ , B ⋆ ) is in the the feasible domain of (6). W e show that ( K ⋆ , B ⋆ ) is a solution to (6) by contradiction. Assume, by contradiction, that there is ( K ◦ , B ◦ ) in the feasible domain of (6) such that J ( B ◦ ) < J ( B ⋆ ) . By Lemma 3, it follows that the triple ( Y • , U • , B • ) is in the feasible domain of (9) and J ( B • ) ≤ J ( B ◦ ) . Since ( Y ⋆ , U ⋆ , B ⋆ ) is a solution to (9), then J ( B ⋆ ) ≤ J ( B • ) ≤ J ( B ◦ ) . Bringing ev erything together yields J ( B ◦ ) < J ( B ⋆ ) ≤ J ( B • ) ≤ J ( B ◦ ) , which is a contradiction. Second, an analogous approach is used to prove the second statement of the theorem. Let ( K ◦ , B ◦ ) be a solution to (6). 12 IEEE TRANSACTIONS AND JOURNALS TEMPLA TE Therefore, ( K ◦ , B ◦ ) is in the feasible domain of (6) and, by Lemma 3, the triple ( Y • , U • , B • ) is in the feasible domain of (9) and J ( B • ) ≤ J ( B ◦ ) . W e show that ( Y • , U • , B • ) is a solution to (9) by contradiction. Assume, by contradiction, that there is ( Y ⋆ , U ⋆ , B ⋆ ) in the feasible domain of (9) such that J ( B ⋆ ) < J ( B • ) . By Lemma 3, it follo ws that ( K ⋆ , B ⋆ ) is in the feasible domain of (6). Since ( K ◦ , B ◦ ) is a solution to (6), then J ( B ◦ ) ≤ J ( B ⋆ ) . Bringing ev erything together yields J ( B ⋆ ) < J ( B • ) ≤ J ( B ◦ ) ≤ J ( B ⋆ ) , which is a contradiction. Finally , the last statement of the theorem, i.e., that the OCI problem (6) is feasible if and only if (9) is feasible, follows immediately from Lemma 3. Proposition I.3. Let R ∈ S o ++ , C ∈ R o × m , and Y ∈ S m + . By the spectral theor em for r eal symmetric matrices, Y admits a factorization Y = [ V V ⊥ ]  D 0 0 ( m − r ) × r 0  [ V V ⊥ ] ⊤ , (27) wher e r = rank( Y ) . Deﬁne S ⊥ as a matrix whose columns form an orthonormal basis for CV ⊥ and S such that the columns of [ S S ⊥ ] form an orthonormal basis for R o . Then, R − 1 − R − 1 C ( Y + C ⊤ R − 1 C ) + C ⊤ R − 1 = S ( S ⊤ ( R + CY + C ⊤ ) S ) − 1 S ⊤ . Pr oof. Since R ≻ 0 , a vector v ∈ R m is an eigen vector with null eigenv alue of Y + C ⊤ R − 1 C if and only if v ∈ col V ⊥ and v ∈ ker C . Furthermore, that is equiv alent to v = V ⊥ x for some x ∈ ker CV ⊥ . One concludes that the eigenspace of Y + C ⊤ R − 1 C associated with the null eigen value is E 0 = { v ∈ R m : v = V ⊥ x for some x ∈ ker CV ⊥ } . Therefore, by the spectral theorem for real symmetric matrices, Y + C ⊤ R − 1 C admits a factorization Y + C ⊤ R − 1 C = [ ˜ V ˜ V ⊥ ]  ˜ D 0 0 0 ( m − ˜ r ) × ( m − ˜ r )  [ ˜ V ˜ V ⊥ ] ⊤ , (28) where ˜ r = rank( Y + C ⊤ R − 1 C ) , ˜ V ⊥ ∈ R m × ( m − ˜ r ) is such that col ˜ V ⊥ = E 0 , ˜ V ∈ R m × ˜ r is such that the columns of [ ˜ V ˜ V ⊥ ] form an orthonormal basis for R m , and ˜ D ∈ R ˜ r × ˜ r is a diagonal matrix. Furthermore, by the deﬁnition of ˜ V ⊥ , it follows that C ˜ V ⊥ = 0 . Then, one can write for all ϵ > 0 C ( Y + C ⊤ R − 1 C ) + C ⊤ = [ C ˜ V 0 ]  ˜ D − 1 0 0 1 ϵ I  [ C ˜ V 0 ] ⊤ = C [ ˜ V ˜ V ⊥ ]  ˜ D − 1 0 0 1 ϵ I  [ ˜ V ˜ V ⊥ ] ⊤ C ⊤ = C ( Y + C ⊤ R − 1 C + ϵ ˜ V ⊥ ˜ V ⊤ ⊥ ) − 1 C ⊤ . Therefore, since C ( Y + C ⊤ R − 1 C ) + C ⊤ = C ( Y + C ⊤ R − 1 C + ϵ ˜ V ⊥ ˜ V ⊤ ⊥ ) − 1 C ⊤ holds for any ϵ > 0 and col ˜ V ⊥ = E 0 ⊆ col V ⊥ , one can conclude that lim ϵ → 0 C ( Y + C ⊤ R − 1 C + ϵ V ⊥ V ⊤ ⊥ ) − 1 C exists and is equal to C ( Y + C ⊤ R − 1 C ) + C . From the Sherman-Morrison-W oodb ury for- mula [39, Section 2.1.4] R − 1 − R − 1 C ( Y + C ⊤ R − 1 C + ϵ V ⊥ V ⊤ ⊥ ) − 1 C ⊤ R − 1 = ( R + C ( Y + ϵ V ⊥ V ⊤ ⊥ ) − 1 C ⊤ ) − 1 . Therefore, lim ϵ → 0 ( R + C ( Y + ϵ V ⊥ V ⊤ ⊥ ) − 1 C ⊤ ) − 1 = R − 1 − R − 1 C ( Y + C ⊤ R − 1 C ) + C ⊤ R − 1 . (29) Since Y admits the factorization (27) and, by the deﬁnition of S ⊥ , S ⊤ CV ⊥ = 0 , it follows that ( R + C ( Y + ϵ V ⊥ V ⊤ ⊥ ) − 1 C ⊤ ) − 1 = [ S S ⊥ ]   S ⊤ S ⊤ ⊥  ( R + C ( Y + ϵ V ⊥ V ⊤ ⊥ ) − 1 C ⊤ )[ S S ⊥ ]  − 1  S ⊤ S ⊤ ⊥  = [ S S ⊥ ]  M 11 M 12 M ⊤ 12 M 22  − 1 [ S S ⊥ ] ⊤ , where M 11 = S ⊤ ( R + CVD − 1 ( CV ) ⊤ ) S M 12 = S ⊤ ( R + CVD − 1 ( CV ) ⊤ ) S ⊥ M 22 = S ⊤ ⊥ ( R + CVD − 1 ( CV ) ⊤ + 1 ϵ CV ⊥ ( CV ⊥ ) ⊤ ) S ⊥ . As a result, the limit as ϵ → 0 of ( R + C ( Y + ϵ V ⊥ V ⊤ ⊥ ) − 1 C ⊤ ) − 1 exists and is equal to lim ϵ → 0 ( R + C ( Y + ϵ V ⊥ V ⊤ ⊥ ) − 1 C ⊤ ) − 1 = [ S S ⊥ ]  M − 1 11 0 0 0  [ S S ⊥ ] ⊤ = S ( S ⊤ ( R + CYC ⊤ ) S ) − 1 S ⊤ . (30) Comparing (29) and (30) concludes the proof. D . Proof of Theorem 2 A necessary condition for each of the tw o statements between which an equiv alence is established in this result is that H is full column rank. Therefore, for the remainder of the proof, H is assumed to be full column rank. The proof of the theorem relies on the following proposition. Proposition I.4. F or a given Y , the LMI deﬁned by (9b) - (9c) is feasible if and only if H ⊤ R − 1 H − H ⊤ R − 1 C ( Y + C ⊤ R − 1 C ) + C ⊤ R − 1 H is full rank. Furthermore , if for a given Y the expr ession above is full rank, then it is also full rank if one r eplaces Y by any Y ′ that satisﬁes col Y ⊆ col Y ′ . Pr oof. W e start with the ﬁrst statement. In one direction, if the LMI deﬁned by (9b)-(9c) is feasible then there exist U and B such that, by Proposition I.1(iii), H ⊤ R − 1 H − U ≻ 0 and, by Proposition I.1(ii), U ⪰ H ⊤ R − 1 C ( Y + C ⊤ R − 1 C ) + C ⊤ R − 1 H , thus H ⊤ R − 1 H − H ⊤ R − 1 C ( Y + C ⊤ R − 1 C ) + C ⊤ R − 1 H ≻ 0 . In the other direction, if H ⊤ R − 1 H − H ⊤ R − 1 C ( Y + C ⊤ R − 1 C ) + C ⊤ R − 1 H ≻ 0 , one can choose U = H ⊤ R − 1 C ( Y + C ⊤ R − 1 C ) + C ⊤ R − 1 H , thus H ⊤ R − 1 H − U ≻ 0 and choosing B = ( H ⊤ R − 1 H − U ) − 1 , (9b) is satisﬁed by Proposition I.1(iv). Moreov er , since Y + C ⊤ R − 1 C ⪰ 0 , col C ⊤ R − 1 H ⊆ col ( Y + C ⊤ R − 1 C ) , and U − H ⊤ R − 1 C ( Y + C ⊤ R − 1 C ) + C ⊤ R − 1 H = 0 ⪰ 0 , by Proposition I.1(ii) it follo ws that (9c) is satisﬁed. T o prove the second statement, recall, from Proposition I.3, which is in the proof of Theorem 1 in Appendix I-C, that H ⊤ R − 1 H − H ⊤ R − 1 C ( Y + C ⊤ R − 1 C ) + C ⊤ R − 1 H = H ⊤ S ( S ⊤ ( R + CY + C ⊤ ) S ) − 1 S ⊤ H , (31) where S is deﬁned such that its columns form an orthonormal basis for (col( CV ⊥ )) ⊥ and V ⊥ is deﬁned such that its PEDROSO et al. : O VERLAPPING COV ARIANCE INTERSECTION 13 columns form an orthonormal basis for (col Y ) ⊥ . Therefore, giv en a Y for which (31) is full rank, any Y ′ such that col Y ⊆ col Y ′ implies col S ⊆ col S ′ , so replacing Y with Y ′ in (31) does not change its rank. An equiv alence between the feasibility of the OCI problem (6) and (9) was already established in Lemma 3. There- fore, to establish the theorem, one only needs to show the equiv alence between Condition 1, i.e., H ⊤ R − 1 H − H ⊤ R − 1 C ( W ⊤ W + C ⊤ R − 1 C ) + C ⊤ R − 1 H being full rank, and the feasibility of (9) using Proposition I.4. Recall, from Proposition I.2 in the proof of Lemma 3, that P − 1 ⪰ W ⊤ diag( X 1 , X 2 , . . . , X M ) − 1 W / M for all P ∈ P . On the one hand, if Condition 1 holds, then Y = W ⊤ diag( X 1 , X 2 , . . . , X M ) − 1 W / M has the same column space as W ⊤ . Therefore, by Proposition I.4, H ⊤ R − 1 H − H ⊤ R − 1 C ( Y + C ⊤ R − 1 C ) + C ⊤ R − 1 H is full rank. Addi- tionally , Y = W ⊤ diag( X 1 , X 2 , . . . , X M ) − 1 W / M ⪯ P − 1 for all P ∈ P so, by Proposition I.4, (9) is feasible. On the other hand, if (9) is feasible, then there exists Y such that Y ⪯ P − 1 for all P ∈ P and, by Proposition I.4, H ⊤ R − 1 H − H ⊤ R − 1 C ( Y + C ⊤ R − 1 C ) + C ⊤ R − 1 H is full rank. Since the bounded components of P are characterized by the ro w space of W , it follows that col Y ⊆ col W ⊤ W , there- fore, by Proposition I.4, H ⊤ R − 1 H − H ⊤ R − 1 C ( W ⊤ W + C ⊤ R − 1 C ) + C ⊤ R − 1 H is full rank, i.e., Condition 1 holds. E. Proof of Corollar y 2 Notice that it is possible to decouple the optimization of ω from the remaining decision variables in (14). Speciﬁcally , if one ﬁxes parameter ω , one can approach (14) resorting to Theorem 1. Then, choosing a parameter ω that yields the min- imum objecti ve at the optimizer decouples the problem. There- fore, the ﬁrst statement of the corollary follows immediately from Theorem 1. T o prove the second statement, on the one hand, notice that if a triple ( U , B , ω ) is feasible for (15), then the triple ( U , B , Y ) with Y = P M b =1 ω b Y b is immediately feasible for (9), and, by Theorems 1 and 2, it follows that Condition 1 holds. On the other hand, let Condition 1 hold, i.e., H ⊤ R − 1 H − H ⊤ R − 1 C ( W ⊤ W + C ⊤ R − 1 C ) + C ⊤ R − 1 H is full rank. Then for ω b = 1 / M for all b = 1 , 2 , . . . , M , Y = P M b =1 ω b Y b has the same column space as W ⊤ W . Therefore, by Proposition I.4, H ⊤ R − 1 H − H ⊤ R − 1 C ( Y + C ⊤ R − 1 C ) + C ⊤ R − 1 H is full rank and there is a pair ( U , B ) that satisﬁes the LMIs (9b)-(9c) for this choice of Y . It follo ws immediately that ( U , B , ω ) is feasible for (15). R E F E R E N C E S [1] P . Ferguson and J. How , “Decentralized estimation algorithms for formation ﬂying spacecraft, ” in AIAA Guidance, Navigation, and Contr ol Confer ence and Exhibit , 2003, p. 5442. [2] L. Pedroso and P . Batista, “Distributed decentralized EKF for very large-scale networks with application to satellite mega-constellations navigation, ” Contr ol Engineering Practice , vol. 135, p. 105509, 2023. [3] H. Qiu, M. Qiu, Z. Lu, and G. Memmi, “ An efﬁcient key distribution system for data fusion in V2X heterogeneous networks, ” Information Fusion , vol. 50, pp. 212–220, 2019. [4] B. D. O. Anderson and J. B. Moore, Optimal F iltering . Englewood Cliffs, NJ: Prentice-Hall, 1979. [5] L. Pedroso, P . Batista, and W . P . M. H. Heemels, “Distributed design of ultra large-scale control systems: Progress, challenges, and prospects, ” Annual Reviews in Control , v ol. 59, p. 100987, 2025. [6] S. Panzieri, F . Pascucci, and R. Setola, “Multirobot localisation using interlaced extended Kalman ﬁlter , ” in 2006 IEEE/RSJ International Confer ence on Intelligent Robots and Systems , 2006, pp. 2816–2821. [7] K. C. Chang, C. Chong, and S. Mori, “ Analytical and computational ev aluation of scalable distributed fusion algorithms, ” IEEE T ransactions on Aer ospace and Electronic Systems , vol. 46, no. 4, pp. 2022–2034, 2010. [8] R. Forsling, B. Noack, and G. Hendeby , “ A quarter century of cov ari- ance intersection: Correlations still unknown?” IEEE Control Systems Magazine , vol. 44, no. 2, pp. 81–105, 2024. [9] J. K. Uhlmann, “General data fusion for estimates with unknown cross covariances, ” in Signal Pr ocessing, Sensor Fusion, and T ar get Recognition V , 1996. [10] S. Julier and J. Uhlmann, “ A non-divergent estimation algorithm in the presence of unknown correlations, ” in 1997 American Contr ol Confer ence (ACC) , v ol. 4, 1997, pp. 2369–2373. [11] M. Reinhardt, B. Noack, P . O. Arambel, and U. D. Hanebeck, “Minimum cov ariance bounds for the fusion under unknown correlations, ” IEEE Signal Pr ocessing Letters , vol. 22, no. 9, pp. 1210–1214, 2015. [12] J. K. Uhlmann, “Cov ariance consistency methods for fault-tolerant distributed data fusion, ” Information Fusion , vol. 4, no. 3, pp. 201–215, 2003. [13] J. Ajgl and O. Straka, “Fusion of multiple estimates by cov ariance intersection: Why and how it is suboptimal, ” International Journal of Applied Mathematics and Computer Science , vol. 28, no. 3, pp. 521– 530, 2018. [14] U. D. Hanebeck, K. Briechle, and J. Horn, “ A tight bound for the joint cov ariance of two random vectors with unknown but constrained cross- correlation, ” in International Confer ence on Multisensor Fusion and Inte gration for Intelligent Systems , 2001, pp. 85–90. [15] H. Li, F . Nashashibi, and M. Y ang, “Split cov ariance intersection ﬁlter: Theory and its application to vehicle localization, ” IEEE T ransactions on Intelligent T ransportation Systems , vol. 14, no. 4, pp. 1860–1871, 2013. [16] L. Chen, P . Arambel, and R. Mehra, “Fusion under unknown correlation - cov ariance intersection as a special case, ” in Proceedings of the F ifth International Confer ence on Information Fusion , 2002, pp. 905–912. [17] T . R. W anasinghe, G. K. Mann, and R. G. Gosine, “Decentralized cooperativ e localization for heterogeneous multi-robot system using split cov ariance intersection ﬁlter, ” in 2014 Canadian Confer ence on Computer and Robot V ision , 2014, pp. 167–174. [18] S. Julier and J. Uhlmann, “General decentralized data fusion with cov ariance intersection, ” in Handbook of Multisensor Data Fusion , 2nd ed., J. L. Martin Liggins II, David Hall, Ed. CRC Press, 2017, pp. 339–364. [19] C. Cros, P . Amblard, C. Prieur, and J. Da Rocha, “Revisiting split cov ariance intersection: Correlated components and optimality , ” IEEE T ransactions on Automatic Control , vol. 70, no. 7, pp. 4593–4607, 2025. [20] S. Reece and S. Roberts, “Robust, low-bandwidth, multi-vehicle map- ping, ” in 7th International Conference on Information Fusion , 2005, pp. 1319–1326. [21] Z. W u, Q. Cai, and M. Fu, “Covariance intersection for partially correlated random vectors, ” IEEE T ransactions on Automatic Contr ol , vol. 63, no. 3, pp. 619–629, 2018. [22] A. Petersen and M. Beyer , “Partitioned covariance intersection, ” in Pr oceedings of International Symposium Information on Ships , 2011. [23] J. Ajgl and O. Straka, “Rectiﬁcation of partitioned co variance intersec- tion, ” in American Contr ol Confer ence , 2019, pp. 5786–5791. [24] ——, “Cov ariance intersection fusion with element-wise partial kno wl- edge of correlation, ” Automatica , v ol. 139, p. 110168, 2022. [25] F . Zhang, Ed., The Schur Complement and Its Applications , 1st ed. Springer New Y ork, NY , 2005. [26] J. Sijs, U. Hanebeck, and B. Noack, “ An empirical method to fuse partially overlapping state vectors for distributed state estimation, ” in 2013 Eur opean Control Confer ence , 2013, pp. 1615–1620. [27] A. Ben-Israel and T . N. E. Greville, Generalized In verses: Theory and Applications , 2nd ed. Springer Ne w Y ork, NY , 2003. [28] Z. Deng, P . Zhang, W . Qi, J. Liu, and Y . Gao, “Sequential cov ariance intersection fusion Kalman ﬁlter , ” Information Sciences , vol. 189, pp. 293–309, 2012. [29] J. Ajgl and O. Straka, “ A geometrical perspective on fusion under unknown correlations based on Minkowski sums, ” in 20th International Confer ence on Information Fusion , 2017, pp. 1–8. 14 IEEE TRANSACTIONS AND JOURNALS TEMPLA TE [30] B. Noack, J. Sijs, M. Reinhardt, and U. D. Hanebeck, “Decentralized data fusion with inv erse covariance intersection, ” Automatica , vol. 79, pp. 35–41, 2017. [31] J. Nyg ˚ ards, V . Deleskog, and G. Hendeby , “Decentralized tracking in sensor networks with varying coverage, ” in 21st International Confer- ence on Information Fusion , 2018, pp. 1661–1667. [32] J. Ajgl and O. Straka, “Inverse cov ariance intersection fusion of multiple estimates, ” in IEEE 23rd International Conference on Information Fusion , 2020, pp. 1–8. [33] M. S. Grew al and A. P . Andrews, Kalman F iltering: Theory and Practice Using MA TLAB , 2nd ed. John W iley & Sons, 2001. [34] W . Kahan, “Circumscribing an ellipsoid about the intersection of two ellipsoids, ” Canadian Mathematical Bulletin , vol. 11, no. 3, pp. 437– 441, 1968. [35] S. Boyd and L. V andenberghe, Conve x optimization . Cambridge Univ ersity Press, 2004. [36] L. V andenberghe and S. Boyd, “Semideﬁnite programming, ” SIAM Review , vol. 38, no. 1, pp. 49–95, 1996. [37] MOSEK, MOSEK Modeling Cookbook , release 3.3.0 ed., 2024. [38] R. A. Horn and C. R. Johnson, Matrix Analysis , 2nd ed. Cambridge: Cambridge Univ ersity Press, 2012. [39] G. H. Golub and C. F . V an Loan, Matrix Computations , 4th ed. Johns Hopkins Univ ersity Press, 2013. Leonardo P edroso (Graduate Student Member , IEEE) received the M.Sc. degree in aerospace engineering from Instituto Superior T ´ ecnico (IST), University of Lisbon (ULisboa), Portugal, in 2022. From 2019 to 2022, he held a research scholarship with the Institute for Systems and Robotics (ISR), IST , ULisboa. Since 2023, he is working tow ard the Ph.D . degree in mechanical engineering with the Control Systems T echnol- ogy section, Eindhoven University of T echnol- ogy , The Netherlands. Since 2024, he is working toward a second Ph.D . degree in aerospace engineer ing with the ISR, IST , ULisboa, P or tugal. He was the recipient of the 2024 Best M.Sc. Thesis A ward by the Portuguese Automatic Control Association. His current research interests include mean-ﬁeld games and distr ibuted control and estimation of ultra large-scale systems. Pedr o Batista (Senior Member , IEEE) received the Licenciatura and Ph.D . deg rees in Electrical and Computer Engineer ing from Instituto Supe- rior T ´ ecnico (IST), Lisbon, Portugal, in 2005 and 2010, respectively . From 2004 to 2006, he was a Monitor with the Depar tment of Mathematics, IST . Since 2012, he has been with the Depar t- ment of Electr ical and Computer Engineer ing, IST , where he is currently Associate Professor . His research interests include navigation and control of single and multiple autonomous vehi- cles. Dr . Batista was the recipient of the Diploma de M ´ erito twice during his graduation and his Ph.D . disser tation was a warded the Best Robotics Ph.D . Thesis A ward b y the P or tuguese Society of Robotics. He was also aw arded a ULisboa/CGD Scientiﬁc A ward in 2022 by the Universidade de Lisboa. W .P .M.H. Heemels (Fello w , IEEE) received M.Sc. (mathematics) and Ph.D . (EE, control the- ory) degrees (summa cum laude) from the Eind- hov en University of T echnology (TU/e) in 1995 and 1999, respectively . F rom 2000 to 2004, he was with the Electrical Engineering Depar tment, TU/e, as an assistant professor , and from 2004 to 2006 with the Embedded Systems Institute (ESI) as a Research Fello w . Since 2006, he has been with the Depar tment of Mechanical Engineering, TU/e, where he is currently a Full Professor and Vice-Dean. He held visiting professor positions at ETH, Switzerland (2001), UCSB, USA (2008) and University of Lorraine, F rance (2020). He is a Fello w of the IEEE and IF AC , and was the chair of the IF AC T echnical Committee on Networked Systems (2017- 2023). He served/s on the editor ial boards of Automatica, Nonlinear Analysis: Hybrid Systems (NAHS), Annual Revie ws in Control, and IEEE T ransactions on Automatic Control, and is the Editor-in-Chief of NAHS as of 2023. He was a recipient of a personal VICI grant aw arded by NWO (Dutch Research Council) and recently obtained an ERC Advanced Grant. He was the recipient of the 2019 IEEE L-CSS Outstanding P aper Aw ard and the A utomatica P aper Prize 2020-2022. He was elected for the IEEE-CSS Board of Governors (2021-2023). His current research includes hybrid and cyber-physical systems, networ k ed and ev ent-tr iggered control systems and model predictive control and their applications.

Overlapping Covariance Intersection: Fusion with Partial Structural Knowledge of Correlation from Multiple Sources

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment