Distributed and Consistent Multi-Robot Visual-Inertial-Ranging Odometry on Lie Groups

Reliable localization is a fundamental requirement for multi-robot systems operating in GPS-denied environments. Visual-inertial odometry (VIO) provides lightweight and accurate motion estimation but suffers from cumulative drift in the absence of gl…

Authors: Ziwei Kang, Yizhi Zhou

Distributed and Consistent Multi-Robot Visual-Inertial-Ranging Odometry on Lie Groups
Distrib uted and Consistent Multi-Robot V isual-Inertial-Ranging Odometry on Lie Groups Ziwei Kang and Y izhi Zhou Abstract — Reliable localization is a fundamental requir ement for multi-robot systems operating in GPS-denied envir onments. V isual–inertial odometry (VIO) provides lightweight and ac- curate motion estimation but suffers fr om cumulative drift in the absence of global references. Ultra-wideband (UWB) ranging offers complementary global observations, yet most existing UWB-aided VIO methods are designed for single-robot scenarios and rely on pre-calibrated anchors, which limits their rob ustness in practice. This paper proposes a distributed col- laborative visual–inertial–ranging odometry (DC-VIR O) frame- work that tightly fuses VIO and UWB measurements across multiple robots. Anchor positions are explicitly included in the system state to address calibration uncertainty , while shared anchor observations are exploited through inter-r obot communication to provide additional geometric constraints. By leveraging a right-in variant error formulation on Lie groups, the proposed approach preserves the observability properties of standard VIO, ensuring estimator consistency . Simulation results with multiple robots demonstrate that DC-VIR O sig- nificantly improves localization accuracy and robustness, while simultaneously enabling anchor self-calibration in distributed settings. I . I N T R O D U C T I O N Accurate positioning and mutual perception are essential for multi-robot systems, especially in GPS-denied indoor en vironments. Unlike single robots that rely solely on lo- cal sensors, multi-robot systems can improv e accuracy by sharing relativ e measurements and joint observations of the en vironment [1], [2], [3]. Many approaches build on visual- inertial odometry (VIO) [4]–[6], which fuses camera and IMU data and can be extended with inter-robot information to further enhance accuracy . Although lightweight and pre- cise, VIO suffers from cumulative drift due to the lack of global references, and this drift can propagate across robots through relati ve pose estimations. T o address this issue, ultra- wideband (UWB) ranging pro vides a promising solution by introducing anchor-to-robot distance measurements as global observations [7], [8]. For UWB-aided VIO systems, the key to achie ving ac- curate localization lies in how UWB information is fused with visual-inertial measurements [9], [10]. A well-designed UWB-aided VIO system should ensure not only error- bounded but also consistency when fusing ranging mea- surements [11]. Inconsistent estimators tend to underesti- mate system uncertainty , which in turn degrades ov erall localization accuracy [12], [13]. Early methods were typi- cally loosely coupled, assuming that anchor positions had Z. Kang is af filiated with the School of Control and Computer En- gineering, North China Electric Power Uni versity , Beijing, V A 102206, China. Y . Zhou is af filiated with the Department of Electrical and Computer Engineering, George Mason University , Fairfax, V A 22030, USA (e-mail: yzhou26@gmu.edu ). He is the corresponding author . been precisely calibrated of fline [10], [14]. Howe ver , such approaches often require manual calibration and neglect the uncertainty in anchor placement, which can lead to inconsistency and reduced rob ustness in practical deploy- ments [8]. Subsequent studies hav e e xplored tightly coupled framew orks to better exploit the complementary information of UWB and VIO [15], [16]. Nev ertheless, most of these methods were de veloped for single-robot scenarios and can- not be directly extended to multi-robot systems, particularly in distributed settings. This paper therefore focuses on the problem of consistently fusing UWB measurements within a tightly coupled framework for distrib uted multi-robot local- ization. Unlike single-robot UWB-aided VIO, the multi-robot case introduces additional challenges and opportunities. When multiple robots simultaneously observe the same UWB anchor , the ranging measurements naturally form shared observations across robots [17]. Such shared anchors not only provide extra geometric constraints for improving localiza- tion accurac y b ut also enable the joint calibration of anchor positions, which are often uncertain in practice [18]. T o ad- dress these challenges, this paper develops a fully distributed collaborativ e visual–inertial–ranging odometry (DC-VIRO) framew ork that fuses VIO and UWB information in a tightly coupled manner . Anchor positions are explicitly included in the system state to account for calibration uncertainty , while the right-in v ariant error formulation on Lie groups guarantees that the proposed system preserves the same four unobservable directions as the original VIO, ensuring observability consistency . The main contributions of this paper are: • W e propose a fully distributed DC-VIR O system that lev erages inter -robot communication and shared UWB anchors to construct additional geometric constraints, thereby improving both collaborative localization accu- racy and anchor calibration. • W e define the system state on a Lie group and prov e, using the right-inv ariant error properties, that DC-VIRO retains the same four unobservable directions as stan- dard VIO, ensuring estimator consistency . • W e v alidate the feasibility and effecti veness of the proposed system through extensi ve simulations exper - iments. Notations: Let R denote the set of real numbers. W e use I n and 0 n to denote the n × n identity and zero matrices, respectiv ely . For any matrix A ∈ R m × n , its transpose is de- noted by A ⊤ ∈ R n × m . The notation diag ( A 1 , A 2 , . . . , A k ) represents a block-diagonal matrix with A 1 , A 2 , . . . , A k Fig. 1: Cooperati ve visual-range-inertial na vigation setup. along the diagonal and zeros elsewhere. I I . S Y S T E M D E S I G N In this section, we introduce the DC-VIRO system. It is a Multi-State Constraint Kalman Filter (MSCKF) framework formulated on matrix Lie groups for multi-robot cooperativ e localization. Specifically , we consider a system of N robots as shown in Figure 1 . Each robot is equipped with an IMU, UWB, and a camera. W e first define the IMU and camera frames at time t k as I i,k and C i,k , respectively , where i represents the robot index, and G denotes the global frame. Moreov er , robots can communicate and share inter -r obot information with their neighbors to improve o verall accurac y . As illustrated in Fig. 1 , two types of inter-robot information are considered in this work: UWB anchor observations and visual feature observations. For UWB anchors, each robot obtains ranging measurements between its UWB tag and surrounding anchors. When these anchors are observed only by a single robot, they are treated as individual anchor observations; when multiple robots observe the same anchor simultaneously , the ranging information becomes a com- mon anchor observation that can be exploited for inter- robot alignment and joint estimation. Similarly , for visual features, a robot may detect and track features visible only to itself, which are re garded as individual feature observations, while features observ ed by multiple robots are treated as common feature observ ations and serve as shared constraints that improve the consistenc y and accurac y of cooperative localization. The primary objectiv e of the system is to simultaneously estimate the state of the robot’ s (IMU’ s) and the positions of the UWB anchors using the robot’ s own sensor measure- ments and the local information obtained through commu- nication with neighboring robots. It is worth noting that, unlike the standard MSCKF frame work formulated in vector space, the proposed DC-VIR O system operates directly on matrix Lie group representations. By le veraging the inherent manifold structure, the system enhances the consistency and accuracy of the estimator . A. System state Each robot i at time t k has a state that includes the current IMU state, the positions of UWB anchors observed by itself, and sliding window information of m historical IMU clones as: X i,k =  X I i,k , X C i  X C i = ( C i,k − 1 , · · · , C i,k − m ) (1) Here, the IMU state and the UWB anchor state are jointly represented as X I i,k = ( T i,k , B i,k ) ∈ S E 2+ L (3) × R 6 , where S E 2+ L (3) represents the augmented Special Eu- clidean Group in three dimension, and L represents the number of UWB anchors observed by the robot. For sim- plicity , we assume that the observation information of each robot contains only the information of one UWB anchor , L = 1 in the state vector X I i,k . which can be easily generalizable to multi-anchor cases based on the definitions of S E 2+ L (3) group. Specifically , the state representation T i,k ∈ S E 2+ L (3) is defined as T i,k =  G I i,k R G v I i,k G p I i,k G p u 0 3 I 3  ∈ S E 3 (3) , (2) that contains the IMU’ s rotation G I i,k R ∈ S O (3) from the inertial frame I i,k to global frame G at time t k , and IMU’ s velocity G v I i,k and position G p I i,k in the global frame. It is important to highlight that, in addition to the IMU states, T i,k also contains the UWB anchors’ positions G p u since it is also unkno wn parameters that need to be estimated. B i,k =  b ⊤ ω i,k b ⊤ a i,k  ⊤ ∈ R 6 consists the IMU’ s gyroscope bias b ω i,k and accelerometer bias b a i,k . For l ∈ { k − 1 , · · · , k − m } , C l ∈ S E (3) , denotes the historical clones of the IMU’ s pose at time t l giv en by C l =  G I l R G p I l 0 1 × 3 1  (3) By defining ˆ X i,k =  ˆ X I i,k , ˆ X C i  as the state estimate of X i,k , we formulate the error state e X i,k as e X i,k =  ˆ T i,k T − 1 i,k , ˆ B i,k − B i,k , ˆ C i,l C − 1 i,l  ˆ C i,l C − 1 i,l =  ˆ C i,k − 1 C − 1 i,k − 1 , · · · , ˆ C i,k − m C − 1 i,k − m  (4) where ˆ T i,k T − 1 i,k ∈ S E 3 (3) denotes the right inv ariant error giv en by ˆ T i,k T − 1 i,k = " G I i,k e R Γ i, 1 Γ i, 2 Γ i, 3 0 3 I 3 # G I i,k e R = G I i,k ˆ R ( G I i,k R ) ⊤ Γ i, 1 = G ˆ v I i,k − G I i,k e R G v I i,k Γ i, 2 = G ˆ p I i,k − G I i,k e R G p I i,k Γ i, 3 = G ˆ p u − G I i,k e R G p u (5) e B i,k ∈ R 6 is the error of the IMU bias gi ven as e B i,k =  ( ˆ b ω i,k − b ω i,k ) ⊤ ( ˆ b a i,k − b a i,k ) ⊤  (6) and ˆ C i,l C − 1 i,l , for l ∈ { k − 1 , · · · , k − m } , is defined in S E (3) as ˆ C i,l C − 1 i,l =  G I i,l ˆ R ( G I i,l R ) ⊤ G ˆ p I i,l − G I i,l ˆ R ( G I i,l R ) ⊤ G p I i,l 0 1 × 3 1  (7) By applying the log-linear property of the inv ariant error [19], errors ˆ T i,k T − 1 i,k and errors ˆ C i,l C − 1 i,l can be approxi- mated using a first-order approximation as follo ws: ˆ T i,k T − 1 i,k = exp G ( ξ I i,k ) ≈ I 6 + ξ ∧ I i,k ∈ R 6 × 6 ˆ C i,l C − 1 i,l = exp G ( ξ c i,l ) ≈ I 6 + ξ ∧ c i,l ∈ R 6 × 6 (8) where ( · ) ∧ : R dim g → g be the linear map that transforms the error vector ξ I i,k and ξ c i,l defined in the Lie algebra to its corresponding matrix representation [19] as ξ I i,k ≜  ( ξ θ i,k ) ⊤ ( ξ v i,k ) ⊤ ( ξ p i,k ) ⊤ ( ξ u k ) ⊤  ⊤ ∈ R 12 ξ θ i,k = e θ i,k = log( G I i,k e R ) ∈ R 3 ξ v i,k = G ˆ v I i,k − ( I 3 + ⌊ e θ i,k ×⌋ ) G v I i,k ∈ R 3 ξ p i,k = G ˆ p I i,k − ( I 3 + ⌊ e θ i,k ×⌋ ) G p I i,k ∈ R 3 ξ u i,k = G ˆ p u − ( I 3 + ⌊ e θ i,k ×⌋ ) G p u ∈ R 3 , (9) and ξ c i,l ≜  ( ξ θ i,l ) ⊤ ( ξ p i,l ) ⊤  ⊤ ∈ R 6 ξ θ i,l = e θ i,k = log( G I i,l e R ) ∈ R 3 ξ p i,l = G ˆ p I i,l − ( I 3 + ⌊ e θ i,l ×⌋ ) G p I i,l ∈ R 3 (10) Giv en the error definitions in ( 6 ), ( 9 ), and ( 10 ), we define the ov erall error state vector at timestep k as e x i,k ≜  e x ⊤ I i,k e x ⊤ C i  ⊤ ∈ R 18+6 m e x I i,k = h ξ ⊤ I i,k e B ⊤ i,k i ⊤ e x C i =  ξ ⊤ c i,k − 1 · · · ξ ⊤ c i,k − m  ⊤ (11) B. System Model System motion model: In the proposed DC-VIR O, we adopt the general 3-D rigid-body kinematics as the system motion model, gi ven by: G I i,k ˙ R = G I i,k R ⌊ I i,k ω ×⌋ , G ˙ v I i,k = G I i,k R ( I i,k a ) + G g G ˙ p I i,k = G v I i,k , G ˙ p u = 0 , ˙ b ω i,k = w ω i , ˙ b a i,k = w a i (12) where G g =  0 0 − 9 . 8  ⊤ denotes the gravity vector . In particular , the IMU sensor can measure the robot’ s accel- eration I i,k a and angular velocity I i,k ω with respect to the IMU’ s o wn frame I i,k at each timestep k as I i,k a m = I i,k a − G I i,k R ⊤ G g + n a i,k + b a i,k I i,k ω m = I i,k ω + n ω i,k + b ω i,k (13) where I i,k a m and I i,k ω m are the noisy measurements of the robot’ s acceleration and angular velocity , respecti vely . n a i,k and n ω i,k are the IMU’ s measurement noises assumed to be zero-mean Gaussian. b ω i,k and b a i,k are the IMU’ s biases which are modeled as random walk process, where their time deriv ativ es follo w a Gaussian distribution. Camera measurement model: When a static landmark of the en vironment, denoted as G p f , is tracked by the camera carried by robot i at timestep k , the corresponding feature measurement can be obtained through the follo wing model z c i,k = Y ( C i,k p f ) + n c i,k C i,k p f = C i,k I i,k R I i,k G R ( G p f − G p I i,k ) + C i,k p I i,k (14) where n c i,k ∼ N (0 , Q c ) is the measurement noise which is assumed to be white Gaussian with covariance Q c . The C i,k p f is the landmark position in the camera frame, and the projection function Q ( · ) is defined as Q (  x y z  ⊤ ) =  x/z y /z  ⊤ Range measurement model: The UWB tag on the robot can provide the ranging measurement between the robot i and the UWB anchor at time t k , denoted as d i,k . Gi ven the robot’ s pose ( I i,k G R , G p I i,k ) and the anchor position G p u , the ranging measurement is described by the follo wing model: z u i,k = ∥ G p I i,k + I i,k G R ⊤ I i,k p T − G p u ∥ + n u i,k + b u i,k (15) where n u i,k ∼ N (0 , Q u ) is the measurement noise; I i,k p T , b u i,k represent UWB tag’ s position in the IMU frame and the measurement bias, respectively . Note that the terms, I i,k p T and b u i,k can be easily calibrated offline [10]. I I I . A L G O R I T H M D E S I G N In this section, we present the proposed DC-VIR O algo- rithm for 3-D multi-robot systems, which enables simulta- neous self-localization and calibration of multiple unknown anchors by leveraging inter-robot information as additional constraints. The main algorithm is summarized in Algorithm 1 . A. IMU Pr opagation Consider a posterior state estimate ˆ X i,k . 1 T o propagate the state cov ariance, we first linearize the continuous-time kinematics ( 12 ) at ˆ X i,k to compute a linearized error-state model as d dt e x i,k =  F i,k 0 18 × 6 m 0 6 m × 18 0 6 m  e x i,k +  G i,k 0 6 m × 18  n i,k , (16) where n i,k =  n ⊤ ω i,k n ⊤ a i,k 0 ⊤ 1 × 6 w ⊤ ω i w ⊤ a i  ⊤ . The explicit forms of the Jacobians F i,k and G i,k are provided in Appendix. Giv en the linearized system in ( 16 ), we can propagate the cov ariance as ˆ P i,k +1 = ¯ Φ i,k +1 | k ˆ P i,k ¯ Φ ⊤ i,i,k +1 | k + Q i,k (17) where Q i,k denotes the noise cov ariance for the i ’th robot. ¯ Φ i,k +1 | k is the discrete-time state transition matrix from time t k to t k +1 which can be computed as ¯ Φ i,k +1 | k = exp  F i,k 0 18 × 6 m 0 6 m × 18 0 6 m  δ t  δ t = t k +1 − t k (18) B. Range Update W e first linearize the range measurement model ( 15 ) by using the first-order T aylor expansion. Notably , unlike the standard T aylor expansion represented with the standard vector error , we hav e to linearize ( 15 ) with the right-in variant error e x I i,k in ( 11 ). Recall that exp G ( ξ I i,k ) ≈ I 6 + ξ ∧ I i,k in ( 8 ), we hav e the following linearized model at the linearization point ˆ X i,k as e z u i,k = H u i,k e x I i,k + n u i,k (19) 1 Throughout this paper , ( ¯ X i,k , ¯ P i,k ) , ( ˆ X i,k , ˆ P i,k ) denotes the prior and posterior estimate of the robot i ’ s state X i,k at timestep t k , respectiv ely . where the measurement Jacobian H u i,k is gi ven by H u i,k = H pu  Λ k 0 3 I 3 − I 3 0 3 × 6  H pu =  G ˆ p I i,k − G ˆ p u + G I i,k ˆ R I i,k p T  ⊤ ∥ G ˆ p I i,k − G ˆ p u + G I i,k ˆ R I i,k p T ∥ Λ k = ⌊ G ˆ p u − G ˆ p I i,k − G I i,k ˆ R I i,k p T ×⌋ . (20) As previously mentioned, UWB anchors can be classified into two categories: indi vidual anchors, which are observed by only a single robot, and common anchors, which can be simultaneously observed by multiple robots. For measure- ments associated with individual anchors, one can directly incorporate them into the update step following the standard In variant EKF frame work [20]. Therefore, this work focuses on the update with common anchor measurements. Specifically , if both robot i and its neighbors receiv e distance measurements from the same anchor , this indicates that the anchor is a common anchor at this time step. Let N i,k = { j 1 , . . . , j L } denote the neighbors of robot i that observe the common anchor except robot i , gi ven as N i,k = { j 1 , . . . , j L } . T o update using this common anchor information, each robot i will collect information from their neighbor to compute a stacked linearized system as ¯ z u i =      e z u i,k e z u j 1 ,k . . . e z u j L,k      = e H i,k      e x I i,k e x I j 1 ,k . . . e x I j L,k      +      n u i,k n u j 1 ,k . . . n u j L,k      (21) where e H i,k = diag ( H u i,k , H u j 1 ,k , . . . , H u j L,k ) denotes the measurement jacobian of the stacked system. For simplicity , we reformulate model ( 21 ), where each column of e H i,k is rewritten as a block matrix:      e z u i,k e z u j 1 ,k . . . e z u j L,k      ≜ h ¯ H u i,k . . . ¯ H u j L,k i      e x I i,k e x I j 1 ,k . . . e x I j L,k      +      n u i,k n u j 1 ,k . . . n u j L,k      (22) This stacked measurement model depends on the states of all robots in N i,k , thereby incorporating information from both the ego-robot and its neighbors, and effecti vely improving estimation accuracy by lev eraging shared inter- robot measurements. Ho wev er, it is important to note that this stacked measurement model cannot be directly used for the update, since states e x I i,k and e x I j l,k become correlated once robots utilize shared information [17]. Such cross-correlation is generally unknown in a fully distributed setting, and if ignored, it may lead to inconsistency and degraded esti- mation performance [17]. T o properly handle the unknown correlation term and ensure the consistency of the estimator , we leverage the co variance intersection algorithm [17] to approximate the co variance    1 ω i ¯ P i . . . 1 ω j L ¯ P j L    >    ¯ P i · · · ¯ P ij L . . . . . . . . . ¯ P ⊤ ij L · · · ¯ P j L    (23) where the right-hand side denotes the true cov ariance matrix that includes the cross-cov ariance terms, while the left- hand side represents the approximated covariance matrix expressed as a function of the individual covariances of each robot. The coef ficients ω i and ω j l are fusion weights that satisfy ω i > 0 , ω j l > 0 , ω i + X j l ∈N i,k ω j l = 1 (24) Using the linearized measurement residual in ( 22 ), the EKF update for the state of each robot can be performed by substituting the approximated cov ariance in Equation ( 23 ). In particular , the cov ariance can be updated as ˆ P i,k = 1 ω i ¯ P i,k − 1 ω 2 i ¯ P i,k ¯ H ⊤ u i,k S − 1 k ¯ H u i,k ¯ P i,k S i,k = X j ∈N i,k 1 ω j ¯ H u j,k ¯ P j,k ¯ H ⊤ u j,k + Q u (25) and the state is updated as ˆ X i,k = exp( ε i,k ) ¯ X i,k ε i,k = 1 ω i ¯ P i,k ¯ H ⊤ u i,k S − 1 k ¯ z u i (26) C. V isual Update Similar to the UWB measurement update, the visual update is also di vided into two categories: common observa- tion features and individual observation features . The term common observation featur es refers to landmarks or features that are simultaneously observed by multiple robots. These shared observations can be exploited to establish inter -robot constraints and impro ve the consistency of the joint state esti- mation. In contrast, individual observation featur es are those detected by only a single robot, and thus contribute solely to the local state update of that robot without introducing cross-robot constraints. Specifically , we first linearize the e z c i,k = H x i,k e x I i,k + H f i,k e p f + n c i,k (27) where the measurement jacobians H x i,k and H f i,k can be computed as H x i,k = H pc C i,k I ˆ R I i,k G ˆ R  0 3 0 3 − I 3 0 3 × 9  H f i,k = H pc C i,k I i,k ˆ R I i,k G ˆ R , H pc =  1 / ˆ z 0 − ˆ x/ ˆ z 2 0 1 / ˆ z − ˆ y/ ˆ z 2 .  (28) For individual observation features, the linearized measure- ment residual in ( 27 ) can be directly emplo yed for the update. In contrast, when a common observation feature G p f is detected by multiple robots, each robot i aggreg ates the information receiv ed from its neighbors and constructs a stacked linearized system as ¯ z c i =      e z c i,k e z c j 1 ,k . . . e z c j L,k      = ¯ H i,k      e x I i,k e x I j 1 ,k . . . e x I j L,k      + ¯ H f i G ¯ p f +      n c i,k n c j 1 ,k . . . n c j L,k      (29) where ¯ H i,k = diag ( H x i,k , H x j 1 ,k , . . . , H x j L,k ) denotes the measurement jacobian with respect to all inv olved robots’ states, ¯ H f i = h H ⊤ f i,k H ⊤ f j 1 ,k . . . H ⊤ f j L,k i ⊤ is the jaco- bian with respect to the feature point. Since it follows the same form as ( 22 ), the procedures in ( 22 )–( 26 ) can subse- quently be employed to perform the update with common observation features. D. UWB state augmentation and initialization Initially , the system state estimate ˆ T k only includes the IMU state and not the anchor position G ˆ p u . Thus, a key task is to initialize the UWB anchor state together with its cov ariance and augment them into the system state, enabling joint estimation within the DC-VIRO framew ork. W e assume that each robot can directly obtain IMU states and ranging measurements over a time window of length n , and can share its ranging information with neighboring robots within the UWB measurement range. Specifically , to initialize the anchor state, we construct the following optimization problem based on ranging measurements for each robot i : min G p u i n X k =1 ( z u i,k − b u i,k −   G p T i,k − G p u i   ) 2 (30) where G p T i,k = G p I i,k + I i,k G R ⊤ I i,k p T denotes the position of the UWB tag carried by the i -th robot relativ e to the IMU frame I i,k . T o initialize the cov ariance of UWB anchor states and their correlation with existing states, we first define the state X i pos containing the pose of the i -th robot from time step k = 0 to k = n . X i pos =  G I i, 0 R ⊤ G p ⊤ I i, 0 · · · G I i,n R ⊤ G p ⊤ I i,n  ⊤ (31) Then stack all av ailable UWB measurements z u,i =  z u 0 ,i , · · · , z u n ,i  to construct the following stacked mea- surement model: z u,i = h  X i pos , G p u  + n u,i (32) Each z u i satisfies model ( 15 ). Linearizing the measurement model abov e, we obtain: e z u,i =  H x,i H u,i   e x i pos ξ u k  + n u,i (33) n u,i represents the stack ed measurement noise vector , e x i pos represents the error state of X i pos giv en by: e x i pos =  ξ ⊤ θ i, 0 ξ ⊤ p i, 0 · · · ξ ⊤ θ i,n ξ ⊤ p i,n  ⊤ (34) Through QR decomposition [21], we decompose the linear system into tw o subsystems:  e z u 1 ,i e z u 2 ,i  =  H x 1 ,i H u 1 ,i H x 2 ,i 0   e x i pos ξ u k  +  n u 1 ,i n u 2 ,i  (35) Then, the covariance of the UWB state and its correlations with the existing state can be computed and incorporated into the current covariance through augmentation, follo wing [21]. I V . C O N S I S T E N C Y A N A L Y S I S A. Observability Analysis Observability is a crucial metric for analyzing system consistency . In this section, we conduct an observ ability analysis of the DC-VIRO system constructed above. T o simplify the deriv ation, we assume that each robot’ s state contains only one feature point G p f and one UWB anchor point G p u . T i,k =  G I i,k R G v I i,k G p I i,k G p u G p f 0 3 × 4 I 4  ∈ S E 4 (3) (36) Algorithm 1 DC-VIR O 1: Step 1: Initialization 2: Consider the nonlinear system motion model ( 12 ). Start with ˆ X i, 0 = E ( X i, 0 ) , ˆ P i, 0 = P i, 0 , for all robots i = 1 , ..., n . 3: Step 2: State propagation by each single robot 4: For each agent i = 1 , ..., n , performs the propaga- tion step to compute the prior estimate ( ¯ X i,k , ¯ P i,k ) at timestep k . 5: Step 3: State augmentation Once robot i detects a previously unseen UWB anchor , it initializes the cor- responding anchor state and augments both the state and its cov ariance into the o verall system state. 6: Step 4: Measurement update 7: Each robot communicates with its neighbors to de- termine whether commonly observ ed features or shared anchors e xist. If no common observ ations are av ailable, each robot performs the update step using only its own measurements. When common features or anchors are detected, the shared information is fused to compute the posterior estimate ( ˆ X i,k , ˆ P i,k ) . According to the proof in reference [12], the local observ- ability matrix of the time-varying error states for the entire system is defined as: O =      O 0 O 1 . . . O k      =      H 0 H 1 ¯ Φ 1 | 0 . . . H k ¯ Φ k | 0      , (37) Here, ¯ Φ k | 0 = diag ( Φ 1 ,k | 0 , · · · , Φ n,k | 0 ) is the joint state transition matrix for all robots. Since we use visual mea- surements and distance measurements to update the state, the joint measurement Jacobian matrix can be written as H k =  H ⊤ c k H ⊤ u k  ⊤ (38) where H c k = diag ( H c 1 ,k , · · · , H c n,k ) represents the Ja- cobian matrix for visual measurements, and H u k = diag ( H u 1 ,k , · · · , H u n,k ) denotes the Jacobian matrix for UWB ranging measurements. The k -th sub-block of the observability matrix can then be expressed as O k ≜  O c k O u k  =  H c k ¯ Φ k | 0 H u k ¯ Φ k | 0  (39) where each sub-block can be represented as H c k ¯ Φ k | 0 =    H c 1 ,k Φ 1 ,k | 0 . . . H c n,k Φ n,k | 0    H u k ¯ Φ k | 0 =    H u 1 ,k Φ 1 ,k | 0 . . . H u n,k Φ n,k | 0    (40) Lemma 1: If each robot in the team has general motion in the 3-D space, the right null space N of the observability matrix O k for a multi-robot system can be composed of four unobservable directions corresponding to global position and yaw rotation: N =    N 1 . . . N n    , N i =          G g 0 3 0 3 × 1 0 3 0 3 × 1 I 3 0 3 × 1 I 3 0 3 × 1 I 3 0 3 × 1 0 3 0 3 × 1 0 3          , for i = { 1 , . . . , n } (41) Remark 1: Owing to the inv ariant error property , the unobservable subspace of the original multi-robot system remains independent of the state estimate. Consequently , the proposed DC-VIR O algorithm inherits the same observ ability characteristics as the original system, thereby mitigating the inconsistency introduced by linearization in standard EKF and ensuring estimator consistency . V . S I M U L A T I O N S A. Simulation Settings W e used MA TLAB to simulate a three-dimensional en vi- ronment with multiple robots and 3 UWB anchors, where each robot follows a predefined trajectory , as illustrated in Fig. 2 . Each robot is equipped with an IMU for motion sensing, a monocular camera for feature detection, and a UWB tag for ranging with anchors. The IMU, monocular camera, and UWB measurements operate at frequencies of 100 Hz, 10 Hz, and 10 Hz, respectiv ely . The IMU noise standard de viations for each robot are specified as n a i = [0 . 003 , 0 . 003 , 0 . 004] ⊤ ( m/ ( s 2 √ H z )) n ω i = [0 . 0003 , 0 . 0003 , 0 . 0005] ⊤ ( r ad/ ( s 2 √ H z )) (42) (a) T rajectory a (b) T rajectory b Fig. 2: Robots’ groundtruth path and the estimation per- formance ov er 100 Monte-Carlo simulation under different trajectories. The camera has a limited observation range, and its mea- surement noise is modeled as 1 pixel. The initial positions of the UWB anchors are unkno wn and must be calibrated, with a noise lev el of 0 . 1 m . The anchor positions are listed in T able I . The initial covariance of each robot’ s state (position, velocity , and orientation) is set to 10 − 3 I 9 , and the initial state estimates are initialized to their true poses, as summarized in T able II . Each robot achie ves joint information acquisition by com- municating with neighboring robots in pairs, with the prob- ability of establishing communication with neighbors set at 70% . T ABLE I: Initial pose of each anchor Anchor Position (m) 1 [ 0 . 00 0 . 00 0 . 00 ] ⊤ 2 [ 0 . 00 15 . 00 2 . 00 ] ⊤ 3 [ 5 . 00 15 . 00 2 . 00 ] ⊤ T ABLE II: Initial pose of each robot for trajectories a Robot Position (m) Orientation (rad) 1 [ 14 . 00 4 . 00 0 . 00 ] ⊤ [ 0 0 0 ] ⊤ 2 [ 14 . 00 11 . 00 0 . 00 ] ⊤ [ 0 π 0 ] ⊤ 3 [ 6 . 00 4 . 00 0 . 00 ] ⊤ [ 0 0 0 ] ⊤ 4 [ 6 . 00 11 . 00 0 . 00 ] ⊤ [ 0 π 0 ] ⊤ B. Simulation Results T o thoroughly ev aluate the performance of the proposed algorithm, we conducted 100 Monte Carlo simulation runs. The estimated trajectories of the robots are shown in Fig. 2 , demonstrating close agreement with the ground truth. Using the Root Mean Square Error (RMSE) as the e valuation met- ric, we assessed the accuracy of both position and heading estimation. The results, summarized in Fig. 3 , present the av erage Position RMSE (PRMSE) and Orientation RMSE (ORMSE) for each robot, confirming that the proposed al- gorithm achieves highly accurate and consistent localization performance. (a) RMSE of T rajectory a (b) RMSE of T rajectory b Fig. 3: A veraged position RMSE (PRMSE) and orientation RMSE (ORMSE) for each robot ov er 100 Monte-Carlo simulations. T o further demonstrate its effecti veness, we compared the proposed algorithm with a single-robot localization approach that does not in volve inter -robot information sharing. As reported in T able III , incorporating inter -robot information fusion leads to a clear impro vement in the ov erall localization accuracy of the robot team, confirming the advantage of the proposed method. T ABLE III: Comparison of average RMSE (m/deg) with and without inter- robot information T rajectory a T rajectory b w/ inter-robot info. 0.147/1.295 0.256/1.027 w/o inter-robot info. 0.205/1.788 0.498/1.575 V I . C O N C L U S I O N This paper presents a consistent and tightly coupled dis- tributed multi-robot visual–inertial–ranging odometry (DC- VIR O) system formulated on matrix Lie groups. By in- corporating UWB ranging measurements into a collabo- rativ e VIO frame work, the system introduces the concept of shared anchors, which, together with common visual features, provide additional geometric constraints through inter-robot communication. Anchor positions are explicitly included in the state to handle calibration uncertainty , and the use of a right-in variant error formulation ensures that DC-VIR O preserves the same four unobserv able directions as standard VIO, guaranteeing observability consistency . Extensiv e Monte Carlo simulations demonstrate that the pro- posed method significantly improves multi-robot localization accuracy and rob ustness. In future work, we will focus on im- plementing DC-VIRO on real robotic platforms, addressing challenges such as sensor synchronization, communication delays, and real-time computation to v alidate its effecti veness in practical deplo yments. A P P E N D I X The Jacobians F i,k and G i,k are obtained by linearizing the continuous-time IMU kinematics in ( 12 ) under the right- in variant error definition. The Jacobian F i,k can be computed as F i,k =  F A F B i,k 0 6 × 12 0 6  , F A =     0 3 0 3 0 3 0 3 ⌊ G g ×⌋ 0 3 0 3 0 3 0 3 I 3 0 3 0 3 0 3 0 3 0 3 0 3     F B i,k =      − G I i,k ˆ R 0 3 −⌊ G ˆ v I i,k ×⌋ G I i,k ˆ R − G I i,k ˆ R −⌊ G ˆ p I i,k ×⌋ G I i,k ˆ R 0 3 −⌊ G ˆ p u ×⌋ G I i,k ˆ R 0 3      (43) and the Jacobian G i,k can be computed as G i,k =  Ad ˆ X i,k 0 12 × 6 0 6 × 12 0 6  Ad ˆ X i,k =      G I i,k ˆ R 0 3 0 3 0 3 ⌊ G ˆ v I i,k ×⌋ G I i,k ˆ R G I i,k ˆ R 0 3 0 3 ⌊ G ˆ p I i,k ×⌋ G I i,k ˆ R 0 3 G I i,k ˆ R 0 3 ⌊ G ˆ p u ×⌋ G I i,k ˆ R 0 3 0 3 G I i,k ˆ R      (44) R E F E R E N C E S [1] Y .-H. Lee, C. Zhu, T . W iedemann, E. Staudinger , S. Zhang, and C. Günther, “Cov or-slam: Cooperativ e slam using visual odometry and ranges for multi-robot systems, ” 2023. [Online]. A vailable: https://arxiv .org/abs/2311.12580 [2] P . Zhu, P . Geneva, W . Ren, and G. Huang, “Distributed visual- inertial cooperative localization, ” 2021. [Online]. A vailable: https: //arxiv .org/abs/2103.12770 [3] I. V . Melnyk, J. A. Hesch, and S. I. Roumeliotis, “Cooperative vision- aided inertial navigation using overlapping views, ” in 2012 IEEE International Confer ence on Robotics and A utomation , 2012, pp. 936– 943. [4] B. Nisar , P . Foehn, D. Falanga, and D. Scaramuzza, “V imo: Simul- taneous visual inertial model-based odometry and force estimation, ” IEEE Robotics and Automation Letters , vol. 4, no. 3, pp. 2785–2792, 2019. [5] J. Huai, Y . Lin, Y . Zhuang, C. K. T oth, and D. Chen, “Observability analysis and keyframe-based filtering for visual inertial odometry with full self-calibration, ” IEEE T ransactions on Robotics , vol. 38, no. 5, pp. 3219–3237, 2022. [6] C. Li, L. Y u, and S. Fei, “Real-time 3d motion tracking and reconstruc- tion system using camera and imu sensors, ” IEEE Sensors Journal , vol. 19, no. 15, pp. 6460–6466, 2019. [7] Z. Y in, W . Li, and F . Shen, “Uwb-visual-inertial fusion localization with fast and robust uwb extrinsic calibration, ” in 2024 IEEE SEN- SORS . IEEE, 2024, pp. 1–4. [8] C. W ang, H. Zhang, T .-M. Nguyen, and L. Xie, “Ultra-wideband aided fast localization and mapping system, ” in 2017 IEEE/RSJ International Confer ence on Intelligent Robots and Systems (IR OS) , 2017, pp. 1602– 1609. [9] W . Zhao, J. Panerati, and A. P . Schoellig, “Learning-based bias correction for time difference of arriv al ultra-wideband localization of resource-constrained mobile robots, ” IEEE Robotics and A utomation Letters , vol. 6, no. 2, pp. 3639–3646, 2021. [10] T .-M. Nguyen, S. Y uan, M. Cao, Y . L yu, T . H. Nguyen, and L. Xie, “Ntu viral: A visual-inertial-ranging-lidar dataset, from an aerial vehicle viewpoint, ” The International Journal of Robotics Resear ch , vol. 41, no. 3, pp. 270–280, 2022. [11] D.-H. Kim, G.-R. Kwon, J.-Y . Pyun, and J.-W . Kim, “Nlos identifica- tion in uwb channel for indoor positioning, ” in 2018 15th IEEE Annual Consumer Communications & Networking Confer ence (CCNC) , 2018, pp. 1–4. [12] G. P . Huang, A. I. Mourikis, and S. I. Roumeliotis, “ A first- estimates jacobian ekf for improving slam consistency , ” in Experi- mental Robotics: The Eleventh International Symposium . Springer , 2009, pp. 373–382. [13] J. A. Hesch, D. G. Kottas, S. L. Bowman, and S. I. Roumeliotis, “Con- sistency analysis and improvement of vision-aided inertial navigation, ” IEEE T ransactions on Robotics , vol. 30, no. 1, pp. 158–176, 2014. [14] T . H. Nguyen, T .-M. Nguyen, and L. Xie, “Flexible and resource- efficient multi-robot collaborative visual-inertial-range localization, ” IEEE Robotics and Automation Letters , vol. 7, no. 2, pp. 928–935, 2021. [15] B. Y ang, J. Li, and H. Zhang, “Uvip: Robust uwb aided visual-inertial positioning system for complex indoor en vironments, ” in 2021 IEEE International Conference on Robotics and Automation (ICRA) , 2021, pp. 5454–5460. [16] S. Jia, R. Xiong, and Y . W ang, “Distributed initialization for visual- inertial-ranging odometry with position-unknown uwb network, ” in 2023 IEEE International Confer ence on Robotics and Automation (ICRA) , 2023, pp. 6246–6252. [17] P . Zhu, Y . Y ang, W . Ren, and G. Huang, “Cooperativ e visual-inertial odometry , ” in 2021 IEEE International Conference on Robotics and Automation (ICRA) , 2021, pp. 13 135–13 141. [18] Y . Cao and G. Beltrame, “V ir-slam: V isual, inertial, and ranging slam for single and multi-robot systems, ” A utonomous Robots , vol. 45, no. 6, pp. 905–917, 2021. [19] A. Barrau and S. Bonnabel, “The inv ariant extended kalman filter as a stable observer , ” IEEE Tr ansactions on Automatic Contr ol , vol. 62, no. 4, pp. 1797–1812, 2017. [20] Y . Y ang, C. Chen, W . Lee, and G. Huang, “Decoupled right inv ariant error states for consistent visual-inertial navigation, ” IEEE Robotics and Automation Letters , vol. 7, no. 2, pp. 1627–1634, 2022. [21] P . Genev a, K. Eckenhof f, W . Lee, Y . Y ang, and G. Huang, “Openvins: A research platform for visual-inertial estimation, ” in 2020 IEEE International Confer ence on Robotics and A utomation (ICRA) . IEEE, 2020, pp. 4666–4672.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment