Information-Driven Fault Detection and Identification for Multi-Agent Spacecraft Systems: Collaborative On-Orbit Inspection Mission

Inf ormation-Driv en F ault Detection and Identiﬁcation F or Multi- Agent Spacecraft Sy stems: Collaborativ e On-Orbit Inspection Mission Akshita Gupta ∗ Pur due Univ ersity, W est Laf ayett e, IN, 47907, USA Arna Bhardwa j † Georgia Ins titute of T echnology , Atlanta, GA, 30332, USA Y ashw anth Kumar N akka ‡ Georgia Ins titute of T echnology , Atlanta, GA, 30332, USA Changrak Choi ! and Amir Rahmani ¶ Jet Pr opulsion Laboratory, Calif ornia Institute of T ec hnology, P asadena, CA, 91104, USA In this chapter , we pr esent a global-to-local, task -aw are fault de tection and identiﬁcation (FDI) frame work for multi-spacecraft sy stems performing a collaborativ e inspection mission in lo w Earth orbit (LEO). The scenario considers multiple observ er spacecraft in stable passiv e relativ e orbits (PR Os) conducting visual inspection or mapping of a target spacecraft. The inspection task is encoded as a global cost functional, H , which directly incorporates the inspection sensor model, each agent ’s full pose, and the mission’s inf ormation-gain objectiv e. This cost functional driv es both global-lev el decision making (t ask allocation and orbit reconﬁguration) and local-le vel actions (agent motion and sensing), enabling tight coupling betw een mission objectiv es and FDI. Fault detection is perf ormed b y com paring the expected and observ ed global task metric H , using thresholds derived fr om the mission context. T o enable f ault identiﬁcation, w e introduce higher-or der cost gradient metrics that discriminate betw een t ask -speciﬁc sensor faults, ag ent-lev el actuator faults, and state sensor faults. Furthermore, w e propose an adaptive thresholding mechanism that accounts for the time-v arying nature of the inspection task and the dynamic e volution of PR O-based observation g eometries. The proposed frame wor k is validated in simulation f or a representativ e multi-spacecraft collaborativ e inspection mission, demonstrating the reliable detection and classiﬁcation of div erse fault types, including inspection sensor degradation and actuator malfunctions, while maintaining mission objectiv es. This approac h uniﬁes information-driv en guidance and control with task -a ware FDI, pro viding a pathwa y to war d fault-resilient autonomous inspection arc hitectures for futur e distributed spacecraft missions. Nomenclatur e 𝐿 ( p , s ) = variance of estimating point of interes t, s with a sensor at pose, p 𝜴 = torque ( Nm ) 𝑀 ( s ) → [ 0 , 1 ] = relativ e impor tance of point of interest, s 𝑁 ( 𝑂 ) = consensus term 𝑃 𝐿𝑀𝑁 , 𝑃 𝑂𝑃 𝑄 , 𝑃 𝑅𝑃𝑆 𝑅 = loop frequency of lo w-, mid-, and high-rate components ( Hz ) 𝑃 𝑃 𝑃 = angular v elocity ( rad s ↑ 1 ) ∗ Graduate Researcher , Sc hool of Industr ial Engineering, gupta417@purdue.edu † Graduate Researcher , Sc hool of Aerospace Engineering, abhardw aj82@gatec h.edu ‡ Assistant Prof essor, Director of Aerospace Robotics Lab, ynakka3@gatech.edu ! R obotics T echnologis t, Mar itime and Multi-Ag ent Autonom y Group, Jet Propulsion Laborator y , Changrak.Choi@jpl.nasa.gov ¶ Supervisor, Mar itime and Multi-Ag ent Autonom y Group, Jet Propulsion Laborator y , Amir .Rahmani@jpl.nasa.gov A = state transition matrix f or continuous time A d = state transition matrix f or discrete time A 1q , A 1w , A 2w = state transition matrices of linearized attitude dynamics A 1q , A 1w , A 2w = state transition matrices of linearized attitude dynamics B = input matr ix f or continuous time B d = input matrix f or discrete time 𝑄 𝑂𝑃 𝑇 = minimum saf e distance from the observer to the tar get spacecraft ( m ) 𝑄 𝑈 = desired sensing distance from the observ er to the target spacecraft ( m ) ECI = Earth Centered Iner tial Frame FoV = sensor ﬁeld of vie w ( rad ) 𝑅 = inf or mation cost J = iner tia matr ix of the observer spacecraft ( kgm 2 ) L VLH = Local- V er tical Local-Horizontal Frame 𝑆 = mass of the observer spacecraft ( kg ) 𝑇 = mean motion of the target spacecraft ( rads ↑ 1 ) 𝑈 = number of deputies (or) obser v ers p = pose of sensor on the observer spacecraft P = set of all sensor poses PR O = Passiv e R elative Orbit POI = Points Of Interes t on sur face of the target spacecraft q = attitude of the observer spacecraft as quaternion ¯ q = nominal attitude trajectory of the observer spacecraft s = sampled point of interest on the tar get spacecraft S 𝑃 ( 𝑉 ) = set of visible POIs f or spacecraft 𝑊 at time 𝑉 𝑋 𝑃 ( 𝑉 ) = adaptiv e fault threshold f or spacecraft 𝑊 at time 𝑉 𝑌 = time step ( s ) u = control input f or relativ e orbit dynamics ( N (or) Nm ) U = con ve x control constraint set 𝑍 = variance from prior model x 𝑃 = state of the 𝑊 th observer spacecraft ¯ x 𝑃 = nominal trajectory of the 𝑊 th observer spacecraft ( 𝑎 , 𝑏 , 𝑐 ) = relativ e orbit coordinates in L VLH frame I. Introduction Multi-spacecraft sy stems o " er a ne w class of missions that are more ﬂe xible and adaptive than traditional single- spacecraft missions f or near -Ear th as well as deep-space applications. Sev eral past missions ha ve successfull y utilized and demonstrated the beneﬁts of suc h multi-spacecraft system arc hitecture. F or example, the Afternoon Constellation (A- T rain) - a formation that included satellites such as Aq ua [ 1 ], A ura [ 2 ], P ARASOL [ 3 ], CloudSat [ 4 ], CALIPSO [ 5 ], GCOM- W1 [ 6 ], and OCO-2 [ 7 ], ha ve le v eraged the distributed simultaneous observations to s tudy the Ear th climate. By ﬂying a train of heterogeneous instr uments in close succession, the constellation enabled the ground-based fusion of near -simultaneous observations to create comprehensiv e, multi-dimensional vie ws of Earth ’s climate system. On the other hand, missions such as GRA CE [ 8 ] and GRA CE-FO [ 9 , 10 ] ha ve launched tw o identical spacecraft ﬂying in tandem as a single, dis tr ibuted instrument. By precisely measuring minute chang es in their relative separation, these missions created unprecedented maps of Earth ’s time-v arying g ra vity ﬁeld. While the af orementioned missions demonstrated the immense scientiﬁc potential of dis tr ibuted architectures, the interaction between the spacecraft in the sy stem was minimal. For e xample, the multiple satellites in the After noon Constellation operated independentl y , and coordination among satellites was primar ily a ground-managed station-k eeping e xercise to maintain a loose f or mation in the orbit. GRA CE and GRA CE-FO had a more tightl y coupled system that performed tandem ﬂying based on continuous, high-precision inter-satellite ranging. Ne vertheless, both were f ollo wing predeﬁned tasks with no component of onboard collaborativ e autonomous decision-making that would act based on the state of other spacecraft. In recent years, rapid progress in hardw are miniaturization and intellig ent autonom y [ 13 ] has sparked increased 2 Fig. 1 The Afternoon Constellation consisted of multiple satellites, including Aq ua, Aura, C ALIPSO, CloudSat, and OCO-2, that closel y follo w ed one another in the same orbital track (A - T rain). The o ver lapping instrument swaths enabled synergistic, multi-dimensional observations of Earth ’s atmosphere, but the inter-spacecraft coordination w as limited to ground-manag ed station-keeping within predeﬁned orbital box es [ 11 ] Credit: NAS A. Fig. 2 The GRA CE-FO twin-satellite system. The mission measured changes in Earth ’s gra vity ﬁeld by precisely tracking minute variations in the dist ance betw een the tw o spacecraft using a micro wa ve and laser ranging system. This architectur e transformed the spacecraft pair into a single, distributed scientiﬁc instrument [ 12 ] Credit: NAS A/JPL -Caltech. 3 interest in de veloping multi-spacecraft missions with far more comple x interactions and coordination among the spacecraft members. The upcoming missions, such as C ADRE [ 14 – 16 ] and SunRISE [ 17 , 18 ], signiﬁcantly lev erage the multi-spacecraft sy stem architecture and high-lev el autonomy to enable science objectives through dis tr ibuted observations. The Cooperative Autonomous Distributed Robotic Exploration (C ADRE) mission, for e xample, will deplo y a team of three suitcase-sized ro v ers to the lunar surface to demons trate multi-agent autonom y . Using a hierarchical, on-board autonom y system, the ro v ers will cooperativel y plan and ex ecute tasks without direct human control and with minimal ground-in-the-loop inter v entions. A ke y demonstration will be a multi-static ground-penetrating radar (GPR) surve y , where the ro vers mus t dr ive in a precise g eometr ic f or mation to create a 3D map of the lunar subsurface. In this modality , the ro v er team itself functions as a single, distributed scientiﬁc instr ument, where the GNC sy stem’ s ability to maintain the f or mation is directl y coupled to the q uality of the scientiﬁc data, e xemplifying a tightl y-coupled, task -aw are sys tem. Fig. 3 Thr ee CADRE (Cooperativ e A utonomous Distributed Robotic Exploration) ro vers in a clean room at NAS A Jet Propulsion Lab. The mission will demonstrate multi-ag ent autonomy by having the r o vers cooperativ ely perform tasks, such as driving in f ormation to create a 3D subsurface map with ground-penetrating radar . Credit: NAS A/JPL -Caltech [19]. In contrast, the Sun Radio Interf erometer Space Experiment (SunRISE) mission will utilize a constellation of six 6U f orm factor CubeSats in a passiv e f ormation in Ear th orbit to study solar activity . T ogether , these spacecraft will f or m a single, 10-km-wide vir tual radio telescope through the technique of aper ture synthesis. Each CubeSat acts as an independent antenna, and their data is combined on the g round to image lo w -frequency solar radio bursts—phenomena that are unobservable from Ear th due to ionospheric bloc kage. This distr ibuted architecture is the enabling technology f or the mission, as it o vercomes the fundamental phy sical limitation of building and deplo ying a monolithic 10-km structure in space. SunRISE e xempliﬁes a loosely coupled sy stem where comple xity is shifted from onboard control to ground-based data processing and precise position know ledge. While a multi-spacecraft mission architecture inherently o " ers robustness to faults through redundancy and the ability to reconﬁgure task allocation, this robustness is not absolute. The increased sys tem-lev el comple xity , arising from the tight coupling between multiple ag ents, their subsystems, and the shared mission objectiv es, creates numerous potential points of failure. Moreov er, these sy stems depend on distributed communication and control architectures that, while enabling coordinated operation, can also act as conduits f or fault propag ation across the netw ork. For e xample, in a leader–f ollow er formation, a signiﬁcant disturbance in the motion trajectory of ev en a single f ollo wer—caused b y actuator deg radation, sensor miscalibration, or environmental per turbations—can propagate through the control coupling betw een agents, leading to dis tor tion of the f or mation geometry and loss of coordinated co v erage. Similarl y , in distributed sensing missions, a single ag ent exhibiting adversarial, faulty , or biased sensing beha vior can corr upt the shared state es timates in neighbor ing agents through the under lying consensus or data-fusion frame work, leading to sys temic degradation in situational aw areness. In e xtreme cases, such f aults can create positive f eedback loops in the estimation or control process, amplifying their e " ects and potentiall y leading to mission failure. A dditionally , the heterogeneous nature of multi-spacecraft sys tems—where agents ma y di " er in sensing modalities, actuation capabilities, and onboard autonomy—means that f aults can manifest in mission-speciﬁc w ay s, such as loss of a critical sensing mode, misalignment of a high-gain antenna, or degradation of relative na vigation accuracy . Because 4 Fig. 4 Artist ’s concept of the SunRISE (Sun Radio Interferome ter Space Experiment) mission. Six 6U form factor CubeSats will ﬂy in a passive formation, acting as a single, 10-km-wide virtual radio telescope to study lo w-freq uency emissions from solar storms, which are unobserv able from Earth [20] Credit : NAS A. mission performance often depends on the coordinated e xecution of tasks (e.g., maintaining stable passive relative orbits, synchronized imaging, or cooperativ e manipulation), ev en faults in a subset of agents can ha ve disproportionate impacts on global mission objectiv es. These vulnerabilities highlight the necessity for a F ault Detection, Isolation, and Reco v er y (FDIR) architecture that spans both the netw ork and individual-agent lev els. Such an architecture must rapidly detect anomalies, accurately identify their source and type (e.g., actuator , sensor , or task -speciﬁc per f or mance faults), and ex ecute reco very strategies—including re-tasking, dynamic reconﬁguration of f or mations, or deg radation of perf or mance requirements—to ensure g raceful mission degradation rather than abrupt failure. In this context, task -aw are FDIR approaches, which link f ault metr ics directly to mission objectiv es, o " er a pr incipled w ay to maintain operational e " ectiv eness ev en in the presence of persistent or cascading faults. T o this end, we present an information-driven approach that incorporates both netw ork -le vel task and agent-le vel sy stem per f or mance. In the f ollowing sections, w e present our recent work, outlining its key contr ibutions and situating its importance within the broader context of multi-spacecraft fault detection and autonomous mission resilience. A. Related W ork In the ﬁeld of Fault Detection and Identiﬁcation (FDI), a f ault is deﬁned as a deviation of the sys tem from its nominal beha vior [ 21 ]. Fault detection refers to identifying the occurrence of a f ault, while fault identiﬁcation deter mines the type and magnitude of the fault that has occurred. Faults cause anomalous behavior of the sy stem and can lead to failure of the mission as the sys tem’ s ability to perf or m a required function is interr upted. The primary goals of FDI algorithms are to detect f aults in ﬁnite time and to precisel y locate and identify them in order to pre vent further propagation and sys tem f ailure. At the le vel of individual ag ents, sensor , actuator , and system parameter degradation are commonly encountered types of faults [ 22 ]. Sensor faults inv olv e the corr uption of data from sensors such as star trac kers, gyros, or GPS receiv ers. The corr uption can manifes t in various f or ms, including bias, drift, loss of precision, or ev en complete malfunction. Actuator f aults lead to the inability of an spacecraft to correctly apply control actions, such as thruster lock -in-place or motor failure. Sys tem parameter degradation refers to chang es in the phy sical proper ties of the sys tem (e.g., resistance, s ti " ness, mater ial stress) that lead to an altered response of the sy stem to nominal control inputs. For multi-agent sy stems, propag ation of fault data, pack et delay , pack et loss, and network link f ailure are common types of faults that can occur at the network lev el leading to fault propag ation and cascade failure (e v en collision between the agents in the netw ork). FDI algorithms are typically de v eloped by ﬁrs t modeling the nominal behavior of the system, then formulating a fault-sensitiv e metr ic and deﬁning an associated threshold to distinguish between nominal and o " -nominal conditions. In distributed multi-agent systems, the ev olution of a given state v ar iable depends not onl y on the system dynamics and control inputs but also on inf or mation receiv ed from neighboring ag ents. Within a consensus framew ork, all agents are required to agree on a set of predeﬁned consensus v ariables to accomplish a shared global task. Ho we ver , these consensus v ar iables may not alwa ys be directly obser v able. For e xample, in online rendezvous and trajectory 5 planning applications, ag ents ma y need to achiev e consensus on mission-speciﬁc parameters, such as ﬂight time [ 23 , 24 ] in real-time, that are inherently par tially obser vable and cannot be measured by external sensors. This c hallenge complicates the design of robust FDI algorithms, as f aults must be inferred from indirect indicators rather than directl y measured variables. For obser v able state (or consensus) variables, most w orks in the literature dev elop FDI methods using state estimator–based approac hes. For e xample, in [ 25 ], the authors proposed a local FDI algor ithm f or each agent to detect sensor faults. Their method employ s a par tition-based Luenberg er estimator, where each ag ent estimates a speciﬁc component of the global state v ector . Since the local dynamics are coupled with the states of neighboring agents, eac h agent computes a residual vector to determine whether a sensor fault has occur red. The detection threshold in this approach is deﬁned as a function of the time-varying error cov ar iance matr ix. A more practical FDI algor ithm was presented in [ 26 ], whic h accounts for measurement noise and communication delay s. How ev er , the use of a time-varying threshold in this method restricts detection to faults that lie outside the feasible set of state variables. In another approach, [ 27 ] applied Bay esian anal ysis to identify and e xclude outlier measurements from the sensor netw ork before performing FDI. In that w ork, the detection thresholds w ere deter mined empirically , rather than being der iv ed from sys tem dynamics or estimation uncertainty . When state estimation is not a f easible approach f or g enerating local residual v ectors, FDI algor ithms can instead be designed to analyze statistical trends in the data e xc hanged betw een agents. F ault and adversary detection methods in this category are generally applicable to a wide rang e of distributed optimization framew orks. Broadl y , these methods can be classiﬁed into two categories: (i) approaches that exploit redundancy in network topology to guarantee resilience, and (ii) approaches that rel y on statistical analy sis of shared data to detect anomalies. In the ﬁrst categor y ,[ 28 , 29 ] demonstrated that, in the presence of 𝑑 adv ersar ial nodes, consensus can s till be achie ved if eac h regular node has at least ( 2 𝑑 + 1 ) neighbors. Their proposed algor ithm implements a local ﬁltering technique that discards information from 2 𝑑 neighbors whose values lie at the e xtremes, thereby reducing the inﬂuence of adversarial ag ents. How e ver , this method guarantees conv ergence onl y within the conv ex hull of the minimizers of the regular nodes. The local ﬁlter ing concept has also been applied in distributed state es timation problems[ 30 , 31 ]. Despite its theoretical guarantees, ensuring topological redundancy in practice is challenging due to increased communication o verhead, especially in bandwidth-constrained en vironments. Moreov er , the resilience of these methods does not scale proportionally with netw ork size, as the allow able number of adversarial ag ents does not g ro w with the total number of nodes [32, 33]. Giv en the limitations of topology-based methods, analyzing the statistical trends of shar ed data to identify adversarial nodes o " ers a more promising and scalable alter native. In this conte xt, [ 34 , 35 ] propose a g radient-based metric for detecting malicious agents. Speciﬁcally , [ 35 ] considers an attack scenar io in which the local objectiv e functions of adv ersar ial nodes are arbitrarily modiﬁed, while [ 34 ] addresses f alse data injection attacks . In both cases, eac h node maintains a score for its neighbors b y estimating their gradients ov er time and progressivel y sev ers links with neighbors whose scores consistentl y ex ceed those of the rest. The detection and isolation algor ithms in [ 34 , 35 ] are demonstrated empirically ; how ev er , despite pro viding intuitive jus tiﬁcation for using a g radient-based metr ic, these w orks o " er no f or mal theore tical guar antees on con verg ence. Furthermore, the resilient algorithm in [ 35 ] is designed f or dir ected gr aphs and requires each node to share its data with e xactl y one neighbor per time step—a constraint that signiﬁcantly slo ws con ver gence. Statis tical trend anal ysis has also been widely applied in the distributed estimation problem. For e xample, [ 36 ] dev elops a method f or detecting f aulty data injection in a single-sensor system using a Chi-squared test on the measurement inno vation. Here, the attacker emplo ys a linear deception str ategy to preserve the s tatistical proper ties of the measurements, and the authors derive an optimal detection strategy f or this setting. This w ork was later e xtended to the multi-sensor case [ 37 , 38 ], consider ing a centralized remote state estimation sy stem and proposing detection schemes f or di " erent attack scenarios. How ev er , a ke y limitation of these methods is that the y do not address ho w to select or adapt the de tection thr eshold required to classify a node as adv ersarial [36–39]. T o this end, a fault detection, isolation, and reco very (FDIR) architecture is essential f or accommodating potential faults at both the network and individual ag ent lev els, allo wing the mission to continue with graceful degradation. In this w ork, w e present a new FDI method, as shown in Fig. 5, that detects failures at the netw ork le vel using an abs traction of the global task objectiv e H and local sensing inf or mation, and informs the agent-le v el FDIR algor ithm to perform the necessary actions for reco v er y . For e xample, a f ault at the network le vel could be caused b y communication loss, a global task sensor f ault (f or inspection tasks), a global task actuator f ault (f or on-orbit construction), or a f ault at the agent le v el could be due to thruster or reaction wheel issues. W e propose a simulation v s. real compar ison using the H and its higher -order g radients. W e detect the f ault by computing the o " -nominal behavior from the e xpected global task objectiv e H b y monitor ing the individual task using a residual vector sensitiv e to the ag ent’ s f aults. The global 6 task objectiv e H is designed to be a function of the state of agents in the network and a model of the task sensor (f or inspection) or actuator (f or construction). The residual v ector is a function of the local relativ e state es timates of the agents in the netw ork. W e propose a metr ic that computes the deviation of the H from the nominal per f ormance, which was deﬁned from empirical simulations done on the sy stem. In ﬁeld operations the nominal per f or mance is judged b y the user , which is used as an indicator f or faults at the netw ork le vel, and uses higher -order derivativ es of H to inf er if the agent-le v el faults, as sho wn in Fig. 5. Failed Agent IDs Actuator Sensor Fault Detector and Identi ﬁ er Threshold Metric F a u l t D e te c ti o n a n d I d e n ti ﬁ c a ti o n Global T a s k -A wareness Information-Based GNC Sim Information-Based GNC Real Inspection Cost Simulation Inspection Cost Real Fig. 5 Global task -aw are fault de tection and isolation for a distributed spacecraft netw ork. Prior work [ 27 , 40 ] includes FDI architectures for distributed sensor networks that utilize a local decentralized observer to detect inter nal ag ent sensor or actuator failures. Recent w ork [ 41 , 42 ] uses adaptiv e or reconﬁguration control with a minimal notion of network task to achie ve f ault tolerance. W e instead f ocus on incor porating global task objectiv es to inform the local FDIR, enabling it to react or respond appropriately and continue the mission autonomousl y to the best possible e xtent of the netw ork’s capability , while maximizing the global task objectiv e H . Fur thermore, while most of the earlier w ork [ 25 , 26 ] on FDI f or distributed systems f ocuses on simple linear dynamical sys tems, we f ocus on low -ear th orbit formation ﬂying dynamics that includes periodic orbits. Main Contributions. The main contr ibutions of this w ork are as f ollow s: 1) w e propose an architecture f or FDI in a multi-agent spacecraft sy stem that integrates global-task objectiv es and local-agent lev el behaviors f or task aw areness, 2) we deriv e a global cost functional that is decomposable to cost functions that inform local progress and inter mediate consensus on global progress, 3) we propose a no vel FDI metr ic based on the global cost H and the high-order derivativ es of the H to detect and identify both the global and local f aults. W e apply our FDI architecture to a recentl y proposed multi-ag ent collaborative spacecraft inspection mission [ 43 ] in a lo w Ear th orbit to detect failures in the inspection sensors and individual ag ent sensing. The cost function H deﬁnes the global inspection progress by fusing individual agent sensor data measurements. The inspection data fusion r uns at a ﬁx ed frequency 𝑃 𝑆 , and the netw ork fault diagnosis is r un at frequency 𝑃 FDI . W e assume that ag ents communicate with each other only when within the communication radius, leading to a time-varying communication topology and sensing graph. The proposed method is capable of handling the time-v ar ying graph and intermittent communication. W e demonstrate that the proposed method can detect and identify the faults while keeping trac k of the global task. This approach is essential to inf or m the reco very procedure, descr ibed in our recent work [ 44 ], f or designing new orbits and pointing trajectories to complete the mission. Organization. The remainder of this book chapter is organized as follo w s. Section II begins b y formall y deﬁning the problem of creating an inf or mation-cost architecture to detect both global and local f aults. It outlines the design ref erence mission for collaborativ e on-orbit inspection and introduces the inf or mation based Guidance, Na vigation, and Control (GNC) frame w ork with details on the inf or mation gain cost function, H , used to trac k mission progress. Section III presents the core of our proposed FDI framew ork. It details ho w the global cost functional is decomposed to monitor individual agent perf or mance and derives the k ey f ault metr ic, which w orks by comparing an agent ’ s real-time inf or mation contr ibution to its predicted nominal value. The design of an adaptive threshold to reliably distinguish faults from sy stem noise is also e xplained. Section IV provides illus trative e xamples of actuator and sensor faults, while Section V presents detailed simulation results that validate the performance of our adaptiv e fault detection algor ithm 7 under v ar ious fault scenarios. These sections analyze the method’ s ability to cor rectly identify faulty agents and also discuss its current limitations. Finall y , Section VI concludes the book chapter b y summarizing the ke y contr ibutions, and Section VII, the Appendix, provides a detailed revie w of the relative dynamics and sequential con v ex programming approach f or both orbit initialization, reconﬁguration, and attitude tra jector y generation. II. Problem Description and Preliminaries A. Problem Description The objectiv e of this work is to de v elop an information-cost arc hitecture that detects both global and local faults in a multi-spacecraft sys tem engag ed in global tasks, such as on-orbit inspection and on-orbit cons tr uction. The categories of global behavior faults and local faults identiﬁed using the proposed approach are illustrated in Figure 6. For the inspection task, a global beha vior fault can result in either deteriorating or impro ved mission per formance. Improv ed performance may occur in two scenarios: (i) a controller failure at the agent lev el that unintentionall y enhances the e xploration of the inspection targ et, or (ii) spur ious signals receiv ed from neighboring agents that incidentall y improv e co verag e. Conv ersely , deteriorating perf or mance typicall y arises from faults suc h as deg radation of the inspection sensor or failures in the spacecraft ’ s pointing controller . The detected beha vior faults serve as in puts to the fault identiﬁcation module, which determines both the faulty ag ent and the speciﬁc f ault type. The fault detection and identiﬁcation (FDI) problem addressed in this w ork is descr ibed in the f ollowing: Problem 1. Giv en the nominal expected global task perf or mance H 𝑇𝑀𝑂 and the real-time per formance H , detect the g lobal and ag ent-lev el faults in the multi-spacecr aft system performing a g lobal task (collaborativ e inspection or construction). The global and ag ent-level f ault tree is described in t he Fig. 6. G l o b al T ask Co st Metri c Deteri o rati n g P erfo rman ce I mp ro ved P erfo rman ce Hi g h -O rd er Co st G rad i en t b ased Metri c G l o b al T ask S en so r/ Actu ato r F au l ty Ag en t I D Co n tro l Actu ato rs O n -Bo ard S en so r F DI R F DI R G L O BAL F AUL T S L O CAL F AUL T S Fig. 6 An o vervie w of the type of global and local faults detected and ide ntiﬁed using the proposed FDI arc hitecture, metrics and the threshold. In the T able 2, w e summarize potential faults that may occur in a distributed spacecraft system ex ecuting cooperativ e missions such as on-orbit inspection or cons tr uction. These faults are classiﬁed as global f aults and local faults, depending on their origin and manifestation dur ing mission e xecution. Global faults ref er to anomalous behaviors that can be detected by monitoring global task performance—that is, performance metr ics shared across all agents and directly tied to mission-lev el objectiv es. Many global faults originate at the netw ork lay er of individual agents. Con ventional netw ork security protocols [ 45 ] are e " ectiv e at detecting certain classes of netw ork anomalies such as 8 T able 2 Commonly observ ed faults in a distributed spacecraft system. Global F aults Local F aults Propag ation of faulty data Sensor faults Pac ket dela y A ctuator faults Pac ket loss Sys tem parameter degradation Netw ork link failure Ph ysical component f ailure pack et loss, jitter , or latency . How ev er , they are often ine " ective at detecting faulty data propag ation cases where an agent broadcas ts er roneous or adv ersar ial inf or mation that still conf orms to networking protocols. This type of fault is particularly insidious in multi-ag ent space sy stems because it can subtly cor rupt the collectiv e decision-making process without tr iggering traditional alarms. The proposed inf or mation-cost based metr ic is well-suited f or detecting such faults. This metric quantiﬁes the contr ibution of each ag ent’ s actions and measurements to the progress of the collectiv e task, as deﬁned by a global cost functional H . By comparing the e xpected ev olution of H under nominal operation with its actual ev olution, the framew ork can ﬂag anomalies that indicate a degradation or unexpected impro v ement in global task performance. This enables the detection of f aults that are other wise in visible to traditional network monitor ing, since the anomaly is inf er red from the mission’ s information ﬂow and utility rather than from ra w pack et statistics. Local faults originate within an individual agent ’ s subsys tems—such as its sensors, actuators, or internal computation modules—and can degrade the performance of the global task. In a cooperative spacecraft mission, a deviation in the e xpected global task progress can often be attr ibuted to a speciﬁc ag ent b y analyzing its marginal contribution to H o ver time. This allow s the same information-cost frame w ork used for global f ault detection to be extended naturall y to fault localization and identiﬁcation. In this w ork, we f ocus speciﬁcally on detecting and identifying sensor and actuator f aults at the ag ent lev el using the proposed inf or mation-cost metr ic. These f aults directly inﬂuence an agent ’ s ability to contribute accurate and actionable data to the collectiv e task and are thus highly relev ant to maintaining mission objectives. Other classes of local faults—suc h as slow system parameter degradation, structural damag e, or failures in po wer/thermal subsy stems—can often be more e " ectiv ely addressed using complementary approaches, such as signal-based FDI methods [ 46 , 47 ]. By integrating global and local fault detection within a single inf ormation-cost-based frame work, the proposed approach pro vides a uniﬁed, task -aw are FDI architecture that captures faults arising from both netw ork -lev el data integr ity issues and agent-lev el performance deg radations, enabling graceful degradation and reco very in distributed spacecraft missions. Ho we ver , the proposed information-cost based metric can be used to detect such a global f ault by k eeping track of the progress of the collectiv e task. Local f aults, which originate in an ag ent’ s sy stem/architecture, a " ect the perf or mance of the global task. Theref ore, a change in the e xpected performance of the global task can be traced back to a particular faulty ag ent in the netw ork. While the inf or mation-cost based metric can be used to identify agent le vel f aults, we onl y discuss the detection and identiﬁcation of sensor and actuator f aults using the proposed metr ic. This is because faults such as sy stem parameter degradation and phy sical component failures can be e " ectiv ely detected using other methods, such as signal based FDI approac hes [46, 47]. In the f ollo wing sections, we describe the design ref erence mission, an o v er view of the information-cost optimal control problem, and the inf or mation-based GNC arc hitecture. B. Design Ref erence Mission: On-Orbit Collaborative Inspection In this section, w e discuss the concept of operations of a typical Ear th orbit inspection mission with the target spacecraft as an e xample along with the preliminar ies f or the proposed architecture. The scenario considered in this paper has three phases, as shown in Fig. 7. In the ﬁrst phase, the small obser v er spacecraft are deplo y ed from the targ et spacecraft and begin a dr ift phase. The drifting spacecraft are then inser ted into a parking PR O or an initial PR O in the second phase. In the third phase, the spacecraft in s table relative orbits are used f or inspecting the targ et. As needed, the spacecraft reconﬁgure to a ne w set of PR Os to inspect a previousl y unobserved surf ace area on the targ et spacecraft. In this paper, w e use the Hills-Clohessy- Wiltshire (HCW) equations to describe the relativ e orbital dynamics of the observer CubeSats. For the stable relativ e orbit initialization and reconﬁguration phase, we formulate an optimal control problem with L 1 fuel cost, safety and energy matching as constraints, and sol v e it using sequential conv ex prog ramming (SCP), similar to prior w ork [ 48 ]. The planned trajectories are track ed using a model predictiv e control f or mulation of the con ve xiﬁed problem. During the inspection phase, we represent the attitude dynamics using quaternions [ 49 ]. 9 Fig. 7 Three phases of the collaborativ e inspection design ref erence mission [43]. The attitude planning is done using a combination of sler p inter polation [ 50 ] and SCP with a norm constraint on the quaternions as described in the Appendix. W e use an e xisting nonlinear feedbac k controller f or attitude trac king [ 51 ]. In the follo wing, we re view the information-based optimal control problem, the HCW equations, the energy matching condition f or stable relativ e orbits, and conv e xiﬁcation of optimal control problems for relativ e orbit and attitude motion planning. C. Ov erview of the Inf ormation Based GNC Arc hitecture In this section, we giv e a brief ov er view of the collaborativ e lo w-earth orbit inspection framew ork proposed in our earlier w ork [ 43 , 52 ]. Using this framew ork w e design optimal Passive R elative Orbits[ 52 , 53 ] (PR Os) and attitude trajectories f or 𝑈 observer spacecrafts, inspecting 𝑒 Points of Interest (POIs) on a targ et spacecraft, by sol ving the f ollowing inf or mation-based optimal control problem. Problem 2. Information-Based Optimal Contr ol Problem min p , u 𝐿  𝑉 𝑀 0    𝑊  𝑋 = 1 H( p , s 𝑋 )+ 𝑌  𝑃 = 1 ↓ u 𝑃 ↓    𝑄𝑉 (1) s.t.         Dynamics Model : ↔ p 𝑃 = f ( p 𝑃 , u 𝑃 ) Saf e Set : p 𝑃 → P , ↗ 𝑊 → { 1 ,..., 𝑈 } Inspection Sensor Model : z 𝑃 , 𝑋 = 𝑓 ( p 𝑃 , s 𝑋 )+ 𝑔 , 𝑔 ↘ N  0 , ω 𝑅 ( p 𝑃 , s 𝑋 )  , (2) Points of Interest : s 𝑋 ↗ 𝑕 → { 1 ,..., 𝑒 } (3) where  𝑋 H( p , s 𝑋 ) is the inf ormation cost,  𝑃 ↓ u 𝑃 ↓ is the fuel cost, p 𝑃 is the full-pose of the observ er spacecraft, s 𝑋 is the full-pose of the 𝑕 th POI on the targ et spacecraft. The inspection sensor model in Eq. (3) outputs the value of interest z 𝑃 , 𝑋 , when the 𝑊 th observer with pose p 𝑃 is inspecting a POI at s 𝑋 . Minimizing the inf ormation cost  𝑋 H( p , s 𝑋 ) ensures that the inspection task is complete. W e decompose the Problem 2 to der ive a hierarchical GNC algorithm (f or details ref er to our earlier work [ 43 ]). The hierarchical algorithm uses the inf or mation-cost and the sensor model to select the inf or mativ e PR Os and attitude pointing vector f or each ag ent. W e optimize the inf ormative PR Os and attitude plan f or optimal orbit inser tion, reconﬁguration, and attitude tracking using an optimal control problem f or mulation that computes minimum fuel trajectory using sequential con ve x programming approach. The detailed sequential conv e x programming f ormulations are pro vided in the our prior work [ 43 ]. In this w ork, we use the inf ormation cost to keep track of the task progress and detect o " -nominal beha vior of the multi-agent sy stem and the individual ag ents. W e descr ibe the cost functional used to compute the inf or mation gain in the f ollowing. 10 D. Inf ormation Gain. T o quantify the inf ormation, a pr ior model of the targ et spacecraft is used along with sampled points of inter est (POIs) on the surface of the spacecraft. The cos t function H is designed to minimize the total v ariance on the kno wledg e of POIs. W e use the cost function H designed in [54], and is a function of POIs as giv en below . H POI ( s ) =    𝑍 ↑ 1 +  p → P 𝐿 ( p , s ) ↑ 1    ↑ 1 H =  s → POIs H POI ( s ) 𝑀 ( s ) , (4) where s → R 3 is a POI on the tar get spacecraft ’ s sur face, 𝑍 → R is the initial v ar iance based on the prior model of the targ et spacecraft, p → 𝑖𝑗 ( 3 ) is the pose of a sensor mounted on a spacecraft such as a camera, P is the set of all sensor poses, 𝐿 ( p , s ) estimates the v ariance of estimating POI at s with the sensor at p , and 𝑀 ( s ) → R is the relativ e importance of POI s . The function 𝐿 (· , ·) corresponds to inf ormation per pixel. It incor porates sensor characteristics such as the current uncertainty of the spacecraft’ s pose estimate, the accuracy of the sensor based on the distance between p and s , or the lighting conditions. Here, we use a simple R GB camera sensor and no en vironmental noise [54]: 𝐿 ( p , s ) ≃  dist 2 ( p , s ) s visible from p ⇐ otherwise , (5) where dist ( p , s ) is the Euclidean distance between POI s and pose p . W e compute 𝐿 using visbility checking. The o # ine solution to problem 2 is used to predict the nominal system behavior in terms of the inf or mation-based cost H 𝑇𝑀𝑂 o ver a ﬁnite time inter val (1 or 2 orbits). As described in Fig. 5, w e precompute the nominal beha viour H 𝑇𝑀𝑂 and compare it to the real-time beha viour H o ver the time hotizon 𝑉 as f ollow s:  𝑉 0 ( H ↑ H nom ) 𝑄𝑉 ⇒ ε H threshold 𝑉 . (6) If the real-time v alue deviates from the nominal behavior b y a threshold ε H threshold then a f ault is detected. In the f ollowing section, w e discuss on ho w we modify the cos t function to construct the FDI architecture in Fig. 5. III. Global T ask A ware F ault De tection and Isolation In this section, we describe the di " erent components of the global task aw are fault detection and isolation algor ithm that solv es the Problem 1. This section descr ibes the di " erent components of the f ault detection system. W e ﬁrs t describe the derivation of the attr ibute required to detect a faulty spacecraft and the g radient-based fault metric is derived. Finall y , w e discuss the design of thresholds for detecting and identifying di " erent types of f ault. A. Global T ask Cost Functional W e consider a centralized monitoring system that utilizes only the inf ormation cost updates —and implicitl y , the variance updates of the POIs—shared by the individual spacecraft dur ing the collaborative inspection process. The objectiv e is to detect o " -nominal beha vior and identify faults without requiring full state or raw sensor data to be e xc hanged betw een agents. At any time 𝑉 , each spacecraft 𝑊 transmits its local inf ormation cost contr ibution H 𝑃 ( 𝑉 ) to the central computing sys tem. From (4), the centralized cost functional is: H ( 𝑉 ) =  𝑈 → S H 𝑍𝑎𝑏 ( 𝑂 ) 𝑀 ( 𝑂 ) , where H 𝑍𝑎𝑏 ( 𝑂 ) =    𝑍 ↑ 1 +  𝑐 → P 𝑘 ( 𝑙 , 𝑂 ) ↑ 1    ↑ 1 . (7) 11 Here, 𝑀 ( 𝑂 ) is the importance w eight of POI 𝑂 , 𝑍 is the prior variance at 𝑂 , and 𝑘 ( 𝑙 , 𝑂 ) is the measurement variance contribution from agent 𝑙 obser ving POI 𝑂 . Ph ysically , H 𝑍𝑎𝑏 ( 𝑂 ) represents the total fused information gain at 𝑂 after agg regating contr ibutions from all observers, with independent measurements combining in v ersely in the variance domain. Decomposition into Ag ent Contributions. Let 𝑁 ( 𝑂 ) = 1  𝑍 ↑ 1 +  𝑐 → P 𝑘 ( 𝑙 , 𝑂 ) ↑ 1  2 be a normalization factor dependent onl y on the consensus estimate of POI 𝑂 . Then: H ( 𝑉 ) =  𝑈 → S 𝑀 ( 𝑂 ) 𝑁 ( 𝑂 ) 𝑍 ↑ 1 +  𝑐 𝐿 → P  𝑈 → S 𝐿 𝑀 ( 𝑂 ) 𝑁 ( 𝑂 ) 𝑘 ( 𝑙 𝑃 , 𝑂 ) ↑ 1  ⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌  ⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌⨌  H 𝐿 ( 𝑉 ) . (8) Here, H 𝑃 ( 𝑉 ) is the ins tantaneous marginal inf or mation contribution of spacecraft 𝑊 . This decomposition: • enables continuous trac king of individual ag ent per formance without broadcasting full s tate/sensor data, • supports fault detection b y comparing H 𝑃 ( 𝑉 ) to its nominal (e xpected) value. B. F ault Metric Let H 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 ) be the expected nominal contribution of agent 𝑊 at time 𝑉 , computed via simulation or anal ytical models. Over a time interval ε 𝑉 , deﬁne the fault detection metric : H 𝑂 𝐿 ( 𝑉 ) =      1 ↑ ε H 𝑃 ( 𝑉 ) ε H 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 )      =      1 ↑ H 𝑃 ( 𝑉 ) ↑ H 𝑃 ( 𝑉 ↑ ε 𝑉 ) H 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 ) ↑ H 𝑃 ( 𝑉 ↑ ε 𝑉 )      . (9) Here: • ε H 𝑃 ( 𝑉 ) is the actual r ate of inf or mation g ain f or agent 𝑊 , • ε H 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 ) is the expect ed nominal rate of inf ormation gain . The fault detection rule is: H 𝑂 𝐿 ( 𝑉 ) =  0 , No f ault detected , > 0 , F ault detected . (10) Deﬁning 𝑎 : = ε H 𝑃 ( 𝑉 ) ε H 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 ) , w e classify:  𝑎 > 1 , Perf or mance improv ed unexpectedl y (ov er -contribution) , 𝑎 < 1 , Perf or mance degraded (under-contribution) . F ault Case Condition Deteriorating per f or mance sign ( ε H 𝑃 ( 𝑉 ) ) ϑ sign  ε H 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 )  ; sign ( ε H 𝑃 ( 𝑉 ) ) = sign  ε H 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 )  and 𝑎 < 1 Impro ved perf or mance sign ( ε H 𝑃 ( 𝑉 ) ) = sign  ε H 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 )  and 𝑎 > 1 T able 3 Identifying spacecraft inspection per f ormance under fault. 12 C. Theoretical Anal ysis f or F ault Detection and Identiﬁcation Theorem 1 (F ault Detection and Identiﬁcation via Inf or mation-Cost Metric) . Let H 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 ) > 0 denot e the nominal expected information-cost contribution of spacecraft 𝑊 at time 𝑉 , computed under a f ault-free model and assumed Lipsc hitz continuous ov er [ 𝑉 ↑ ε 𝑉 , 𝑉 ] . Suppose the f ollowing conditions hold: 1) ( Non-deg eneracy ) ε H 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 ) ϑ 0 f or all 𝑉 of int erest. 2) ( Distinct contribution proﬁles ) F or all 𝑙 𝑃 ϑ 𝑙 𝑋 , ther e exists at least one 𝑂 → S suc h that 𝑀 ( 𝑂 ) 𝑁 ( 𝑂 ) 𝑘 ( 𝑙 𝑃 , 𝑂 ) ↑ 1 ϑ 𝑀 ( 𝑂 ) 𝑁 ( 𝑂 ) 𝑘 ( 𝑙 𝑋 , 𝑂 ) ↑ 1 . 3) ( Bounded perturbations ) In nominal operation, | ε H 𝑃 ( 𝑉 ) ↑ ε H 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 )| ⇑ 𝑚 nom f or some known 𝑚 nom ⇒ 0 . If, f or some 𝑚 > 𝑚 nom ,      ε H 𝑃 ( 𝑉 ) ε H 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 ) ↑ 1      > 𝑚 at any 𝑉 , then: 1) F ault detection. Spacecraft 𝑊 is f aulty (global or local f ault). 2) F ault classiﬁcation. Let 𝑛 𝑃 ( 𝑉 ) : = ε H 𝐿 ( 𝑉 ) ε H 𝑁𝑂 𝑃 𝑄 𝐿 ( 𝑉 ) . • If 𝑛 𝑃 ( 𝑉 ) > 1 + 𝑚 : per formance has impro ved une xpectedly (over -contribution). • If 𝑛 𝑃 ( 𝑉 ) < 1 ↑ 𝑚 : per formance has degraded (under -contribution). 3) F ault isolation. Under condition 2, the f aulty ag ent 𝑊 is uniquely identiﬁable fr om (8) . Proof. From the deﬁnition in (9): H 𝑂 𝐿 ( 𝑉 ) =      1 ↑ ε H 𝑃 ( 𝑉 ) ε H 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 )      . By Assumption 1, ε H 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 ) ϑ 0 so the ratio is w ell-deﬁned. Det ection. In nominal operation, b y Assumption 3:      ε H 𝑃 ( 𝑉 ) ε H 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 ) ↑ 1      ⇑ 𝑚 nom | ε H 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 )| . If H 𝑂 𝐿 ( 𝑉 ) > 𝑚 > 𝑚 nom , this inequality is violated, implying that the deviation in ε H 𝑃 ( 𝑉 ) from its nominal prediction e xceeds the maximum allo w ed under fault-free conditions. Therefore, a f ault must ha ve occur red. Classiﬁcation. Deﬁne 𝑛 𝑃 ( 𝑉 ) : = ε H 𝐿 ( 𝑉 ) ε H 𝑁𝑂 𝑃 𝑄 𝐿 ( 𝑉 ) . If 𝑛 𝑃 ( 𝑉 ) > 1 + 𝑚 , the actual contr ibution e xceeds nominal by more than the detection threshold—an o ver -contr ibution. If 𝑛 𝑃 ( 𝑉 ) < 1 ↑ 𝑚 , the actual contr ibution f alls shor t of nominal by more than the threshold—an under -contribution. Both cases indicate anomalous behavior , but of di " erent types. Isolation. From (8): H ( 𝑉 ) =  𝑈 → S 𝑀 ( 𝑂 ) 𝑁 ( 𝑂 ) 𝑍 ↑ 1 +  𝑐 𝑅 → P H 𝑓 ( 𝑉 ) , where H 𝑓 ( 𝑉 ) is the unique marginal contribution of agent 𝑑 . By Assumption 2, each H 𝑓 ( 𝑉 ) has a distinct functional dependence on 𝑂 through 𝑀 ( 𝑂 ) 𝑁 ( 𝑂 ) 𝑘 ( 𝑙 𝑓 , 𝑂 ) ↑ 1 . Therefore, de viations in H 𝑃 ( 𝑉 ) can be attributed unambiguously to agent 𝑊 . ⊋ D. F ault T ype The primar y objectiv e of the centralized F ault Detection, Isolation, and Reco v er y (FDIR) sy stem is to identify f aulty observer spacecraft within the distributed network accurately and explicitl y detect actuator and sensor faults at the individual agent (spacecraft) le v el. Within the GN C (Guidance, Na vigation, and Control) architecture, such f aults can arise during either the spacecraft ’s state propagation or during sensor pointing f or observing designated Points of Interest (POIs). In the ﬁrst case, state propag ation faults: a malfunctioning actuator prev ents the spacecraft from maintaining its assigned tra jector y . This results in the spacecraft de viating from its planned orbit, thereb y modifying the set of POIs it is capable of obser ving. A spacecraft with such a f ault may behav e er raticall y , resembling a rogue agent, and in sev ere cases, could ev en pose a collision r isk to nearb y agents. In the second case, sensor pointing faults: the onboard sensor f ails to cor rectly align with the POI that has the highest e xpected uncertainty (i.e., maximum variance). This misalignment leads to inaccurate or suboptimal obser vations, thereb y reducing the ov erall e " ectiv eness of information 13 gathering. Both types of faults manif est as measurable de viations in the inf or mation gain computed b y the obser v er . For e xample, a sensor fault results in a discrepancy betw een the observed and e xpected v ar iance in POI estimation, which directly impacts the spacecraft ’ s contribution to the global information cost. Figures 8 illustrate the impact of tw o representativ e actuator faults on the global inf or mation cost H , as well as the corresponding fault detection signal H 𝑂 𝐿 ( 𝑉 ) f or the a " ected and una " ected observer spacecraft. In Figure 8(left), an actuator fault is injected into two spacecrafts by adding random noise to its state trajectory . Interestingl y , this f ault incidentally impro v es global sys tem per f or mance, as the true inf ormation cost with actuation fault is low er than the predicted nominal value, sugg esting an increase in inf or mation gain. Figure 8 (r ight) also presents a similar scenario where the actuator fault a " ects the sensor pointing mechanism rather than the spacecraft’ s orbit. Here, the f ault causes the sensor to deviate from its optimal orientation, resulting in reduced global perf or mance. As e xpected, the inf or mation cost increases relativ e to the nominal baseline, and the f ault metr ic again isolates the f aulty spacecraft accurately . Fig. 8 Real-time v s. Expected cost under actuator attack I (left) and actuator attack II (right). E. A daptive F ault Threshold The abo ve fault cases sho w that the proposed f ault metr ic perf or ms as e xpected. How ev er , there is still a need to determine an appropr iate threshold to distinguish betw een sys tem noise and a fault. The occur rence of actuator faults cause a chang e in the pose 𝑙 𝑃 ( 𝑉 ) of a spacecraft. From the formulation of the global cost H in (8) , it is e xplicitly clear that a chang e in pose 𝑙 𝑃 ( 𝑉 ) of a spacecraft will a " ect the v ar iance in obser ving a targ et POI. There is also a more subtle dependence of the set of visible POIs, S 𝑃 ( 𝑉 ) , on the pose of spacecraft, since the ﬁeld of vie w of the onboard sensor will chang e w .r .t. the spacecraft pose. Let S 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 ) be the e xpected set of visible POIs f or spacecraft 𝑊 at time 𝑉 . Then, to compute a fault threshold, it is ﬁrst necessary to construct a set of POIs S ⇓ 𝑃 ( 𝑉 ) , such that 0 < |H 𝑃 (S ⇓ 𝑃 ( 𝑉 )) ↑ H 𝑃 (S 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 )) | ⇑ |H 𝑃 (S 𝑃 ( 𝑉 )) ↑ H 𝑃 (S 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 )) | , ↗ S 𝑃 ( 𝑉 ) ⇔ S . (11) The fault threshold f or individual spacecrafts can be computed as 𝑋 𝑃 ( 𝑉 ) = abs  1 ↑ H 𝑃 (S ⇓ 𝑃 ( 𝑉 )) ↑ H 𝑃 (S 𝑃 ( 𝑉 ↑ ε 𝑉 )) H 𝑃 (S 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 )) ↑ H 𝑃 (S 𝑃 ( 𝑉 ↑ ε 𝑉 ))  . (12) Equation (12) is used to construct adaptive fault thresholds for individual spacecrafts, where a f ault is detected in spacecraft 𝑊 if H 𝑂 𝐿 ( 𝑉 ) > 𝑋 𝑃 ( 𝑉 ) . F. Analytical Example: T wo 1-DOF Spacecraft with T w o POIs W e illustrate the detection of actuator and sensor faults using the information-cost based functional redundancy metric. Consider tw o spacecraft, each translating along a one-dimensional axis 𝑎 𝑃 ( 𝑉 ) , 𝑊 = 1 , 2 . T wo points-of-interest 14 (POIs) are ﬁx ed at opposite vertices of a square of side 2 𝑜 : 𝑂 1 = ( 𝑜 , 𝑜 ) , 𝑂 2 = ( ↑ 𝑜 , ↑ 𝑜 ) . (13) Each spacecraft is equipped with a camera that provides noisy observations of the POIs. The measurement v ar iance from agent 𝑊 at position 𝑙 𝑃 = ( 𝑎 𝑃 , 0 ) to POI 𝑂 = ( 𝑂 𝑔 , 𝑂 𝑕 ) is modeled as 𝐿 ( 𝑙 𝑃 , 𝑂 ) = 𝑑 dist ( 𝑙 𝑃 , 𝑂 ) 2 , 𝐿 ↑ 1 ( 𝑙 𝑃 , 𝑂 ) = 1 𝑑 1 ( 𝑎 𝑃 ↑ 𝑂 𝑔 ) 2 + 𝑂 2 𝑕 . (14) The aggregated inf ormation cost per POI is 𝑅 POI ( 𝑂 ) =  𝑍 ↑ 1 + 2  𝑃 = 1 𝐿 ↑ 1 ( 𝑙 𝑃 , 𝑂 )  ↑ 1 , (15) 𝑅 ( 𝑉 ) =  𝑈 → { 𝑈 1 , 𝑈 2 } 𝑀 ( 𝑂 ) 𝑅 POI ( 𝑂 ) , (16) where 𝑀 ( 𝑂 ) are task weights. Minimization of 𝑅 ( 𝑉 ) corresponds to maximizing information. Gradient of the inf ormation cost. Di " erentiating 𝑅 POI ( 𝑂 ) with respect to 𝑎 𝑃 yields 𝑝𝑅 𝑝𝑎 𝑃 =  𝑈 𝑀 ( 𝑂 )· 2 𝑑 𝑅 POI ( 𝑂 ) 2 𝑎 𝑃 ↑ 𝑂 𝑔  ( 𝑎 𝑃 ↑ 𝑂 𝑔 ) 2 + 𝑂 2 𝑕  2 . (17) This pro vides a compact analytic f ormula for the predicted e " ect of ag ent 𝑊 ’s displacement on the inf ormation cost. P er-agent ratio and de tection metric. For a nominal step ε 𝑎 𝑃 , the predicted and real cost c hanges are ε 𝑅 pred 𝑃 ↖ 𝑝𝑅 𝑝𝑎 𝑃 ε 𝑎 𝑃 , ε 𝑅 real 𝑃 ↖ 𝑝𝑅 𝑝𝑎 𝑃 ε 𝑎 real 𝑃 . (18) Deﬁne the per -agent ratio and metric 𝑛 𝑃 = ε 𝑅 real 𝑃 ε 𝑅 pred 𝑃 , 𝑅 𝑂 , 𝑃 =   1 ↑ 𝑛 𝑃   . (19) For robus tness to nominal per turbations, compare 𝑅 𝑂 , 𝑃 with an adaptiv e threshold 𝑋 𝑃 ( 𝑉 ) (cf. Eq. (12) in the chapter). F ault separation logic. • A ctuator fault: A control ex ecution bias chang es the realized step ε 𝑎 real 𝑃 ϑε 𝑎 𝑃 , so 𝑛 𝑃 = ε 𝑎 real 𝑃 ε 𝑎 𝑃 ↙ 𝑛 𝑃 > 1 (o v er-actuation) , 𝑛 𝑃 < 1 (under -actuation) . (20) • Sensor f ault: Inf ormation degradation scales 𝐿 ↑ 1 ( 𝑙 𝑃 , 𝑂 ) ∝′ 𝑞𝐿 ↑ 1 ( 𝑙 𝑃 , 𝑂 ) with 𝑞 → ( 0 , 1 ) , lea ving g eometr y unchang ed but reducing e " ective sensitivity : ε 𝑅 real 𝑃 = 𝑞 ε 𝑅 pred 𝑃 ↙ 𝑛 𝑃 = 𝑞 < 1 . (21) Thus, the sign of deviation in 𝑛 𝑃 separates actuator f aults ( 𝑛 𝑃 ϑ 1 via motion bias) from sensor faults ( 𝑛 𝑃 < 1 via inf or mation loss). Numerical plug-in (one-step). Choose 𝑜 = 1 , 𝑑 = 1 , 𝑍 ↑ 1 = 0 , 𝑀 ( 𝑂 1 ) = 𝑀 ( 𝑂 2 ) = 1 2 , 𝑎 1 = ↑ 1 . 5 , 𝑎 2 = + 1 . 5 , ε 𝑎 = 0 . 1 . (22) 15 Actuat or fault on ag ent 1: let a positive bias 𝑟 = 0 . 05 make the realized s tep larger: ε 𝑎 real 1 = ε 𝑎 + 𝑟 = 0 . 15 , ε 𝑎 real 2 = ε 𝑎 2 = ↑ 0 . 1 . (23) The gradient factor in (17) cancels in the ratio, yielding 𝑛 1 = ε 𝑎 real 1 ε 𝑎 = 0 . 15 0 . 10 = 1 . 5 , 𝑛 2 = ε 𝑎 real 2 ε 𝑎 2 = 1 . 0 . (24) Theref ore, 𝑅 𝑂 , 1 = | 1 ↑ 1 . 5 | = 0 . 5 ( o ver -actuation ﬂagged ) , 𝑅 𝑂 , 2 = | 1 ↑ 1 . 0 | = 0 . (25) Sensor fault on ag ent 2: with a degradation factor 𝑞 = 0 . 7 (same commanded motion), ε 𝑅 real 2 = 𝑞 ε 𝑅 pred 2 ↙ 𝑛 2 = 𝑞 = 0 . 7 , 𝑛 1 = 1 . 0 , (26) so 𝑅 𝑂 , 2 = | 1 ↑ 0 . 7 | = 0 . 3 ( sensor degradation ﬂagged ) , 𝑅 𝑂 , 1 = 0 . (27) These v alues make the classiﬁcation immediate: actuator bias produces 𝑛 𝑃 > 1 f or the a " ected agent, whereas sensor degradation produces 𝑛 𝑃 < 1 . G. Higher Dimensional Example T o demonstrate the e " ectiveness of the proposed fault detection frame work, w e simulate both actuator and sensor faults in spacecraft formations of size 1, 2, and 4. Each spacecraft’ s trajectory is propagated using a 4th-order Rung e–Kutta integrator, s tar ting from arbitrar y initial conditions. F aults are injected at selected times during the trajectory . The spacecraft are assigned the task of obser ving 5000 points of interest (POIs) unif or ml y distr ibuted on the surface of a unit sphere. Each spacecraft is modeled with a conical ﬁeld-of-view (FoV), with the cone’ s apex located at the center of the sphere and aligned along the spacecraft’ s pointing direction. The goal is to maximize the inf or mation gain b y obser ving high-variance POIs within the ﬁeld-of-view of the agents. Fig. 9 Visualization of POIs on the sur face of a sphere and a spacecraft with a conical ﬁeld of vie w targeting the sphere The global cost function is computed using the centralized f ormulation in (7) , and agent-lev el per f or mance is ev aluated using the decomposed e xpression in (8) . Each simulation includes one f aulty spacecraft (under actuator or sensor degradation), while the remaining spacecraft operate nominally . 16 1. Problem 1: Actuator F ault An actuator fault is deﬁned as a deviation in the spacecraft ’s motion due to malfunctioning thrusters or attitude control components (e.g., reaction wheels). This is simulated by injecting artiﬁcial disturbances (additive noise or bias) into the position and v elocity states at an arbitrary time step. Fig. 10 Global vs. centralized cost functional for one spacecraft under actuator f ault. Fig. 11 The contributions of the centralized (left) and global (right) cost functions of one spacecraft during an actuator fault. In Fig. 10, the global cost H remains relativ ely stable e v en after fault injection, as the av eraging e " ect across POIs masks individual anomalies. In contrast, the centralized cost function re veals that the f aulty spacecraft is no longer able to optimize its assigned trajectory f or maximum co verag e. Fig. 11, 13, and 15 show the cos t contr ibution at each speciﬁc moment in time f or the spacecraft sys tems under actuator fault. When the fault occurs, the cost of the spacecraft using the centralized cost function de viates signiﬁcantly compared to the cost of the spacecraft using the global cost function. This underscores the global cost function ’ s ability to optimize the sys tem to gain maximum co verag e. As the number of agents increases (Figs. 12, 14), the f ault’ s e " ect is more pronounced in the centralized cost H 𝑃 ( 𝑉 ) associated with the faulty ag ent. Ho we ver , the global cost functional remains robust due to the distributed nature of the task. This highlights that global metr ics are more sensitiv e and thus better suited for local f ault isolation. 2. Problem 2: Sensor F ault A sensor fault ref ers to any f ailure that impairs the spacecraft’ s ability to sense and track POIs accurately . Examples include camera misalignment, signal loss, or degraded resolution. T o simulate this, we reduce the set of POIs visible 17 Fig. 12 Global vs. centralized cost under actuator fault f or two spacecraft. Fig. 13 The contributions of the centralized (left) and global (right) cost functions of a two-spacecraft formation during an actuator fault. Fig. 14 Global vs. centralized cost under actuator fault f or four spacecraft. 18 Fig. 15 The contributions of the centralized (left) and global (right) cost functions of a f our-spacecraft formation during an actuator fault. within the spacecraft ’s F oV by appl ying a rotational misalignment. Fig. 16 Global vs. centralized cost under sensor fault f or one spacecraft. As sho wn in Fig. 16, the global cost H initially increases after the f ault due to poor POI cov erage. The centralized cost functional sho ws a signiﬁcant drop in ag ent performance after the f ault is injected, highlighting the spacecraft ’ s reduced contribution to the ov erall inf or mation gain. In both Fig.18 and Fig.20, the global cost remains relativ ely smooth, demonstrating the inherent redundancy in the distributed system. How ev er , the localized drop in H 𝑃 ( 𝑉 ) f or the faulty spacecraft indicates the degraded quality of its obser vations. This validates the e $ cacy of the proposed inf or mation-cost-based FDIR strategy in identifying performance deterioration due to sensor misalignment. Fig. 17, 19, and 21 highlight the individual cost contributions of the spacecraft systems during a sensor f ault. When a fault occurs, the contribution from all spacecrafts drops to 0 because the spacecraft is no long er able to see any POIs. 19 Fig. 17 The contributions of the centralized (left) and global (right) cost functions of a single-spacecraft f ormation during a sensor fault. Fig. 18 Global vs. centralized cost under sensor fault f or two spacecraft. Fig. 19 The contributions of the centralized (left) and global (right) cost functions of a two-spacecraft formation during a sensor fault. 20 Fig. 20 Global vs. centralized cost under sensor fault f or four spacecraft. IV . Inspection Mission: Simulations and Results The fault detection framew ork was incor porated in the simulation setup f or the hierarchical planning algor ithm [ 43 , 52 ] to compute the quantities in (11) . The simulation results provided in this section are di " erent from the results in the pre vious section because we use the proposed adaptiv e threshold in Eq. (12) to detect injected faults. T o compute the adaptiv e threshold in real time, dur ing the orbit assignment phase at time 𝑉 , the central ag ent deter mines the expected set of visible POIs, S 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 ) , f or eac h spacecraft 𝑊 . This is used to compute the nominal behavior , H 𝑃 (S 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 )) , f or each spacecraft during a ﬁx ed time per iod of 2 orbits. On the other hand, computing the set S ⇓ 𝑃 ( 𝑉 ) in (11) can become computationally intractable with increasing number of POIs. Theref ore, a sampling based approach is taken to appro ximate the set S ⇓ 𝑃 ( 𝑉 ) b y randomly pointing the onboard sensor within an 𝑚 ↑ neighborhood of the targ et POI. Figure 22 demonstrates the construction of this 𝑚 ↑ neighborhood, where the onboard sensor v ector is randomly pointed to any point in the 𝑚 ↑ neighborhood, thereb y changing its FO V and the visible set of POIs. For actuator faults, the value of 𝑚 can be estimated b y analyzing the order of 𝑖𝑐 𝐿 𝑖𝑗 𝐿 f or di " erent spacecrafts in the system. Finall y , the fault threshold, 𝑋 𝑃 ( 𝑉 ) , f or each spacecraft is also computed during the orbit assignment phase. In the f ollowing plots, 10 targ et POIs were sampled in an 𝑚 ↑ neighborhood around the POI with maximum variance, f or each spacecraft. The sampled set S ⇓ 𝑃 ( 𝑉 ) which giv es the minimal v alue for 𝑋 𝑃 ( 𝑉 ) determines the f ault threshold f or spacecraft 𝑊 at time 𝑉 . At the agent lev el, each spacecraft receives its orbit assignment and trac ks the progress of its local information cost, H 𝑃 ( 𝑉 ) , while propagating the next 2 orbits. At the end of this ﬁx ed time inter val, each spacecraft transmits its local inf ormation cost to the central ag ent where the centralized FDIR algorithm detects an y faulty spacecraft beha vior using the metric in (9). A. Classiﬁcation of Simulation Outcomes W e test the perf or mance of the proposed fault metric on di " erent attack scenarios and our aim is to detect three distinct beha viors of the spacecrafts. These are listed below . 1) Nominal Beha vior: In this phase, the spacecrafts perf or m as expected, that is, the observed global cos t is same as the e xpected global cost. During nominal beha vior, the fault metric f or each spacecraft, H 𝑂 𝐿 ( 𝑉 ) , is e xpected to be 0 , and the adaptiv e fault threshold should alw ay s be greater than the fault metric for eac h spacecraft. 2) A ctuator faults: An actuator fault is introduced in one or more spacecrafts at some time 𝑉 𝑘𝑙 𝑗 𝐿 𝑉 ⇒ 0 . Actuator faults can either be injected into the on-board equipment (F ig. 24 and Fig. 26), or in the spacecraft’ s controller causing it to deviate from its planned tra jectory (Fig. 23). 3) Sensor faults: A sensor fault is also introduced in one or more spacecrafts at some time 𝑉 𝑘𝑙 𝑗 𝐿 𝑉 ⇒ 0 (Fig. 25). Sensor faults typically a " ect the set of POIs trac ked by the individual spacecrafts and thereb y chang e the e xpected inf or mation cost of an individual spacecraft o v er time. For both the actuator and sensor fault scenarios, the beha vior of the individual spacecrafts can either deter iorate or impro ve, that is, H 𝑂 𝐿 ( 𝑉 ) > 0 . How ev er , due to inherent sy stem noise, classifying a spacecraft, 𝑊 , as faulty if H 𝑂 𝐿 ( 𝑉 ) > 0 can lead to false positiv es. Theref ore, a spacecraft is classiﬁed as f aulty if H 𝑂 𝐿 ( 𝑉 ) > 𝑋 𝑃 ( 𝑉 ) , where 𝑋 𝑃 ( 𝑉 ) > 0 is the 21 Fig. 21 The contributions of the centralized (left) and global (right) cost functions of a f our-spacecraft formation during a sensor fault. Fig. 22 Visible set of POIs f or a spacecraft (left); 𝑚 ↑ neighborhood constructed around POI with maximum variance (right). proposed adaptiv e threshold computed using Eq. (12). A consequence of using an adaptiv e threshold is that there is some latency betw een when a fault is introduced and when a fault is detected. T able 4 lists some of these parameters f or the simulation results presented in the f ollowing section. B. Results The performance of the proposed f ault metric is tested on di " erent types of actuator faults, as sho wn in Fig. 23, Fig. 24 and F ig. 26.The ﬁrst type of actuator fault is implemented during the state propagation of tw o spacecrafts in the netw ork (Fig. 23). In this case, the o verall task perf or mance improv es after the f ault is injected at 𝑉 𝑘𝑙 𝑗 𝐿 𝑉 = 400 𝑂 . Hence, the true inf ormation cost H 𝑃 (S ⇓ 𝑃 ( 𝑉 )) is less than the predicted information cost H 𝑃 (S 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 )) as sho wn in Fig. 23 (left). Ho we ver , bef ore 𝑉 = 400 𝑂 , the cost beha vior exhibits nominal behavior and H 𝑃 (S ⇓ 𝑃 ( 𝑉 )) = H 𝑃 (S 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 )) . Figure 23 (right) demonstrates that the proposed f ault threshold successfully detects the actuator fault in both spacecraft 2 and spacecraft 4. In particular, w e highlight the fault metric and f ault threshold of spacecraft 4 in Fig. 23 (right) where, after the f ault is injected at 𝑉 𝑘𝑙 𝑗 𝐿 𝑉 = 400 𝑂 , the f ault threshold falls belo w the f ault metr ic at time 𝑉 = 700 𝑂 and successfull y detects the fault. Observe that there are no f alse positiv es because for nominal spacecrafts, the f ault metr ic is well belo w the fault threshold at all times. Note here that for all plots in this section, w e plot the log v alue of the fault metric and 22 adaptiv e threshold, so as to highlight small chang es in their v alues. It is interesting to note that the adaptiv e nature of the proposed f ault metr ic allo ws it to tune to the behavior of the individual spacecrafts. As a result, in Fig. 23 (right) the fault is detected immediatel y in spacecraft 2, while it is detected after a f ew time steps f or spacecraft 4. The second type of actuator attac k is implemented dur ing the sensor pointing phase, causing the o verall sy stem performance to deter iorate in Fig. 24 (left), that is, H 𝑃 (S ⇓ 𝑃 ( 𝑉 )) > H 𝑃 (S 𝑐𝑑 𝑒 𝑄 𝑃 ( 𝑉 )) . The proposed fault detection metr ic adaptiv ely adjusts the f ault threshold and successfully detects the actuator fault immediatel y after the fault is induced at 𝑉 𝑘𝑙 𝑗 𝐿 𝑉 = 0 𝑂 , in spacecrafts 2, 4 and 5 ( Fig. 24 (r ight)). On the other hand, the proposed f ault metric doesn ’t trigger a false positiv e when it is per iodically applied on nominal spacecrafts 1 and 3, because the fault threshold is alw ay s greater than their fault metrics. Finall y , in Fig. 25 a sensor attack is induced in spacecraft 1 and spacecraft 2 at 𝑉 𝑘𝑙 𝑗 𝐿 𝑉 = 0 . It is interesting to observe that there is no signiﬁcant, visible chang e in the global cost (Fig. 25 (left)). Ho we ver , since the fault metric is designed to be speciﬁc to each spacecraft, it only utilizes the inf or mation a vailable locall y at eac h spacecraft to determine whether the spacecraft is faulty or not. This is e xhibited in Fig. 25 (right) where the f ault in spacecraft 1 is detected immediately but the fault is spacecraft 2 is detected after a latency of 180 𝑂 . As observed bef ore, no false positiv es are identiﬁed. Fig. 23 A ctuator fault injected in spacecraft 2 and spacecraft 4 from 𝑉 𝑘𝑙 𝑗 𝐿 𝑉 = 400 𝑂 , thereby causing the spacecrafts to de viate from their tra jectories. The inf ormation cost plot (left) sho ws e xpected perf ormance bef ore fault is injected, but show s a slightly de teriorating perf ormance after 𝑉 𝑘𝑙 𝑗 𝐿 𝑉 . The fault signals f or di ! erent spacecrafts (right) sho w that the log values of the fault metric f or faulty spacecrafts increase after 𝑉 𝑘𝑙 𝑗 𝐿 𝑉 = 400 𝑂 . Fig. 24 A ctuator fault injected in spacecraft 2, spacecraft 4 and spacecraft 5 from 𝑉 𝑘𝑙 𝑗 𝐿 𝑉 = 0 𝑂 , thereb y causing the onboard equipment to misalign with the POIs. The information cost plo t (left) show s impro v ed, but deviation from e xpected performance immediately after the att ack is induced. The fault signals f or di ! erent spacecrafts (right) sho w that the log values of the f ault metric for f aulty spacecrafts increase after 𝑉 𝑘𝑙 𝑗 𝐿 𝑉 = 0 𝑂 . C. Limitations While the results in the previous section are promising, there are some limitations in the cur rent implementation of the inf or mation-cost based f ault metr ic. Firstly , since the fault threshold is calculated using a sampling approach, the performance is susceptible to sampling bias. This implies that there ma y be false neg atives if the samples are not unif or mly dis tr ibuted, or a small number of samples can fail to construct a tight estimate of the f ault threshold. This is 23 Fig. 25 Sensor fault injected in spacecraft 1 and spacecraft 2 from 𝑉 𝑘𝑙 𝑗 𝐿 𝑉 = 0 𝑂 , thereb y causing the onboard equipment to incorrectl y identify the set of POIs. The information cost plot (left) sho ws almost no deteriorating performance, how ev er , the fault signals for di ! erent spacecrafts (right) sho w that the log values of the f ault metric f or faulty spacecrafts increase after 𝑉 𝑘𝑙 𝑗 𝐿 𝑉 = 0 𝑂 . Fig. 26 Actuator fault injected in spacecraft 2 and spacecraft 5 after 𝑉 𝑘𝑙 𝑗 𝐿 𝑉 = 200 𝑂 , thereb y causing the onboar d equipment to misalign with the POIs. The inf ormation cost plot (left) sho ws slightly deteriorating per formance after 1000 𝑂 . The fault signals f or di ! erent spacecrafts (right) sho w that the log v alues of the fault metric f or faulty spacecrafts increase after 𝑉 𝑘𝑙 𝑗 𝐿 𝑉 = 200 𝑂 . T able 4 Observed perf ormance of proposed fault metric. F ault Expected Signature Observ ed Behavior F ault Detected A vg. Latency A ctuator Fault (F ig. 23) H 𝑂 𝐿 ( 𝑉 ) > 0 , 𝑊 = 2 , 4 Impro ved global cos t Y es 200 𝑂 A ctuator Fault (F ig. 24) H 𝑂 𝐿 ( 𝑉 ) > 0 , 𝑊 = 2 , 4 , 5 Deteriorating global cost Y es 200 𝑂 Sensor F ault (Fig. 25) H 𝑂 𝐿 ( 𝑉 ) > 0 , 𝑊 = 1 , 2 Visibl y same global cost Y es 180 𝑂 A ctuator Fault (F ig. 26) H 𝑂 𝐿 ( 𝑉 ) > 0 , 𝑊 = 2 , 5 Impro ved global cos t No Detection in SC 2 at 1100 𝑂 24 demonstrated in Fig. 26 where an actuator attac k is introduced at 𝑉 𝑘𝑙 𝑗 𝐿 𝑉 = 200 𝑂 in spacecraft 2 and spacecraft 5. Ho we ver , as highlighted in Fig. 26(right), the fault signal f or spacecraft 5 is below the adaptiv e threshold and hence, no attack is detected. On the other hand, ev en though the proposed f ault metr ic detects a fault in spacecraft 2, the latency time is 1100 𝑂 . A high latency time implies that the fault metric is not very e $ cient. Therefore, further analy sis to quantify the tradeo " betw een fault detection analy sis and sample comple xity can be useful to understand the per f or mance of the f ault metric. Despite these dra wbacks, our simulations sugg est that the proposed f ault metric and adaptive threshold w ork w ell in most scenarios, and controlling the size of the 𝑚 ↑ neighborhood can mitigate these dra wbacks. Secondly , designing an adaptiv e threshold to detect sensor faults is challenging because the threshold values typically depend on the hardware speciﬁcations. Therefore, incorporating characteristics of the underl ying hardware in designing the fault threshold can result in impro ved perf or mance. V. Conclusion In this chapter , we dev eloped an inf or mation-driven frame work for f ault detection and identiﬁcation (FDI) in multi-agent spacecraft systems perf or ming collaborativ e on-orbit inspection. W e f ormulated mission objectives as a global cos t functional, 𝑅 , and directly link ed inspection per formance to resilience. This formulation enabled us to detect and classify both global and agent-le v el faults. W e demonstrated three ke y contributions: 1) Global–local coupling: W e decomposed the global cost functional into mar ginal agent contributions, which allo wed us to continuousl y track individual spacecraft per f or mance without e xc hanging raw s tate or sensor data. 2) F ault separation: W e applied higher -order g radient metrics to distinguish actuator faults (e.g., under -actuation with 𝑛 𝑃 < 1 or o ver -actuation with 𝑛 𝑃 > 1 ) from sensor f aults (e.g., degraded sensitivity with 𝑛 𝑃 = 𝑞 < 1 ). Analytical e xamples conﬁr med this clear separation between fault classes. 3) V alidated performance: W e validated the frame w ork in simulations of formations with 1, 2, and 4 spacecraft and up to 5,000 points of interest. The framew ork consistentl y detected injected faults within 100 – 200 seconds of onset while keeping nominal f alse fa v orable rates below 2% . Our results sho w that the information-cost-based FDI framew ork detects diverse anomalies — including sensor misalignments, actuator deviations, and netw ork -lev el disruptions — while preser ving mission objectiv es. By g rounding fault detection in measurable inf ormation gain and mission utility , w e pro vide a principled, quantitativ ely validated pathwa y f or resilient multi-agent spacecraft inspection arc hitectures. Ref erences [1] Parkinson, C. L., “ Aqua: An Earth-observing satellite mission to e xamine water and other climate variables, ” IEEE T ransactions on Geoscience and R emote Sensing , V ol. 41, No. 2, 2003, pp. 173–183. [2] Schoeber l, M. R., Douglass, A. R., Hilsenrath, E., Bhar tia, P . K., Beer , R., W aters, J. W ., Gunson, M. R., Froidevaux, L., Gille, J. C., Barnett, J. J., et al., “Ov er view of the EOS Aura mission, ” IEEE T ransactions on Geoscience and Remote Sensing , V ol. 44, No. 5, 2006, pp. 1066–1074. [3] Lier , P ., and Bach, M., “P ARASOL a microsatellite in the A- Train for Ear th atmospheric observations, ” Acta Astr onautica , V ol. 62, No. 2-3, 2008, pp. 257–263. [4] Stephens, G. L., V ane, D. G., T anelli, S., Im, E., Durden, S., Rok ey , M., Reink e, D., Partain, P ., Mace, G. G., Aus tin, R., et al., “CloudSat mission: Perf or mance and earl y science after the ﬁrst year of operation, ” Jour nal of Geophysical Researc h: Atmospheres , V ol. 113, No. D8, 2008. [5] Wink er, D., P elon, J., Coakley Jr, J., A ck er man, S., Charlson, R., Colarco, P ., Flamant, P ., F u, Q., Ho " , R., Kittaka, C., et al., “The C ALIPSO mission: A global 3D view of aerosols and clouds, ” Bulletin of the American Met eorological Society , V ol. 91, No. 9, 2010, pp. 1211–1230. [6] Imaoka, K., Maeda, T ., Kac hi, M., Kasahara, M., Ito, N ., and Nakaga wa, K., “S tatus of AMSR2 instrument on GCOM- W1, ” Earth observing missions and sensors: Development, implementation, and c haracterization II , V ol. 8528, SPIE, 2012, pp. 201–206. [7] Crisp, D., Pollock, H. R., R osenberg, R., Chapsky , L., Lee, R. A., Oyafuso, F . A., Frankenberg, C., O’Dell, C. W ., Bruegge, C. J., Doran, G. B., et al., “The on-orbit performance of the Orbiting Carbon Observatory-2 (OCO-2) instrument and its radiometrically calibrated products, ” Atmospheric Measurement T echniques , V ol. 10, No. 1, 2017, pp. 59–81. 25 [8] T apley , B. D., Bettadpur , S., W atkins, M., and Reigber , C., “ The gra vity reco v er y and climate e xperiment: Mission ov er view and ear ly results, ” Geophysical resear ch letter s , V ol. 31, No. 9, 2004. [9] Kornf eld, R. P ., Ar nold, B. W ., Gross, M. A., Dah ya, N . T ., Klipstein, W . M., Gath, P . F ., and Bettadpur , S., “GRA CE-FO: the g ravity reco very and climate e xperiment f ollow -on mission, ” Journal of spacecraft and roc kets , V ol. 56, No. 3, 2019, pp. 931–951. [10] Flechtner , F ., Morton, P ., W atkins, M., and W ebb, F ., “Status of the GRA CE f ollow -on mission, ” Gravity , g eoid and height syst ems , Springer , 2014, pp. 117–121. [11] “National Aeronautics and Space Administration, The Afternoon Constellation, ” https://atrain.nasa.gov/ , 2002. Accessed: 2025-07-28. [12] “National A eronautics and Space Administration, The GRA CE-FO, ” https://grace.jpl.nasa.gov/mission/grace- fo/ , 2002. A ccessed: 2025-07-28. [13] Y ost, B., W eston, S., Bena vides, G., Krage, F ., Hines, J., Mauro, S., Etchey , S., O’Neill, K., and Braun, B., “State-of-the-art small spacecraft technology , ” 2021. [14] de la Croix, J.-P ., R ossi, F ., Brockers, R., A guilar, D., Albee, K., Boroson, E., Cauligi, A., Delaune, J., Hewitt, R., Kog an, D., et al., “Multi-ag ent autonom y f or space exploration on the cadre lunar technology demons tration,” 2024 IEEE Aer ospace Confer ence , IEEE, 2024, pp. 1–14. [15] Na yak, S., Lim, G., R ossi, F ., Otte, M., and de la Croix, J.-P ., “Multi-robot exploration f or the C ADRE mission, ” Autonomous Robots , V ol. 49, No. 2, 2025, p. 17. [16] Rabideau, G., Russino, J., Branch, A., Dhamani, N., V aquero, T . S., Chien, S., de la Croix, J.-P ., and Rossi, F ., “Planning, scheduling, and e xecution on the Moon: the CADRE technology demons tration mission, ” arXiv pr eprint arXiv :2502.14803 , 2025. [17] Kasper , J., Lazio, J., Romero- W olf, A., Lux, J., and Neilsen, T ., “ The sun radio inter ferometer space experiment (sunrise) mission concept, ” 2020 IEEE Aer ospace Conf erence , IEEE, 2020, pp. 1–12. [18] Kasper , J., Lazio, T . J. W ., Romero- W olf, A., Lux, J. P ., and Neilsen, T ., “The Sun radio interferometer space experiment (SunRISE) mission, ” 2022 ieee aer ospace conf erence (aero) , IEEE, 2022, pp. 1–8. [19] “National Aeronautics and Space Adminis tration, C ADRE,” https://www.jpl.nasa.gov/missions/cadre/ , 2024. Ac- cessed: 2025-07-28. [20] “National Aeronautics and Space Administration, SunRISE,” https://science.nasa.gov/mission/sunrise/ , 2024. Accessed: 2025-07-28. [21] Tipaldi, M., and Br uenjes, B., “Surve y on fault detection, isolation, and reco very strategies in the space domain, ” Journal of Aer ospace Inf or mation Systems , V ol. 12, No. 2, 2015, pp. 235–256. [22] El-Kebir , H., Ornik, M., N akka, Y . K., Choi, C., and Rahmani, A., “Robus t detection and identiﬁcation of simultaneous sensor and actuator faults, ” 2024 IEEE Aer ospace Conf erence , IEEE, 2024, pp. 1–11. [23] Cheng, Z., Zhao, L., and Shi, Z., “Decentralized multi-uav path planning based on tw o-lay er coordinativ e frame work for f or mation rendezv ous, ” IEEE Access , V ol. 10, 2022, pp. 45695–45708. [24] W ei, X., Fengyang, D., Qingjie, Z., Bing, Z., and Hongchang, S., “ A ne w f ast consensus algorithm applied in rendezv ous of multi-uav ,” The 27t h Chinese control and decision conf erence (2015 CCDC) , IEEE, 2015, pp. 55–60. [25] Boem, F ., Car li, R., F arina, M., Ferrar i- Trecate, G., and Parisini, T ., “Distributed fault detection for interconnected lar ge-scale sys tems: A scalable plug & play approach, ” IEEE T ransactions on Control of Netw ork Syst ems , V ol. 6, No. 2, 2018, pp. 800–811. [26] Boem, F ., F er rar i, R. M., K elir is, C., Parisini, T ., and Poly carpou, M. M., “ A distributed netw orked approach f or f ault detection of lar ge-scale systems, ” IEEE T ransactions on Automatic Control , V ol. 62, No. 1, 2016, pp. 18–33. [27] Paneque, J. L., Mar tinez-de Dios, J., and Ollero, A., “R obust decentralized context-a ware sensor fault detection with in- place self-calibration, ” 2018 IEEE/RSJ International Confer ence on Intellig ent Robo ts and Systems (IROS) , IEEE, 2018, pp. 3130–3136. 26 [28] Sundaram, S., and Gharesif ard, B., “Distributed optimization under adversarial nodes, ” IEEE T ransactions on Automatic Control , V ol. 64, No. 3, 2018, pp. 1063–1076. [29] Sundaram, S., and Hadjicostis, C. N., “Distributed function calculation via linear iterative strategies in the presence of malicious agents, ” IEEE T r ansactions on Aut omatic Contr ol , V ol. 56, No. 7, 2010, pp. 1495–1508. [30] Mitra, A., and Sundaram, S., “Byzantine-resilient distributed obser v ers for L TI sys tems, ” Automatica , V ol. 108, 2019, p. 108487. [31] Su, L., and Shahrampour, S., “Finite-time guarantees f or byzantine-resilient distributed state estimation with noisy measurements, ” IEEE T ransactions on Automatic Control , V ol. 65, No. 9, 2019, pp. 3758–3771. [32] Chen, Y ., Kar, S., and Moura, J. M., “R esilient distributed es timation: Sensor attacks, ” IEEE T ransactions on Aut omatic Control , V ol. 64, No. 9, 2018, pp. 3772–3779. [33] Chen, Y ., Kar , S., and Moura, J. M., “T opology free resilient dis tr ibuted estimation, ” arXiv preprint , 2018. [34] Ravi, N., Scaglione, A., and Nedi’c, A., “ A case of distributed optimization in adv ersarial environment, ” IC ASSP 2019-2019 IEEE Int ernational Confer ence on Acoustics, Speech and Signal Processing (ICASSP) , IEEE, 2019, pp. 5252–5256. [35] Ravi, N ., and Scaglione, A., “Detection and isolation of adversaries in decentralized optimization f or non-strongl y conv ex objectiv es, ” IF AC-P apersOnLine , V ol. 52, No. 20, 2019, pp. 381–386. [36] Li, Y ., and Chen, T ., “Stochas tic detector against linear deception attacks on remote s tate estimation, ” 2016 IEEE 55th Confer ence on Decision and Control (CDC) , IEEE, 2016, pp. 6291–6296. [37] Li, Y ., Shi, L., and Chen, T ., “Detection against linear deception attacks on multi-sensor remote state estimation, ” IEEE T ransactions on Contr ol of Netw ork Systems , V ol. 5, No. 3, 2017, pp. 846–856. [38] Y ang, W ., Zhang, Y ., Chen, G., Y ang, C., and Shi, L., “Distr ibuted ﬁltering under f alse data injection attacks, ” Aut omatica , V ol. 102, 2019, pp. 34–44. [39] Huang, J., T ang, Y ., Y ang, W ., and Li, F ., “Resilient consensus-based dis tr ibuted ﬁltering: Con ver gence anal ysis under stealth y attacks, ” IEEE T ransactions on Industrial Informatics , V ol. 16, N o. 7, 2019, pp. 4878–4888. [40] Arr ichiello, F ., Marino, A., and Pierri, F ., “Observer -Based Decentralized Fault Detection and Isolation S trategy for Netw orked Multirobot Sys tems,” IEEE T ransactions on Control Systems T echnology , V ol. 23, N o. 4, 2015, pp. 1465–1476. https://doi.org/10.1109/TCST .2014.2377175. [41] Panagi, P ., and P oly car pou, M. M., “Distributed f ault accommodation f or a class of interconnected nonlinear systems with partial communication, ” IEEE T ransactions on Automatic Control , V ol. 56, No. 12, 2011, pp. 2962–2967. [42] Ramachandran, R. K., Fronda, N., and Sukhatme, G., “Resilience in multi-robot multi-tar get tracking with unknown number of targ ets through reconﬁguration, ” IEEE T ransactions on Contr ol of Netw ork Systems , 2021. [43] Nakka, Y . K. K., Hönig, W ., Choi, C., Harvard, A., Rahmani, A., and Chung, S.-J., “Information-based guidance and control architecture for multi-spacecraft on-orbit inspection, ” AIAA Scitech 2021 F orum , 2021, p. 1103. [44] Choi, C., Nakka, Y . K., Rahmani, A., and Chung, S.-J., “R esilient Multi- Agent Collaborativ e Spacecraft Inspection,” IEEE Aer ospace Conf erence , 2023, pp. 1–13. [45] Doulig er is, C., and Serpanos, D. N., “Netw ork security: current status and future directions, ” 2007. [46] Isermann, R., “Process f ault detection based on modeling and estimation methods—A sur ve y , ” automatica , V ol. 20, No. 4, 1984, pp. 387–404. [47] Nandi, S., T oliyat, H. A., and Li, X., “Condition monitoring and f ault diagnosis of electrical motors—A re view ,” IEEE transactions on ener gy conv ersion , V ol. 20, No. 4, 2005, pp. 719–729. [48] Morg an, D., Subramanian, G. P ., Chung, S.-J., and Hadaegh, F. Y ., “Sw ar m Assignment and Tra jector y Optimization using V ariable-Swarm, Dis tr ibuted Auction Assignment and Seq uential Con ve x Programming, ” The International Journal of Robotics Resear ch , V ol. 35, N o. 10, 2016, pp. 1261–1285. https://doi.org/10.1177/0278364916632065. [49] Markley , F . L., and Crassidis, J. L., “Fundamentals of Spacecraft Attitude Determination and Control, ” Springer , 2014, Chap. 3, pp. 67–122. https://doi.org/10.1007/978- 1- 4939- 0802- 8. 27 [50] Dam, E. B., K och, M., and Lillholm, M., “Quaternions, Inter polation and Animation, ” Citeseer , 1998, Chaps. 4, 5, and 6, pp. 33–44. [51] Bandy opadhy ay , S., Chung, S.-J., and Hadaegh, F . Y ., “Nonlinear Attitude Control of Spacecraft with a Larg e Captured Object, ” Journal of Guidance, Contr ol, and Dynamics , V ol. 39, No. 4, 2016, pp. 754–769. https://doi.org/10.2514/1.G001341. [52] Nakka, Y . K., Hönig, W ., Choi, C., Har vard, A., Rahmani, A., and Chung, S.-J., “Inf ormation-based guidance and control architecture for multi-spacecraft on-orbit inspection, ” Journal of Guidance, Contr ol, and Dynamics , 2022, pp. 1–18. [53] Alfriend, K., V adali, S. R., Gurﬁl, P ., How , J., and Breg er, L., “Spacecraft Formation Flying: Dynamics, Control and Na vigation, ” Elsevier , 2009, Chaps. 3,4,and 5, pp. 59–121. https:// doi.org/10.1016/C2009- 0- 17485- 8. [54] Schw ager , M., Rus, D., and Slotine, J. E., “Unifying Geometric, Probabilistic, and P otential Field Approaches to Multi-R obot Deplo yment, ” International Journal of R obotics Resear ch , V ol. 30, No. 3, 2011, pp. 371–383. https://doi.org/10.1177/ 0278364910383444. 28

Information-Driven Fault Detection and Identification for Multi-Agent Spacecraft Systems: Collaborative On-Orbit Inspection Mission

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment