Decentralized Multi-target Tracking in Urban Environments: Overview and Challenges
In multi-target tracking, sensor control involves dynamically configuring sensors to achieve improved tracking performance. Many of these techniques focus on sensors with memoryless states (e.g., waveform adaptation, beam scheduling, and sensor selec…
Authors: Donald J. Bucci Jr., Pramod K. Varshney
Decentralized Multi-tar get T racking in Urban En vironments: Ov ervie w and Challenges Donald J. Bucci Jr . Lockheed Martin ATL Cherry Hill, NJ, USA donald.j.bucci.jr@lmco.com Pramod K. V arshney Syracuse University Syracuse, NY , USA varshney@syr.edu Abstract —In multi-target tracking, sensor contr ol in volves dynamically configuring sensors to achieve impr oved tracking performance. Many of these techniques focus on sensors with memoryless states (e.g ., wa vef orm adaptation, beam scheduling, and sensor selection), lending themselves to computationally efficient control strategies. Mobile sensor contr ol for multi-target tracking, howev er , is significantly more challenging due to the complexity of the platform state dynamics. This platform com- plexity necessitates high-fidelity , non-myopic control strategies in order to achieve strong tracking performance while maintaining safe operation. These sensor control techniques are particularly important in non-cooperative urban sur veillance applications including person of interest, vehicle, and unauthorized U A V interdiction. In this over view paper , we highlight the current state of the art in mobile sensor control for multi-target tracking in urban envir onments. W e use this application to moti vate the need for closer collaboration between the information fusion, tracking, and control r esearch communities across three challenge areas rele vant to the urban surveillance problem. I . I N T RO D U C T I O N An accurate and scalable multi-target tracking solution is a critical component of many wide-area urban surveillance systems. For example, human and vehicle detection with closed-circuit television (CCTV) networks le verages multiple bearing-only sensors to uniquely track tar gets throughout a city [1]–[4]. Another important area in v olves tracking unauthorized unmanned aerial vehicles (U A Vs) using heterogeneous and spatially distributed sensors [5]–[7]. For commercial U A Vs that stream video and telemetry data, passiv e RF detection mechanisms have also been suggested [8]–[10]. Across all of these applications, the deployment and positioning of the sensors over time has a major impact on multi-target track- ing performance. This is especially true when tracking with passiv e sensors, which requires fusing multiple sensors to unambiguously resolve target position and velocity . Examples of these passive sensor types include received signal strength indicator (RSSI), time difference of arri val (TDO A), frequency difference of arriv al (FDO A), and angle of arri val (A O A). Sensor deployment and path planning for multi-target track- ing falls under the broad research area of sensor contr ol . Sensor control began receiving considerable attention from the information fusion community in the late 1990s [11], [12]. Many of the techniques in the area initially focused on dynamic reconfiguration of individual sensors in order to maintain strong target tracking performance (e.g., beam scheduling [13], waveform selection [14], [15]). In the early 2000s, howe ver , the focus shifted to include sensor selection for wireless sensor network (WSN) applications [16], [17]. The size, weight, and power (SW aP) requirements of these systems necessitated sensor control techniques that could balance tracking performance with the ener gy cost of obtaining and communicating sensor measurements across the network [18]–[21]. Decentralized multi-tar get tracking techniques were proposed to maintain communication bandwidth scalability and resilience to sensor failure [22]–[27]. The majority of WSN applications focused on stationary installations, allow- ing of fline solutions to the problem of sensor deployment optimization. A number of WSN deployment optimization solvers were proposed by drawing analogies to the NP-hard art gallery problem from computational geometry [28]. Meta heuristics formed the basis for many of these solv ers, including techniques such as particle swarm optimization and genetic algorithms [29]–[32]. The sensor control problem for online path planning in the context of multi-target tracking, howe ver , is significantly more challenging and less studied. As opposed to existing offline techniques for path planning in mobile sensor networks [33], the multi-target tracking v ariation of the problem necessitates an online solution due to the lack of a priori information on target trajectories. V ery few online mobile sensor control techniques in the current state-of-the-art are capable of addressing the unique challenges associated with non-cooperative target surveillance in urban environments. This is primarily because the urban en vironment highly constrains sensor cov erage and maneu- verability based on terrain ele vation and building geometries. The same shadowing issues that make target sensing dif ficult also introduce challenges in maintaining end-to-end network connectivity , thus rendering centralized fusion approaches im- practical. Strong performance and safe operation of a mobile sensor network in this scenario requires an understanding of how the terrain impacts the relev ant tracking and sensor con- trol algorithms, all while maintaining decentralized operation. The goal of this paper is to provide a brief summary of the current multi-target multi-sensor tracking approaches using a mobile sensor network. W e use this summary to highlight the main limitations that prev ent immediate application of these architectures to the urban en vironment. In the sections that follow , we briefly summarize the model generally assumed for the mobile sensor network control problem. Follo wing this, we provide a brief literature revie w of the current-state-of-the art in multi-target tracking with mobile sensor networks in non- urban en vironments. W e then conclude by discussing three open challenges related to urban surv eillance with commercial- off-the-shelf (CO TS) U A Vs, or more specifically , quadcopters. I I . P RO B L E M O V E RV I E W A. Inte grated Sensing and Contr ol Arc hitectur e Figure 1 sho ws a typical architecture used for decentralized target tracking and mobile sensor control for a single platform. A sensor interface provides deriv ed target measurements, such as time of arri val, Doppler shift, TDO A/FDO A, RSSI, or A O A. The measurements are usually obtained under measurement origin uncertainty . That is, it is not known a priori which sensor measurements correspond to clutter and which to ex- isting targets. In addition, measurements obtained from targets may be miss-detected at a gi ven time step. The posterior distributions for each target from the previous time step are propagated forward in time under known target birth and sur- viv al dynamics. The mechanism for performing this forward prediction is usually a v ariant of the Chapman-K olmogorov integral [34], [35]. A data association process uses these predicted posterior distributions to generate a mapping from the measurements to ne wborn and persisting targets. The association map, sensor measurements, and platform telemetry are then used to apply the Bayes update for each target’ s predicted posterior distribution. If the tracker update step is decentralized, a consensus process is used to jointly process the sensor log likelihood messages ov er the network with one- hop neighbors. The updated posterior distribution per tar get is used to perform state e xtraction, which generates state estimates and co variance ellipses. The sensor control policy finally uses the updated posterior distribution to determine which platform control actions (e.g., heading, acceleration, or waypoint) to use for the next time step. A separate consensus procedure may also occur to synchronize agent control actions. B. POMDP F ormulation for Mobile Sensor Contr ol Control of mobile sensor networks for multi-target tracking typically follows a partially observable Markov decision pr o- cess (POMDP) formulation [36, Chapter 5.6]. In a POMDP , the target states (i.e., position and velocities) ev olve according to a Marko v process and are observed indirectly through sensor measurements. Sensor states also ev olve according to a Markov process based on the control action applied at the current time step. Depending on the inertial sensor and kinematic models, the corresponding relationship between platform state dynamics and control may be deterministic or stochastic with directly or partially observable states. The relationship between the sensor measurements and the tar get and platform states at the current time step is giv en by a set of likelihood functions. The reward function is designed to capture target tracking goals (e.g., minimized cumulativ e un- certainty in state estimates), obstacle and inter-agent collision av oidance requirements, and constraints on platform control actions. Given this rew ard function defined over target states, platform states, and platform actions, the goal is to construct a closed-loop control policy that maximizes the infinite horizon expected cost-to-go (i.e., discounted cumulativ e reward). The information available for a control policy at the current time step is the measurement history of all tar get and platform states and the control inputs used at each platform. For simplicity , the following discussion will assume that the sensor platform states are completely observable. T o prev ent the growth of the control and state space di- mensionality as new measurements are obtained, the POMDP model is typically reformulated into an equi valent Markov de- cision process (MDP). This is accomplished using a sufficient statistic that subsumes the measurements up until the current time step [37, Chapter 4.3]. The corresponding suf ficient statis- tic is termed the belief state of the system. The belief state for the sensor control application here is the posterior probability of the target states given the observed measurements up until the current time step. In general, the belief state per target is estimated using application-specific variations of the recursive Bayes filter [35]. For the multi-tar get case, extensions e xist for the joint target probability under soft associations [38] or for multi-target probability distrib ution under the random finite sets (RFS) formalism [39]. The sensor control re ward function at each time step is then mapped to the belief states through an information theoretic measure of the quality of the current target state estimate. The most commonly used measure is the mutual information [40] between the future states per target and the predicted sensor measurements obtained over a finite horizon lookahead windo w . Each plausible control action affects the locations of the platforms at future time steps, which in turn affects the posterior belief state for each target. The core idea is that the mutual information metric quantifies how the the sharpness of the belief state per target changes ov er a finite horizon lookahead under each control action. Despite this simplification, the resulting belief MDP state space is a subset of the space of multiv ariate probability distributions. These belief states are always continuous, even if the partially observ able states are discrete. As a result, very few closed-form optimal policies for belief MDPs exist. The most well known solution is for the case of a linear Gaussian POMDP with quadratic cost. Here, the optimal solution reduces to a Kalman update of the belief state, and the control solution results from solving the discrete-time algebraic Riccati equation [41], [42]. In all other cases, the policies must be determined using approximate online dy- namic programming techniques for infinite dimensional state spaces, such as model predictiv e control (MPC) or stochastic rollout [37, Chapter 6.4-5]. For these techniques, achieving real-time implementation of the control policy in volves an application-specific treatment of computational complexity . I I I . R E L A T E D W O R K I N N O N - U R BA N E N V I R O N M E N T S A. Myopic Contr ol of Mobile Sensors Ristic and V o in [43] used the RFS formalism [39] to deri ve a myopic sensor control policy for a single integrator (i.e., Fig. 1. T ypical estimation and control architecture used for multi-target tracking. Network communication interfaces shown for decentralized operation. Interfaces shown as gray boxes. Algorithms shown as blue boxes. velocity controlled) plant. The tracking algorithm was a par- ticle filter approximation of the multi-target Bayes filter . This controller used the Renyi di ver gence between the predicted and the future expected multi-object posterior after obtaining measurements from range-only mobile sensors. Ristic et al. provided a similar myopic scheme in [44] for range-only tracking, but specialized the multi-object Renyi diver gence of [43] to the more computationally tractable probability hypothesis density (PHD) filter [45]. Gostar et al. in [46] leveraged the Cauchy-Schwarz diver - gence for Poisson point processes [47] to deri ve myopic sensor control policies for a sequential Monte Carlo implementation of the labeled multi-Bernoulli (LMB) filter . T o maximize com- putational efficienc y , the Cauchy-Schwarz div ergence between the PHDs of the LMB filter’ s predict and update steps (per control action) was used as the reward function. This reward was efficiently realized by e valuating the dif ference between each predicted and updated particle systems’ weights, scaled by each target’ s probability of existence. A similar approach was applied by Gostar et al. to the cardinality-balanced multi- target multi-Bernoulli (CB-MeMBer) filter in [48]. T o further accelerate computation, both [46] and [48] applied certainty equiv alent (i.e., noiseless) measurement models when predict- ing future filter states per control action. K oohifar et al. in [49] provided a single sensor myopic control policy based on the steepest descent direction of the predicted posterior Cramer -Rao lower bound (PCRLB). This policy generalized their previous work in [50] by deriving the sensor likelihoods for an RSSI-only measurement. The RSSI measurement likelihood further modeled packet transmission statistics via a Bernoulli process. The plant model was as- sumed velocity controllable, where the heading at each control step was chosen from a fixed quantized set. Hoffman and T omlin in [51] leveraged a particle filtering solution and distrib uted myopic control policy for a con- strained double integrator (i.e., acceleration controlled) plant using bearing-only or range-only measurements. The plant was designed to model the ST ARMA C quadcopter [52] moving at slow speeds. The reward function was the mutual information between the predicted future target states and measurements. T o maintain computational tractability , mutual information was ev aluated using either local contributions per node, or limited pair-wise contributions between nodes. Dames and Kumar in [53] leveraged the PHD filter to con- struct a distrib uted sensor control scheme for indoor unmanned ground vehicles (UGV). In contrast to [51], the policy used the mutual information between the predicted target states and the binary ev ent of an agent observing an empty measurement set at subsequent time steps. Decentralized estimation and con- trol was achie ved by transitioning sensors through operating modes, and org anizing them into smaller sub-groups where their PHD states were directly synchronized. Meyer et al. in [54] proposed a myopic gradient descent algorithm for a class of decentralized multi-target tracking algorithms based on loopy belief propagation (BP). The cost function used was the conditional entropy of the target states at the next time step given the expected sensor measurements at the next time step. The relevant BP messages and con- ditional entropy gradient were approximated using multiple particle systems under perfect knowledge of the number of targets and the target-to-measurement association. Although the simulations presented in [54] were for single integrator dynamics, the corresponding technique is general enough to accommodate non-linear sensor and target dynamic models. Chung et al. in [55] proposed a decentralized myopic control strategy based on memoryless, zero cross-cov ariance, track-to-track fusion. The indi vidual sensors provided range- bearing measurements which were fused decentrally using the Kalman equations. The re ward function was the determinant of the fused cov ariance matrix, which is directly related to the entropy of the fused target state estimate. The re ward gradient was deriv ed, and consisted of a sum of per-sensor reward gradients. As a result, the control policy was the same for each sensor based on its state, sensor model, and fused target cov ariance estimates. A similar controller was deriv ed for the case where imperfect communications contribute to additional errors in the fused estimates. B. Non-myopic Contr ol of Mobile Sensors Beard et al. in [56] used the generalized labeled multi- Bernoulli (GLMB) filter to apply the Cauchy-Schwarz div er- gence for Poisson point processes [47] as the sensor control rew ard function. The authors also proposed the use of RFS void probabilities to achiev e collision av oidance with targets. An e xample of controlling of a single range-bearing sensor tracking multiple tar gets under measurement origin uncertainty was presented. A finite horizon controller was simulated that assumed a constant velocity plant with instantaneously controllable heading. Closed form equations of the Cauchy- Schwarz di ver gence for the case when the GLMB single object posterior densities are modeled as Gaussian mixtures were provided. The derivations in [56] for the GLMB can also be applied to the LMB filter as a special case, but exhibit higher complexity than those described by Gostar in [46], [48]. Implementation details of these control techniques, including pseudo-code, can be found in [57] Dames and Kumar in [58] demonstrated a non-myopic tracking and control solution on real UGVs. The tracking and control algorithms included a particle filter implementation of the PHD filter and an online estimate of mutual information. Similar to [53], the non-myopic policy was achiev ed by ev aluating the mutual information against the potential of observing empty measurement sets. Receding horizon control was achiev ed through a combination of efficient action-set generation and adaptiv e sequential information planning [59]. Atanasov et al. in [60] proposed a reduced value iteration (R VI) algorithm demonstrated on a target-linearized range- bearing measurement model for a single sensor . An important distinction was that the relationship between platform state and the observed measurements remained non-linear . The specific technique did not require linearized sensor platform dynamics, and as such, it was demonstrated in simulation for a single target tracking scenario under differential dri ve dynamics. This R VI algorithm was later generalized by Schlotfeldt et al. in [61] to an anytime planning algorithm. The resulting technique, denoted Anytime-R VI (AR VI) was decentralized and tested on a set of quadcopters attempting to localize ground-based robots using range and bearing estimates. Ragi and Chong in [62] assumed linear-Gaussian state and measurement dynamics and applied a joint-probabilistic data association (JPD A) tracker [38]. The sensor control technique used in this paper is known as nominal belief-state optimization (NBO) [63]. NBO is a POMDP approach that assumes the associated belief-states (i.e., per target posteriors) are completely characterized by a normal distribution (pre- sumably through a Kalman update). A certainty-equiv alent principle was applied to remove the expectation across belief states. Both single and multi-step lookahead rollout approaches were provided. The approach in [62] additionally considered forward acceleration thrust and heading dynamics for the plat- form under wind force disturbances. Inter-agent collision and obstacle a voidance constraints were considered by including a scaled regularization parameter in the cost-to-go function. Grocholsky et al. in [64] assumed a fixed wing aircraft with constant forward velocity and controllable yaw rate to implement a decentralized control rule for bearing-only sensors. Decentralized data fusion was achie ved by lev eraging the information form of the Kalman filter [65]. The control law used the expected mutual information gain of the information matrix at the beginning and end of a finite lookahead windo w . This law was made computationally feasible by linearizing the measurement and sensor state ev olution dynamics and solving the resulting linear-quadratic-Gaussian (LQG) optimal control problem. I V . C H A L L E N G E S S P E C I FI C T O U R BA N S U RV E I L L A N C E A. T errain-awar e T rac king and Sensor Contr ol The primary sensor control challenge in urban surveillance in volv es understanding how the terrain and building geome- tries affect tracking performance. In a camera based solution, for example, the observed measurements are A O As where the probability of detection is dependent on the platform’ s ability to maintain line-of-sight (LOS) on targets. A similar argument can be used for passiv e RF observations from low po wer transmitters, where the detectability of multi-path effects is ne gligible 1 . If tar gets maneuv er into a non-line-of- sight (NLOS) region to all sensors, the uncertainty on the target’ s position and velocity increases due to the lack of measurement updates. Thus, platform maneuvers that keep as many targets within LOS to their corresponding sensors will lead to an increase in the mutual information between predicted future target states and measurements. Consequently , the number of sensors that ha ve LOS to a gi ven tar get and their sensing geometry in the LOS region is also important. There are a number of related studies that provide tracking functionality for targets constrained to road networks. Ulmke and K och in [67] describe a particle filtering technique for tracking a single target maneuv ering on partially obstructed road networks. The authors showed that improv ed tracking performance results when conditioning the measurement de- tection process on LOS/NLOS information. Ulmke et al. extended their work in [67] to the RFS formalism in [68] using the Gaussian-mixture cardinalized PHD filter [69]. W ithin these ef forts, the detectable re gions were constrained by the road network as observed from an overhead sensor . In the gen- eral urban surveillance case, the sensors may not necessarily be overhead. Furthermore, many practical target types will not be constrained to road networks (e.g., unauthorized U A Vs). The conditioning of the sensor control policy on LOS/NLOS sensing regions as described above necessitates a non- parametric approximation of the belief state per target. Across all applications, these approximation techniques are computa- tionally complex and make implementing non-myopic policies 1 Examples of localization in multi-path rich environments have been proposed based off of pattern recognition techniques [66]. These are outside of the scope of this revie w . at high update rates very challenging. A number of point- based value iteration (PBVI) approaches [70]–[72] have been proposed to solve loosely related target surveillance problems [73]–[75]. This computational complexity is made worse by the requirement to perform moderate to high fidelity ray- casting under each sensor action to identify the LOS/NLOS re- gions. In order to maintain the computational comple xity of the PBVI approaches, these shadowing computations necessitate some form of GPU-based acceleration from the computational geometry literature [76]. Another important consideration is the incorporation of safety-guaranteed operation with respect to inter-agent and ob- stacle collision avoidance. Some studies such as [62] attempt to address the collision a voidance constraints for sensor control by directly penalizing the rew ard function estimate when tar- gets are too close to other agents or other obstacles. A central issue, howe ver , is how the safety constraint penalization term should be weighted when estimating the discounted cost-to-go. A regularization weight that is too small may not be capable of preventing a collision under the assumed dynamics of the platform. Conv ersely , a penalization that is too large may ov er constrain the system and unnecessarily degrade the optimality of the policy . A better solution would be to select a POMDP solver that is capable of guaranteeing satisfiability of the safety constraints gi ven an accurate map of the en vironment and agent positions. Minimum-norm controllers that modify the planned action from sensor control using safety barrier functions are one option [77], [78], but can potentially over - compensate for safety when the optimization reward is not quantifiable by a control L yapunov function. Computationally tractable implementations of this technique also require a plant model that is affine in the control actions. Path planning and graph traversal techniques, such as A* [79] and RR T [80] provide another option, b ut require a discretization of the platform state space that may not be kinematically feasible. Relev ant work by the controls community applying such graph search techniques to the trajectory planning problem is giv en in [81]–[83]. When digital surface models (DSM) of the terrain and buildings are not av ailable a priori , an online estimate is usually computed via a simultaneous localization and map- ping (SLAM) technique. The use of online estimated map data necessitates sensor control robustness under uncertainty . That is, the LOS/NLOS sensor control techniques should be designed to maintain strong performance up to a pre-specified lev el of error in the estimated map data. Similarly , the obstacle av oidance techniques should guarantee collision av oidance up to the same pre-specified level of map error . B. Contr ol Space F idelity for Quadcopters A major contributing factor to the current interest in urban surveillance with mobile sensor networks is the abundance of commercially av ailable UA Vs. Quadcopters, for example, provide vertical takeoff and landing functionality in addition to high agility maneuvers. These platforms and their flexi- ble APIs for flight control tasking [84]–[86] make real-time experimentation of mobile sensor network applications v ery attractiv e. The sensor control methods that exist in the current state-of-the-art, ho wev er , make ov erly simplifying assumptions in the platform kinematics to further reduce computational complexity (e.g., first or second order inte grator dynamics). As a result, the platforms are forced to maneuv er at slower velocities so that the actions generated by the sensor control algorithms are representati ve of the POMDP state dynamics. A more critical flaw in this approach is that, under the dynamics of the urban en vironment, platforms may attempt to delay necessary actions for maintaining collision-free flight until it is too late. Quadcopter platform dynamics have been studied exten- siv ely by the controls and aeronautics communities [87]. The quadcopter is a six degree-of-freedom system consisting of position and orientation in 3D Euclidean space. Howe ver , it provides only four actuation points consisting of the total upward thrust force and the roll, pitch, and yaw moments. This mak es the quadcopter underactuated , implying that its position and orientation can not be accelerated in any arbitrary direction. Instead, translational and rotational acceleration are achiev ed by applying time-varying attitude control. A nai ve incorporation of these plant state and action dynamics under a fixed discretization in a POMDP-based sensor control algo- rithm is not computationally feasible. Early attempts at quadcopter control applied small-angle approximation techniques to linearize the flight dynamics around the hover state [88]. An important finding was made by Mellinger and Kumar in [89], [90], where the quadcopter was determined to be differ entially flat in terms of its 3D Euclidean position and yaw angle. Differential flatness of a system implies that the original states and inputs can be rewritten as algebraic functions of (potentially fewer) state variables and their deriv ativ es. These algebraic functions de- fine a dif feomorphism that ensures an y trajectories of suf ficient smoothness in the flat outputs will be suf ficiently smooth in the original state and control space. For the quadcopter , the highest degree deriv ativ e of the flat position outputs in their expressions for the original control inputs is four (i.e., trajectory snap). Similarly , the highest degree deriv ativ e of the flat yaw output in the expressions for the original control inputs is two (i.e., yaw acceleration). Using this insight, Mellinger and K umar [89], [90] provided a series of waypoint-based quadcopter trajectory generation techniques that minimize the control effort under trajectory snap and yaw acceleration (i.e., minimum snap trajectories). These waypoint generation methods assume a concatenation of piecewise polynomial functions that pass through pre-defined waypoints. Solving for the trajectory polynomial coef ficients is done by solving a computationally tractable quadratic program (QP). Regulating the original state dynamics of the quadcopter according to this trajectory can then be achiev ed through the use of a backstepping controller [87], [91]. The key takeaway from the above discussion is that the accuracy of the quadcopter control space in a mobile sensor control algorithm can be maintained provided that the actions commanded to the platform generate smooth trajectories up until the fourth deri vati ve of position and second deri vati ve of yaw rate. For sensor control with a quadcopter platform, a natural solution is to assume that the plant consists of a fourth order differential equation on the flat outputs, with an input consisting of the trajectory snap at each time step. This trajectory snap input is termed a motion primitive [81], [92]. Under polynomial trajectories, these motion primitives induce a resolution-complete discretization in the flat outputs. Sikang et al. in [92] deriv e this discretization and suggest optimal search techniques for trajectory generation between waypoints using A* [79]. F or dynamic en vironments, Sikang et al. in [92] proposed a receding horizon control technique based on Lifelong Planning A* (LP A*). The techniques presented in [93] were shown to provide collision av oidance guarantees between static and dynamic obstacles, and generate robust paths with respect to random platform disturbances. C. T rac king and Contr ol Algorithm Decentralization Decentralized operation is a critical requirement of an urban surveillance system. As discussed in Section IV -A, the terrain and building geometries present a strongly RF shadowed propagation environment. This poses a significant challenge for inter-agent communication, and thus renders centralized tracking and sensor control techniques impractical. In general, decentralization of multi-target tracking and sensor control algorithms is very challenging The BP tracking approaches discussed by Meyer et al. in [24] provide an intuitiv e frame- work for performing av erage consensus on the relev ant belief state parameters with one-hop neighbors. For particle filter- ing approaches to the multi-target tracking problem, the BP approaches are decentralized using a consensus-over -weights approach [94]. Consensus over -weights assumes that the par- ticle systems sampled at each agent are identical, which ne- cessitates the use of synchronized random number generators. Likelihood consensus [95] is a slightly less restricti ve approach that overcomes the need for synchronized random number generators by projecting the sensor likelihood functions onto a common set of basis functions. Other alternativ es to the consensus-ov er-weights scheme include fusion via Gaussian mixture approximations [96] and kernel-based methods [97]. Although these techniques work well when decentralizing target tracking algorithms, it is difficult to apply them to mo- bile sensor control policies for the urban en vironment. Since the techniques suggested in Section IV -A necessitate an online simulation-based approach, it is not immediately clear how a consensus algorithm should be constructed. One approach to circumvent this challenge is to implement the centralized mobile sensor control policy in a high-fidelity simulation and perform imitation learning to generate a decentralized policy . Imitation learning is a variation of reinforcement learning, where the goal is to make observations on a set of oracle control decisions and determine a non-parametric representation of the policy [98]. This type of learning has been applied regularly in robotics to learn specific robotic manipulator movements via kinesthetic examples [99]–[101]. In these efforts, a con volutional neural network (CNN) is commonly used as approximate architecture for the state- action value function. A recent study by Gama et al. in [102] has shown how the conv olution and pooling operations used in CNNs can be generalized to support learning with signals supported ov er graphs. The resulting learning architecture, titled a graph neural network (GNN), may be capable of supporting an imitation learning procedure where the one-hop features that may be rele vant to consensus are analogous to those signals supported over a communication network graph. A more thorough in vestigation of imitation learning of decentralized policies from centralized ones using GNNs is an ongoing and open area of research. V . C O N C L U S I O N In this paper , we presented an ov erview of the mobile sensor control problem for multi-target tracking with a specific em- phasis on urban surv eillance problems. In addition to pro viding a brief background on the sensor control POMDP formulation, we provided a detailed literature re view of the current state- of-the-art and suggested three challenge areas that ha ve yet to recei ve considerable attention by the community These three areas were terrain-aware tracking and sensor control, control space fidelity for quadcopters, and joint estimation and control algorithm decentralization. A number of these challenges are addressed separately in the information fusion, tracking, and control communities. W e suggest a coordinated effort amongst these communities in order to arri ve at solutions that are capable of addressing these challenges together in a computationally tractable and bandwidth efficient manner . R E F E R E N C E S [1] Z. Chen, W . Liao, B. Xu, H. Liu, Q. Li, H. Li, C. Xiao, H. Zhang, Y . Li, W . Bao, and D. Y ang, “Object tracking over a multiple-camera network, ” in IEEE International Conference on Multimedia Big Data , 2015. [2] L. Hou, W . W an, K. Han, R. Muhammad, and M. Y ang, “Human de- tection and tracking over camera networks: A re view , ” in International Confer ence on A udio, Language and Image Pr ocessing (ICALIP) , 2016. [3] L. Anuj and M. T . G. Krishna, “Multiple camera based multiple object tracking under occlusion: A survey , ” in International Conference on Innovative Mechanisms for Industry Applications (ICIMIA) , 2017. [4] A. Y . Y ang, S. Maji, C. M. Christoudias, T . Darrell, J. Malik, and S. S. Sastry , “Multiple-view object recognition in band-limited distributed camera networks, ” in Thir d ACM/IEEE International Conference on Distributed Smart Cameras , 2009. [5] D. Poullin, “Countering ille gal U A V flights: Passiv e D VB radar poten- tiality , ” in 19th International Radar Symposium (IRS) , 2018. [6] I. P . Snezhana Jovanoska Fraunhofer Institute for Communication, G. Ergonomics FKIE, W achtber g, M. Br ¨ otje, and W . K och, “Multisen- sor data fusion for U A V detection and tracking, ” in 19th International Radar Symposium (IRS) , 2018. [7] S. R. Ganti and Y . Kim, “Implementation of detection and tracking mechanism for small U AS, ” Ph.D. dissertation, International Confer- ence on Unmanned Aircraft Systems (ICUAS), June 2016. [8] W . D. W atson, “3D activ e and passive geolocation and tracking of unmanned aerial systems, ” in IEEE International Symposium on T echnologies for Homeland Security (HST) , 2017. [9] W . D. W atson and T . McElwain, “4D CAF for localization of co- located, moving, and RF coincident emitters, ” in IEEE Military Com- munications Conference (MILCOM) , 2016. [10] P . Scerri, R. Glinton, S. Owens, D. Scerri, and K. Sycara, “Geoloca- tion of RF emitters by many UA Vs, ” in AIAA Infotech@ Aer ospace Confer ence and Exhibit , 2007. [11] A. O. Hero and D. Cochran, “Sensor management: Past, present, and future, ” IEEE Sensors Journal , vol. 11, no. 12, pp. 3064–3074, December 2011. [12] G. W . Ng and K. H. Ng, “Sensor management – What, why and ho w , ” Information Fusion , vol. 1, no. 2, pp. 67–75, 2000. [13] V . Krishnamurthy and R. J. Evans, “Hidden Markov model multiarm bandits: A methodology for beam scheduling in multitarget tracking, ” IEEE T ransactions on Signal Pr ocessing , vol. 49, no. 12, pp. 2893– 2908, December 2001. [14] D. J. K ershaw and R. J. Evans, “Optimal waveform selection for tracking systems, ” IEEE T r ansactions on Information Theory , vol. 40, no. 5, pp. 1536–1550, September 1994. [15] S. P . Sira, Y . Li, A. Papandreou-Suppappola, D. Morrell, D. Cochran, and M. Rangaswamy , “W a veform-agile sensing for tracking, ” IEEE Signal Pr ocessing Magazine , vol. 26, no. 1, pp. 53–64, January 2009. [16] K. Ramya, K. P . Kumar , and V . S. Rao, “ A survey on target tracking techniques in wireless sensor networks, ” International Journal of Computer Science and Engineering Survey , v ol. 3, no. 4, 2012. [17] N. Cao, S. Choi, E. Masazade, and P . K. V arshney , “Sensor selection for target tracking in wireless sensor networks with uncertainty , ” IEEE T ransactions on Signal Pr ocessing , vol. 64, no. 20, pp. 5191–5204, July 2016. [18] D. Guo and X. W ang, “Dynamic sensor collaboration via sequential Monte Carlo, ” Journal on Selected Areas in Communications , vol. 22, pp. 1037–1047, August 2004. [19] L. Zuo, R. Niu, and P . K. V arshney , “ A sensor selection approach for target tracking in sensor networks with quantized measurements, ” in IEEE International Confer ence on Acoustics, Speech and Signal Pr ocessing , 2008. [20] E. Masazade and P . K. V arshney , “ A market based dynamic bit allocation scheme for target tracking in wireless sensor networks, ” in IEEE International Confer ence on Acoustics, Speech and Signal Pr ocessing , 2013. [21] R. Niu, A. V empaty , and P . K. V arshney , “Received-signal-strength- based localization in wireless sensor networks, ” Pr oceedings of the IEEE , vol. 106, no. 7, pp. 1166–1182, June 2018. [22] O. Hlinka, F . Hlawatsch, and P . M. Djuric, “Distrib uted particle filtering in agent networks: A survey , classification, and comparison, ” IEEE Signal Pr ocessing Magazine , vol. 30, no. 1, pp. 61–81, January 2013. [23] ——, “Consensus-based distributed particle filtering with distributed proposal adaptation, ” IEEE Tr ansactions on Signal Processing , vol. 62, no. 12, pp. 3029–3041, June 2014. [24] F . Meyer, T . Kropfreiter, J. L. W illiams, R. Lau, F . Hlawatsch, P . Braca, and M. Z. Win, “Message passing algorithms for scalable multitarget tracking, ” Proceedings of the IEEE , vol. 106, no. 2, pp. 221–259, February 2018. [25] E. J. Msechu, S. I. Roumeliotis, A. Ribeiro, and G. B. Giannakis, “Decentralized quantized Kalman filtering with scalable communica- tion cost, ” IEEE T ransactions on Signal Processing , vol. 56, no. 8, pp. 3727–3741, August 2008. [26] A. Ribeiro, G. B. Giannakis, and S. I. Roumeliotis, “SOI-KF: Dis- tributed Kalman filtering with low-cost communications using the sign of innovations, ” IEEE T ransactions on Signal Pr ocessing , vol. 54, no. 12, pp. 4782–4795, December 2006. [27] L. Zuo, “Conditional posterior cramer-rao lo wer bound and distributed target tracking in sensor networks, ” Ph.D. dissertation, Syracuse Uni- versity , 2010. [28] A. Efrat, S. Har-Peled, and J. S. B. Mitchell, “ Approximation algo- rithms for two optimal location problems in sensor networks, ” in 2nd International Conference on Br oadband Networks , 2005. [29] Z. Bojkovic and B. Bakmaz, “ A survey on wireless sensor networks deployment, ” WSEAS Tr ansactions on Communications , v ol. 7, no. 12, pp. 1172–1181, 2008. [30] R. V . Kulkarni and G. K. V enayagamoorthy , “Particle swarm optimiza- tion in wireless-sensor networks: A brief survey , ” IEEE T ransactions on Systems, Man, and Cybernetics, P art C (Applications and Re views) , vol. 41, no. 2, pp. 262–267, March 2011. [31] F . Domingo-Perez, J. L. Lazaro-Galilea, I. Bra vo, E. Martin-Gorostiza, D. Salido-Monzu, A. Llana, and F . Govaers, “Sensor deployment for motion trajectory tracking with a genetic algorithm, ” in IEEE International Conference on Industrial T echnology (ICIT) , 2015. [32] J. Hu, J. Song, M. Zhang, and X. Kang, “T opology optimization for urban traffic sensor network, ” Tsinghua Science and T echnolo gy , vol. 13, no. 2, pp. 229–236, April 2008. [33] A. Singh, A. Krause, C. Guestrin, and W . J. Kaiser, “Efficient infor- mativ e sensing using multiple robots, ” Journal of Artificial Intelligence Resear ch , vol. 34, pp. 707–755, 2009. [34] S. Ross, “Chapter 4.2: Chapman-Kolmogoro v equations, ” in Intr oduc- tion to Probability Models , 11th ed. Academic Press, 2014. [35] Z. Chen, “Bayesian filtering: From Kalman filters to particle filters, and beyond, ” Statistics , vol. 182, no. 1, pp. 1–69, 2003. [36] D. P . Bertsekas, Dynamic Progr amming and Optimal Control: Approx- imate Dynamic Pro gramming . Athena Scientific, 2012. [37] ——, Dynamic Pr ogramming and Optimal Contr ol . Athena Scientific, 2017. [38] Y . Bar-Shalom and T . E. Fortmann, Tr acking and Data Association . Academic Press, 1988. [39] R. Mahler , Advances in Statistical Multisource-Multitar get Information Fusion . Artech House, 2014. [40] T . Cover and J. Thomas, Elements of Information Theory , 2nd ed. W iley-Interscience, 2006. [41] T . T . Georgiou and A. Lindquist, “The separation principle in stochastic control, redux, ” IEEE T r ansactions on Automatic Contr ol , vol. 58, no. 10, pp. 2481–2494, October 2013. [42] K. J. ˚ Astr ¨ om, Intr oduction to Stochastic Control Theory . Courier Corporation, 2012. [43] B. Ristic and B.-N. V o, “Sensor control for multi-object state-space estimation using random finite sets, ” Automatica , vol. 46, no. 11, pp. 1812–1818, 2010. [44] B. Ristic, B.-N. V o, and D. Clark, “ A note on the reward function for PHD filters with sensor control, ” IEEE T ransactions on Aer ospace and Electr onic Systems , vol. 47, no. 2, pp. 1521–1529, April 2011. [45] B.-N. V o, S. Singh, and A. Doucet, “Sequential Monte Carlo methods for multitarget filtering with random finite sets, ” IEEE T ransactions on Aer ospace and Electr onic Systems , vol. 41, no. 4, pp. 1224–1245, 2005. [46] A. K. Gostar, R. Hoseinnezhad, T . Rathnayak e, X. W ang, and A. Bab- Hadiashar, “Constrained sensor control for labeled multi-Bernoulli filter using Cauchy-Schwarz diver gence, ” IEEE Signal Processing Letters , vol. 24, no. 9, pp. 1313–1317, September 2017. [47] H. G. Hoang, B. V o, B. V o, and R. Mahler, “The Cauchy–Schwarz div ergence for Poisson point processes, ” IEEE T ransactions on Infor- mation Theory , vol. 61, no. 8, pp. 4475–4485, August 2015. [48] A. K. Gostar, R. Hoseinnezhad, and A. Bab-Hadiashar, “Multi- Bernoulli sensor control using Cauchy-Schwarz divergence, ” in 19th International Conference on Information Fusion (FUSION) , July 2016, pp. 651–657. [49] F . Koohifar , I. Guvenc, and M. L. Sichitiu, “ Autonomous tracking of intermittent RF source using a U A V swarm, ” IEEE Access , v ol. 6, pp. 15 884–15 897, 2018. [50] F . K oohifar , A. K umbhar, and I. Guvenc, “Receding horizon multi-U A V cooperativ e tracking of moving RF source, ” IEEE Communications Letters , vol. 21, no. 6, pp. 1433–1436, 2017. [51] G. M. Hoffmann and C. J. T omlin, “Mobile sensor network control using mutual information methods and particle filters, ” IEEE T ransac- tions on Automatic Control , v ol. 55, no. 1, pp. 32–47, January 2010. [52] G. Hoffmann, H. Huang, S. W aslander , and C. T omlin, “Quadrotor helicopter flight dynamics and control: Theory and experiment, ” in AIAA Guidance, Navigation and Contr ol Confer ence and Exhibit , 2007. [53] P . Dames and V . Kumar, “Cooperativ e multi-target localization with noisy sensors, ” in IEEE International Confer ence on Robotics and Automation (ICRA) , 2013. [54] F . Meyer , H. W ymeersch, M. Frohle, and F . Hlawatsch, “Distributed estimation with information-seeking control in agent networks, ” IEEE Journal on Selected Ar eas in Communications , vol. 33, no. 11, pp. 2439–2456, 2015. [55] T . H. Chung, J. W . Burdick, and R. M. Murray , “ A decentralized motion control strategy for dynamic target tracking, ” in IEEE International Confer ence on Robotics and Automation (ICRA) , 2006. [56] M. Beard, B. V o, B. V o, and S. Arulampalam, “V oid probabilities and Cauchy–Schwarz di vergence for generalized labeled multi-bernoulli models, ” IEEE T ransactions on Signal Processing , vol. 65, no. 19, pp. 5047–5061, October 2017. [57] M. Beard, “Estimation and control of multi-object systems with high- fidelity sensor models: A labelled random finite set approach, ” Ph.D. dissertation, Curtin University , June 2016. [58] P . Dames and V . Kumar , “ Autonomous localization of an unknown number of targets without data association using teams of mobile sensors, ” IEEE T ransactions on Automation Science and Engineering , vol. 12, no. 3, pp. 850–864, July 2015. [59] B. Charrow , N. Michael, , and V . Kumar , “ Acti ve control strategies for discovering and localizing de vices with range-only sensors, ” in W orkshop on the Algorithmic F oundations of Robotics , 2014, pp. 55– 71. [60] N. Atanasov , J. L. Ny , K. Daniilidis, and G. J. Pappas, “Information acquisition with sensing robots: Algorithms and error bounds, ” in IEEE International Conference on Robotics and Automation (ICRA) , 2014. [61] B. Schlotfeldt, D. Thakur, N. Atanasov , V . Kumar , and G. J. Pappas, “ Anytime planning for decentralized multirobot active information gathering, ” IEEE Robotics and Automation Letters , vol. 3, no. 2, pp. 1025–1032, 2018. [62] S. Ragi and E. K. P . Chong, “U A V path planning in a dynamic en vironment via partially observable Markov decision process, ” IEEE T ransactions on Aer ospace and Electr onic Systems , v ol. 49, no. 4, pp. 2397–2412, October 2013. [63] S. A. Miller, Z. A. Harris, and E. K. P . Chong, “Coordinated guid- ance of autonomous UA Vs via nominal belief-state optimization, ” in American Contr ol Confer ence , 2009. [64] B. Grocholsky , A. Makarenko, and H. Durrant-Whyte, “Information- theoretic coordinated control of multiple sensor platforms, ” in IEEE International Conference on Robotics and Automation (ICRA) , 2003. [65] J. Manyika and H. Durrant-Whyte, Data Fusion and Sensor Manage- ment: An Information-Theoretic Appr oach . Prentice Hall, 1994. [66] E. Tsalolikhin, I. Bilik, and N. Blaunstein, “ A single-base-station localization approach using a statistical model of the NLOS propa- gation conditions in urban terrain, ” IEEE Tr ansactions on V ehicular T echnology , vol. 60, no. 3, pp. 1124–1137, March 2011. [67] M. Ulmke and W . Koch, “Road-map assisted ground moving target tracking, ” IEEE T ransactions on Aer ospace and Electronic Systems , vol. 42, no. 4, pp. 1264–1274, October 2006. [68] M. Ulmke, O. Erdinc, and P . Willett, “GMTI tracking via the Gaus- sian mixture cardinalized probability hypothesis density filter, ” IEEE T ransactions on Aer ospace and Electr onic Systems , v ol. 46, no. 4, pp. 1821–1833, October 2010. [69] B. V o, B. V o, and A. Cantoni, “ Analytic implementations of the cardinalized probability hypothesis density filter, ” IEEE Tr ansactions on Signal Processing , vol. 55, no. 7, pp. 3553–3567, July 2007. [70] H. Kurniaw ati, D. Hsu, and W . S. Lee, “SARSOP: Efficient point- based POMDP planning by approximating optimally reachable belief spaces, ” in Robotics: Science and Systems , 2008. [71] T . Smith and R. G. Simmons, “Point-based POMDP algorithms: Improved analysis and implementation, ” in 21st Conference on Un- certainty in Artificial Intelligence (U AI) , 2005. [72] M. Kochenderfer , Decision Making Under Uncertainty: Theory and Application . MIT Press, 2015. [73] M. Egorov , M. J. Kochenderfer , and J. J. Uudmae, “T ar get surveillance in adversarial environments using POMDPs, ” in 13th AAAI Conference on Artificial Intelligence , 2016. [74] D. Hsu, W . S. Lee, and N. Rong, “ A point-based POMDP planner for target tracking, ” in IEEE International Confer ence on Robotics and Automation (ICRA) , 2008. [75] R. He, A. Bachrach, and N. Roy, “Efficient planning under uncertainty for a tar get-tracking micro-aerial vehicle, ” in IEEE International Con- fer ence on Robotics and Automation , 2010. [76] L. J. T omczak, “GPU ray marching of distance fields, ” Ph.D. disserta- tion, T echnical Univ ersity of Denmark, 2012. [77] A. D. Ames, X. Xu, J. W . Grizzle, and P . T abuada, “Control barrier function based quadratic programs for safety critical systems, ” IEEE T ransactions on Automatic Contr ol , vol. 62, no. 8, pp. 1–17, 2016. [78] R. A. Freeman and P . V . Kokotovic, “Inv erse optimality in rob ust stabilization, ” SIAM Journal on Contr ol and Optimization , v ol. 34, no. 4, pp. 1365–1391, 1996. [79] P . E. Hart, N. J. Nilsson, and B. Raphael, “ A formal basis for the heuristic determination of minimum cost paths, ” IEEE T ransactions on Systems Science and Cybernetics SSC4 , vol. 4, no. 2, pp. 100–107, 1968. [80] S. M. LaV alle and J. J. Kuffner Jr ., “Randomized kinodynamic plan- ning, ” The International Journal of Robotics Resear ch (IJRR) , vol. 20, no. 5, pp. 378–400, 2001. [81] S. Liu, “Motion planning for aerial vehicles, ” Ph.D. dissertation, Univ ersity of Pennsylvania, 2018. [82] J. Fink, A. Ribeiro, and V . Kumar, “Robust control for mobility and wireless communication in cyber–physical systems with application to robot teams, ” Pr oceedings of the IEEE , vol. 100, no. 1, pp. 164–178, January 2012. [83] ——, “Rob ust control of mobility and communications in autonomous robot teams, ” IEEE Access , vol. 1, pp. 290–309, 2013. [84] Dronecode. Micro Air V ehicle Communication Protocol (MA VLINK). [Online]. A v ailable: https://mavlink.io/en/ [85] D ` a-Ji ¯ ang Innovations (DJI). DJI Onboard-SDK. [Online]. A vailable: https://dev eloper .dji.com/onboard- sdk/ [86] Open Source Robotics Foundation. Robot Operating System (ROS). [Online]. A v ailable: http://www .ros.org/ [87] N. Michael, D. Mellinger , Q. Lindsey , and V . Kumar , “The GRASP multiple micro-UA V testbed, ” IEEE Robotics and Automation Maga- zine , vol. 17, no. 3, pp. 56–65, 2010. [88] G. Hoffmann, S. W aslander, and C. T omlin, “Quadrotor helicopter trajectory tracking control, ” in AIAA Guidance, Navigation and Contr ol Confer ence and Exhibit , 2008. [89] D. Mellinger and V . Kumar, “Minimum snap trajectory generation and control for quadrotors, ” in IEEE International Conference on Robotics and Automation , 2011. [90] D. Mellinger, N. Michael, and V . Kumar , “Trajectory generation and control for precise aggressive maneuvers with quadrotors, ” The International Journal of Robotics Researc h , vol. 31, no. 5, pp. 664– 674, 2012. [91] T . L. Lee, M. McClamroch, and N. Harris, “Geometric tracking control of a quadrotor U A V on SE(3), ” in 49th IEEE Confer ence on Decision and Contr ol (CDC) , 2010. [92] S. Liu, N. Atanasov , K. Mohta, and V . Kumar , “Search-based motion planning for quadrotors using linear quadratic minimum time control, ” in EEE/RSJ International Confer ence on Intellig ent Robots and Systems (IR OS) , 2017. [93] S. Liu, K. Mohta, N. Atanasov , and V . K umar, “T ow ards search-based motion planning for micro aerial vehicles, ” in International Conference on Robotics and Automation (ICRA) , 2019. [94] S. Farahmand, S. I. Roumeliotis, and G. B. Giannakis, “Set- membership constrained particle filter: Distrib uted adaptation for sensor networks, ” IEEE T ransactions on Signal Pr ocessing , vol. 59, no. 9, pp. 4122–4138, 2011. [95] O. Hlinka, O. Sluciak, F . Hlawatsch, P . M. Djuric, and M. Rupp, “Like- lihood consensus and its application to distributed particle filtering, ” IEEE T ransactions on Signal Processing , vol. 60, no. 8, pp. 4334– 4349, 2012. [96] J. Li and A. Nehorai, “Distributed particle filtering via optimal fusion of Gaussian mixtures, ” IEEE T ransactions on Signal and Information Pr ocessing over Networks , vol. 4, no. 2, pp. 280–292, June 2018. [97] O. Tslil, O. Aharon, and A. Carmi, “Distributed estimation using particles intersection, ” in 21st International Conference on Information Fusion (FUSION) , 2018. [98] S. Schaal, A. Ijspeert, and A. Billard, “Computational approaches to motor learning by imitation, ” Philosophical T ransactions of the Royal Society of London. Series B: Biological Sciences , vol. 358, no. 1431, pp. 537–547, 2003. [99] J. Kober and J. Peters, “Imitation and reinforcement learning, ” IEEE Robotics Automation Magazine , vol. 17, no. 2, pp. 55–62, June 2010. [100] A. Pervez, Y . Mao, and D. Lee, “Learning deep movement primitives using con volutional neural networks, ” in IEEE-RAS 17th International Confer ence on Humanoid Robotics (Humanoids) , 2017. [101] C. Zhang, H. Zhang, and L. E. Parker , “Feature space decomposition for effectiv e robot adaptation, ” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IR OS) , 2015. [102] F . Gama, A. G. Marques, G. Leus, and A. Ribeiro, “Conv olutional neural network architectures for signals supported on graphs, ” IEEE T ransactions on Signal Processing , vol. 67, no. 4, pp. 1034–1049, February 2019.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment