Autonomous Tracking of Intermittent RF Source Using a UAV Swarm

1 Autonomous T racking of Intermittent RF Source Using a U A V Swarm Farshad K oohifar , Ismail Guvenc, and Mihail L. Sichitiu Department of Electrical and Computer Engineering, North Carolina State Univ ersity , Raleigh, NC Email: { fkoohif, iguvenc, mlsichit } @ncsu.edu Abstract —Localization of a radio frequency (RF) transmitter with intermittent transmissions is considered via a gr oup of unmanned aerial vehicles (U A V s) equipped with omnidirectional recei ved signal strength (RSS) sensors. This group embarks on an autonomous patrol to localize and track the target with a speciﬁed accuracy , as quickly as possible. The challenge can be decomposed into two stages: 1) estimation of the target position given previous measurements (localization), and 2) planning the future trajectory of the tracking U A Vs to get lower expected localization error given current estimation (path planning). For each stage we compare two algorithms in terms of performance and computational load. For the localization stage, we compare a detection based extended Kalman ﬁlter (EKF) and a recursi ve Bayesian estimator . For the path planning stage, we compare steepest descent posterior Cramer -Rao lower bound (CRLB) path planning and a bio-inspir ed heuristic path planning . Our r esults show that the steepest descent path planning outperf orms the bio- inspired path planning by an order of magnitude, and recursive Bayesian estimator narr owly outperf orms detection based EKF . Index T erms —Cramer Rao lower bound, drone, Fisher infor - mation, intermittent transmitter , jammer , localization, steepest descent, tracking, U A V . I . I N T R O D U C T I O N Enabled by miniaturization of wide variety of sensors and communication modems, unmanned aerial vehicles (U A Vs), also colloquially known as drones, are increasingly used for broadband communications, situational a wareness, and localization [1]–[6]. Detecting, localizing, and tracking unau- thorized U A Vs hav e been identiﬁed as of utmost priority both in military and ci vilian settings [7]–[10]. In particular , such U A Vs may be used by malicious entities for jamming critical communication links [7] or to collect/hack data from a critical infrastructure, or they may simply be controlled by amateur drone users which may still introduce threats especially if ﬂying near unauthorized areas [10]. In order to interdict a target U A V , multiple U A Vs may need to approach close to the vicinity of the target U A V to launch cyber (e.g. deauthentication [11]) or physical (e.g. using a net to catch the target [12], [13]) attacks. A v ariety of sensors on U A Vs can be used for localizing and tracking targets, including visual sensors [14], radio frequency (RF) angle of arriv al sensors [15], RF time difference of arriv al [16], and receiv ed signal strength indicator (RSSI) sensors [17]. In our previous work, we have sho wn that a U A V swarm can accurately localize an RF transmitting target, using inex- pensiv e, omnidirectional RSSI sensors [18]. Howe ver not all targets transmit continuously , and often, frequency hopping is Tr a c k i n g  Drones Ta r g e t  D r o n e Intermit tentSignalsT ransmittedfr omtheT ar getDrone Ta r g e t  D r o n e  Tr a j e c t o r y Fig. 1: A swarm of UA Vs collaborating to localize and track an intermittently transmitting target U A V . The intermittent signals may be jamming signal, video/telemetry signal, or control signal to a ground station. used in drone communication link to improv e reliability and security . Hence, RSSI observations may be only intermittently av ailable, which makes the localization of the target UA V more challenging. In this paper we consider a group of U A Vs that are autonomously patrolling a zone of interest in order to ﬁnd an unauthorized intermittent RF transmitter as illustrated in Fig. 1. The swarm tries to localize the target using an estimation algorithm, and moves the U A Vs to get a better measurement in the next step using a path planner algorithm. Fig. 2 represents an ov erall vie w of the collaborati ve tracking process, in which each ne w target location estimate is used to update the path planning. W e compare two different path planning algorithms, namely steepest descent posterior Cramer-Rao lower bound (CRLB) and a bio-inspired method described in [19]. In this heuris- tic bio-inspired approach, the tracking agents move directly tow ards the last estimate of the tar get location. The former approach is signiﬁcantly more computation intensiv e, while the latter is simply a heuristic based approach with negligible processing burden. Furthermore, we compare two dif ferent estimation algorithms, namely a recursiv e Bayesian estimator and a detection based extended Kalman ﬁlter (EKF). Y et again the former is signiﬁcantly more process intensive, and the latter compromises between optimality and processing power . By comparing these options, we in vestigate the trade off between performance and computation resources. Relation between measurements, estimation, and path planning steps is summarized in Fig. 2. The rest of this paper is organized as follows. In Section II, we provide an overvie w of state of the art. In Section III we describe our approach to model the elements of our system, including stochastic target motion, stochastic target transmis- 2 Fig. 2: Process diagram for collaborativ e UA V tracking. sion, stochastic channel fading, and measurement noise. At the end of this section, we deri ve the measurement model that describes the RSSI measurements as an additiv e unbiased noise to expected v alue of the measurement. Building upon this model, in Section IV, we derive the Fisher information matrix (FIM) for the next estimation cycle and deriv e steepest descent path planning based on the determinant of the FIM. In Section V, we describe an EKF that relies on a minimum risk detector to decide if the tar get was transmitting at this time step, and update the estimation accordingly . In Section VI, we describe the Bayesian estimator that does not rely on a separate detector to recursiv ely update the posterior likelihood of target state. In Section VII, we present our simulation results, and in Section VIII, we giv e a summary of our ﬁndings. I I . L I T E R AT U R E R E V I E W Sev eral model predictive control based path planning meth- ods are proposed in the literature [20], [21]. In [18], we proposed a receding horizon path planning method to localize a continuously transmitting RF source. Despite using simple omnidirectional RSSI sensors, we show that path planning based on optimizing D-optimality criterion, paired with an extended Kalman ﬁlter as estimator , is a promising approach in autonomous cooperati ve localization. In [21], localization of an unauthorized transmitter is performed under the constraints of adhering to no-ﬂy zones; in [21] U A Vs are equipped with electronic support sensors that can detect the angle of arriv al of interference, hierarchical model predictive control is used, where the coarse preliminary paths are computed centrally and passed down to all U A Vs, to be further optimized locally and indi vidually by each UA V , using a receding horizon optimization technique. In [22], localization of an RF transmitter using ﬁxed anchor RSSI sensors is proposed. The proposed approach consists of two steps. First, the range between sensors and the target is estimated. Second, localization is achieved using estimated ranges. The node with the highest RSSI is considered as the reference anchor node. The RSSI measurements from all nodes except nodes close to this reference anchor are ignored. Localization is done heuristically , using least square error optimization of range error . In [23], mobile targets with intermittent transmission are localized using a state machine to switch between the fol- lowing four states: 1) global search, 2) approaching target, 3) locating target, and 4) target reacquisition. The cooperation is implemented in a decentralized fashion via 1) a cost function that takes into account the distance that UA V needs to trav el to arriv e at a location in which it can help the localization, 2) the number of currently helping U A Vs at that location, and 3) the number of neighboring UA Vs. If the cost function becomes negati ve, the U A V goes to help; otherwise it will continue its current task. Since the objective is to localize a device with intermittent transmission, the authors hav e introduced a path planning algorithm that re visits each location regularly while maintaining low energy consumption ﬂight paths. In [24], the authors in vestigate localizing simultaneous mul- tiple tar gets, where time dif ference of arri val based localization is used by the U A V swarm to estimate the position of the targets. In [25], posterior CRLB is used as the metric to ﬁnd the optimal trajectories for a group of U A Vs using only bearing sensors. Using a computationally efﬁcient method of updating the FIM introduced in [26], the dynamic of FIM is decomposed to a nonlinear autonomous response and measurement contribution. The autonomous response entirely depends on the dynamic model of target, and latest FIM, while the measurement contribution depends on the sensor model and position. Through simulations, one step and two step sensor paths are compared, while the number of U A Vs varies between one and three. The paper concludes that multi- step multi-UA V scenario results in the best root mean square of error in terms of tracking accuracy . I I I . S Y S T E M M O D E L In this section we describe the models that we are using to represent a group of U A Vs equipped with omnidirectional RSSI sensors that are recei ving intermittent, omnidirectional transmissions from a stochastically moving target. This system can be modeled by considering each subsystem and their interactions with each other . In Section III-A, we model a moving target with a linear time in variant stochastic process. In Section III-B, we derive the stochastic model for RSSI mea- surements. Starting from modeling the transmission process of the target with a Bernoulli process, we model the recei ved power at each RSSI sensor after Friis channel attenuation and log-normal fading, gi ven the distance from the target UA V . W e then add thermal noise to produce the ﬁnal measurement model as depicted in Fig. 3. A. T ar get Movement Model W e use a general stochastic linear time in variant model to describe target movement; our approach w orks for an y stochastic mov ement model that can be described as a linear model. For the rest of this paper we use an integral Brownian motion model: x ( k +1) = Ax ( k ) + w ( k ) , (1)  x ∗ 0 ( k ) y ∗ 0 ( k )  T = Cx ( k ) , (2) 3 Fig. 3: Flowchart summarizing the interaction of stochastic and deterministic subsystem models. where x ( k ) ∈ R M is the state vector of the linear mov ement process and M is the number of target state variables 1 , k ∈ N super script represents the time step, w k ∼ N ( 0 , Q ) is the Gaussian process noise with covariance matrix Q = E { w ( k ) w ( k ) T } , matrix A is the state transition matrix, and C is the output matrix that extracts the position of moving target in horizontal and vertical axes, x ∗ 0 ( k ) and y ∗ 0 ( k ) . In the double integral model, we have: A =     1 0 ∆ t 0 0 1 0 ∆ t 0 0 1 0 0 0 0 1     , C =  1 0 0 0 0 1 0 0  , (3) where ∆ t represents time step length. B. RSSI Measurement Model In this section we model the observ ation vector (RSSI mea- surements) as a deterministic function of unknown parameters (distances from U A Vs to target) plus an unbiased measurement noise: p ( k ) r = h ( k ) + v ( k ) , (4) where p ( k ) r ∈ R N represents the stochastic measurement vector process resulting from all N U A V measurements at time step k , the vector h ( k ) ∈ R N is the deterministic e xpected value of the measurement given the unknown parameter vector (distance vector between N U A Vs and the target), and the stochastic properties of the measurement are condensed to an unbiased measurement vector noise v ( k ) ∈ R N . Therefore: h ( k ) = E { p ( k ) r } , (5) v ( k ) = h ( k ) − p ( k ) r . (6) T o model the measurement process and ﬁnd h ( k ) and v ( k ) , we start from modeling a single RSSI sensor that is measuring 1 In case of the two dimensional dual integral mov ement model, M = 4 , and the elements of the state vector x ( k ) are position and speed in the two horizontal axises. receiv ed power from a single intermittent transmitter , through Friis attenuation, channel fading, and thermal noise. W e start from the transmitter and follow the power as it reaches the recei ver as depicted in Fig. 3. Having the model for a single receiver , we put all N measurements in a vector form, modeling the measurement vector that contains all the measurements in each time step from all U A Vs. W e model the intermittent transmission of the target as a Bernoulli process: p ( k ) t = s ( k ) p on , (7) where p ( k ) t ∈ R represents the intermittent transmitted po wer process 2 , p on represents transmission power when the trans- mitter is transmitting, and s ( k ) is an independent identically distributed Bernoulli process with the probability mass func- tion: L ( s ( k ) = 0) = q , L ( s ( k ) = 1) = 1 − q , (8) where q is the the probability of target not transmitting at any giv en time step. Each tracking U A V has an omnidirectional RSSI sensor on board. T o model the received signal power , we use Friis path loss equation. In the absence of measurement noise, the receiv ed po wer at an arbitrary recei ver at distance d away from the transmitter is giv en by: p ( k ) F = p ( k ) t G t G r ( λ/ 4 π d ) 2 , (9) where p ( k ) F represents the ideal receiv ed signal power in ab- sence of shadowing and measurement noise, G t is transmitter’ s antenna g ain, G r is recei v er’ s antenna gain, and λ is the wa ve length of the transmitted signal. W e assume transmitter’ s power , transmitter’ s frequency , transmitter’ s antenna gain, and receiv er’ s antenna gain are known constants. For ease of representation, we deﬁne G to represent the constant and known power gain: G , G t G r ( λ/ 4 π ) 2 . (10) 2 Even when the target transmits continuously , due to often used frequency hopping, recei vers may observe RSSI only intermittently , due to monitoring only certain portion of the spectrum at a giv en time. 4 0 10 20 30 40 Time (s) -75 -70 -65 -60 -55 -50 Power (dBm) p F (k) p * (k) r p r (k) Fig. 4: A simulated instance of the RSSI measurement stochas- tic process. The blue dotted line represents the RSSI in absence of fading and measurement noise. Red line represents an instance of the RSSI with fading but without measurement noise, and dots represent instances of RSSI with fading and measurement noise. Then in the presence of shado wing effects and without mea- surement noise, the received signal ( p ∗ r ) is: p ∗ r ( k ) = p ( k ) t G d 2 v ( k ) sh , (11) where v ( k ) sh ∼ ln N (0 , σ 2 sh ) represents the log normal shad- owing effect of the channel. Introduction of recei ved power measurement noise leads to the following: p ( k ) r = p ( k ) t G 1 d 2 v ( k ) sh + p ( k ) th , (12) where p ( k ) r is the measured received po wer by a single RSSI sensor at time interval k , and p ( k ) th ∼ N ( ¯ p th , σ 2 p th ) is the power measurement noise. A simulated instance of the measurement model in (12) is presented in Fig. 4, where we hav e simulated 4 UA Vs closing in on a stationary target. They start from a distance of 300 meters and move towards the target with a constant speed of 5 m/s. The dashed line represents the receiv ed power when target transmits continuously ( q = 0 ), while there is no shadowing and measurement noise. The solid line shows the effect of turning on the shadowing ( σ 2 sh = 1 ) and making the transmission intermittent ( q = 0 . 5 ). The black dots are the measurements of the 4 UA V while considering the measurement noise ( ¯ p th = − 70 dBM, σ p th = − 80 dBm). The measurement model comprises a deterministic compo- nent and an unbiased stochastic component. Noting that s ( k ) and v sh are independent: h ( k ) , E { p ( k ) r } = G 1 d 2 p on E { s ( k ) } E { v ( k ) sh } + ¯ p th , (13) and substituting for expected value of the Bernoulli random variable and log normal random variable, we have: h ( k ) = G 1 d 2 p on (1 − q ) e σ 2 sh / 2 + ¯ p th . (14) Therefore, the unbiased measurement noise ( v ( k ) ) is: v ( k ) = G d 2 p on [ s ( k ) v ( k ) sh + ( q − 1) e σ 2 sh / 2 ] + p ( k ) th − ¯ p th . (15) Equations (14) and (15) model a single RSSI sensor . The model captures the expected value of the measurement and the PDF of the measurement noise. In the next step, we consider sev eral RSSI sensors measuring simultaneously on board different U A Vs. W e consider that the underlying Bernoulli process that governs the transmission is identical for all sensors, but channel fading and thermal noise are independent for each sensor . W e stack the resulting measurements in vector form by ﬁrst deﬁning the distance vector d ( k ) ∈ R N between tracking UA Vs and the target, where elements of d ( k ) are the Euclidean distances between corresponding tracking U A V and the target in time step k : d ( k ) i =     Cx ( k ) −  x ∗ i ( k ) y ∗ i ( k )      . (16) The expected value of the measurements vector ( h ( k ) ) is the expected value of each RSSI readout that was deriv ed in (14), arranged in vector format corresponding to all N sensors on board of our swarm, which can be written using (16) as: h ( k ) = Gp on (1 − q ) e σ 2 sh / 2 h d ( k ) 1 − 2 , . . . , d ( k ) N − 2 i T + ¯ p th . (17) The unbiased noise associated with each sensor’ s measurement ( v ( k ) ) can be obtained from the unbiased noise deriv ed in (15), for an indi vidual receiv er , and can be arranged in v ector format corresponding to all N sensors on board of the swarm: v ( k ) =     Gp on [ s ( k ) v ( k ) sh 1 + ( q − 1) e σ 2 sh / 2 ] d ( k ) 1 − 2 + p ( k ) th 1 . . . Gp on [ s ( k ) v ( k ) sh N + ( q − 1) e σ 2 sh / 2 ] d ( k ) N − 2 + p ( k ) th N     − ¯ p th . (18) Therefore, the measurement vector ( z ( k ) ) is no w fully deriv ed based on (17) and (18) as follows: z ( k ) = h ( k ) + v ( k ) . (19) This is the measurement model that links U A Vs’ measure- ments in (19) to the target’ s position through (16), where z is the measurement vector . Each element of z is the RSSI measurement of the corresponding tracking U A V . I V . S T EE P E S T D E S C E N T P A T H P L A N N I N G A L G O R I T H M . In Section III, we have modeled the RSSI measurements from an intermittently transmitting target UA V , observed at tracking U A Vs. The stochastic nature of the model dictates how accurate the estimations can be even with optimal estima- tors. Noisy measurements with low joint likelihood curvature with the parameters of interest (target location) cannot provide additional information to the estimators to allo w them to reduce the expected estimation error . Naturally , the swarm should mov e in a way that the measurements that the y get have good stochastic characteristics and high information. Using the measurement model in (17) and (18), swarm paths can be ranked with respect to how much information each path’ s 5 measurements contain. The maximization of no vel information content of measurements is the one and only goal in the steepest descent path planning algorithm. W e make a distinction between information content of a measurement vector and the total posterior information. This key distinction is the motiv ation for introducing the concept of nov el information. W e give an example to further clarify this distinction. Suppose the location of a stationary tar get in x-axis is perfectly known in time step k , and we hav e no information about its location in y-axis. This is our prior information. Any measurement that gives any information about the location of the target in x-axis, is simply providing information that is redundant, because we hav e perfect information about it already in the prior . In contrast, an y information that we gather about target’ s location in y-axis will increase the total amount of information that we have about the target’ s location; The total information that we are going to hav e about target’ s location after consideration of the new measurements, is posterior information. The dif ference between prior and posterior information is the novel information content of the measurements. Compare this to the case if we had perfect information in the prior about the location in y-axis, and no information about the location in x-axis. It is clear that the information content of a measurement does not depend on prior information, but novel information of a measurement directly depends on prior information. It is also clear that we are interested in increasing the posterior information, and therefore we are only interested in novel information content of measurements. Intuitiv ely , a good path planning method should guide the U A Vs in such a way that the nov el information gathered from next set of measurements is maximized. In other words, giv en the prior information that we ha ve from the past measure- ments, ﬁnd the optimal path that leads to highest amount of posterior information. W ith this goal in mind, we do not intend to ﬁnd the measurement set that maximizes the total infor- mation in the next measurement, since redundant information is included in that. W e intend to ﬁnd the measurement set that, giv en our prior information, leads to maximum posterior information. In this section, we integrate the abov e intuition in a math- ematical framework. In Section IV -A, we use a well deﬁned metric on information and apply it to our system. In Sec- tion IV -B, we discuss how prior information and information in a ne w measurement set combine to form posterior information. Then we discuss ho w to measure novel information in a measurement set, and optimize for it. W e discuss ho w the posterior information formed in the last time step will loose some of its value and become outdated by the time the next measurement set is av ailable. W e discuss how this discounted posterior information will be used as prior for next time step. A. Information Metric W e use FIM as representation of information, which is widely considered to be a good indicator of av ailable infor- mation about a stochastic process [27], since FIM dictates the maximum achie vable estimation accuracy through the CRLB. Giv en this choice of information metric, we derive a procedure to calculate the amount of posterior information at each time step. T o express it formally , we calculate the posterior information about the tar get state at time step k , i.e. the posterior information about x ( k ) , giv en the prior information that we had in time step zero, and all the measurements from time step zero until time step k . T o be able to mathematically refer to all the measurements that we hav e from time step zero to time step k , we stack all of the measurement vectors up until time step k into a single accumulative measurement vector z ( k ) acc : z ( k ) acc ,  z (0) , . . . , z ( k )  T ∈ R kN + N . (20) Similarly , we deﬁne accumulati ve target state vector x ( k ) acc , which contains all the states of the target from time step zero to time step k : x ( k ) acc ,  x (0) , . . . , x ( k )  T ∈ R kM + M . (21) Giv en the accumulative measurement vector z ( k ) acc , the CRLB establishes the lowest achiev able estimation error of the accu- mulativ e target state vector x ( k ) acc , giv en known joint probability function L ( z ( k ) acc , x ( k ) acc ) . For an y unbiased estimator ˆ x ( k ) acc = g ( z ( k ) acc ) , the expected cov ariance matrix of the estimation error is not better than the CRLB [28]: E n || g ( z ( k ) acc ) − x ( k ) acc || 2 2 o ≥ J ( k ) acc − 1 , (22) where J ( k ) acc − 1 ∈ R ( kM + M ) × ( kM + M ) is the accumulative FIM with the elements given by: J ( k ) acc i,j = − E  ∂ 2 ln ( L ( z ( k ) acc , x ( k ) acc )) ∂ x ( k ) acc i ∂ x ( k ) acc j  . (23) Moreov er , J ( k ) acc − 1 in (22) represents the posterior information about all states in all steps. As the system progresses through time, the number of unknown parameters in x ( k ) acc and the number of a vailable measurements in z ( k ) acc increase. In next subsection, we use a recursi ve rule to ﬁnd the posterior Fisher information of current states of the target, given accumulative measurements. B. Prior , P osterior , and Measur ement Information Intuitiv ely , at each time step, we want to ﬁnd the next set point that will lead to the optimal posterior CLRB, rather than the entire CRLB. That is to say , we do not value better estimation of target states at previous time steps. Equiv alently , we want to set the next measurement path such that the next set of measurements lead to the optimal Fisher information submatrix corresponding only to the target state in next step. This implies that we do not need to handle and optimize the whole CRLB, but we are only interested in the lower -right M × M block of the CRLB. Calculation of the entire posterior FIM is both computationally expensi ve and unnecessary . In this case, where we are only interested in the lower right submatrix, we can use the posterior information submatrix recursion proposed in [26]. As a result, we will hav e the posterior CRLB matrix, without the need to handle gro wing 6 matrices. Using proposition 1 in [26], we know that the posterior FIM obeys the recursion: J k +1 | k = − D T 2  J k + D 1  − 1 D 2 + D 3 , (24) J k +1 = J k +1 | k + D 4 , (25) where J k is the FIM corresponding to x ( k ) giv en z ( k ) acc , and J k +1 | k is the information matrix corresponding to x ( k +1) giv en z ( k ) acc . The coefﬁcient matrices D 1 , D 2 , D 3 , and D 4 are deriv ed in [26]: D 1 , E {− ∆ x ( k ) x ( k ) log L ( x ( k +1) | x ( k ) ) } , (26) D 2 , E {− ∆ x ( k +1) x ( k ) log L ( x ( k +1) | x ( k ) ) } , (27) D 3 , E {− ∆ x ( k +1) x ( k +1) log L ( x ( k +1) | x ( k ) ) } , (28) D 4 , E {− ∆ x ( k +1) x ( k +1) log L ( z ( k +1) | x ( k +1) ) } , (29) where we use the denominator layout: ∆ θ ψ = ∇ ψ ∇ T θ , ∇ θ =  ∂ θ 1 , . . . , ∂ θ r  T . (30) The likelihood functions that we need in order to calculate the coefﬁcient matrices are L ( x ( k +1) | x ( k ) ) and L ( z ( k +1) | x ( k +1) ) . W e start from the former (which depends on tar get move- ment model) and the later (which depends on measurement model) is derived later in (35)-(48). From (1) we deriv e L ( x ( k +1) | x ( k ) ) : L ( x ( k +1) | x ( k ) ) = det (2 π Q ) ( − 1 / 2) × e − 1 2 ( x ( k +1) − Ax ( k ) ) T Q − 1 ( x ( k +1) − Ax ( k ) ) . (31) Intuitiv ely , (24) giv es us a method to move information through time. It shows how to calculate the contrib ution of the information about pre vious state of the target to information about current state of the target. On the other hand (25) giv es us a method to augment prior information with ne w information. Applying process model in (2) into (26), we can write: D 1 = E n ∆ x ( k ) x ( k ) 1 2 ( x ( k +1) − Ax ( k ) ) T Q − 1 ( x ( k +1) − Ax ( k ) ) o , (32) which implies: D 1 = A T Q − 1 A . (33) Similarly , for (27) and (28) we can write: D 2 = − A T Q − 1 , D 3 = Q − 1 . (34) In contrast to D 1 , D 2 , and D 3 , which only depend on the process model and handle propagation of information through time, the calculation of D 4 requires applying the measurement model. W e have L ( z ( k +1) | x ( k +1) ) equal to L ( z ( k +1) | h ( k +1) ) , since h ( k +1) is a deterministic function of x ( k +1) . Then, using (19), we can write: L ( z ( k +1) | x ( k +1) ) = L ( z ( k +1) | h ( k +1) ) = L v ( k +1) ( z ( k +1) − h ( k +1) ) . (35) Intuitiv ely , this means that the probability of observing a measurement giv en the process state is equal to probability of measurement noise being equal to difference between the measurement and the expected value of measurement. This can be further simpliﬁed by considering the on and off possibilities for the Bernoulli process. W e deﬁne α as the probability of observing a measurement noise when transmitter has transmitted, and β when it has not: α , L v ( k +1) ( z ( k +1) − h ( k +1) | s ( k +1) = 1) , (36) β , L v ( k +1) ( z ( k +1) − h ( k +1) | s ( k +1) = 0) , (37) where we have: L ( z ( k +1) | h ( k +1) ) = (1 − q ) α + q β . (38) W e deﬁne new dependent vectors k and l that describe measurement noise when the transmitter was on and off: k , v ( k +1) | s ( k +1) = 1 , (39) l , v ( k +1) | s ( k +1) = 0 , (40) where from (18), we can write: k i ( v sh , p th ) = Gp on [ v sh + ( q − 1) e σ 2 sh / 2 ] d − 2 i + p th − ¯ p th , (41) l i ( p th ) = Gp on ( q − 1) e σ 2 sh / 2 d − 2 i + p th − ¯ p th . (42) W e assume that the measurement process at each tracking U A V is independent, and as a result, we can rewrite (43) and (44) as: α = N Y i =1 L k i ( z ( k +1) i − h ( k +1) i ) , (43) β = N Y i =1 L l i ( z ( k +1) i − h ( k +1) i ) , (44) where since the distribution of v sh ∼ ln N (0 , σ 2 sh ) and p th ∼ N ( ¯ p th , σ 2 p th ) are known: ζ , (1 − q ) e σ 2 sh / 2 + z i ( k + 1) − h i ( k + 1) − p th + ¯ p th Gp on d − 2 i (45) L k i ( z ( k +1) i − h ( k +1) i ) = d 2 i Gp on Z ∞ 0 L v sh ( ζ ) L ( p th ) d p th = d 2 i Gp on Z ∞ 0 1 ζ σ sh √ 2 π e − (ln ζ ) 2 2 σ 2 sh 1 p 2 σ 2 th π e − ( p th − ¯ p th ) 2 2 σ 2 th d p th , (46) η , z ( k +1) i − h ( k +1) i − Gp on ( q − 1) e σ 2 sh / 2 d − 2 i , (47) L l i ( z ( k +1) i − h ( k +1) i ) = 1 p 2 σ 2 th π e − η 2 2 σ 2 th . (48) At this point we hav e deriv ed L ( z ( k +1) | x ( k +1) ) , but to get D 4 from (29), we need to numerically calculate the expected value of the Hessian of this lik elihood function. T o do so we use Monte Carlo e xpectation on numerical Hessian. W e pick a random measurement noise instance ( ˆ v ( k +1) ) from its distribution, giv en x ( k +1) , then calculate a corresponding Hessian: ˆ H = ∆ x ( k +1) x ( k +1) log L ( z ( k +1) | x ( k +1) )    ˆ v ( k +1) , (49) 7 Require: ˆ x k − 1 | k − 1 , P k − 1 | k − 1 , z k 1: procedure S T E E P E S T D E S C E N T P A T H P L A N N I N G 2: D 1 ← A T Q − 1 A 3: D 2 ← − A T Q − 1 4: D 3 ← Q − 1 5: for s in S do 6: x ∗ ( k +1) ← x ∗ ( k ) + cos( s )∆ L 7: y ∗ ( k +1) ← y ∗ ( k ) + sin( s )∆ L 8: D 4 ← D O E X P E C TA T I O N ( . ) 9: J ( k + 1) = − D T 2  J ( k ) + D 1  − 1 D 2 + D 3 + D 4 10: c ( s ) ← | J ( k + 1) | 11: end for 12: s opt = arg max c ( s ) 13: end procedure 1: procedure D O E X P E C TA T I O N 2: H ← 0 r,r 3: for try in 1 : t do 4: ˆ v ( k +1) ← E R RO R I N S TA N C E (.) 5: ˆ H ← ˆ H − H /t 6: end for 7: end procedure 8: procedure L ( z ( k +1) | x ( k +1) )    ˆ v ( k +1) 9: α ← β ← 1 10: for i in 1 : N do 11: ζ ← (1 − q ) e σ 2 sh / 2 + z i ( k +1) − h i ( k +1) − p th + ¯ p th Gp on d − 2 i 12: α ← α d 2 i Gp on R ∞ 0 1 ζ σ sh √ 2 π e − (ln ζ ) 2 2 σ 2 sh 1 2 σ 2 th π e − ( p th − ¯ p th ) 2 2 σ 2 th d p th 13: η ← z ( k +1) i − h ( k +1) i − Gp on ( q − 1) e σ 2 sh / 2 d − 2 i 14: β ← β 1 2 σ 2 th π e − η 2 2 σ 2 th 15: end for 16: L ( z k +1 | h k +1 ) ← (1 − q ) α + q β 17: end procedure Fig. 5: Steepest descent path planning. whose individual elements can be calculated as: ˆ h i,j ≈  log L ( z ( k +1) | x ( k +1) + δ i + δ j ) − log L ( z ( k +1) | x ( k +1) + δ i − δ j ) − log L ( z ( k +1) | x ( k +1) − δ i + δ j ) + log L ( z ( k +1) | x ( k +1) + δ i + δ j )  / 4∆ 2    ˆ v ( k +1) , (50) where ∆ is deriv ativ e step size, and δ i is the i th unit vector scaled to ∆ . Monte Carlo expectation of this Hessian leads to numerical ev aluation of D 4 since: D 4 = − E { ˆ H } , (51) The algorithm is presented in Fig. 5. Having numerical ev alu- ation of D 4 , the computationally efﬁcient recursion of Fisher information submatrix can be used. C. P ath Planning Intuitiv ely , we desire a path planning approach that leads to the optimal FIM. Since matrix spaces do not hav e a unique norm, our intuitive criterion of ﬁnding the optimal FIM is ambiguous. One of the most common approaches is to minimize the determinant of posterior CRLB or equiv alently , maximize the determinant of the FIM [29]. This criterion is called D-optimality criterion. W ith this criterion in mind, we deﬁne s ∈ [ − π , π ] N as the vector of UA V directions that is dictated by path planning. The next Fisher information submatrix is a function of our path planning decision on which direction the U A Vs should travel for the next time step. T o note this dependency explicitly , we use the notation J k +1 ( s ) for the next Fisher information submatrix and write the U A V path planning problem as: s opt = arg max s | J k +1 ( s ) | , (52) where s opt is the optimal direction that the U A Vs need to take, and J ( s ) is the FIM corresponding to states of the target at next time step. T o do this optimization, we need to ﬁnd the gradient of determinant of the FIM. The determinant of FIM in this case is a scalar function ov er the ﬁeld of mo vement directions of U A Vs. Note that at any time step, we may have already acquired some measurements and therefore we may hav e some amount of information about the location of the target and the FIM should take pre vious measurements into account and only consider the new information that will be generated. The path planning stage optimizes the information around a point in target state space. Choosing that point in target state space to optimize the information around it, needs a target state estimator to estimate where in the state space our target is (e.g. location and velocity for double integral mov ement model). In the next two sections we introduce two speciﬁc state estimators that can do this task: detection based EKF and Bayesian estimator . V . D ET E C T I O N B A S E D E K F W e introduce detection based EKF as a target location estimation approach. Intuitiv ely , if we kne w that the target has transmitted, we know how to update the estimation of the target location, giv en prior estimation and its expected error cov ariance. W e could use EKF to do the update. Similarly , if we knew target did not transmit, we know how to update our estimation, solely by predicting current target state from its pre vious estimation. Therefore one way of localization is to ﬁrst use a detector to detect if the target has transmitted, and decide ho w to update the estimation based on detector’ s output. W e refer to this estimator as detection based EKF . Detection based EKF relies on a central collectiv e decision to be made based on measurements in current time step to detect if the transmitter has transmitted or was silent in this time period. Deriv ation of optimal detection method requires a discussion of propagation of error through time, when the EKF does a correct detection, false alarm, misdetection, and correct rejection. Therefore we describe how error propagates in each of these scenarios, then use them to do optimal detection that reduces the ov erall determinant of estimation error cov ariance matrix. If a transmission is detected, the current measurement vector is used to update the EKF estimation. Otherwise, we update the 8 Require: ˆ x k − 1 | k − 1 , P k − 1 | k − 1 , z k 1: procedure D E T E C T I O N B A S E D E K F 2: ˆ x ( k | k − 1) ← A ˆ x ( k − 1 | k − 1) 3: P ( k | k − 1) ← AP ( k − 1 | k − 1) A T + Q 4: ˜ y ( k ) ← z ( k ) − h ( ˆ x ( k | k − 1) ) 5: S ( k ) ← H ( k ) T P ( k | k − 1) H ( k ) + R 6: K ( k ) ← P ( k | k − 1) H ( k ) S ( k ) − 1 7: U 11 ←    ( I − K ( k ) H ( k ) T ) P ( k | k − 1)    , 8: U 12 ←   P ( k | k − 1) + K ( k ) ˜ y ( k ) ( K ( k ) ˜ y ( k ) ) T   9: U 21 ←   P ( k | k − 1)   10: U 22 ←   P ( k | k − 1)   11: τ ( k ) ← p ( z |H 1 ) p ( z |H 0 ) > ( U 12 − U 22 ) q ( U 21 − U 11 )(1 − q ) 12: if τ k then 13: P ( k | k ) ← ( I − K ( k ) H ( k ) T ) P ( k | k − 1) 14: ˆ x k | k ← ˆ x ( k | k − 1) + K ( k ) ˜ y ( k ) 15: else 16: P ( k | k ) ← P ( k | k − 1) 17: ˆ x ( k | k ) ← ˆ x ( k | k − 1) 18: end if 19: retur n ˆ x ( k | k ) , P ( k | k ) 20: end procedure Fig. 6: Detection based EKF algorithm for target U A V location estimation. state estimation using one step ahead prediction and update the state estimation error covariance matrix P ( k | k ) accordingly as described in Fig. 6. W e use τ ( k ) to represent detection decision and ˆ x ( k | k − 1) is the prediction of tar get U A V state at time step k giv en measurements up until time step k − 1 . This prediction’ s cov ariance is represented with P ( k | k − 1) , observed residual is represented with ˜ y ( k ) , residual cov ariance is represented with S ( k ) , Kalman gain is represented with K ( k ) , H ( k ) is the Jacobian of h ( k ) in (17) which can be written as: H ( k ) = ∂ h ∂ x    ˆ x ( k | k − 1) (53) h i,j = Gp on (1 − q ) e σ 2 sh / 2 d − 4 j  c 1 ,i  C 1 x ( k ) − x ∗ j ( k )  + c 2 ,i  C 2 x ( k ) − y ∗ j ( k )      ˆ x ( k | k − 1) , (54) and R is the measurement error covariance matrix: R = E { v ( k ) v ( k ) T } , (55) where the individual elements of the matrix R are: r i,j =      ( Gp on ) 2 d − 2 i d − 2 j q (1 − q ) e σ 2 sh ,if i 6 = j ( Gp on ) 2 d − 4 i (1 − q )( e σ 2 sh − 1) e σ 2 sh + σ 2 th − (1 − q ) 2 e σ 2 sh ,if i = j . (56) Having calculated the consequence of each decision on estimation procedure, we turn into ﬁnding an optimal decision making method. The two hypotheses we need to choose between correspond to target having transmitted in time step k or not: H 0 : s ( k ) = 0 , H 1 : s ( k ) = 1 . (57) Intuitiv ely , the detector should not have a constant false alarm rate or ﬁxed probability of detection. The detector should take into account the expected cost of its actions, and adapt its threshold accordingly to minimize the total expected cost. Using the determinant of estimation error cov ariance matrix as cost function, we deriv e the cost associated with each of the possible outcomes of our detection method as follows: U 11 =    ( I − K ( k ) H ( k ) T ) P ( k | k − 1)    , (58) U 12 =   P ( k | k − 1) + K ( k ) ˜ y ( k ) ( K ( k ) ˜ y ( k ) ) T   , (59) U 21 = U 22 =   P ( k | k − 1)   , (60) where U 11 is cost of correct detection, U 12 is cost of false alarm, U 21 is cost of misdetection, and U 22 is cost of correct rejection. Having calculated the costs associated with each decision outcome, we use optimal Bayesian cost detector [30] to decide H 1 if the following condition is satisﬁed: p ( z |H 1 ) p ( z |H 0 ) > ( U 12 − U 22 ) q ( U 21 − U 11 )(1 − q ) . (61) V I . B A Y E S I A N E S T I M A T I O N As an alternati ve estimator for target location, we can use a Bayesian estimator . In contrast to detection based EKF estimator in Section V, Bayesian estimator does not rely on explicitly detecting if tar get has transmitted in the last time step or not. In the Bayesian estimation approach, we aim to achiev e minimum mean square error: ˆ x ( k | k ) = arg min ˜ x E { ( ˜ x − x ( k ) ) 2 } , (62) with the solution: ˆ x ( k | k ) = E { x ( k ) | z ( k ) , . . . , z (0) } , (63) = Z x ( k ) L ( x ( k ) | z ( k ) , . . . , z (0) ) d x ( k ) . (64) Therefore, we need to calculate L ( x ( k ) | z ( k ) , z k − 1 , . . . , z (0) ) , which is the likelihood function of the target state con- ditioned on all acquired measurements. Calculation of this likelihood function can be done recursi vely . First, us- ing the conditional likelihood function of the last step ( L ( x ( k − 1) | z ( k − 1) , . . . , z (0) ) ), we predict the conditional like- lihood function of the target state: L ( x ( k ) | z ( k − 1) , . . . , z (0) ) = Z L ( x ( k ) | x ( k − 1) ) L ( x ( k − 1) | z ( k − 1) , . . . , z (0) ) d x ( k − 1) , (65) where from (1), we can write: L ( x ( k ) | x ( k − 1) ) = | 2 π Q | − 1 2 e − 1 2 ( x ( k ) − Ax ( k − 1) ) T Q − 1 ( x ( k ) − Ax ( k − 1) ) . (66) Intuitiv ely , (65) is the prior belief in an y point of target state space being current target state. It is our prior in the sense that it describes how likely we consider the target to be in any point in tar get state space, before considering the new measurements. Our prior belief in any point in target state space x ( k ) depends on how likely it is for it to occur from all possible x ( k − 1) , and how strongly we belie ve in x ( k − 1) . Next, 9 Require: Posterior ( x ( k − 1) ) , z ( k ) 1: procedure B A Y E S I A N E S T I M A T O R 2: for all x ( k ) do 3: Prior ( x ( k ) ) ← P x ( k − 1) L ( x ( k ) | x ( k − 1) ) Posterior ( x ( k − 1) ) 4: end for 5: Normalize ( Prior ( x ( k ) )) 6: for all x ( k ) do 7: Posterior ( x ( k ) ) ← L ( z ( k ) | x ( k ) ) Prior ( x ( k ) ) 8: end for 9: Normalize ( Posterior ( x ( k ) )) 10: ˆ x ( k | k ) ← P x ( k ) x ( k ) Posterior ( x ( k ) ) 11: retur n ˆ x ( k | k ) , Posterior ( x ( k ) ) 12: end procedure Fig. 7: Bayesian estimator implementation. we update this conditional likelihood function using the new measurement: L ( x ( k ) | z ( k ) , . . . , z (0) ) = L ( x ( k ) | z ( k − 1) , . . . , z (0) ) L ( z ( k ) | x ( k ) ) L ( z ( k ) | z ( k − 1) , . . . , z (0) ) , (67) where: L ( z ( k ) | z ( k − 1) , . . . , z (0) ) = Z L ( z ( k ) | x ( k ) ) L ( x ( k ) | z ( k − 1) , . . . , z (0) ) d x ( k ) . (68) Intuitiv ely , (67) describes the posterior likelihood function, characterizing how new measurements affect our prior . If the new measurement z ( k ) has high likelihood to observ ed, assum- ing the target state is x ( k ) , we increase the posterior belief x ( k ) proportionally . If the measurements are unlikely to be observed at target state x ( k ) , we reduce our belief in x ( k ) proportionally . For the next time step, we use L ( x ( k ) | z ( k ) , . . . , z (0) ) to form our prior belief in x ( k +1) . The described recursiv e Bayesian estimation is mathemat- ically optimal, yet we do not implement it as is due to multidimensional integrations in (64), (65), and (68). W e make slight adjustments in implementation by partitioning the target state space and by exclusiv ely considering the region of interest. By partitioning the state space, we mean we quantize the values of states and tabulate the likelihoods in (64), (65), and (68). By exclusiv ely considering the region of interest, we mean we giv e zero prior to states that represent the target with more than max speed, and farther than the maximum distance. Fig. 7 represents our implementation of recursiv e Bayesian estimator . In this implementation, the integration over all state space in (65) is substituted with the summation over all quantized state space in third line of the Fig. 7. This substitution is intended to reduce the computational expense of the algorithm by quantizing the state space. Similarly , the integration in (68) is replaced with a one shot normalization in ﬁfth line of Fig. 7. The denominator in (67) does not depend on the target state, and acts as a normalization factor to ensure that the posterior belief retains unit volume under the curve. Since we hav e quantized the state space, we can simply sum up -20 0 20 40 X axis (m) -20 -10 0 10 20 Y axis (m) Target UAV1 UAV2 Fig. 8: Trajectories of a target U A V and two tracking U A Vs. T ABLE I: Default values of simulation parameters. Parameter Description V alue N Swarm size 2 v max Maximum speed of tracking UA V 5 m/s q Probability of target not transmitting 0.2 σ th Standard deviation of thermal noise -80 dBm σ sh Standard deviation of log-normal shadowing 1 Q Cov ariance of target movement process noise diag(2) p on T arget transmission power 30 dBm G Aggregated antenna gain 1 Default path planning method bio Default estimation method EKF the posterior belief over all the state space, and divide all the posterior beliefs by this constant to make sure the summation of the posterior beliefs remains equal to one. The expected value in (64) is also substituted with discrete expected value in tenth line of Fig. 7. Overall by changing the number of quantized state space points (increasing resolution), we can trade off computation expense and estimation accuracy . V I I . S I M U L AT I O N R E S U L T S In this section, we present our simulation results. In general at each scenario, the tracking UA Vs use one of the path planning methods to navig ate the area of interest, while using one of the estimation methods to update the target location that is used in their path planning method. Fig. 8 illustrates one instance of such simulation, with steepest descent method of path planning and detection based EKF estimation method. In Section VII-A, we in vestigate the performance of the optimal Bayesian cost detector in different noise scenarios. In Section VII-B, we compare the resulting D-criterion of the two path planing approaches. In Section VII-C, we in vestigate how changes in the stochastic parameters of the measurement process affect the error decay factor for EKF and Bayesian es- timators. Finally in Section VII-D, we present the computation time of each algorithm. Unless otherwise noted, we use the default parameters that are presented in T able I to generate the simulation results. W e assume target moves according to two dimensional double integral model. W e also assume that the tracking U A Vs are moving in the same two dimensional plane. It is worth 10 0 0.2 0.4 0.6 0.8 1 q 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 Probability of False Alarm t h =-90 dBm t h =-80 dBm t h =-70 dBm t h =-60 dBm (a) Probability of false alarm v ersus q , probability of target not transmitting. 0 0.2 0.4 0.6 0.8 1 q 0 0.1 0.2 0.3 0.4 0.5 Probability of Misdetection t h =-90 dBm t h =-80 dBm t h =-70 dBm t h =-60 dBm (b) Probability of misdetection versus q , probability of target not transmitting. 0 0.2 0.4 0.6 0.8 1 q 0 0.2 0.4 0.6 0.8 1 Probability of Correct Detection t h =-90 dBm t h =-80 dBm t h =-70 dBm t h =-60 dBm (c) Probability of correct detection versus q , probability of target not transmitting. 0 0.2 0.4 0.6 0.8 1 q 0 0.2 0.4 0.6 0.8 1 Probability of Correct Ignore t h =-90 dBm t h =-80 dBm t h =-70 dBm t h =-60 dBm (d) Probability of correctly ignoring the measurements v ersus q , probability of target not transmitting. Fig. 9: Detection performance. mentioning that the methods presented in this paper are not restricted to two dimensions, and can be used in three dimensions without modiﬁcation. A. Detection P erformance W e present, in Fig. 9, the performance metrics of the optimal Bayesian detector stage of the detection based EKF estimator , across wide range of thermal noise and probability of target transmission ( 1 − q ). These performance metrics are probability of false positiv e (Fig. 9a), f alse negati ve (Fig. 9b), correct detection (Fig. 9c), and correct ignore (Fig. 9d). Obviously , when the probability of target transmission is zero or one, there is no error in detection and the detection becomes non- trivial when the transmission becomes probable but not certain. From Fig. 9d we observe that all across the thermal noise range of -90 dBm to -60 dBm, the optimal detector has a very good performance in ignoring the measurements when there has not been a transmission from the target. On the other hand, from Fig. 9c we observe an interesting phenomenon in the detector performance in correctly detecting the transmission of the target when the target has a small chance of being silent. These two observ ations are consistent with false positive and false negati ve rates in Fig. 9a (low false alarm across the range) and Fig. 9c (high miss when target has small chances of silence). This becomes e ven more interesting when we notice that reducing thermal noise reduces false alarm, but increases the misdetection. Intuitiv ely , and with a ﬁxed threshold on lik e- 11 -10 10 30 50 x (m), k= 1 50 30 10 -10 y (m) -15 -10 -5 0 5 10 -10 10 30 50 x (m), k= 2 50 30 10 -10 y (m) -15 -10 -5 0 5 10 -10 10 30 50 x (m), k= 3 50 30 10 -10 y (m) -15 -10 -5 0 5 10 -10 10 30 50 x (m), k= 4 50 30 10 -10 y (m) -15 -10 -5 0 5 10 -10 10 30 50 x (m), k= 5 50 30 10 -10 y (m) -15 -10 -5 0 5 10 -10 10 30 50 x (m), k= 6 50 30 10 -10 y (m) -15 -10 -5 0 5 10 -10 10 30 50 x (m), k= 7 50 30 10 -10 y (m) -15 -10 -5 0 5 10 -10 10 30 50 x (m), k= 8 50 30 10 -10 y (m) -15 -10 -5 0 5 10 -10 10 30 50 x (m), k= 9 50 30 10 -10 y (m) -15 -10 -5 0 5 10 (dB) (dB) (dB) (dB) (dB) (dB) (dB) (dB) (dB) Fig. 10: Evolution of the D-criterion (dB) over the entire region of interest at different time steps k . lihood ratio, we would not expect to see higher misdetection associated with lower measurement noise. Our in vestig ation shows that when noise lev el is lo w , the optimal detector be- comes conserv ativ e after acquiring a couple of measurements. That is to say , when the noise le vel is low and the target has low probability of silence, the prior information goes up rapidly , and the detector does not see any additional beneﬁt in risking to consider barely conﬁdence inspiring measurements. In this case, the optimal detector has so much prior information that it prefers to lose relati vely small nov el information content of the new measurements if the likelihood of the transmission of the tar get is not e xtremely high. In contrast, when noise is high or probability of target silence is high, the prior information is small and the detector is willing to consider any measurement that is likely to be generated from a transmitting target. B. D-Criterion The steepest descent path planning algorithm relies on estimation stage to localize the target. Subsequently , the steep- est descent path planning maximized the D-criterion on that location by choosing the optimal path. The optimization does not consider the estimation error , and it may optimize the D- criterion on a location that is far from the location of the target, and as a consequence, the estimation stage will have an ev en harder job localizing the target. Naturally , we are interested to in vestigate how optimization of D-criterion ov er a single point trickles down the information to the vicinity of that point. Our simulations sho w that information in our particular setup is not very sensitive to localization error . For instance, Fig. 10 illustrates the a vailable information over the entire re gion of interest as the U A Vs are follo wing steepest descent path planning. As the tracking U A Vs navigate through the region 12 0 2 4 6 8 10 Time (s) -30 -25 -20 -15 -10 -5 0 5 10 15 D Criterion (dB) Steepest Descent Bio-Inspired Fig. 11: Time progression of the D-criterion for the steepest descent and bio-inspired path planning. of interest, they gain information about their en vironment. W e note two observ ations: 1) the information gathered tends to fall rather rapidly over a certain radius; and 2) the information tends to be more or less homogeneous inside that radius. Next we compare the time progression of D-criterion for bio-inspired path planning and steepest descent path planning. Fig. 11 depicts the result for N = 2 , q = 0 . 2 , σ 2 th = − 60 dBm, and Q = diag (2) . The D-criterion for both cases exhibit close to linear increase, suggesting an exponential increase in D-criterion. Although in the initial steps, the bio-inspired approach keeps up with the steepest descent approach, it starts to loose performance as the tracking UA Vs get closer to the target. Such a behavior is expected, since as we get closer to the target, tracking U A Vs can make better measurements if they start to encircle the target rather than following a straight path to the tar get. The performance hit is approximately 10 dB at the end of the simulation. C. Estimator Error In Fig. 12, we compare the mean square estimation error of detection based EKF and Bayesian estimator . W e show the result for N = 2 , q = 0 . 2 , σ 2 th = − 60 dBm, and Q = diag (2) . The mean squared estimation error con ver ges below 50 for both estimators, with detection based EKF showing an increase in the ﬁrst update. This increase increase is due to second order dif ference equations that gov ern the process model. In this case, the errors in initial speed estimation will carry into the next step. W e note that by mean squared estimation error, we mean D. Pr ocessing T ime Although our implementation of the aforementioned algo- rithms are by no means computationally optimal, we provide the run times that we have observed to giv e a rough idea about the computational complexity of each approach. All codes are implemented using MA TLAB, and are ex ecuted on Intel 0 2 4 6 8 10 12 Time (s) 0 50 100 150 200 250 Mean Squared Estimation Error Detection Based EKF Bayesian Estimator Fig. 12: Time progression of the estimation error for the state vector x when using detection based EKF and Bayesian estimator . Core i7 6700k. The bio-inspired path planning and detection based EKF run time is on the order of measurement accuracy of MA TLAB (1 ms) for the entire simulation. On the other hand, Bayesian estimation processing time takes 30.8 seconds of runtime for each simulation. Bayesian estimation process- ing time standard deviation is observed to be 5.6 seconds. This result is obtained by subdi viding the target state space into 10000 regions. The processing time can be signiﬁcantly improv ed by considering a lower order state space for the target, and/or lo wer number of quantized subspaces. Mean steepest descent path planning processing time is measured to be 41.3 s for the entire simulation. Steepest descent path planning processing time standard de viation is measured to be 9.8 seconds. W e have used exhaustiv e optimization to ﬁnd the steepest path since we wanted to trade off extra computation for better D-criterion. The performance may be signiﬁcantly improv ed if other types of non-con vex optimization are used. V I I I . C O N C L U S I O N W e ha ve decomposed the challenge of tracking a moving intermittent RF source into two distinct phases of path plan- ning and estimation. W e have deri ved steepest descent path planning to address the ﬁrst challenge, and dev eloped detection based EKF and recursiv e Bayesian estimator to address the second challenge. W e ha ve compared the performance of steepest descent path planning with a heuristic bio-inspired path planning and ha ve shown that steepest descent method continues to gather more information compared to bio-inspired path planning method. The av ailable information through steepest descent path planning is 10 times more than the bio- inspired path planning after only 10 seconds of data gathering. W e have shown that this increase in performance comes at the price of higher computational cost. W e ha ve also compared the performance of detection based EKF with recursive Bayesian estimator . W e hav e sho wn that Bayesian estimator marginally 13 outperforms detection based EKF estimator in terms of local- ization error , but at a larger computational cost. A C K N O W L E D G E M E N T This research was supported in part by the National Science Foundation under the grant number CNS-1446570 and by the National Aeronautics and Space Administration under the grant number NNX17AJ94A. R E F E R E N C E S [1] Z. Liu, Y . Chen, B. Liu, C. Cao, and X. Fu, “HA WK: An unmanned mini-helicopter-based aerial wireless kit for localization, ” IEEE T rans. Mobile Computing , vol. 13, no. 2, pp. 287–298, Feb 2014. [2] H. Oh, S. Kim, H. s. Shin, and A. Tsourdos, “Coordinated standoff tracking of moving target groups using multiple UA Vs, ” IEEE T rans. Aer ospace and Electr on. Sys. , vol. 51, no. 2, pp. 1501–1514, April 2015. [3] R. Reshma, T . Ramesh, and P . Sathishkumar, “Security situational aware intelligent road trafﬁc monitoring using U A Vs, ” in Proc. Int. Conf . VLSI Systems, Arc hitectur es, T ec hnology and Applications (VLSI-SAT A) , Jan. 2016, pp. 1–6. [4] A. W allar, E. Plaku, and D. A. Sofge, “Reactive motion planning for unmanned aerial surveillance of risk-sensitive areas, ” IEEE T rans. Automation Science and Engineering , vol. 12, no. 3, pp. 969–980, July 2015. [5] A. L. Sumalan, D. Popescu, and L. Ichim, “Flood ev aluation in critical areas by U A V surveillance, ” in Pr oc. Int. Conf. Electr onics, Computers and Artiﬁcial Intelligence (ECAI) , June 2016, pp. 1–6. [6] A. Merwaday , A. T uncer, A. Kumbhar , and I. Guvenc, “Improved throughput coverage in natural disasters: Unmanned aerial base stations for public-safety communications, ” IEEE V ehicular T ec hnology Maga- zine , vol. 11, no. 4, pp. 53–60, 2016. [7] Z. Liu, H. Liu, W . Xu, and Y . Chen, “ An error-minimizing framework for localizing jammers in wireless networks, ” IEEE T rans. P ar allel and Distributed Sys. , vol. 25, no. 2, pp. 508–517, Feb 2014. [8] D. Orfanus, E. P . de Freitas, and F . Eliassen, “Self-organization as a supporting paradigm for military U A V relay networks, ” IEEE Commu- nications Letters , vol. 20, no. 4, pp. 804–807, 2016. [9] K. Hartmann and K. Giles, “UA V exploitation: A new domain for cyber power , ” in Proc. Int. Conf. Cyber Conﬂict (CyCon) , 2016, pp. 205–221. [10] I. Guvenc, O. Ozdemir , Y . Y apici, H. Mehrpouyan, and D. Matolak, “De- tection, localization, and tracking of unauthorized UAS and jammers, ” in Proc. IEEE/AIAA Digital A vionics Systems Conf. (DASC) , Sep. 2017. [11] C. G. L. Krishna and R. R. Murphy , “ A revie w on cybersecurity vulnerabilities for unmanned aerial vehicles, ” in Proc. Int. Symp. Safety, Security and Rescue Robotics (SSRR) , Oct. 2017, pp. 194–199. [12] K. Klausen, J. B. Moe, J. C. van den Hoorn, A. Gomola, T . I. Fossen, and T . A. Johansen, “Coordinated control concept for recovery of a ﬁxed-wing U A V on a ship using a net carried by multirotor UA Vs, ” in Pr oc. Int. Conf. Unmanned Aircr aft Systems (ICU AS) , June 2016, pp. 964–973. [13] I. Guvenc, F . Koohifar , S. Singh, M. L. Sichitiu, and D. Matolak, “De- tection, tracking, and interdiction for amateur drones, ” IEEE Commun. Mag. , Jan. 2018. [14] B. Co wan, N. Imanberdiye v , C. Fu, Y . Dong, and E. Kayacan, “A performance evaluation of detectors and descriptors for UA V visual tracking, ” in Proc. Int. Conf. Contr ol, Automation, Robotics and V ision (ICARCV) , Nov . 2016, pp. 1–6. [15] S. Xu, K. Doanay , and H. Hmam, “Distributed PLKF using delayed information sharing for 3D A O A target tracking, ” in Proc. Eur opean Signal Processing Conference (EUSIPCO) , Aug. 2016, pp. 2070–2074. [16] Y .-K. Jung, Y . Lee, and W .-S. Ra, “Optimal formation of cooperativ e multiple UA Vs for enhancing passiv e target tracking performance, ” in Pr oc. Conf. Industrial Electr onics Society , Oct. 2016, pp. 1004–1009. [17] J. Liang and Q. Liang, “RF emitter location using a network of small unmanned aerial vehicles (SUA Vs), ” in Pr oc. IEEE Int. Conf. Commun. (ICC) , June 2011, pp. 1–6. [18] F . Koohif ar , A. Kumbhar , and I. Guv enc, “Receding horizon multi- U A V cooperative tracking of moving RF source, ” IEEE Commun. Lett. , vol. 21, no. 6, pp. 1433–1436, Aug. 2016. [19] S. M. M. Dehghan, M. S. T a vakkol i, and H. Moradi, “Path planning for localization of an RF source by multiple UA Vs on the Crammer-Rao Lower Bound, ” in Proc Int. Conf. Robotics and Mechatr onics (ICRoM) , Feb 2013, pp. 68–73. [20] P . W . Sarunic and R. J. Evans, “Trajectory control of autonomous ﬁxed- wing aircraft performing multiple target passive detection and tracking, ” in Proc. Int. Con. Intelligent Sensors, Sensor Networks and Information Pr ocessing , Dec 2010, pp. 169–174. [21] P . Sarunic and R. Evans, “Hierarchical model predictive control of U A Vs performing multitarget-multisensor tracking, ” IEEE T rans. Aer ospace and Electr on. Sys. , vol. 50, no. 3, pp. 2253–2268, July 2014. [22] S. Pagano, S. Peirani, and M. V alle, “Indoor ranging and localisation algorithm based on receiv ed signal strength indicator using statistic parameters for wireless sensor networks, ” IET W ir eless Sensor Sys. , vol. 5, no. 5, pp. 243–249, 2015. [23] D. Pack, P . DeLima, G. T oussaint, and G. Y ork, “Cooperative control of U A Vs for localization of intermittently emitting mobile targets, ” IEEE T rans. Systems, Man, and Cybernetics, P art B: Cybernetics , vol. 39, no. 4, pp. 959–970, Aug. 2009. [24] K. Purvis, K. Astrom, and M. Khammash, “Estimation and optimal conﬁgurations for localization using cooperativ e UA Vs, ” IEEE T rans. Contr ol Sys. T echnol. , vol. 16, no. 5, pp. 947–958, Sep. 2008. [25] M. Hernandez, “Optimal sensor trajectories in bearings-only tracking, ” T echnical report, 2004. [Online]. A vailable: http://citeseerx.ist.psu.edu/ viewdoc/summary?doi=10.1.1.1.3703 [26] P . Tichavsk y , C. Muravchik, and A. Nehorai, “Posterior Cramer-Rao bounds for discrete-time nonlinear ﬁltering, ” IEEE T rans. Signal Pr o- cessing , vol. 46, no. 5, pp. 1386–1396, May 1998. [27] Y . Oshman and P . Davidson, “Optimization of observer trajectories for bearings-only target localization, ” IEEE T rans. Aer ospace and Electr on. Sys. , vol. 35, no. 3, pp. 892–902, Jul 1999. [28] S. M. Kay , Fundamentals of statistical signal processing: Estimation theory . Upper Saddle River , NJ, USA: Prentice-Hall, Inc., 1993. [29] N. H. Nguyen and K. Doanay , “Optimal geometry analysis for multi- static TO A localization, ” IEEE T rans. Signal Processing , vol. 64, no. 16, pp. 4180–4193, Aug. 2016. [30] S. Kay , Fundamentals of statistical signal processing: Detection theory , ser . Prentice Hall Signal Processing Series. Prentice-Hall PTR, 1998.

Autonomous Tracking of Intermittent RF Source Using a UAV Swarm

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment