Estimation and Tracking of a Moving Target by Unmanned Aerial Vehicles

Estimation and T racking of a Mo ving T ar get by Unmanned Aerial V ehicles Jun-Ming Li 1 , Ching W en Chen 1 , and T eng-Hu Cheng 1 Abstract —An image-based control strategy along with esti- mation of target motion is dev eloped to track dynamic targets without motion constraints. T o the best of our knowledge, this is the ﬁrst work that utilizes a bounding box as image features for tracking control and estimation of dynamic target without mo- tion constraint. The features generated fr om a Y ou-Only-Look- Once (Y OLO) deep neural network can relax the assumption of continuous av ailability of the feature points in most literature and minimize the gap for applications. The challenges are that the motion pattern of the target is unknown and modeling its dynamics is infeasible. T o resolve these issues, the dynamics of the target is modeled by a constant-velocity model and is employed as a process model in the Unscented Kalman Filter (UKF), but process noise is uncertain and sensitive to system instability . T o ensur e con v ergence of the estimate error , the noise covariance matrix is estimated according to history data within a moving windo w . The estimated motion from the UKF is implemented as a feedforward term in the de veloped controller , so that tracking performance is enhanced. Simulations are demonstrated to verify the efﬁcacy of the developed estimator and controller . Index T erms —Unscented Kalman Filter , Estimation, T rack- ing of moving targets, U A V I . I N T RO D U C T I O N Knowledge about the position and v elocity of surrounding objects is important to the booming ﬁelds such as self-dri ving cars, target tracking and monitoring. In case of performing an object tracking task, position and velocity of the tracking target are typically assumed to be av ailable to achiev e better control performance [1] and [2] using visual servo controllers. When the target is not static, its velocity needs be considered in the system dynamics as to eliminate the tracking error and to calculate the accurate motion command for the camera. Howe v er , obtaining the knowledge online is challenging since the dynamics of the target might be complicated and unknown. Moreov er , there are instances that the measurement can be unexpected. For example, the target can exceed the ﬁeld of view (FO V) of the camera, or cannot be detected due to the unexpected occlusion. Se veral approaches have been proposed for estimating position or velocity of the target such as by using a ﬁxed camera [3], sensor netw orks [4]–[6], radar [7], and some known reference information in the image scene [8]. In order to inte grate 1 Department of Mechanical Engineering, National Chiao T ung Univer - sity , Hsinchu, T aiwan 30010 Email: michael1874888@gmail.com, amyk- ing90511@gmail.com, tenghu@g2.nctu.edu.tw This research is supported by the Ministry of Science and T echnology , T aiwan (Grant Number 107-2628-E-009-005-MY3, 106-2622-8-009-017), and partially supported by Perv asiv e Artiﬁcial Intelligence Research (P AIR) Labs, T aiwan (Grant Number MOST 108-2634-F-009-006-). with applications based on vision system such as target tracking, exploration, visual serv o control and navigation [1] and [9]–[11], an algorithm for a monocular camera to estimate position and velocity of a moving tar get is developed in this work. T o continuously estimate the position or the velocity of a target, it needs to remain in the ﬁeld of view of the camera, and therefore, motion of the target should be considered. Structure from motion (SfM), Structure and Motion (SaM) methods are usually used to reconstruct the relative position and motion between the vision system and objects in many applications [1] and [2]. With the kno wledge of length be- tween two feature points, [12] proposed methods to estimate position of the stationary features. In [13], 3D Euclidean structure of a static object is estimated based on the linear and angular velocities of a single camera mounted on a mobile platform, where the assumption is relaxed in [14]. Howe ver , SfM can only estimate the position of the object and usually the object is assumed to be stationary . In order to address the problem to estimate motion of moving objects, SaM is applied for estimation by using the knowledge of camera motion. Nonlinear observers are proposed in [15] and [16] to estimate the structure of a moving object with time-varying velocities. The v elocity of the object in [15] is assumed to be constant, and [16] relaxes the constant-velocity assumption to time-varying velocities for targets moving in a straight line or on a ground plane. In practice, measurement can be intermittent when the object is occluded, outside the camera FO V , etc. [17]–[19] present the dev elopment of dwell time conditions to guarantee that the state estimate error con ver ges to an ultimate bound under intermittent measurement. In [17]–[19], the estimation is based on the knowledge about the velocity of the moving object and the camera. Howe v er , in practice the velocity of the tar get is usually unknown, and modeling its dynamics is complicated and challenging. In fact, the relationship between target motion estimator and vision-based controller is inseparable. Speciﬁcally , out- put from a high performance target motion estimator can be used as a feedforward term for the controller to keep the target in the ﬁeld of view longer , which, in return, results in a longer period for the estimate error to con ver ge. In this work, a dynamic monocular camera is employed to estimate the position and velocity of a moving tar get. Compared to the multi-camera system [20], using a monocular camera has the advantage of reducing power consumption and the quantity of image data. A Y ou-Only-Look-Once (YOLO) deep neural network [21] is applied in this work for target detection, which relaxes the assumption of continuous av ailability of the feature point and minimizes the gap for applications, but it also introduces some challenges. That is, the detected box enclosing the target can lead to intermittent measure- ment, and the probability distribution function of the noise from inaccurate motion model may not follow the normal distribution. An Unscented Kalman Filter (UKF) based al- gorithm is dev eloped in this work to deal with problems of intermittent measurement and to obtain continuous estimate the target motion ev en when it lea ves the FO V . T o deal with the uncertain noise during the estimation, method in [22] is applied to update the process noise covariance matrix online to guarantee the con v ergence and the accuracy of the estimation. I I . P R E L I M I NA R I E S A N D P R O B L E M S T A T E M E N T A. Kinematics Model Fig. 1. Kinematics model. Based on the model in [19], Fig. 1 depicts the relationship between a mo ving target, a camera, and an image plane. The camera is mounted on the multirotor without relativ e motion. The subscript G denotes the inertial frame with its origin set arbitrarily on the ground, and the subscript C represents the body-ﬁxed camera frame with its origin ﬁx ed at the principle point of the camera, where Z c and X c are ax es with denoted direction. The vectors r q =  x q y q z q  T denotes the position of the feature point of the target, which is unknown and to be estimated, r c =  x c y c z c  T denotes the posi- tion of the camera, which can be measured by the embedded GPS/Motion Capture Systems, and r q /c =  X Y Z  T denotes the relative position between the feature point and the camera, all e xpressed in the camera frame. Their relation can be written as r q /c = r q − r c . (1) T aking the time deriv ativ e on the both sides of (1) yields the relativ e velocity as ˙ r q /c = V q − V c − ω c × r q /c , (2) where V c ,  v cx v cy v cz  T is the linear velocity of the camera, ω c ,  ω cx ω cy ω cz  T is the angular velocity of the camera, both are the control command to be designed. In (2), V q =  v q x v q y v q z  T is the linear velocity of the dynamic target, which is unknown and needs to be estimated. T o relax the limitation of existing results, following assumption is made throughout this work. Assumption 1. The trajectory of the target is unknown but bounded. Since the dynamics of the camera and the target are coupled, the states of the overall system are deﬁned as x =  x 1 x 2 x 3 x q y q z q v q x v q y v q z  T . (3) T o estimate the position and v elocity of the target, the state  x 1 x 2 x 3  T =  X Z Y Z 1 Z  T is deﬁned to facilitate the subsequent analysis. T aking the time deriv ati ve on the both sides of (3) and using (2) obtain a nonlinear function that represents the dynamics of the overall system as ˙ x =           v q x x 3 − v q z x 1 x 3 + ζ 1 + η 1 v q y x 3 − v q z x 2 x 3 + ζ 2 + η 2 − v q z x 2 3 + v cz x 2 3 − ( ω cy x 1 − ω cx x 2 ) x 3 V q 0 0 0           , (4) where ζ 1 , ζ 2 , η 1 , η 2 ∈ R are deﬁned as ζ 1 = ω cz x 2 − ω cy − ω cy x 2 1 + ω cx x 1 x 2 ζ 2 = − ω cz x 1 + ω cx + ω cx x 2 2 − ω cy x 1 x 2 η 1 = ( v cz x 1 − v cx ) x 3 η 2 = ( v cz x 2 − v cy ) x 3 . (5) Remark 1 . Since the trajectory and motion pattern of the target is unknown, it is modeled by a zero acceleration (i.e., constant velocity) dynamics as formulated in (4), which is reasonable during a short sampling time with the unneglectable mass of the moving target. The mismatch between the true and modeled dynamics can be considered as a process noise in an UKF dev eloped in the subsequent section. B. Image Model Fig. 2. The images of the dynamic targets are captured from an onboard camera on the multirotor in the Gazebo simulator . Note that the center of the bounding box is considered as a feature point for the subsequent analysis, and the bounding boxes, enclosing the vehicles from different angles of inclination, are obtained from a YOLO network that is trained for this work. By projecting the feature point Q into the image frame using the pinhole model yields the projection point q =  x 1 x 2  T ∈ R 2 as x 1 = u − c u f x x 2 = v − c v f y , (6) where [ u, v ] T denotes the position of the feature point in the image frame, f x and f y are the focal length of pixel unit, and [ c u , c v ] T represents the position of the center of the image. The area of the bounding box is deﬁned as a , and based on the pinhole model the relation between a and x 3 can be expressed as a = Af x f y x 2 3 (7) where A is the area of the tar get on the side, observed from the camera 1 . Assumption 2. The optical axis of the camera remains perpendicular to A to ensure better detection accuracy from Y OLO. Remark 2 . T o estimate x 3 precisely from (7), A needs to be accurate. Since A is a ﬁxed v alue, the optical axis of the camera needs to remain at ﬁxed angle relativ e to the plane of A. C. Measurement Model T o correct the unobserved system states, the measurement is deﬁned as z =  u v a r c  T , where u , v and a can be obtained directly from the detected bounding box, and r c is measurable as described in Section II-A. By using (1), (6), and (7), the estimate measurement ˆ z for the UKF can be obtained as ˆ z =     f x ˆ x 1 + c u f y ˆ x 2 + c v Af x f y ˆ x 2 3 sgn ( ˆ x 3 ) ˆ r q − ˆ r q /c     (8) where sgn ( · ) is a signum function, and ˆ ( · ) is the estimate of the denoted argument obtained from the process step in the UKF developed in the next section. In (7), the area of bounding box a remains positiv e despite of the sign of x 3 , which is positive since the depth is nonnegati ve. Therefore, to ensure ˆ x 3 con ver ge to a positiv e value, the term sgn ( ˆ x 3 ) is added to (8). Remark 3 . Despite the aforementioned adv antages, bounding boxes can lose unexpectedly , or the Intersection over Union (IoU) may sometime decrease, leading to intermittent or inaccurate measurements. These inherited defects from the data-driv en-based detection moti vate the need of Kalman ﬁlter for estimation. As the target velocity changes, state predicted by the constant-velocity dynamics model can be inaccurate, and the prediction error can considered as process noise. 1 Giv en a sedan as the target, the A is about 4 . 6 m × 1.5m. I I I . P O S I T I O N A N D V E L O C I T Y E S T I M A T I O N A. Unscented Kalman F ilter T o estimate state of dynamic systems with noisy measure- ment or intermittent measurement, Unscented Kalman Filter [23] has been applied in this work. Based on (4) and (8), the UKF for nonlinear dynamic system can be expressed as x k +1 = F ( x k ) + w k , (9) z k = H ( x k ) + v k , (10) where w k and v k represent the process and measurement noise, respectiv ely , and F ( · ) and H ( · ) are the corresponding nonlinear dynamics and measurement model deﬁned in (4) and (8), respectiv ely . Based on Remark 3, the Y OLO detection might fail incidentally , which makes the measurement correction step in (10) unav ailable. When it happens, the state is only predicted by the dynamics model using (9), which is used as a feedforward term to keep the target in the ﬁeld of view , which is reliable in a short period of time before the detection is recovered. B. Estimation of Noise Covariance Matrices When applying Kalman ﬁlter , the process and measure- ment noise cov ariance matrices are usually provided in prior . As mentioned in Remark 3, the unmodeled dynamics model can be considered as process noises, and the cov ariance matrix is sensitiv e to the con ver gence of estimation. It has been conﬁrmed in our simulations that inaccurate constant cov ariance matrices can lead to large estimate error or con ver ge failure. T o dynamically estimate the process noise cov ariance matrices, a method dev eloped in [22] is applied in this work to estimate and update the cov ariance matrices online, so that a faster and reliable con ver gence performance can be obtained. That is, the process noise w k is assumed to be uncorrelated, time-varying, and nonzero means Gaussian white noises that satisﬁes Q k δ kj = cov ( w k , w j ) (11) where δ kj is the Kronecker δ function. By selecting a windo w of size N , the estimate of the process noise co variance matrix ˆ Q k − 1 ∈ R m × m can be expressed as ˆ Q k − 1 = N X j =1 v j  P k − j + K k − j ε k − j ε T k − j K T k − j (12) − 2 n X i =0 ω c i  ξ i,k − j /k − 1 − j − ˆ X k − j /k − 1 − j  ×  ξ i,k − j /k − 1 − j − ˆ X k − j /k − 1 − j  T  , Since ˆ Q k − 1 might not be a diagonal matrix and positiv e deﬁnite, it is further con verted to a diagonal, positi ve deﬁnite matrix as ˆ Q ∗ k − 1 = diag n | ˆ Q k − 1 (1) | , | ˆ Q k − 1 (2) | , · · · , | ˆ Q k − 1 ( m ) | o , (13) where ˆ Q k − 1 ( i ) is the i -th diagonal element of the matrix ˆ Q k − 1 . On the other hand, the measurement noise can be measured in advance. I V . T R A C K IN G C O N T RO L In this section, a motion controller for the multirotor is designed using vision feedback. Compared to the existing Image-based V isual Servo (IBVS) control methods [24], the controller dev eloped in this work not only uses feedback but also includes a feedforward term to compensate the target motion and to ensure better tracking performance, where the feedforward term is obtained from the UKF dev eloped in Section III-A. Most existing approaches either focus on the estimate of target position/velocity or camera posi- tion/velocity , but yet the controllers designed for the cameras are rarely discussed, and vice versa. Additionally , the relation between estimating the target motion and controlling the camera are highly coupled. That is, a high performance motion controller can minimizes the estimate error (i.e., the camera is controlled to keep the target in the ﬁeld-of-view longer), which, in return, yields a precise feedforward term to facilitate the tracking performance, and vice versa. Finally , since YOLO deep neural network is employed to enclose the tar get in the image, the en velop area is deﬁned as a new reference signal for the controller to track. A. T arg et Recognition Y OLO [21] is a real-time object detection system with reasonable accuracy after training. Our Y OLO network is trained by using a large number of dataset and the perfor- mance is veriﬁed before implementation in this work. B. Controller The IBVS controller based on [25] is employed in this work for achieving tracking control of dynamic targets. T o this end, a vector s ( t ) = [ x 1 , x 2 , x 3 ] T : [0 , ∞ ) → R 3 denoted a feature vector is deﬁned as the control state which is deﬁned in (3). The visual error e ( t ) : [0 , ∞ ) → R 3 to be controlled is deﬁned as e = s − s ∗ (14) where s ∗ ∈ R 3 is a desired constant vector of the feature vector predeﬁned by the user (i.e., typically [ x ∗ 1 , x ∗ 2 ] T is selected as the center of the image and x ∗ 3 is a function of the expected distance to the target). T aking the time deriv ativ e of (14) and using (4) yield the open-loop error system as ˙ e = ˙ s = L e  V c − V q ω c  , where V c and ω c are considered as the control inputs, V q is the feedforward term estimated by the UKF , and L e ∈ R 3 × 6 is the interaction matrix deﬁned as L e =   − x 3 0 x 1 x 3 x 1 x 2 −  x 2 1 + 1  x 2 0 − x 3 x 2 x 3 ( x 2 2 + 1) − x 1 x 2 − x 1 0 0 x 2 3 x 2 x 3 − x 1 x 3 0   . (15) Note that as the error signal e con ver ges to zero, the position of the camera relativ e to the target is not unique, due to the fact that the camera control input V c and ω c hav e a higher dimension compared to e. T o keep the camera staying on the left-hand-side of the target as to maintain high detection accuracy from Y OLO 2 , v cx is controlled to track the moving target as to stay on the speciﬁed angle facing tow ard the target as v cx = − w im d exp ( ψ − ψ exp ) F O V u × f x , (16) where the design is inspired from [26]. In (16), w im is the width of the image in pix el, d exp denotes the expected distance to the target, F O V u is the horizontal ﬁeld of vie w of the camera, and ψ and ψ exp are the current and the e xpected angle of view with respect to the target, respectively . Since v cx is speciﬁed in (16), the corresponding column in the interaction matrix L e deﬁned in (15) can be remov ed, which giv es the resultant matrix ˆ L e ∈ R 3 × 5 as ˆ L e =   0 x 1 x 3 x 1 x 2 −  x 2 1 + 1  x 2 − x 3 x 2 x 3 ( x 2 2 + 1) − x 1 x 2 − x 1 0 x 2 3 x 2 x 3 − x 1 x 3 0   . (17) Using the Moore-Penrose pseudo-in verse of ˆ L e as well as adding a feedforward term, the tracking controller for the camera can be designed as v cx = − w im d exp ( ψ − ψ exp ) F O V u × f x + v q x       v cy v cz ω cx ω cy ω cz       = − λ ˆ L + e e +       v q y v q z 0 0 0       . The block diagram of the controller is shown in Fig. 3. V . S I M U L A T I O N S A. Envir onment Setup In the simulation 3 , a car is considered as a moving target and is tracked by a quadrotor , where the developed controller as well as the UKF are implemented. A camera is implemented on the quadrotor to provide visual feedback. Speciﬁcally , bounding boxes are generated in the image to 2 Pose estimate of the target at this angle can be achiev ed by a well-trained YOLO, and the extension to multiple angles of view will be trained in the future. 3 https://goo.gl/93EDnd Fig. 3. Block diagram of the controller . enclose detected cars as shown in Fig. 2, which is achieved by a pretrained Y OLO deep neural network. The simulation is conducted in the ROS framework (16.04, kinetic) with Gazebo simulator . In the simulation en vironment, the value of A = 4 . 6 m × 1 . 5 m , and the resolution of the image is 640 x 480 with 50 fps. The intrinsic parameters matrix of the camera is K =   381 . 36 0 320 . 5 0 381 . 36 240 . 5 0 0 1   . which is obtained by calibration. A time moving windo w of width N is set to be 150 with sampling rate of 50 sample / sec. The ini- tial process and measurement noise cov ariance matrices are selected as diag { [20 , 20 , 500 , 0 . 0001 , 0 . 0001 , 0 . 0001] } and diag  [0 . 08 , 0 . 08 , 0 . 02 , 5 , 5 , 5 , 1 , 1 , 1] × 10 − 2  , re- spectiv ely , and the process noise cov ariance matrix is es- timated online using (13). The initial location of the car and the drone are  0 0 0  T and  0 5 . 5 1 . 0  T along with the initial orientations  0 0 0  T and  0 0 − pi 2  T in radians, respectiv ely , all e xpressed in the global frame. The car is free to mov e on the X Y plane, and its v elocity is speciﬁed based on the real-time user command. B. Simulation Results Fig. 4 depicts the position estimate errors of the moving vehicle with simulation period of 223 seconds. The position estimate errors are reduced from 14% to 7% as the target mov es from time-varying velocity to constant velocity , de- spite some noises. Note that in practice the drone may not react fast enough to the rapid change of velocity , in which the optical axis cannot remain facing right to the target, leading to a slight deviation in the depth estimate (i.e., x 3 ). Note that the position estimate error in the Z -axis increases as the velocity in the Y -axis increases. This can be attributed to the fact that the increasing velocity of the vehicle in the Y - axis causes the quadrotor to tilt forward for tracking, which breaks the assumption 2 that the optical axis is facing toward the side of the vehicle and leads to a large estimate error . Remark 4 . In Gazebo environment, the velocity of the car is set belo w 4 m/s due to a large drag. As the speed of the car increases, the quadrotor accelerates with a tilt angle, which increases the chance of object detection failure. Ho wev er , the problem can be resolved by expanding the training dataset with images from dif ferent angles of vie w , which will be part of the future work. Fig. 4. Estimate and ground truth of the target trajectory in the X Y plane in the global frame. Fig. 5 depicts the velocity estimate errors of the moving vehicle. The estimate performance is slightly compromised when the vehicle is accelerated (i.e., due to the constant- velocity model utilized in the UKF), but better estimate performance can be expected by increasing the sensing rate for the UKF measurement. The increase of the velocity estimate error between 103-223 seconds can be attributed to the acceleration of the target in the Y -direction. Fig. 5. Estimate and ground truth of the target velocity in the X and Y direction of the global frame. Fig. 6 depicts the estimate error without process noise estimation, where the constant cov ariance matrix Q is same as the initial v alue. Compared to Fig. 5, using the estimated noise covariance matrix dev eloped in (13) yields a better velocity estimate performance. Fig. 6. Estimate and ground truth of the target velocity without estimation of noise covariance matrix. V I . C O N C L U S I O N In this work, a motion controller for a camera on an U A V is dev eloped to track a dynamic target with unknown motion and without motion constraint. The unkno wn target motion is estimated in the de veloped UKF with process noise cov ariance matrix estimated based on the past data within a moving windo w , and the intermittent measurement caused by Y OLO detection is addressed. The estimated target velocity is then included as a feedforward term in the de veloped controller to impro ve tracking performance. Compared to the case without noise estimation, the developed approach is proven to obtain better tracking performance. Although Assumption 2 is rigorous in practice, this work is the ﬁrst one to prove the feasibility of the overall control architecture, and future work will be relaxing the assumption by training a YOLO network to detect the tar get from any angles. Additionally , eliminating the knowledge of the ground truth will be another future works. R E F E R E N C E S [1] N. R. Gans, A. Dani, and W . E. Dixon, “V isual servoing to an arbitrary pose with respect to an object gi ven a single known length, ” in Pr oc. Am. Contr ol Conf. , Seattle, W A, USA, Jun. 2008, pp. 1261–1267. [2] A. Dani, N. Gans, and W . E. Dixon, “Position-based visual servo control of leader -follower formation using image-based relative pose and relativ e velocity estimation, ” in Pr oc. Am. Control Conf. , St. Louis, Missouri, Jun. 2009, pp. 5271–5276. [3] V . Chitrakaran, D. M. Dawson, W . E. Dixon, and J. Chen, “Identiﬁcation of a moving object’ s velocity with a ﬁxed camera, ” Automatica , vol. 41, pp. 553–562, 2005. [4] A. Stroupe, M. Martin, and T . Balch, “Distributed sensor fusion for object position estimation by multi-robot systems, ” in Proc. IEEE Int. Conf. Robotics and Automation , Seoul, South K orea, May 2001, pp. 1092–1098. [5] A. Kamthe, L. Jiang, M. Dudys, and A. Cerpa, “Scopes: Smart cameras object position estimation system, ” in EWSN ’09 , Berlin, Heidelberg, Feb . 2009, pp. 279–295. [6] H. S. Ahn and K. H. K o, “Simple pedestrian localization algorithms based on distributed wireless sensor networks, ” IEEE T rans. Ind. Electr on. , v ol. 56, no. 10, pp. 4296–4302, Mar . 2009. [7] J. Schlichenmaier , N. Selvaraj, M. Stolz, and C. W aldschmidt, “T em- plate matching for radar-based orientation and position estimation in automotiv e scenarios, ” in Pr oc. IEEE MTT -S Int. Conf. Microw . Intell. Mobility (ICMIM) , Nagoya, Japan, Mar . 2017, pp. 95–98. [8] Z. Huang, W . Liu, and J. Zhong, “Estimating the real positions of objects in images by using evolutionary algorithm, ” in Proc. Int. Conf. Mach. V is. Inf. T echnol. , Singapore, Feb . 2017, pp. 34–39. [9] J. Thomas, J. W elde, G. Loianno, K. Daniilidis, and V . Kumar, “ Autonomous ﬂight for detection, localization, and tracking of moving targets with a small quadrotor , ” IEEE Robot. Autom. Lett. , vol. 2, no. 3, pp. 1762–1769, Jul. 2017. [10] E. Palazzolo and C. Stachniss, “Information-driv en autonomous ex- ploration for a vision-based mav , ” in Pr oc. of the ISPRS Int. Conf. on Unmanned Aerial V ehicles in Geomatics (UA V -g) , Bonn, Germany , Sep. 2017, pp. 59–66. [11] D. Scaramuzza, M. Achtelik, L. Doitsidis, F . Fraundorfer, E. K os- matopoulos, A. Martinelli, M. Achtelik, M. Chli, S. Chatzichristoﬁs, L. Kneip, D. Gurdan, L. Heng, G. Lee, S. L ynen, L. Meier , M. Polle- feys, A. Renzaglia, R. Siegwart, J. Stumpf, P . T anskanen, C. Troiani, and S. W eiss, “V ision-controlled micro ﬂying robots: From system design to autonomous navigation and mapping in gps-denied environ- ments, ” IEEE Robot. Autom. Mag. , vol. 21, no. 3, pp. 26–40, Aug. 2014. [12] V . K. Chitrakaran and D. M. Dawson, “ A lyapunov-based method for estimation of euclidean position of static features using a single camera, ” in Proc. Am. Contr ol Conf. , New Y ork, NY , USA, Jul. 2007, pp. 1988–1993. [13] D. Braganza, D. M. Dawson, and T . Hughes, “Euclidean position estimation of static features using a moving camera with known velocities, ” in Pr oc. IEEE Conf. Decis. Control , New Orleans, LA, USA, Dec 2007, pp. 2695–2700. [14] A. P . Dani, N. R. Fischer , and W . E. Dixon, “Single camera structure and motion, ” IEEE T ransactions on Automatic Contr ol , vol. 57, no. 1, pp. 238–243, Jan. 2012. [15] A. Dani, Z. Kan, N. Fischer , and W . E. Dixon, “Structure and motion estimation of a moving object using a moving camera, ” in Proc. Am. Contr ol Conf . , Baltimore, MD, 2010, pp. 6962–6967. [16] A. Dani, Z. Kan, N. Fischer, and W . E. Dixon, “Structure estimation of a moving object using a moving camera: An unknown input observ er approach, ” in Pr oc. IEEE Conf. Decis. Contr ol , Orlando, FL, 2011, pp. 5005–5012. [17] A. Parikh, T .-H. Cheng, and W . E. Dixon, “ A switched systems approach to image-based localization of tar gets that temporarily leave the ﬁeld of view , ” in Pr oc. IEEE Conf. Decis. Contr ol , 2014, pp. 2185– 2190. [18] A. Parikh, T .-H. Cheng, and W . E. Dixon, “ A switched systems approach to vision-based localization of a target with intermittent measurements, ” in Proc. Am. Control Conf. , Jul. 2015, pp. 4443–4448. [19] A. Parikh, T .-H. Cheng, H.-Y . Chen, and W . E. Dixon, “ A switched systems framework for guaranteed conver gence of image-based ob- servers with intermittent measurements, ” IEEE Tr ans. Robot. , vol. 33, no. 2, pp. 266–280, April 2017. [20] K. Zhang, J. Chen, B. Jia, and Y . Gao, “V elocity and range iden- tiﬁcation of a moving object using a static-moving camera system, ” in IEEE Conf. Decis. Contr ol , Las V e gas, NV , USA, Dec. 2016, pp. 7135–7140. [21] J. Redmon and A. Farhadi, “Y olov3: An incremental improvement, ” arXiv preprint arXi v:1804.02767, T ech. Rep., 2018. [22] S. Gao, G. Hu, and Y . Zhong, “W indowing and random weighting- based adapti ve unscented kalman ﬁlter, ” Int. J. Adapt. Contr ol Signal Pr ocess. , v ol. 29, no. 2, pp. 201–223, Feb . 2015. [23] E. A. W an and R. van der Men ve, “The unscented kalman ﬁlter for nonlinear estimation, ” in Proc. IEEE 2000 Adaptive Syst. Signal Pr ocess., Commun., Contr ol Symp. , Lake Louise, Alberta, Canada, Oct. 2000, pp. 153–158. [24] F . Chaumette and S. Hutchinson, “V isual servo control part I: Basic approaches, ” IEEE Rob. Autom. Mag. , vol. 13, no. 4, pp. 82–90, 2006. [25] N. Shahriari, S. Fantasian, F . Flacco, and G. Oriolo, “Robotic visual servoing of moving targets, ” in Pr oc. IEEE/RSJ Int. Conf. Intell. Robots Syst. , T ok yo, Japan, Nov . 2013, pp. 77–82. [26] J. Pestana, J. L. Sanchez-Lopez, S. Saripall, and P . Campoy , “Com- puter vision based general object following for gps-denied multirotor unmanned vehicles, ” in Am. Contr ol Conf. , Portland, OR, USA, Jun. 2014, pp. 1886–1891.

Estimation and Tracking of a Moving Target by Unmanned Aerial Vehicles

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment