Single-Agent On-line Path Planning in Continuous, Unpredictable and Highly Dynamic Environments

Universid ad T ´ ecnica Federico Sant a Mar ´ ıa Dep ar t amento de Inf orm ´ atica V alp ara ´ ıso – Chile SINGLE-A GENT ON-LINE P A TH PLANNING IN CONTINUOUS, UNPREDICT ABLE AND HIGHL Y D YNAMIC ENVIR ONMENTS T esis presen tada como requerimien to parcial para optar al grado acad ´ emico de MA G ´ ISTER EN CIENCIAS DE LA INGENIER ´ IA INF ORM ´ ATICA y al t ´ ıtulo profesional de INGENIER O CIVIL EN INF ORM ´ ATICA p or Nicol´ as Arturo Barriga Ric hards Comisi´ on Ev aluadora: Dr. Mauricio Solar (Gu ´ ıa, UTFSM) Dr. Horst H. v on Brand (UTFSM) Dr. John A tkinson (UdeC) NO V 2009 i Universid ad T ´ ecnica Federico Sant a Mar ´ ıa Dep ar t amento de Inf orm ´ atica V alp ara ´ ıso – Chile TITULO DE LA TESIS: SINGLE-A GENT ON-LINE P A TH PLANNING IN CONTIN- UOUS, UNPREDICT ABLE AND HIGHL Y D YNAMIC ENVI- R ONMENTS A UTOR: NICOL ´ AS AR TURO BARRIGA RICHARDS T esis presen tada como requerimien to parcial para optar al grado acad´ emico de Mag ´ ıster en Ciencias de la Ingenier ´ ıa Inform´ atica y al t ´ ıtulo pro- fesional de Ingeniero Civil en Inform´ atica de la Univ ersidad T ´ ecnica F ederico San ta Mar ´ ıa. Dr. Mauricio Solar Profesor Gu ´ ıa Dr. Horst H. v on Brand Profesor Correferen te Dr. John Atkinson Profesor Externo No v 2009. V alpara ´ ıso, Chile. ii R e al stupidity b e ats artiﬁcial intel ligenc e every time. T erry Pr atchett iii Ac kno wledgmen ts A mis hermanos, p or quienes he tratado de ser una mejor p ersona, para ser un buen ejemplo para ellos. A mis padres, de quienes hered ´ e lo que soy , tan to en cuerp o como en esp ´ ıritu. A mis abuelos y abuelas, especialmente mi Opapa, de quien aprend ´ ı a disfrutar la vida, y mi T ata, quien me ense ˜ n´ o el v alor de la familia. Quiero tambi ´ en agradecer a los que hicieron p osible esta tesis: Mauricio Aray a p or a yudarme a escoger el tema y en el desarrollo p osterior, a Ro drigo Aray a p or obli- garme a traba jar en medio de las v acaciones de verano, a Nicol´ as T roncoso, Diego Candel y Rob erto Bon v allet, y a mis profesores gu ´ ıas y correferentes, Mauricio Solar, Horst H. v on Brand y John A tkinson p or sus ap ortes duran te el desarrollo de esta tesis. Quiero apro vec har la op ortunidad para agradecer a los profesores que marcaron mi estad ´ ıa en la Univ ersidad: Horst H. von Brand, H´ ector Allende, Luis Salinas, Hub ert Hoﬀmann, Claudio Dib y Viktor Sl ¨ usarenk o. Finalmen te, quiero agradecer en esp ecial a P amela Saa v edra, p or estar junto a m ´ ı duran te los m´ as de diez meses que demor´ o en ser terminada esta tesis, sop ort´ andome y ap oy´ andome en los momentos m´ as dif ´ ıciles. V alpara ´ ıso, Chile Nicol´ as A. Barriga No viembre 2009 iv Resumen Este do cumen to es una tesis en el tema de planiﬁcaci´ on de caminos uniagen te y en l ´ ınea, para am bientes con tinuos, impredecibles y altamen te din´ amicos. El problema es encon trar y recorrer un camino sin colisiones para un rob ot holon´ omico, sin restric- ciones kino din´ amicas, mo vi´ endose en un am biente con v arios obst´ aculos o adv ersarios mo vi´ endose impredeciblemen te. Se asume la disponibilidad de informaci´ on perfecta del en torno en to do momen to. V arias v arian tes est´ aticas y din´ amicas del algoritmo “Rapidly Exploring Random T rees” (RR T) se exploran, as ´ ı como tam bi´ en un algoritmo ev olutivo para planiﬁcaci´ on en ambien tes din´ amicos llamado “Evolutionary Planner/Na vigator.” Se prop one una com binaci´ on de am b os algoritmos para sup erar las falencias de ambos y luego una com binaci´ on de RR T para planiﬁcaci´ on inicial y b ´ usqueda lo cal informada para na ve- gaci´ on, sumado a una heur ´ ıstica voraz simple para optimizaci´ on. Se demuestra que esta com binaci´ on de t´ ecnicas simples pro duce mejores respuestas en ambien tes altamente din´ amicos que las v ariantes RR T est´ andar. P alabras Cla v es: In teligencia artiﬁcial. planiﬁcaci´ on de rutas, RR T, multi-etapa, b ´ usqueda local informada, heur ´ ıstica v oraz, algoritmos evolutiv os v Abstract This document is a thesis on the sub ject of single-agent on-line path planning in con- tin uous,unpredictable and highly dynamic en vironmen ts. The problem is ﬁnding and tra versing a collision-free path for a holonomic rob ot, without kino dynamic restrictions, mo ving in an en vironmen t with several unpredictably moving obstacles or adversaries. The av ailabilit y of p erfect information of the environmen t at all times is assumed. Sev eral static and dynamic v ariants of the Rapidly Exploring Random T rees (RR T) algorithm are explored, as well as an evolutionary algorithm for planning in dynamic en vironments called the Ev olutionary Planner/Na vigator. A com bination of b oth kinds of algorithms is proposed to o vercome shortcomings in b oth, and then a com bination of a RR T v ariant for initial planning and informed lo cal searc h for na vigation, plus a simple greedy heuristic for optimization. W e sho w that this combination of simple tec hniques pro vides b etter resp onses to highly dynamic en vironments than the RR T extensions. Keyw ords: Artiﬁcial in telligence, motion planning, RR T, Multi-stage, informed lo cal search, greedy heuristics, ev olutionary algorithms vi Index of Con ten ts Ac knowledgmen ts iv Resumen v Abstract vi Index of Con ten ts vii List of T ables ix List of Figures x List of Algorithms xi 1 In tro duction 1 1.1 Problem F orm ulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Do cumen t Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 State of the Art 5 2.1 Rapidly-Exploring Random T ree . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Retraction-Based RR T Planner . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Execution Extended RR T . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.4 Dynamic RR T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.5 Multipartite RR T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.6 Rapidly Exploring Ev olutionary T ree . . . . . . . . . . . . . . . . . . . 11 2.7 Multidimensional Binary Searc h T rees . . . . . . . . . . . . . . . . . . 13 2.8 Ev olutionary Planner/Na vigator . . . . . . . . . . . . . . . . . . . . . . 15 3 Prop osed T echniques 19 3.1 Com bining RR T and EP/N . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.1 The Combined Strategy . . . . . . . . . . . . . . . . . . . . . . 19 3.1.2 Algorithm Implementation . . . . . . . . . . . . . . . . . . . . . 20 vii 3.2 A Simple Multi-stage Probabilistic Algorithm . . . . . . . . . . . . . . 20 3.2.1 A Multi-stage Probabilistic Strategy . . . . . . . . . . . . . . . 21 3.2.2 Algorithm Implementation . . . . . . . . . . . . . . . . . . . . . 22 4 Exp erimen tal Setup and Results 27 4.1 Exp erimen tal Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.1.1 Dynamic Environmen t . . . . . . . . . . . . . . . . . . . . . . . 27 4.1.2 P artially Kno wn En vironment . . . . . . . . . . . . . . . . . . . 27 4.1.3 Unkno wn En vironment . . . . . . . . . . . . . . . . . . . . . . . 28 4.2 Implemen tation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.3.1 Dynamic Environmen t Results . . . . . . . . . . . . . . . . . . . 32 4.3.2 P artially Kno wn En vironment Results . . . . . . . . . . . . . . 34 4.3.3 Unkno wn En vironment Results . . . . . . . . . . . . . . . . . . 34 5 Conclusions and F uture W ork 37 5.1 F uture W ork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.1.1 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.1.2 F ramework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Bibliograph y 40 viii List of T ables 4.1 Dynamic Environmen t Results, map 1. . . . . . . . . . . . . . . . . . . 33 4.2 Dynamic Environmen t Results, map 2. . . . . . . . . . . . . . . . . . . 33 4.3 P artially Kno wn En vironment Results, map 1. . . . . . . . . . . . . . . 35 4.4 P artially Kno wn En vironment Results, map 2. . . . . . . . . . . . . . . 36 4.5 Unkno wn En vironment Results . . . . . . . . . . . . . . . . . . . . . . 36 ix List of Figures 2.1 RR T during execution . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 The roles of the genetic op erators . . . . . . . . . . . . . . . . . . . . . 17 3.1 A Multi-stage Strategy for Dynamic Path Planning . . . . . . . . . . . 25 3.2 The arc op erator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.3 The mutation op erator . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.1 The dynamic en vironment, map 1 . . . . . . . . . . . . . . . . . . . . . 28 4.2 The dynamic en vironment, map 2 . . . . . . . . . . . . . . . . . . . . . 29 4.3 The partially kno wn en vironment, map 1 . . . . . . . . . . . . . . . . . 30 4.4 The partially kno wn en vironment, map 2 . . . . . . . . . . . . . . . . . 31 4.5 The unknown environmen t . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.6 Dynamic environmen t time . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.7 Dynamic environmen t success rate . . . . . . . . . . . . . . . . . . . . . 35 x List of Algorithms 1 BuildRR T( q init , q goal ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Extend( T , q ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3 RR TConnectPlanner( q init , q goal ) . . . . . . . . . . . . . . . . . . . . . . 7 4 Connect( T , q ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 5 Retraction-based RR T Extension . . . . . . . . . . . . . . . . . . . . . 8 6 Cho oseT arget( q , goal) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 7 DRR T() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 8 ReGro wRR T() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 9 T rimRR T() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 10 Inv alidateNo des( obstacl e ) . . . . . . . . . . . . . . . . . . . . . . . . . 11 11 MPRR TSearch( q init ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 12 PruneAndPrep end( T , F , q init ) . . . . . . . . . . . . . . . . . . . . . . . . 13 13 SelectSample( F ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 14 ExtendT oT arget( T ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 15 EP/N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 16 Main() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 17 pro cessRR TEPN( time ) . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 18 Main() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 19 pro cessMultiStage(time) . . . . . . . . . . . . . . . . . . . . . . . . . . 23 20 arc(path , ﬁrstCol) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 21 mut(path , ﬁrstCol) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 22 p ostPro cess( path ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 xi Chapter 1 In tro duction The dynamic p ath-planning problem consists in ﬁnding a suitable plan for eac h new conﬁguration of the en vironment b y recomputing a collision-free path using the new information a v ailable at eac h time step [HA92]. This kind of problem has to b e solv ed for example by a rob ot trying to na vigate through an area crowded with p eople, such as a shopping mall or sup ermark et. The problem has b een widely addressed in its sev eral ﬂa vors, suc h as cellular decomp osition of the conﬁguration space [Ste95], partial en vironmental knowledge [Ste94], high-dimensional conﬁguration spaces [KSLO96] or planning with non-holonomic constrain ts [LKJ99]. How ever, ev en simpler v ariations of this problem are complex enough that they can not b e solv ed with deterministic tec hniques, and therefore are worth y of study . This thesis is fo cused on algorithms for ﬁnding and trav ersing a collision-free path in tw o dimensional space, for a holonomic rob ot 1 , without kino dynamic restrictions 2 , in a highly dynamic en vironment, but for comparison purposes three diﬀeren t scenarios will b e tested: • Several unpredictably mo ving obstacles or adv ersaries. • Partially kno wn en vironmen t, where some obstacles b ecome visible when the rob ot approaches eac h one of them. 1 A holonomic robot is a robot in which the controllable degrees of freedom is equal to the total degrees of freedom. 2 Kino dynamic planning is a problem in which velocity and acceleration b ounds m ust b e satisﬁed 1 • T otally unkno wn en vironmen t, where every obstacle is initially in visible to the planner, and only b ecomes visible when the rob ot approaches it. Besides the obstacles in the second and third scenario w e assume that we hav e p erfect information of the environmen t at all times. W e will fo cus on con tinuous space algorithms and will not consider algorithms that use a discretized represen tation of the conﬁguration space 3 , suc h as D* [Ste95], because for high dimensional problems the conﬁguration space b ecomes in tractable in terms of b oth memory and computation time, and there is the extra diﬃcult y of calculating the discretization size, trading oﬀ accuracy v ersus computational cost. Only single agen t algorithms will b e considered here. On-line as well as oﬀ-line algorithms will b e studied. An on-line algorithm is one that is p ermanen tly adjusting its solution as the en vironmen t changes, while an oﬀ-line algorithm computes a solution only once (ho wev er, it can be executed many times). The oﬄine Rapidly-Exploring Random T ree (RR T) is eﬃcient at ﬁnding solutions, but the results are far from optimal, and m ust b e p ost-pro cessed for shortening, smo oth- ing or other qualities that might b e desirable in each particular problem. F urthermore, replanning RR Ts are costly in terms of computation time, as are evolutionary and cell-decomp osition approac hes. Therefore, the nov elty of this work is the mixture of the feasibilit y b eneﬁts of the RR Ts, the repairing capabilities of lo cal searc h, and the computational inexp ensiv eness of greedy algorithms, in to our light weigh t m ulti-stage algorithm. Our working h yp othesis will b e that a m ulti-stage algorithm, using diﬀer- en t techniques for initial planning and na vigation, outp erforms curren t probabilistic sampling techniques in highly dynamic en vironmen ts 1.1 Problem F orm ulation A t each time-step, the problem could be deﬁned as an optimization problem with satisﬁabilit y constrain ts. Therefore, given a path our ob jective is to minimize an ev aluation function (i.e., distance, time, or path-p oin ts), with the C free constrain t. 3 the space of p ossible p ositions that a ph ysical system ma y attain 2 F ormally , let the path ρ = p 1 p 2 . . . p n b e a sequence of p oints, where p i ∈ R n is a n - dimensional point ( p 1 = q init , p n = q goal ), O t ∈ O the set of obstacles p ositions at time t , and ev al : R n × O 7→ R an ev aluation function of the path dep ending on the ob ject p ositions. Our ideal ob jective is to obtain the optim um ρ ∗ path that minimizes our ev al function within a feasibility restriction in the form ρ ∗ = argmin ρ [ev al( ρ, O t )] with feas( ρ, O t ) = C free (1.1) where feas( · , · ) is a fe asibility function that equals C free if the path ρ is collision free for the obstacles O t . F or simplicity , we use very naiv e ev al( · , · ) and feas( · , · ) functions, but our approac h could b e extended easily to more complex ev aluation and feasibility functions. The feas( ρ, O t ) function used assumes that the rob ot is a p oin t ob ject in space, and therefore if no segments − − − → p i p i +1 of the path collide with any ob ject o j ∈ O t , w e say that the path is in C free . The ev al( ρ, O t ) function is the length of the path, i.e., the sum of the distances b et ween consecutive p oints. This could b e easily c hanged to an y other metric such as the time it would take to tra verse this path, accoun ting for smo othness, clearness or several other optimization criteria. 1.2 Do cumen t Structure In the following sections w e present sev eral path planning metho ds that can b e applied to the problem describ ed abov e. In section 2.1 we review the basic oﬄine, single-query RR T, a probabilistic metho d that builds a tree along the free conﬁguration space until it reac hes the goal state. Afterw ards, w e introduce the most p opular replanning v ari- an ts of RR T: Execution Extended RR T (ERR T) in section 2.3, Dynamic RR T (DRR T) in section 2.4 and Multipartite RR T (MP-RR T) in section 2.5. The Evolutionary Plan- ner/Na vigator (EP/N), along with some v ariants, is presen ted in section 2.8. Then, in section 3.1 w e present a mixed approac h, using a RR T to ﬁnd an initial solution and the EP/N to na vigate, and ﬁnally , in section 3.2 we presen t our new hybrid m ulti-stage algorithm, that uses RR T for initial planning and informed lo cal search for navigation, plus a simple greedy heuristic for optimization. Experimental results and compar- isons that show that this combination of simple tec hniques pro vides b etter resp onses 3 to highly dynamic environmen ts than the standard RR T extensions are presen ted in section 4.3. The conclusions and further work are discussed in section 5. 4 Chapter 2 State of the Art In this chapter w e presen t sev eral path planning metho ds that can b e applied to the problem describ ed ab o ve. First we will in tro duce v ariations of the Rapidly-Exploring Random T ree (RR T), a probabilistic metho d that builds a tree along the free conﬁg- uration space un til it reac hes the goal state. This family of planners is fast at ﬁnding solutions, but the solutions are far from optimal, and must be p ost-pro cessed for short- ening, smo othing or other qualities that migh t b e desirable in each particular problem. F urthermore, replanning RR Ts are costly in terms of computation time. W e then in- tro duce an ev olutionary planner with somewhat opp osite qualities: It is slow in ﬁnding feasible solutions in diﬃcult maps, but eﬃcient at replanning when a feasible solution has already b een found. It can also optimize the solution according to an y given ﬁtness function without the need for a p ost-processing step. 2.1 Rapidly-Exploring Random T ree One of the most successful probabilistic metho ds for oﬄine path planning currently in use is the Rapidly-Exploring Random T ree (RR T), a single-query planner for static en vironments, ﬁrst introduced in [La v98]. RR Ts w orks tow ards ﬁnding a con tinuous path from a state q init to a state q goal in the free conﬁguration space C free b y building a tree rooted at q init . A new state q rand is uniformly sampled at random from the conﬁguration space C . Then the nearest no de, q near , in the tree is located, and if q rand and the shortest path from q rand to q near are in C free , then q rand is added to the tree 5 (algorithm 1). The tree gro wth is stopp ed when a no de is found near q goal . T o sp eed up conv ergence, the search is usually biased to q goal with a small probability . In [KL00], t wo new features are added to RR T. First, the EXTEND function (algo- rithm 2) is in tro duced, whic h instead of trying to add q rand directly to the tree, mak es a motion to wards q rand and tests for collisions. Algorithm 1 BuildRR T( q init , q goal ) 1: T ← empt y tree 2: T . init( q init ) 3: while Distance( T , q goal ) > threshold do 4: q rand ← RandomConﬁg() 5: Extend( T , q rand ) 6: return T Algorithm 2 Extend( T , q ) 1: q near ← NearestNeigh b or( q , T ) 2: if NewConﬁg ( q , q near , q new ) then 3: T . add v ertex( q new ) 4: T . add edge( q near , q new ) 5: if q new = q then 6: return Reached 7: else 8: return Adv anced 9: return T rapp ed Then a greedier approac h is introduced (the CONNECT function, sho wn in algo- rithms 3 and 4), whic h repeats EXTEND until an obstacle is reac hed. This ensures that most of the time w e will b e adding states to the tree, instead of just rejecting new random states. The second extension is the use of tw o trees, ro oted at q init and q goal , whic h are grown tow ards eac h other (see ﬁgure 2.1). This signiﬁcantly decreases the time needed to ﬁnd a path. 6 Figure 2.1: RR T during execution Algorithm 3 RR TConnectPlanner( q init , q goal ) 1: T a ← tree ro oted at q init 2: T b ← tree ro oted at q goal 3: T a . init( q init ) 4: T b . init( q goal ) 5: for k = 1 to K do 6: q rand ← RandomConﬁg() 7: if not (Extend( T a , q rand ) = T rapp ed) then 8: if Connect( T b , q new ) = Reached then 9: return Path( T a , T b ) 10: Sw ap( T a , T b ) 11: return F ailure Algorithm 4 Connect( T , q ) 1: rep eat 2: S ← Extend( T , q ) 3: until ( S 6 = Adv anced) 7 2.2 Retraction-Based RR T Planner The Retraction-based RR T Planner presen ted in [ZM08] aims at impro ving the p er- formance of the standard oﬄine RR T in static en vironments with narro w passages. The basic idea of the Optimize( q r , q n ) function in algorithm 5 is to iterativ ely retract a randomly generated conﬁguration that is in C obs to the closest b oundary p oint in C free . So, instead of using the standard extension that tries to extend in a straigh t line from q near to q rand , it extends from q near to the closest p oint in C free to q rand . This giv es more samples in narro w passages. This technique could easily b e applied to on-line RR T planners. Algorithm 5 Retraction-based RR T Extension 1: q r ← a random conﬁguration in C space 2: q n ← the nearest neigh b or of q r in T 3: if CollisionF ree( q n , q r ) then 4: T . addV ertex( q r ) 5: T . addEdge( q n , q r ) 6: else 7: S ← Optimize( q r , q n ) 8: for all q i ∈ S do 9: Standard RR T Extension( T , q i ) 10: return T 2.3 Execution Extended RR T The Execution Extended RR T presented in [BV02] in tro duces tw o extensions to RR T to build an on-line planner, the w aypoint cac he and adaptive cost p enalt y search, whic h improv e re-planning eﬃciency and the quality of generated paths. ERR T uses a kd-tree (see section 2.7) to sp eed nearest neighbor lo ok-up, and do es not use bidirec- tional searc h. The w a yp oin t cac he is implemen ted b y k eeping a constan t size arra y of states, and whenever a plan is found, all the states in the plan are placed in the cac he with random replacemen t. Then, when the tree is no longer v alid, a new tree must b e 8 gro wn, and there are three possibilities for choosing a new target state, as sho wn in algorithm 6, whic h is used instead of RandomConﬁg() in previous algorithms. With probabilit y P[ go al ], the goal is chosen as the target; with probability P[ wayp oint ], a random w a yp oin t is chosen, and with the remaining probabilit y a uniform state is c hosen as b efore. In [BV02] the v alues used are P[ go al ]= 0 . 1 and P[ wayp oint ]= 0 . 6. Another extension is adaptive cost penalty searc h, where the planner adaptiv ely mo diﬁed a parameter to help it ﬁnd shorter paths. A v alue of 1 for b eta will alwa ys extend from the ro ot no de, while a v alue of 0 is equiv alen t to the original algorithm. Ho wev er, the pap er [BV02] lac ks implemen tation details and exp erimen tal results on this extension. Algorithm 6 Cho oseT arget( q , goal) 1: p ← UniformRandom(0 . 0 , 1 . 0) 2: i ← UniformRandom(0 . 0 , NumW a yP oints) 3: if 0 < p < GoalProb then 4: return q goal 5: else if GoalProb < p < GoalProb + W ayP oin tProb then 6: return W ayP ointCac he[ i ] 7: else if GoalProb + W ayP ointProb < p < 1 then 8: return RandomConﬁg() 2.4 Dynamic RR T The Dynamic Rapidly-Exploring Random T ree described in [FKS06] is a probabilistic analog to the widely used D* family of algorithms. It w orks by growing a tree from q goal to q init , as shown in algorithm 7. This has the adv an tage that the root of the tree do es not ha ve to be mo ved during the lifetime of the planning and execution. In some problem classes the rob ot has limited range sensors, th us mo ving or newly app earing obstacles will be near the robot, not near the goal. In general this strategy attempts to trim smaller branches that are farther a w ay from the ro ot. When new information concerning the conﬁguration space is received, the algorithm remo v es the newly-inv alid branc hes of the tree (algorithms 9 and 10), and gro ws the remaining tree, fo cusing, 9 with a certain probabilit y (empirically tuned to 0 . 4 in [FKS06]) to a vicinit y of the recen tly trimmed branc hes, b y using the w a yp oin t cache of the ERR T (algorithm 6). In exp eriments presen ted in [FKS06] DRR T v astly outp erforms ERR T. Algorithm 7 DRR T() 1: q robot ← the current rob ot p osition 2: T ← BuildRR T( q goal , q robot ) 3: while q robot 6 = q goal do 4: q next ← P arent( q robot ) 5: Mo ve from q robot to q next 6: for all obstacles that changed O do 7: In v alidateNo des( O ) 8: if Solution path contains an in v alid no de then 9: ReGro wRR T() Algorithm 8 ReGrowRR T() 1: T rimRR T() 2: GrowRR T() Algorithm 9 T rimRR T() 1: S ← ∅ , i ← 1 2: while i < T . size() do 3: q i ← T . no de( i ) 4: q p ← P arent( q i ) 5: if q p . ﬂag = INV ALID then 6: q i . ﬂag ← INV ALID 7: if q i . ﬂag 6 = INV ALID then 8: S ← S S { q i } 9: i ← i + 1 10: T ← CreateT reeF romNo des( S ) 10 Algorithm 10 Inv alidateNo des( obstacle ) 1: E ← FindAﬀectedEdges(obstacle) 2: for all e ∈ E do 3: q e ← ChildEndp oin tNo de( e ) 4: q e . ﬂag ← INV ALID 2.5 Multipartite RR T Multipartite RR T presen ted in [ZKB07] is another RR T v ariant which supp orts plan- ning in unkno wn or dynamic en vironmen ts. MP-RR T main tains a forest F of discon- nected sub-trees which lie in C free , but whic h are not connected to the ro ot no de q root of T , the main tree. A t the start of a given planning iteration, an y no des of T and F whic h are no longer v alid are deleted, and an y disconnected sub-trees whic h are created as a result are placed in to F (as seen in algorithms 11 and 12). With giv en probabil- ities, the algorithm tries to connect T to a new random state, to the goal state, or to the root of a tree in F (algorithm 13). In [ZKB07], a simple greedy smo othing heuristic is used, that tries to shorten paths by skipping in termediate no des. The MP-RR T is compared to an iterated RR T, ERR T and DRR T, in 2D, 3D and 4D problems, with and without smo othing. F or most of the experiments, MP-RR T mo destly outp erforms the other algorithms, but in the 4D case with smoothing, the p erformance gap in fav or of MP-RR T is muc h larger. The authors explained this fact due to MP-RR T b eing able to construct m uch more robust plans in the face of dynamic obstacle motion. Another algorithm that utilizes the concept of forests is Reconﬁgurable Random F orests (RRF) presen ted in [LS02], but without the success of MP-RR T. 2.6 Rapidly Exploring Ev olutionary T ree The Rapidly Exploring Evolutionary T ree, introduced in [MWS07] uses a bidirectional RR T and a kd-tree (see section 2.7) for eﬃcient nearest neigh b or searc h. The mo diﬁca- tions to the Extend() function are sho wn in algorithm 14. The re-balancing of a kd-tree is costly , and in this pap er a simple threshold on the n umber of nodes added b efore re-balancing was used. The authors suggest using the metho d describ ed in [AL02] and 11 Algorithm 11 MPRR TSearc h( q init ) 1: T ← the previous searc h tree, if an y 2: F ← the previous forest of disconnected sub-trees 3: q init ← the initial state 4: if T = ∅ then 5: q root ← q init 6: Insert( q root , T ) 7: else 8: PruneAndPrep end( T , F , q init ) 9: if T reeHasGoal( T ) then 10: return true 11: while searc h time/space remaining do 12: q new ← SelectSample( F ) 13: q near ← NearestNeigh b or( q new ,T ) 14: if q new ∈ F then 15: b connect ← Connect( q near , q new ) 16: if b connect and T reeHasGoal( T ) then 17: return true 18: else 19: b extend ← Extend( q near , q new ) 20: if b extend and IsGoal( q new ) then 21: return true 22: return false used in [BV02] to impro ve the searc h sp eed. The no velt y in this algorithm comes from the introduction of an evolutionary algorithm [BFM97] that builds a population of biases for the RR Ts. The genot yp e of the ev olutionary algorithm consists of a single rob ot conﬁguration for each tree. This conﬁguration is sampled instead of the uniform distribution. T o balance exploration and exploitation, the evolutionary algorithm was designed with 50% elitism. The ﬁtness function is related to the n umber of left and righ t branches tra v ersed during the insertion of a new no de in the kd-tree. The goal is to introduce a bias to the RR T algorithm which sho ws preference to nodes created 12 Algorithm 12 PruneAndPrep end( T , F , q init ) 1: for all q ∈ T , F do 2: if not No deV alid( q ) then 3: KillNo de( q ) 4: else if not ActionV alid( q ) then 5: SplitEdge( q ) 6: if not T = ∅ and q root 6 = q init then 7: if not ReRo ot( T , q init ) then 8: F ← F S T 9: T . init( q init ) Algorithm 13 SelectSample( F ) 1: p ← Random(0 , 1) 2: if p < p goal then 3: q new ← q goal 4: else if p < ( p goal + p forest ) and not Empt y ( F ) then 5: q new ← q ∈ SubT reeRo ots( F ) 6: else 7: q new ← RandomState() 8: return q new a wa y from the cen ter of the tree. The authors suggest com bining RET with DRR T or MP-RR T. 2.7 Multidimensional Binary Searc h T rees The kd-tree, ﬁrst in tro duced in [Ben75], is a binary tree in whic h ev ery node is a k-dimensional p oin t. Ev ery non-leaf no de generates a splitting h yp erplane that divides the space in to tw o subspaces. In the RR T algorithm, the num b er of p oin ts gro ws incre- men tally , un balancing the tree, thus slowing nearest-neighbor queries. Re-balancing a kd-tree is costly , so in [AL02] the authors present another approac h: A v ector of trees is constructed, where for n p oin ts there is a tree that contains 2 i p oin ts for each ”1” in 13 Algorithm 14 ExtendT oT arget( T ) 1: static p : p opulation, inc ← 1 2: p 0 : temp orary p opulation 3: if inc > length( p ) then 4: SortByFitness( p ) 5: p 0 ← n ull 6: for all i ∈ p do 7: if i is in upp er 50% then 8: AddIndividual( i, p 0 ) 9: else 10: i ← RandomState() 11: AddIndividual( i, p 0 ) 12: p ← p 0 13: inc ← 1 14: q r ← p (inc) 15: q near ← Nearest( T , q r ) 16: q new ← Extend( T , q near ) 17: if q new 6 = ∅ then 18: AddNo de( T , q new ) 19: AssignFitness( p (inc) , ﬁtness( q new ) 20: else 21: AssignFitness( p (inc) , 0) 22: return q new the i th place of the binary representation of n . As bits are cleared in the representation due to increasing n , the trees are deleted, and the points are included in a tree that corresp onds to the higher-order bit which is changed to ”1”. This general sc heme incurs in logarithmic-time o verhead, regardless of dimension. Exp eriment s sho w a substantial p erformance increase compared to a naiv e brute-force approac h. 14 2.8 Ev olutionary Planner/Na vigator An ev olutionary algorithm [BFM97] is a generic population-based meta-heuristic opti- mization algorithm. It is inspired in biological evolution, using methods such as indi- vidual selection, repro duction and m utation. The p opulation is comp osed of candidate solutions and they are ev aluated according to a ﬁtness function. The Evolutionary Planner/Na vigator presented in [XMZ96], [XMZT97], and [TX97] is an ev olutionary algorithm for path ﬁnding in dynamic en vironments. A high lev el description is shown in algorithm 15. A diﬀerence with RR T is that it can optimize the path according to any ﬁtness function deﬁned (length, smoothness, etc), without the need for a p ost-pro cessing step. Exp erimen tal tests ha v e shown it has go o d p er- formance for sparse maps, but no so m uc h for diﬃcult maps with narrow passages or to o crowded with obstacles. How ever, when a feasible path is found, it is very eﬃcient at optimizing it and adapting to the dynamic obstacles. Ev ery individual in the p op- ulation is a sequence of no des, representing no des in a path consisting of straigh t-line segmen ts. Each no de consists of an ( x, y ) pair and a state v ariable b with information ab out the feasibility of the p oin t and the path segmen t connecting it to the next p oint. Individuals hav e v ariable length. Since a path p can b e either feasible or unfeasible, tw o ev aluation functions are used. F or feasible paths (equation 2.1), the goal is to minimize distance trav eled, main tain a smo oth tra jectory and satisfy a clearance requiremen t (the rob ot should not approac h the obstacles to o closely). F or unfeasible paths, we use equation 2.2, taken from [Xia97], where µ is the num b er of intersections of a whole path with obstacles and η is the a verage num b er of intersections p er unfeasible segmen t. ev al f ( p ) = w d · dist( p ) + w s · smo oth( p ) + w c · clear( p ) (2.1) ev al u ( p ) = µ + η (2.2) EP/N uses eight diﬀeren t op erators, as shown in ﬁgure 2.2 (description taken from [XMZ96]): Cr ossover: Recom bines tw o (parent) paths in to t wo new paths. The paren t paths 15 Algorithm 15 EP/N 1: P ( t ): p opulation at generation t 2: t ← 0 3: Initialize( P ( t )) 4: Ev aluate( P ( t )) 5: while ( not termination-condition) do 6: t ← t + 1 7: Select op erator o j with probability p j 8: Select parent(s) from P ( t ) 9: Pro duce oﬀspring applying o j to selected paren t(s) 10: Ev aluate oﬀspring 11: Replace worst individual in P ( t ) b y new oﬀspring 12: Select b est individual p from P ( t ) 13: if F easible( p ) then 14: Mo ve along path p 15: Up date all individuals in P ( t ) with curren t p osition 16: if changes in en vironment then 17: Up date map 18: Ev aluate( P ( t )) 19: t ← t + 1 are divided randomly into tw o parts resp ectiv ely and recom bined: The ﬁrst part of the ﬁrst path with the second part of the second path, and the ﬁrst part of the second path with the second part of the ﬁrst path. Note that there can b e diﬀeren t n umbers of no des in the t wo paren t paths. Mutate 1: Used to ﬁne tune no de co ordinates in a feasible path for shap e adjustment. This op erator randomly adjusts no de co ordinates within some lo cal clearance of the path so that the path remains feasible afterw ards. Mutate 2: Used for large random changes of no de co ordinates in a path, which can b e either feasible or unfeasible. 16 Crossov er Delete Mutation 1 Swap Smooth Mutation 2 Insert Delete Repair Figure 2.2: The roles of the genetic op erators Insert-Delete: Operates on an unfeasible path b y inserting randomly generated new no des in to unfeasible path segmen ts and deleting unfeasible no des (i.e., path no des that are inside obstacles). Delete: Deletes no des from a path, whic h can b e either feasible or unfeasible. If the path is unfeasible, the deletion is done randomly . Otherwise, the op erator decides whether a no de should deﬁnitely b e deleted based on some heuristic knowledge, and if a no de is not deﬁnitely deletable, its deletion will b e random. Swap: Sw aps the co ordinates of randomly selected adjacen t no des in a path, which can b e either feasible or unfeasible. Smo oth: Smoothens turns of a feasible path by “cutting corners,” i.e., for a selected no de, the operator inserts tw o new no des on the t wo path segments connected to that no de respectively and deletes that selected no de. The nodes with sharp er turns are more likely to be selected. R ep air: Repairs a randomly selected unfeasible segmen t in a path b y “pulling” the 17 segmen t around its in tersecting obstacle. The probabilities of using each op erator is set randomly at the b eginning, and then are up dated according to the success ratio of each op erator, so more successful operators are used more often, and automatically c hosen according to the instance of the problem, eliminating the diﬃcult problem of hand tuning the probabilities. In [TX97], the authors include a memory buﬀer for each individual to store go o d paths from its ancestors, whic h ga ve a small p erformance gain. In [EAA04], the authors prop ose strategies for improving the stability and con- trolling p opulation div ersity for a simpliﬁed v ersion of the EP/N. An impro vemen t prop osed by the authors in [XMZT97] is using heuristics for the initial p opulation, instead of random initialization. W e will consider this impro vemen t in section 3.1. Other evolutionary algorithms hav e also b een prop osed for similar problems, in [NG04] a binary genetic algorithm is used for an oﬄine planner, and [NVTK03] presen ts an algorithm to generate curved tra jectories in 3D space for an unmanned aerial v ehicle. EP/N has b een adapted to an 8-connected grid mo del in [AR08] (with previous w ork in [AR05] and [Alf05]). The authors study t wo diﬀerent crosso ver op erators and four asexual operators. Exp erimen tal results for this new algorithm (EvP) in static unkno wn en vironments show that it is faster than EP/N. 18 Chapter 3 Prop osed T ec hniques 3.1 Com bining RR T and EP/N As men tioned in section 2, RR T v ariants pro duce suboptimal solutions, which must later be p ost-pro cessed for shortening, smo othing or other desired characteristics. On the other hand, EP/N, presen ted in section 2.8, can optimize a solution according to an y given ﬁtness function. How ev er, this algorithm is slo wer at ﬁnding a ﬁrst feasible solution. In this section w e prop ose a combined approach, that uses RR T to ﬁnd an initial solution to b e used as starting p oin t for EP/N, taking adv an tage of the strong p oin ts of b oth algorithms. 3.1.1 The Com bined Strategy Initial Solution EP/N as presented in section 2.8 can not ﬁnd feasible paths in a reasonable amount of time in any but very sparse maps. F or this reason, RR T will be used to generate a ﬁrst initial solution, ignoring the eﬀects pro duced b y dynamic ob jects. This solution will b e in the initial p opulation of the ev olutionary algorithm, along with random solutions. F easibilit y and Optimization EP/N is the resp onsible of regaining feasibility when it is lost due to a moving obstacle or a new obstacle found in a partially kno wn or totally unkno wn en vironment. If a 19 feasible solution can not b e found in a giv en amount of time, the algorithm is restarted, k eeping its old p opulation, but adding a new individual generated b y RR T. 3.1.2 Algorithm Implemen tation Algorithm 16 Main() 1: q robot ← is the curren t rob ot p osition 2: q goal ← is the goal p osition 3: while q robot 6 = q goal do 4: up dateW orld(time) 5: pro cessRR TEPN(time) The combined RR T-EP/N algorithm proposed here w orks b y alternating en viron- men t updates and path planning, as can b e seen in algorithm 16. The ﬁrst stage of the path planning (see algorithm 17) is to ﬁnd an initial path using a RR T technique, ignoring any cuts that might happ en during environmen t up dates. Thus, the RR T ensures that the path found do es not collide with static obstacles, but might collide with dynamic obstacles in the future. When a ﬁrst path is found, the na vigation is done by using the standard EP/N as sho wn in algorithm 15. 3.2 A Simple Multi-stage Probabilistic Algorithm In highly dynamic environmen ts, with man y (or a few but fast) relativ ely small mov- ing obstacles, regrowing trees are pruned to o fast, cutting a wa y important parts of the trees b efore they can be replaced. This dramatically reduces the p erformance of the algorithms, making them unsuitable for these classes of problems. W e b eliev e that b et- ter performance could be obtained b y sligh tly mo difying a RR T solution using simple obstacle-a voidance operations on the new colliding points of the path b y informed local searc h. The path could b e greedily optimized if the path has reached the feasibility condition. 20 Algorithm 17 pro cessRR TEPN( time ) 1: q robot ← the current rob ot p osition 2: q start ← the starting p osition 3: q goal ← the goal p osition 4: T init ← the tree ro oted at the rob ot p osition 5: T goal ← the tree ro oted at the goal position 6: path ← the path extracted from the merged RR Ts 7: q robot ← q start 8: T init . init( q robot ) 9: T goal . init( q goal ) 10: while time elapsed < time do 11: if First path not found then 12: RR T( T init , T goal ) 13: else 14: EP/N() 3.2.1 A Multi-stage Probabilistic Strategy If solving equation 1.1 is not a simple task in static environmen ts, solving dynamic v ersions turns out to b e even more diﬃcult. In dynamic path planning w e cannot w ait until reac hing the optimal solution b ecause we m ust deliver a “go od enough” plan within some time restriction. Th us, a heuristic approac h m ust b e dev elop ed to tac kle the on-line nature of the problem. The heuristic algorithms presen ted in sections 2.3, 2.4 and 2.5 extend a metho d developed for static en vironments, whic h pro duces p o or resp onse to highly dynamic en vironments and unw an ted complexity of the algorithms. W e prop ose a m ulti-stage combination of simple heuristic and probabilistic tec h- niques to solv e eac h part of the problem: F easibilit y , initial solution and optimization. F easibilit y The key p oin t in this problem is the hard constraint in equation 1.1 whic h m ust be met before even thinking ab out optimizing. The problem is that in highly dynamic en vironments a path turns rapidly from feasible to unfeasible — and the other w a y 21 around — even if our path do es not c hange. W e prop ose a simple informe d lo c al se ar ch to obtain paths in C free . The idea is to randomly searc h for a C free path b y modifying the nearest colliding segmen t of the path. As w e include in the searc h some kno wledge of the problem, the informe d term is coined to distinguish it from blind lo cal search. The details of the op erators used for the mo diﬁcation of the path are described in section 3.2.2. If a feasible solution can not b e found in a given amoun t of time, the algorithm is restarted, with a new starting p oin t generated b y a RR T v ariant. Initial Solution The problem with local searc h algorithms is that they repair a solution that it is assumed to be near the feasibilit y condition. T rying to pro duce feasible paths from scratc h with lo cal searc h (or even with evolutionary algorithms [XMZT97]) is not a go od idea due the randomness of the initial solution. Therefore, w e prop ose feeding the informed local searc h with a standar d RR T solution at the start of the planning, as can b e seen in ﬁgure 3.1. Optimization Without an optimization criterion, the path could gro w inﬁnitely large in time or size. Therefore, the ev al( · , · ) function m ust be minimized when a (temp orary) feasible path is obtained. A simple greedy tec hnique is used here: W e test eac h p oint in the solution to chec k if it can b e remov ed maintaining feasibilit y; if so, w e remov e it and chec k the follo wing p oin t, con tinuing un til reac hing the last one. 3.2.2 Algorithm Implemen tation Algorithm 18 Main() 1: q robot ← the current rob ot p osition 2: q goal ← the goal p osition 3: while q robot 6 = q goal do 4: up dateW orld(time) 5: pro cessMultiStage(time) 22 The m ulti-stage algorithm prop osed in this thesis w orks b y alternating en vironment up dates and path planning, as can b e seen in algorithm 18. The ﬁrst stage of the path planning (see algorithm 19) is to ﬁnd an initial path using a RR T tec hnique, ignoring an y cuts that migh t happ en during en vironmen t up dates. Th us, RR T ensures that the path found do es not collide with static obstacles, but migh t collide with dynamic obstacles in the future. When a ﬁrst path is found, the na vigation is done by alternating a simple informed lo cal searc h and a simple greedy heuristic as sho wn in ﬁgure 3.1. Algorithm 19 pro cessMultiStage(time) 1: q robot ← is the current rob ot p osition 2: q start ← is the starting p osition 3: q goal ← is the goal p osition 4: T init ← is the tree ro oted at the rob ot p osition 5: T goal ← is the tree ro oted at the goal position 6: path ← is the path extracted from the merged RR Ts 7: q robot ← q start 8: T init . init( q robot ) 9: T goal . init( q goal ) 10: while time elapsed < time do 11: if First path not found then 12: RR T( T init , T goal ) 13: else 14: if path is not collision free then 15: ﬁrstCol ← collision p oin t closest to rob ot 16: arc(path , ﬁrstCol) 17: m ut(path , ﬁrstCol) 18: p ostPro cess(path) The second stage is the informed lo cal searc h, whic h is a tw o step function composed b y the ar c and mutate op erators (algorithms 20 and 21). The ﬁrst one tries to build a square arc around an obstacle, b y inserting tw o new points b et w een t w o p oin ts in the path that form a segmen t colliding with an obstacle, as sho wn in ﬁgure 3.2. The second step in the function is a mutation op erator that mov es a p oin t close to an obstacle to 23 a random p oint in the vicinity , as explained graphically in ﬁgure 3.3. The m utation op erator is inspired b y the ones used in the Adaptiv e Evolutionary Planner/Navigator (EP/N) presen ted in [XMZT97], while the arc op erator is derived from the arc operator in the Ev olutionary Algorithm presented in [AR05]. Algorithm 20 arc(path , ﬁrstCol) 1: vicinity ← some vicinity size 2: randDev ← random( − vicinity , vicinity) 3: p oint1 ← path[ﬁrstCol] 4: p oint2 ← path[ﬁrstCol + 1] 5: if random()%2 then 6: newP oint1 ← (p oin t1[ X ] + randDev , p oin t1[ Y ]) 7: newP oint2 ← (p oin t2[ X ] + randDev , p oin t2[ Y ]) 8: else 9: newP oint1 ← (p oin t1[ X ] , p oin t1[ Y ] + randDev) 10: newP oint2 ← (p oin t2[ X ] , p oin t2[ Y ] + randDev) 11: if path segmen ts p oin t1-newP oint1-newP oint2-point2 are collision free then 12: Add new p oin ts betw een point1 and point2 13: else 14: Drop new p oin ts Algorithm 21 mut(path , ﬁrstCol) 1: vicinity ← some vicinity size 2: path[ﬁrstCol][X] + = random( − vicinit y , vicinity) 3: path[ﬁrstCol][Y] + = random( − vicinit y , vicinity) 4: if path segmen ts b efore and after path[ﬁrstCol] are collision free then 5: Accept new p oin t 6: else 7: Reject new p oin t The third and last stage is the greedy optimization heuristic, whic h can b e seen as a p ost-processing for path shortening, that eliminates in termediate no des if doing so do es not create collisions, as is described in the algorithm 22. 24 Algorithm 22 p ostPro cess( path ) 1: i ← 0 2: while i < path . size() − 2 do 3: if segment path[ i ] to path[ i + 2] is collision free then 4: Delete path[i+1] 5: else 6: i ← i + 1 Figure 3.1: A Multi-stage Strategy for Dynamic P ath Planning . This ﬁgure describ es the life-cycle of the m ulti-stage algorithm presen ted here. The RR T, informed lo cal search, and greedy heuristic are combined to pro duce a cheap solution to the dynamic path planning problem. 25 Figure 3.2: The arc op erator . This op erator draws an oﬀset v alue ∆ o ver a ﬁxed in terv al called vicinit y . Then, one of the t wo axes is selected to p erform the arc and t wo new consecutive p oints are added to the path. n 1 is placed at a ± ∆ of the p oin t b and n 2 at ± ∆ of p oin t c , b oth of them ov er the same selected axis. The axis, sign and v alue of ∆ are chosen randomly from an uniform distribution. Figure 3.3: The m utation op erator . This op erator dra ws t wo oﬀset v alues ∆ x and ∆ y o ver a vicinity region. Then the same p oin t b is mo v ed in b oth axes from b = [ b x , b y ] to b 0 = [ b x ± ∆ x , b y ± ∆ y ], where the sign and oﬀset v alues are c hosen randomly from an uniform distribution. 26 Chapter 4 Exp erimen tal Setup and Results 4.1 Exp erimen tal Setup Although the algorithms dev elop ed in this thesis are aimed at dynamic environmen ts, for the sak e of completeness they will also be compared in partially known environmen ts and in totally unkno wn en vironments, where some or all of the obstacles b ecome visible to the planner as the rob ot approaches each one of them, sim ulating a rob ot with limited sensor range. 4.1.1 Dynamic En vironmen t The ﬁrst environmen t for our exp erimen ts consists on t wo maps with 30 mo ving obsta- cles the same size of the rob ot, with a random sp eed b et ween 10% and 55% the sp eed of the rob ot. Go o d p erformance in this environmen t is the main focus of this thesis. This dynamic envir onments are illustrated in ﬁgures 4.1 and 4.2. 4.1.2 P artially Kno wn En vironmen t The second en vironment uses the same maps, but with a few obstacles, three to four times the size of the rob ot, that b ecome visible when the rob ot approac hes eac h one of them. This is the kind of environmen t that most dynamic RR T v arian ts w ere designed for. The p artial ly known envir onments are illustrated in ﬁgure 4.3 and 4.4. 27 Figure 4.1: The dynamic en vironment, map 1. The gr e en square is our robot, currently at the start p osition. The blue squares are the mo ving obstacles. The blue cross is the goal. 4.1.3 Unkno wn En vironmen t F or completeness sake, we will compare the diﬀerent tec hnique in a third en vironment, w ere we use one of the maps presen ted b efore, but all the obstacles will initially b e unkno wn to the planners, and will become visible as the robot approac hes them, forcing sev eral re-plans. This unknown envir onment is illustrated in ﬁgure 4.5. 4.2 Implemen tation Details The algorithms where implemented in C++ using the MoP a framew ork 1 partly dev el- op ed by the author. This framework features exact collision detection, three diﬀeren t map formats (including .pbm images from any graphic editor), dynamic, unknown and partially known en vironments and supp ort for easily adding new planners. One of the biggest do wnsides is that it only supp orts rectangular ob jects, so sev eral ob jects 1 MoP a homepage: https://csrg.inf.utfsm.cl/twiki4/bin/view/CSRG/MoPa 28 Figure 4.2: The dynamic en vironment, map 2. The gr e en square is our robot, currently at the start p osition. The blue squares are the mo ving obstacles. The blue cross is the goal. m ust b e used to represent other geometrical shap es, as in ﬁgure 4.4, composed of 1588 rectangular ob jects. There are several v ariations that can b e found in the literature when implemen ting RR T. F or all our RR T v ariants, the following are the details on where w e departed from the basics: 1. W e alw a ys use t wo trees rooted at q init and q g oal . 2. Our EXTEND function, if the p oin t cannot b e added without collisions to a tree, adds the mid p oin t b et ween the nearest tree no de and the nearest collision p oint to it. 3. In each iteration, w e try to add the new randomly generated point to b oth trees, and if successful in b oth, the trees are merged, as proposed in [KL00]. 4. W e believe that there migh t b e signiﬁcant p erformance diﬀerences b et ween al- lo wing or not allo wing the rob ot to adv ance to wards the node nearest to the goal 29 Figure 4.3: The partially kno wn environmen t, map 1. The gr e en square is our rob ot, curren tly at the start p osition. The yel low squares are the suddenly appearing obsta- cles. The blue cross is the goal. when the trees are disconnected, as prop osed in [ZKB07]. In p oint 4 ab ov e, the problem is that the rob ot w ould b ecome stuck if it enters a small conca ve zone of the environmen t (like a ro om in a building) while there are mo ving obstacles inside that zone, but otherwise it can lead to b etter p erformance. Therefore w e present results for b oth kinds of behavior: DRR T-adv and MP-RR T-adv mov e even when the trees are disconnected, while DRR T-noadv and MP-RR T-noadv only mov e when the trees are connected. In MP-RR T, the forest was handled b y simply replacing the oldest tree in it if the forest had reac hed the maximum allow ed size. Concerning the parameter selection, the probability for selecting a p oin t in the vicin- it y of a point in the w aypoint cache in DRR T was set to 0.4 as suggested in [FKS06]. The probability for trying to reuse a subtree in MP-RR T w as set to 0.1 as suggested in [ZKB07]. Also, the forest size w as set to 25 and the minim um size of a tree to b e sa ved in the forest w as set to 5 no des. 30 Figure 4.4: The partially kno wn environmen t, map 2. The gr e en square is our rob ot, curren tly at the start p osition. The yel low squares are the suddenly appearing obsta- cles. The blue cross is the goal. F or the com bined RR T-EP/N, it was considered the planner was stuc k after tw o sec- onds without a feasible solution in the p opulation, at which p oin t a new solution from a RR T v arian t is inserted into the p opulation. F or the simple m ulti-stage probabilis- tic algorithm, the restart is made after one second of encountering the same obstacle along the planned path. This second approach, which seems better, cannot b e applied to the RR T-EP/N, b ecause there is no single path to c heck for collisions, but instead a p opulation of paths. The restart times where man ually tuned. 4.3 Results The three algorithms w ere run a hundred times in each environmen t and map com- bination. The cutoﬀ time w as ﬁve minutes for all tests, after which the rob ot w as considered not to hav e reac hed the goal. Results are presented concerning: 31 Figure 4.5: The unknown environmen t. The gr e en square is our rob ot, currently at the start position. The blue cross is the goal. None of the obstacles is visible initially to the planners • Suc c ess r ate (S.R.): The p ercen tage of times the rob ot arriv ed at the goal, b efore reac hing the ﬁve min utes cutoﬀ time. This do es not accoun t for collisions or time the rob ot w as stopped w aiting for a plan. • Numb er of ne ar est neighb or lo okups p erforme d by e ach algorithm (N.N.): One of the p ossible bottlenecks for tree-based algorithms • Numb er of c ol lision che cks p erforme d (C.C.), which in our sp eciﬁc implementa- tion takes a signiﬁcan t percentage of the running time • Time it to ok the rob ot to reac h the goal, ± the standard deviation. 4.3.1 Dynamic En vironmen t Results The results in tables 4.1 and 4.2 sho w that the m ulti-stage algorithm tak es considerably less time than the DRR T and MP-RR T to reac h the goal, with far less collision chec ks. The com bined RR T-EP/N is a close second. It w as exp ected that nearest neigh b or 32 lo okups w ould b e m uc h lo wer in b oth combined algorithms than in the RR T v ariants, b ecause they are only p erformed in the initial phase and restarts, not during navigation. The com bined algorithms pro duce more consistent results within a map, as sho wn by their smaller standard deviations, but also across diﬀeren t maps. An interesting fact is that in map 1 DRR T is sligh tly faster than MP-RR T, and in map 2 MP-RR T is faster than DRR T. Ho wev er the diﬀerences are to o small to draw an y conclusions. Figures 4.6 and 4.7 sho w the times and success rates of the diﬀerent algorithms, when c hanging the num b er of dynamic obstacles in map 1. The simple m ulti-stage algorithm and the mixed RR T-EP/N clearly sho w the b est p erformance, while the DRR T-adv and MP-RR T-adv signiﬁcantly reduce their success rate when confronted to more than 30 moving obstacles. T able 4.1: Dynamic En vironment Results, map 1. Algorithm S.R.[%] C.C. N.N. Time[s] Multi-stage 99 23502 1122 6.62 ± 0.7 RR T-EP/N 100 58870 1971 10.34 ± 14.15 DRR T-noadv 100 91644 4609 20.57 ± 20.91 DRR T-adv 98 107225 5961 23.72 ± 34.33 MP-RR T-noadv 100 97228 4563 22.18 ± 14.71 MP-RR T-adv 94 118799 6223 26.86 ± 41.78 T able 4.2: Dynamic En vironment Results, map 2. Algorithm S.R.[%] C.C. N.N. Time[s] Multi-stage 100 10318 563 8.05 ± 1.47 RR T-EP/N 100 21785 1849 12.69 ± 5.75 DRR T-noadv 99 134091 4134 69.32 ± 49.47 DRR T-adv 100 34051 2090 18.94 ± 17.64 MP-RR T-noadv 100 122964 4811 67.26 ± 42.45 MP-RR T-adv 100 25837 2138 16.34 ± 13.92 33 Figure 4.6: Times for diﬀerent num b er of moving obstacles in map 1. 4.3.2 P artially Kno wn En vironmen t Results T aking b oth maps into consideration, the results in tables 4.3 and 4.4 show that b oth com bined algorithms are faster and more consisten t than the RR T v ariants, with the simple multi-stage algorithm being faster in b oth. These results were unexp ected, as the com bined algo rithms w ere designed for dynamic environmen ts. It is w orth to notice though, that in map 1 DRR T-adv is a close second, but in map 2 it is a close last, so its lac k of reliability do es not mak e it a go o d choice in this scenario. In this en vironment, as in the dynamic environmen t, in map 1 DRR T is faster than MP-RR T, while the opp osite happ ens in map 2. 4.3.3 Unkno wn En vironmen t Results Results in table 4.5 present the combined RR T-EP/N clearly as the faster algorithm in unkno wn environmen ts, with the multi-stage algorithm in second place. In con trast to 34 Figure 4.7: Success rate for diﬀeren t n umber of moving obstacles in map 1. T able 4.3: Partially Known En vironment Results, map 1. Algorithm S.R.[%] C.C. N.N. Time[s] Multi-stage 100 12204 1225 7.96 ± 2.93 RR T-EP/N 99 99076 1425 9.95 ± 2.03 DRR T-noadv 100 37618 1212 11.66 ± 15.39 DRR T-adv 99 12131 967 8.26 ± 2.5 MP-RR T-noadv 99 49156 1336 13.82 ± 17.96 MP-RR T-adv 97 26565 1117 11.12 ± 14.55 dynamic and partially known en vironments in this same map, MP-RR T is faster than DRR T. 35 T able 4.4: Partially Known En vironment Results, map 2. Algorithm S.R.[%] C.C. N.N. Time[s] Multi-stage 100 12388 1613 17.66 ± 4.91 RR T-EP/N 100 42845 1632 22.01 ± 6.65 DRR T-noadv 99 54159 1281 32.67 ± 15.25 DRR T-adv 100 53180 1612 32.54 ± 19.81 MP-RR T-noadv 100 48289 1607 30.64 ± 13.97 MP-RR T-adv 100 38901 1704 25.71 ± 12.56 T able 4.5: Unknown Environmen t Results Algorithm S.R.[%] C.C. N.N. Time[s] Multi-stage 100 114987 2960 13.97 ± 3.94 RR T-EP/N 100 260688 2213 10.69 ± 2.08 DRR T-noadv 98 89743 1943 18.38 ± 22.01 DRR T-adv 100 104601 2161 19.64 ± 34.87 MP-RR T-noadv 99 129785 1906 21.82 ± 27.23 MP-RR T-adv 100 52426 1760 16.05 ± 10.87 36 Chapter 5 Conclusions and F uture W ork The new multi-stage algorithm proposed here has goo d p erformance in very dynamic en vironments. It b eha v es particularly well when several small obstacles are mo ving around at random. This is explained by the fact that if the obstacles are constan tly mo ving, they will sometimes mov e out of the w ay b y themselv es, whic h our algorithm tak es adv antage of, while RR T based ones do not, they just drop branches of the tree that could pro v e useful again just a few momen ts later. The com bined RR T-EP/N, although ha ving more op erators, and automatic adjustment of the operator probabil- ities according to their eﬀectiveness, is still better than the RR T v ariants, but ab out 55% slo wer than the simple m ulti-stage algorithm. This is explained b y the n um b er of collision chec ks p erformed, more than twice than the m ulti-stage algorithm, b ecause collision chec ks m ust be p erformed for the entire p opulation, not just a single path. In the partially kno wn en vironment, ev en though the diﬀerence in collision c hecks is ev en greater than in dynamic en vironments, the RR T-EP/N p erformance is ab out 25% worse than the multi-stage algorithm. Ove rall, the RR T v arian ts are closer to the p erformance of both com bined algorithms. In the totally unkno wn environmen t, the com bined RR T-EP/N is about 30% faster than the simple m ulti-stage algorithm, and b oth outp erform the RR T v ariants, with m uch smaller times and standard deviations. All things considered, the simple multi-stage algorithm is the b est c hoice in most situations, with faster and more predictable planning times, a higher success rate, few er collision c hec ks performed and, abov e all, a m uch simpler implementation than all the 37 other algorithms compared. This thesis sho ws that a m ulti-stage approach, using diﬀerent techniques for initial plannning and navigation, outperforms curren t probabilistic sampling tec hniques in dynamic, partially kno wn and unknown envi ronmen ts. P art of the results presented in this thesis are published in [BALS09]. 5.1 F uture W ork W e prop ose sev eral areas of impro vemen t for the w ork presen ted in this thesis. 5.1.1 Algorithms The most promising area of improv ement seems to b e to exp erimen t with diﬀerent on-line planners such as a version of the EvP ([AR05] and [AR08]) mo diﬁed to work in contin uous conﬁguration space or a p oten tial ﬁeld navigator. Also, the lo cal search presen ted here could b eneﬁt from the use of more sophisticated operators and the parameters for the RR T v arian ts (suc h as forest size for MP-RR T), and the EP/N (suc h as p opulation size) could beneﬁt from b eing tuned sp eciﬁcally for this implemen tation, and not simply reusing the parameters found in previous work. Another area of researc h that could b e tac kled is extending this algorithm to higher dimensional problems, as RR T v ariants are known to w ork w ell in higher dimensions. Finally , as RR T v arian ts are suitable for kino dynamic planning, w e only need to adapt the on-line stage of the algorithm to hav e a new m ulti-stage planner for problems with kino dynamic constrain ts. 5.1.2 F ramew ork The MoP a framework could beneﬁt from the integration of a third part y logic lay er, with support for arbitrary geometrical shap es, a spatial scene graph and hierarchical maps. Some candidates w ould b e OgreODE [Ogr], Spring R TS [Spr] and OR TS [OR T]. Other possible impro v ements are adding support for other map formats, including discrimination of static and moving obstacles, limited sensor range simulation and 38 in tegration with external hardw are suc h as the Lego NXT [Leg], to run exp eriments in a more realistic scenario. 39 Bibliograph y [AL02] A. A tramento v and S.M. LaV alle. Eﬃcien t nearest neighbor searching for motion planning. In Pr o c e e dings of the IEEE International Confer enc e on R ob otics and A utomation , v olume 1, pages 632–637 vol.1, 2002. [Alf05] T. Alfaro. Un algoritmo evolutiv o para la resoluci´ on del problema de plan- iﬁcaci´ on de rutas de un rob ot m´ ovil. Master’s thesis, Departamen to de Inform´ atica, Univ ersidad T ´ ecnica F ederico San ta Mar ´ ıa, June 2005. [AR05] T. Alfaro and M. Riﬀ. An on-the-ﬂy ev olutionary algorithm for rob ot motion planning. L e ctur e Notes in Computer Scienc e , 3637:119, 2005. [AR08] T. Alfaro and M. Riﬀ. An ev olutionary na vigator for autonomous agen ts on unkno wn large-scale environmen ts. Intel ligent Automation and Soft Com- puting , 14(1):105, 2008. [BALS09] N.A. Barriga, M. Ara ya-Lopez, and M. Solar. Combining a probabilis- tic sampling technique and simple heuristics to solve the dynamic path planning problem. In Pr o c e e dings XXVIII International Confer enc e of the Chile an Computing Scienc e So ciety (SCCC) , 2009. [Ben75] J.L. Ben tley . Multidimensional binary searc h trees used for asso ciativ e searc hing. Communic ations of the A CM , 18(9):517, 1975. [BFM97] T. B¨ ac k, DB F ogel, and Z. Michalewicz. Handb o ok of Evolutionary Com- putation . T aylor & F rancis, 1997. [BV02] J. Bruce and M. V eloso. Real-time randomized path planning for rob ot na vigation. In Pr o c e e dings of the IEEE/RSJ International Confer enc e on Intel ligent R ob ots and Systems , v olume 3, pages 2383–2388 v ol.3, 2002. [EAA04] A. Elshamli, HA Ab dullah, and S. Areibi. Genetic algorithm for dynamic path planning. In Pr o c e e dings of the Canadian Confer enc e on Ele ctric al and Computer Engine ering , v olume 2, 2004. [FKS06] D. F erguson, N. Kalra, and A. Stentz. Replanning with RR Ts. In Pr o c e e d- ings of the IEEE International Confer enc e on R ob otics and Automation , pages 1243–1248, 15-19, 2006. 40 [HA92] Y ong K. Hw ang and Narendra Ah uja. Gross motion planning — a survey . A CM Computing Surveys , 24(3):219–291, 1992. [KL00] Jr. Kuﬀner, J.J. and S.M. LaV alle. RR T-connect: An eﬃcien t approach to single-query path planning. In Pr o c e e dings of the IEEE International Confer enc e on R ob otics and Automation , volume 2, pages 995–1001 v ol.2, 2000. [KSLO96] L.E. Ka vraki, P . Sv estk a, J.-C. Latombe, and M.H. Ov ermars. Probabilistic roadmaps for path planning in high-dimensional conﬁguration spaces. IEEE T r ansactions on R ob otics and Automation , 12(4):566–580, August 1996. [La v98] S.M. Lav alle. Rapidly-Exploring Random T rees: A new to ol for path plan- ning. T ec hnical rep ort, Computer Science Departmen t, Iow a State Univ er- sit y , 1998. [Leg] Lego Mindstorms. http://mindstorms.lego.com/ . [LKJ99] S.M. LaV alle and J.J. Kuﬀner Jr. Randomized kino dynamic planning. In Pr o c e e dings of the IEEE International Confer enc e on R ob otics and A utoma- tion , volume 1, 1999. [LS02] Tsai-Y en Li and Y ang-Chuan Shie. An incremen tal learning approac h to motion planning with roadmap managemen t. In Pr o c e e dings of the IEEE International Confer enc e on R ob otics and A utomation , volume 4, pages 3411–3416 vol.4, 2002. [MWS07] S.R. Martin, S.E. W right, and J.W. Sheppard. Oﬄine and online ev olu- tionary bi-directional RR T algorithms for eﬃcient re-planning in dynamic en vironments. In Pr o c e e dings of the IEEE International Confer enc e on A utomation Scienc e and Engine ering , pages 1131–1136, September 2007. [NG04] G. Nagib and W. Gharieb. P ath planning for a mobile rob ot using genetic algorithms. In Pr o c e e dings of the International Confer enc e on Ele ctric al, Ele ctr onic and Computer Engine ering , pages 185–189, 2004. [NVTK03] I.K. Nik olos, K.P . V ala v anis, N.C. Tsourveloudis, and A.N. Kostaras. Ev o- lutionary algorithm based oﬄine/online path planner for UA V navigation. IEEE T r ansactions on Systems, Man, and Cyb ernetics, Part B , 33(6):898– 912, December 2003. [Ogr] OgreODE. http://www.ogre3d.org/wiki/index.php/OgreODE . [OR T] OR TS – A free softw are R TS game engine. http://www.cs.ualberta.ca/~mburo/orts/ . [Spr] The Spring Pro ject. http://springrts.com/ . 41 [Ste94] A. Sten tz. Optimal and eﬃcient path planning for partially-kno wn environ- men ts. In Pr o c e e dings of the IEEE International Confer enc e on R ob otics and Automation , pages 3310–3317, 1994. [Ste95] A. Sten tz. The fo cussed D* algorithm for real-time replanning. In In- ternational Joint Confer enc e on Artiﬁcial Intel ligenc e , volume 14, pages 1652–1659. LA WRENCE ERLBAUM ASSOCIA TES L TD, 1995. [TX97] K.M. T ro jano wski and Z.J. Xiao. Adding memory to the Evolutionary Planner/Na vigator. In Pr o c e e dings of the IEEE International Confer enc e on Evolutionary Computation , pages 483–487, 1997. [Xia97] J. Xiao. Handb o ok of Evolutionary Computation , c hapter G3.6 The Evolu- tionary Planner/Navigator in a Mobile Rob ot En vironmen t. IOP Publish- ing Ltd., Bristol, UK, UK, 1997. [XMZ96] J. Xiao, Z. Mic halewicz, and L. Zhang. Ev olutionary Planner/Na vigator: Op erator p erformance and self-tuning. In International Confer enc e on Evo- lutionary Computation , pages 366–371, 1996. [XMZT97] J. Xiao, Z. Michalewicz, L. Zhang, and K. T ro jano wski. Adaptiv e Evo- lutionary Planner/Na vigator for mobile rob ots. Pr o c e e dings of the IEEE T r ansactions on Evolutionary Computation , 1(1):18–28, April 1997. [ZKB07] M. Zuck er, J. Kuﬀner, and M. Branic ky . Multipartite RR Ts for rapid replanning in dynamic en vironments. In Pr o c e e dings of the IEEE Inter- national Confer enc e on R ob otics and Automation , pages 1603–1609, April 2007. [ZM08] Liang jun Zhang and D. Mano c ha. An eﬃcien t retraction-based RR T plan- ner. In Pr o c e e dings of the IEEE International Confer enc e on R ob otics and A utomation , pages 3743–3750, Ma y 2008. 42

Single-Agent On-line Path Planning in Continuous, Unpredictable and Highly Dynamic Environments

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment