From Drinking Philosophers to Asynchronous Path-Following Robots

Fr om Drinking Philosophers to Asynchr onous Path-F ollowing Robots Y unus Emre Sahin, Necmiye Ozay Department of Electrical Engineering and Computer Science, University of Mic higan, Ann Arbor , MI Abstract In this paper , we consider the multi-robot path e xecution problem where a group of robots move on predeﬁned paths from their initial to target positions while av oiding collisions and deadlocks in the face of asynchrony . W e ﬁrst show that this problem can be reformulated as a distributed resource allocation problem and, in particular , as an instance of the well-known Drinking Philosophers Problem (DrPP). By careful construction of the drinking sessions capturing shared resources, we show that any existing solutions to DrPP can be used to design robot control policies that are collecti vely collision and deadlock-free. W e then propose modiﬁcations to an e xisting DrPP algorithm to allow more concurrent behavior , and provide conditions under which our method is deadlock-free. Our method does not require robots to know or to estimate the speed proﬁles of other robots and results in distributed control policies. W e demonstrate the efﬁcacy of our method on simulation examples, which show competiti ve performance against the state-of-the-art. K e ywor ds: Multi-agent systems; Autonomous mobile robots; Concurrent systems; Deadlock av oidance. 1 Introduction Multi-robot path planning (MRPP) has been one of the fundamental problems studied by artiﬁcial intelligence and robotics communities. Quickly ﬁnding paths that take each robot from their initial location to target location, and en- suring that robots execute these paths in a safe manner have applications in man y areas from e v acuation planning [15] to warehouse robotics [28], and from formation control [25] to cov erage [17]. There are sev eral challenges in multi-robot path planning such as scalability , optimality , trading off centralized ver- sus distributed decisions and corresponding communication loads, and potential asynchrony . Planning optimal collision- free paths is kno wn to be hard [24] ev en in synchronous cen- tralized settings. Recently dev eloped heuristics aim to ad- dress the scalability challenge when optimality is a concern [29]. Arguably , the problem gets ev en harder when there is non-determinism in the robot motions. One source of non- determinism is asynchrony , that is, the robots can mov e on their indi vidual paths with dif ferent and time-varying speeds and their speed proﬁles are not known a priori. This might happen due to many factors such as low battery levels, cal- ibration errors, or waiting to gi ve way to humans in the workspace. The goal of this paper is, giv en a collection of paths, one for each robot, to devise a distributed protocol Email addresses: ysahin@umich.edu (Y unus Emre Sahin), necmiye@umich.edu (Necmiye Ozay). so that the robots are guaranteed to reach their targets in the face of asynchrony and a void all collisions along the way . W e call this the Multi-Robot P ath Execution (MRPE) problem. W e study this problem at a discrete-lev el where the execution policies we generate can be used with any suitably abstracted continuous-dynamics and low-le vel con- trol policies that allow stopping the robot when needed. Moreov er , our execution policies can interface with higher- lev el decision-making modules that lead to asynchrony by temporarily stopping the individual robots for emergencies, maintenance, or other reasons. In addition to MRPE, our ex- ecution policies can also be relev ant for other applications such as transportation networks where vehicles trav el on ﬁxed tracks or manufacturing processes where workpieces follow complex con ve yor networks. The key insight of the paper is to recast the MRPE prob- lem as a drinking philosophers problem (DrPP) [3], an ex- tension of the well-known dining philosophers problem [7]. DrPP is a resource allocation problem for distributed and concurrent systems. By partitioning the workspace into a set of discrete cells and treating each cell as a shared resource, we show how to construct drinking sessions such that the MRPE problem can be solved using any DrPP algorithm. Existing DrPP algorithms, such as [3, 10, 27], can be im- plemented in distributed manner , and enjoy nice properties such as fairness and deadlock-freeness, while also guaran- teeing collision av oidance when applied to multi-robot plan- ning. T o allow more concurrent behavior and to improve the overall performance, we further modify [10], and derive conditions on the collection of paths such that collisions and Preprint deadlocks are guaranteed to be av oided. The rest of the paper is organized as follows. Section 2 brieﬂy presents related work. Section 3 formally deﬁnes the MRPE problem we are interested in solving. A brief sum- mary of the DrPP and an existing solution is presented in Section 4. Section 5 recasts the MRPE problem as a DrPP , and sho ws that existing methods can be used to solve MRPE problems. Furthermore, this section provides modiﬁcations to [10] that allow more concurrent beha vior . Section 6 shows that, when fed by the same paths, our algorithm achieves competitiv e results with the state-of-the-art [16]. Section 7 concludes the paper . 2 Related W ork Multi-robot path planning deals with the problem of plan- ning a collection of paths that take a set of robots from their initial position to a goal location without collisions. In this paper , we consider the type of problems where the workspace is partitioned into a set of discrete cells, each of which can hold at most one robot, and time is discretized. Most of the research in this ﬁeld have been focused on ﬁnd- ing optimal or suboptimal paths that minimize either the makespan (last arriv al time) or the ﬂowtime (sum of all ar- riv al times) [9, 23, 29]. These methods require the duration of each actions to be ﬁxed to show optimality . In real-life, howe ver , robots cannot execute their paths perfectly . They might mov e slower or faster than intended due to various factors, such as low battery lev els, calibration errors and other failures. Methods that deal with such uncertainties, which might lead to collisions or deadlocks if not handled properly , can be divided into two main groups. In the ﬁrst group, robots are allowed to replan their paths at run-time [18, 22, 26]. In this case, simpler path plan- ning algorithms can be used, leaving the burden of colli- sion a voidance to the run-time controllers. Ho we ver , this approach might lead to deadlocks in densely cro wded envi- ronments. Moreover , when the speciﬁcations are complex, changing paths might e ven lead to violations of the speci- ﬁcations. Therefore, replanning paths on run-time is not al- ways feasible. Alternati vely , collisions and deadlocks can be av oided without needing to replan on run-time [6, 16, 19, 20, 21, 30, 31]. For instance, if the synchronization errors can be bounded, [6] and [21] show how to synthesize paths that are collision and deadlock-free. This is achie ved by overes- timating the positions of robots and treating them as mo ving obstacles. Howe ver , this is a conservati ve approach as the burden of collision and deadlock av oidance is moved to the ofﬂine planning part. Alternativ ely , an execution policy can be used to decide when to mov e, slow do wn or stop robots on their predeﬁned paths to avoid collisions and deadlocks. Giv en a collection of paths, which are collision and deadlock free under per- fect synchronization, [13] provides a method for robust ex- ecution under synchronization errors. This work focuses on how to compute the temporal dependencies between robots and enforce a ﬁxed ordering between each robot pair for all possible conﬂicts such that collisions and deadlocks are av oided. Same idea is used in [16], named Minimal Com- munication Policies (MCP), but the focus is on the plan- ning phase. A delay probability for each agent is used in the planning phase to optimize the e xpected makespan. This method is later expanded to lifelong missions in warehouse en vironments [12]. Such a ﬁx ed ordering pre vents collisions and deadlocks, howe ver , it is limiting as the performance of the multirobot system depends highly on the exact ordering. If one of the robots experiences a failure at run-time and starts moving slowly , it might become the bottleneck of the whole system. In fact, we demonstrate the effects of such a scenario on the system performance and provide numerical results that show the robustness of our method. When the collection of paths are known a priori, one can also ﬁnd all possible collision and deadlock conﬁgurations, and prev ent the system from reaching those. For instance, distributed methods in [30] and [31] ﬁnd deadlock conﬁgu- rations by abstracting robot paths into an edge-colored di- rected graph. Ho wev er , this abstraction step might be con- servati ve. Imagine a long passage which is not wide enough to ﬁt more than one robot, and two robots crossing this pas- sage in the same direction. The entire passage would be ab- stracted as a single node, and even though robots can enter the passage at the same time and follow each other safely , they would not be allowed to do so. Instead, robots hav e to wait for the other to clear the entire passage before enter- ing. Moreov er , [31] require that no two nodes in the graph are connected by two or more different colored edges. This strong restriction limits the method’ s applicability to classi- cal multirobot path execution problems where robots move on a graph and the same two nodes might be connected with multiple edges in each direction. As connectivity and autonomous capabilities of vehicles improv e, cooperati ve intersection management problems draw signiﬁcant attention from researchers [1, 2, 8, 32]. These problems are similar to MRPE problem as both re- quire coordinating multiple vehicles to prev ent collisions and deadlocks. Compared to traditional trafﬁc light-based methods, cooperativ e intersection management methods offer improv ed safety , increased traf ﬁc ﬂow and lower emissions. W e refer the reader to [4] for a recent survey on this topic and main solution approaches. Although they seem similar , the setting of intersection management prob- lems are tailored speciﬁcally for the existing road networks, and thus, cannot be easily generalized to MRPE problems where robots/vehicles might be moving in non-structured en vironments. Our method is based on reformulating the MRPE problem as a resource allocation problem. There are similar methods such as [19], which requires a centralized controller , and [20], which needs cells to be large enough to allo w collision- free travel of up to two vehicles, instead of only one. W e base our method on the well-known drinking philosopher 2 algorithm. W e show that any existing DrPP solution can be used to solve the MRPE problem if drinking sessions are constructed carefully . Ho wever , such methods require strong conditions on a collection of paths to hold, and limit the amount of concurrency in the system. T o relax the condi- tions and to improve the performance, we provide a novel approach by taking the special structure of MRPE problems into account. W e show that our method is less conserv ativ e than the naive approach, and provide numerical results to conﬁrm the theoretical ﬁndings. Our approach leads to con- trol policies that can be deployed in a distributed form. 3 Problem Deﬁnition W e start by providing deﬁnitions that are used in the rest of the paper and formally state the problem we are interested in solving. Let a set R = { r 1 , . . . , r N } of robots share a workspace that is partitioned into set V of discrete cells. T wo robots are said to be in collision if the y occup y the same cell at the same time. An ordered sequence π = { π 0 , π 1 , . . . } of cells, where each π t ∈ V , is called a path . The path segment { π t , π t + 1 , . . . , π t ′ } is denoted by π t ∶ t ′ , and we write v ∈ π if there exists a t such that π t = v . W e assume that a path is giv en for each robot, and π n denotes the path associated with r n . W e allo w paths to contain loops b ut require them to be ﬁnite. W e use π end n and cur r ( r n ) to denote the ﬁnal cell of π n and the number of successful transitions completed by r n , respectively . W e also deﬁne next ( r n ) ≐ cur r ( r n ) + 1 . The motion of each robot is governed by a control policy , which issues one of the two commands at every time step: ( 1 ) S T O P and ( 2 ) GO . The S T O P action forces a robot to stay in its current cell. If the GO action is chosen, the robot starts moving. This robot might or might not reach to the next cell within one time step, ho wev er , we assume that a robot e ventually progresses if GO action is chosen constantly . This non-determinism models the uncertainties in the environment, such as battery le vels or noisy sen- sors/actuators, which might lead to robots moving faster or slower than intended. It can also be due to a higher -lev el de- cision maker that forces the robot to stay put for instance as an emergenc y stop to giv e way to humans in the workspace. W e now formally deﬁne the problem we are interested in solving: Problem 1 Given a collection Π = { π 1 , . . . π N } of paths, design a contr ol policy for each r obot such that all r obots eventually reac h their ﬁnal cells while avoiding collisions. As stated in the problem deﬁnition, we assume that prede- ﬁned paths are known by ev ery robot prior to the start of ex ecution. There are many control policies that can solve Problem 1. For the sake of performance, policies that allow more concurrent behavior are preferred. In the literature, two metrics are commonly used to measure the performance: makespan (latest arrival time) and ﬂowtime (sum of arrival times) . Given a set of robots R = { r 1 , . . . , r N } , if robot r n takes t n time steps to reach its ﬁnal state, makespan and ﬂowtime v alues are giv en by max 1 ≤ n ≤ N t n and ∑ N n = 1 t n , re- spectiv ely . These v alues decrease as the amount of concur- rency increases. Ho wev er , it might not be possible to min- imize both makespan and ﬂowtime at the same time, and choice of policy might depend on the application. T o solve Problem 1, we propose a method that is based on the well-known drinking philosophers problem introduced by [3]. For the sake of completeness, this problem is explained brieﬂy in Section 4. 4 Drinking philosophers problem The drinking philosophers problem is a generalization of the well-known dining philosophers pr oblem proposed by [7]. These problems capture the essence of conﬂict resolution, where multiple resources must be allocated to multiple pro- cesses. Giv en a set of processes and a set of resources, it is assumed that each resource can be used by at most one process at any giv en time. In our setting, processes and re- sources correspond to robots and discrete cells that parti- tion the workspace, respectively . Similar to mutually exclu- siv e use of the resources, any giv en cell can be occupied by at most one robot to av oid collisions. In the DrPP set- ting, processes are called philosopher s , and shared resources are called bottles . A philosopher can be in one of the three states : ( 1 ) tranquil , ( 2 ) thirsty , or ( 3 ) drinking . A tranquil philosopher may stay in this state for an arbitrary period of time or become thirsty at any time it wishes. A thirsty philosopher needs a non-empty subset of bottles to drink from. This subset, called drinking session , is not necessarily ﬁxed, and it could change over time. After acquiring all the bottles in its current drinking session, a thirsty philosopher starts drinking . After a ﬁnite time, when it no longer needs any bottles, the philosopher goes back to tranquil state. The goal of the designer is to ﬁnd a set of rules for each philoso- pher for acquiring and releasing bottles. A desired solution would have the following properties. (i) Liveness: A thirsty philosopher ev entually starts drinking. In our setting liv e- ness implies that each robot is e ventually allowed to move. (ii) F airness: No philosopher is consistently fav ored ov er another . In multi-robot setting, fairness indicate that there is no ﬁxed priority order between robots. (iii) Concurr ency: Any pair of philosophers must be allowed to drink at the same time, as long as they wish to drink from dif ferent bot- tles. Analogously , no robot waits unnecessarily . W e base our method on the DrPP solution of [10], which is shown in Algorithm 1. This solution ensures liveness, fairness and concurrency . For the sake of completeness, we provide a brief summary of their solution, but refer the reader to [10] for the proof of correctness and additional details. Each philosopher p has a unique integer id p and keeps track of two non-decreasing integers: session number s num p and the highest received session number max r ec p . These in- tegers are used to keep a strict priority order between the philosophers. Conﬂicts are resolved according to this order , 3 in fa vor of the philosopher with the higher priority . W e say that philosopher p has higher priority than philosopher r (denoted p ≺ r ) if ( s num p , id p ) ≺ ( s num r , id r ) that is s num p < s num r , or s num p = s num r and id p < id r . That is, smaller session number indicates a higher priority , and in the case of identical session numbers, philosopher with the smaller id has the higher priority . Note that the pri- ority order is not ﬁxed as s num values change ov er time. Philosophers also keep track of sev eral Boolean variables. Let p and r be two philosophers and b be a bottle shared between them. It must be noted that, philosophers p and r can share multiple bottles among each other, b ut each bottle b is shared by exactly two philosophers. For each bottle b in philosopher p ’ s in ventory , we deﬁne two booleans hol d p ( b ) and r eq p ( b ) , and say that p holds the bottle b (or the request token for the bottle b ) if the variable hol d p ( b ) (or r eq p ( b ) ) holds tr ue . Similarly , we say that p needs b if need p ( b ) hold tr ue . Algorithm 1 ensures mutual exclusiv eness, that is, hold p ( b ) and hol d r ( b ) (similarly r eq p ( b ) and r eq r ( b ) ) cannot be tr ue at the same time. Philosophers are initialized such that each philosopher is in tranquil state and s num and max r ec are set to 0 . Bottles and associated request tokens are shared between philoso- phers such that one philosopher holds the bottle while the other holds the associated request token. Since philosophers are in tranquil state, all need ( b ) variables are initially f alse . The rules R 1 − R 6 of Algorithm 1 are triggered by ev ents. For example, if a philosopher p wants to drink from a set of bottles S , it needs to become thirsty ﬁrst by executing R 1 . Upon holding all the bottles in S , R 2 is triggered and p starts drinking. Similarly , p triggers R 4 when p (i) needs b , (ii) does not currently hold b , and (iii) holds the associated request token req p ( b ) . Then, p requests the bottle from r by sending the message ( r eq b , s num p , id p ) . Receiving such a message, triggers R 5 in r . If r (1) does not need b or (2) is thirsty and ( s num p , id p ) ≺ ( s num r , id r ) , then it sets hold r ( b ) f alse and sends the bottle to p . Sending a bottle is simply done by sending a message to p . Upon recei ving such a message, p sets hol d p ( b ) tr ue by running R 6 . W e refer the reader to [10] for more details. 5 Multi-robot navigation as a drinking philosophers problem In this section we recast the multi-robot path ex ecution prob- lem as an instance of drinking philosophers problem. W e ﬁrst show that nai ve reformulation using existing DrPP solu- tions leads to conserv ati ve control policies. W e then provide a solution that is based on Algorithm 1. W e ﬁrst provide an intuitive explanation for the transfor- mation from MRPE to DrPP , and then present the formal procedure. Given a set V = { v 1 , . . . , v ∣ V ∣ } of cells and a col- lection Π = { π 1 , . . . π N } of paths, cells that appear in more than one path are called shar ed and the rest are called fr ee . Algorithm 1 Drinking Philosopher Algorithm by [10] R1: becoming thirsty with session S for each bottle b ∈ S do need p ( b ) ← tr ue end for s num p ← max r ec p + 1 R2: start drinking if holding all needed bottles then become dr ink ing end if R3: becoming tranquil , honoring deferred requests for each consumed bottle b do need p ( b ) ← f alse ; if r eq p ( b ) then [ hold p ( b ) ← f alse ; S end ( b )] end if end for R4: requesting a bottle if ( need p ( b ) and ¬ hold p ( b ) and r eq p ( b )) ) then r eq p ( b ) ← f alse S end ( r eq b , s num p , id p ) end if R5: receiving a request from r , resolving a conﬂict upon reception of ( r eq b , s num r , id r ) do r eq p ( b ) ← tr ue ; max r ec p ← max ( max r ec p , s num r ) if   ¬ need p ( b )  or  p is thirsty and ( s num r , id r ) < ( s num p , id p )   then [ hold p ( b ) ← f alse ; S end ( b )] end if R6: receive bottle upon reception of b do hold p ( b ) ← tr ue W e denote the set of all shared cells by V shared . W e assume that robots know all paths prior to the start of the execution, hence each robot knows which cells are shared and which ones are free. A shared cell must be occupied at most by one robot at any gi ven time to a void collisions. One can treat the robots as philosophers and shared cells as bottles to enforce this mutual e xclusion requirement. In multi-robot setting, the actions “moving between two fr ee cells” or “oc- cupying a free cell” of a robot are mapped into the tranquil state. Similarly , the “desire to move into a shared cell” and “moving towards or occupying a shared cell are mapped into the thirsty and the drinking states, respectively . Giv en any two arbitrary robots, we deﬁne a bottle for each cell that is visited by both. For example, if the k th cell v k ∈ V is visited both by r m and r n , we deﬁne the bottle b k m,n . W e denote the set of cells visited by both r m and r n 4 Fig. 1. An illustrati ve example sho wing partial paths of ﬁ ve robots. Robots, each assigned a unique color/pattern pair , are initialized on free cells. Shared cells are shown as hollow black rectangles. Each path eventually reaches a free cell that is not shown for the sake of simplicity . by V m,n ≐ { v  ∃ t m , t n ∶ π t m m = π t n n = v ∈ V shared } . It must be noted that for a shared cell v k ∈ V m,n , there exists a single bottle shared between r n and r m , and both b k m,n and b k n,m refer to the same object. Multiple bottles would be deﬁned for a shared cell that is visited by more than two robots, where each bottle is shared between e xactly two robots. W e use B m,n and B m to denote the set of all bottles r m shares with r n and with all other robots, respectiv ely . W e then deﬁne B m ( V ) , the set of bottles associated with the cells in V ⊆ V that r m share with others such that B m ( V ) ≐ { b k m,n ∈ B m  v k ∈ V } . W e use the following example to illustrate the concepts abov e. Example 1 In the scenario depicted in F igur e 1, the r obot r 1 shar es one bottle with r 2 , B 1 , 2 = { b 2 1 , 2 } , thr ee bottles with r 4 , B 1 , 4 = { b 1 1 , 4 , b 2 1 , 4 , b 4 1 , 4 } , and one bottle with r 5 , B 1 , 5 = { b 6 1 , 5 } . The set B 1 is the union of these three sets, as r 1 does not share any bottles with r 3 . Given V = { v 2 } , we have B 1 ( V ) = { b 2 1 , 2 , b 2 1 , 4 } . Bottles are used to indicate the priority order between robots ov er shared cells. For instance, if b k m,n is currently held by robot r m , then r m has a higher priority than r n over the shar ed cell v k . Note that, this order is dynamic as bottles are sent back and forth. Collisions can be prev ented simply by the following rule: “to occupy a shar ed cell v k , the robot r n must be drinking fr om all the bottles in B n ( v k ) ”. That is, r n must set all the bottles in B n ( v k ) needed, and be in drinking state. Upon arriving a free cell, a drinking robot w ould become tranquil. This rule pre vents collisions as r n is the only robot allo wed to occupy v k while it is drinking from B n ( v k ) . Howe ver , this rule is not sufﬁcient to ensure that all robots reach their ﬁnal cells. W ithout the introduction of further rules, robots might end up in a deadlock . W e formally deﬁne deadlocks as follows: Deﬁnition 1 A deadlock is any conﬁguration wher e a sub- set of r obots, which have not r eached their ﬁnal cell, wait cyclically and choose S T O P action indeﬁnitely . T o exemplify the insuf ﬁciency of the aforementioned rule, imagine the scenario shown in Figure 1. Robots r 1 and r 4 trav erse the neighboring cells v 1 and v 2 in the opposite or- der . Assume r 4 is at v 4 and wants to proceed into v 2 , and, at the same time, r 1 wants to mov e into v 1 . Using the afore- mentioned rule, robots must be drinking from the associated bottles in order to mov e. Since they wish to drink from dif- ferent bottles, both robots would be allowed to start drinking. After arriving at v 1 , r 1 has to start drinking from B 1 ( v 2 ) in order to progress any further . Howe ver , r 4 is currently drinking from b 2 1 , 4 ∈ B 1 ( v 2 ) and cannot stop drinking be- fore leaving v 2 . Similarly , r 4 cannot progress, as r 1 cannot release b 1 1 , 4 before leaving v 1 . Consequently , robots would not be able to make any further progress, and would stay in drinking state forev er . 5.1 Naive F ormulation W e no w show that deadlocks can be av oided by constructing the drinking sessions carefully . For the correctness of DrPP solutions, all drinking sessions must end in ﬁnite time. If drinking sessions are set such that a r obot entering a shared cell is clear to move until it reac hes a free cell without r equiring additional bottles along the way , then all drinking sessions would end in ﬁnite time. That is, if a robot is about to enter a segment which consists of consecuti ve shared cells, it is required to acquire not only the bottles associated with the ﬁrst cell, but all the bottles on that segment. T o formally state this requirement, let S n ( t ) denote the drinking session associated with cell π t n for robot r n . That is, r n should be drinking from all the bottles in B n ( S n ( t )) to occupy π t n . Now set S n ( t ) = π t ∶ t ′ n = { π t n , . . . π t ′ n } (1) where π k n ∈ V shared for all k ∈ [ t, t ′ ] and π t ′ + 1 n ∈ V f ree is the ﬁrst free cell after π t n . As long as it is drinking from S n ( t ) , r n has the highest priority among all robots ov er the cells π t ∶ t ′ n . Then, r n can constantly choose the action GO until eventually reaching the free cell π t ′ + 1 n and would stop drinking in ﬁnite time. Therefore, any existing DrPP solution, such as [3, 10, 27], can be used to design the control policies that solve Problem 1 if the drinking sessions are constructed as in (1). T o illustrate, let us revisit the scenario in Figure 1. T o be able to occupy v 1 , r 1 needs to acquire all bottles in B 1 ( v 1 , v 2 , v 4 , v 6 ) . Then, r 1 is free to mo ve all the way up to the last shared state v 6 without requiring additional bottles. Note howe ver that, once r 1 reaches v 2 , bottles B 1 ( v 1 ) are no longer needed to av oid deadlocks. T o allow more con- currency , we introduce a new rule that lets robots to dr op bottles they no longer need: 5 R7: upon leaving a shared state for each b  ∈ B p  S p  cur r ( p )   do need p ( b ) ← f alse if r eq p ( b ) then [ hold p ( b ) ← f alse ; S end ( b )] end if end for W ithout R 7 , if r 1 is at v 2 , r 2 would need to wait until r 1 leav es v 6 . W ith R 7 , r 1 would release b 2 1 , 2 upon leaving v 2 . Then, r 2 can mov e to v 2 , while r 1 is at v 4 . Howe ver , the control policies resulting from the aforemen- tioned approach are conservati ve and lead to poor perfor- mance in terms of both makespan and ﬂo wtime. T o illustrate using the scenario shown in Figure 1, if r 1 is currently at v 1 , r 5 cannot move into v 6 since b 6 1 , 5 is held by r 1 . This is a conservati ve action as r 5 cannot cause a deadlock by moving to v 6 , as it mov es to a free cell right after . W e hav e seen that small drinking sessions might lead to deadlocks, and lar ge drinking sessions might lead to un- necessary waits, and thus, bad performance. Our goal is to construct drinking sessions as small as possible such that we can guarantee deadlock-freeness while allo wing as much concurrent behavior as possible. T o achiev e this goal, we introduce a ne w drinking state and rules regarding its oper- ation. 5.2 New Drinking State and New Rules In this subsection, we propose a ne w drinking state for the philosophers, namely insatiable . This new state is used when robot moves from a shared cell to another shared cell. W e also add an additional rule R 8 regarding this new state and modify the existing rules R 4 and R 5 of Algorithm 1 as R ′ 4 and R ′ 5 , respectiv ely: R’4: requesting a bottle if  need p ( b ) and ¬ hold p ( b ) and r eq p ( b )  then r eq p ( b ) ← f alse S end ( r eq b , s num p , id p , ds p ) end if R’5: receiving a request from r , and resolving a conﬂict upon reception of ( r eq b , s num r , id r , ds r ) do r eq p ( b ) ← tr ue ; max r ec p ← max ( max r ec p , s num r ) if  ( 1 ) or ( 2 ) or ( 3 )  then [ hold p ( b ) ← f alse ; S end ( b )] end if where (1) ¬ need p ( b ) (2) p is thir sty and (a)  ds r is thirsty and ( s num r , id r ) ≺ ( s num p , id p )  or , (b) ds r is insatiable , (3) p is insatiabl e with  B p ( S 1 ) , B p ( S 2 )  and , (a) ds r is insatiable and , (b) b  ∈ B p ( S 1 ) and , (c) ( s num r , id r ) ≺ ( s num p , id p ) . R8: becoming insatiable with tuple  B p ( S 1 ) , B p ( S 2 )  for each bottle b ∈ B p ( S 2 ) do need p ( b ) ← tr ue end for if not holding all bottles in B p ( S 1 ∪ S 2 ) then become insatiable end if The message structure used for requesting bottles is modiﬁed in R ′ 4 and now includes the drinking state ds p ∈ { tr anq uil , thir sty , dr ink ing , insatiabl e } of the sender . This infor- mation is used in conjunction with id and s num by the receiv er to decide whether to grant or defer the request. In the naive formulation, drinking sessions are set such that a robot entering a shared cell is free to move until it reaches a free cell, without requiring additional bottles along the way . The insatiable state is intended to soften this constraint. Assume robot r n wants to mov e to shared cell π t n , and the ﬁrst free cell after π t n is π t ′ + 1 n for some arbitrary t ′ > t , all the cells in between are shared. If r n enters the ﬁrst shared cell without acquiring all the bottles until π t ′ + 1 n , it would need to acquire those bottles at some point along the way . If r n becomes thirsty to acquire those bottles, it risks losing the bottles it currently holds. If another robot r m with a higher priority needs and receives the bottles associated with the cell r n currently occupies, two robots might collide. Insatiable state allows a robot to request ne w bottles without risking to lose any of the bottles it currently holds. In this state, the robot does not hold all the bottles it needs to start drinking, similar to thirsty state. The difference between two states is that, an insatiable philosopher always has a higher priority than a thirsty philosopher regardless of their session numbers. Moreov er , an insatiable philosopher does not release under any circumstance an y of the bottles needed to occupy the cell it is currently in. In other words, if p is insatiable with ( B 1 , B 2 ) , then p already has all bottles in B 1 and cannot release them. Moreov er , p is trying to acquire the bottles in B 2 to start drinking, which might be released, if a robot with higher priority requests them. An insatiable robot always has a higher priority than a thirsty robot. In case of identical drinking states, ≺ relation is used to resolve the priority order . The insatiable state and the rules regarding its operation might lead to deadlocks without careful construction of drinking sessions. W e now explain how to construct drink- ing sessions to av oid deadlocks. 6 5.3 Constructing Drinking Sessions T o compute drinking sessions, we ﬁrst need to deﬁne a new concept called P ath-Graph : Deﬁnition 2 The P ath-Graph induced by the collection Π = { π 1 , . . . π N } of paths is a dir ected edge-color ed multigraph G Π = ( V , E Π , C ) where V is a set of nodes, one per each cell in Π , E Π = {( π t n , c n , π t + 1 n )  π n ∈ Π } is the set of edges, r epr esenting transitions of each path, and C = { c 1 , . . . , c N } is the set of colors, one per each path (i.e., one per each r obot). A P ath-Graph is a graphical representation of a collection of paths, ov erlayed on top of each other . The nodes of this graph correspond to discrete cells that partition the workspace, and edges illustrate the transitions between them. Color coding of edges indicate which robot is responsible for a particular transition. In other words, if π n has a transition from u to v , then there exists a c n colored edge from u to v in G Π , i.e., ( u, c n , v ) ∈ E Π . Path-Graphs are useful to detect possible deadlock conﬁgu- rations. Intuitiv ely , deadlocks occur when a subset of robots wait cyclically for each other . W e ﬁrst show that such con- ﬁgurations correspond to a rainbow cycle in the correspond- ing Path-Graph. A rainbow cycle is a closed walk where no color is repeated. Let Π be a collection of paths and G Π be the Path-Graph induced by it. Assume that a subset { r 1 , . . . , r K } ⊆ R of robots are in a deadlock conﬁguration such that r n waits for r n + 1 for all n ∈ { 1 , . . . , K } where r K + 1 = r 1 . That is, r n cannot mov e any further , because it wants to move to the cell that is currently occupied by r n + 1 . Let v n denote the current cell of r n . Since r n wants to mov e from v n to v n + 1 , we hav e e n = ( v n , c n , v n + 1 ) ∈ E Π . Then, ω = {( v 1 , c 1 , v 2 ) , . . . , ( v K , c K , v 1 )} is a rain- bow cycle of G Π . For instance, there are two rainbow cy- cles in Figure 1: ω 1 = {( v 1 , c 1 , v 2 ) , ( v 2 , c 4 , v 1 )} and ω 2 = {( v 2 , c 1 , v 4 ) , ( v 4 , c 4 , v 2 )} . The ﬁrst idea that follows from this observation is to limit the number of robots in each rainbow cycle to avoid dead- locks. Howe ver , this is not enough as rainbow cycles can intersect with each other and robots might end up waiting for each other to av oid e ventual deadlocks. For instance, in the scenario illustrated in Figure 1, let r 1 and r 4 be at v 1 and v 4 , respectively . The number of robots in each rainbow cycles is limited to one, nonetheless, this conﬁguration will ev entually lead to a deadlock. W e propose Algorithm 2 to construct the drinking sessions, which are used to prev ent such deadlocks. Gi ven a collection Π of paths let G Π = ( V , E Π , C ) denote its Path-Graph. W e ﬁrst deﬁne equiv alence relation ∼ on V such that each node is equiv alent only to itself. W e then ﬁnd all rainbow cycles in G Π . Let W denote the set of all rainbow c ycles. For each rainbo w cycle W ∈ W , we expand the equiv alence relation ∼ by declaring all nodes in W to be equiv alent. Algorithm 2 ﬁnd equiv alence classes Input G Π retur n ˜ G Π 1: ∼ ← ∅ 2: f or u ∈ G Π do 3: expand ∼ such that ( u, u ) ∈ ∼ 4: end for 5: W ← ﬁnd rainbow cycles( G Π ) 6: if W = ∅ then 7: ˜ G Π ← G Π 8: retur n 9: else 10: for W ∈ W do 11: for u, v ∈ W do 12: expand ∼ such that ( u, v ) ∈ ∼ 13: end for 14: end for 15: ˜ G Π ← f ind q uotient ( G Π , ∼ ) 16: f ind eq uiv al ence cl asses ( ˜ G Π ) 17: end if That is, if u and v are two nodes of the rainbow cycle W , we add the pair ( u, v ) to the equiv alence relation ∼ . Note that, due to transitivity of the equiv alence relation, nodes of two intersecting rainbow cycles would belong to the same equiv alence class. The relation ∼ partitions V by grouping the intersecting rainbow cycles together . W e then ﬁnd the quotient set V  ∼ and deﬁne a ne w graph ˜ G Π = ( V  ∼ , ˜ E Π , C ) where ([ u ] , c m , [ v ]) ∈ ˜ E Π if [ u ]  = [ v ] , and there exists α ∈ [ u ] , β ∈ [ v ] such that ( α, c m , β ) ∈ E Π . That is, we create a node for each equiv alence class. W e then add a c m colored edge to ˜ G Π between the nodes corresponding [ u ] and [ v ] if there is a c m colored edge in G Π from a node in [ u ] to a node in [ v ] . W e repeat the same process with ˜ G Π in a recursiv e manner until no more rainbo w cycles are found. Proposition 1 Algorithm 2 terminates in ﬁnite steps. Proof: Since all paths are ﬁnite, the number of nodes in the Path-Graph G Π ,  V  , is ﬁnite. At each iteration, Algorithm 2 either ﬁnds a new graph ˜ G Π which has a smaller number of nodes, or returns G Π . Therefore, Algorithm 2 is guaranteed to terminate at most in  V  steps. ◻ Remark 1 Algorithm 2 needs to ﬁnd all rainbow cycles of an edge-color ed multi-graph at each iteration, which can be done in the following way . Given G = ( V , E , C ) , obtain E ⊆ V × V from E by remo ving the coloring and replacing multiple edges between the same two nodes with a single edge. Then, ﬁnd all simple cycles in the graph G ′ = ( V , E ) . F inally , check if these cycles can be color ed as a rainbow cycle. F inding all simple cycles in a dir ected gr aph is time bounded by O (( V  +  E )( C + 1 )) and space bounded by O (( V  +  E ) , wher e C is the number of cycles [14]. Although the 7 number of cycles in a dir ected graph gr ows, in the worst- case, e xponentially with the number of nodes, this operation can be done efﬁciently in practice [11]. Deciding if a cycle can be rainbow color ed can be posed as an exact set cover pr oblem, which is NP-complete. This is essentially due to the fact that, in the worst-case, the number of cycles in a multi-graph can be exponential in the number of colors compar ed to the corr esponding dir ected graph. However , the number of nodes decr ease at each iter ation of Algorithm 2, making computations easier . Mor eover , while the worst- case complexity is high, these operations can usually be performed efﬁciently in practice. When the Algorithm 2 ﬁnds the ﬁxed point, we set ˜ S n ( t ) ≐ S n ( t ) ∩ [ π t n ] (2) where S n ( t ) is deﬁned as in (1) and [ π t n ] is the equi valence class of π t n . That is, r n must be drinking from all the bottles in B n ( ˜ S n ( t )) to be able to occupy π t n . W e now revisit the example in Figure 1. Example 2 Let G Π be given as in F igur e 1. After the ﬁrst recursion of Algorithm 2, [ v 1 ] = { v 1 , v 2 , v 4 } and [ v i ] = { v i } for i ∈ { 3 , 5 , 6 } . After the second r ecursion, [ v 1 ] = { v 1 , v 2 , v 3 , v 4 , v 5 } and [ v 6 ] = { v 6 } . No rainbow cycles ar e found after the second r ecursion, ther efore , ˜ S 1 ( 1 ) = { v 1 , v 2 , v 4 , v 6 } ∩ { v 1 , v 2 , v 3 , v 4 , v 5 } = { v 1 , v 2 , v 4 } . While all cells e xcept v 6 get mer ged into a single cell in Fig- ure 1, constructing drinking sessions as in (2) allo ws mul- tiple robots to simultaneously occupy v 1 − v 5 . For instance, the drinking sessions of r 1 and r 3 are disconnected since they hav e no common cells. Therefore, r 1 and r 3 can en- ter cells v 1 and v 3 , respectiv ely , at the same time. Similarly , such drinking sessions also allo w r 1 , r 2 and r 3 to simul- taneously occupy cells v 4 , v 2 and v 5 , respectively . Even in this small example, we can see the beneﬁt of using Eq. (2) instead of Eq. (1). The modiﬁcation allo ws r 1 and r 5 to be at v 4 and v 6 , respectiv ely , while it was not possible with the naiv e formulation. Remark 2 Sessions constructed by (2) ar e always con- tained in the sessions constructed by (1) . That is, when drinking sessions ar e found as in (2) , r obots would need fewer bottles to move, and the r esulting contr ol policies would be mor e permissive. W e no w propose a control policy that prevents collisions and deadlocks when drinking sessions are constructed as in (2). 5.4 Contr ol Strate gy W e propose Algorithm 3 as a control policy to solve Prob- lem 1. In order to implement Algorithm 3 in a distributed manner , we require the communication graph to be identi- cal to the resource dependency graph. That is, if two robots visit a common cell, there must be a communication chan- nel between them. W e also assume that messages from one robot to another are receiv ed in the order that they are sent. W e now brieﬂy e xplain the ﬂo w of the control policy , which is illustrated in Figure 2, and then provide more details. Robots are initialized as follows. If robot r n starts at a shared cell, all bottles in its initial drinking session are giv en to r n and the related request tokens are gi ven to the corresponding robots. T o ensure bottles can be assigned in this way , we require that the initial drinking sessions are disjoint. Robots with shared initial cells are then initialized in drinking state, which is possible since they hold all the required bottles, and the remaining robots are initialized in tranquil state. If the ﬁnal cell is reached, S T O P action is chosen as the robot accomplished its task. Otherwise, if a robot is in either tranquil or drinking state, the control policy chooses the action GO until the robot reaches to the next cell. When a robot mo ves from a free cell to a shared cell, it ﬁrst becomes thirsty and the control policy issues the action S T OP until the robot starts drinking. When moving between shared cells, a robot becomes insatiable if it needs to acquire additional bottles, and S T O P action is chosen until the robot starts drinking again. If a robot r n is leaving a shared cell where another robot r m ’ s path terminates, for the last time, r n sends r m a clear ed message. When a robot’ s path terminates at a shared cell, it must be careful not to arriv e early and block others from progressing. Therefore, when a robot is about to move to a segment of consecutiv e shared cells which includes its ﬁnal cell, it needs to wait for others to clear its ﬁnal cell. All robots are initialized as previously described. Let r n be an arbitrary robot. Lines 1 − 2 of Algorithm 3 ensure that r n does not move after reaching its ﬁnal cell. Otherwise, let π t n denote the next cell on r n ’ s path. If π t n is a free cell, the control polic y chooses the GO action until the robot reaches π t + 1 n (lines 5 − 8 ). r n goes back to tranquil state if it was drinking and sends a clear ed message to a corresponding robot r m if (i) π t − 1 n was the terminal cell for r m and (ii) π t − 1 n will not be visited by r n again in the future (lines 9 − 13 ). When π t n is a shared cell, there are two possible options: (i) If there is no free cell between the ne xt cell and the ﬁnal cell of r n , i.e., π end n ∈ S n ( t ) where S n ( t ) is deﬁned as in (1), the robot must wait for all other robots to clear this cell (lines 15 − 18 ). This w ait is needed, otherwise, r n might block oth- ers by arriving and staying indeﬁnitely at its ﬁnal cell. When all others clear its ﬁnal state, r n can start moving again. (ii) If the ﬁnal cell is not included in the drinking session, r n checks its drinking state. If tranquil, r n becomes thirsty with the drinking session ˜ S n ( t ) (lines 19 − 20 ). If drinking, r n becomes insatiable with  B n  ˜ S n ( t )  , B n  ˜ S n ( t + 1 )   (lines 21 − 22 ). Then, the robot w aits until it starts drinking to mov e to the next cell (lines 24 − 26 ). When the robot starts drink- ing, it is allowed to move until it reaches π t n (lines 27 − 29 ). Upon reaching π t n , the robot sends cleared signal if needed (line 31 ), as previously explained. 8 W e now show the correctness of Algorithm 3. Theorem 1 Given an instance of Pr oblem 1, using Algo- rithm 3 as a contr ol policy solves Problem 1 if (1) Initial drinking sessions ar e disjoint for each r obot, i.e., ˜ S m ( 0 ) ∩ ˜ S n ( 0 ) = ∅ for all m ≠ n and (2) F inal cells of each r obot belong to a differ ent equiva- lence class, i.e., ( π end m , π end n )  ∈∼ for any m ≠ n wher e ∼ is computed accor ding to Algorithm 2, and (3) Ther e exists at least one fr ee cell in each π n . The proof of Theorem 1 can be found in the Appendix. The ﬁrst condition in Theorem 1 ensures that robots can be initialized correctly . Imagine a robot r n whose path starts with a shared cell. If r n is initialized in tranquil state, it will momentarily violate the requirement that “ r n must be drinking fr om all the bottles in B n ( ˜ S n ( 0 )) to be able to oc- cupy π 0 n ” . If initial drinking sessions are disjoint for each robot, r n can immediately start drinking. Therefore, all such robots can be initialized in drinking state if the ﬁrst con- dition is satisﬁed. The second condition is required so that robots whose paths end in a shared cell do not block oth- ers from progressing by reaching their ﬁnal cells early . The last condition is required otherwise (1) and (2) cannot be satisﬁed at the same time. Remark 3 If (1) of Theorem 1 is replaced by S m ( 0 ) ∩ S n ( 0 ) = ∅ , the Naive formulation explained in Section 5.1 also solves Pr oblem 1. However , as mentioned in Remark 2, drinking sessions constructed by (1) always contain sessions con- structed by (2) , that is, S n ( t ) ⊇ ˜ S n ( t ) . Ther efor e, condition (1) of Theor em 1 becomes mor e restrictive for the Naive im- plementation. Remark 4 The contr ol policy given in Algorithm 3 satisﬁes liveness, fairness and concurr ency pr operties. The liveness pr oof is shown in Theorem 1, and fairness and concurrency pr operties follows directly fr om [10]. 6 Results In this section, we compare our Rainbow Cycle based method explained in Sections 5.2-5.4, denoted DrPP-RC , with other path execution methods using identical paths. Firstly , we compare DrPP-RC with the Naive method, de- noted DrPP-N , which is explained in Section 5.1. This comparison demonstrates the performance improv ement that results from the addition of the new drinking state. As stated in Remark 2, DrPP-RC uses smaller drinking sessions, and allows more concurrency . W e also provide re- sults for DrPP-N without R 7 . This rule is an addition to the original DrPP solution of [3] and exploits the structure of Fig. 2. Flowchart of the control policy explained in Algorithm 3. the multi-robot path execution problem by allowing robots to drop bottles while in drinking state. W e further compare DrPP-RC with the Minimal Communi- cation P olicy of [16], denoted MCP , which prev ents colli- sions and deadlocks by maintaining a ﬁxed visiting order for each cell. A robot is allowed to enter a cell only if all the other robots, which are planned to visit the said cell ear- lier , hav e already visited and left the said state. It is shown that, under mild conditions on the collection of the paths, keeping this ﬁxed order prevents collisions and deadlocks. W e refer the reader to [16] for more details. W e also note that, conditions required by [31] are too re- strictiv e for the majority of the examples provided in this section. That is, some nodes in Path-Graph are connected by more than one colored edge, hence [31] cannot be used. On the other hand, merging shared cells as in Equation (1) generates a quotient graph that satisﬁes the required condi- tions. Then, the performance of [31] is identical to that of DrPP-N without R 7 . Howe ver , as Section 6.1 shows, [31] 9 Algorithm 3 Control policy for r n 1: if r n .is ﬁnal cell reached then 2: r n .S T OP 3: else 4: t ← next ( r n ) 5: if is free( π t n ) then 6: while ¬ r n .is reached( π t n ) do 7: r n .GO 8: end while 9: next ( r n ) ← next ( r n ) + 1 10: if r n .is drinking then 11: r n .get tranquil() 12: send cleared message if needed() 13: end if 14: else 15: if π end n ∈ S n ( t ) then 16: while ¬ cleared( π end n ) do 17: r n .S T OP 18: end while 19: else if r n .is tranquil then 20: r n .get thirsty( ˜ S n ( t ) ) 21: else if r n .is drinking then 22: r n .get insatiable( B n ( ˜ S n ( t ) , ˜ S n ( t + 1 )) ) 23: end if 24: while ¬ r n .is drinking do 25: r n .S T OP 26: end while 27: while ¬ r n .is reached( π t n ) do 28: r n .GO 29: end while 30: next ( r n ) ← next ( r n ) + 1 31: send cleared message if needed() 32: end if 33: end if cannot still solve all problems solved by DrPP-RC, and for the problems it can solve, it is signiﬁcantly outperformed by both DrPP-RC and DrPP-N. T o capture the uncertainty in the robot motions, each robot is assigned a delay pr obability . When the action GO is chosen, a robot either stays in its current cell with this probability , or completes its transition to the next cell before the next time step leading to asynchrony between robots’ motion. Our implementation can be accessed from https://github. com/sahiny/philosophers . 6.1 Randomly Generated Examples There are 10 MRPE instances in [16], labelled random 1-10, where 35 robots navigate in 4-connected grids of size 30 × 30 . In each example, randomly generated obstacles block 10% of the cells, and robots are assigned random but unique ini- tial and ﬁnal locations. The ﬁrst of these randomly gener- ated examples can be seen Figure 3. All control policies use the same paths generated by the Approximate Minimiza- tion in Expectation algorithm of [16]. Delay probabilities of robots are sampled from the range ( 0 , 1 − 1  t max ) . Note Fig. 3. Randomly generated example, denoted random1, consisting of 35 robots on a 30 × 30 grid with 10% blocked cells, which are shown in black. The path of each robot is shown with a unique color where the solid and hollow circles represent the initial and ﬁnal cells, respectively . Free (i.e., used by a single robot) and shared (i.e., visited by more than one robot) cells are painted green and red, respectively . Among the cells that are visited by at least one robot, 44% (217/496) are shared cells, and a robot’ s path consists of 65% shared cells on average. These statistics are similar for other random examples. that, higher delay probabilities can be sampled as t max in- crease, resulting in slow moving robots. Figure 4 reports the makespan and ﬂowtime statistics averaged over 1000 runs for varying t max values. The delay probabilities are sam- pled randomly for each run, but kept identical ov er different control policies. As expected, both makespan and ﬂowtime statistics increase with t max , as higher delay probabilities result in slower robots. An illustrative run of the DrPP-RC algorithm for t max = 2 and en vironment random1 can be seen from https://youtu.be/tht4ydW5iJA . From Figure 4, we ﬁrst observe that the addition of R 7 im- prov es the ﬂo wtime performance of DrPP-N signiﬁcantly , while its ef fect on makespan is neglible. Secondly , we ob- serve that DrPP-RC always performs better than DrPP-N. This is expected as drinking sessions for DrPP-N, which are computed by (1), are always larger than the ones of DrPP- RC, which are computed by (2). That is, robots using DrPP- N need more bottles to mov e, and thus, wait more. More- ov er , DrPP-N requires stronger assumptions to hold for a collection of paths. For instance, only one of the ten random examples (random7) satisfy the the assumptions in Theo- rem 1 for DrPP-N. The number of instances that satisfy the assumptions increase to four for DrPP-RC (random 3, 4, 7, 10). The random1 example illustrated in Figure 3 originally violates the assumptions, but this is ﬁxed for both drinking 10 based methods by adding a single cell into a robot’ s path. W e here note that, the set of v alid paths for MCP and DrPP algorithms are non-comparable. There are paths that satisfy the assumptions of one algorithm and violate the other , and vice versa. W e also observe that makespan values are quite similar for DrPP-RC and MCP methods, although MCP often performs slightly better in this regard. Gi ven a collection of paths, the makespan is largely determined by the “slowest” robot, a robot with a long path and/or a high delay probability , re- gardless of the control policies. Therefore, makespan statis- tics do not necessarily reﬂect the amount of concurrency al- lowed by the control policies. Ideally , in the case of a slow moving robot, we want the control policies not to stop or slow down other robots unnecessarily , but to allow them mov e freely . The ﬂowtime statistics reﬂect these properties better . From Figure 4, we see that ﬂowtime values increase more signiﬁcantly with t max for MCP , compared to DrPP- RC. This trend can be explained with how priority orders are maintained in each of the algorithms. As the delay prob- abilities increase, there is more uncertainty in the motion of robots. MCP keeps a ﬁxed priority order between robots, which might lead to robots waiting for each other unnec- essarily . On the other hand, DrPP-RC dynamically adjusts this order , which leads to more concurrent behavior , hence the smaller ﬂowtime values. Section 6.2 illustrates this phe- nomenon with a simple example. 6.2 Makespan versus Flowtime As mentioned earlier , [16] assumes that delay probabilities are known a priori, and computes paths to minimize the ex- pected makespan. Once the paths are computed, the priority order between robots is ﬁxed to ensure MCP policies are collision and deadlock-free. W e now provide a simple ex- ample to illustrate the effect of using inaccurate delay prob- abilities in the path planning process. Imagine 3 robots are sharing a 10 by 10 grid environment as shown in Figure 5. Assume that the delay probabilites for robots r 1 , r 2 and r 3 are known to be { 0 , 0 . 4 , 0 . 8 } , respectiv ely . If we compute paths to minimize the expected makespan, resulting paths are straight lines for each robot. Paths π 1 and π 2 intersect at a single cell, for which r 1 has a priority over r 2 . Similarly π 2 and π 3 also intersect at a single cell, for which r 2 has a priority over r 3 . W e run this example using inaccurate de- lay probabilities { 0 . 8 , 0 . 4 , 0 } to see how the makespan and ﬂowtime statistics are af fected. Over 1000 runs, makespan values are found to be 48 . 30 and 45 . 77 steps for MCP and DrPP-RC implementations, respectiv ely . The makespan values are close because of the slow moving r 1 , which becomes the bottleneck of the sys- tem. Therefore, it is not possible to improv e the makespan statistics by employing different control policies. Howe ver , the ﬂowtime statistics are found as 128 . 78 and 77 . 78 steps for MCP and DrPP-RC implementations, respectively . Sig- niﬁcant difference is the result of how a slow moving robot is treated by each policy . For the MCP implementation, r 2 (resp. r 3 ) needs to wait for r 1 (resp. r 2 ) unnecessarily , since the priority order is ﬁxed at the path planning phase. On the other hand, DrPP-RC implementation allows robots to modify the priority order at run-time, resulting in improved ﬂowtime statistics. 6.3 W ar ehouse Example W e also compare the performance of the control policies in a more structured warehouse-like en vironment. This ware- house example is taken from [16], and it has 35 robots as shown in Figure 6. The makespan and ﬂo wtime statistics are reported in Figure 7, which are averaged ov er 1000 runs for varying t max values. Due to stronger assumptions on the collection of paths, DrPP-N is not able to handle this exam- ple. Similarly , the conditions required by [31] are too restric- tiv e, hence it cannot solve this problem. Although paths can be altered to allow [31] to be used, this requires each aisle to be abstracted as one discrete cell, and limits the number of robots in each aisle to at most one. As a result, the per- formance of [31] would be signiﬁcantly worse compared to both DrPP-RC and MCP , no matter ho w paths are generated. Similar to Section 6.1, we observe that makespan v alues are better for MCP , but DrPP-RC scales better with t max for ﬂowtime statistics. Upon closer inspection, we see that robots moving in narrow corridors in opposite directions lead to many rainbow cycles. By enforcing a one-way policy in each corridor , similar to [5], many of these rainbo w cycles can be eliminated and the performance of our method can be improv ed. Indeed, Figure 7 reports the results when paths are modiﬁed such that no horizontal corridor has robots mo ving in opposing directions. W e further use the same warehouse example to demonstrate how DrPP-RC can be used in conjunction with a higher- lev el emergency stopping algorithm. In practical examples, robots carry shelves around the warehouse, which might make it dangerous for humans to work in the same space. T o guarantee safety for humans, we require robots to stop and giv e way if there is a human in a predeﬁned radius. As the video in https://youtu.be/gVSKs1iKsQw shows, DrPP-RC guarantees that deadlocks and collisions are av oided in such cases. 7 Conclusions In this paper , we presented a method to solve the multi- robot path execution (MRPE) problem. Our method is based on a reformulation of the MRPE problem as an instance of drinking philosophers problem (DrPP). W e showed that the existing solutions to the DrPP can be used to solve in- stances of MRPE problems if drinking sessions are con- structed carefully . Howe ver , such an approach leads to con- servati ve control policies. T o improv e the system perfor- mance, we provided a less conserv ativ e approach where we 11 Fig. 4. Makespan and ﬂowtime statistics averaged over 1000 runs for the randomly generated environments under varying t max values. DrPP based method cannot be used in environments where the collection of paths violate the conditions in Theorem 1. DrPP-N and DrPP-RC methods can solve 2 and 5 out of 10 randomly generated instances, respectively . 1 2 3 Fig. 5. A simple example to show effects of a slow moving robot. Robots r 1 , r 2 and r 3 are colored in red, blue and green, respec- tiv ely . Initial and ﬁnal cells of the robots are marked with solid and hollow circles of their unique color , respectiv ely . modiﬁed an e xisting DrPP solution. W e pro vided conditions under which our control policies are sho wn to be collision and deadlock-free. W e further demonstrated the efﬁcac y of this method by comparing it with existing work. W e ob- served that our method provides similar makespan perfor- mance to [16] while outperforming it in ﬂowtime statistics, especially as uncertainty in robots’ motion increase. This im- prov ement can be explained mainly by our method’ s ability to change the priority order between robots during run-time, as opposed to keeping a ﬁxed order . 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Fig. 6. Illustration of a warehouse example on a 22 × 57 grid. Blocked cells are shown in black. Initial and ﬁnal cells are marked with a solid and a hollow circle of a unique color, respectiv ely . Our current method and deriv ed conditions that guarantee collision and deadlock-freeness are limited to the multi-robot path e xecution problem where robot paths are assumed to be ﬁxed a priori. Using such conditions to guarantee deadlock- freeness of replanning approaches or designing life-long planning algorithms with similar guarantees are interesting directions for future research. W e are also interested in ﬁnd- ing looser conditions that guarantee collision and deadlock- freeness, as the current conditions are sufﬁcient but might not be necessary . Acknowledgements W e thank Hang Ma from Simon Fraser University and Sven K oenig from Univ ersity of Southern California for sharing their code for MCP implementation in [16] with us. W e also 12 Fig. 7. Makespan and ﬂowtime statistics averaged o ver 1000 runs for the warehouse en vironment under v arying t max values. Dashed lines show the improv ement obtained by modifying the paths to decrease the number of rainbow cycles. DrPP-N cannot solve this instance as the collection of paths violate the conditions in Theorem 1. thank Ruya Karagulle for pointing out typos in Theorem 1. The last but not least, we thank the revie wers for their valu- able comments and suggestions, which improved the clarity and the presentation of the paper greatly . This work is sup- ported in part by ONR grant N00014-18-1-2501, NSF grant ECCS-1553873, and an Early Career Faculty grant from N ASA ’ s Space T echnology Research Grants Program. References [1] Ahn, H. and Del V ecchio, D. [2017], ‘Safety veriﬁ- cation and control for collision avoidance at road in- tersections’, IEEE T ransactions on Automatic Contr ol 63 (3), 630–642. [2] Carlino, D., Boyles, S. D. and Stone, P . [2013], Auction-based autonomous intersection management, in ‘16th International IEEE Conference on Intelligent T ransportation Systems (ITSC 2013)’, IEEE, pp. 529– 534. [3] Chandy , K. M. and Misra, J. [1984], ‘The drink- ing philosophers problem’, ACM T ransactions on Pr ogramming Languages and Systems (TOPLAS) 6 (4), 632–646. [4] Chen, L. and Englund, C. [2015], ‘Cooperativ e inter- section management: A surv ey’, IEEE T ransactions on Intelligent T ransportation Systems 17 (2), 570–586. [5] Cohen, L., Uras, T . and Koenig, S. [2015], Feasibility study: Using highways for bounded-suboptimal multi- agent path ﬁnding, in ‘Eighth Annual Symposium on Combinatorial Search’. [6] Desai, A., Saha, I., Y ang, J., Qadeer, S. and Seshia, S. A. [2017], Drona: a framew ork for safe distributed mobile robotics, in ‘Proceedings of the 8th ICCPS’, A CM, pp. 239–248. [7] Dijkstra, E. W . [1971], Hierarchical ordering of se- quential processes, in ‘The origin of concurrent pro- gramming’, Springer , pp. 198–227. [8] Dresner , K. and Stone, P . [2008], ‘ A multiagent ap- proach to autonomous intersection management’, J our- nal of artiﬁcial intelligence resear ch 31 , 591–656. [9] Felner , A., Stern, R., Shimony , S. E., Boyarski, E., Goldenberg, M., Sharon, G., Sturtev ant, N., W agner , G. and Surynek, P . [2017], Search-based optimal solvers for the multi-agent pathﬁnding problem: Summary and challenges, in ‘T enth Annual Symposium on Combi- natorial Search’. [10] Ginat, D., Shankar , A. U. and Agrawala, A. K. [1989], An efﬁcient solution to the drinking philosophers prob- lem and its extensions, in ‘International W orkshop on Distributed Algorithms’, Springer , pp. 83–93. [11] Gupta, A. and Suzumura, T . [2021], ‘Finding all bounded-length simple cycles in a directed graph’, arXiv pr eprint arXiv:2105.10094 . [12] H ¨ onig, W ., Kiesel, S., T inka, A., Durham, J. W . and A yanian, N. [2019], ‘Persistent and rob ust execution of MAPF schedules in warehouses’, IEEE Robotics and Automation Letters 4 (2), 1125–1131. [13] H ¨ onig, W ., Kumar , T . S., Cohen, L., Ma, H., Xu, H., A yanian, N. and K oenig, S. [2016], Multi-agent path ﬁnding with kinematic constraints, in ‘T wenty-Sixth International Conference on Automated Planning and Scheduling’. [14] Johnson, D. B. [1975], ‘Finding all the elementary cir- cuits of a directed graph’, SIAM J ournal on Computing 4 (1), 77–84. [15] Luh, P . B., Wilkie, C. T ., Chang, S.-C., Marsh, K. L. and Olderman, N. [2012], ‘Modeling and optimization of building emergenc y ev acuation considering block- ing ef fects on crowd movement’, IEEE T ransactions on Automation Science and Engineering 9 (4), 687–700. [16] Ma, H., Kumar , T . S. and K oenig, S. [2017], Multi- agent path ﬁnding with delay probabilities, in ‘Thirty- First AAAI Conference on Artiﬁcial Intelligence’, pp. 3605–3612. [17] McNe w , J.-M., Klavins, E. and Egerstedt, M. [2007], Solving coverage problems with embedded graph grammars, in ‘International W orkshop on Hybrid Sys- tems: Computation and Control’, Springer, pp. 413– 427. [18] Panagou, D. [2014], Motion planning and collision av oidance using navigation vector ﬁelds, in ‘2014 IEEE International Conference on Robotics and Automation (ICRA)’, IEEE, pp. 2513–2518. [19] Re veliotis, S. and Roszkowska, E. [2008], Conﬂict res- olution in multi-vehicle systems: A resource allocation paradigm, in ‘2008 IEEE International Conference on Automation Science and Engineering’, IEEE, pp. 115– 13 121. [20] Roszko wska, E. and Rev eliotis, S. [2013], ‘ A dis- tributed protocol for motion coordination in free-range vehicular systems’, Automatica 49 (6), 1639–1653. [21] Sahin, Y . E., Nilsson, P . and Ozay , N. [2020], ‘Mul- tirobot coordination with counting temporal logics’, IEEE T ransactions on Robotics . [22] S ¸ enbas ¸ lar, B., H ¨ onig, W . and A yanian, N. [2019], Ro- bust trajectory execution for multi-robot teams using distributed real-time replanning, in ‘Distributed Au- tonomous Robotic Systems’, Springer , pp. 167–181. [23] Stern, R., Sturtev ant, N., Felner, A., Koenig, S., Ma, H., W alker , T ., Li, J., Atzmon, D., Cohen, L., Ku- mar , T . et al. [2019], ‘Multi-agent pathﬁnding: Def- initions, variants, and benchmarks’, arXiv preprint arXiv:1906.08291 . [24] Surynek, P . [2010], An optimization variant of multi- robot path planning is intractable, in ‘T wenty-Fourth Conference on Artiﬁcial Intelligence’. [25] T anner , H. G., Pappas, G. J. and Kumar , V . [2004], ‘Leader-to-formation stability’, IEEE T ransactions on Robotics and Automation 20 (3), 443–455. [26] V an Den Berg, J., Guy , S. J., Lin, M. and Manocha, D. [2011], Reciprocal n-body collision av oidance, in ‘Robotics research’, Springer , pp. 3–19. [27] W elch, J. L. and L ynch, N. A. [1993], ‘ A modular drinking philosophers algorithm’, Distributed Comput- ing 6 (4), 233–244. [28] W urman, P . R., D’Andrea, R. and Mountz, M. [2008], ‘Coordinating hundreds of cooperative, autonomous vehicles in warehouses’, AI magazine 29 (1), 9. [29] Y u, J. and LaV alle, S. M. [2016], ‘Optimal multirobot path planning on graphs: Complete algorithms and effecti ve heuristics’, IEEE T ransactions on Robotics 32 (5), 1163–1177. [30] Zhou, Y ., Hu, H., Liu, Y ., Lin, S.-W . and Ding, Z. [2018], ‘ A distributed approach to robust control of multi-robot systems’, Automatica 98 , 1–13. [31] Zhou, Y ., Hu, H., Liu, Y ., Lin, S.-W . and Ding, Z. [2020], ‘ A distributed method to a void higher- order deadlocks in multi-robot systems’, Automatica 112 , 108706. [32] Zohdy , I. H. and Rakha, H. A. [2016], ‘Intersection management via vehicle connectivity: The intersec- tion cooperativ e adaptiv e cruise control system con- cept’, Journal of Intelligent T ransportation Systems 20 (1), 17–32. Appendix - Proof of Theorem 1 W e ﬁrst start by showing that the Algorithm 3 is collision- free. Assume that r n is currently occupying the shared cell π t n . Note that, when a robot is about to move to a shared cell, GO action is issued only when r n is drinking (lines 20 − 23 ). Therefore, before reaching π t n , r n was in drinking state, and thus, was holding all the bottles in ˜ S n ( t ) . If π t + 1 n is a free cell, r n would stay in drinking state until reaching π t + 1 n (lines 28 − 35 ). Otherwise, it would get insatiable with  B n  ˜ S n ( t )  , B n  ˜ S n ( t + 1 )   . In neither of these scenar- ios, r n releases any bottles in B n ( π t n ) ⊆ B n  ˜ S n ( t )  before reaching to π t + 1 n . Note also that, by construction of drinking sessions, π t n ⊆ ˜ S n ( t ) . Since bottles are mutually exclusiv e, none of the other robots could acquire the bottles in B n ( π t n ) while r n is in π t n . This implies that collisions are av oided, as no other robot is allo wed to occupy π t n before r n leav es. W e now sho w that Algorithm 3 is deadlock free. As deﬁned in Deﬁnition 1 deadlock is any conﬁguration where a sub- set of robots, which have not reached their ﬁnal cell, choose S T OP action indeﬁnitely . As it can be seen from Algo- rithm 3, there are only three cases where a robot chooses the S T OP action: (i) when the robot is already in the ﬁnal cell (line 2 ), (ii) when there are no free cells from the next cell up to and including the ﬁnal cell, and the ﬁnal cell is not yet cleared by all other robots (line 14 ), (iii) when the robot is in thirsty or insatiable state (line 19 ). In the following, we show that none of these cases can cause a deadlock. W e start by showing that neither (i) nor (ii) could cause a deadlock. T o do so, assume r n has reached its ﬁnal cell and is causing a deadlock by blocking others from progressing. By (3) of Theorem 1, we know that there exist at least one free cell in each path. Since we assumed that r n is blocking others by waiting in its ﬁnal cell, π end n must be a shared cell. Then, there must be at least one free cell before π end n . Let π t n denote the last free cell on π n . A robot reaching a free cell gets into tranquil state, if it is not already in tranquil state, due to line 34 of Algorithm 3. Otherwise, if π t n is the ﬁrst cell of π n , r n would be in tranquil state before trying to mov e forward, since all robots are initialized in tranquil state. According to lines 12 − 15 of Algorithm 3, r n would wait in π t n in tranquil state, until its ﬁnal cell is cleared by all other robots. Since a tranquil robot does not need any bottles, no other robot could be waiting for r n . Ho wev er , this is a contradiction, and it is not possible for a robot to reach its ﬁnal cell and block others from progressing. Therefore, (i) cannot be a reason for a deadlock. Furthermore, we showed that a robot waiting due to (ii) would stay in tranquil state until all others clear its ﬁnal cell. As stated, a tranquil robot does not need any bottles, and thus, no other robot could be waiting for r n . Thus, (ii) cannot cause deadlocks, either . W e now sho w that (iii) cannot cause deadlocks. T o do so, assume that a subset of robots are stuck due to (iii), i.e., they are all in thirsty or insatiable state, and they need additional bottle(s) to mov e. If there was a robot who does not wait for any other robot, it would start drinking and moving. There- fore, some non-empty subset of these robots must be wait- ing circularly for each other . W ithout loss of generality , let r n be waiting for r n + 1 for n ∈ { 1 , . . . , K } where r K + 1 = r 1 . That is, r n has some subset of bottles r n − 1 needs, and would not release them without acquiring some subset of bottles from r n + 1 . Note that, there might be other robots choosing the S T O P action indeﬁnitely as well, howe ver , the main reason for the deadlock is this circular wait. Once the cir- cular waiting ends, all robots would start moving according 14 to their priority ordering. For the time being, assume that each robot starts from a free inital cell and moves towards a free cell through an arbi- trary number of shared cells in between. W e later relax this assumption. Firstly , we know that none of the robots could be in tranquil or drinking state, otherwise they would be moving until reaching the next cell as lines 5 − 8 , 20 − 23 and 28 − 31 of Algorithm 3. Secondly , we show that, not all robots can be thirsty . Since a strict priority order is main- tained between robots at all times, if all of them were thirsty , the robot with the highest priority would acquire all the bot- tles it needs according to R ′ 5 and start drinking. A drink- ing robot starts moving, therefore cannot be participating in a deadlock. Therefore, there must be at least one robot that is in insatiable state. Thirdly , we show that if there is a deadlock, all robots participating in it must be in insatiable state. T o show a contradiction, assume that at least one of the robots participating in the deadlock is thirsty . Accord- ing to R ′ 5 , an insatiable robot always has a higher priority than a thirsty robot. Therefore, an insatiable robot cannot be waiting for a thirsty robot. Thus, all robots in a deadlock conﬁguration must in insatiable state. Let ˜ G Π be the graph returned by the Algorithm 2 for the input Path-Graph G Π . W e showed that all robots are in in- satiable state. Let π t n n denote the current cell r n is occu- pying, i.e., cur r ( r n ) = t n . That is, r n is insatiable with  B n  ˜ S n ( t n )  , B n  ˜ S n ( t n + 1 )   and needs all the bottles in B n ( ˜ S n ( t n ) ∪ ˜ S n ( t n + 1 )) to start drinking. Since r n cur- rently occupies π t n n , it holds all the bottles in B n ( ˜ S n ( t n )) . This implies that ˜ S n ( t n )  = ˜ S n ( t n + 1 ) , and that r n does not hold some of the bottles in B n ( ˜ S n ( t n + 1 )) . Now de- ﬁne [ ˜ S n ( t n )] ≐ ⋃ v i ∈ ˜ S n ( t n ) {[ v i ]} . By construction, there must be two nodes in ˜ G Π , one corresponding to [ ˜ S n ( t n )] and another corresponding to [ ˜ S n ( t n + 1 )] , and a c n col- ored edge from [ ˜ S n ( t n )] to [ ˜ S n ( t n + 1 )] in ˜ G Π . Similarly , r n + 1 holds all the bottles in B n + 1 ( ˜ S n + 1 ( t n + 1 )) and is miss- ing some of the bottles in B n + 1 ( ˜ S n + 1 ( t n + 1 + 1 )) . Since r n is waiting for r n + 1 , either [ ˜ S n ( t n + 1 )] = [ ˜ S n + 1 ( t n + 1 )] or [ ˜ S n ( t n + 1 )] = [ ˜ S n + 1 ( t n + 1 + 1 )] must hold. This implies that, there exists a c n colored edge from [ ˜ S n ( t n )] to either [ ˜ S n + 1 ( t n + 1 )] or to [ ˜ S n + 1 ( t n + 1 + 1 )] . In a similar manner , either [ ˜ S i ( t i )] = [ ˜ S i + 1 ( t i + 1 )] or [ ˜ S i ( t i )] = [ ˜ S i + 1 ( t i + 1 + 1 )] holds for each i ∈ { 1 , . . . , K } . Assume [ ˜ S n ( t n + 1 )] = [ ˜ S n + 1 ( t n + 1 )] holds for each n ∈ { 1 , . . . , K } . Then, there exists a rainbow cycle {([ ˜ S 1 ( t 1 )] , c 1 , [ ˜ S 2 ( t 2 )]) , . . . , ([ ˜ S K ( t K )] , c N , [ ˜ S 1 ( t 1 )])} in ˜ G Π . How- ev er, this is a contradiction since ˜ G Π must be free of rain- bow c ycles by construction and such a rainbow cycle would be found by the Algorithm 2. Alternativ ely , let m ∈ { 1 , . . . , K } be arbitrary and [ ˜ S m ( t m + 1 )] = [ ˜ S m + 1 ( t m + 1 + 1 )] be true. For each n  = m , assume [ ˜ S n ( t n + 1 )] = [ ˜ S n + 1 ( t n + 1 )] holds. Then, a rainbo w cycle can be created similar to the previous case, where the edges ([ ˜ S m ( t m )] , c m , [ ˜ S m + 1 ( t m + 1 )]) and ([ ˜ S m + 1 ( t m + 1 )] , c m + 1 , [ ˜ S m + 2 ( t m + 2 )]) are replaced by ([ ˜ S m ( t m )] , c m , [ ˜ S m + 2 ( t m + 2 )]) . Again, this is a contradic- tion because ˜ G Π must be free of rainbow cycles. Similarly , we can construct a rainbow c ycle when [ ˜ S m ( t m + 1 )] = [ ˜ S m + 1 ( t m + 1 + 1 )] holds for more than one robot. The length of the rainbow cycle would decrease by the number of robots for which the previous condition holds. On the extreme case, assume [ ˜ S n ( t n + 1 )] = [ ˜ S n + 1 ( t n + 1 + 1 )] for all n ∈ { 1 , . . . , K } . In this case, we can no longer construct a rainbow cycle as before. Howe ver , this case implies all robots are insatiable with  B n ( S n ( t n )) , B n ( S n ( t n + 1 ))  where [ S n ( t n + 1 )] = [ S m ( t m + 1 )] for each pair of ( m, n ) . In such a case, the robot r p with the highest priority would obtain all the bottles in B p ( S p 2 ) due to R ′ 5 ( 3 a ) , and start drinking. After r p no longer needs bottles in [ S p ( t p + 1 )] , robot with the second highest priority would start drinking. This pattern would be repeated until there are no robots waiting. This is a contradiction that there was a deadlock. Therefore, we showed that a deadlock conﬁguration cannot be reached and (iii) cannot be a reason for a deadlock. W e now relax the assumption that all robots are initialized at a free cell. T o do so, we “modify” all paths by appending virtual free cell at the beginning. That is, all robots are ini- tialized at a virtual free cell, which does not exist physically , and the next cell in a robot’ s path is its original initial cell. Theorem 1 assumes that initial drinking sessions are dis- joint for each robot, i.e., S m ( 0 ) ∩ S n ( 0 ) = ∅ for all m  = n . Since initial drinking sessions are disjoint, all robots whose initial cell is a shared cell can immediately start drinking. As a result, all of those robots can immediately “virtually move” into their original initial cell. All other robots with free initial cells can also move to their original initial cells immediately . Therefore the assumption that all robots are initialized at a free cell is not restricting. Finally , we relax the assumption that each robot moves to- wards a free cell. Theorem 1 requires each path to have at least one free cell. Then, up until reaching the ﬁnal free cell, moving towards a free cell assumption is not restric- tiv e. W e know under this condition that deadlocks are pre- vented, therefore all robots are at least guaranteed to reach to the ﬁnal free cell in their path. Theorem 1 also requires that the ﬁnal drinking sessions are disjoint. Therefore, all robots would ev entually be able to start drinking and reach their ﬁnal location. Deadlocks occur when a subset of robots, which have not reached their ﬁnal cell, choose S T O P action indeﬁnitely . A robot chooses the S T O P action only under three condi- tions. W e sho wed that none of these conditions can cause a deadlock. Thus, Algorithm 3 is deadlock-free. ◻ 15

From Drinking Philosophers to Asynchronous Path-Following Robots

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment