The Multi-AMR Buffer Storage, Retrieval, and Reshuffling Problem: Exact and Heuristic Approaches

The Multi-AMR Buﬀer Storage, Retriev al, and Resh uﬄing Problem: Exact and Heuristic Approac hes Max Disselnmey er ∗ , Thomas Bömer † , Laura Dörr ‡ , Bastian Am b erg § , Anne Mey er ¶ Abstract Buﬀer zones are essen tial in pro duction systems to decouple sequen tial pro cesses. In dense ﬂo or storage en vironments, suc h as space-constrained bro wnﬁeld facilities, man ual op eration is increasingly c hallenged b y severe lab or shortages and rising op erational costs. Automating these zones requires solving the Buﬀer Storage, Retriev al, and Reshuﬄing Prob- lem (BSRRP). While previous w ork has addressed scenarios, where the fo cus is limited to resh uﬄing and retrieving a ﬁxed set of items, real-world man ufacturing necessitates an adaptiv e approach that also incorp orates arriving unit loads. This pap er introduces the Multi-AMR BSRRP , coordinating a rob ot ﬂeet to manage concurren t reshuﬄing, along- side time-windo wed storage and retriev al tasks, within a shared ﬂo or area. W e form ulate a Binary Integer Programming (IP) mo del to obtain exact solutions for benchmarking pur- p oses. As the problem is NP-hard, rendering exact metho ds computationally intractable for industrial scales, w e prop ose a hierarc hical heuristic. This approach decomposes the problem in to an A ∗ searc h for task-lev el sequence planning of unit load placements, and Constrain t Programming (CP) approach for multi-robot coordination and scheduling. Ex- p erimen ts demonstrate orders-of-magnitude computation time reductions compared to the exact formulation. These results conﬁrm the heuristic’s viability as responsive con trol logic for high-density production environmen ts. Keyw ords: Autonomous Mobile Rob ots (AMR), Buﬀer Management, Integer Programming, Pro duction Logistics, Multi-Rob ot Co ordination. ∗ Karlsruhe Institute of T ec hnology , Zirk el 2, 76131 Karlsruhe, Germany , max.disselnmeyer@kit.edu, ORCID: https://orcid.org/0009-0008-5689-2235 , E-Mail: max.disselnmeyer@kit.edu † Karlsruhe Institute of T echnology , Zirkel 2, 76131 Karlsruhe, Germany , thomas.b oemer@kit.edu, OR CID: https://orcid.org/0000-0003-4979-7455 ‡ Karlsruhe Institute of T echnology , Zirk el 2, 76131 Karlsruhe, Germany , laura.do err@kit.edu, OR CID: https://orcid.org/0000-0002-8007-1815 § Karlsruhe Institute of T echnology , Zirkel 2, 76131 Karlsruhe, Germany , bastian.amberg@kit.edu, ORCID: https://orcid.org/0000-0001-6715-3819 ¶ Karlsruhe Institute of T echnology , Zirk el 2, 76131 Karlsruhe, German y , anne.meyer@kit.edu, ORCID: https://orcid.org/0000-0001-6380-1348 1 Con ten ts 1 In tro duction 4 2 Related W ork 6 2.1 F oundations and Related Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Extensions for Modeling the Buﬀer Storage, Retriv al, and Reshuﬄing Problem . 7 2.3 Heuristic and Learning-Based Approac hes . . . . . . . . . . . . . . . . . . . . . . 7 2.4 Researc h Gap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3 The Multi-AMR Buﬀer Storage, Retriev al, and Resh uﬄing Problem 11 3.1 Static Lanes and Graph Overla y . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2 Mo del Simpliﬁcations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.3 Ob jective F unction and Model Flexibilit y . . . . . . . . . . . . . . . . . . . . . . 12 4 Exact Problem F orm ulation 13 4.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.2 Decision V ariables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.3 Ob jective F unction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.4 Constrain ts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.5 Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5 Heuristic Approac h 20 5.1 Priorit y Queue Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.2 Op eration Sequencing via A* Searc h . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.2.1 Direct Retriev al Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.2.2 State Space and T ransitions . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.2.3 Cost F unction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.3 Multi-AMR Scheduling via CP-SA T . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.3.1 Mo del F orm ulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.3.2 Ob jective F unction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5.4 T ra jectory Repair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 6 Computational Experiments 29 6.1 Exp erimen tal Design and Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 6.1.1 Instance Generation and P arameters . . . . . . . . . . . . . . . . . . . . . 29 6.1.2 Computational Environmen t . . . . . . . . . . . . . . . . . . . . . . . . . 30 6.1.3 Exact F ormulation Exp erimen ts and Benc hmark Set . . . . . . . . . . . . 30 6.2 Quan titative Benc hmarking against Exact F ormulation . . . . . . . . . . . . . . . 31 6.2.1 Quan titative Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 6.3 Qualitativ e Analysis of Solution Beha vior . . . . . . . . . . . . . . . . . . . . . . 33 6.3.1 Reactiv e Conﬂict Resolution ( 8 × 3 Lay out) . . . . . . . . . . . . . . . . . 35 6.3.2 Spatial Strategies in Bro wnﬁeld La y outs . . . . . . . . . . . . . . . . . . . 35 6.3.3 Spatial Decision P olicy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2 6.4 La yout Sensitivit y and Saturation Poin ts . . . . . . . . . . . . . . . . . . . . . . . 38 6.5 Managerial Insights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 6.5.1 La yout F ragmen tation and A ccess Eﬃciency . . . . . . . . . . . . . . . . . 39 6.5.2 Capacit y-Adaptiv e Breathing T op ology . . . . . . . . . . . . . . . . . . . . 39 6.5.3 The 90% Stabilit y Threshold . . . . . . . . . . . . . . . . . . . . . . . . . 40 6.5.4 Op erational Robustness and Fleet Scalabilit y . . . . . . . . . . . . . . . . 40 7 Conclusion and F uture W ork 41 A Multi-AMR Buﬀer Storage, Retriev al, and Reshuﬄing Problem - Complete Mo del 46 B NP-hardness of the BSRRP Problem 49 B.1 Problem Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 B.2 Polynomial-Time Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 B.3 Pro of of Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 B.4 Polynomial-Time Reduction Complexity . . . . . . . . . . . . . . . . . . . . . . . 52 B.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3 1 In tro duction The demand for automation in pro duction supply and intralogistics, esp ecially for autonomous mobile rob ots (AMR), is growing to optimize material ﬂo w and address critical labor short- ages (Descartes Systems Group, 2023; Pytel et al., 2021). These c hallenges are ampliﬁed in bro wnﬁeld manufacturing facilities, whic h—despite b eneﬁts regarding sustainability and land a v ailabilit y—present h urdles such as spatial limitations and legacy infrastructure (Andulk ar, Le, & Berger, 2018). These constraints force the use of dense storage in the form of deep ﬂo or-lev el blo c k la youts, where accessibilit y is restricted. AMRs oﬀer greater ﬂexibilit y than traditional Automated Guided V ehicles (A GV s) or forklifts, signiﬁcan tly impro ving material ﬂo w and pro- ductivit y (F ragapane, de K oster, Sgarb ossa, & Strandhagen, 2021). A ccording to (Grand View Researc h, 2025), the AMR mark et is expected to grow by 14 . 4% from 2026 to 2033. Eﬃcien t management of buﬀer zones (temp orary storage areas decoupling pro duction stages) is a critical challenge in these space-constrained environmen ts. In manual op eration, these zones suﬀer from lab or shortages, operational ineﬃciencies and time lost searc hing for materials. A represen tative real-w orld example for automation from the surface coating industry is illustrated in Figure 1, where a buﬀer zone must b e inte grated into the irregular residual space surrounding existing machinery . B uff er s lot s out h ac c es s B uff er s lot w es t ac c es s B uff er s lot eas t ac c es s S our c e S ink High - gl os s Coa ting Mac hine Ais les Ac es s point (a) Schematic lay out discretizing the residual space around a high- gloss coating machine into static LIFO storage lanes. (b) View from the source/sink area to wards the coating ma- c hine (background wall on the left), highlighting the storage densit y . (c) AMR designed to au- tonomously transp ort the spe- cialized unit loads within con- strained aisles. Figure 1: Real-world buﬀer scenario from the surface coating industry . Automating material ﬂo w in such space-constrained bro wnﬁeld facilities requires solving the Buﬀer Storage, Retriev al, and Reshuﬄing Problem (BSRRP) to ensure contin uous machine supply . 4 Automating buﬀer zones with AMRs requires solving the Buﬀer Storage, Retriev al, and Reshuf- ﬂing Problem (BSRRP). While previous researc h has established a model for buﬀer scenarios, where the fo cus is limited to retrieving and reshuﬄing a ﬁxed set of unit loads (Disselnmey er, Bömer, Pfrommer, & Meyer, 2024), real-world man ufacturing en vironments also include stor- age. They inv olv e an inﬂo w of new unit loads and require the co ordination of multiple AMRs within conﬁned spaces, necessitating adv anced optimization to a void collisions and deadlo c ks while meeting strict retriev al deadlines. This pap er addresses these requiremen ts by extending this retriev al-fo cused form ulation to in- corp orate storage decisions and adapt it to the Multi-AMR setting. Our contributions are as follo ws: • Exact F orm ulation (EF) W e develop a Binary Integer Programming (IP) mo del for the Multi-AMR BSRRP . It couples storage, retriev al and resh uﬄing decisions in dense storage, managing a ﬂeet restricted to perimeter access like conv en tional forklifts. • Complexit y Analysis W e establish the computational complexit y of the BSRRP prob- lem, referencing a formal pro of of NP-hardness via reduction from the Blo c k Relo cation Problem (BRP). • Hierarc hical Heuristic W e prop ose a scalable hierarchical heuristic that com bines an A* algorithm for creating storage, retriev al and reshuﬄing decisions with a Constraint Programming (CP) form ulation for precise multi-AMR scheduling. • Rigorous V alidation W e employ a rigorous v alidation metho dology where heuristic so- lutions are injected into the exact IP mo del. This allo ws us to verify feasibility using a commercial solver and explicitly quantify the optimality gap, pro viding a ground-truth b enc hmark for the heuristic’s p erformance. The remainder of this pap er is structured as follows: Section 2 reviews the relev an t literature on the Block Relo cation Problem and multi-AMR co ordination to situtate the BSRRP within the curren t researc h landscap e. In Section 3, w e pro vide a detailed description of the Multi-AMR BSRRP , including the logical subdivision of the storage space and the underlying op erational as- sumptions. Section 4 presen ts the formal Binary Integer Programming form ulation and discusses the computational complexit y of the problem. The proposed hierarc hical heuristic, combining A ∗ with Constrain t Programming, is detailed in Section 5. Section 6 ev aluates the performance of both the exact and heuristic approac hes through extensive computational exp eriments and discusses managerial insigh ts deriv ed from the results. Finally , Section 7 concludes the pap er and outlines a v enues for future researc h. 5 2 Related W ork This section reviews the current state of researc h and deﬁnes the scientiﬁc context of the Multi- AMR BSRRP . First, w e compare the problem to foundational approaches such as the Blo c k Relo cation Problem (BRP) and Multi-Agen t P ath Finding (MAPF). Second, w e outline the sp eciﬁc modeling requiremen ts for autonomous buﬀers operated by an AMR ﬂeet, focusing on the transition from single AMR mo dels to in tegrated multi-AMR systems. Third, w e discuss existing heuristics for complex logistics tasks. Finally , w e identify the research gaps regarding holistic, real-time control in fragmented environmen ts, whic h forms the basis for the solutions presen ted in this pap er. 2.1 F oundations and Related Problems Con tainer retriev al sub ject to strict LIFO constrain ts is formalized as the Blo c k Relo cation Prob- lem (BRP) (Kim & Hong, 2006; Lersteau & Shen, 2022). This problem is closely related to the Pre-Marshalling Problem (PMP) and its in tralogistics v ariant, the Unit-Load Pre-Marshalling Problem (UPMP) (Bömer, Disselnmey er, & Meyer, 2025; Bömer, Pfrommer, Akizhano v, & Mey er, 2026; Pfrommer, Meyer, & Tierney, 2024). Analogously , the steel industry addresses the Slab Pre-Marshalling Problem (SPMP), whic h is gov erned by iden tical LIFO stac king constraints (Ge, Meng, Liu, T ang, & Zhao, 2020). The BRP serv es as a foundation to the Buﬀer Resh uﬄing and Retriev al Problem (BRR), whic h fo cuses on unit load relo cation and retriev al in constrained spaces (Disselnmey er et al., 2024). Ho wev er, the Multi-AMR BSRRP presen ted in this pap er introduces decisive additional c hal- lenges: the arriv al and storage of new unit loads into the buﬀer, near real-time decision-making, and the co ordination of multiple AMRs. Unlik e PMP and UPMP , the BSRRP explicitly couples resh uﬄing and retriev al tasks with contin uous ﬂeet na vigation in a shared workspace. While BRP researc h often assumes single-crane op erations (Ji, Guo, Zhu, & Y ang, 2015; T ang & Ren, 2010), the BSRRP utilizes AMRs with greater op erational ﬂexibility , enabling co ordinated nav- igation within a shared workspace without rigid segmen tation. F urthermore, the BSRRP is distinct from other related optimization problems. It diﬀers from the Y ard Crane Sc heduling Problem (YCSP) by a voiding crane-sp eciﬁc constrain ts like non-crossing (Kizila y & Eliiyi, 2021). While space-constrained AGV sc heduling (Chen, Tiong, & Chen, 2019) addresses system deadlo c ks through capacit y-aw are pro duction scheduling, it t ypically lacks the activ e resh uﬄing logic required for deep LIFO stacks in trinsic to the BSRRP . F urthermore, the problem diﬀers from the P arallel Stac k Loading Problem (PSLP) b y managing contin uous ev olution rather than just initial placemen t (Boge & Kn ust, 2020). Unlike the Storage Lo cation Assignmen t Problem (SLAP), whic h pro vides static optimal snapshots (Charris, Ro jas-Reyes, & Mon toy a-T orres, 2018), BSRRP manages ongoing reorganization. A dditionally , while the P allet Retriev al and Processing Problem (PRPP) optimizes the in terface betw een transport and pro cessing (Buc ko w, Go erigk, & Kn ust, 2025), it neglects the complex internal resh uﬄing deﬁned b y BSRRP . Finally , unlik e Rob otic Mobile F ulﬁllment Systems (RMFS) whic h often deal with sto c hastic human pic king times (T eck, Dewil, & V ansteenw egen, 2024), BSRRP fo cuses on deterministic reshuﬄing to minimize tra v el distance. 6 2.2 Extensions for Mo deling the Buﬀer Storage, Retriv al, and Reshuﬄing Problem This study extends previous researc h on the BRR problem (Disselnmeyer et al., 2024) b y in- corp orating storage op erations in to the buﬀer and multi-AMR management. W e build upon the foundational BRP mo del by Borjian, Manshadi, Barnhart, and Jaillet (2015) for storing, retrieving and resh uﬄing container stac ks in in a single-crane yard, adapting it to address key requiremen ts for autonomous buﬀer zones: 1. Ob jectiv e F unction: Prioritizing the minimization of total AMR trav el distance rather than the n um b er of mo v es, whic h is the traditional fo cus of the BRP literature. 2. Unrestricted Relocation: Allowing the relo cation of any blocking unit load, not just those directly blocking the target, whic h is a common approac h for the BRP . 3. Retriev al Time Windo ws: Incorp orating strict time windows for retriev al to ensure pro cess sync hronization. 4. Time-Based Modeling: Utilizing discrete time steps for precise multi-AMR co ordination and collision a v oidance. 5. Storage Decisions: A ccommo dating new unit loads requiring placement during ongoing op erations, rather than focusing solely on reshuﬄing and retriev al tasks. 2.3 Heuristic and Learning-Based Approac hes Giv en the NP-hard nature of BRP v arian ts, exact metho ds are often in tractable for real-time decision-making, prompting the widespread use of heuristics (Kim & Hong, 2006; Lersteau & Shen, 2022). Ho wev er, the BSRRP extends b ey ond the purely combinatorial c hallenge of reshuf- ﬂing: it necessitates the translation of mo ves in to collision-free tra jectories. Consequen tly , the problem encompasses b oth pathﬁnding, commonly solv ed using A* or Multi-Agen t Path Finding (MAPF) tec hniques (Q. W ang, V eerapaneni, W u, Li, & Likhac hev, 2024), and task allo cation, whic h relates to the V ehicle Routing Problem (VRP) (Arc hetti, Co elho, Sp eranza, & V ansteen- w egen, 2025; Dan tzig & Ramser, 1959). This complexit y highligh ts the p oten tial of h ybrid decomposition approac hes. F or example, Bömer, Koltermann, Pfrommer, and Mey er (2024) successfully applied a sequen tial method com bining A* search for resh uﬄing logic with a mixed-in teger program for m ulti-AMR tour planning in the UPMP con text. This demonstrates the viabilit y of lev eraging well-established heuristics to solve coupled subproblems sequentially . In the broader context of pre-marshalling, recen t adv ances hav e also employ ed Mon te Carlo T ree Searc h (MCTS) to eﬀectively manage the cascading chain eﬀects of resh uﬄing mov es (Z. W ang, Zhou, Che, & Gao, 2024). Most recently , concurren t research has explicitly addressed the intersection of multi-agen t pathﬁnd- ing and mo v able obstacles in dense environmen ts: Hu, Zhao, and Ren (2025) introduced M- P AMO (MAPF Among Mo v able Obstacles), utilizing Conﬂict-Based Searc h (CBS) to resolve dep endencies b et w een agen ts and mov able blo c k ers. A ddressing extreme density , Makino and Ito 7 (2025) proposed MAPF-HD (MAPF-for high densit y en vironments), utilizing sw apping heuris- tics to manage obstructing agents. Finally , F u et al. (2026) formalized the Blo ck Rearrangemen t Problem (BRaP) for dense storage grids. Their approac h mo dels the system as a discrete sliding-tile puzzle where agents navigate within the grid to rearrange blo c ks. Complemen ting this, Geft, Zhang, Y u, and Bekris (2026) in vestigate the theoretical limits of relo cation-free retriev al under uncertain ty . They prov e that relo cations can b e eliminated through robust storage arrangements if sp eciﬁc empty space ratios are maintained. While these works establish imp ortan t foundations for dense storage, they assume in ternal grid accessibilit y or fo cus on static sequence optimization . In con trast, the BSRRP addresses scenarios where AMRs are restricted to p erimeter access (see Figure 2), due to the characteristics of the la yout, unit loads, or AMR ﬂeet, necessitating logic to handle LIF O access constraints from the outside. F urthermore, unlike the focus on minimizing mov es b et w een static snapshots found in these approac hes, our work addresses the contin uous temporal sync hronization required for ﬂo or-handling AMRs to meet strict time windows. 01 04 07 a) Access from the P erimeter 01 04 07 b) In-Grid A ccess Figure 2: Conceptual comparison of buﬀer accessibilit y: a) A ccess from the P erimeter, as deﬁned in the BSRRP . a) In-Grid A ccess, t ypical for Rob otic Mobile F ulﬁllmen t Systems (RMFS) where agen ts navigate within the grid; In the former, AMRs are restricted to the exterior aisles, necessitating strategic resh uﬄing of obstructing unit loads to access deep LIF O slots. 2.4 Researc h Gap Despite the extensive literature on BRP and autonomous logistics, a signiﬁcant researc h gap remains regarding the Multi-AMR BSRRP . As summarized in T able 1, existing researc h tends to isolate sp eciﬁc subproblems, failing to address the in terdep enden t complexities of mo dern bro wnﬁeld intralogistics. Sp eciﬁcally , the table categorizes state-of-the-art approac hes across k ey metho dological dimensions: handling operations, access mode (physical constraints), ﬂeet conﬁguration, temp oral representation and the Metho d used for solving the corresp onding prob- lem in that pap er. Regarding the latter, we distinguish b et w een three lev els of abstraction that deﬁne when the system state is ev aluated: • Moves treat op erations as a logical sequence; the system state is only up dated b etw een complete crane or rob ot mo vemen ts, eﬀectively ignoring durations. 8 T able 1: Comparison of related literature in the ﬁeld of Blo c k Relo cation and Buﬀer Resh uﬄing. Reference Handling Operations Access Mode Fleet T emp oral Representation Method Caserta, Sch w arze, and V oß (2012) G # P erimeter 1 Crane Mov es IP Borjian et al. (2015) P erimeter 1 Crane Con tinuous IP Ji et al. (2015) G # P erimeter n Cranes Mov es Genetic Algo. Bömer et al. (2024)  P erimeter n AMRs Mo ves CP Disselnmey er et al. (2024) G # P erimeter 1 AMR Con tinuous IP Hu et al. (2025)  In-Grid n AMRs Steps CBS Search F u et al. (2026) G # In-Grid n AMRs Steps Sym b olic Planning This Paper P erimeter n AMRs Contin uous IP + Heuristic Op erations:  Reshuﬄing only , G # Resh uﬄing & Retriev al, Storage, Reshuﬄing & Retriev al T emp oral Rep.: Moves (abstract sequence), Steps (discrete synchronous), Continuous (v ariable durations). • Steps partition time into ﬁxed, sync hronous in terv als (ticks); the entire system state is re- calculated at ev ery interv al (e.g., every 5 seconds), whic h is common in grid-based pathﬁnd- ing but imposes rigid sync hronization. • Continuous representations allow for v ariable task durations. While implemented on a discrete time grid for computational feasibilit y , the mo del treats tra v el times as distance- dep enden t parameters rather than ﬁxed steps, enabling precise synchronization with ex- ternal deadlines. Based on this comparison, we identify three speciﬁc gaps in the curren t b o dy of w ork: 1. Lac k of In tegrated Models: Most existing approac hes address only fragmen ts of the problem. F or instance, the BRR model (Disselnmeyer et al., 2024) eﬀectively optimizes the reshuﬄing and retriev al of a ﬁxed set of unit loads but neglects the storage of unit loads arriving from upstream pro cesses. Conv ersely , literature on the dynamic BRP (e.g., Borjian et al. (2015)) accounts for incoming containers but t ypically minimizes the n um b er of mov es for a single crane. This metric is insuﬃcient for AMR ﬂeets, where minimizing tra vel distance and execution time is critical to meeting strict retriev al deadlines in a spatially distributed buﬀer. Consequen tly , there is no mo del that integrates the conﬂicting ob jectives of storage, retriev al, and resh uﬄing in to a single formulation. 2. Insuﬃcien t Multi-AMR Co ordination for Perimeter A ccess: While the co ordina- tion of m ultiple agents is central to MAPF, scalability remains a hurdle in dense, in teractive en vironments (Hua, W ang, & Ji, 2024; Q. W ang et al., 2024). Standard MAPF ignores the manipulation of unit loads and fo cuses solely on computing collision-free paths for the agents. Conv ersely , multi-crane BRP literature addresses manipulation but relies on zoning strategies or rail-b ound constrain ts (Ji et al., 2015) that do not apply to ﬂexible AMRs. F urthermore, recent concurrent studies on Blo c k Rearrangemen t or Mov able Ob- stacles (F u et al., 2026; Hu et al., 2025) mo del the problem as a discrete sliding-tile puzzle where agen ts navigate within the grid (Grid-Based traﬃc control). This abstraction diﬀers fundamen tally from the BSRRP , where AMRs access the buﬀer from the p erimeter. There is curren tly no mo del that couples the com binatorial complexity of BRP reshuﬄing with 9 the co ordinated traﬃc managemen t of a multi-AMR ﬂeet required to meet strict service time windows. 3. Absence of Fleet-A w are Heuristics: Addressing the BSRRP requires tigh tly integrat- ing storage assignment, resh uﬄing logic, retriev al sequencing, and v ehicle routing. Solving these problems in isolation is kno wn to yield sub optimal results, as the in terdep endencies b et w een subproblems are lost (ElW akil, Eltawil, & Gheith, 2022). While heuristics exist for diﬀeren t BRP v arian ts (see the surv ey b y (Lersteau & Shen, 2022)), there is a lack of scalable approaches speciﬁcally designed to couple the combinatorial storage, retriev al and resh uﬄing decisions with the spatio-temporal constrain ts of m ulti-AMR ﬂeet routing. 10 3 The Multi-AMR Buﬀer Storage, Retriev al, and Resh uﬄing Problem This section formally deﬁnes the Multi-AMR Buﬀer Storage, Retriev al, and Reshuﬄing problem (BSRRP). W e describ e the physical storage environmen t, introduce the concept of Static Lanes as a static graph o v erlay to manage accessibility , and detail the ob jective function used to co ordinate the ﬂeet. 3.1 Static Lanes and Graph Ov erlay The buﬀer consists of a con tin uous ﬂo or area where unit loads are stac k ed directly on the ground. T o ensure accessibility and prev en t deadlo c ks in this dense environmen t, we o verla y a ﬁxed graph structure onto the contin uous space. F ollo wing the approach of Pfrommer, Meyer, and Tierney (2022), we partition the storage area into a set of Static Lanes, denoted as I . Figure 3 illustrates this concept using a 5 × 5 buﬀer la yout. As sho wn in Figure 3a, the ﬂo or is ﬁrst discretized into a grid of storage slots. In a second step (Figure 3b), these slots are group ed in to Static Lanes based on their accessibility from the p erimeter aisles. (a) Physical Grid Lay out ( 5 × 5 ) (b) Static Lane Overla y Figure 3: Logical decomp osition of the storage area. (a) The contin uous ﬂo or is discretized into slots. (b) These slots are group ed in to Static Lanes (colored), where eac h lane acts as a LIFO stac k accessible from the perimeter. The grey n um b ered rectangles represen t the unit loads This partitioning serv es three critical op erational purp oses: 1. LIF O Constraints: Eac h static lane i ∈ I functions as a Last-In-First-Out (LIF O) stac k. As depicted in Figure 3b, a lane consist of a sequence of ﬂo or slots. T o retriev e a unit load stored deep within a lane i , all blo c king unit loads placed in fron t of it (i.e., closer to the aisle) must ﬁrst b e reshuﬄed to empty slots in other lanes k ∈ I \ { i } . 2. Deadlo c k Preven tion: A ma jor c hallenge in multi-AMR systems on dense grids is con- gestion. T o preven t circular deadlo c ks, we enforce a strict resource lo cking mec hanism: Only one AMR ma y enter a lane at an y given time. If multiple AMRs w ere to en ter a single lane i sim ultaneously , the leading robot w ould b e blo c ked by the following one, 11 causing a deadlo c k. Consequently , a lane i ∈ I is lo c k ed b y an AMR during entry and exit op erations. 3. Static Lane Conﬁguration: While the allo cation of unit loads to slots is dynamic, the lane lay out itself (the set I ) is static. F or the instances analyzed in this study , the lane conﬁguration is pre-computed (e.g., using a maxim um-ﬂow net work formulation (Pfrommer et al., 2022)) to maximize storage densit y while ensuring connectivity . Once generated, this lay out remains ﬁxed throughout the run time. 3.2 Mo del Simpliﬁcations T o maintain computational tractability while capturing the core dynamics of the system, w e apply the follo wing abstractions: • Kinematics: W e assume constant AMR velocity and negligible acceleration/deceleration phases. This allo ws us to mo del trav el time as a linear function of distance on the graph. • Handling Times: Unit load handling (pic king up and setting down) is mo deled with constan t duration, assuming standardized load carriers. • Idealized En vironment: W e assume a deterministic en vironmen t without disruptions (e.g., human traﬃc or breakdowns), fo cusing the optimization on the logical coordination of the ﬂeet. 3.3 Ob jectiv e F unction and Mo del Flexibility The primary ob jective of the BSRRP is to minimize the total distance trav eled b y the AMR ﬂeet while strictly satisfying all service time windows. Giv en the set of unit loads N designated for retriev al, where eac h load n ∈ N has a release time r n and a deadline d n (deriv ed from the retriev al window), the solver m ust determine a sequence of storage, resh uﬄing, and retriev al mo v es suc h that: 1. Ev ery retriev al job is completed within its time window [ r n , d n ] . 2. No static lane constraints (LIFO, Capacit y , Exclusiv e Access) are violated. 3. The sum of distances for all loaded and empty AMR mo v es is minimized. The cum ulative distance serv es as a robust pro xy for eﬃciency , reducing energy consumption and hardware wear. F urthermore, by strictly enforcing time window constraints, the mo del acts as a v alidation tool for strategic planning: if the op erational requiremen ts exceed the ﬂeet’s capacit y given the la yout, the mo del returns an infeasibility status, signaling the need for la yout adjustmen ts or ﬂeet expansion. 12 4 Exact Problem F orm ulation W e form ulate the Multi-AMR Buﬀer Storage, Retriev al, and Reshuﬄing Problem (BSRRP) as a binary integer program, referred to as the Exact F ormulation (EF). This mo del builds directly up on the Buﬀer Reshuﬄing and Retriev al formulation in tro duced in (Disselnmeyer et al., 2024). W e extend this mo del to a multi-AMR setting b y incorp orating unit load arriv als—conceptually follo wing the dynamic BRP form ulation by (Borjian et al., 2015)—while adding explicit con- strain ts for collision-free ﬂeet coordination and ﬂo or-lev el maneuvering. 4.1 Notation Let N = { 1 , . . . , N } b e the set of unit loads, I = { 0 , . . . , I } the set of lanes, and T = { 1 , . . . , T } the set of time steps. The ﬂeet of AMRs is denoted by V = { 1 , . . . , V } . A sp eciﬁc storage lo cation is deﬁned as [ i, j ] for lane i ∈ I and depth p osition j ∈ J i = { 1 , . . . , J i } . The planning horizon T is calculated based on the latest arriv al ( a n ) and retriev al ( r n ) deadlines and the length of the arriv al ( α n ) and retriev al windo w ( ρ n ): T = max n,m ∈N { a n + α n , r m + ρ m } (4.1) The parameter τ ij kl represen ts the time cost to mo v e b et w een slots. It defaults to the distance d ij kl , but includes handling time h for loaded mov es: τ ij kl = max(1 , d ij kl + 2 h ) . W e enforce a lo w er bound of 1 to ensure that every action consumes time. This prev en ts the solv er from sc heduling instantaneous zero-cost idle loops, ensuring that w aiting explicitly adv ances the sys- tem state. T able 2 summarizes the sets, indices, and parameters. T able 2: Sets, indices, and parameters Notation Description N , V , T Sets of unit loads, AMRs, and time steps. I , I ′ Sets of all lanes (incl. I/O), Set of all buﬀer lanes (excl. I/O). [ i, j ] Slot at lane i and position j ( j = 1 is deep est in the lane). d ij kl Distance b et ween slots [ i, j ] and [ k , l ] . τ ij kl T ra vel and handling time (cost) b et w een slots [ i, j ] and [ k , l ] . h Handling time for pic king up or dropping a unit load. [ a n , a n + α n ] Arriv al time window for unit load n . [ r n , r n + ρ n ] Retriev al time window for unit load n . 4.2 Decision V ariables W e classify the binary decision v ariables into AMR actions and system state indicators. All v ariables are deﬁned as 1 if the condition holds and 0 otherwise. The decision-making pro cess is modeled using t w o categories of binary v ariables. First, AMR decision v ariables determine the speciﬁc tasks started b y each v ehicle v at time t . W e deﬁne 13 v ariables for resh uﬄings ( x ), retriev als ( y ), and the storage of new arriv als ( z ). A dditionally , the v ariable e explicitly mo dels empty tra velling to accurately track the p ositions of the AMRs. Second, state v ariables are required to trac k the ph ysical conﬁguration of the system. The v ariable b maps the in ven tory of unit loads to slots while c trac ks the spatio-temp oral p osition of ev ery AMR to preven t collisions. The auxiliary v ariables s and g monitor the completion status of storage and retriev al requests, resp ectiv ely . AMR Decision V ariables: Con trol the mov emen t and handling tasks of each vehicle v . x ij kl ntv = 1 if AMR v relo cates n from [ i, j ] to [ k , l ] at t, (4.2) ∀ i, k ∈ I ′ , ∀ j ∈ J i , ∀ l ∈ J k , ∀ n ∈ N , ∀ t ∈ T , ∀ v ∈ V y ij ntv = 1 if AMR v retrieves n from [ i, j ] at t, (4.3) ∀ i ∈ I \ { I } , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T , ∀ v ∈ V z ij ntv = 1 if AMR v stores n in to [ i, j ] at t, (4.4) ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T , ∀ v ∈ V e ij kl tv = 1 if AMR v p erforms an empt y drive from [ i, j ] → [ k , l ] at t, (4.5) ∀ i, k ∈ I , ∀ j ∈ J i , ∀ l ∈ J k , ∀ t ∈ T , ∀ v ∈ V State V ariables: T rack the lo cation and completion status of loads and vehicles. b ij nt = 1 if unit load n is in slot [ i, j ] at time t, (4.6) ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T c ij tv = 1 if AMR v is presen t at [ i, j ] at time t, (4.7) ∀ i ∈ I , ∀ j ∈ J i , ∀ t ∈ T , ∀ v ∈ V s nt = 1 if unit load n is stored by time t − 1 , ∀ n ∈ N , ∀ t ∈ T (4.8) g nt = 1 if unit load n is retrieved by time t − 1 , ∀ n ∈ N , ∀ t ∈ T (4.9) 4.3 Ob jectiv e F unction The ob jective is to minimize the total trav el distance of the AMR ﬂeet. Minimizing distance serv es as a proxy for energy eﬃciency and reduces hardw are wear. F urthermore, reducing un- necessary trav el alleviates congestion, which indirectly aids the throughput of the system. Note that service deadlines are treated as hard constrain ts to guaran tee service levels; therefore, the ob jective fo cuses purely on eﬃciency rather than tardiness p enalties. The function sums the distances for retriev al ( y ), storage ( z ), resh uﬄing ( x ), and empt y tra vel ( e ): 14 min X t ∈T , v ∈V       X n ∈N X i ∈I j ∈J i ( y ij ntv d ij I 1 + z ij ntv d 01 ij ) + X i,k ∈I j ∈J i l ∈J k d ij kl ( e ij kl tv + X n ∈N x ij kl ntv )       (4.10) 4.4 Constrain ts The feasible region is gov erned b y ﬂo w conserv ation, ph ysical consistency , time windows, and traﬃc rules. Flo w Conserv ation & State Updates W e explicitly mo del the system dynamics using four ﬂo w conserv ation constrain ts that link the discrete AMR actions to the state of the unit loads and vehicles. Constraints (4.11) deﬁne the storage status s nt . A unit load n is considered stored at time t if it was stored initially ( s n 1 ) or if a storage action z transp orting it from the source has b een completed. This summation explicitly accoun ts for the tra vel time τ 01 ij , ensuring the status only up dates after the transport is ﬁnished. Constrain ts (4.12) analogously track the retriev al status g nt . This v ariable is up dated to 1 if a retriev al action y has been initiated for unit load n at any previous time step. Unlike storage, this updates at the start of the action to prev ent the load from b eing accessed again. Constraints (4.13) gov ern the slot o ccupancy b ij nt for every buﬀer slot [ i, j ] . The state at time t is determined by the state at t − 1 , plus any unit loads arriving via reshuﬄing ( x ) or new storage ( z ) after their resp ectiv e trav el times, minus an y loads lea ving the slot due to reshuﬄing ( x ) or retriev al ( y ). Finally , constrain ts (4.14) ensure spatio- temp oral con tinuit y for the mobile robots. The AMR position v ariable c ij tv trac ks whether the v ehicle v is in the slot [ i, j ] at time t . The constrain t up dates the p osition by adding vehicles arriving from storage ( z ), retriev al ( y ), reshuﬄing ( x ), or empt y trav el ( e ) tasks—accounting for the sp eciﬁc trav el duration τ of each—and subtracting v ehicles that depart to initiate these tasks. 15 s nt = s n 1 + X i ∈I ′ X j ∈J i t − τ 01 ij X t ′ =1 X v ∈V z ij nt ′ v ∀ n ∈ N , ∀ t ∈ T (4.11) g nt = X i ∈I \{ I } X j ∈J i t − 1 X t ′ =1 X v ∈V y ij nt ′ v ∀ n ∈ N , ∀ t ∈ T (4.12) b ij nt = b ij n ( t − 1) + X v ∈V " X k ∈I ′ X l ∈J k  x kl ij n ( t − τ klij ) v − x ij kl n ( t − 1) v  − y ij n ( t − 1) v + z ij n ( t − τ 01 ij ) v # ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T \ { 1 } (4.13) c ij tv = c ij ( t − 1) v + X n ∈N z ij n ( t − τ 01 ij ) v − X n ∈N y ij n ( t − 1) v + X k ∈I ′ X l ∈J k X n ∈N x kl ij n ( t − τ klij ) v + X k ∈I X l ∈J k e kl ij ( t − τ klij ) v − X k ∈I ′ X l ∈J k X n ∈N x ij kl n ( t − 1) v − X k ∈I X l ∈J k e ij kl ( t − 1) v ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ t ∈ T \ { 1 } , ∀ v ∈ V (4.14) Note: Eq. (4.14) deﬁnes the motion for storage lanes I ′ ; analogous conserv ation constraints apply for the Source ( c 01 tv ) and Sink ( c I 1 tv ) no des. Initialization T o accurately sim ulate the system’s evolution, we m ust deﬁne its starting state at t = 1 based on the input instance. Constraints (4.15) map the initial slot o ccupancy , setting the state v ariable b ij n 1 to 1 if unit load n o ccupies slot [ i, j ] at the start of the planning horizon. Constrain ts (4.16) distinguish b et w een inistially stored unit loads and future arriv als; the storage status s n 1 is initialized to 1 for loads already present in the buﬀer, and 0 for those arriving at later time steps. Finally , Constrain ts (4.17) establish the initial AMR p ositions, assigning eac h AMR v to its designated starting co ordinates [ i, j ] by setting the position v ariable c ij 1 v accordingly . b ij n 1 =    1 , if unit load n starts in slot [ i, j ] 0 , otherwise ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N (4.15) s n 1 =    1 , if unit load n is initially stored in buﬀer 0 , otherwise ∀ n ∈ N (4.16) c ij 1 v =    1 , if AMR v starts in slot [ i, j ] 0 , otherwise ∀ i ∈ I , ∀ j ∈ J i , ∀ v ∈ V (4.17) System Consistency W e spatial and op erational in tegrity through a set of constraints gov- erning capacity , lane structure, and ob ject p ermanence. Constrain ts (4.18) limit the capacity of eac h storage slot [ i, j ] , ensuring it holds at most one unit load at any given time t . Constraints (4.19) mandate a dense storage p olicy to ensure gapless lane utilization; a unit load ma y only 16 o ccup y the outer p osition j + 1 if the adjacen t inner position j is also o ccupied. This ensures that the lanes are ﬁlled contin uously , reﬂecting the ph ysical constraints of ﬂoor block storage. Constrain ts (4.20) ensure that an AMR v can only initiate a retriev al ( y ) or resh uﬄing ( x ) for a unit load n from slot [ i, j ] if that load is presen t ( b ij nt = 1 ). Finally , Constraints (4.21) link the vehicle’s actions to its physical lo cation. The sum of all tasks—resh uﬄing ( x ), empt y tra vel ( e ), and retriev al ( y )—initiated by AMR v at slot [ i, j ] is b ounded by the presence v ariable c ij tv , ensuring a robot m ust b e at a lo cation to op erate from it. X n ∈N b ij nt ≤ 1 ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ t ∈ T (4.18) X n ∈N b ij nt ≥ X n ∈N b i ( j +1) nt ∀ i ∈ I ′ , ∀ j ∈ J i \ { J i } , ∀ t ∈ T (4.19) X v ∈V   y ij ntv + X k ∈I ′ X l ∈J k x ij kl ntv   ≤ b ij nt ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T (4.20) X k ∈I ′ X l ∈J k X n ∈N x ij kl ntv + X k ∈I X l ∈J k e ij kl tv + X n ∈N y ij ntv ≤ c ij tv ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ t ∈ T , ∀ v ∈ V (4.21) Note: Constraints (4.21) restrict actions in buﬀer lanes; separate presence constraints gov ern the Source (for z, y , e ) and Sink (for e ). F or brevit y , w e omit the explicit form ulation and refer to the complete mo del in App endix A. Time Windo ws W e mo del time windows as hard constraints, rendering any solution that misses a deadline infeasible. Constrain ts (4.22) mandate that every unit load n is successfully retriev ed exactly once within its designated window [ r n , r n + ρ n ] . Since the deadline applies to the arriv al at the Sink [ I , 1] , the v alid start time for a retriev al action y ij ntv from a sp eciﬁc slot [ i, j ] is shifted earlier b y the tra v el time τ ij I 1 . Constrain ts (4.23) explicitly forbid retriev al actions outside this v alid in terv al, preven ting premature or tardy deliveries. X i ∈I \{ I } X j ∈J i r n + ρ n − τ ij I 1 X t = r n − τ ij I 1 X v ∈V y ij ntv = 1 ∀ n ∈ N (4.22) X i ∈I \{ I } X j ∈J i   r n − τ ij I 1 − 1 X t =1 X v ∈V y ij ntv + T X t = r n + ρ n − τ ij I 1 +1 X v ∈V y ij ntv   = 0 ∀ n ∈ N (4.23) Note: Constraints (4.22 and 4.23) speciﬁcally gov ern retriev al windows. F or brevity , w e omit the explicit storage constrain ts, as they are mathematically symmetric to the retriev al case: they restrict the storage actions z ij ntv at the Source to start in the arriv al window [ a n , a n + α n ] . (See App endix A) Constrain t (4.24) go v erns the en try logic for incoming unit loads, enforcing that eac h load n is either stored in the buﬀer or directly cross-do c k ed. The ﬁrst term handles standard storage, ensuring the action z occurs during the arriv al windo w [ a n , a n + α n ] . The second term enables 17 direct retriev al from the Source, which must satisfy tw o simultaneous conditions: the load must b e a v ailable ( t ≤ a n + α n ) and the resulting deliv ery to the Sink must meet the retriev al deadline ( t ≥ r n − τ 01 I 1 ). The upper b ound of the summation enforces the tigh ter of these tw o limiting factors. X i ∈I ′ X j ∈J i a n + α n X t = a n X v ∈V z ij ntv + min( r n + ρ n − τ 01 I 1 ,a n + α n ) X t = r n − τ 01 I 1 X v ∈V y 01 ntv = 1 ∀ n ∈ N (4.24) T raﬃc Con trol W e implemen t traﬃc rules by mo deling eac h static lane i as a unary resource that can accommo date at most one v ehicle at an y giv en time t . T o formally capture lane o ccupancy during m ulti-step transitions, we deﬁne the set of relev an t start times Ω( t, τ ) = { t ′ ∈ T | t − τ < t ′ ≤ t } . This set identiﬁes all past time steps t ′ where an action of duration τ initiated at t ′ w ould still b e activ e at the current time t . Constrain ts (4.25) enforce the unary capacity b y aggregating three distinct mo des of occupa- tion. The ﬁrst term accoun ts for static pr esenc e , where an AMR is w aiting in the lane ( c ij tv ). The second term captures inc oming actions (storage z , incoming resh uﬄing/empt y trav el x, e ). Since the lane is blo c ked from the momen t an AMR en ters, w e sum ov er the full duration win- do w Ω( t, τ ) . The third term handles outgoing actions (retriev al y , outgoing resh uﬄing/empt y tra vel x, e ). T o av oid double-coun ting the vehicle’s presence at the instan t of departure (which is already captured by c ij tv ), we sum ov er the strictly past interv al Ω ∗ ( t, τ ) = Ω( t, τ ) \ { t } . Collec- tiv ely , the sum of these binary indicators m ust not exceed 1, ensuring collision-free op erations. X v ∈V " X j ∈J i c ij tv + X j ∈J i  X n ∈N X t ′ ∈ Ω( t,τ lane ( j )+ h ) z ij nt ′ v + X k ∈I ′ X l ∈J k X n ∈N X t ′ ∈ Ω( t,τ lane ( j )+ h ) x kl ij nt ′ v + X k ∈I X l ∈J k X t ′ ∈ Ω( t,τ lane ( j )) e kl ij t ′ v  + X j ∈J i  X n ∈N X t ′ ∈ Ω ∗ ( t,τ lane ( j )+ h ) y ij nt ′ v + X k ∈I ′ X l ∈J k X n ∈N X t ′ ∈ Ω ∗ ( t,τ lane ( j )+ h ) x ij kl nt ′ v + X k ∈I X l ∈J k X t ′ ∈ Ω ∗ ( t,τ lane ( j )) e ij kl t ′ v  # ≤ 1 ∀ i ∈ I ′ , ∀ t ∈ T (4.25) Finally , Constraints (4.26) enforce the ph ysical accessibilit y of the block storage. A slot [ i, j ] is accessible only if the blo c king slot [ i, j + 1] is empt y . Consequen tly , an y action (reshuﬄing x , retriev al y , or empt y trav el e ) targeting or originating from [ i, j ] is forbidden if b i ( j +1) nt = 1 . X n ∈N ( x ij kl ntv + y ij ntv + e ij kl tv ) ≤ 1 − X n ∈N b i ( j +1) nt ∀ i ∈ I ′ , ∀ j ∈ J i \ { J i } , ∀ k ∈ I ′ , ∀ l ∈ J k , ∀ t ∈ T , ∀ v ∈ V (4.26) 18 4.5 Computational Complexit y The BSRRP generalizes the Blo c k Relo cation Problem (BRP), which is kno wn to b e NP-hard (Caserta et al., 2012). Consequently , the BSRRP is also NP-hard. This relationship can b e established via a p olynomial-time reduction where a standard BRP instance is mapp ed to a restricted BSRRP instance by ﬁxing the ﬂeet size to one ( |V | = 1 ), eliminating arriv als ( a n = 1 ), setting empt y tra v el costs to zero, and normalizing all loaded tra vel distances to unity ( d ij kl = 1 ). Under these conditions, the BSRRP ob jective of minimizing total trav el distance b ecomes mathematically equiv alen t to the BRP ob jectiv e of minimizing the total n umber of relocations. F or a theoretical foundation, w e provide a formal proof of this reduction in App endix B. The pro of explicitly constructs the mapping from BRP to BSRRP , demonstrating that in tractability p ersists ev en under the speciﬁc constrain ts of our proposed mo del. Given this complexit y , the EF is computationally intractable for large-scale instances, necessitating the heuristic approach describ ed in Section 5. 19 5 Heuristic Approach The Exact F ormulation prop osed in Section 4 provides an exact solution but b ecomes com- putationally in tractable for large-scale instances due to the coupled complexity of m ulti-agen t pathﬁnding, task scheduling, and the storing, retrieving and resh uﬄing of unit loads. T o ad- dress this, we propose a hierarchical heuristic approac h that decomp oses the global optimization problem into four sequential, tractable sub-problems. This approac h builds up on the multi-ba y sorting strategies prop osed b y (Bömer et al., 2024), extending them to handle the storage and retriev al op erations and the m ulti-AMR constraints of the BSRRP problem. Input Orders O Time Windows Stage 1: Priority Queue Generation Linearized T ask Sequence Stage 2: Op era- tion Sequencing via A ∗ Searc h T ransp ortation T asks Sequence Stage 3: Multi- AMR Sc heduling Schedules for the AMRs Stage 4: T ra- jectory Repair Output F easible Multi-AMR Schedules Figure 4: Overview of the hierarchica l heuristic. The sequen tial stages transform the input from a linearized task sequence in to strict precedence constraints, then in to timed assignments, and ﬁnally into collision-free tra jectories. As illustrated in the ﬂow c hart, the proposed metho dology pro cesses the orders through four se- quen tial stages. First, Priority Queue Gener ation linearizes the asynchronous orders to provide a strict execution sequence for the Op er ation Se quencing via A* se ar ch , to ﬁnd a sequence of transp ortation tasks with storage, retriev al and reshuﬄing mov es. Finally , Multi-AMR Sche dul- ing maps these mov es to the AMR ﬂeet via Constraint Programming, while the subsequen t T r aje ctory R ep air resolves any remaining spatiotemp oral conﬂicts of the AMR p ositions using a three-tier priority mec hanism to ensure collision-free execution. T able 3 summarizes the notation used throughout the heuristic form ulation. In the following eac h of the four components will b e explained. 5.1 Priorit y Queue Generation The ﬁrst stage of the heuristic transforms the set of async hronous storage and retriev al requests in to a linear execution sequence. Let O denote the complete set of orders, comprising b oth storage requests O S and retriev al requests O R . Each order o ∈ O is constrained by a time windo w [ a o , d o ] , where a o represen ts the earliest release time and d o the hard deadline. Standard sorting strategies, suc h as Earliest Due Date (EDD), often fail in scenarios with asyn- c hronous release times, where a task with a late deadline might hav e a very late release time (a tigh t window), rendering it more critical than a task with an earlier deadline but a wide window. Executing the task with the wider window ﬁrst ma y irrev ersibly consume temp oral resources required by the tighter task immediately upon its release. T o resolv e this, w e emplo y an Enhanced Earliest Due Date strategy , summarized in Algorithm 1, which is based on the concept of enclosed windows. W e deﬁne an enclosure relationship o ∼ p 20 T able 3: Notation and parameters used in the heuristic approac h Sym b ol Description Sets and Indic es O Set of storage and retriev al orders ( O = O S ∪ O R ) O S , O R Subsets of storage and retriev al orders o, p Indices for speciﬁc orders ( o, p ∈ O ) G Set of dependency groups formed during priorit y assignment M Set of mo v es generated b y A* search m a , m b Indices for speciﬁc mo v es (storage, retriev al, reshuﬄing) V Set of Autonomous Mobile Robots (AMRs) u Index for a speciﬁc unit load Par ameters and V ariables [ a o , d o ] Arriv al and retriev al time windo w for order o D g Eﬀectiv e deadline of a group ( min o ∈ g d o ) Q Final prioritized execution queue P ( u ) Assigned priority v alue for unit load u (lo wer v alue indicates higher urgency) o ∼ p Enclosure relationship ( o is nested within p ) θ T abu tenure for cleared lanes σ Searc h state tuple ( B , U src , v pos ) in A* searc h Σ ′ Set of candidate successor states generated during expansion γ Congestion scaling factor (set to 5) W P enalty w eight for time windo w deviations (set to 10,000) t handling Time required for load/unload op erations F unctions and V ariables g time ( σ ) Elapsed schedule time (makespan) at state σ h est ( σ ) T otal heuristic estimate cost h ops Op erational Cost comp onen t h block Blo c king Cost component h prio Priorit y P enalt y h pre_store Premature Storage P enalt y A m,v In terv al v ariable for mo v e m assigned to v ehicle v start m,v , end m,v Start and end times of mo ve assignments τ i,j T ra vel time b et w een location i and j Algorithm 1: Enhanced Earliest Due Date Strategy Input: Set of orders O Output: Priority Queue Q G ← P artition O in to groups using enclosure relation o ∼ p (Eq. 1) Sort G ascending by group deadline D g = min o ∈ g ( d o ) Q ← Concatenate orders from sorted groups, lo cally sorted by d o return Q 21 b et w een t w o orders o, p ∈ O if the time window of one is entirely contained within the other: o ∼ p ⇐ ⇒ ( a o ≤ a p ∧ d o ≥ d p ) ∨ ( a p ≤ a o ∧ d p ≥ d o ) (5.1) This binary relationship iden tiﬁes pairs of tasks that comp ete for the same time in terv al. W e utilize this relationship to partition the set of orders O into disjoint groups, eﬀectiv ely treating the orders as no des in a graph where an edge exists b et ween an y pair o, p if o ∼ p . The resulting connected comp onen ts form the groups, clustering temp orally dep enden t tasks. The heuristic then generates the ﬁnal priority queue by sorting these groups to ensure resource a v ailabilit y for critical tasks: 1. In ter-Group Sorting: The groups are sorted b y the minimum deadline of their con- stituen t orders ( min o ∈ group d o ). This ensures that a cluster containing ev en a single urgen t order is prioritized o ver a cluster of ﬂexible tasks. 2. In tra-Group Sorting: Within each group, orders are sorted by their individual deadlines d o . This tw o-lev el sorting yields a strict priority ordering that explicitly protects orders with nested, tigh t windo ws from b eing preempted b y non-critical tasks. Based on their ﬁnal position in the queue Q , eac h unit load u is assigned an in teger priorit y v alue P ( u ) , where a lo wer numerical v alue indicates a higher urgency . F or example, a sequence of four unit loads migh t receive the priorit y assignments P ( u 1 ) = 1 , P ( u 3 ) = 1 (if they share the same critical deadline), P ( u 4 ) = 2 , and P ( u 2 ) = 3 . During the subsequent A searc h, these priority v alues are used to calculate p enalties for inv alid or non-optimal placement sequences. 5.2 Op eration Sequencing via A* Searc h The second stage conv erts the prioritized sequence of orders O into a dep endency graph of mo v es M , whic h explicitly deﬁnes the lo cations for all storage, reshuﬄing, and retriev al op erations. This pro cess employs a tw o-phase approach: ﬁrst, a pre-pro cessing step prunes trivial direct transfers from the source to the sink to reduce the searc h space, follow ed by an A* search that enforces the priorit y sc heme established in Section 5.1 and determines storage locations for the orders by using a virtual AMR as a substitute for the rob ot ﬂeet. 5.2.1 Direct Retriev al Pruning Before initializing the search, we p erform a Dir e ct R etrieval A nalysis to identify unit loads that can bypass the buﬀer entirely . A unit load u is extracted for direct retriev al if its release time a u and deadline d u allo w for a direct transfer from Source to Sink. F ormally , if a u + τ source , sink ≤ d u , the item is remov ed from the searc h space and assigned a direct mo ve. This reduces the branching factor for the subsequen t pathﬁnding. 5.2.2 State Space and T ransitions F or the remaining orders, w e searc h for an optimal sequence of operations. The state space is deﬁned b y the tuple σ = ( B , U src , v pos ) , where B represen ts the curren t buﬀer conﬁguration 22 (mapping unit loads to lanes and slots), U src is the set of p ending unit loads at the source, and v pos trac ks the p osition of a single virtual AMR to estimate and minimize the empty trav el distances b et w een consecutive transp ort tasks. Representi ng the ﬂeet as a single virtual AMR is a necessary metho dological simpliﬁcation for ﬂeet sizes |V | > 1 . T racking the individual p ositions of m ultiple AMRs within the A* state space would lead to a combinatorial explosion, rendering the search intractable. By optimizing the sequence for a single virtual AMR, the heuristic inherently groups spatially proximate tasks and minimizes total empty tra vel. This pro duces a highly cohesive op eration sequence that the subsequent CP-SA T scheduling stage can eﬃciently distribute and parallelize across the actual AMR ﬂeet. T ransitions b et w een states corresp ond to three mo v e t yp es: • Store: Places an arriving unit at the source in to a v alid empt y slot in the buﬀer. • Retrieve: Retrieves an accessible unit load from the front of a lane to the Sink. • Reshuﬄe: Relo cates a blocking unit load to a temporary position. A state σ is iden tiﬁed as a goal state ( I sGoal ) when all storage and retriev al orders from the priorit y queue Q hav e b een successfully executed and the buﬀer state is consistent. Once a goal is reached, the algorithm uses Reconstr uctP ath to backtrac k through the state transitions, yielding the ﬁnal mo ve sequence M for the subsequent scheduling phase. T o ensure scalability while main taining the solution qualit y of standard A ∗ , w e employ three k ey algorithmic optimizations. First, to bound the branching factor in high-densit y scenarios, w e implemen t a Beam Searc h strategy (Lo w erre, 1976). A t eac h expansion step, the generated successors are ranked b y their f -cost, and only the top k most promising candidates (with k = 8 in our exp erimen ts) are added to the open set. Second, w e enforce Open Set Pruning as a memory protection mechani sm. If the priorit y queue exceeds a safety threshold (set to 5,000 no des), the w orst-p erforming 50% of the nodes are discarded to preven t memory exhaustion and search stagnation. Third, w e utilize a lazy ev aluation strategy for the heuristic cost. Up on no de generation, only the computationally inexp ensiv e op erational costs are calculated. The exp ensiv e p enalt y components (blocking and priority violations) are deferred and computed only when the no de is extracted from the priority queue, prev enting wasted computation on unexplored no des. The complete search pro cedure, integrating the b eam searc h and memory protection strategies, is summarized in Algorithm 2. T o preven t cyclic b eha vior (e.g., immediate reﬁlling of a cleared lane), the searc h maintains a short-term T abu list. When a unit load is retrieve d or reshuﬄed from a lane l , that lane b ecomes tabu for incoming storage or reshuﬄing mo ves. T o balance ﬂeet coordination with unrestricted lane access, we set a ﬁxed short-term ten ure of θ = 1 state transition. This tenure eﬀectively prev ents immediate inv erse op erations (suc h as placing a unit load back into the p osition it just v acated) without restricting the solution space or locking down buﬀer capacity for extended p eriods, whic h prov ed more robust than dynamic, ﬂeet-dependent tenures in testing. 5.2.3 Cost F unction The search is guided b y a composite cost function f ( σ ) = g time ( σ ) + h est ( σ ) . 23 Algorithm 2: A* searc h Input: Initial State σ init , Priority Queue Q Output: Mov e Sequence M O P E N ← { ( σ init , 0) } while O P E N  = ∅ do Select state σ with low est f ( σ ) from O P E N if IsGo al( σ ) then return ReconstructPath( σ ) Step 1: Expansion & T abu V alidation Generate successors Σ ′ b y applying actions { S tor e, Retr iev e, R eshuf f le } Filter Σ ′ : Remov e actions violating T abu tenure θ (Cycle Preven tion) Step 2: Ev aluation & Beam Pruning Calculate cost f ( σ ′ ) = g time + h est for all σ ′ ∈ Σ ′ Sort Σ ′ b y f -score and add only top k no des to OP E N (Beam Width) Step 3: Memory Protection if | O P E N | > Limit then Discard w orst 50% of no des from O P E N return F ailure T otal Sc hedule Mak espan The term g time ( σ ) represen ts the total elapsed time of the op era- tion sequence up to state σ . Unlike standard pathﬁnding whic h migh t only sum loaded distances, our cost function accounts for the empty trav el time required to trav el b et w een task lo cations, as well as an y w aiting time incurred if a v ehicle arriv es b efore a unit load’s release windo w a o op ens. This is achiev ed by employing a virtual AMR that sequen tially p erforms the tasks, cal- culating the unloaded trav el from the end p osition of the preceding mo ve to the start lo cation of the next. This allows p enalizing ineﬃcien t sequencing (e.g., generating excessiv e empty trav el for the virtual AMR b et ween distan t tasks) and optimize for the true op erational makespan. Heuristic Cost F unction The heuristic h est ( σ ) is a weigh ted sum of four comp onen ts de- signed to minimize future eﬀort while enforcing the priority ordering. T o maintain searc h tractabilit y and ensure robust performance in high-densit y scenarios, the heuristic emplo ys soft constrain ts within this cost function. Rather than strictly forbidding undesirable states—which could lead to searc h stagnation—it penalizes violations (su ch as lane blo c k ages or sequence in versions) through weigh ted p enalt y terms. h est ( σ ) = h ops ( σ ) + h block ( σ ) + h prio ( σ ) + h pre_store ( σ ) (5.2) 1. Op erational Costs ( h ops ): Estimates the trav el time τ required to complete all p ending tasks. • Stor age: W e emplo y a Greedy Best-Matc h heuristic. The cost for storing p end- ing unit loads is estimated by assigning each item to the nearest a v ailable slot that minimizes the total sequence: Source → Slot → Sink. • R etrieval: Sums the tra vel time τ i, sink from eac h stored item’s current p osition to the Sink. 24 2. Blo c king Costs ( h blo c k ): Anticipates the immediate resh uﬄing eﬀort required for cur- ren tly blo ck ed targets. F or every block ed target u , the heuristic iden tiﬁes the set of blo ck ers and calculates the cost to relocate them to the nearest empty lane. If no empt y lane is a v ailable, the algorithm applies tw o times the av erage resh uﬄe cost to reﬂect the op era- tional dela y of w aiting for future retriev als to free up buﬀer capacity . This cost is scaled b y a ﬁxed p enalt y m ultiplier γ = 5 . 0 . W e found this ﬁxed v alue to be eﬀectiv e for con- sisten t congestion a v oidance, as it ensures high-priority reshuﬄing is p enalized uniformly regardless of the ﬂeet size. 3. Priorit y Penalt y ( h prio ): Enforces the task sequence deriv ed in Stage 1 b y penalizing t wo t yp es of structural violations based on the assigned priorit y v alues P ( u ) : • Se quenc e Inversion: P enalizes the retriev al of a low er-priorit y unit load while a higher- priorit y one is still pending in the buﬀer. Example: Retrieving an item with P = 3 while an item with P = 1 is not retriev ed y et. • Stacking Violation: Penalizes an y LIF O lane conﬁguration where a lo w er-priority item is placed closer to the aisle than a higher-priority item. Example: Placing an item with P = 2 in front of an item with P = 1 in the same lane. Since this placement guarantees a future resh uﬄe operation, it is p enalized immediately to prune the search branc h. 4. Premature Storage Penalt y ( h pre_store ): P enalizes the storage of a lo w-priority unit load from the source while a high-priorit y unit load is waiting to b e retriev ed from the buﬀer. This guides the search to clear urgent retriev als ﬁrst, freeing up buﬀer space b efore bringing in less urgen t items. The output of this stage is a sequence of mov es M . If mov e m a resh uﬄes a blo c ker for a retriev al mo ve m b , a strict precedence constraint m a ≺ m b is generated for the subsequent sc heduling phase. 5.3 Multi-AMR Sc heduling via CP-SA T The third stage maps the sequence of mo v es M generated b y the A* search on to the ﬂeet of Autonomous Mobile Rob ots (AMRs) V . While standard V ehicle Routing Problems (VRP) fo cus primarily on spatial routing, the BSRRP is dominated b y complex temporal constrain ts and precedences b et w een mo v es. Consequently , w e model this stage as a Job Shop Sche duling Pr oblem with Se quenc e-Dep endent Setup Times , solved using the CP-SA T solver from Google OR-T o ols. 5.3.1 Mo del F orm ulation T o emphasize the scheduling nature of the problem, the v ehicles of the AMR ﬂeet are modeled as a set of identical parallel machines V = { 1 , . . . , V } . Eac h generated mov e m ∈ M is treated as a task that m ust b e assigned to exactly one machine. F ollo wing standard Constraint Programming 25 form ulation, we deﬁne optional in terv al v ariables A m,v to represen t the p oten tial execution of mo ve m by vehicle v . If activ e, an in terv al A m,v is c haracterized b y a start time start m,v , an end time end m,v , and a ﬁxed pro cessing duration τ m . This duration com bines the loaded trav el time (as deﬁned in Section 4) and the constan t handling time: τ m = τ origin ( m ) , dest ( m ) + t handling (5.3) The mo del enforces the follo wing constrain ts: 1. Assignmen t Completeness: Ev ery mo ve must b e assigned to exactly one vehicle. This is enforced b y constraining the sum of activ e assignment b ooleans to 1 for eac h mo v e: X v ∈V I ( A m,v ) = 1 ∀ m ∈ M (5.4) 2. Precedence Constraints: W e enforce three t yp es of hard dep endencies to guaran tee consistency and stac k integrit y: • Unit L o ad Flow: Sequential op erations acting on the same unit load (e.g., resh uﬄe blo c k er → retriev e target) m ust main tain the order deﬁned by the logical ﬂow. • LIFO Inter-Slot Dep endencies: Deriv ed from the buﬀer geometry; if unit load A is stored in a slot strictly in fron t of unit load B within the same lane, the retriev al of A must complete b efore the retriev al of B can commence. • L ane Se quencing: T o preserv e the v alidity of the buﬀer states determined b y the heuristic decomp osition, w e enforce strict sequencing for all mov es accessing the same lane. If the A* searc h sc hedules mo ve m a b efore m b on lane l , the sc heduler is constrained to resp ect this order ( m a ≺ m b ), preven ting the optimizer from creating in v alid lane conﬁgurations. F or any suc h dependency pair ( m a , m b ) , we enforce end m a ≤ start m b . 3. Lane Capacit y : W e mo del storage lo cations as resource constraints. • Buﬀer L anes (Unary R esour c e): Standard buﬀer lanes are modeled as unary re- sources. W e apply a global NoOverlap constraint on the set of in terv als assigned to an y speciﬁc lane l : NoOv erlap ( {A m,v | dest ( m ) = l ∨ origin ( m ) = l } ) (5.5) This constrain t ensures that at an y point in time t , at most one v ehicle can execute a mov e inv olving lane l . • Sour c e and Sink Queues: Mo deled as inﬁnite-capacity resources to trac k v ehicle a v ail- abilit y without restricting the n um b er of concurren t AMRs at these locations. 4. Sequence-Dep enden t T ransition Times: The setup time b et ween tw o mov es dep ends on the rob ot’s lo cation. If vehicle v p erforms mo ve m a immediately b efore m b , a transition 26 constrain t ensures the gap co v ers the empty trav el time: start m b ,v ≥ end m a ,v + τ dest ( m a ) , origin ( m b ) (5.6) 5. Time Windo ws: Time windows are mo deled using a hybrid approach to ensure feasibility . • Har d Constr aints: Physical constrain ts are strictly enforced (e.g., a storage action cannot start before the unit load’s arriv al time a o ; a retriev al cannot end after the deadline d o ). • Soft Constr aints: Op erational targets (latest start, earliest ﬁnish) are treated as soft constrain ts. Violations are p ermitted to maintain feasibility but incur a w eigh ted tardiness p enalt y in the ob jective function to prioritize service-level agreements. 5.3.2 Ob jectiv e F unction The optimization ob jective is hierarc hical, implemented using a w eighted sum method. The primary goal is to minimize total tardiness, reﬂecting the strict service level requirements. The secondary goal is to minimize the sum of completion times (Flow Time), which implicitly mini- mizes unpro ductiv e empt y trav el and w aiting times. The ob jectiv e is formulated as: min X m ∈M    W · max(0 , end m − d m ) | {z } W eighted T ardiness + end m | {z } Flow Time    (5.7) Here, the term max(0 , end m − d m ) calculates the strictly p ositiv e delay relativ e to the soft deadline d m . W is a large w eigh ting constan t (set to 10,000) ensuring that meeting deadlines strictly dominates operational eﬃciency . 5.4 T ra jectory Repair While the CP-SA T schedule ensures temp oral v alidit y , it ignores the presence of idle AMRs, whic h ma y park after p erforming a storage or reshuﬄing mov e and o ccupy the buﬀer lanes. T o generate collision-free tra jectories, the ﬁnal stage p ost-pro cesses the timeline follo wing the structure of Algorithm 3: Step 1: Deadlo c k Resolution The algorithm resolv es symmetric head-on collisions (e.g., v 1 mo ving A → B while v 2 mo ves B → A ) b y sw apping the AMRs’ future task sc hedules. Since the AMRs are homogenous, this resolv es the deadlock instantly without additional trav el time. Step 2: Conﬂict Resolution Lo op Remaining spatial ov erlaps caused by park ed AMRs are resolv ed using a hierarchical strategy prioritizing eﬃciency: 1. Priorit y 1 Reschedule: Exploits schedule slac k to shift the parked AMR’s next departure to an earlier time, clearing the lane b efore the incoming AMR arriv es. 27 Algorithm 3: T ra jectory Repair Strategy Input: Initial Sc hedule S Output: F easible Sc hedule S ∗ Step 1: Deadlo c k Resolution if Symmetric de ad lo ck dete cte d b etwe en v 1 , v 2 then Swap sc hedules of v 1 , v 2 Step 2: Conﬂict Resolution Lo op while Col lision dete cte d b etwe en v 1 (p arking AMR) and v 2 (inc oming AMR) do if v 1 is waiting empty and e arly shift fe asible then // Priority 1: Reschedule Shift v 1 departure to t < t arr iv al ( v 2 ) else if valid eviction str ate gy for v 1 exists then // Priority 2: Evict Insert b est eviction mo v e (Smart or Standard) for v 1 else // Priority 3: Delay Dela y v 2 un til v 1 departs (propagate do wnstream) Step 3: Dep endency Chec k foreac h unit lo ad u with end store > start retr iev e do Shift retriev al forward to resolve violation return S ∗ 2. Priorit y 2 Evict: Inserts an explicit eviction mo ve. The algorithm prioritizes a Smart Eviction (moving the AMR directly to the start lo cation of its next assigned task) ov er a Standar d Eviction (relocating it to an av ailable neutral p osition, such as a free buﬀer lane or the sink). 3. Priorit y 3 Delay: As a fallback, the incoming AMR is dela y ed, and this shift is propa- gated downstream to all dependent tasks. Step 3: Dep endency Chec k Dela ys in tro duced in Step 2 ma y violate precedence constrain ts (e.g., pushing a storage task to complete after its retriev al w as scheduled to b egin). This step enforces the end store ≤ start retrieve constrain t for all unit loads, shifting retriev al tasks forw ard if necessary to guaran tee consistency . 28 6 Computational Exp erimen ts This section ev aluates the p erformance of the prop osed solution approaches for the Multi-AMR Buﬀer Storage, Retriev al, and Resh uﬄing Problem (BSRRP). The exp erimen tal study is de- signed to answ er three primary research questions: 1. Benc hmarking: What are the computational limits of the EF when establishing a ground truth of optimal solutions? 2. Heuristic V alidation: Ho w do es the prop osed hierarchical heuristic p erform in terms of feasibilit y rates and solution qualit y compared to the optimal baselines? 3. Managerial Insights: Ho w do op erational parameters—speciﬁcally ﬂeet size, access ﬂexibilit y , and congestion lev els—impact the stability and eﬃciency of space-constrained buﬀers? 6.1 Exp erimen tal Design and Dataset T o ensure a robust ev aluation, w e dev elop ed a comprehensive dataset reﬂecting the constraints of bro wnﬁeld ma nufacturing facilities, suc h as limited ﬂoor space, high storage densit y , and complex traﬃc dynamics. T o guarantee full repro ducibilit y and facilitate future researc h, the complete source code of the proposed heuristic and the discrete-ev en t sim ulator, as w ell as all generated b enc hmark instances and detailed solutions, are made publicly a v ailable (see the Data and Code A v ailabilit y Statemen t in the Ac kno wledgements). Our experimental design employs a tw o-tiered v alidation to address diﬀeren t ev aluation ob jectives: solution qualit y b enc hmarking on small-scale instances and scalability analysis on large-scale instances. 6.1.1 Instance Generation and P arameters Small-Scale Instances F or 3 × 3 and 4 × 4 grids, instances were generated sto chastically to quan tify the heuristic’s optimalit y gap against the EF. W e systematically v aried the following parameters: • Grid and T op ology: 3 × 3 (9 slots) and 4 × 4 (16 slots) with access p oin ts distributed across 1, 2, or 4 sides. • Fleet Size ( |V | ): 1 to 3 AMRs. • Load-to-Slot Ratio: Ranged from 0.4 to 1.3 to test capacity limits. • T emp oral Constrain ts: Arriv al and retriev al windows were generated with v arying o v er- lap (using random seeds for repro ducibility) to enforce async hronous op erations. Negativ e arriv al time windo ws w ere utilized to initialize instances with a set of already randomly placed pre-stored unit loads at t = 0 . 29 Large-Scale Instances F or la youts exceeding exact computational limits, w e dev elop ed a discrete-ev ent constructive simulator. This to ol generates task sequences b y sim ulating buﬀer op erations forward in time. It guaran tees feasibility by explicitly simulating the necessary resh uf- ﬂing of blocking unit loads, while in tentionally ignoring AMR collisions, creating an idealized, con tinuous ﬂow of operations. W e utilize these densely p ack ed sequences to stress-test the heuristic, allowing us to identify the saturation p oin ts. Sp eciﬁcally , the maxim um unit load- to-slot-ratio a lay out can sustain b efore the heuristic fails to ﬁnd a v alid sc hedule within the sim ulated deadlines. • T op ology: Square blo c ks ( 5 × 5 , 6 × 6 ), rectangular lay outs ( 8 × 3 ), and industrial brown- ﬁeld lay outs. • T ask Generation: The simulator adaptively alternates storage and retriev al requests to main tain a target ﬁll lev el (e.g., 80%), utilizing a bac k-to-front ﬁlling strategy . • Idealized Time Estimation: T ask durations are estimated using scaled Manhattan distances ( t op ≈ 2 . 0 × d manhattan ) plus ﬁxed handling times, and reshuﬄing op erations incur explicitly sim ulated time penalties. • Time Windo w Deriv ation: T ask start times are greedily assigned to the earliest a v ail- able rob ot in a simulated ﬂeet with tw o AMRs. The arriv al ( [ a n , a n + α n ] ) and retriev al time windo ws ( [ r n , r n + ρ n ] ) are then generated b y applying a ﬁxed temp oral slac k (e.g., ± 15 time steps) around these sim ulated task start and end times. 6.1.2 Computational En vironmen t All experiments were conducted on a workstation equipp ed with an AMD Ryzen 9 5950X pro- cessor. The EF w as solv ed using Gurobi Optimizer version 13.0.0 with a strict time limit of 3600 seconds (1 hour) p er instance. The hierarchical heuristic was implemen ted in Python, utilizing Go ogle OR-T o ols (CP-SA T) for the sc heduling stage. T o ensure rapid conv ergence and stabilit y during the exp erimen ts, the solv er w as conﬁgured with aggressiv e searc h parameters (linearization lev el 2, probing lev el 2) and utilized 8 parallel searc h work ers. T o simulate a realistic pro duction con trol en vironmen t, w e imposed a realistic time limit of 300 seconds for both the A ∗ Searc h (Stage 2) and the CP-SA T solv er (Stage 3). While t ypical run times are fractions of a second, this cap prev ents indeﬁnite stalling in degenerate cases. A dditionally , the algorithmic parameters of the heuristic (sp eciﬁcally the congestion factor γ = 5 , p enalt y w eigh t W = 10 , 000 and tabu ten ure θ = 1 ) w ere calibrated based on preliminary exp erimen ts using a represen tativ e subset of small-scale instances to balance solution quality and computational speed. 6.1.3 Exact F orm ulation Exp erimen ts and Benc hmark Set T o ev aluate the EF and establish a ground truth, we generated a p o ol of 6,903 small-scale instances ( 3 × 3 and 4 × 4 ) across v arying la yout complexities and parameters. W e sub jected these instances to a tw o-stage pro cess to test the computational limits of the EF and to build a b enc hmark set. 30 First, the EF w as task ed with ﬁnding feasible solutions within a 3600-second time limit. This step identiﬁed 949 instances as solv able. W e assume this step eﬀectively separates op erationally feasible scenarios from those rendered structurally imp ossible b y tight temp oral and spatial con- strain ts. This assumption is based on the observ ation that when the EF found a feasible solution, it typically did so relativ ely quickly , whic h w e veriﬁed by examining individual instances. Second, to ensure a strict baseline for optimalit y gap calculations, w e ﬁltered this po ol to in- stances where the EF achiev ed a solution with a MIP gap of ≤ 5% . This pro cess yielded a ﬁnal b enc hmark set of 810 instances. The comp osition of this b enc hmark, categorized by ﬂeet size and access top ology , is detailed in T able 4. The distribution highligh ts the computational b oundaries of exact metho ds: the EF solved 716 instances in the 3 × 3 lay out to the required gap, but only 94 instances in the 4 × 4 lay out. Notably , the EF failed to solv e any 4 × 4 instances with a single AMR, suggesting that the used parameter combination exceeds the capacity of a single rob ot. W e utilize this ﬁltered b enc hmark set of 810 optimally or near-optimally solved instances ( ≤ 5% MIP gap) during the subsequent ev aluation to v alidate the solution quality and feasibility rates of the proposed heuristic. T able 4: Comp osition of the Benc hmark Reference Set: 810 high-quality instances ﬁltered from a p o ol of 6,903 generated small-scale scenarios. P arameter Category Small ( 3 × 3 ) Large ( 4 × 4 ) Fleet Size ( |V | ) 1 AMR 151 0 2 AMRs 288 48 3 AMRs 277 46 A ccess Directions 1 Side 72 1 2 Sides 373 54 4 Sides 271 39 T otal All Instances 716 94 6.2 Quan titativ e Benchmarking against Exact F orm ulation W e ﬁrst analyze the quantitativ e performance of the heuristic against the benchmark set, fol- lo wed b y a qualitativ e assessmen t of its conﬂict resolution capabilities. 6.2.1 Quan titative Analysis T able 5 summarizes the comparativ e p erformance of the exact formulation (EF) and the prop osed heuristic. The results are ﬁrst categorized by the physical lay out and the ﬂeet size ( |V | ). Under the section Solve R ate Heuristic , the ﬁrst column giv es the total num b er of b enc hmark instances, the second column lists how many w ere successfully solved by the heuristic, and the third column pro vides the resulting success rate. In the Computational Eﬃciency (Me dian) section, the table compares the median EF runtime against the median heuristic runtime, follow ed b y the relative sp eedup factor. Finally , the Me dian Opt. Gap (%) column quantiﬁes the solution quality b y measuring the ob jectiv e v alue deviation of the heuristic solution from the exact lo w er bound. 31 T able 5: Performance assessment: F easibility , Run time, and Optimality Gap comparison b e- t ween the EF and the Heuristic approac h across the reference set. Solve Rate Heuristic Computational Eﬃciency (Median) Median Opt. Lay out Fleet ( |V | ) Number of Instances Solved Rate EF Run time Heur. Runtime Speedup Gap (%) 3 × 3 1 AMR 151 139 92.1% 515.5 s 0.05 s 10,310x 2.56% 2 AMRs 288 284 98.6% 253.0 s 0.08 s 3,163x 10.20% 3 AMRs 277 275 99.3% 307.2 s 0.09 s 3,413x 14.58% 4 × 4 2 AMRs 48 44 91.7% 1,524.3 s 3.37 s 452x 4.70% 3 AMRs 46 45 97.8% 2,927.9 s 2.52 s 1,162x 10.63% Solv ability and Structural Limits The prop osed heuristic demonstrated high consistency in ﬁnding feasible solutions when ev aluated against the b enc hmark. As detailed under the Solve R ate Heuristic section of T able 5, the heuristic achiev ed success rates comparable to the EF across all parameter com binations where exact solutions w ere obtainable. Sp eciﬁcally , the heuristic solv ed 698 out of 716 instances (97.5%) in the 3 × 3 la yout. In the more complex 4 × 4 la y out, it successfully solv ed 89 out of the 94 reference instances (94.7%). Notably , the EF was unable to solve any 4 × 4 instances restricted to a single AMR. As outlined in Section 6.1, this is a direct consequence of the instance generation parameters: the extensiv e resh uﬄing required to resolve deep blo c k ages in a 4 × 4 grid simply consumes more time than the tigh t retriev al windows allo w for a single AMR, rendering these scenarios op erationally infeasible. Figure 5: P erformance comparison b et ween the EF and the prop osed Heuristic. T op panels: Run time distributions (logarithmic scale) across 3 × 3 and 4 × 4 lay outs. Bottom panels: Opti- malit y gaps relativ e to the exact lo wer b ound, categorized b y unit-load-to-slot-ratio. 32 Run time Performance While feasibility is the primary prerequisite for deploymen t, opera- tional viabilit y dep ends on computational speed. The runtime disparity , visualized in the top panels of Figure 5, highlights the impact of the problem’s NP-hard complexity . While the EF frequen tly approac hes the timeout of 3,600 seconds (e.g., median of 2,927s for 4 × 4 with 3 AMRs), the heuristic maintains stable sub-second to low-second runtimes. F or the 3 × 3 in- stances, the median computation time w as consistently under 0.1 seconds, achieving speedup factors exceeding 3,000x. Even in the complex 4 × 4 instances, the median was appro ximately 3 seconds—well b elo w the 300-second time limit. Solution Qualit y and Optimality Gap T o assess the trade-oﬀ for this massive speedup, we ev aluate the optimality gap (visualized in the b ottom panels of Figure 5). T o pro vide a rigorous assessmen t of solution quality , we calculate the T rue Optimality Gap. Unlike relative deviations b et w een tw o feasible solutions, this metric compares the heuristic solution ( Z H eur ) against the theoretical low er b ound ( Z LB ) established b y the exact solver: Gap = Z H eur − Z LB Z H eur × 100% (6.1) The heuristic maintains high solution qualit y , with a median gap of just 4.70% for the challenging 4 × 4 la yout with 2 AMRs. As expected, the gap increases with ﬂeet size and exhibits a higher sensitivit y to the unit-load-to-slot-ratio. Medians typically range b et ween 10% and 20% for larger conﬁgurations, reﬂecting the heuristic’s tendency to prioritize rapid conﬂict resolution o ver ﬁnding the mathematically p erfect spatial sequence. Constrain t Handling and T emp oral Flexibilit y A k ey distinction in this assessment lies in the treatmen t of time windo ws. The EF strictly enforces the start times at the storage slots — which are derived from the external deadlines in (4.22) — as hard constraints, classifying any temp oral deviation as an infeasible solution. In contrast, the heuristic is designed to main tain op erational ﬂow by selectiv ely relaxing these in ternal b ounds. Sp eciﬁcally , it permits internal storage tasks to complete later and retriev al tasks to commence earlier than originally scheduled. This approach ensures that the handov ers at the Source and Sink remain p erfectly sync hronized with external processes, while providing the internal buﬀer operations with the temp oral ﬂexi- bilit y necessary to resolv e spatial conﬂicts. Among the generated solutions, approximately 9.5% utilized this ﬂexibilit y . While strictly sp eaking suboptimal compared to the rigid EF lo wer b ound, these solutions are operationally sup erior in a pro duction con text, as they preserve the punctualit y of external pro cesses while prev en ting in ternal system lockups. 6.3 Qualitativ e Analysis of Solution Behavior T o v alidate the p erformance of the proposed approach, w e analyze the op erational b eha vior of the generated solutions in tw o distinct scenarios: reactiv e conﬂict resolution in high-density conﬁned spaces and proactiv e capacity managemen t in large-scale bro wnﬁeld la y outs. This analysis examines solutions for 8 × 3 , 5 × 5 , and 6 × 6 lay outs, as w ell as a real-world brownﬁeld case from the large-scale instance set. 33 Figure 6: Visualization of a solution in a high-densit y 8 × 3 lay out (21 unit loads) with 4 AMRs. The b ottom-left of each sub-ﬁgure indicates the corresp onding decision v ariable from the EF (also visualized with the dotted arro ws; white for empty driv es; blac k for transp ortation of an UL), while T denotes the discrete time step of the ev en t. 34 6.3.1 Reactiv e Conﬂict Resolution ( 8 × 3 La y out) First, w e examine a highly constrained en vironment serviced b y a ﬂeet of 4 AMRs (Utilization ≈ 90% ). Figure 6 visualizes a complex interlea v ed reshuﬄing and storage sequence spanning from time step t = 345 to t = 446 . The system’s ob jective is to retriev e the target unit load (UL) 04, whic h is curren tly blo c ked. 1. Resh uﬄing ( t = 345 − 349 ): The heuristic iden tiﬁes UL 18 as blo c king the retriev al of the target UL 04. AMR V2 is assigned to reshuﬄe UL 18 to a temp orary p osition. 2. Storage ( t = 375 − 401 ): Concurren tly , new storage requests for UL 20 and UL 21 arrive. Rather than blo c king the no w-accessible lane for UL 04, the heuristic utilizes the space in fron t of the resh uﬄed UL 18. It directs the ﬂeet to stac k UL 20 ( t = 375 ) and UL 21 ( t = 401 ) in front of UL 18. This choice av oids creating new blo c k ages, as UL 20 and UL 21 are sc heduled for retriev al earlier than UL 18. 3. Retriev al ( t = 440 − 446 ): With the blo c k age remov ed and incoming traﬃc div erted to non-critical lanes, AMR V1 navigates to the target lane and retrieves UL 04 within its designated retriev al window. Ov erall, this scenario exempliﬁes ho w the heuristic resolv es conﬂicts b y lev eraging temporal slac k. It stac ks UL 20/21 in fron t of UL 18 without creating block ages; this is p ermissible due to the speciﬁc retriev al order and maximizes the utility of the constrained ﬂo or space. 6.3.2 Spatial Strategies in Bro wnﬁeld Lay outs T o demonstrate the algorithm’s capabilit y to op erate within complex la youts, we applied the heuristic to a real-w orld la y out from the surface coating industry (Figure 7). This scenario uses the irregular ﬂoor area surrounding a ﬁxed high-gloss coating mac hine (grey obstacle). The lay out is c haracterized by the adjacen t p ositioning of the Source and Sink (b ottom righ t), requiring all AMR mo v ements to b e sequenced through the central aisle. Figure 7b illustrates the system state at t = 012 . The heuristic utilizes this lay out through t w o k ey mec hanisms: 1. Static Lanes: The irregular ﬂo or plan is man ually decomp osed in to a set of static Lanes. Sp eciﬁcally , the Orange and Y ellow zones are mapp ed to single-deep lanes (depth 1), while the Green zone forms double-deep lanes (depth 2). Static lanes allow the algorithm to apply standard stack-based logic to handle LIF O constraints, regardless of the sp eciﬁc ph ysical arrangemen t. 2. Lo ok-Ahead Placement: The heuristic optimizes storage lo cations based on retriev al deadlines. F or instance, although b oth UL 19 and UL 20 are retrieved late in the horizon, the heuristic distinguishes b etw een them. UL 20, ha ving the later deadline, is mov ed to a distan t location (lane 8) during initial reshuﬄing, whereas UL 19 is k ept in a closer slot (lane 16). This prioritization ensures that accessible buﬀer capacity is preserved for unit loads with earlier retriev al times. 35 These results sho w exemplarly that the prop osed logic enables robust operations in la yout- constrained environmen ts, preven ting deadlo c ks through the temp oral sequencing of lane access. (a) T op ology ( t = 001 ): The la y out utilizes the residual space around a high-gloss coating ma- c hine (grey) with adjacen t Source (blue) and Sink (green). The nic hes are manually abstracted in to static lanes with v arying depths: orange/yello w zones (depth 1) and green zones (depth 2). (b) Op erational State ( t = 012 ): Early opti- mization. The snapshot captures a reshuﬄing op- eration where AMR V2 relo cates unit load UL 19 to the single-deep y ello w zone (lane 16). In con- trast, the low er-priorit y UL 20 is mo ved to the dis- tan t y ellow zone (lane 8), preserving closer slots for urgen t unit loads. Figure 7: Real-w orld use case (surface coating). This brownﬁeld scenario demonstrates the con version of dead ﬂo or space in to an autonomous buﬀer. Unlike op en grids, the la y out is dictated b y the fo otprin t of the machinery . The visualization highligh ts the heuristic’s capabilit y to assign in ven tory to separated zones (depth 1 vs. depth 2) to optimize storage utilization within the irregular boundaries. 6.3.3 Spatial Decision P olicy T o analyze the spatial allo cation strategies of the heuristic, we ev aluate the o ccupancy in tensity and traﬃc density within the brownﬁeld scenario. As illustrated in Figure 8, o ccupancy intensit y is measured by the a verage num ber of timesteps a storage slot holds a unit load, whereas traﬃc densit y quan tiﬁes the cum ulative frequency of AMR visits at eac h lane access p oin t. This ev aluation reveals that the algorithm adopts a highly strategic utilization of the irregular space 36 without explicit pre-programming. Rather than treating the a v ailable ﬂoor as a rigid grid, the system b eha v es organically , dynamically adapting its spatial fo otprint to the curren t op erational load. (a) Occupancy In tensity: A verage duration a slot is o ccupied. Ligh ter colors indicate the slot is utilized most of the time. (b) T raﬃc Density: F requency of AMR visits at access p oin ts. Lighter colors indicate the ac- cess p oin t is utilized frequently . Figure 8: Algorithmic storage lo cation assignmen t in the brownﬁeld scenario. The heatmaps con trast buﬀer duration (a) with traﬃc ﬂow (b), highlighting the diﬀerentiation b et ween high- densit y storage and transfer zones. The heatmaps reveal three key b eha viors. Notably , these adv anced spatial strategies emerge despite the metho dological simpliﬁcations of the approac h. Sp eciﬁcally , the heuristic nature of the A ∗ searc h and the hierarchical decomposition of the problem. This demonstrates that the designed cost function captures global system dynamics, leading to the following emergen t b eha viors: 1. Lane-Blo c king A v oidance: A feature in Figure 8a is the low utilization of the front p ositions in double-deep lanes, despite their proximit y to the access p oin t. The heuristic demonstrates a preference for utilizing distant single-deep slots rather than blo c king an o ccupied rear slot in a closer lane. This conﬁrms that the algorithm prioritizes accessibilit y o ver physical pro ximity , accepting longer tra v el distances to prev ent future resh uﬄing op erations. 2. Buﬀer Duration-Based Inv en tory Stratiﬁcation: The system exhibits a distinct sep- aration of inv en tory based on buﬀer duration in the buﬀer. Deep slots in deep lanes sho w the highest cum ulativ e o ccupancy (Figure 8a), indicating they are used to buﬀer unit loads with late deadlines. Conv ersely , slots adjacen t to the Source and Sink exhibit the high- est traﬃc density (Figure 8b) but comparativ ely lo w occupancy in tensity . The heuristic 37 adaptiv ely treats these prime lo cations as a staging area with short buﬀer durations for immediate retriev al tasks. 3. Capacit y-Adaptiv e Breathing T op ology: The algorithm automatically adjusts its activ e storage area based on current inv en tory lev els. During p erio ds of lo w utilization, it allo cates unit loads to nearby slots, minimizing AMR trav el distances and leaving distant areas empt y . As storage density increases, the algorithm dynamically expands the activ e storage area into the further distant slots. This ensures strict trav el time optimization under normal conditions, while unlo c king maxim um capacity during p eak congestion. These b ehaviors suggest that the heuristic is capable of highly adaptive, context-a w are decision making on complex ﬂo or plans. 6.4 La y out Sensitivity and Saturation Poin ts T o isolate the impact of lay out top ology from pure capacit y , w e expanded the ev aluation to compare rectangular conﬁgurations ( 8 × 3 ) against dense square blocks ( 5 × 5 , 6 × 6 ) and irregular real-world la youts (e.g. Figure 7). These scenarios w ere ev aluated across ﬂeet sizes ranging from |V | = 2 to 6 AMRs. Since feasibilit y rates remained consistent regardless of ﬂeet densit y — conﬁrming the heuristic’s deadlo c k a v oidance — w e aggregate these results to focus on the more decisiv e structural factors: num b er of access points and lane-to-depth ratio. T able 6 presen ts a sensitivity analysis, breaking do wn feasibilit y b y the unit-load-to-slot-ratio and access constrain ts. The table lists the absolute num ber of solved instances v ersus the total n umber of generated instances for each conﬁguration. The results highlight three critical ﬁndings regarding system stabilit y: T able 6: Sensitivity Analysis: feasibilit y b y unit load to slot ratio and access directions. The table displays the success rate as a fraction (Solved/Generated) for v arious la youts. La yout T op ology Ratio A ccess Directions Solve Rate 1 2 3 4 Rectangle ( 8 × 3 ) 0.7 30/50 50/50 – – 80.0% 0.8 15/50 50/50 – – 65.0% 0.9 5/50 50/50 – – 55.0% 1.0 0/50 45/50 – – 45.0% Block ( 5 × 5 ) 0.7 0/50 85/100 – 50/50 67.5% 0.8 0/50 70/100 – 50/50 60.0% 0.9 0/50 20/100 – 45/50 32.5% 1.0 0/50 10/100 – 40/50 25.0% Block ( 6 × 6 ) 0.7 0/50 15/100 – 50/50 32.5% 0.8 0/50 0/100 – 50/50 25.0% 0.9 0/50 5/100 – 25/50 15.0% 1.0 0/50 0/100 – 25/50 12.5% Real-W orld (39 and 43 slots) 0.5–0.8 – – 450/450 – 100.0% 0.9 – – 98/100 – 98.0% 1.0 – – 75/100 – 75.0% Most notably , the coating industry lay outs with high slot counts demonstrate exceptional stabil- it y compared to the theoretical blo c k mo dels. As shown in T able 6, feasibility remains at or near 100% up to a unit load-to-slot ratio of 0.9. Ev en at full capacity , the system successfully solv es b et w een 70% and 80% of instances. The decline in solv ability for these instances is attributable 38 to tw o conv erging factors: primarily , the time required for deep reshuﬄing exceeds the theo- retical baselines deriv ed from simpliﬁed trav el estimates, rendering some instances structurally infeasible. Additionally , for theoretically solv able but complex instances, the A* searc h (Stage 2) b ecomes a computational b ottlenec k, hitting time limits due to the searc h space explosion. The analysis of the 8 × 3 la yout conﬁrms that the topology dictates p erformance. Ev en at a saturation ratio of 1.0, this conﬁguration maintains 90% feasibilit y provided it allows for 2-sided access (45/50 solved). This evidence suggests that a high n umber of indep enden t access p oin ts facilitates parallel op erations, eﬀectively mitigating the congestion eﬀects typically associated with high storage densit y . In sharp con trast, deep square la youts (e.g., 6 × 6 ) exhibit signiﬁcantly earlier saturation p oin ts. With limited access (1–2 sides), p erformance degrades rapidly even at mo derate ratios (0.7). This conﬁrms that for deep storage, optimization cannot fully comp ensate for a lac k of reshuﬄing space; consequen tly , either lo wer storage densities or full p erimeter access (4 sides) are requisite for reliable solv ability . 6.5 Managerial Insigh ts The computational experiments provide guidelines for the design and operation of autonomous buﬀer zones. By synthesizing the results, we derive four principles for managing ﬂo or storage. 6.5.1 La yout F ragmen tation and A ccess Eﬃciency In brownﬁeld facilities, contiguous areas for buﬀering are rarely av ailable. The results on lay out top ology show that spatial fragmentation is not a disadv antage. Small, distributed rectangular buﬀers (e.g., 8 × 3 ) outp erform large, symmetrical blo c k la y outs with deeper lanes (e.g., 6 × 6 ), ev en if these distributed zones are lo cated further a wa y from the source and sink. Str ate gic Insight: Buﬀering do es not require consolidating space into a single zone. Utilizing decen tralized ﬂo or space near pro duction lines is eﬀective, provided these p ock ets maintain high access a v ailability relative to storage slots. Distributed buﬀers with m ultiple access directions prev ent b ottlenec ks more eﬀectiv ely than deep storage blo c ks. 6.5.2 Capacit y-Adaptiv e Breathing T op ology T raditional material handling relies on static zoning, which requires manual interv en tion. Heatmap analysis sho ws that the heuristic autonomously p erforms storage location assignmen t without explicit pre-conﬁguration. Str ate gic Insight: Despite the limitations of the A ∗ searc h and the decomposition, our heuristic enables capacity-adaptiv e breathing top ology b y placing urgent unit loads at the p eriphery and less critical loads in deep er p ositions. This logic reorganizes the buﬀer in real-time based on the pro duction sc hedule, aiming to optimize the accessibility for the next retriev al window while the utilized area expands or contracts based on curren t demand. 39 6.5.3 The 90% Stabilit y Threshold The sensitivity analysis shows a stability threshold across all tested lay outs. Solv abilit y and sys- tem reliability decline when the unit-load-to-slot ratio exceeds 0.9. At higher ratios, the system lac ks the uno ccupied slots required for reshuﬄing. Str ate gic Insight: Planners must distinguish b etw een storage capacity and op erational through- put capacity . A pro duction buﬀer requires appro ximately 10% of slots as slac k to resolv e dead- lo c ks. Op erating ab ov e this 90% threshold transforms the buﬀer into a static storage yard, rendering it brittle to sto c hastic production requests. 6.5.4 Op erational Robustness and Fleet Scalabilit y The algorithm maintains robustness across v arying ﬂeet sizes from 2 to 6 rob ots without pa- rameter retuning. The system utilizes additional rob ots to reduce makespan un til the maximum throughput capacity is reached. Str ate gic Insight: This ensures op erational robustness through ﬂeet scalability . If a rob ot is remo ved for maintenance, the algorithm redistributes tasks and adapts the throughput capacity without causing an op erational standstill. Pro cess contin uit y is decoupled from the sp eciﬁc ﬂeet coun t, pro viding inv estmen t securit y . 40 7 Conclusion and F uture W ork This pap er addressed the Multi-AMR Buﬀer Storage, Retriev al, and Reshuﬄing Problem (BSRRP) b y establishing an Exact F ormulation (EF) for b enc hmarking and prop osing a hierarchical heuris- tic. Our research demonstrates that automating dense ﬂo or storage in brownﬁeld environmen ts requires the integration of reshuﬄing logic and m ulti-AMR co ordination, as implemented in our t wo-stage approac h. The computational exp eriments rev eal a disparity in tractability betw een exact and heuristic metho ds. While exact approaches often fail to con v erge for dense industrial scenarios, the hierarc hical heuristic ac hieves speedup factors ranging from 450x to ov er 3,000x compared to the exact solv er. Regarding op erational robustness, our analysis identiﬁed a saturation threshold at a unit load to slot ratio of 0.9. Bey ond this ratio, the lac k of resh uﬄing space renders the system brittle, con- ﬁrming that 10% of capacity m ust be treated as slack to resolve deadlo c ks. F urthermore, strict time windo ws were identiﬁed as a source of infeasibility; treating deadlines as soft constraints allo wed the heuristic to resolv e high-traﬃc scenarios where the exact solv er failed, conﬁrming that ﬂow con tinuit y must tak e precedence o ver precision in time-critical environmen ts. Finally , the ph ysical lay out top ology and the heuristic’s capacity-adaptiv e b eha vior pla y a de- cisiv e role. Our experiments demonstrate that rectangular la youts with high access a v ailabilit y outp erform deep square conﬁgurations b y enabling parallel access. The heuristic adaptiv ely adjusts the storage fo otprint based on curren t inv en tory lev els: it utilizes nearb y slots during lo w demand to minimize tra vel distances and expands in to distant slots only when capacity re- quiremen ts increase. This allows the system to optimize storage depth based on buﬀer duration without explicit pre-conﬁguration, adapting autonomously to irregular brownﬁeld constraints. This conﬁrms that a decoupled approac h—combining A ∗ searc h for task sequencing with Con- strain t Programming for scheduling—is a viable path for industrial con trol. Op erationally , future w ork will focus on integrating kinematic constrain ts (e.g., acceleration proﬁles, turning radii) in to the scheduling logic. Additionally , implementing p ost-processing tra jectory smoothing could further enhance path eﬃciency . Algorithmically , proﬁling identiﬁes the A ∗ searc h for task sequencing as the primary b ottlenec k in dense scenarios. F uture optimization of this comp onen t — through pre-computed pattern databases or Multi-Agen t Path Finding (MAPF) reﬁnemen ts — would eliminate remaining latency . F urthermore, this mo dular arc hitecture establishes a data-driv en infrastructure for online op erations. The decoupled solv er can serve as a baseline for real-time decision-making, where the system must con tin uously adapt to dynamic production streams and sto c hastic arriv al rates in a data-ric h en vironmen t. 41 A c kno wledgemen ts This research w as presen ted as a work-in-progress at the V eRoLog conference 2025. The authors ackno wledge the use of artiﬁcial intelligence (AI) to ols during the preparation of this manuscr ipt. Speciﬁcally , Gemini (versions 1.5 to 3.0; Google) and GitHub Copilot (Claude Sonnet 4.0 and 4.5) were used to help with asp ects of softw are dev elopment, language improv e- men t and editing, and the classiﬁcation of the literature relev an t to this study . The authors ha ve review ed and edited all conten t and assume full resp onsibilit y for the accuracy , in tegrit y , and originality of the w ork presented. Disclosure statement The authors declare no conﬂicts of interest. F unding This work w as supported b y the European Union - Next GenerationEU under Grant 13IK0321 Data and Co de av ailability statemen t The source co de is a v ailable under https://anonymous.4open.science/r/buffer_reshuffling _and_retrieval_IP-CFEC . Instances and solutions are av ailable under https://doi.org/10 .6084/m9.figshare.31852507 . 42 References Andulk ar, M., Le, D. T., & Berger, U. (2018). A multi-case study on industry 4.0 for sme’s in branden burg, germany . Pr o c e e dings of the 51st Hawaii International Confer enc e on System Scienc es, 2018 . doi: https://doi.org/10.24251/HICSS.2018.574 Arc hetti, C., Coelho, L., Sp eranza, M., & V ansteen wegen, P . (2025). Beyond ﬁft y y ears of v ehicle routing: Insights into the history and the future. Eur op e an Journal of Op er ational R ese ar ch . doi: https://doi.org/10.1016/j.ejor.2025.06.014 Boge, S., & Kn ust, S. (2020). The parallel stac k loading problem minimizing the n um b er of reshuﬄes in the retriev al stage. Eur op e an Journal of Op er ational R ese ar ch , 280 (3), 940–952. doi: h ttps://doi.org/10.1016/j.ejor.2019.08.005 Bömer, T., K oltermann, N., Pfrommer, J., & Meyer, A. (2024). Sorting m ultibay block stack- ing storage systems with multiple rob ots. In International c onfer enc e on c omputational lo gistics (pp. 34–48). doi: https://doi.org/10.1007/978-3-031-71993-6_3 Borjian, S., Manshadi, V. H., Barnhart, C., & Jaillet, P . (2015). Managing r elo c ation and delay in c ontainer terminals with ﬂexible servic e p olicies. doi: h ttps://doi.org/10.48550/ arXiv.1503.01535 Buc ko w, J.-N., Goerigk, M., & Kn ust, S. (2025). In tegrated pallet retriev al and pro cessing in w arehouses under uncertaint y . OR Sp e ctrum , 1–39. doi: https://doi.org/10.1007/ s00291-024-00806-7 Bömer, T., Disselnmeyer, M., & Meyer, A. (2025). A constraint programming approach for the m ulti-rob ot m ultibay unit load pre-marshalling problem. Pr o c e dia CIRP , 134 , 508-513. doi: https://doi.org/10.1016/j.procir.2025.02.151 Bömer, T., Pfrommer, J., Akizhanov, D., & Meyer, A. (2026). Sorting multi–ba y blo c k stacking storage systems. Computers & Op er ations R ese ar ch , 188 , 107359. doi: https://doi.org/ 10.1016/j.cor.2025.107359 Caserta, M., Sc hw arze, S., & V oß, S. (2012). A mathematical form ulation and complexit y con- siderations for the blocks relo cation problem. Eur op e an Journal of Op er ational R ese ar ch , 219 (1), 96-104. doi: 10.1016/j.ejor.2011.12.039 Charris, E., Ro jas-Reyes, J., & Monto y a-T orres, J. (2018, 07). The storage lo cation assignment problem: A literature review. International Journal of Industrial Engine ering Computa- tions , 10 . doi: h ttps://doi.org/10.5267/j.ijiec.2018.8.001 Chen, C., Tiong, L. K., & Chen, I.-M. (2019). Using a genetic algorithm to sc hedule the space- constrained agv-based prefabricated bathro om units manufacturing system. International Journal of Pr o duction R ese ar ch , 57 (10), 3003–3019. doi: h ttps://doi.org10.1080/00207543 .2018.1521532 Dan tzig, G. B., & Ramser, J. H. (1959). The truck dispatching problem. Management Scienc e , 6 (1), 80–91. Retriev ed from http://www.jstor.org/stable/2627477 Descartes Systems Group. (2023, May). How b ad is the supply chain and lo gistics workfor c e chal lenge? https://www.descartes.com/resources/news/descartes-study-reveals -76-supply-chain-and-logistics-operations-are-experiencing . (Accessed: 2026- 01-28) Disselnmey er, M., Bömer, T., Pfrommer, J., & Meyer, A. (2024). The static buﬀer reshuﬄing and 43 retriev al problem for autonomous mobile rob ots. In International c onfer enc e on c omputa- tional lo gistics (pp. 18–33). Springer. doi: https://doi.org/10.1007/978-3-031-71993-6_2 ElW akil, M., Eltawil, A., & Gheith, M. (2022). On the integration of the parallel stack loading problem with the blo c k relo cation problem. Computers & Op er ations R ese ar ch , 138 , 105609. doi: h ttps://doi.org/10.1016/j.cor.2021.105609 F ragapane, G., de Koster, R., Sgarb ossa, F., & Strandhagen, J. O. (2021). Planning and contr ol of autonomous mobile rob ots for intralogistics: Literature review and research agenda. Eur op e an Journal of Op er ational R ese ar ch , 294 (2), 405-426. doi: h ttps://doi.org/10.1016/ j.ejor.2021.01.019 F u, B., Chen, Z., Chandan, R., Barb osa, A., Caldara, M., Durham, J., & P ecora, F. (2026). Symb olic planning and multi-agent p ath ﬁnding in extr emely dense envir onments with unas- signe d agents. doi: https://doi.org/10.48550/arXiv.2509.01022 Ge, P ., Meng, Y., Liu, J., T ang, L., & Zhao, R. (2020). Logistics optimisation of slab pre- marshalling problem in steel industry . International Journal of Pr o duction R ese ar ch , 58 (13), 4050–4070. doi: https://10.1080/00207543.2019.1641238 Geft, T., Zhang, W., Y u, J., & Bekris, K. (2026). R obust out-of-or der r etrieval for grid-b ase d stor age at maximum c ap acity. doi: h ttps://doi.org/10.48550/arXiv .2601.19144 Grand View Research. (2025). A utonomous mobile r ob ots market (2026 - 2033). https://www .grandviewresearch.com/industry-analysis/autonomous-mobile-robots-market . (A ccessed: 2026-01-28) Hu, S., Zhao, S., & Ren, Z. (2025). Conﬂict-b ase d se ar ch and prioritize d planning for multi-agent p ath ﬁnding among movable obstacles. doi: https://doi.org/10.48550/arXiv.2509.26050 Hua, Y., W ang, Y., & Ji, Z. (2024, 06). Adaptiv e lifelong m ulti-agent path ﬁnding with multiple priorities. IEEE R ob otics and Automation L etters , PP , 1-8. doi: h ttps://doi.org/10.1109/ LRA.2024.3392084 Ji, M., Guo, W., Zhu, H., & Y ang, Y. (2015). Optimization of loading sequence and rehandling strategy for m ulti-qua y crane op erations in container terminals. T r ansp ortation R ese ar ch Part E: L o gistics and T r ansp ortation R eview , 80 , 1-19. doi: h ttps://doi.org/10.1016/ j.tre.2015.05.004 Kim, K. H., & Hong, G.-P . (2006). A heuristic rule for relo cating blo c ks. Computers& Op er ations R ese ar ch , 33 (4), 940-954. doi: https://doi.org/10.1016/j.cor.2004.08.005 Kizila y , D., & Eliiyi, D. T. (2021). A comprehensive review of quay crane schedulin g, yard op er- ations and integrations thereof in con tainer terminals. Flexible Servic es and Manufacturing Journal , 33 (1), 1–42. doi: h ttps://doi.org/10.1007/s10696-020-09385-5 Lersteau, C., & Shen, W. (2022). A surv ey of optimization methods for blo c k relo cation and premarshalling problems. Computers& Industrial Engine ering , 172 , 108529. doi: h ttps://doi.org/10.1016/j.cie.2022.108529 Lo werre, B. T. (1976). The harpy sp e e ch r e c o gnition system. Carnegie Mellon Universit y. Makino, H., & Ito, S. (2025). Mapf-hd: Multi-agent p ath ﬁnding in high-density envir onments. doi: https://doi.org/10.48550/arXiv.2509.06374 Pfrommer, J., Meyer, A., & Tierney , K. (2022). Solving the unit-load pre-marshalling prob- lem in blo c k stac king storage systems with m ultiple access directions. arXiv pr eprint 44 arXiv:2207.09118 . Pfrommer, J., Mey er, A., & Tierney , K. (2024). Solving the unit-load pre-marshalling problem in blo c k stac king storage systems with multiple access directions. Eur op e an Journal of Op er ational R ese ar ch , 313 (3), 1054-1071. doi: https://doi.org/10.1016/j.ejor.2023.08.044 Pytel, S., Sitek, S., Chmielewsk a, M., Zuzańsk a-Żyśk o, E., Runge, A., & Markiewicz-Patk o wsk a, J. (2021). T ransformation directions of brownﬁelds: The case of the górnośląsk o- zagłębio wsk a metrop olis. Sustainability , 13 (4). doi: https://doi.org/1 0.3390/su13042075 T ang, L., & Ren, H. (2010). Mo delling and a segmented dynamic programming-based heuristic approac h for the slab stack shuﬄing problem. Computers& Op er ations R ese ar ch , 37 (2), 368-375. doi: h ttps://doi.org/10.1016/j.cor.2009.05.011 T ec k, S., Dewil, R., & V ansteenw egen, P . (2024). A sim ulation-based genetic algorithm for a semi-automated w arehouse sc heduling problem with pro cessing time v ariabilit y . Applie d Soft Computing , 160 , 111713. doi: https://doi.org/10.1016/j.asoc.2024.111713 W ang, Q., V eerapaneni, R., W u, Y., Li, J., & Likhachev, M. (2024, 05). Mapf in 3d warehouses: Dataset and analysis. Pr o c e e dings of the International Confer enc e on Automate d Planning and Sche duling , 34 , 623-632. doi: https://doi.org/10.1609/icaps.v34i1.31525 W ang, Z., Zhou, C., Che, A., & Gao, J. (2024). A p olicy-based monte carlo tree search metho d for container pre-marshalling. International Journal of Pr o duction R ese ar ch , 62 (13), 4776– 4792. doi: h ttps://doi.org/10.1080/00207543.2023.2279130 45 A Multi-AMR Buﬀer Storage, Retriev al, and Resh uﬄing Prob- lem - Complete Mo del Elemen t Notation Description Sets and Indic es Set of unit loads N , n Range: { 1 , 2 , . . . , N } Set of time steps T , t Range: { 1 , 2 , . . . , T } Set of all lanes I , i, k Range: { 0 , 1 , . . . , I } Set of buﬀer lanes I ′ Range: { 1 , 2 , . . . , I − 1 } (excluding sink and source) Set of slots J i,k , j, l Range: { 1 , 2 , . . . , J } Set of vehicles V , v Range: { 1 , 2 , . . . , V } Gener al Notation and Par ameters Unit load ( u n , r n , a n ) A unit load with lab el u n . Retriev al windo w op ens at r n , arriv al window at a n . Slot [ i, j ] Slot in the i th lane and j th slot (deepest j = 1 , at the p erimeter j = J ). Outermost slot [ i, J i ] The outermost slot J i in lane i . Sink [ I , 1] = [ I , J I ] Modelled as a slot in the lane with the highest v alue for i . Source [0 , 1] = [0 , J I ] Mo delled as a slot in the lane with the low est v alue for i . Relo cation mov e [ i, j ] → [ k , l ] A relo cation mov e from slot [ i, j ] to [ k , l ] . Retriev al mo ve [ i, j ] → [ I , J ] Retrieving a unit load from slot [ i, j ] . Empt y drive [ i, j ]  → [ k , l ] Driving the AMR from slot [ i, j ] to [ k , l ] without trans- p orting a load. Storage mov e [0 , 1] → [ i, j ] Storing a unit load to slot [ i, j ] . Retriev al windo w [ r n , r n + ρ n ] Time window for retriev al of u n ; ρ n is the maximum dela y . Arriv al windo w [ a n , a n + α n ] Time window for arriv al of u n ; α n is the maximum dela y . Distance d ij kl Distance b et ween the slots [ i, j ] and [ k , l ] . T rav el time τ ij kl T rav el time (incl. handling time) betw een the slots [ i, j ] and [ k , l ] . T able 7: Sets, indices, and general notation for the In teger Programming mo del 46 b ij nt =    1 if unit load n is in [ i, j ] at time t, 0 otherwise; (A.1) ∀ i ∈ I , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T x ij klntv =    1 if unit load n is relocated from [ i, j ] to [ k, l ] at time t by AMR v , 0 otherwise; (A.2) ∀ i, k ∈ I ′ , ∀ j ∈ J i , ∀ l ∈ J k , ∀ n ∈ N , ∀ t ∈ T , ∀ v ∈ V y ij ntv =    1 if unit load n is retrieved from [ i, j ] at time t by AMR v, 0 otherwise; (A.3) ∀ i ∈ I \ { I } , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T , ∀ v ∈ V g nt =    1 if unit load n has been retrieved at time t ′ ∈ { 1 , . . . , t − 1 } , 0 otherwise; (A.4) ∀ n ∈ N , ∀ t ∈ T z ij ntv =    1 if unit load n is stored in [ i, j ] at time t by AMR v , 0 otherwise; (A.5) ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T , ∀ v ∈ V s nt =    1 if unit load n has been stored at time t ′ ∈ { 1 , . . . , t − 1 } , 0 otherwise; (A.6) ∀ n ∈ N , ∀ t ∈ T e ij kltv =        1 if the AMR drives from slot [ i, j ] to slot [ k , l ] at time t, without transporting a unit load 0 otherwise; (A.7) ∀ i, k ∈ I , ∀ j ∈ J i , ∀ l ∈ J k , ∀ t ∈ T , ∀ v ∈ V c ij tv =    1 if the AMR v is at slot [ i, j ] at time t, 0 otherwise; (A.8) ∀ i ∈ I , ∀ j ∈ J i , ∀ t ∈ T , ∀ v ∈ V T = max n ∈N ,m ∈N ,d n < ∞ { a n + α n , d m + δ n } (A.9) τ ij kl =    max(1 , d ij kl ) if performing an empt y drive e ij klt , max(1 , d ij kl + 2 h ) otherwise; (A.10) Starting Constraints b ij n 1 =    1 , if unitload n starts in slot [ i, j ] 0 , otherwise ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N (A.11) s n 1 =    1 , if unitload is initially stored in buﬀer zone 0 , otherwise ∀ n ∈ N (A.12) c ij 1 v =    1 , if vehicle v starts in slot [ i, j ] 0 , otherwise ∀ i ∈ I , ∀ j ∈ J i , ∀ v ∈ V (A.13) 47 Ob jective function min X i ∈I X j ∈J i X t ∈T X v ∈V  X n ∈N  y ij ntv ∗ d ij I 1 + z ij ntv ∗ d 01 ij  (A.14) + X k ∈I X l ∈J k  X n ∈N x ij klntv + e ij kltv  ∗ d ij kl  Sub ject to: X i ∈I ′ X j ∈J i b ij nt ≤ s nt , ∀ n ∈ N , ∀ t ∈ T (A.15) g nt ≥ s nt − X i ∈I ′ X j ∈J i b ij nt , ∀ n ∈ N , ∀ t ∈ T (A.16) X n ∈N b ij nt ≤ 1 , ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ t ∈ T (A.17) X n ∈N b ij nt ≥ X n ∈N b ij +1 nt , ∀ i ∈ I ′ , ∀ j ∈ J i \ J i , ∀ t ∈ T (A.18) X k ∈I ′ X l ∈J k X n ∈N x ij klntv + X k ∈I X l ∈J k e ij kltv + X n ∈N y ij ntv ≤ c ij tv , (A.19) ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ t ∈ T , ∀ v ∈ V X k ∈I X l ∈J k e I 1 kltv ≤ c I 1 tv , ∀ t ∈ T , ∀ v ∈ V (A.20) X k ∈I ′ X l ∈J k z klntv + X k ∈I X l ∈J k e 01 kltv + y 01 ntv ≤ c 01 tv , ∀ t ∈ T , ∀ v ∈ V (A.21) X v ∈V y 01 ntv + s nt ≤ 1 , ∀ n ∈ N , ∀ t ∈ T (A.22) X v ∈V  y ij ntv + X k ∈I ′ X l ∈J k x ij klntv  ≤ b ij nt , ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T (A.23) b ij nt = b ij n ( t − 1) + X v ∈V h X k ∈I ′ X l ∈J k  x klij n ( t − τ klij ) v − x ij kln ( t − 1) v  − y ij n ( t − 1) v + z ij n ( t − τ 01 ij ) v i , (A.24) ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T \ 1 g nt = X i ∈I \ I X j ∈J i t − 1 X t ′ =1 X v ∈V y ij nt ′ v , ∀ n ∈ N , ∀ t ∈ T (A.25) s nt = s n 1 + X i ∈I ′ X j ∈J i t − τ 01 ij X t ′ =1 X v ∈V z ij nt ′ v , ∀ n ∈ N , ∀ t ∈ T (A.26) c ij tv = c ij ( t − 1) v + X n ∈N z ij n ( t − τ 01 ij ) v − X n ∈N y ij n ( t − 1) v + X k ∈I ′ X l ∈J k X n ∈N x klij n ( t − τ klij ) v (A.27) + X k ∈I X l ∈J k e klij ( t − τ klij ) v − X k ∈I ′ X l ∈J k X n ∈N x ij kln ( t − 1) v − X k ∈I X l ∈J k e ij kl ( t − 1) v , ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ t ∈ T \ 1 , ∀ v ∈ V c I 1 tv = c I 1( t − 1) v + X i ∈I \ I X j ∈J i X n ∈N y ij n ( t − τ ij I 1 ) v + X i ∈I X j ∈J i  e ij I 1( t − τ ij I 1 ) v − e I 1 ij ( t − 1) v  , (A.28) ∀ t ∈ T \ 1 , ∀ v ∈ V c 01 tv = c 01( t − 1) v − X i ∈I ′ X j ∈J i X n ∈N z ij n ( t − 1) v + X i ∈I X j ∈J i  e ij 01( t − τ ij IJ ) v − e 01 ij ( t − 1) v  − X n ∈N y 01 n ( t − 1) v , (A.29) ∀ t ∈ T \ 1 , ∀ v ∈ V X i ∈I \ I X j ∈J i r n − τ ij I 1 − 1 X t =1 X v ∈V y ij ntv = 0 , ∀ n ∈ N (A.30) X i ∈I \ I X j ∈J ⟩ r n + ρ n − τ ij I 1 X t = r n − τ ij I 1 X v ∈V y ij ntv = 1 , ∀ n ∈ N (A.31) X i ∈I \ I X j ∈J i T X t = r n + ρ n − τ ij I 1 +1 X v ∈V y ij ntv = 0 , ∀ n ∈ N (A.32) X i ∈I ′ X j ∈J i a n − 1 X t =1 X v ∈V z ij ntv = 0 , ∀ n ∈ N (A.33) X i ∈I ′ X j ∈J i a n + α n X t = a n X v ∈V z ij ntv + min( r n + ρ n − τ 01 I 1 ,a n + α n ) X t = r n − τ 01 I 1 X v ∈V y 01 ntv = 1 , ∀ n ∈ N (A.34) X i ∈I ′ X j ∈J i T X t = a n + α n +1 X v ∈V z ij ntv = 0 , ∀ n ∈ N (A.35) X v ∈V " X j ∈J i c ij tv + X j ∈J i  X n ∈N X t ′ ∈ Ω in z z ij nt ′ v + X k,l X n ∈N X t ′ ∈ Ω in x x klij nt ′ v + X k,l X t ′ ∈ Ω in e e klij t ′ v  + X j ∈J i  X n ∈N X t ′ ∈ Ω out y y ij nt ′ v + X k,l X n ∈N X t ′ ∈ Ω out x x ij klnt ′ v + X k,l X t ′ ∈ Ω out e e ij klt ′ v  # ≤ 1 , ∀ i ∈ I ′ , ∀ t ∈ T (A.36) X n ∈N ( x ij klntv + y ij ntv + e ij kltv ) ≤ 1 − X n ∈N b i ( j +1) nt , ∀ i ∈ I ′ , ∀ j ∈ J i \ { 1 } , ∀ k ∈ I ′ , ∀ l ∈ J k , ∀ t ∈ T , ∀ v ∈ V (A.37) 48 B NP-hardness of the BSRRP Problem The Block Relocation Problem is kno wn to be NP-hard (Caserta et al., 2012). W e estab- lish the NP-hardness of the Buﬀer Storage, Retriev al, and Resh uﬄing Problem b y providing a p olynomial-time reduction from an y instance of BRP to BSRRP . B.1 Problem Deﬁnitions Deﬁnition B.1 (BRP - Blo c k Relocation Problem) Given: • A set of c ontainers C = { c 1 , ..., c N } • A set of stacks S = { s 1 , ..., s M } with maximum height H • A priority function p : C → { 1 , ..., N } assigning unique priorities to c ontainers • An initial c onﬁgur ation function f : C → S × { 1 , ..., H } mapping c ontainers to p ositions Find a se quenc e of r elo c ations minimizing the total numb er of moves k while r etrieving c ontainers in asc ending priority or der. Deﬁnition B.2 (BSRRP - Buﬀer Storage, Retriev al, and Resh uﬄing Problem) Given: • A set of unit lo ads U = { u 1 , ..., u N } • A buﬀer zone with lanes L = { l 1 , ..., l M } of maximum depth H • Time windows [ e i , l i ] for e ach unit lo ad u i • An initial c onﬁgur ation function g : U → L × { 1 , ..., H } Find a se quenc e of r elo c ations minimizing total tr avel distanc e D while r etrieving unit lo ads within their time windows. B.2 P olynomial-Time Reduction W e present a p olynomial-time transformation T that maps any instance I B RP of BRP to an instance I B S RRP = T ( I B RP ) of BSRRP . Deﬁnition B.3 (T ransformation T ) L et I B RP = ( C, S, p, f , H ) b e any instanc e of BRP. W e c onstruct I B S RRP = ( U, L, W , g , H ) as fol lows: 1. Structur e Pr eservation: • Cr e ate a set of unit lo ads U such that | U | = | C | (same numb er of elements) • Cr e ate a set of L anes L such that | L | = | S | (same numb er of stacks/lanes) • F or e ach c ontainer c i ∈ C , cr e ate a c orr esp onding unit lo ad u i ∈ U • F or e ach c ontainer p osition ( s, h ) = f ( c i ) , set the p osition of the c orr esp onding unit lo ad g ( u i ) = ( l, h ) wher e l is the lane c orr esp onding to stack s 49 2. Time Window Construction: L et k max ( N ) = N ( N − 1) 2 b e the maximum p ossible r elo c ations for N = | U | = | C | unit lo ads. F or e ach u i c orr esp onding to c ontainer c i with priority p ( c i ) = i : • e i = ( i − 1) ∗ ( k max ( N ) + 1) + 1 • l i = i ∗ ( k max ( N ) + 1) 3. Distanc e Metric: Deﬁne the distanc e function d ( x, y ) b etwe en any two p ositions x, y in the BSRRP instanc e as fol lows: d ( x, y ) =    1 , if the move fr om x to y involves c arrying a unit lo ad (lo ade d move) 0 , if the move fr om x to y do es not involve c arrying a unit lo ad (unlo ade d move) This distance metric directly counts the n um b er of loaded mo ves (relocations and retriev als), since unloaded mo v es con tribute to zero distance. B.3 Pro of of Correctness Lemma B.1 (Correctness of Mapping T ) The mapping of c ontainers c i to unit lo ads u i and stacks s j to lanes l j deﬁne d by the tr ansformation T pr eserves al l ac c essibility r elationships b etwe en the elements. Pro of B.1 As describ e d in the deﬁnition of tr ansformation T , the fol lowing holds: • F or e ach c i ∈ C , ther e exists a c orr esp onding u i ∈ U . • F or e ach s j ∈ S , ther e exists a c orr esp onding l j ∈ L . • The p osition of e ach element u i in I B S RRP c orr esp onds to the p osition of the c orr esp ond- ing element c i in I B RP (i.e., if f ( c i ) = ( s, h ) , then g ( u i ) = ( l , h ) , wher e l is the lane c orr esp onding to s ). Ther efor e, for any c ontainers c i , c j ∈ C : • If c i blo cks c j in I B RP , then u i blo cks u j in I B S RRP . • If c i is ac c essible in I B RP , then u i is ac c essible in I B S RRP . • The mapping g pr eserves the r elative p ositions of al l elements. Thus, any valid ac c ess and r elo c ation se quenc e in one pr oblem has a c orr esp onding valid se quenc e in the other pr oblem. Lemma B.2 (Time Windo w Correctness) The c onstruction of the time window enfor c es the priority or dering of BRP while al lowing for al l ne c essary r elo c ations for e ach unit lo ad in I B S RRP . 50 Pro of B.2 The time windows [ e i , l i ] for e ach unit lo ad u i (c orr esp onding to c ontainer c i with BRP priority i ) ar e deﬁne d as e i = ( i − 1)( k max ( N ) + 1) + 1 and l i = i ( k max ( N ) + 1) . This c onstruction cr e ates se quential and strictly nonoverlapping time windows, such as e i = l i − 1 + 1 . Conse quently, op er ations r elate d to u i c an only b e gin after the time window for u i − 1 has close d, thus dir e ctly enfor cing the asc ending priority or der of the BRP. The dur ation of e ach time window for u i is l i − e i + 1 = k max ( N ) + 1 time units. In the BSRRP instanc e, e ach unit of time c orr esp onds to a lo ade d move (either a r elo c ation or a r e- trieval), ac c or ding to the deﬁne d distanc e metric. The value k max ( N ) = N ( N − 1) 2 r epr esents an upp er b ound on the total numb er of r elo c ations r e quir e d for the entir e BRP instanc e. A l lo c ating k max ( N ) + 1 time units (i.e., p otential lo ade d moves) for e ach individual unit lo ad u i is ther efor e amply suﬃcient to ac c ommo date: • Any r elo c ations ne c essary to ac c ess u i after u 1 , . . . , u i − 1 have b e en r etrieve d. The numb er of such r elo c ations for u i alone wil l not exc e e d N − 1 , which is less than or e qual to k max ( N ) for N ≥ 2 . • The single lo ade d move r e quir e d for the r etrieval of u i itself. Thus, any valid se quenc e of BRP op er ations c an b e mapp e d to the BSRRP instanc e within the c onstructe d time windows, r esp e cting the priority or der and al lowing for al l r e quir e d r elo c ations and r etrievals. Theorem B.3 (Solution Equiv alence) An optimal solution to I B RP with k r elo c ations exists if and only if an optimal solution to I B S RRP with total distanc e D = k + N exists. Pro of B.3 L et S ol B RP b e a fe asible se quenc e of op er ations for I B RP involving k r elo c ations and N mandatory r etrievals. L et S ol B S RRP b e the c orr esp onding se quenc e of op er ations for I B S RRP . Under the deﬁne d distanc e metric, unlo ade d moves in S ol B S RRP have a distanc e of 0. L o ade d moves have a distanc e of 1. Each of the k r elo c ations in S ol B RP c orr esp onds to exactly one lo ade d move in S ol B S RRP (moving the r elo c ate d item). Each of the N r etrievals in S ol B RP c orr esp onds to exactly one lo ade d move in S ol B S RRP (moving the r etrieve d item out). Ther efor e, the total distanc e D for S ol B S RRP is pr e cisely the sum of the distanc es of the lo ade d moves: D = ( k × 1) + ( N × 1) = k + N . ( ⇒ ) Optimal BRP Solution implies Optimal BSRRP Solution: Assume S ol ∗ B RP is an optimal solution to I B RP with k ∗ r elo c ations. The c orr esp onding S ol ∗ B S RRP has total distanc e D ∗ = k ∗ + N . Supp ose, for c ontr adiction, that S ol ∗ B S RRP is not optimal for I B S RRP . Then ther e must exist a diﬀer ent fe asible solution S ol ′ B S RRP with total distanc e D ′ < D ∗ . Sinc e distanc e only c ounts lo ade d moves, D ′ must c orr esp ond to some numb er of r elo c ations k ′ and the N r etrievals, such that D ′ = k ′ + N . Given D ′ < D ∗ , we have k ′ + N < k ∗ + N , which implies k ′ < k ∗ . The se quenc e S ol ′ B S RRP c orr esp onds to a valid se quenc e S ol ′ B RP with k ′ r elo c ations (due to L emma 4.1 and L emma 4.2). This c ontr adicts the assumption that S ol ∗ B RP with k ∗ r elo c ations was optimal for I B RP . Ther efor e, S ol ∗ B S RRP must b e optimal for I B S RRP . ( ⇐ ) Optimal BSRRP Solution implies Optimal BRP Solution: Assume S ol ∗ B S RRP is an optimal solution to I B S RRP with total distanc e D ∗ . As shown ab ove, D ∗ must b e e qual to the numb er of lo ade d moves, which c orr esp onds to k ∗ r elo c ations and N r etrievals, so D ∗ = 51 k ∗ + N . This solution S ol ∗ B S RRP c orr esp onds to a valid BRP solution S ol ∗ B RP with k ∗ = D ∗ − N r elo c ations. Supp ose, for c ontr adiction, that S ol ∗ B RP is not optimal for I B RP . Then ther e must exist a diﬀer ent fe asible solution S ol ′ B RP with k ′ r elo c ations such that k ′ < k ∗ . F r om the ( ⇒ ) dir e ction, this S ol ′ B RP c orr esp onds to a BSRRP s olution S ol ′ B S RRP with total distanc e D ′ = k ′ + N . Sinc e k ′ < k ∗ , it fol lows that D ′ < D ∗ . This c ontr adicts the assumption that S ol ∗ B S RRP with distanc e D ∗ was optimal for I B S RRP . Ther efor e, S ol ∗ B RP must b e optimal for I B RP . Thus, an optimal solution with k r elo c ations exists for I B RP if and only if an optimal solution with total distanc e D = k + N exists for I B S RRP . B.4 P olynomial-Time Reduction Complexit y The transformation T op erates in p olynomial time with resp ect to the input size (primarily N ): • Conﬁguration mapping: O ( N ) • Time windo w computation: The lo op runs N times with constant time arithmetic op era- tions inside. O ( N ) . • Distance metric deﬁnition: O (1) Th us, the transformation T is p olynomial-time. B.5 Conclusion Since the reduction is polynomial-time, while it preserves solution feasibilit y and optimality for all instances, and the BRP is prov en NP-hard, the BSRRP is also NP-hard. □ 52

The Multi-AMR Buffer Storage, Retrieval, and Reshuffling Problem: Exact and Heuristic Approaches

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment