The Multi-AMR Buffer Storage, Retrieval, and Reshuffling Problem: Exact and Heuristic Approaches

Buffer zones are essential in production systems to decouple sequential processes. In dense floor storage environments, such as space-constrained brownfield facilities, manual operation is increasingly challenged by severe labor shortages and rising …

Authors: Max Disselnmeyer, Thomas Bömer, Laura Dörr

The Multi-AMR Buffer Storage, Retrieval, and Reshuffling Problem: Exact and Heuristic Approaches
The Multi-AMR Buffer Storage, Retriev al, and Resh uffling Problem: Exact and Heuristic Approac hes Max Disselnmey er ∗ , Thomas Bömer † , Laura Dörr ‡ , Bastian Am b erg § , Anne Mey er ¶ Abstract Buffer zones are essen tial in pro duction systems to decouple sequen tial pro cesses. In dense flo or storage en vironments, suc h as space-constrained bro wnfield facilities, man ual op eration is increasingly c hallenged b y severe lab or shortages and rising op erational costs. Automating these zones requires solving the Buffer Storage, Retriev al, and Reshuffling Prob- lem (BSRRP). While previous w ork has addressed scenarios, where the fo cus is limited to resh uffling and retrieving a fixed set of items, real-world man ufacturing necessitates an adaptiv e approach that also incorp orates arriving unit loads. This pap er introduces the Multi-AMR BSRRP , coordinating a rob ot fleet to manage concurren t reshuffling, along- side time-windo wed storage and retriev al tasks, within a shared flo or area. W e form ulate a Binary Integer Programming (IP) mo del to obtain exact solutions for benchmarking pur- p oses. As the problem is NP-hard, rendering exact metho ds computationally intractable for industrial scales, w e prop ose a hierarc hical heuristic. This approach decomposes the problem in to an A ∗ searc h for task-lev el sequence planning of unit load placements, and Constrain t Programming (CP) approach for multi-robot coordination and scheduling. Ex- p erimen ts demonstrate orders-of-magnitude computation time reductions compared to the exact formulation. These results confirm the heuristic’s viability as responsive con trol logic for high-density production environmen ts. Keyw ords: Autonomous Mobile Rob ots (AMR), Buffer Management, Integer Programming, Pro duction Logistics, Multi-Rob ot Co ordination. ∗ Karlsruhe Institute of T ec hnology , Zirk el 2, 76131 Karlsruhe, Germany , max.disselnmeyer@kit.edu, ORCID: https://orcid.org/0009-0008-5689-2235 , E-Mail: max.disselnmeyer@kit.edu † Karlsruhe Institute of T echnology , Zirkel 2, 76131 Karlsruhe, Germany , thomas.b oemer@kit.edu, OR CID: https://orcid.org/0000-0003-4979-7455 ‡ Karlsruhe Institute of T echnology , Zirk el 2, 76131 Karlsruhe, Germany , laura.do err@kit.edu, OR CID: https://orcid.org/0000-0002-8007-1815 § Karlsruhe Institute of T echnology , Zirkel 2, 76131 Karlsruhe, Germany , bastian.amberg@kit.edu, ORCID: https://orcid.org/0000-0001-6715-3819 ¶ Karlsruhe Institute of T echnology , Zirk el 2, 76131 Karlsruhe, German y , anne.meyer@kit.edu, ORCID: https://orcid.org/0000-0001-6380-1348 1 Con ten ts 1 In tro duction 4 2 Related W ork 6 2.1 F oundations and Related Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Extensions for Modeling the Buffer Storage, Retriv al, and Reshuffling Problem . 7 2.3 Heuristic and Learning-Based Approac hes . . . . . . . . . . . . . . . . . . . . . . 7 2.4 Researc h Gap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3 The Multi-AMR Buffer Storage, Retriev al, and Resh uffling Problem 11 3.1 Static Lanes and Graph Overla y . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2 Mo del Simplifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.3 Ob jective F unction and Model Flexibilit y . . . . . . . . . . . . . . . . . . . . . . 12 4 Exact Problem F orm ulation 13 4.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.2 Decision V ariables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.3 Ob jective F unction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.4 Constrain ts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.5 Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5 Heuristic Approac h 20 5.1 Priorit y Queue Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.2 Op eration Sequencing via A* Searc h . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.2.1 Direct Retriev al Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.2.2 State Space and T ransitions . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.2.3 Cost F unction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.3 Multi-AMR Scheduling via CP-SA T . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.3.1 Mo del F orm ulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.3.2 Ob jective F unction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5.4 T ra jectory Repair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 6 Computational Experiments 29 6.1 Exp erimen tal Design and Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 6.1.1 Instance Generation and P arameters . . . . . . . . . . . . . . . . . . . . . 29 6.1.2 Computational Environmen t . . . . . . . . . . . . . . . . . . . . . . . . . 30 6.1.3 Exact F ormulation Exp erimen ts and Benc hmark Set . . . . . . . . . . . . 30 6.2 Quan titative Benc hmarking against Exact F ormulation . . . . . . . . . . . . . . . 31 6.2.1 Quan titative Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 6.3 Qualitativ e Analysis of Solution Beha vior . . . . . . . . . . . . . . . . . . . . . . 33 6.3.1 Reactiv e Conflict Resolution ( 8 × 3 Lay out) . . . . . . . . . . . . . . . . . 35 6.3.2 Spatial Strategies in Bro wnfield La y outs . . . . . . . . . . . . . . . . . . . 35 6.3.3 Spatial Decision P olicy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2 6.4 La yout Sensitivit y and Saturation Poin ts . . . . . . . . . . . . . . . . . . . . . . . 38 6.5 Managerial Insights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 6.5.1 La yout F ragmen tation and A ccess Efficiency . . . . . . . . . . . . . . . . . 39 6.5.2 Capacit y-Adaptiv e Breathing T op ology . . . . . . . . . . . . . . . . . . . . 39 6.5.3 The 90% Stabilit y Threshold . . . . . . . . . . . . . . . . . . . . . . . . . 40 6.5.4 Op erational Robustness and Fleet Scalabilit y . . . . . . . . . . . . . . . . 40 7 Conclusion and F uture W ork 41 A Multi-AMR Buffer Storage, Retriev al, and Reshuffling Problem - Complete Mo del 46 B NP-hardness of the BSRRP Problem 49 B.1 Problem Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 B.2 Polynomial-Time Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 B.3 Pro of of Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 B.4 Polynomial-Time Reduction Complexity . . . . . . . . . . . . . . . . . . . . . . . 52 B.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3 1 In tro duction The demand for automation in pro duction supply and intralogistics, esp ecially for autonomous mobile rob ots (AMR), is growing to optimize material flo w and address critical labor short- ages (Descartes Systems Group, 2023; Pytel et al., 2021). These c hallenges are amplified in bro wnfield manufacturing facilities, whic h—despite b enefits regarding sustainability and land a v ailabilit y—present h urdles such as spatial limitations and legacy infrastructure (Andulk ar, Le, & Berger, 2018). These constraints force the use of dense storage in the form of deep flo or-lev el blo c k la youts, where accessibilit y is restricted. AMRs offer greater flexibilit y than traditional Automated Guided V ehicles (A GV s) or forklifts, significan tly impro ving material flo w and pro- ductivit y (F ragapane, de K oster, Sgarb ossa, & Strandhagen, 2021). A ccording to (Grand View Researc h, 2025), the AMR mark et is expected to grow by 14 . 4% from 2026 to 2033. Efficien t management of buffer zones (temp orary storage areas decoupling pro duction stages) is a critical challenge in these space-constrained environmen ts. In manual op eration, these zones suffer from lab or shortages, operational inefficiencies and time lost searc hing for materials. A represen tative real-w orld example for automation from the surface coating industry is illustrated in Figure 1, where a buffer zone must b e inte grated into the irregular residual space surrounding existing machinery . B uff er s lot s out h ac c es s B uff er s lot w es t ac c es s B uff er s lot eas t ac c es s S our c e S ink High - gl os s Coa ting Mac hine Ais les Ac es s point (a) Schematic lay out discretizing the residual space around a high- gloss coating machine into static LIFO storage lanes. (b) View from the source/sink area to wards the coating ma- c hine (background wall on the left), highlighting the storage densit y . (c) AMR designed to au- tonomously transp ort the spe- cialized unit loads within con- strained aisles. Figure 1: Real-world buffer scenario from the surface coating industry . Automating material flo w in such space-constrained bro wnfield facilities requires solving the Buffer Storage, Retriev al, and Reshuffling Problem (BSRRP) to ensure contin uous machine supply . 4 Automating buffer zones with AMRs requires solving the Buffer Storage, Retriev al, and Reshuf- fling Problem (BSRRP). While previous researc h has established a model for buffer scenarios, where the fo cus is limited to retrieving and reshuffling a fixed set of unit loads (Disselnmey er, Bömer, Pfrommer, & Meyer, 2024), real-world man ufacturing en vironments also include stor- age. They inv olv e an inflo w of new unit loads and require the co ordination of multiple AMRs within confined spaces, necessitating adv anced optimization to a void collisions and deadlo c ks while meeting strict retriev al deadlines. This pap er addresses these requiremen ts by extending this retriev al-fo cused form ulation to in- corp orate storage decisions and adapt it to the Multi-AMR setting. Our contributions are as follo ws: • Exact F orm ulation (EF) W e develop a Binary Integer Programming (IP) mo del for the Multi-AMR BSRRP . It couples storage, retriev al and resh uffling decisions in dense storage, managing a fleet restricted to perimeter access like conv en tional forklifts. • Complexit y Analysis W e establish the computational complexit y of the BSRRP prob- lem, referencing a formal pro of of NP-hardness via reduction from the Blo c k Relo cation Problem (BRP). • Hierarc hical Heuristic W e prop ose a scalable hierarchical heuristic that com bines an A* algorithm for creating storage, retriev al and reshuffling decisions with a Constraint Programming (CP) form ulation for precise multi-AMR scheduling. • Rigorous V alidation W e employ a rigorous v alidation metho dology where heuristic so- lutions are injected into the exact IP mo del. This allo ws us to verify feasibility using a commercial solver and explicitly quantify the optimality gap, pro viding a ground-truth b enc hmark for the heuristic’s p erformance. The remainder of this pap er is structured as follows: Section 2 reviews the relev an t literature on the Block Relo cation Problem and multi-AMR co ordination to situtate the BSRRP within the curren t researc h landscap e. In Section 3, w e pro vide a detailed description of the Multi-AMR BSRRP , including the logical subdivision of the storage space and the underlying op erational as- sumptions. Section 4 presen ts the formal Binary Integer Programming form ulation and discusses the computational complexit y of the problem. The proposed hierarc hical heuristic, combining A ∗ with Constrain t Programming, is detailed in Section 5. Section 6 ev aluates the performance of both the exact and heuristic approac hes through extensive computational exp eriments and discusses managerial insigh ts deriv ed from the results. Finally , Section 7 concludes the pap er and outlines a v enues for future researc h. 5 2 Related W ork This section reviews the current state of researc h and defines the scientific context of the Multi- AMR BSRRP . First, w e compare the problem to foundational approaches such as the Blo c k Relo cation Problem (BRP) and Multi-Agen t P ath Finding (MAPF). Second, w e outline the sp ecific modeling requiremen ts for autonomous buffers operated by an AMR fleet, focusing on the transition from single AMR mo dels to in tegrated multi-AMR systems. Third, w e discuss existing heuristics for complex logistics tasks. Finally , w e identify the research gaps regarding holistic, real-time control in fragmented environmen ts, whic h forms the basis for the solutions presen ted in this pap er. 2.1 F oundations and Related Problems Con tainer retriev al sub ject to strict LIFO constrain ts is formalized as the Blo c k Relo cation Prob- lem (BRP) (Kim & Hong, 2006; Lersteau & Shen, 2022). This problem is closely related to the Pre-Marshalling Problem (PMP) and its in tralogistics v ariant, the Unit-Load Pre-Marshalling Problem (UPMP) (Bömer, Disselnmey er, & Meyer, 2025; Bömer, Pfrommer, Akizhano v, & Mey er, 2026; Pfrommer, Meyer, & Tierney, 2024). Analogously , the steel industry addresses the Slab Pre-Marshalling Problem (SPMP), whic h is gov erned by iden tical LIFO stac king constraints (Ge, Meng, Liu, T ang, & Zhao, 2020). The BRP serv es as a foundation to the Buffer Resh uffling and Retriev al Problem (BRR), whic h fo cuses on unit load relo cation and retriev al in constrained spaces (Disselnmey er et al., 2024). Ho wev er, the Multi-AMR BSRRP presen ted in this pap er introduces decisive additional c hal- lenges: the arriv al and storage of new unit loads into the buffer, near real-time decision-making, and the co ordination of multiple AMRs. Unlik e PMP and UPMP , the BSRRP explicitly couples resh uffling and retriev al tasks with contin uous fleet na vigation in a shared workspace. While BRP researc h often assumes single-crane op erations (Ji, Guo, Zhu, & Y ang, 2015; T ang & Ren, 2010), the BSRRP utilizes AMRs with greater op erational flexibility , enabling co ordinated nav- igation within a shared workspace without rigid segmen tation. F urthermore, the BSRRP is distinct from other related optimization problems. It differs from the Y ard Crane Sc heduling Problem (YCSP) by a voiding crane-sp ecific constrain ts like non-crossing (Kizila y & Eliiyi, 2021). While space-constrained AGV sc heduling (Chen, Tiong, & Chen, 2019) addresses system deadlo c ks through capacit y-aw are pro duction scheduling, it t ypically lacks the activ e resh uffling logic required for deep LIFO stacks in trinsic to the BSRRP . F urthermore, the problem differs from the P arallel Stac k Loading Problem (PSLP) b y managing contin uous ev olution rather than just initial placemen t (Boge & Kn ust, 2020). Unlike the Storage Lo cation Assignmen t Problem (SLAP), whic h pro vides static optimal snapshots (Charris, Ro jas-Reyes, & Mon toy a-T orres, 2018), BSRRP manages ongoing reorganization. A dditionally , while the P allet Retriev al and Processing Problem (PRPP) optimizes the in terface betw een transport and pro cessing (Buc ko w, Go erigk, & Kn ust, 2025), it neglects the complex internal resh uffling defined b y BSRRP . Finally , unlik e Rob otic Mobile F ulfillment Systems (RMFS) whic h often deal with sto c hastic human pic king times (T eck, Dewil, & V ansteenw egen, 2024), BSRRP fo cuses on deterministic reshuffling to minimize tra v el distance. 6 2.2 Extensions for Mo deling the Buffer Storage, Retriv al, and Reshuffling Problem This study extends previous researc h on the BRR problem (Disselnmeyer et al., 2024) b y in- corp orating storage op erations in to the buffer and multi-AMR management. W e build upon the foundational BRP mo del by Borjian, Manshadi, Barnhart, and Jaillet (2015) for storing, retrieving and resh uffling container stac ks in in a single-crane yard, adapting it to address key requiremen ts for autonomous buffer zones: 1. Ob jectiv e F unction: Prioritizing the minimization of total AMR trav el distance rather than the n um b er of mo v es, whic h is the traditional fo cus of the BRP literature. 2. Unrestricted Relocation: Allowing the relo cation of any blocking unit load, not just those directly blocking the target, whic h is a common approac h for the BRP . 3. Retriev al Time Windo ws: Incorp orating strict time windows for retriev al to ensure pro cess sync hronization. 4. Time-Based Modeling: Utilizing discrete time steps for precise multi-AMR co ordination and collision a v oidance. 5. Storage Decisions: A ccommo dating new unit loads requiring placement during ongoing op erations, rather than focusing solely on reshuffling and retriev al tasks. 2.3 Heuristic and Learning-Based Approac hes Giv en the NP-hard nature of BRP v arian ts, exact metho ds are often in tractable for real-time decision-making, prompting the widespread use of heuristics (Kim & Hong, 2006; Lersteau & Shen, 2022). Ho wev er, the BSRRP extends b ey ond the purely combinatorial c hallenge of reshuf- fling: it necessitates the translation of mo ves in to collision-free tra jectories. Consequen tly , the problem encompasses b oth pathfinding, commonly solv ed using A* or Multi-Agen t Path Finding (MAPF) tec hniques (Q. W ang, V eerapaneni, W u, Li, & Likhac hev, 2024), and task allo cation, whic h relates to the V ehicle Routing Problem (VRP) (Arc hetti, Co elho, Sp eranza, & V ansteen- w egen, 2025; Dan tzig & Ramser, 1959). This complexit y highligh ts the p oten tial of h ybrid decomposition approac hes. F or example, Bömer, Koltermann, Pfrommer, and Mey er (2024) successfully applied a sequen tial method com bining A* search for resh uffling logic with a mixed-in teger program for m ulti-AMR tour planning in the UPMP con text. This demonstrates the viabilit y of lev eraging well-established heuristics to solve coupled subproblems sequentially . In the broader context of pre-marshalling, recen t adv ances hav e also employ ed Mon te Carlo T ree Searc h (MCTS) to effectively manage the cascading chain effects of resh uffling mov es (Z. W ang, Zhou, Che, & Gao, 2024). Most recently , concurren t research has explicitly addressed the intersection of multi-agen t pathfind- ing and mo v able obstacles in dense environmen ts: Hu, Zhao, and Ren (2025) introduced M- P AMO (MAPF Among Mo v able Obstacles), utilizing Conflict-Based Searc h (CBS) to resolve dep endencies b et w een agen ts and mov able blo c k ers. A ddressing extreme density , Makino and Ito 7 (2025) proposed MAPF-HD (MAPF-for high densit y en vironments), utilizing sw apping heuris- tics to manage obstructing agents. Finally , F u et al. (2026) formalized the Blo ck Rearrangemen t Problem (BRaP) for dense storage grids. Their approac h mo dels the system as a discrete sliding-tile puzzle where agents navigate within the grid to rearrange blo c ks. Complemen ting this, Geft, Zhang, Y u, and Bekris (2026) in vestigate the theoretical limits of relo cation-free retriev al under uncertain ty . They prov e that relo cations can b e eliminated through robust storage arrangements if sp ecific empty space ratios are maintained. While these works establish imp ortan t foundations for dense storage, they assume in ternal grid accessibilit y or fo cus on static sequence optimization . In con trast, the BSRRP addresses scenarios where AMRs are restricted to p erimeter access (see Figure 2), due to the characteristics of the la yout, unit loads, or AMR fleet, necessitating logic to handle LIF O access constraints from the outside. F urthermore, unlike the focus on minimizing mov es b et w een static snapshots found in these approac hes, our work addresses the contin uous temporal sync hronization required for flo or-handling AMRs to meet strict time windows. 01 04 07 a) Access from the P erimeter 01 04 07 b) In-Grid A ccess Figure 2: Conceptual comparison of buffer accessibilit y: a) A ccess from the P erimeter, as defined in the BSRRP . a) In-Grid A ccess, t ypical for Rob otic Mobile F ulfillmen t Systems (RMFS) where agen ts navigate within the grid; In the former, AMRs are restricted to the exterior aisles, necessitating strategic resh uffling of obstructing unit loads to access deep LIF O slots. 2.4 Researc h Gap Despite the extensive literature on BRP and autonomous logistics, a significant researc h gap remains regarding the Multi-AMR BSRRP . As summarized in T able 1, existing researc h tends to isolate sp ecific subproblems, failing to address the in terdep enden t complexities of mo dern bro wnfield intralogistics. Sp ecifically , the table categorizes state-of-the-art approac hes across k ey metho dological dimensions: handling operations, access mode (physical constraints), fleet configuration, temp oral representation and the Metho d used for solving the corresp onding prob- lem in that pap er. Regarding the latter, we distinguish b et w een three lev els of abstraction that define when the system state is ev aluated: • Moves treat op erations as a logical sequence; the system state is only up dated b etw een complete crane or rob ot mo vemen ts, effectively ignoring durations. 8 T able 1: Comparison of related literature in the field of Blo c k Relo cation and Buffer Resh uffling. Reference Handling Operations Access Mode Fleet T emp oral Representation Method Caserta, Sch w arze, and V oß (2012) G # P erimeter 1 Crane Mov es IP Borjian et al. (2015) P erimeter 1 Crane Con tinuous IP Ji et al. (2015) G # P erimeter n Cranes Mov es Genetic Algo. Bömer et al. (2024)  P erimeter n AMRs Mo ves CP Disselnmey er et al. (2024) G # P erimeter 1 AMR Con tinuous IP Hu et al. (2025)  In-Grid n AMRs Steps CBS Search F u et al. (2026) G # In-Grid n AMRs Steps Sym b olic Planning This Paper P erimeter n AMRs Contin uous IP + Heuristic Op erations:  Reshuffling only , G # Resh uffling & Retriev al, Storage, Reshuffling & Retriev al T emp oral Rep.: Moves (abstract sequence), Steps (discrete synchronous), Continuous (v ariable durations). • Steps partition time into fixed, sync hronous in terv als (ticks); the entire system state is re- calculated at ev ery interv al (e.g., every 5 seconds), whic h is common in grid-based pathfind- ing but imposes rigid sync hronization. • Continuous representations allow for v ariable task durations. While implemented on a discrete time grid for computational feasibilit y , the mo del treats tra v el times as distance- dep enden t parameters rather than fixed steps, enabling precise synchronization with ex- ternal deadlines. Based on this comparison, we identify three specific gaps in the curren t b o dy of w ork: 1. Lac k of In tegrated Models: Most existing approac hes address only fragmen ts of the problem. F or instance, the BRR model (Disselnmeyer et al., 2024) effectively optimizes the reshuffling and retriev al of a fixed set of unit loads but neglects the storage of unit loads arriving from upstream pro cesses. Conv ersely , literature on the dynamic BRP (e.g., Borjian et al. (2015)) accounts for incoming containers but t ypically minimizes the n um b er of mov es for a single crane. This metric is insufficient for AMR fleets, where minimizing tra vel distance and execution time is critical to meeting strict retriev al deadlines in a spatially distributed buffer. Consequen tly , there is no mo del that integrates the conflicting ob jectives of storage, retriev al, and resh uffling in to a single formulation. 2. Insufficien t Multi-AMR Co ordination for Perimeter A ccess: While the co ordina- tion of m ultiple agents is central to MAPF, scalability remains a hurdle in dense, in teractive en vironments (Hua, W ang, & Ji, 2024; Q. W ang et al., 2024). Standard MAPF ignores the manipulation of unit loads and fo cuses solely on computing collision-free paths for the agents. Conv ersely , multi-crane BRP literature addresses manipulation but relies on zoning strategies or rail-b ound constrain ts (Ji et al., 2015) that do not apply to flexible AMRs. F urthermore, recent concurrent studies on Blo c k Rearrangemen t or Mov able Ob- stacles (F u et al., 2026; Hu et al., 2025) mo del the problem as a discrete sliding-tile puzzle where agen ts navigate within the grid (Grid-Based traffic control). This abstraction differs fundamen tally from the BSRRP , where AMRs access the buffer from the p erimeter. There is curren tly no mo del that couples the com binatorial complexity of BRP reshuffling with 9 the co ordinated traffic managemen t of a multi-AMR fleet required to meet strict service time windows. 3. Absence of Fleet-A w are Heuristics: Addressing the BSRRP requires tigh tly integrat- ing storage assignment, resh uffling logic, retriev al sequencing, and v ehicle routing. Solving these problems in isolation is kno wn to yield sub optimal results, as the in terdep endencies b et w een subproblems are lost (ElW akil, Eltawil, & Gheith, 2022). While heuristics exist for differen t BRP v arian ts (see the surv ey b y (Lersteau & Shen, 2022)), there is a lack of scalable approaches specifically designed to couple the combinatorial storage, retriev al and resh uffling decisions with the spatio-temporal constrain ts of m ulti-AMR fleet routing. 10 3 The Multi-AMR Buffer Storage, Retriev al, and Resh uffling Problem This section formally defines the Multi-AMR Buffer Storage, Retriev al, and Reshuffling problem (BSRRP). W e describ e the physical storage environmen t, introduce the concept of Static Lanes as a static graph o v erlay to manage accessibility , and detail the ob jective function used to co ordinate the fleet. 3.1 Static Lanes and Graph Ov erlay The buffer consists of a con tin uous flo or area where unit loads are stac k ed directly on the ground. T o ensure accessibility and prev en t deadlo c ks in this dense environmen t, we o verla y a fixed graph structure onto the contin uous space. F ollo wing the approach of Pfrommer, Meyer, and Tierney (2022), we partition the storage area into a set of Static Lanes, denoted as I . Figure 3 illustrates this concept using a 5 × 5 buffer la yout. As sho wn in Figure 3a, the flo or is first discretized into a grid of storage slots. In a second step (Figure 3b), these slots are group ed in to Static Lanes based on their accessibility from the p erimeter aisles. (a) Physical Grid Lay out ( 5 × 5 ) (b) Static Lane Overla y Figure 3: Logical decomp osition of the storage area. (a) The contin uous flo or is discretized into slots. (b) These slots are group ed in to Static Lanes (colored), where eac h lane acts as a LIFO stac k accessible from the perimeter. The grey n um b ered rectangles represen t the unit loads This partitioning serv es three critical op erational purp oses: 1. LIF O Constraints: Eac h static lane i ∈ I functions as a Last-In-First-Out (LIF O) stac k. As depicted in Figure 3b, a lane consist of a sequence of flo or slots. T o retriev e a unit load stored deep within a lane i , all blo c king unit loads placed in fron t of it (i.e., closer to the aisle) must first b e reshuffled to empty slots in other lanes k ∈ I \ { i } . 2. Deadlo c k Preven tion: A ma jor c hallenge in multi-AMR systems on dense grids is con- gestion. T o preven t circular deadlo c ks, we enforce a strict resource lo cking mec hanism: Only one AMR ma y enter a lane at an y given time. If multiple AMRs w ere to en ter a single lane i sim ultaneously , the leading robot w ould b e blo c ked by the following one, 11 causing a deadlo c k. Consequently , a lane i ∈ I is lo c k ed b y an AMR during entry and exit op erations. 3. Static Lane Configuration: While the allo cation of unit loads to slots is dynamic, the lane lay out itself (the set I ) is static. F or the instances analyzed in this study , the lane configuration is pre-computed (e.g., using a maxim um-flow net work formulation (Pfrommer et al., 2022)) to maximize storage densit y while ensuring connectivity . Once generated, this lay out remains fixed throughout the run time. 3.2 Mo del Simplifications T o maintain computational tractability while capturing the core dynamics of the system, w e apply the follo wing abstractions: • Kinematics: W e assume constant AMR velocity and negligible acceleration/deceleration phases. This allo ws us to mo del trav el time as a linear function of distance on the graph. • Handling Times: Unit load handling (pic king up and setting down) is mo deled with constan t duration, assuming standardized load carriers. • Idealized En vironment: W e assume a deterministic en vironmen t without disruptions (e.g., human traffic or breakdowns), fo cusing the optimization on the logical coordination of the fleet. 3.3 Ob jectiv e F unction and Mo del Flexibility The primary ob jective of the BSRRP is to minimize the total distance trav eled b y the AMR fleet while strictly satisfying all service time windows. Giv en the set of unit loads N designated for retriev al, where eac h load n ∈ N has a release time r n and a deadline d n (deriv ed from the retriev al window), the solver m ust determine a sequence of storage, resh uffling, and retriev al mo v es suc h that: 1. Ev ery retriev al job is completed within its time window [ r n , d n ] . 2. No static lane constraints (LIFO, Capacit y , Exclusiv e Access) are violated. 3. The sum of distances for all loaded and empty AMR mo v es is minimized. The cum ulative distance serv es as a robust pro xy for efficiency , reducing energy consumption and hardware wear. F urthermore, by strictly enforcing time window constraints, the mo del acts as a v alidation tool for strategic planning: if the op erational requiremen ts exceed the fleet’s capacit y given the la yout, the mo del returns an infeasibility status, signaling the need for la yout adjustmen ts or fleet expansion. 12 4 Exact Problem F orm ulation W e form ulate the Multi-AMR Buffer Storage, Retriev al, and Reshuffling Problem (BSRRP) as a binary integer program, referred to as the Exact F ormulation (EF). This mo del builds directly up on the Buffer Reshuffling and Retriev al formulation in tro duced in (Disselnmeyer et al., 2024). W e extend this mo del to a multi-AMR setting b y incorp orating unit load arriv als—conceptually follo wing the dynamic BRP form ulation by (Borjian et al., 2015)—while adding explicit con- strain ts for collision-free fleet coordination and flo or-lev el maneuvering. 4.1 Notation Let N = { 1 , . . . , N } b e the set of unit loads, I = { 0 , . . . , I } the set of lanes, and T = { 1 , . . . , T } the set of time steps. The fleet of AMRs is denoted by V = { 1 , . . . , V } . A sp ecific storage lo cation is defined as [ i, j ] for lane i ∈ I and depth p osition j ∈ J i = { 1 , . . . , J i } . The planning horizon T is calculated based on the latest arriv al ( a n ) and retriev al ( r n ) deadlines and the length of the arriv al ( α n ) and retriev al windo w ( ρ n ): T = max n,m ∈N { a n + α n , r m + ρ m } (4.1) The parameter τ ij kl represen ts the time cost to mo v e b et w een slots. It defaults to the distance d ij kl , but includes handling time h for loaded mov es: τ ij kl = max(1 , d ij kl + 2 h ) . W e enforce a lo w er bound of 1 to ensure that every action consumes time. This prev en ts the solv er from sc heduling instantaneous zero-cost idle loops, ensuring that w aiting explicitly adv ances the sys- tem state. T able 2 summarizes the sets, indices, and parameters. T able 2: Sets, indices, and parameters Notation Description N , V , T Sets of unit loads, AMRs, and time steps. I , I ′ Sets of all lanes (incl. I/O), Set of all buffer lanes (excl. I/O). [ i, j ] Slot at lane i and position j ( j = 1 is deep est in the lane). d ij kl Distance b et ween slots [ i, j ] and [ k , l ] . τ ij kl T ra vel and handling time (cost) b et w een slots [ i, j ] and [ k , l ] . h Handling time for pic king up or dropping a unit load. [ a n , a n + α n ] Arriv al time window for unit load n . [ r n , r n + ρ n ] Retriev al time window for unit load n . 4.2 Decision V ariables W e classify the binary decision v ariables into AMR actions and system state indicators. All v ariables are defined as 1 if the condition holds and 0 otherwise. The decision-making pro cess is modeled using t w o categories of binary v ariables. First, AMR decision v ariables determine the specific tasks started b y each v ehicle v at time t . W e define 13 v ariables for resh ufflings ( x ), retriev als ( y ), and the storage of new arriv als ( z ). A dditionally , the v ariable e explicitly mo dels empty tra velling to accurately track the p ositions of the AMRs. Second, state v ariables are required to trac k the ph ysical configuration of the system. The v ariable b maps the in ven tory of unit loads to slots while c trac ks the spatio-temp oral p osition of ev ery AMR to preven t collisions. The auxiliary v ariables s and g monitor the completion status of storage and retriev al requests, resp ectiv ely . AMR Decision V ariables: Con trol the mov emen t and handling tasks of each vehicle v . x ij kl ntv = 1 if AMR v relo cates n from [ i, j ] to [ k , l ] at t, (4.2) ∀ i, k ∈ I ′ , ∀ j ∈ J i , ∀ l ∈ J k , ∀ n ∈ N , ∀ t ∈ T , ∀ v ∈ V y ij ntv = 1 if AMR v retrieves n from [ i, j ] at t, (4.3) ∀ i ∈ I \ { I } , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T , ∀ v ∈ V z ij ntv = 1 if AMR v stores n in to [ i, j ] at t, (4.4) ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T , ∀ v ∈ V e ij kl tv = 1 if AMR v p erforms an empt y drive from [ i, j ] → [ k , l ] at t, (4.5) ∀ i, k ∈ I , ∀ j ∈ J i , ∀ l ∈ J k , ∀ t ∈ T , ∀ v ∈ V State V ariables: T rack the lo cation and completion status of loads and vehicles. b ij nt = 1 if unit load n is in slot [ i, j ] at time t, (4.6) ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T c ij tv = 1 if AMR v is presen t at [ i, j ] at time t, (4.7) ∀ i ∈ I , ∀ j ∈ J i , ∀ t ∈ T , ∀ v ∈ V s nt = 1 if unit load n is stored by time t − 1 , ∀ n ∈ N , ∀ t ∈ T (4.8) g nt = 1 if unit load n is retrieved by time t − 1 , ∀ n ∈ N , ∀ t ∈ T (4.9) 4.3 Ob jectiv e F unction The ob jective is to minimize the total trav el distance of the AMR fleet. Minimizing distance serv es as a proxy for energy efficiency and reduces hardw are wear. F urthermore, reducing un- necessary trav el alleviates congestion, which indirectly aids the throughput of the system. Note that service deadlines are treated as hard constrain ts to guaran tee service levels; therefore, the ob jective fo cuses purely on efficiency rather than tardiness p enalties. The function sums the distances for retriev al ( y ), storage ( z ), resh uffling ( x ), and empt y tra vel ( e ): 14 min X t ∈T , v ∈V       X n ∈N X i ∈I j ∈J i ( y ij ntv d ij I 1 + z ij ntv d 01 ij ) + X i,k ∈I j ∈J i l ∈J k d ij kl ( e ij kl tv + X n ∈N x ij kl ntv )       (4.10) 4.4 Constrain ts The feasible region is gov erned b y flo w conserv ation, ph ysical consistency , time windows, and traffic rules. Flo w Conserv ation & State Updates W e explicitly mo del the system dynamics using four flo w conserv ation constrain ts that link the discrete AMR actions to the state of the unit loads and vehicles. Constraints (4.11) define the storage status s nt . A unit load n is considered stored at time t if it was stored initially ( s n 1 ) or if a storage action z transp orting it from the source has b een completed. This summation explicitly accoun ts for the tra vel time τ 01 ij , ensuring the status only up dates after the transport is finished. Constrain ts (4.12) analogously track the retriev al status g nt . This v ariable is up dated to 1 if a retriev al action y has been initiated for unit load n at any previous time step. Unlike storage, this updates at the start of the action to prev ent the load from b eing accessed again. Constraints (4.13) gov ern the slot o ccupancy b ij nt for every buffer slot [ i, j ] . The state at time t is determined by the state at t − 1 , plus any unit loads arriving via reshuffling ( x ) or new storage ( z ) after their resp ectiv e trav el times, minus an y loads lea ving the slot due to reshuffling ( x ) or retriev al ( y ). Finally , constrain ts (4.14) ensure spatio- temp oral con tinuit y for the mobile robots. The AMR position v ariable c ij tv trac ks whether the v ehicle v is in the slot [ i, j ] at time t . The constrain t up dates the p osition by adding vehicles arriving from storage ( z ), retriev al ( y ), reshuffling ( x ), or empt y trav el ( e ) tasks—accounting for the sp ecific trav el duration τ of each—and subtracting v ehicles that depart to initiate these tasks. 15 s nt = s n 1 + X i ∈I ′ X j ∈J i t − τ 01 ij X t ′ =1 X v ∈V z ij nt ′ v ∀ n ∈ N , ∀ t ∈ T (4.11) g nt = X i ∈I \{ I } X j ∈J i t − 1 X t ′ =1 X v ∈V y ij nt ′ v ∀ n ∈ N , ∀ t ∈ T (4.12) b ij nt = b ij n ( t − 1) + X v ∈V " X k ∈I ′ X l ∈J k  x kl ij n ( t − τ klij ) v − x ij kl n ( t − 1) v  − y ij n ( t − 1) v + z ij n ( t − τ 01 ij ) v # ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T \ { 1 } (4.13) c ij tv = c ij ( t − 1) v + X n ∈N z ij n ( t − τ 01 ij ) v − X n ∈N y ij n ( t − 1) v + X k ∈I ′ X l ∈J k X n ∈N x kl ij n ( t − τ klij ) v + X k ∈I X l ∈J k e kl ij ( t − τ klij ) v − X k ∈I ′ X l ∈J k X n ∈N x ij kl n ( t − 1) v − X k ∈I X l ∈J k e ij kl ( t − 1) v ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ t ∈ T \ { 1 } , ∀ v ∈ V (4.14) Note: Eq. (4.14) defines the motion for storage lanes I ′ ; analogous conserv ation constraints apply for the Source ( c 01 tv ) and Sink ( c I 1 tv ) no des. Initialization T o accurately sim ulate the system’s evolution, we m ust define its starting state at t = 1 based on the input instance. Constraints (4.15) map the initial slot o ccupancy , setting the state v ariable b ij n 1 to 1 if unit load n o ccupies slot [ i, j ] at the start of the planning horizon. Constrain ts (4.16) distinguish b et w een inistially stored unit loads and future arriv als; the storage status s n 1 is initialized to 1 for loads already present in the buffer, and 0 for those arriving at later time steps. Finally , Constrain ts (4.17) establish the initial AMR p ositions, assigning eac h AMR v to its designated starting co ordinates [ i, j ] by setting the position v ariable c ij 1 v accordingly . b ij n 1 =    1 , if unit load n starts in slot [ i, j ] 0 , otherwise ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N (4.15) s n 1 =    1 , if unit load n is initially stored in buffer 0 , otherwise ∀ n ∈ N (4.16) c ij 1 v =    1 , if AMR v starts in slot [ i, j ] 0 , otherwise ∀ i ∈ I , ∀ j ∈ J i , ∀ v ∈ V (4.17) System Consistency W e spatial and op erational in tegrity through a set of constraints gov- erning capacity , lane structure, and ob ject p ermanence. Constrain ts (4.18) limit the capacity of eac h storage slot [ i, j ] , ensuring it holds at most one unit load at any given time t . Constraints (4.19) mandate a dense storage p olicy to ensure gapless lane utilization; a unit load ma y only 16 o ccup y the outer p osition j + 1 if the adjacen t inner position j is also o ccupied. This ensures that the lanes are filled contin uously , reflecting the ph ysical constraints of floor block storage. Constrain ts (4.20) ensure that an AMR v can only initiate a retriev al ( y ) or resh uffling ( x ) for a unit load n from slot [ i, j ] if that load is presen t ( b ij nt = 1 ). Finally , Constraints (4.21) link the vehicle’s actions to its physical lo cation. The sum of all tasks—resh uffling ( x ), empt y tra vel ( e ), and retriev al ( y )—initiated by AMR v at slot [ i, j ] is b ounded by the presence v ariable c ij tv , ensuring a robot m ust b e at a lo cation to op erate from it. X n ∈N b ij nt ≤ 1 ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ t ∈ T (4.18) X n ∈N b ij nt ≥ X n ∈N b i ( j +1) nt ∀ i ∈ I ′ , ∀ j ∈ J i \ { J i } , ∀ t ∈ T (4.19) X v ∈V   y ij ntv + X k ∈I ′ X l ∈J k x ij kl ntv   ≤ b ij nt ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T (4.20) X k ∈I ′ X l ∈J k X n ∈N x ij kl ntv + X k ∈I X l ∈J k e ij kl tv + X n ∈N y ij ntv ≤ c ij tv ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ t ∈ T , ∀ v ∈ V (4.21) Note: Constraints (4.21) restrict actions in buffer lanes; separate presence constraints gov ern the Source (for z, y , e ) and Sink (for e ). F or brevit y , w e omit the explicit form ulation and refer to the complete mo del in App endix A. Time Windo ws W e mo del time windows as hard constraints, rendering any solution that misses a deadline infeasible. Constrain ts (4.22) mandate that every unit load n is successfully retriev ed exactly once within its designated window [ r n , r n + ρ n ] . Since the deadline applies to the arriv al at the Sink [ I , 1] , the v alid start time for a retriev al action y ij ntv from a sp ecific slot [ i, j ] is shifted earlier b y the tra v el time τ ij I 1 . Constrain ts (4.23) explicitly forbid retriev al actions outside this v alid in terv al, preven ting premature or tardy deliveries. X i ∈I \{ I } X j ∈J i r n + ρ n − τ ij I 1 X t = r n − τ ij I 1 X v ∈V y ij ntv = 1 ∀ n ∈ N (4.22) X i ∈I \{ I } X j ∈J i   r n − τ ij I 1 − 1 X t =1 X v ∈V y ij ntv + T X t = r n + ρ n − τ ij I 1 +1 X v ∈V y ij ntv   = 0 ∀ n ∈ N (4.23) Note: Constraints (4.22 and 4.23) specifically gov ern retriev al windows. F or brevity , w e omit the explicit storage constrain ts, as they are mathematically symmetric to the retriev al case: they restrict the storage actions z ij ntv at the Source to start in the arriv al window [ a n , a n + α n ] . (See App endix A) Constrain t (4.24) go v erns the en try logic for incoming unit loads, enforcing that eac h load n is either stored in the buffer or directly cross-do c k ed. The first term handles standard storage, ensuring the action z occurs during the arriv al windo w [ a n , a n + α n ] . The second term enables 17 direct retriev al from the Source, which must satisfy tw o simultaneous conditions: the load must b e a v ailable ( t ≤ a n + α n ) and the resulting deliv ery to the Sink must meet the retriev al deadline ( t ≥ r n − τ 01 I 1 ). The upper b ound of the summation enforces the tigh ter of these tw o limiting factors. X i ∈I ′ X j ∈J i a n + α n X t = a n X v ∈V z ij ntv + min( r n + ρ n − τ 01 I 1 ,a n + α n ) X t = r n − τ 01 I 1 X v ∈V y 01 ntv = 1 ∀ n ∈ N (4.24) T raffic Con trol W e implemen t traffic rules by mo deling eac h static lane i as a unary resource that can accommo date at most one v ehicle at an y giv en time t . T o formally capture lane o ccupancy during m ulti-step transitions, we define the set of relev an t start times Ω( t, τ ) = { t ′ ∈ T | t − τ < t ′ ≤ t } . This set identifies all past time steps t ′ where an action of duration τ initiated at t ′ w ould still b e activ e at the current time t . Constrain ts (4.25) enforce the unary capacity b y aggregating three distinct mo des of occupa- tion. The first term accoun ts for static pr esenc e , where an AMR is w aiting in the lane ( c ij tv ). The second term captures inc oming actions (storage z , incoming resh uffling/empt y trav el x, e ). Since the lane is blo c ked from the momen t an AMR en ters, w e sum ov er the full duration win- do w Ω( t, τ ) . The third term handles outgoing actions (retriev al y , outgoing resh uffling/empt y tra vel x, e ). T o av oid double-coun ting the vehicle’s presence at the instan t of departure (which is already captured by c ij tv ), we sum ov er the strictly past interv al Ω ∗ ( t, τ ) = Ω( t, τ ) \ { t } . Collec- tiv ely , the sum of these binary indicators m ust not exceed 1, ensuring collision-free op erations. X v ∈V " X j ∈J i c ij tv + X j ∈J i  X n ∈N X t ′ ∈ Ω( t,τ lane ( j )+ h ) z ij nt ′ v + X k ∈I ′ X l ∈J k X n ∈N X t ′ ∈ Ω( t,τ lane ( j )+ h ) x kl ij nt ′ v + X k ∈I X l ∈J k X t ′ ∈ Ω( t,τ lane ( j )) e kl ij t ′ v  + X j ∈J i  X n ∈N X t ′ ∈ Ω ∗ ( t,τ lane ( j )+ h ) y ij nt ′ v + X k ∈I ′ X l ∈J k X n ∈N X t ′ ∈ Ω ∗ ( t,τ lane ( j )+ h ) x ij kl nt ′ v + X k ∈I X l ∈J k X t ′ ∈ Ω ∗ ( t,τ lane ( j )) e ij kl t ′ v  # ≤ 1 ∀ i ∈ I ′ , ∀ t ∈ T (4.25) Finally , Constraints (4.26) enforce the ph ysical accessibilit y of the block storage. A slot [ i, j ] is accessible only if the blo c king slot [ i, j + 1] is empt y . Consequen tly , an y action (reshuffling x , retriev al y , or empt y trav el e ) targeting or originating from [ i, j ] is forbidden if b i ( j +1) nt = 1 . X n ∈N ( x ij kl ntv + y ij ntv + e ij kl tv ) ≤ 1 − X n ∈N b i ( j +1) nt ∀ i ∈ I ′ , ∀ j ∈ J i \ { J i } , ∀ k ∈ I ′ , ∀ l ∈ J k , ∀ t ∈ T , ∀ v ∈ V (4.26) 18 4.5 Computational Complexit y The BSRRP generalizes the Blo c k Relo cation Problem (BRP), which is kno wn to b e NP-hard (Caserta et al., 2012). Consequently , the BSRRP is also NP-hard. This relationship can b e established via a p olynomial-time reduction where a standard BRP instance is mapp ed to a restricted BSRRP instance by fixing the fleet size to one ( |V | = 1 ), eliminating arriv als ( a n = 1 ), setting empt y tra v el costs to zero, and normalizing all loaded tra vel distances to unity ( d ij kl = 1 ). Under these conditions, the BSRRP ob jective of minimizing total trav el distance b ecomes mathematically equiv alen t to the BRP ob jectiv e of minimizing the total n umber of relocations. F or a theoretical foundation, w e provide a formal proof of this reduction in App endix B. The pro of explicitly constructs the mapping from BRP to BSRRP , demonstrating that in tractability p ersists ev en under the specific constrain ts of our proposed mo del. Given this complexit y , the EF is computationally intractable for large-scale instances, necessitating the heuristic approach describ ed in Section 5. 19 5 Heuristic Approach The Exact F ormulation prop osed in Section 4 provides an exact solution but b ecomes com- putationally in tractable for large-scale instances due to the coupled complexity of m ulti-agen t pathfinding, task scheduling, and the storing, retrieving and resh uffling of unit loads. T o ad- dress this, we propose a hierarchical heuristic approac h that decomp oses the global optimization problem into four sequential, tractable sub-problems. This approac h builds up on the multi-ba y sorting strategies prop osed b y (Bömer et al., 2024), extending them to handle the storage and retriev al op erations and the m ulti-AMR constraints of the BSRRP problem. Input Orders O Time Windows Stage 1: Priority Queue Generation Linearized T ask Sequence Stage 2: Op era- tion Sequencing via A ∗ Searc h T ransp ortation T asks Sequence Stage 3: Multi- AMR Sc heduling Schedules for the AMRs Stage 4: T ra- jectory Repair Output F easible Multi-AMR Schedules Figure 4: Overview of the hierarchica l heuristic. The sequen tial stages transform the input from a linearized task sequence in to strict precedence constraints, then in to timed assignments, and finally into collision-free tra jectories. As illustrated in the flow c hart, the proposed metho dology pro cesses the orders through four se- quen tial stages. First, Priority Queue Gener ation linearizes the asynchronous orders to provide a strict execution sequence for the Op er ation Se quencing via A* se ar ch , to find a sequence of transp ortation tasks with storage, retriev al and reshuffling mov es. Finally , Multi-AMR Sche dul- ing maps these mov es to the AMR fleet via Constraint Programming, while the subsequen t T r aje ctory R ep air resolves any remaining spatiotemp oral conflicts of the AMR p ositions using a three-tier priority mec hanism to ensure collision-free execution. T able 3 summarizes the notation used throughout the heuristic form ulation. In the following eac h of the four components will b e explained. 5.1 Priorit y Queue Generation The first stage of the heuristic transforms the set of async hronous storage and retriev al requests in to a linear execution sequence. Let O denote the complete set of orders, comprising b oth storage requests O S and retriev al requests O R . Each order o ∈ O is constrained by a time windo w [ a o , d o ] , where a o represen ts the earliest release time and d o the hard deadline. Standard sorting strategies, suc h as Earliest Due Date (EDD), often fail in scenarios with asyn- c hronous release times, where a task with a late deadline might hav e a very late release time (a tigh t window), rendering it more critical than a task with an earlier deadline but a wide window. Executing the task with the wider window first ma y irrev ersibly consume temp oral resources required by the tighter task immediately upon its release. T o resolv e this, w e emplo y an Enhanced Earliest Due Date strategy , summarized in Algorithm 1, which is based on the concept of enclosed windows. W e define an enclosure relationship o ∼ p 20 T able 3: Notation and parameters used in the heuristic approac h Sym b ol Description Sets and Indic es O Set of storage and retriev al orders ( O = O S ∪ O R ) O S , O R Subsets of storage and retriev al orders o, p Indices for specific orders ( o, p ∈ O ) G Set of dependency groups formed during priorit y assignment M Set of mo v es generated b y A* search m a , m b Indices for specific mo v es (storage, retriev al, reshuffling) V Set of Autonomous Mobile Robots (AMRs) u Index for a specific unit load Par ameters and V ariables [ a o , d o ] Arriv al and retriev al time windo w for order o D g Effectiv e deadline of a group ( min o ∈ g d o ) Q Final prioritized execution queue P ( u ) Assigned priority v alue for unit load u (lo wer v alue indicates higher urgency) o ∼ p Enclosure relationship ( o is nested within p ) θ T abu tenure for cleared lanes σ Searc h state tuple ( B , U src , v pos ) in A* searc h Σ ′ Set of candidate successor states generated during expansion γ Congestion scaling factor (set to 5) W P enalty w eight for time windo w deviations (set to 10,000) t handling Time required for load/unload op erations F unctions and V ariables g time ( σ ) Elapsed schedule time (makespan) at state σ h est ( σ ) T otal heuristic estimate cost h ops Op erational Cost comp onen t h block Blo c king Cost component h prio Priorit y P enalt y h pre_store Premature Storage P enalt y A m,v In terv al v ariable for mo v e m assigned to v ehicle v start m,v , end m,v Start and end times of mo ve assignments τ i,j T ra vel time b et w een location i and j Algorithm 1: Enhanced Earliest Due Date Strategy Input: Set of orders O Output: Priority Queue Q G ← P artition O in to groups using enclosure relation o ∼ p (Eq. 1) Sort G ascending by group deadline D g = min o ∈ g ( d o ) Q ← Concatenate orders from sorted groups, lo cally sorted by d o return Q 21 b et w een t w o orders o, p ∈ O if the time window of one is entirely contained within the other: o ∼ p ⇐ ⇒ ( a o ≤ a p ∧ d o ≥ d p ) ∨ ( a p ≤ a o ∧ d p ≥ d o ) (5.1) This binary relationship iden tifies pairs of tasks that comp ete for the same time in terv al. W e utilize this relationship to partition the set of orders O into disjoint groups, effectiv ely treating the orders as no des in a graph where an edge exists b et ween an y pair o, p if o ∼ p . The resulting connected comp onen ts form the groups, clustering temp orally dep enden t tasks. The heuristic then generates the final priority queue by sorting these groups to ensure resource a v ailabilit y for critical tasks: 1. In ter-Group Sorting: The groups are sorted b y the minimum deadline of their con- stituen t orders ( min o ∈ group d o ). This ensures that a cluster containing ev en a single urgen t order is prioritized o ver a cluster of flexible tasks. 2. In tra-Group Sorting: Within each group, orders are sorted by their individual deadlines d o . This tw o-lev el sorting yields a strict priority ordering that explicitly protects orders with nested, tigh t windo ws from b eing preempted b y non-critical tasks. Based on their final position in the queue Q , eac h unit load u is assigned an in teger priorit y v alue P ( u ) , where a lo wer numerical v alue indicates a higher urgency . F or example, a sequence of four unit loads migh t receive the priorit y assignments P ( u 1 ) = 1 , P ( u 3 ) = 1 (if they share the same critical deadline), P ( u 4 ) = 2 , and P ( u 2 ) = 3 . During the subsequent A searc h, these priority v alues are used to calculate p enalties for inv alid or non-optimal placement sequences. 5.2 Op eration Sequencing via A* Searc h The second stage conv erts the prioritized sequence of orders O into a dep endency graph of mo v es M , whic h explicitly defines the lo cations for all storage, reshuffling, and retriev al op erations. This pro cess employs a tw o-phase approach: first, a pre-pro cessing step prunes trivial direct transfers from the source to the sink to reduce the searc h space, follow ed by an A* search that enforces the priorit y sc heme established in Section 5.1 and determines storage locations for the orders by using a virtual AMR as a substitute for the rob ot fleet. 5.2.1 Direct Retriev al Pruning Before initializing the search, we p erform a Dir e ct R etrieval A nalysis to identify unit loads that can bypass the buffer entirely . A unit load u is extracted for direct retriev al if its release time a u and deadline d u allo w for a direct transfer from Source to Sink. F ormally , if a u + τ source , sink ≤ d u , the item is remov ed from the searc h space and assigned a direct mo ve. This reduces the branching factor for the subsequen t pathfinding. 5.2.2 State Space and T ransitions F or the remaining orders, w e searc h for an optimal sequence of operations. The state space is defined b y the tuple σ = ( B , U src , v pos ) , where B represen ts the curren t buffer configuration 22 (mapping unit loads to lanes and slots), U src is the set of p ending unit loads at the source, and v pos trac ks the p osition of a single virtual AMR to estimate and minimize the empty trav el distances b et w een consecutive transp ort tasks. Representi ng the fleet as a single virtual AMR is a necessary metho dological simplification for fleet sizes |V | > 1 . T racking the individual p ositions of m ultiple AMRs within the A* state space would lead to a combinatorial explosion, rendering the search intractable. By optimizing the sequence for a single virtual AMR, the heuristic inherently groups spatially proximate tasks and minimizes total empty tra vel. This pro duces a highly cohesive op eration sequence that the subsequent CP-SA T scheduling stage can efficiently distribute and parallelize across the actual AMR fleet. T ransitions b et w een states corresp ond to three mo v e t yp es: • Store: Places an arriving unit at the source in to a v alid empt y slot in the buffer. • Retrieve: Retrieves an accessible unit load from the front of a lane to the Sink. • Reshuffle: Relo cates a blocking unit load to a temporary position. A state σ is iden tified as a goal state ( I sGoal ) when all storage and retriev al orders from the priorit y queue Q hav e b een successfully executed and the buffer state is consistent. Once a goal is reached, the algorithm uses Reconstr uctP ath to backtrac k through the state transitions, yielding the final mo ve sequence M for the subsequent scheduling phase. T o ensure scalability while main taining the solution qualit y of standard A ∗ , w e employ three k ey algorithmic optimizations. First, to bound the branching factor in high-densit y scenarios, w e implemen t a Beam Searc h strategy (Lo w erre, 1976). A t eac h expansion step, the generated successors are ranked b y their f -cost, and only the top k most promising candidates (with k = 8 in our exp erimen ts) are added to the open set. Second, w e enforce Open Set Pruning as a memory protection mechani sm. If the priorit y queue exceeds a safety threshold (set to 5,000 no des), the w orst-p erforming 50% of the nodes are discarded to preven t memory exhaustion and search stagnation. Third, w e utilize a lazy ev aluation strategy for the heuristic cost. Up on no de generation, only the computationally inexp ensiv e op erational costs are calculated. The exp ensiv e p enalt y components (blocking and priority violations) are deferred and computed only when the no de is extracted from the priority queue, prev enting wasted computation on unexplored no des. The complete search pro cedure, integrating the b eam searc h and memory protection strategies, is summarized in Algorithm 2. T o preven t cyclic b eha vior (e.g., immediate refilling of a cleared lane), the searc h maintains a short-term T abu list. When a unit load is retrieve d or reshuffled from a lane l , that lane b ecomes tabu for incoming storage or reshuffling mo ves. T o balance fleet coordination with unrestricted lane access, we set a fixed short-term ten ure of θ = 1 state transition. This tenure effectively prev ents immediate inv erse op erations (suc h as placing a unit load back into the p osition it just v acated) without restricting the solution space or locking down buffer capacity for extended p eriods, whic h prov ed more robust than dynamic, fleet-dependent tenures in testing. 5.2.3 Cost F unction The search is guided b y a composite cost function f ( σ ) = g time ( σ ) + h est ( σ ) . 23 Algorithm 2: A* searc h Input: Initial State σ init , Priority Queue Q Output: Mov e Sequence M O P E N ← { ( σ init , 0) } while O P E N  = ∅ do Select state σ with low est f ( σ ) from O P E N if IsGo al( σ ) then return ReconstructPath( σ ) Step 1: Expansion & T abu V alidation Generate successors Σ ′ b y applying actions { S tor e, Retr iev e, R eshuf f le } Filter Σ ′ : Remov e actions violating T abu tenure θ (Cycle Preven tion) Step 2: Ev aluation & Beam Pruning Calculate cost f ( σ ′ ) = g time + h est for all σ ′ ∈ Σ ′ Sort Σ ′ b y f -score and add only top k no des to OP E N (Beam Width) Step 3: Memory Protection if | O P E N | > Limit then Discard w orst 50% of no des from O P E N return F ailure T otal Sc hedule Mak espan The term g time ( σ ) represen ts the total elapsed time of the op era- tion sequence up to state σ . Unlike standard pathfinding whic h migh t only sum loaded distances, our cost function accounts for the empty trav el time required to trav el b et w een task lo cations, as well as an y w aiting time incurred if a v ehicle arriv es b efore a unit load’s release windo w a o op ens. This is achiev ed by employing a virtual AMR that sequen tially p erforms the tasks, cal- culating the unloaded trav el from the end p osition of the preceding mo ve to the start lo cation of the next. This allows p enalizing inefficien t sequencing (e.g., generating excessiv e empty trav el for the virtual AMR b et ween distan t tasks) and optimize for the true op erational makespan. Heuristic Cost F unction The heuristic h est ( σ ) is a weigh ted sum of four comp onen ts de- signed to minimize future effort while enforcing the priority ordering. T o maintain searc h tractabilit y and ensure robust performance in high-densit y scenarios, the heuristic emplo ys soft constrain ts within this cost function. Rather than strictly forbidding undesirable states—which could lead to searc h stagnation—it penalizes violations (su ch as lane blo c k ages or sequence in versions) through weigh ted p enalt y terms. h est ( σ ) = h ops ( σ ) + h block ( σ ) + h prio ( σ ) + h pre_store ( σ ) (5.2) 1. Op erational Costs ( h ops ): Estimates the trav el time τ required to complete all p ending tasks. • Stor age: W e emplo y a Greedy Best-Matc h heuristic. The cost for storing p end- ing unit loads is estimated by assigning each item to the nearest a v ailable slot that minimizes the total sequence: Source → Slot → Sink. • R etrieval: Sums the tra vel time τ i, sink from eac h stored item’s current p osition to the Sink. 24 2. Blo c king Costs ( h blo c k ): Anticipates the immediate resh uffling effort required for cur- ren tly blo ck ed targets. F or every block ed target u , the heuristic iden tifies the set of blo ck ers and calculates the cost to relocate them to the nearest empty lane. If no empt y lane is a v ailable, the algorithm applies tw o times the av erage resh uffle cost to reflect the op era- tional dela y of w aiting for future retriev als to free up buffer capacity . This cost is scaled b y a fixed p enalt y m ultiplier γ = 5 . 0 . W e found this fixed v alue to be effectiv e for con- sisten t congestion a v oidance, as it ensures high-priority reshuffling is p enalized uniformly regardless of the fleet size. 3. Priorit y Penalt y ( h prio ): Enforces the task sequence deriv ed in Stage 1 b y penalizing t wo t yp es of structural violations based on the assigned priorit y v alues P ( u ) : • Se quenc e Inversion: P enalizes the retriev al of a low er-priorit y unit load while a higher- priorit y one is still pending in the buffer. Example: Retrieving an item with P = 3 while an item with P = 1 is not retriev ed y et. • Stacking Violation: Penalizes an y LIF O lane configuration where a lo w er-priority item is placed closer to the aisle than a higher-priority item. Example: Placing an item with P = 2 in front of an item with P = 1 in the same lane. Since this placement guarantees a future resh uffle operation, it is p enalized immediately to prune the search branc h. 4. Premature Storage Penalt y ( h pre_store ): P enalizes the storage of a lo w-priority unit load from the source while a high-priorit y unit load is waiting to b e retriev ed from the buffer. This guides the search to clear urgent retriev als first, freeing up buffer space b efore bringing in less urgen t items. The output of this stage is a sequence of mov es M . If mov e m a resh uffles a blo c ker for a retriev al mo ve m b , a strict precedence constraint m a ≺ m b is generated for the subsequent sc heduling phase. 5.3 Multi-AMR Sc heduling via CP-SA T The third stage maps the sequence of mo v es M generated b y the A* search on to the fleet of Autonomous Mobile Rob ots (AMRs) V . While standard V ehicle Routing Problems (VRP) fo cus primarily on spatial routing, the BSRRP is dominated b y complex temporal constrain ts and precedences b et w een mo v es. Consequently , w e model this stage as a Job Shop Sche duling Pr oblem with Se quenc e-Dep endent Setup Times , solved using the CP-SA T solver from Google OR-T o ols. 5.3.1 Mo del F orm ulation T o emphasize the scheduling nature of the problem, the v ehicles of the AMR fleet are modeled as a set of identical parallel machines V = { 1 , . . . , V } . Eac h generated mov e m ∈ M is treated as a task that m ust b e assigned to exactly one machine. F ollo wing standard Constraint Programming 25 form ulation, we define optional in terv al v ariables A m,v to represen t the p oten tial execution of mo ve m by vehicle v . If activ e, an in terv al A m,v is c haracterized b y a start time start m,v , an end time end m,v , and a fixed pro cessing duration τ m . This duration com bines the loaded trav el time (as defined in Section 4) and the constan t handling time: τ m = τ origin ( m ) , dest ( m ) + t handling (5.3) The mo del enforces the follo wing constrain ts: 1. Assignmen t Completeness: Ev ery mo ve must b e assigned to exactly one vehicle. This is enforced b y constraining the sum of activ e assignment b ooleans to 1 for eac h mo v e: X v ∈V I ( A m,v ) = 1 ∀ m ∈ M (5.4) 2. Precedence Constraints: W e enforce three t yp es of hard dep endencies to guaran tee consistency and stac k integrit y: • Unit L o ad Flow: Sequential op erations acting on the same unit load (e.g., resh uffle blo c k er → retriev e target) m ust main tain the order defined by the logical flow. • LIFO Inter-Slot Dep endencies: Deriv ed from the buffer geometry; if unit load A is stored in a slot strictly in fron t of unit load B within the same lane, the retriev al of A must complete b efore the retriev al of B can commence. • L ane Se quencing: T o preserv e the v alidity of the buffer states determined b y the heuristic decomp osition, w e enforce strict sequencing for all mov es accessing the same lane. If the A* searc h sc hedules mo ve m a b efore m b on lane l , the sc heduler is constrained to resp ect this order ( m a ≺ m b ), preven ting the optimizer from creating in v alid lane configurations. F or any suc h dependency pair ( m a , m b ) , we enforce end m a ≤ start m b . 3. Lane Capacit y : W e mo del storage lo cations as resource constraints. • Buffer L anes (Unary R esour c e): Standard buffer lanes are modeled as unary re- sources. W e apply a global NoOverlap constraint on the set of in terv als assigned to an y specific lane l : NoOv erlap ( {A m,v | dest ( m ) = l ∨ origin ( m ) = l } ) (5.5) This constrain t ensures that at an y point in time t , at most one v ehicle can execute a mov e inv olving lane l . • Sour c e and Sink Queues: Mo deled as infinite-capacity resources to trac k v ehicle a v ail- abilit y without restricting the n um b er of concurren t AMRs at these locations. 4. Sequence-Dep enden t T ransition Times: The setup time b et ween tw o mov es dep ends on the rob ot’s lo cation. If vehicle v p erforms mo ve m a immediately b efore m b , a transition 26 constrain t ensures the gap co v ers the empty trav el time: start m b ,v ≥ end m a ,v + τ dest ( m a ) , origin ( m b ) (5.6) 5. Time Windo ws: Time windows are mo deled using a hybrid approach to ensure feasibility . • Har d Constr aints: Physical constrain ts are strictly enforced (e.g., a storage action cannot start before the unit load’s arriv al time a o ; a retriev al cannot end after the deadline d o ). • Soft Constr aints: Op erational targets (latest start, earliest finish) are treated as soft constrain ts. Violations are p ermitted to maintain feasibility but incur a w eigh ted tardiness p enalt y in the ob jective function to prioritize service-level agreements. 5.3.2 Ob jectiv e F unction The optimization ob jective is hierarc hical, implemented using a w eighted sum method. The primary goal is to minimize total tardiness, reflecting the strict service level requirements. The secondary goal is to minimize the sum of completion times (Flow Time), which implicitly mini- mizes unpro ductiv e empt y trav el and w aiting times. The ob jectiv e is formulated as: min X m ∈M    W · max(0 , end m − d m ) | {z } W eighted T ardiness + end m | {z } Flow Time    (5.7) Here, the term max(0 , end m − d m ) calculates the strictly p ositiv e delay relativ e to the soft deadline d m . W is a large w eigh ting constan t (set to 10,000) ensuring that meeting deadlines strictly dominates operational efficiency . 5.4 T ra jectory Repair While the CP-SA T schedule ensures temp oral v alidit y , it ignores the presence of idle AMRs, whic h ma y park after p erforming a storage or reshuffling mov e and o ccupy the buffer lanes. T o generate collision-free tra jectories, the final stage p ost-pro cesses the timeline follo wing the structure of Algorithm 3: Step 1: Deadlo c k Resolution The algorithm resolv es symmetric head-on collisions (e.g., v 1 mo ving A → B while v 2 mo ves B → A ) b y sw apping the AMRs’ future task sc hedules. Since the AMRs are homogenous, this resolv es the deadlock instantly without additional trav el time. Step 2: Conflict Resolution Lo op Remaining spatial ov erlaps caused by park ed AMRs are resolv ed using a hierarchical strategy prioritizing efficiency: 1. Priorit y 1 Reschedule: Exploits schedule slac k to shift the parked AMR’s next departure to an earlier time, clearing the lane b efore the incoming AMR arriv es. 27 Algorithm 3: T ra jectory Repair Strategy Input: Initial Sc hedule S Output: F easible Sc hedule S ∗ Step 1: Deadlo c k Resolution if Symmetric de ad lo ck dete cte d b etwe en v 1 , v 2 then Swap sc hedules of v 1 , v 2 Step 2: Conflict Resolution Lo op while Col lision dete cte d b etwe en v 1 (p arking AMR) and v 2 (inc oming AMR) do if v 1 is waiting empty and e arly shift fe asible then // Priority 1: Reschedule Shift v 1 departure to t < t arr iv al ( v 2 ) else if valid eviction str ate gy for v 1 exists then // Priority 2: Evict Insert b est eviction mo v e (Smart or Standard) for v 1 else // Priority 3: Delay Dela y v 2 un til v 1 departs (propagate do wnstream) Step 3: Dep endency Chec k foreac h unit lo ad u with end store > start retr iev e do Shift retriev al forward to resolve violation return S ∗ 2. Priorit y 2 Evict: Inserts an explicit eviction mo ve. The algorithm prioritizes a Smart Eviction (moving the AMR directly to the start lo cation of its next assigned task) ov er a Standar d Eviction (relocating it to an av ailable neutral p osition, such as a free buffer lane or the sink). 3. Priorit y 3 Delay: As a fallback, the incoming AMR is dela y ed, and this shift is propa- gated downstream to all dependent tasks. Step 3: Dep endency Chec k Dela ys in tro duced in Step 2 ma y violate precedence constrain ts (e.g., pushing a storage task to complete after its retriev al w as scheduled to b egin). This step enforces the end store ≤ start retrieve constrain t for all unit loads, shifting retriev al tasks forw ard if necessary to guaran tee consistency . 28 6 Computational Exp erimen ts This section ev aluates the p erformance of the prop osed solution approaches for the Multi-AMR Buffer Storage, Retriev al, and Resh uffling Problem (BSRRP). The exp erimen tal study is de- signed to answ er three primary research questions: 1. Benc hmarking: What are the computational limits of the EF when establishing a ground truth of optimal solutions? 2. Heuristic V alidation: Ho w do es the prop osed hierarchical heuristic p erform in terms of feasibilit y rates and solution qualit y compared to the optimal baselines? 3. Managerial Insights: Ho w do op erational parameters—specifically fleet size, access flexibilit y , and congestion lev els—impact the stability and efficiency of space-constrained buffers? 6.1 Exp erimen tal Design and Dataset T o ensure a robust ev aluation, w e dev elop ed a comprehensive dataset reflecting the constraints of bro wnfield ma nufacturing facilities, suc h as limited floor space, high storage densit y , and complex traffic dynamics. T o guarantee full repro ducibilit y and facilitate future researc h, the complete source code of the proposed heuristic and the discrete-ev en t sim ulator, as w ell as all generated b enc hmark instances and detailed solutions, are made publicly a v ailable (see the Data and Code A v ailabilit y Statemen t in the Ac kno wledgements). Our experimental design employs a tw o-tiered v alidation to address differen t ev aluation ob jectives: solution qualit y b enc hmarking on small-scale instances and scalability analysis on large-scale instances. 6.1.1 Instance Generation and P arameters Small-Scale Instances F or 3 × 3 and 4 × 4 grids, instances were generated sto chastically to quan tify the heuristic’s optimalit y gap against the EF. W e systematically v aried the following parameters: • Grid and T op ology: 3 × 3 (9 slots) and 4 × 4 (16 slots) with access p oin ts distributed across 1, 2, or 4 sides. • Fleet Size ( |V | ): 1 to 3 AMRs. • Load-to-Slot Ratio: Ranged from 0.4 to 1.3 to test capacity limits. • T emp oral Constrain ts: Arriv al and retriev al windows were generated with v arying o v er- lap (using random seeds for repro ducibility) to enforce async hronous op erations. Negativ e arriv al time windo ws w ere utilized to initialize instances with a set of already randomly placed pre-stored unit loads at t = 0 . 29 Large-Scale Instances F or la youts exceeding exact computational limits, w e dev elop ed a discrete-ev ent constructive simulator. This to ol generates task sequences b y sim ulating buffer op erations forward in time. It guaran tees feasibility by explicitly simulating the necessary resh uf- fling of blocking unit loads, while in tentionally ignoring AMR collisions, creating an idealized, con tinuous flow of operations. W e utilize these densely p ack ed sequences to stress-test the heuristic, allowing us to identify the saturation p oin ts. Sp ecifically , the maxim um unit load- to-slot-ratio a lay out can sustain b efore the heuristic fails to find a v alid sc hedule within the sim ulated deadlines. • T op ology: Square blo c ks ( 5 × 5 , 6 × 6 ), rectangular lay outs ( 8 × 3 ), and industrial brown- field lay outs. • T ask Generation: The simulator adaptively alternates storage and retriev al requests to main tain a target fill lev el (e.g., 80%), utilizing a bac k-to-front filling strategy . • Idealized Time Estimation: T ask durations are estimated using scaled Manhattan distances ( t op ≈ 2 . 0 × d manhattan ) plus fixed handling times, and reshuffling op erations incur explicitly sim ulated time penalties. • Time Windo w Deriv ation: T ask start times are greedily assigned to the earliest a v ail- able rob ot in a simulated fleet with tw o AMRs. The arriv al ( [ a n , a n + α n ] ) and retriev al time windo ws ( [ r n , r n + ρ n ] ) are then generated b y applying a fixed temp oral slac k (e.g., ± 15 time steps) around these sim ulated task start and end times. 6.1.2 Computational En vironmen t All experiments were conducted on a workstation equipp ed with an AMD Ryzen 9 5950X pro- cessor. The EF w as solv ed using Gurobi Optimizer version 13.0.0 with a strict time limit of 3600 seconds (1 hour) p er instance. The hierarchical heuristic was implemen ted in Python, utilizing Go ogle OR-T o ols (CP-SA T) for the sc heduling stage. T o ensure rapid conv ergence and stabilit y during the exp erimen ts, the solv er w as configured with aggressiv e searc h parameters (linearization lev el 2, probing lev el 2) and utilized 8 parallel searc h work ers. T o simulate a realistic pro duction con trol en vironmen t, w e imposed a realistic time limit of 300 seconds for both the A ∗ Searc h (Stage 2) and the CP-SA T solv er (Stage 3). While t ypical run times are fractions of a second, this cap prev ents indefinite stalling in degenerate cases. A dditionally , the algorithmic parameters of the heuristic (sp ecifically the congestion factor γ = 5 , p enalt y w eigh t W = 10 , 000 and tabu ten ure θ = 1 ) w ere calibrated based on preliminary exp erimen ts using a represen tativ e subset of small-scale instances to balance solution quality and computational speed. 6.1.3 Exact F orm ulation Exp erimen ts and Benc hmark Set T o ev aluate the EF and establish a ground truth, we generated a p o ol of 6,903 small-scale instances ( 3 × 3 and 4 × 4 ) across v arying la yout complexities and parameters. W e sub jected these instances to a tw o-stage pro cess to test the computational limits of the EF and to build a b enc hmark set. 30 First, the EF w as task ed with finding feasible solutions within a 3600-second time limit. This step identified 949 instances as solv able. W e assume this step effectively separates op erationally feasible scenarios from those rendered structurally imp ossible b y tight temp oral and spatial con- strain ts. This assumption is based on the observ ation that when the EF found a feasible solution, it typically did so relativ ely quickly , whic h w e verified by examining individual instances. Second, to ensure a strict baseline for optimalit y gap calculations, w e filtered this po ol to in- stances where the EF achiev ed a solution with a MIP gap of ≤ 5% . This pro cess yielded a final b enc hmark set of 810 instances. The comp osition of this b enc hmark, categorized by fleet size and access top ology , is detailed in T able 4. The distribution highligh ts the computational b oundaries of exact metho ds: the EF solved 716 instances in the 3 × 3 lay out to the required gap, but only 94 instances in the 4 × 4 lay out. Notably , the EF failed to solv e any 4 × 4 instances with a single AMR, suggesting that the used parameter combination exceeds the capacity of a single rob ot. W e utilize this filtered b enc hmark set of 810 optimally or near-optimally solved instances ( ≤ 5% MIP gap) during the subsequent ev aluation to v alidate the solution quality and feasibility rates of the proposed heuristic. T able 4: Comp osition of the Benc hmark Reference Set: 810 high-quality instances filtered from a p o ol of 6,903 generated small-scale scenarios. P arameter Category Small ( 3 × 3 ) Large ( 4 × 4 ) Fleet Size ( |V | ) 1 AMR 151 0 2 AMRs 288 48 3 AMRs 277 46 A ccess Directions 1 Side 72 1 2 Sides 373 54 4 Sides 271 39 T otal All Instances 716 94 6.2 Quan titativ e Benchmarking against Exact F orm ulation W e first analyze the quantitativ e performance of the heuristic against the benchmark set, fol- lo wed b y a qualitativ e assessmen t of its conflict resolution capabilities. 6.2.1 Quan titative Analysis T able 5 summarizes the comparativ e p erformance of the exact formulation (EF) and the prop osed heuristic. The results are first categorized by the physical lay out and the fleet size ( |V | ). Under the section Solve R ate Heuristic , the first column giv es the total num b er of b enc hmark instances, the second column lists how many w ere successfully solved by the heuristic, and the third column pro vides the resulting success rate. In the Computational Efficiency (Me dian) section, the table compares the median EF runtime against the median heuristic runtime, follow ed b y the relative sp eedup factor. Finally , the Me dian Opt. Gap (%) column quantifies the solution quality b y measuring the ob jectiv e v alue deviation of the heuristic solution from the exact lo w er bound. 31 T able 5: Performance assessment: F easibility , Run time, and Optimality Gap comparison b e- t ween the EF and the Heuristic approac h across the reference set. Solve Rate Heuristic Computational Efficiency (Median) Median Opt. Lay out Fleet ( |V | ) Number of Instances Solved Rate EF Run time Heur. Runtime Speedup Gap (%) 3 × 3 1 AMR 151 139 92.1% 515.5 s 0.05 s 10,310x 2.56% 2 AMRs 288 284 98.6% 253.0 s 0.08 s 3,163x 10.20% 3 AMRs 277 275 99.3% 307.2 s 0.09 s 3,413x 14.58% 4 × 4 2 AMRs 48 44 91.7% 1,524.3 s 3.37 s 452x 4.70% 3 AMRs 46 45 97.8% 2,927.9 s 2.52 s 1,162x 10.63% Solv ability and Structural Limits The prop osed heuristic demonstrated high consistency in finding feasible solutions when ev aluated against the b enc hmark. As detailed under the Solve R ate Heuristic section of T able 5, the heuristic achiev ed success rates comparable to the EF across all parameter com binations where exact solutions w ere obtainable. Sp ecifically , the heuristic solv ed 698 out of 716 instances (97.5%) in the 3 × 3 la yout. In the more complex 4 × 4 la y out, it successfully solv ed 89 out of the 94 reference instances (94.7%). Notably , the EF was unable to solve any 4 × 4 instances restricted to a single AMR. As outlined in Section 6.1, this is a direct consequence of the instance generation parameters: the extensiv e resh uffling required to resolve deep blo c k ages in a 4 × 4 grid simply consumes more time than the tigh t retriev al windows allo w for a single AMR, rendering these scenarios op erationally infeasible. Figure 5: P erformance comparison b et ween the EF and the prop osed Heuristic. T op panels: Run time distributions (logarithmic scale) across 3 × 3 and 4 × 4 lay outs. Bottom panels: Opti- malit y gaps relativ e to the exact lo wer b ound, categorized b y unit-load-to-slot-ratio. 32 Run time Performance While feasibility is the primary prerequisite for deploymen t, opera- tional viabilit y dep ends on computational speed. The runtime disparity , visualized in the top panels of Figure 5, highlights the impact of the problem’s NP-hard complexity . While the EF frequen tly approac hes the timeout of 3,600 seconds (e.g., median of 2,927s for 4 × 4 with 3 AMRs), the heuristic maintains stable sub-second to low-second runtimes. F or the 3 × 3 in- stances, the median computation time w as consistently under 0.1 seconds, achieving speedup factors exceeding 3,000x. Even in the complex 4 × 4 instances, the median was appro ximately 3 seconds—well b elo w the 300-second time limit. Solution Qualit y and Optimality Gap T o assess the trade-off for this massive speedup, we ev aluate the optimality gap (visualized in the b ottom panels of Figure 5). T o pro vide a rigorous assessmen t of solution quality , we calculate the T rue Optimality Gap. Unlike relative deviations b et w een tw o feasible solutions, this metric compares the heuristic solution ( Z H eur ) against the theoretical low er b ound ( Z LB ) established b y the exact solver: Gap = Z H eur − Z LB Z H eur × 100% (6.1) The heuristic maintains high solution qualit y , with a median gap of just 4.70% for the challenging 4 × 4 la yout with 2 AMRs. As expected, the gap increases with fleet size and exhibits a higher sensitivit y to the unit-load-to-slot-ratio. Medians typically range b et ween 10% and 20% for larger configurations, reflecting the heuristic’s tendency to prioritize rapid conflict resolution o ver finding the mathematically p erfect spatial sequence. Constrain t Handling and T emp oral Flexibilit y A k ey distinction in this assessment lies in the treatmen t of time windo ws. The EF strictly enforces the start times at the storage slots — which are derived from the external deadlines in (4.22) — as hard constraints, classifying any temp oral deviation as an infeasible solution. In contrast, the heuristic is designed to main tain op erational flow by selectiv ely relaxing these in ternal b ounds. Sp ecifically , it permits internal storage tasks to complete later and retriev al tasks to commence earlier than originally scheduled. This approach ensures that the handov ers at the Source and Sink remain p erfectly sync hronized with external processes, while providing the internal buffer operations with the temp oral flexi- bilit y necessary to resolv e spatial conflicts. Among the generated solutions, approximately 9.5% utilized this flexibilit y . While strictly sp eaking suboptimal compared to the rigid EF lo wer b ound, these solutions are operationally sup erior in a pro duction con text, as they preserve the punctualit y of external pro cesses while prev en ting in ternal system lockups. 6.3 Qualitativ e Analysis of Solution Behavior T o v alidate the p erformance of the proposed approach, w e analyze the op erational b eha vior of the generated solutions in tw o distinct scenarios: reactiv e conflict resolution in high-density confined spaces and proactiv e capacity managemen t in large-scale bro wnfield la y outs. This analysis examines solutions for 8 × 3 , 5 × 5 , and 6 × 6 lay outs, as w ell as a real-world brownfield case from the large-scale instance set. 33 Figure 6: Visualization of a solution in a high-densit y 8 × 3 lay out (21 unit loads) with 4 AMRs. The b ottom-left of each sub-figure indicates the corresp onding decision v ariable from the EF (also visualized with the dotted arro ws; white for empty driv es; blac k for transp ortation of an UL), while T denotes the discrete time step of the ev en t. 34 6.3.1 Reactiv e Conflict Resolution ( 8 × 3 La y out) First, w e examine a highly constrained en vironment serviced b y a fleet of 4 AMRs (Utilization ≈ 90% ). Figure 6 visualizes a complex interlea v ed reshuffling and storage sequence spanning from time step t = 345 to t = 446 . The system’s ob jective is to retriev e the target unit load (UL) 04, whic h is curren tly blo c ked. 1. Resh uffling ( t = 345 − 349 ): The heuristic iden tifies UL 18 as blo c king the retriev al of the target UL 04. AMR V2 is assigned to reshuffle UL 18 to a temp orary p osition. 2. Storage ( t = 375 − 401 ): Concurren tly , new storage requests for UL 20 and UL 21 arrive. Rather than blo c king the no w-accessible lane for UL 04, the heuristic utilizes the space in fron t of the resh uffled UL 18. It directs the fleet to stac k UL 20 ( t = 375 ) and UL 21 ( t = 401 ) in front of UL 18. This choice av oids creating new blo c k ages, as UL 20 and UL 21 are sc heduled for retriev al earlier than UL 18. 3. Retriev al ( t = 440 − 446 ): With the blo c k age remov ed and incoming traffic div erted to non-critical lanes, AMR V1 navigates to the target lane and retrieves UL 04 within its designated retriev al window. Ov erall, this scenario exemplifies ho w the heuristic resolv es conflicts b y lev eraging temporal slac k. It stac ks UL 20/21 in fron t of UL 18 without creating block ages; this is p ermissible due to the specific retriev al order and maximizes the utility of the constrained flo or space. 6.3.2 Spatial Strategies in Bro wnfield Lay outs T o demonstrate the algorithm’s capabilit y to op erate within complex la youts, we applied the heuristic to a real-w orld la y out from the surface coating industry (Figure 7). This scenario uses the irregular floor area surrounding a fixed high-gloss coating mac hine (grey obstacle). The lay out is c haracterized by the adjacen t p ositioning of the Source and Sink (b ottom righ t), requiring all AMR mo v ements to b e sequenced through the central aisle. Figure 7b illustrates the system state at t = 012 . The heuristic utilizes this lay out through t w o k ey mec hanisms: 1. Static Lanes: The irregular flo or plan is man ually decomp osed in to a set of static Lanes. Sp ecifically , the Orange and Y ellow zones are mapp ed to single-deep lanes (depth 1), while the Green zone forms double-deep lanes (depth 2). Static lanes allow the algorithm to apply standard stack-based logic to handle LIF O constraints, regardless of the sp ecific ph ysical arrangemen t. 2. Lo ok-Ahead Placement: The heuristic optimizes storage lo cations based on retriev al deadlines. F or instance, although b oth UL 19 and UL 20 are retrieved late in the horizon, the heuristic distinguishes b etw een them. UL 20, ha ving the later deadline, is mov ed to a distan t location (lane 8) during initial reshuffling, whereas UL 19 is k ept in a closer slot (lane 16). This prioritization ensures that accessible buffer capacity is preserved for unit loads with earlier retriev al times. 35 These results sho w exemplarly that the prop osed logic enables robust operations in la yout- constrained environmen ts, preven ting deadlo c ks through the temp oral sequencing of lane access. (a) T op ology ( t = 001 ): The la y out utilizes the residual space around a high-gloss coating ma- c hine (grey) with adjacen t Source (blue) and Sink (green). The nic hes are manually abstracted in to static lanes with v arying depths: orange/yello w zones (depth 1) and green zones (depth 2). (b) Op erational State ( t = 012 ): Early opti- mization. The snapshot captures a reshuffling op- eration where AMR V2 relo cates unit load UL 19 to the single-deep y ello w zone (lane 16). In con- trast, the low er-priorit y UL 20 is mo ved to the dis- tan t y ellow zone (lane 8), preserving closer slots for urgen t unit loads. Figure 7: Real-w orld use case (surface coating). This brownfield scenario demonstrates the con version of dead flo or space in to an autonomous buffer. Unlike op en grids, the la y out is dictated b y the fo otprin t of the machinery . The visualization highligh ts the heuristic’s capabilit y to assign in ven tory to separated zones (depth 1 vs. depth 2) to optimize storage utilization within the irregular boundaries. 6.3.3 Spatial Decision P olicy T o analyze the spatial allo cation strategies of the heuristic, we ev aluate the o ccupancy in tensity and traffic density within the brownfield scenario. As illustrated in Figure 8, o ccupancy intensit y is measured by the a verage num ber of timesteps a storage slot holds a unit load, whereas traffic densit y quan tifies the cum ulative frequency of AMR visits at eac h lane access p oin t. This ev aluation reveals that the algorithm adopts a highly strategic utilization of the irregular space 36 without explicit pre-programming. Rather than treating the a v ailable floor as a rigid grid, the system b eha v es organically , dynamically adapting its spatial fo otprint to the curren t op erational load. (a) Occupancy In tensity: A verage duration a slot is o ccupied. Ligh ter colors indicate the slot is utilized most of the time. (b) T raffic Density: F requency of AMR visits at access p oin ts. Lighter colors indicate the ac- cess p oin t is utilized frequently . Figure 8: Algorithmic storage lo cation assignmen t in the brownfield scenario. The heatmaps con trast buffer duration (a) with traffic flow (b), highlighting the differentiation b et ween high- densit y storage and transfer zones. The heatmaps reveal three key b eha viors. Notably , these adv anced spatial strategies emerge despite the metho dological simplifications of the approac h. Sp ecifically , the heuristic nature of the A ∗ searc h and the hierarchical decomposition of the problem. This demonstrates that the designed cost function captures global system dynamics, leading to the following emergen t b eha viors: 1. Lane-Blo c king A v oidance: A feature in Figure 8a is the low utilization of the front p ositions in double-deep lanes, despite their proximit y to the access p oin t. The heuristic demonstrates a preference for utilizing distant single-deep slots rather than blo c king an o ccupied rear slot in a closer lane. This confirms that the algorithm prioritizes accessibilit y o ver physical pro ximity , accepting longer tra v el distances to prev ent future resh uffling op erations. 2. Buffer Duration-Based Inv en tory Stratification: The system exhibits a distinct sep- aration of inv en tory based on buffer duration in the buffer. Deep slots in deep lanes sho w the highest cum ulativ e o ccupancy (Figure 8a), indicating they are used to buffer unit loads with late deadlines. Conv ersely , slots adjacen t to the Source and Sink exhibit the high- est traffic density (Figure 8b) but comparativ ely lo w occupancy in tensity . The heuristic 37 adaptiv ely treats these prime lo cations as a staging area with short buffer durations for immediate retriev al tasks. 3. Capacit y-Adaptiv e Breathing T op ology: The algorithm automatically adjusts its activ e storage area based on current inv en tory lev els. During p erio ds of lo w utilization, it allo cates unit loads to nearby slots, minimizing AMR trav el distances and leaving distant areas empt y . As storage density increases, the algorithm dynamically expands the activ e storage area into the further distant slots. This ensures strict trav el time optimization under normal conditions, while unlo c king maxim um capacity during p eak congestion. These b ehaviors suggest that the heuristic is capable of highly adaptive, context-a w are decision making on complex flo or plans. 6.4 La y out Sensitivity and Saturation Poin ts T o isolate the impact of lay out top ology from pure capacit y , w e expanded the ev aluation to compare rectangular configurations ( 8 × 3 ) against dense square blocks ( 5 × 5 , 6 × 6 ) and irregular real-world la youts (e.g. Figure 7). These scenarios w ere ev aluated across fleet sizes ranging from |V | = 2 to 6 AMRs. Since feasibilit y rates remained consistent regardless of fleet densit y — confirming the heuristic’s deadlo c k a v oidance — w e aggregate these results to focus on the more decisiv e structural factors: num b er of access points and lane-to-depth ratio. T able 6 presen ts a sensitivity analysis, breaking do wn feasibilit y b y the unit-load-to-slot-ratio and access constrain ts. The table lists the absolute num ber of solved instances v ersus the total n umber of generated instances for each configuration. The results highlight three critical findings regarding system stabilit y: T able 6: Sensitivity Analysis: feasibilit y b y unit load to slot ratio and access directions. The table displays the success rate as a fraction (Solved/Generated) for v arious la youts. La yout T op ology Ratio A ccess Directions Solve Rate 1 2 3 4 Rectangle ( 8 × 3 ) 0.7 30/50 50/50 – – 80.0% 0.8 15/50 50/50 – – 65.0% 0.9 5/50 50/50 – – 55.0% 1.0 0/50 45/50 – – 45.0% Block ( 5 × 5 ) 0.7 0/50 85/100 – 50/50 67.5% 0.8 0/50 70/100 – 50/50 60.0% 0.9 0/50 20/100 – 45/50 32.5% 1.0 0/50 10/100 – 40/50 25.0% Block ( 6 × 6 ) 0.7 0/50 15/100 – 50/50 32.5% 0.8 0/50 0/100 – 50/50 25.0% 0.9 0/50 5/100 – 25/50 15.0% 1.0 0/50 0/100 – 25/50 12.5% Real-W orld (39 and 43 slots) 0.5–0.8 – – 450/450 – 100.0% 0.9 – – 98/100 – 98.0% 1.0 – – 75/100 – 75.0% Most notably , the coating industry lay outs with high slot counts demonstrate exceptional stabil- it y compared to the theoretical blo c k mo dels. As shown in T able 6, feasibility remains at or near 100% up to a unit load-to-slot ratio of 0.9. Ev en at full capacity , the system successfully solv es b et w een 70% and 80% of instances. The decline in solv ability for these instances is attributable 38 to tw o conv erging factors: primarily , the time required for deep reshuffling exceeds the theo- retical baselines deriv ed from simplified trav el estimates, rendering some instances structurally infeasible. Additionally , for theoretically solv able but complex instances, the A* searc h (Stage 2) b ecomes a computational b ottlenec k, hitting time limits due to the searc h space explosion. The analysis of the 8 × 3 la yout confirms that the topology dictates p erformance. Ev en at a saturation ratio of 1.0, this configuration maintains 90% feasibilit y provided it allows for 2-sided access (45/50 solved). This evidence suggests that a high n umber of indep enden t access p oin ts facilitates parallel op erations, effectively mitigating the congestion effects typically associated with high storage densit y . In sharp con trast, deep square la youts (e.g., 6 × 6 ) exhibit significantly earlier saturation p oin ts. With limited access (1–2 sides), p erformance degrades rapidly even at mo derate ratios (0.7). This confirms that for deep storage, optimization cannot fully comp ensate for a lac k of reshuffling space; consequen tly , either lo wer storage densities or full p erimeter access (4 sides) are requisite for reliable solv ability . 6.5 Managerial Insigh ts The computational experiments provide guidelines for the design and operation of autonomous buffer zones. By synthesizing the results, we derive four principles for managing flo or storage. 6.5.1 La yout F ragmen tation and A ccess Efficiency In brownfield facilities, contiguous areas for buffering are rarely av ailable. The results on lay out top ology show that spatial fragmentation is not a disadv antage. Small, distributed rectangular buffers (e.g., 8 × 3 ) outp erform large, symmetrical blo c k la y outs with deeper lanes (e.g., 6 × 6 ), ev en if these distributed zones are lo cated further a wa y from the source and sink. Str ate gic Insight: Buffering do es not require consolidating space into a single zone. Utilizing decen tralized flo or space near pro duction lines is effective, provided these p ock ets maintain high access a v ailability relative to storage slots. Distributed buffers with m ultiple access directions prev ent b ottlenec ks more effectiv ely than deep storage blo c ks. 6.5.2 Capacit y-Adaptiv e Breathing T op ology T raditional material handling relies on static zoning, which requires manual interv en tion. Heatmap analysis sho ws that the heuristic autonomously p erforms storage location assignmen t without explicit pre-configuration. Str ate gic Insight: Despite the limitations of the A ∗ searc h and the decomposition, our heuristic enables capacity-adaptiv e breathing top ology b y placing urgent unit loads at the p eriphery and less critical loads in deep er p ositions. This logic reorganizes the buffer in real-time based on the pro duction sc hedule, aiming to optimize the accessibility for the next retriev al window while the utilized area expands or contracts based on curren t demand. 39 6.5.3 The 90% Stabilit y Threshold The sensitivity analysis shows a stability threshold across all tested lay outs. Solv abilit y and sys- tem reliability decline when the unit-load-to-slot ratio exceeds 0.9. At higher ratios, the system lac ks the uno ccupied slots required for reshuffling. Str ate gic Insight: Planners must distinguish b etw een storage capacity and op erational through- put capacity . A pro duction buffer requires appro ximately 10% of slots as slac k to resolv e dead- lo c ks. Op erating ab ov e this 90% threshold transforms the buffer into a static storage yard, rendering it brittle to sto c hastic production requests. 6.5.4 Op erational Robustness and Fleet Scalabilit y The algorithm maintains robustness across v arying fleet sizes from 2 to 6 rob ots without pa- rameter retuning. The system utilizes additional rob ots to reduce makespan un til the maximum throughput capacity is reached. Str ate gic Insight: This ensures op erational robustness through fleet scalability . If a rob ot is remo ved for maintenance, the algorithm redistributes tasks and adapts the throughput capacity without causing an op erational standstill. Pro cess contin uit y is decoupled from the sp ecific fleet coun t, pro viding inv estmen t securit y . 40 7 Conclusion and F uture W ork This pap er addressed the Multi-AMR Buffer Storage, Retriev al, and Reshuffling Problem (BSRRP) b y establishing an Exact F ormulation (EF) for b enc hmarking and prop osing a hierarchical heuris- tic. Our research demonstrates that automating dense flo or storage in brownfield environmen ts requires the integration of reshuffling logic and m ulti-AMR co ordination, as implemented in our t wo-stage approac h. The computational exp eriments rev eal a disparity in tractability betw een exact and heuristic metho ds. While exact approaches often fail to con v erge for dense industrial scenarios, the hierarc hical heuristic ac hieves speedup factors ranging from 450x to ov er 3,000x compared to the exact solv er. Regarding op erational robustness, our analysis identified a saturation threshold at a unit load to slot ratio of 0.9. Bey ond this ratio, the lac k of resh uffling space renders the system brittle, con- firming that 10% of capacity m ust be treated as slack to resolve deadlo c ks. F urthermore, strict time windo ws were identified as a source of infeasibility; treating deadlines as soft constraints allo wed the heuristic to resolv e high-traffic scenarios where the exact solv er failed, confirming that flow con tinuit y must tak e precedence o ver precision in time-critical environmen ts. Finally , the ph ysical lay out top ology and the heuristic’s capacity-adaptiv e b eha vior pla y a de- cisiv e role. Our experiments demonstrate that rectangular la youts with high access a v ailabilit y outp erform deep square configurations b y enabling parallel access. The heuristic adaptiv ely adjusts the storage fo otprint based on curren t inv en tory lev els: it utilizes nearb y slots during lo w demand to minimize tra vel distances and expands in to distant slots only when capacity re- quiremen ts increase. This allows the system to optimize storage depth based on buffer duration without explicit pre-configuration, adapting autonomously to irregular brownfield constraints. This confirms that a decoupled approac h—combining A ∗ searc h for task sequencing with Con- strain t Programming for scheduling—is a viable path for industrial con trol. Op erationally , future w ork will focus on integrating kinematic constrain ts (e.g., acceleration profiles, turning radii) in to the scheduling logic. Additionally , implementing p ost-processing tra jectory smoothing could further enhance path efficiency . Algorithmically , profiling identifies the A ∗ searc h for task sequencing as the primary b ottlenec k in dense scenarios. F uture optimization of this comp onen t — through pre-computed pattern databases or Multi-Agen t Path Finding (MAPF) refinemen ts — would eliminate remaining latency . F urthermore, this mo dular arc hitecture establishes a data-driv en infrastructure for online op erations. The decoupled solv er can serve as a baseline for real-time decision-making, where the system must con tin uously adapt to dynamic production streams and sto c hastic arriv al rates in a data-ric h en vironmen t. 41 A c kno wledgemen ts This research w as presen ted as a work-in-progress at the V eRoLog conference 2025. The authors ackno wledge the use of artificial intelligence (AI) to ols during the preparation of this manuscr ipt. Specifically , Gemini (versions 1.5 to 3.0; Google) and GitHub Copilot (Claude Sonnet 4.0 and 4.5) were used to help with asp ects of softw are dev elopment, language improv e- men t and editing, and the classification of the literature relev an t to this study . The authors ha ve review ed and edited all conten t and assume full resp onsibilit y for the accuracy , in tegrit y , and originality of the w ork presented. Disclosure statement The authors declare no conflicts of interest. F unding This work w as supported b y the European Union - Next GenerationEU under Grant 13IK0321 Data and Co de av ailability statemen t The source co de is a v ailable under https://anonymous.4open.science/r/buffer_reshuffling _and_retrieval_IP-CFEC . Instances and solutions are av ailable under https://doi.org/10 .6084/m9.figshare.31852507 . 42 References Andulk ar, M., Le, D. T., & Berger, U. (2018). A multi-case study on industry 4.0 for sme’s in branden burg, germany . Pr o c e e dings of the 51st Hawaii International Confer enc e on System Scienc es, 2018 . doi: https://doi.org/10.24251/HICSS.2018.574 Arc hetti, C., Coelho, L., Sp eranza, M., & V ansteen wegen, P . (2025). Beyond fift y y ears of v ehicle routing: Insights into the history and the future. Eur op e an Journal of Op er ational R ese ar ch . doi: https://doi.org/10.1016/j.ejor.2025.06.014 Boge, S., & Kn ust, S. (2020). The parallel stac k loading problem minimizing the n um b er of reshuffles in the retriev al stage. Eur op e an Journal of Op er ational R ese ar ch , 280 (3), 940–952. doi: h ttps://doi.org/10.1016/j.ejor.2019.08.005 Bömer, T., K oltermann, N., Pfrommer, J., & Meyer, A. (2024). Sorting m ultibay block stack- ing storage systems with multiple rob ots. In International c onfer enc e on c omputational lo gistics (pp. 34–48). doi: https://doi.org/10.1007/978-3-031-71993-6_3 Borjian, S., Manshadi, V. H., Barnhart, C., & Jaillet, P . (2015). Managing r elo c ation and delay in c ontainer terminals with flexible servic e p olicies. doi: h ttps://doi.org/10.48550/ arXiv.1503.01535 Buc ko w, J.-N., Goerigk, M., & Kn ust, S. (2025). In tegrated pallet retriev al and pro cessing in w arehouses under uncertaint y . OR Sp e ctrum , 1–39. doi: https://doi.org/10.1007/ s00291-024-00806-7 Bömer, T., Disselnmeyer, M., & Meyer, A. (2025). A constraint programming approach for the m ulti-rob ot m ultibay unit load pre-marshalling problem. Pr o c e dia CIRP , 134 , 508-513. doi: https://doi.org/10.1016/j.procir.2025.02.151 Bömer, T., Pfrommer, J., Akizhanov, D., & Meyer, A. (2026). Sorting multi–ba y blo c k stacking storage systems. Computers & Op er ations R ese ar ch , 188 , 107359. doi: https://doi.org/ 10.1016/j.cor.2025.107359 Caserta, M., Sc hw arze, S., & V oß, S. (2012). A mathematical form ulation and complexit y con- siderations for the blocks relo cation problem. Eur op e an Journal of Op er ational R ese ar ch , 219 (1), 96-104. doi: 10.1016/j.ejor.2011.12.039 Charris, E., Ro jas-Reyes, J., & Monto y a-T orres, J. (2018, 07). The storage lo cation assignment problem: A literature review. International Journal of Industrial Engine ering Computa- tions , 10 . doi: h ttps://doi.org/10.5267/j.ijiec.2018.8.001 Chen, C., Tiong, L. K., & Chen, I.-M. (2019). Using a genetic algorithm to sc hedule the space- constrained agv-based prefabricated bathro om units manufacturing system. International Journal of Pr o duction R ese ar ch , 57 (10), 3003–3019. doi: h ttps://doi.org10.1080/00207543 .2018.1521532 Dan tzig, G. B., & Ramser, J. H. (1959). The truck dispatching problem. Management Scienc e , 6 (1), 80–91. Retriev ed from http://www.jstor.org/stable/2627477 Descartes Systems Group. (2023, May). How b ad is the supply chain and lo gistics workfor c e chal lenge? https://www.descartes.com/resources/news/descartes-study-reveals -76-supply-chain-and-logistics-operations-are-experiencing . (Accessed: 2026- 01-28) Disselnmey er, M., Bömer, T., Pfrommer, J., & Meyer, A. (2024). The static buffer reshuffling and 43 retriev al problem for autonomous mobile rob ots. In International c onfer enc e on c omputa- tional lo gistics (pp. 18–33). Springer. doi: https://doi.org/10.1007/978-3-031-71993-6_2 ElW akil, M., Eltawil, A., & Gheith, M. (2022). On the integration of the parallel stack loading problem with the blo c k relo cation problem. Computers & Op er ations R ese ar ch , 138 , 105609. doi: h ttps://doi.org/10.1016/j.cor.2021.105609 F ragapane, G., de Koster, R., Sgarb ossa, F., & Strandhagen, J. O. (2021). Planning and contr ol of autonomous mobile rob ots for intralogistics: Literature review and research agenda. Eur op e an Journal of Op er ational R ese ar ch , 294 (2), 405-426. doi: h ttps://doi.org/10.1016/ j.ejor.2021.01.019 F u, B., Chen, Z., Chandan, R., Barb osa, A., Caldara, M., Durham, J., & P ecora, F. (2026). Symb olic planning and multi-agent p ath finding in extr emely dense envir onments with unas- signe d agents. doi: https://doi.org/10.48550/arXiv.2509.01022 Ge, P ., Meng, Y., Liu, J., T ang, L., & Zhao, R. (2020). Logistics optimisation of slab pre- marshalling problem in steel industry . International Journal of Pr o duction R ese ar ch , 58 (13), 4050–4070. doi: https://10.1080/00207543.2019.1641238 Geft, T., Zhang, W., Y u, J., & Bekris, K. (2026). R obust out-of-or der r etrieval for grid-b ase d stor age at maximum c ap acity. doi: h ttps://doi.org/10.48550/arXiv .2601.19144 Grand View Research. (2025). A utonomous mobile r ob ots market (2026 - 2033). https://www .grandviewresearch.com/industry-analysis/autonomous-mobile-robots-market . (A ccessed: 2026-01-28) Hu, S., Zhao, S., & Ren, Z. (2025). Conflict-b ase d se ar ch and prioritize d planning for multi-agent p ath finding among movable obstacles. doi: https://doi.org/10.48550/arXiv.2509.26050 Hua, Y., W ang, Y., & Ji, Z. (2024, 06). Adaptiv e lifelong m ulti-agent path finding with multiple priorities. IEEE R ob otics and Automation L etters , PP , 1-8. doi: h ttps://doi.org/10.1109/ LRA.2024.3392084 Ji, M., Guo, W., Zhu, H., & Y ang, Y. (2015). Optimization of loading sequence and rehandling strategy for m ulti-qua y crane op erations in container terminals. T r ansp ortation R ese ar ch Part E: L o gistics and T r ansp ortation R eview , 80 , 1-19. doi: h ttps://doi.org/10.1016/ j.tre.2015.05.004 Kim, K. H., & Hong, G.-P . (2006). A heuristic rule for relo cating blo c ks. Computers& Op er ations R ese ar ch , 33 (4), 940-954. doi: https://doi.org/10.1016/j.cor.2004.08.005 Kizila y , D., & Eliiyi, D. T. (2021). A comprehensive review of quay crane schedulin g, yard op er- ations and integrations thereof in con tainer terminals. Flexible Servic es and Manufacturing Journal , 33 (1), 1–42. doi: h ttps://doi.org/10.1007/s10696-020-09385-5 Lersteau, C., & Shen, W. (2022). A surv ey of optimization methods for blo c k relo cation and premarshalling problems. Computers& Industrial Engine ering , 172 , 108529. doi: h ttps://doi.org/10.1016/j.cie.2022.108529 Lo werre, B. T. (1976). The harpy sp e e ch r e c o gnition system. Carnegie Mellon Universit y. Makino, H., & Ito, S. (2025). Mapf-hd: Multi-agent p ath finding in high-density envir onments. doi: https://doi.org/10.48550/arXiv.2509.06374 Pfrommer, J., Meyer, A., & Tierney , K. (2022). Solving the unit-load pre-marshalling prob- lem in blo c k stac king storage systems with m ultiple access directions. arXiv pr eprint 44 arXiv:2207.09118 . Pfrommer, J., Mey er, A., & Tierney , K. (2024). Solving the unit-load pre-marshalling problem in blo c k stac king storage systems with multiple access directions. Eur op e an Journal of Op er ational R ese ar ch , 313 (3), 1054-1071. doi: https://doi.org/10.1016/j.ejor.2023.08.044 Pytel, S., Sitek, S., Chmielewsk a, M., Zuzańsk a-Żyśk o, E., Runge, A., & Markiewicz-Patk o wsk a, J. (2021). T ransformation directions of brownfields: The case of the górnośląsk o- zagłębio wsk a metrop olis. Sustainability , 13 (4). doi: https://doi.org/1 0.3390/su13042075 T ang, L., & Ren, H. (2010). Mo delling and a segmented dynamic programming-based heuristic approac h for the slab stack shuffling problem. Computers& Op er ations R ese ar ch , 37 (2), 368-375. doi: h ttps://doi.org/10.1016/j.cor.2009.05.011 T ec k, S., Dewil, R., & V ansteenw egen, P . (2024). A sim ulation-based genetic algorithm for a semi-automated w arehouse sc heduling problem with pro cessing time v ariabilit y . Applie d Soft Computing , 160 , 111713. doi: https://doi.org/10.1016/j.asoc.2024.111713 W ang, Q., V eerapaneni, R., W u, Y., Li, J., & Likhachev, M. (2024, 05). Mapf in 3d warehouses: Dataset and analysis. Pr o c e e dings of the International Confer enc e on Automate d Planning and Sche duling , 34 , 623-632. doi: https://doi.org/10.1609/icaps.v34i1.31525 W ang, Z., Zhou, C., Che, A., & Gao, J. (2024). A p olicy-based monte carlo tree search metho d for container pre-marshalling. International Journal of Pr o duction R ese ar ch , 62 (13), 4776– 4792. doi: h ttps://doi.org/10.1080/00207543.2023.2279130 45 A Multi-AMR Buffer Storage, Retriev al, and Resh uffling Prob- lem - Complete Mo del Elemen t Notation Description Sets and Indic es Set of unit loads N , n Range: { 1 , 2 , . . . , N } Set of time steps T , t Range: { 1 , 2 , . . . , T } Set of all lanes I , i, k Range: { 0 , 1 , . . . , I } Set of buffer lanes I ′ Range: { 1 , 2 , . . . , I − 1 } (excluding sink and source) Set of slots J i,k , j, l Range: { 1 , 2 , . . . , J } Set of vehicles V , v Range: { 1 , 2 , . . . , V } Gener al Notation and Par ameters Unit load ( u n , r n , a n ) A unit load with lab el u n . Retriev al windo w op ens at r n , arriv al window at a n . Slot [ i, j ] Slot in the i th lane and j th slot (deepest j = 1 , at the p erimeter j = J ). Outermost slot [ i, J i ] The outermost slot J i in lane i . Sink [ I , 1] = [ I , J I ] Modelled as a slot in the lane with the highest v alue for i . Source [0 , 1] = [0 , J I ] Mo delled as a slot in the lane with the low est v alue for i . Relo cation mov e [ i, j ] → [ k , l ] A relo cation mov e from slot [ i, j ] to [ k , l ] . Retriev al mo ve [ i, j ] → [ I , J ] Retrieving a unit load from slot [ i, j ] . Empt y drive [ i, j ]  → [ k , l ] Driving the AMR from slot [ i, j ] to [ k , l ] without trans- p orting a load. Storage mov e [0 , 1] → [ i, j ] Storing a unit load to slot [ i, j ] . Retriev al windo w [ r n , r n + ρ n ] Time window for retriev al of u n ; ρ n is the maximum dela y . Arriv al windo w [ a n , a n + α n ] Time window for arriv al of u n ; α n is the maximum dela y . Distance d ij kl Distance b et ween the slots [ i, j ] and [ k , l ] . T rav el time τ ij kl T rav el time (incl. handling time) betw een the slots [ i, j ] and [ k , l ] . T able 7: Sets, indices, and general notation for the In teger Programming mo del 46 b ij nt =    1 if unit load n is in [ i, j ] at time t, 0 otherwise; (A.1) ∀ i ∈ I , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T x ij klntv =    1 if unit load n is relocated from [ i, j ] to [ k, l ] at time t by AMR v , 0 otherwise; (A.2) ∀ i, k ∈ I ′ , ∀ j ∈ J i , ∀ l ∈ J k , ∀ n ∈ N , ∀ t ∈ T , ∀ v ∈ V y ij ntv =    1 if unit load n is retrieved from [ i, j ] at time t by AMR v, 0 otherwise; (A.3) ∀ i ∈ I \ { I } , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T , ∀ v ∈ V g nt =    1 if unit load n has been retrieved at time t ′ ∈ { 1 , . . . , t − 1 } , 0 otherwise; (A.4) ∀ n ∈ N , ∀ t ∈ T z ij ntv =    1 if unit load n is stored in [ i, j ] at time t by AMR v , 0 otherwise; (A.5) ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T , ∀ v ∈ V s nt =    1 if unit load n has been stored at time t ′ ∈ { 1 , . . . , t − 1 } , 0 otherwise; (A.6) ∀ n ∈ N , ∀ t ∈ T e ij kltv =        1 if the AMR drives from slot [ i, j ] to slot [ k , l ] at time t, without transporting a unit load 0 otherwise; (A.7) ∀ i, k ∈ I , ∀ j ∈ J i , ∀ l ∈ J k , ∀ t ∈ T , ∀ v ∈ V c ij tv =    1 if the AMR v is at slot [ i, j ] at time t, 0 otherwise; (A.8) ∀ i ∈ I , ∀ j ∈ J i , ∀ t ∈ T , ∀ v ∈ V T = max n ∈N ,m ∈N ,d n < ∞ { a n + α n , d m + δ n } (A.9) τ ij kl =    max(1 , d ij kl ) if performing an empt y drive e ij klt , max(1 , d ij kl + 2 h ) otherwise; (A.10) Starting Constraints b ij n 1 =    1 , if unitload n starts in slot [ i, j ] 0 , otherwise ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N (A.11) s n 1 =    1 , if unitload is initially stored in buffer zone 0 , otherwise ∀ n ∈ N (A.12) c ij 1 v =    1 , if vehicle v starts in slot [ i, j ] 0 , otherwise ∀ i ∈ I , ∀ j ∈ J i , ∀ v ∈ V (A.13) 47 Ob jective function min X i ∈I X j ∈J i X t ∈T X v ∈V  X n ∈N  y ij ntv ∗ d ij I 1 + z ij ntv ∗ d 01 ij  (A.14) + X k ∈I X l ∈J k  X n ∈N x ij klntv + e ij kltv  ∗ d ij kl  Sub ject to: X i ∈I ′ X j ∈J i b ij nt ≤ s nt , ∀ n ∈ N , ∀ t ∈ T (A.15) g nt ≥ s nt − X i ∈I ′ X j ∈J i b ij nt , ∀ n ∈ N , ∀ t ∈ T (A.16) X n ∈N b ij nt ≤ 1 , ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ t ∈ T (A.17) X n ∈N b ij nt ≥ X n ∈N b ij +1 nt , ∀ i ∈ I ′ , ∀ j ∈ J i \ J i , ∀ t ∈ T (A.18) X k ∈I ′ X l ∈J k X n ∈N x ij klntv + X k ∈I X l ∈J k e ij kltv + X n ∈N y ij ntv ≤ c ij tv , (A.19) ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ t ∈ T , ∀ v ∈ V X k ∈I X l ∈J k e I 1 kltv ≤ c I 1 tv , ∀ t ∈ T , ∀ v ∈ V (A.20) X k ∈I ′ X l ∈J k z klntv + X k ∈I X l ∈J k e 01 kltv + y 01 ntv ≤ c 01 tv , ∀ t ∈ T , ∀ v ∈ V (A.21) X v ∈V y 01 ntv + s nt ≤ 1 , ∀ n ∈ N , ∀ t ∈ T (A.22) X v ∈V  y ij ntv + X k ∈I ′ X l ∈J k x ij klntv  ≤ b ij nt , ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T (A.23) b ij nt = b ij n ( t − 1) + X v ∈V h X k ∈I ′ X l ∈J k  x klij n ( t − τ klij ) v − x ij kln ( t − 1) v  − y ij n ( t − 1) v + z ij n ( t − τ 01 ij ) v i , (A.24) ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ n ∈ N , ∀ t ∈ T \ 1 g nt = X i ∈I \ I X j ∈J i t − 1 X t ′ =1 X v ∈V y ij nt ′ v , ∀ n ∈ N , ∀ t ∈ T (A.25) s nt = s n 1 + X i ∈I ′ X j ∈J i t − τ 01 ij X t ′ =1 X v ∈V z ij nt ′ v , ∀ n ∈ N , ∀ t ∈ T (A.26) c ij tv = c ij ( t − 1) v + X n ∈N z ij n ( t − τ 01 ij ) v − X n ∈N y ij n ( t − 1) v + X k ∈I ′ X l ∈J k X n ∈N x klij n ( t − τ klij ) v (A.27) + X k ∈I X l ∈J k e klij ( t − τ klij ) v − X k ∈I ′ X l ∈J k X n ∈N x ij kln ( t − 1) v − X k ∈I X l ∈J k e ij kl ( t − 1) v , ∀ i ∈ I ′ , ∀ j ∈ J i , ∀ t ∈ T \ 1 , ∀ v ∈ V c I 1 tv = c I 1( t − 1) v + X i ∈I \ I X j ∈J i X n ∈N y ij n ( t − τ ij I 1 ) v + X i ∈I X j ∈J i  e ij I 1( t − τ ij I 1 ) v − e I 1 ij ( t − 1) v  , (A.28) ∀ t ∈ T \ 1 , ∀ v ∈ V c 01 tv = c 01( t − 1) v − X i ∈I ′ X j ∈J i X n ∈N z ij n ( t − 1) v + X i ∈I X j ∈J i  e ij 01( t − τ ij IJ ) v − e 01 ij ( t − 1) v  − X n ∈N y 01 n ( t − 1) v , (A.29) ∀ t ∈ T \ 1 , ∀ v ∈ V X i ∈I \ I X j ∈J i r n − τ ij I 1 − 1 X t =1 X v ∈V y ij ntv = 0 , ∀ n ∈ N (A.30) X i ∈I \ I X j ∈J ⟩ r n + ρ n − τ ij I 1 X t = r n − τ ij I 1 X v ∈V y ij ntv = 1 , ∀ n ∈ N (A.31) X i ∈I \ I X j ∈J i T X t = r n + ρ n − τ ij I 1 +1 X v ∈V y ij ntv = 0 , ∀ n ∈ N (A.32) X i ∈I ′ X j ∈J i a n − 1 X t =1 X v ∈V z ij ntv = 0 , ∀ n ∈ N (A.33) X i ∈I ′ X j ∈J i a n + α n X t = a n X v ∈V z ij ntv + min( r n + ρ n − τ 01 I 1 ,a n + α n ) X t = r n − τ 01 I 1 X v ∈V y 01 ntv = 1 , ∀ n ∈ N (A.34) X i ∈I ′ X j ∈J i T X t = a n + α n +1 X v ∈V z ij ntv = 0 , ∀ n ∈ N (A.35) X v ∈V " X j ∈J i c ij tv + X j ∈J i  X n ∈N X t ′ ∈ Ω in z z ij nt ′ v + X k,l X n ∈N X t ′ ∈ Ω in x x klij nt ′ v + X k,l X t ′ ∈ Ω in e e klij t ′ v  + X j ∈J i  X n ∈N X t ′ ∈ Ω out y y ij nt ′ v + X k,l X n ∈N X t ′ ∈ Ω out x x ij klnt ′ v + X k,l X t ′ ∈ Ω out e e ij klt ′ v  # ≤ 1 , ∀ i ∈ I ′ , ∀ t ∈ T (A.36) X n ∈N ( x ij klntv + y ij ntv + e ij kltv ) ≤ 1 − X n ∈N b i ( j +1) nt , ∀ i ∈ I ′ , ∀ j ∈ J i \ { 1 } , ∀ k ∈ I ′ , ∀ l ∈ J k , ∀ t ∈ T , ∀ v ∈ V (A.37) 48 B NP-hardness of the BSRRP Problem The Block Relocation Problem is kno wn to be NP-hard (Caserta et al., 2012). W e estab- lish the NP-hardness of the Buffer Storage, Retriev al, and Resh uffling Problem b y providing a p olynomial-time reduction from an y instance of BRP to BSRRP . B.1 Problem Definitions Definition B.1 (BRP - Blo c k Relocation Problem) Given: • A set of c ontainers C = { c 1 , ..., c N } • A set of stacks S = { s 1 , ..., s M } with maximum height H • A priority function p : C → { 1 , ..., N } assigning unique priorities to c ontainers • An initial c onfigur ation function f : C → S × { 1 , ..., H } mapping c ontainers to p ositions Find a se quenc e of r elo c ations minimizing the total numb er of moves k while r etrieving c ontainers in asc ending priority or der. Definition B.2 (BSRRP - Buffer Storage, Retriev al, and Resh uffling Problem) Given: • A set of unit lo ads U = { u 1 , ..., u N } • A buffer zone with lanes L = { l 1 , ..., l M } of maximum depth H • Time windows [ e i , l i ] for e ach unit lo ad u i • An initial c onfigur ation function g : U → L × { 1 , ..., H } Find a se quenc e of r elo c ations minimizing total tr avel distanc e D while r etrieving unit lo ads within their time windows. B.2 P olynomial-Time Reduction W e present a p olynomial-time transformation T that maps any instance I B RP of BRP to an instance I B S RRP = T ( I B RP ) of BSRRP . Definition B.3 (T ransformation T ) L et I B RP = ( C, S, p, f , H ) b e any instanc e of BRP. W e c onstruct I B S RRP = ( U, L, W , g , H ) as fol lows: 1. Structur e Pr eservation: • Cr e ate a set of unit lo ads U such that | U | = | C | (same numb er of elements) • Cr e ate a set of L anes L such that | L | = | S | (same numb er of stacks/lanes) • F or e ach c ontainer c i ∈ C , cr e ate a c orr esp onding unit lo ad u i ∈ U • F or e ach c ontainer p osition ( s, h ) = f ( c i ) , set the p osition of the c orr esp onding unit lo ad g ( u i ) = ( l, h ) wher e l is the lane c orr esp onding to stack s 49 2. Time Window Construction: L et k max ( N ) = N ( N − 1) 2 b e the maximum p ossible r elo c ations for N = | U | = | C | unit lo ads. F or e ach u i c orr esp onding to c ontainer c i with priority p ( c i ) = i : • e i = ( i − 1) ∗ ( k max ( N ) + 1) + 1 • l i = i ∗ ( k max ( N ) + 1) 3. Distanc e Metric: Define the distanc e function d ( x, y ) b etwe en any two p ositions x, y in the BSRRP instanc e as fol lows: d ( x, y ) =    1 , if the move fr om x to y involves c arrying a unit lo ad (lo ade d move) 0 , if the move fr om x to y do es not involve c arrying a unit lo ad (unlo ade d move) This distance metric directly counts the n um b er of loaded mo ves (relocations and retriev als), since unloaded mo v es con tribute to zero distance. B.3 Pro of of Correctness Lemma B.1 (Correctness of Mapping T ) The mapping of c ontainers c i to unit lo ads u i and stacks s j to lanes l j define d by the tr ansformation T pr eserves al l ac c essibility r elationships b etwe en the elements. Pro of B.1 As describ e d in the definition of tr ansformation T , the fol lowing holds: • F or e ach c i ∈ C , ther e exists a c orr esp onding u i ∈ U . • F or e ach s j ∈ S , ther e exists a c orr esp onding l j ∈ L . • The p osition of e ach element u i in I B S RRP c orr esp onds to the p osition of the c orr esp ond- ing element c i in I B RP (i.e., if f ( c i ) = ( s, h ) , then g ( u i ) = ( l , h ) , wher e l is the lane c orr esp onding to s ). Ther efor e, for any c ontainers c i , c j ∈ C : • If c i blo cks c j in I B RP , then u i blo cks u j in I B S RRP . • If c i is ac c essible in I B RP , then u i is ac c essible in I B S RRP . • The mapping g pr eserves the r elative p ositions of al l elements. Thus, any valid ac c ess and r elo c ation se quenc e in one pr oblem has a c orr esp onding valid se quenc e in the other pr oblem. Lemma B.2 (Time Windo w Correctness) The c onstruction of the time window enfor c es the priority or dering of BRP while al lowing for al l ne c essary r elo c ations for e ach unit lo ad in I B S RRP . 50 Pro of B.2 The time windows [ e i , l i ] for e ach unit lo ad u i (c orr esp onding to c ontainer c i with BRP priority i ) ar e define d as e i = ( i − 1)( k max ( N ) + 1) + 1 and l i = i ( k max ( N ) + 1) . This c onstruction cr e ates se quential and strictly nonoverlapping time windows, such as e i = l i − 1 + 1 . Conse quently, op er ations r elate d to u i c an only b e gin after the time window for u i − 1 has close d, thus dir e ctly enfor cing the asc ending priority or der of the BRP. The dur ation of e ach time window for u i is l i − e i + 1 = k max ( N ) + 1 time units. In the BSRRP instanc e, e ach unit of time c orr esp onds to a lo ade d move (either a r elo c ation or a r e- trieval), ac c or ding to the define d distanc e metric. The value k max ( N ) = N ( N − 1) 2 r epr esents an upp er b ound on the total numb er of r elo c ations r e quir e d for the entir e BRP instanc e. A l lo c ating k max ( N ) + 1 time units (i.e., p otential lo ade d moves) for e ach individual unit lo ad u i is ther efor e amply sufficient to ac c ommo date: • Any r elo c ations ne c essary to ac c ess u i after u 1 , . . . , u i − 1 have b e en r etrieve d. The numb er of such r elo c ations for u i alone wil l not exc e e d N − 1 , which is less than or e qual to k max ( N ) for N ≥ 2 . • The single lo ade d move r e quir e d for the r etrieval of u i itself. Thus, any valid se quenc e of BRP op er ations c an b e mapp e d to the BSRRP instanc e within the c onstructe d time windows, r esp e cting the priority or der and al lowing for al l r e quir e d r elo c ations and r etrievals. Theorem B.3 (Solution Equiv alence) An optimal solution to I B RP with k r elo c ations exists if and only if an optimal solution to I B S RRP with total distanc e D = k + N exists. Pro of B.3 L et S ol B RP b e a fe asible se quenc e of op er ations for I B RP involving k r elo c ations and N mandatory r etrievals. L et S ol B S RRP b e the c orr esp onding se quenc e of op er ations for I B S RRP . Under the define d distanc e metric, unlo ade d moves in S ol B S RRP have a distanc e of 0. L o ade d moves have a distanc e of 1. Each of the k r elo c ations in S ol B RP c orr esp onds to exactly one lo ade d move in S ol B S RRP (moving the r elo c ate d item). Each of the N r etrievals in S ol B RP c orr esp onds to exactly one lo ade d move in S ol B S RRP (moving the r etrieve d item out). Ther efor e, the total distanc e D for S ol B S RRP is pr e cisely the sum of the distanc es of the lo ade d moves: D = ( k × 1) + ( N × 1) = k + N . ( ⇒ ) Optimal BRP Solution implies Optimal BSRRP Solution: Assume S ol ∗ B RP is an optimal solution to I B RP with k ∗ r elo c ations. The c orr esp onding S ol ∗ B S RRP has total distanc e D ∗ = k ∗ + N . Supp ose, for c ontr adiction, that S ol ∗ B S RRP is not optimal for I B S RRP . Then ther e must exist a differ ent fe asible solution S ol ′ B S RRP with total distanc e D ′ < D ∗ . Sinc e distanc e only c ounts lo ade d moves, D ′ must c orr esp ond to some numb er of r elo c ations k ′ and the N r etrievals, such that D ′ = k ′ + N . Given D ′ < D ∗ , we have k ′ + N < k ∗ + N , which implies k ′ < k ∗ . The se quenc e S ol ′ B S RRP c orr esp onds to a valid se quenc e S ol ′ B RP with k ′ r elo c ations (due to L emma 4.1 and L emma 4.2). This c ontr adicts the assumption that S ol ∗ B RP with k ∗ r elo c ations was optimal for I B RP . Ther efor e, S ol ∗ B S RRP must b e optimal for I B S RRP . ( ⇐ ) Optimal BSRRP Solution implies Optimal BRP Solution: Assume S ol ∗ B S RRP is an optimal solution to I B S RRP with total distanc e D ∗ . As shown ab ove, D ∗ must b e e qual to the numb er of lo ade d moves, which c orr esp onds to k ∗ r elo c ations and N r etrievals, so D ∗ = 51 k ∗ + N . This solution S ol ∗ B S RRP c orr esp onds to a valid BRP solution S ol ∗ B RP with k ∗ = D ∗ − N r elo c ations. Supp ose, for c ontr adiction, that S ol ∗ B RP is not optimal for I B RP . Then ther e must exist a differ ent fe asible solution S ol ′ B RP with k ′ r elo c ations such that k ′ < k ∗ . F r om the ( ⇒ ) dir e ction, this S ol ′ B RP c orr esp onds to a BSRRP s olution S ol ′ B S RRP with total distanc e D ′ = k ′ + N . Sinc e k ′ < k ∗ , it fol lows that D ′ < D ∗ . This c ontr adicts the assumption that S ol ∗ B S RRP with distanc e D ∗ was optimal for I B S RRP . Ther efor e, S ol ∗ B RP must b e optimal for I B RP . Thus, an optimal solution with k r elo c ations exists for I B RP if and only if an optimal solution with total distanc e D = k + N exists for I B S RRP . B.4 P olynomial-Time Reduction Complexit y The transformation T op erates in p olynomial time with resp ect to the input size (primarily N ): • Configuration mapping: O ( N ) • Time windo w computation: The lo op runs N times with constant time arithmetic op era- tions inside. O ( N ) . • Distance metric definition: O (1) Th us, the transformation T is p olynomial-time. B.5 Conclusion Since the reduction is polynomial-time, while it preserves solution feasibilit y and optimality for all instances, and the BRP is prov en NP-hard, the BSRRP is also NP-hard. □ 52

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment