📝 Original Info
- Title:
- ArXiv ID: 2512.20239
- Date:
- Authors: Unknown
📝 Abstract
Formalized the 2D hierarchical rectangle packing problem • Proposed Multi-level Logic-based Benders Decomposition, outperforming monolithic and Bottom-Up methods📄 Full Content
Our objective is to find the smallest rectangular container (called a block) that can contain the rectangular items so that they do not overlap. The width and height of the block are to be decided, unlike in the case of strip packing (where the container has a fixed width) or bin packing (multiple bins with fixed dimensions). In this paper, the packing problem itself is hierarchical. This means that an item to be packed is either an individual rectangle with fixed dimensions or an occurrence of another block, meaning that the optimized block contains a subblock that corresponds to another packing problem, whose solution serves as a template. To find the dimensions of such a block occurrence, it is necessary to recursively pack such a subblock, which itself consists of individual rectangles and possibly occur-rences of other blocks. This hierarchy of blocks is a static property of the problem instance, and can be conveniently represented as an out-tree, as Fig. 1a later demonstrates.
In packing applications, such a hierarchy is often an intrinsic property of the problem. In the design of integrated circuits, engineers also solve such a hierarchical packing problem. They need to design lower-level components, such as operational amplifiers, so that when they are used as parts of more complex components, they can be efficiently put together so that the overall area of the circuit is minimized. Individual components need to be encapsulated and isolated, with well-defined boundaries. Therefore, the problem cannot be solved as a single-block packing problem, and the hierarchy needs to be explicitly considered.
Another example of the application of the hierarchical packing problem is logistics. Assume that goods have the same height, so the problem essentially boils down to the two-dimensional case. Individual items could be transported directly in the main container, but in practice, they are packed into boxes (putting together items of the same customer), which can also be packed into larger boxes or crates to make handling them easier. Finally, these are put into the main container. This problem could be modeled as a hierarchical packing problem with three levels of hierarchy.
In this paper, we formally describe the problem of hierarchical packing of two-dimensional items with an arbitrary number of levels, which, to the best of our knowledge, has not been formally studied in the literature. In Section 2, related work on topics of packing, its applications, and the use of decomposition methods is investigated. In Section 3, the problem is formally described, and models for Mixed-Integer Linear Programming (MILP) and Constraint Programming (CP) solvers are proposed. In Section 4, heuristics for finding initial solutions and the baseline Bottom-Up method are outlined.
In Section 5, we develop a heuristic method based on Logic-based Benders Decomposition (LBBD) to alleviate the natural hierarchical property of the problem. Experiments in Section 6 show that the proposed method outperforms both exact monolithic models and the baseline Bottom-Up method in synthetic instances with up to 7 levels of hierarchy, with up to 80 items per block and limited computation time. Finally, in Sections 7 and 8, we elaborate on properties of the hierarchical packing problem and possible generalizations to industrial applications.
Packing and cutting problems belong to the most important topics in operations research, which brought us many decomposition approaches, such as column generation (Gilmore andGomory, 1961, 1965). These problems have been extensively studied in the past. The recent survey paper by Oliveira et al. (2023) serves as an introduction to the two-dimensional problems. Surveys of Iori et al. (2021); Oliveira et al. (2016) focus on exact approaches and heuristics, respectively.
MILP is often an approach of choice to tackle rectangle packing problems. Space-indexed (or grid-indexed) models, similar to the time-indexed formulations found in the scheduling domains, were utilized for 2D (Beasley, 1985) and 3D (Allen et al., 2012) packing problems. These models have a pseudo-polynomial number of variables, but are known to offer good linear relaxation. “Normal patterns” and similar concepts were successfully used to reduce the number of variables. Polynomial relative position models were described in Chen et al. (1995), and found application in many CP-based approaches (Berger et al., 2009;Korf et al., 2010), as well as in this paper.
Their use in MILP context is problematic due to the necessity to use “big-M” to model the or-constraint encoding the relative position between pairs of rectangles.
Packing a set of rectangles into a single container is a computationally difficult problem by itself. Classical Benders’ decomposition was used in Zhang et al. (2025) to remove the “big-M” coefficients from the relative position model for optimization of the layout of integrated circuits. Benders decomposition with combinatorial cuts was utilized in Côté et al. (2014) to solve a strip packing problem, and later in Côté et al. (2019) for 2D bin packing problems. In both papers, the complicated subproblem of deciding whether a set of rectangles fits a container is solved using decomposition. Crucially, the master problem utilizes relaxation through contiguous 1D bin packing, which was also used in Delorme et al. (2017). As was done in the context of 1D bin packing applications (Dell’Amico et al., 2020;Lewis and Bonnet, 2025), column generation and branch-and-price were used mostly for packing 2D objects in the smallest number of bins in Pisinger and Sigurd (2007);Cintra et al. (2008). The authors used CP and dynamic programming to solve this pricing problem, respectively. Efficient calculation of lower bounds is necessary for good performance of the search algorithms. In the context of strip packing, early work was done in Martello et al. (2003). Authors of Alvarez-Valdes et al. (2009); Boschetti and Montaletti (2010) extended these results and developed new methods, which promise to improve the trivial sum-of-rectangle-areas lower bound.
However, the more advanced methods mostly work with fixed orientations of rectangles. The case with free orientation was investigated in Delorme et al. (2017). Similarly, pre-processing techniques were developed to simplify the problem (Boschetti and Montaletti, 2010), which reduce the width of the strip, or increase the widths of the items. The method for finding the minimum square was developed in Martello and Monaci (2015).
2D packing problems are found in many domains. These include design of integrated circuits (Xu et al., 2017;Zhu et al., 2023;Grus and Hanzálek, 2024), where the goal is to find the smallest possible placement of components, which also satisfies connectivity and other constraints. Specifically in Xu et al. (2017), the hierarchical property of the circuit is considered, which closely resembles our studied problem. The facility layout problem (Kubalík et al., 2023;de Lira-Flores et al., 2019) is another application in which packing is used to design efficient factory or office layouts with respect to the flow of people and material, with a two-level hierarchy of packing sub-problems encountered in the latter paper. Finally, the close relationship between periodic scheduling with period-induced hierarchy of tasks (Grus et al., 2025) interestingly connects the two classical research areas. In Novak et al. (2019), scheduling with uncertain processing times was transformed into packing of complex “F-shapes” to model mixed-criticality of tasks.
However, to the best of our knowledge, no previous work delved into the problem of hierarchical, or nested, rectangle packing, which is the main topic of this paper. Problems with hierarchical, or recursive structure, similar to ours, were investigated in other domains and in the context of bilevel optimization (Camacho-Vallejo et al., 2024). The two-level vehicle routing problem was investigated in Raidl et al. (2014Raidl et al. ( , 2015)), solved using decomposition methods and heuristics, principally similar to this paper. Hierarchical planning is an extension of classical planning with task hierarchy (Bercher et al., 2019). Task hierarchy can be used to model the planning “physics”, but also to introduce advice to the planner to reduce the search space. Hierarchical scheduling is prevalent in edge computing, where resources (Peixoto et al., 2022) or schedulers themselves (Lane et al., 2022) are organized according to their capabilities into a hierarchy.
Decomposition methods are often applied in a recursive or nested manner, as we study in this paper. Column generation was used in multiple levels to solve multi-stage guillotine cutting in Cintra et al. (2008). Nested branchand-price was used to solve vehicle routing with complex inter-dependencies in Tilk et al. (2019). Nested LBBD was used to plan first global and then local response to disasters in Guo et al. (2025). Similarly, three-level decomposition was applied for home healthcare planning in Algendi et al. (2026).
Benders decomposition with two levels was utilized, together with dynamic programming, in Sohrabi et al. (2024).
In Chen et al. (2019) and Le Blansch (2022), the authors tackled a onedimensional bin packing case -putting items into low-level bins, which need to fit upper-level bins. Developed methods could be used with an arbitrary number of levels, which is also true for this paper. However, our two-dimensional case complicates the problem. In applications, the authors often informally come upon the hierarchical packing case, and their developed methods are relevant for this work as inspiration and baselines. In the case of integrated circuit design (Xu et al., 2017;Zhu et al., 2023), the natural hierarchy of components is solved in a bottom-up manner; this is the source for the baseline method we use in this paper in Section 4.2. The two-level facility layout problem (de Lira-Flores et al., 2019) was solved using a mathematical optimization solver to design an optimal plant with a special room for the process equipment. The core of their formulation is a (two-level) special case of the problem studied here.
In this section, we formally describe the 2D hierarchical rectangle packing (2DHRP) problem derived from application-specific domains (Xu et al., 2017;de Lira-Flores et al., 2019). As outlined in Section 1, the 2DHRP problem packs rectangles into lower-level rectangular blocks, which each need to be packed into upper-level rectangular blocks together with the respective rectangles of that level.
The problem instance can be visualized as a weighted directed out-tree in Fig. 1a. Each block B i is represented by a larger labeled node. The rectangles R i j of the block are represented by the smaller leaf nodes connected to the block nodes (having the same color). Root node (B 1 in Fig. 1a) and its corresponding block are referred to as a top node and top block. The edge between a block and another block/rectangle means that the block at the start of the edge contains the block/rectangle at the end (note that we assume the hierarchy does not contain “diamonds”; there is always exactly one path from root to any other node). The number of occurrences of the included block/rectangle is given by the value next to the edge.
Crucially, all individual occurrences of the same block are packed in the same way; this constraint originates from the circuit design domain, which requires the reuse of designed components, which are represented by blocks.
This also means that each occurrence of the block has the same dimensions.
Unlike the occurrence of the blocks, each rectangle is independent of the other. Thus, to model two rectangles with the same dimensions, they would be represented by two distinct leaf nodes (i.e., weights of their incoming edges have weight 1).
Formally, let B = {B 1 , . . . , B i , . . . , B n } be a set of blocks. Each block B i consists of a set of rectangles
and a set of block occurrences
. The mapping σ(I i k ) describes which blocks are directly included in B i ; they are the children of B i . This also means that there is an edge from node B i to B i ′ in the hierarchy graph. This mapping between the hierarchy of the instance and the associated packing is highlighted in Fig. 1b.
The main task is to minimize the size of the top block while respecting packing constraints across the hierarchy. Each block B i needs to be assigned its integer width and height W i , H i , so no two objects O i u , O i v of the object set O i = I i ∪ R i overlap and are all are within the boundaries of the block
Hierarchy with four blocks and eight rectangles.
(b) Optimal solution. Notice, that two occurrences given by its dimensions W i , H i . This is expressed by the constraints:
where (x, y) are the integer coordinates of the bottom-left corner of the rectangle or block occurrence, and (w, h) are its integer width and height.
When a rectangle R i j is packed, its dimensions are chosen from a set of available pre-defined variants:
where both the width and height of each variant are integers. Exactly one of the available variants has to be selected per rectangle. Note that the rotation of a single-variant rectangle is modeled by adding another reflected variant.
On the other hand, the dimensions of the block occurrence I i k are given by the packing of its reference block B i ′ , i ′ = σ(I i k ). This means that w
A packing specifying coordinates and dimensions for each rectangle, block occurrence, and dimensions of each block, which respects Eqs. (1) to (3), is a feasible solution of the 2DHRP problem.
The objective is to minimize the size of the top block (w.l.o.g., the top block can be assumed to be B 1 ). Let its dimensions be W = W 1 , H = H 1 . Although minimization of area W • H is a natural objective, the proxy criterion of half-perimeter W + H is minimized instead. From previous work (Xu et al., 2017) and preliminary experiments, given a limited computation time, optimization of the half-perimeter proxy seems to be more efficient even with respect to the final area. Furthermore, the square-like solutions this objective prefers are of interest for both the logistics and circuit design applications. Naturally, when a case with one dimension fixed is encountered, the other dimension is directly minimized in a strip packing manner.
An illustration of the problem is shown in Fig. 1. The colors of the rectangles correspond to the color of their block’s node in the hierarchy. We can see that the block B 2 was used twice (blue boundary) and consists of a single blue rectangle R 2 1 and a block occurrence of red block B 4 . For the top block B 1 , we need to pack the two (blue) block occurrences of B 2 with a single (green) occurrence of B 3 and three purple rectangles
We first formulate the 2DHRP problem using MILP. The relative-positionbased approach, previously developed in de Lira-Flores et al. ( 2019), is used.
For block B i , i ∈ {1, . . . , n}, the partial model is the following:
The real variables x, y, w, h model the positions and dimensions of the objects in the block, while W i , H i model its boundary (Eqs. ( 5) and ( 6)). Nonoverlapping is resolved using big-M constraints in Eq. ( 7)-Eq. ( 11). There, binary variables r i,k u,v determine whether
Finally, the dimensions of the objects need to be constrained. For rectangles, one of the possible variants is selected using binary variables s i t in Eqs. ( 14) and ( 15), where s i j,t = 1 means variant t was selected for rectangle R i j . The size of block occurrences is coupled to the boundary variables of their relevant block using Eqs. ( 12) and ( 13). Note that these are the constraints that connect several partial single-block packing models into a monolithic 2DHRP model.
Additional constraints, which can be included, are those enforcing the “absence of cycles”, individually in left-right and up-down directions. A similar approach was used in Park and Klabjan (2017):
The values of the added variables g i,x u correspond to the topological ordering of the objects in both the vertical and horizontal directions. The addition of these constraints usually improved the performance of the MILP solver.
Altogether, we refer to “partial MILP model” consisting of equations ( 5)-( 24) (without any objective) for B i as Part M ILP B i
. When partial models are combined across the hierarchy, we obtain a monolithic MILP model minimizing the half-perimeter of the top block, further denoted as M-MILP:
We also provide a CP model as an alternative to the MILP model. Interval variables are used to model both position and dimensions of all rectangular objects. For each block B i , i ∈ {1, . . . , n} the following model is created:
The dimensions and positions of the objects are described by the properties of the interval variables x, y, and the boundary by the integer variables W i , H i . Boundary constraints are enforced by Eqs. ( 27) and ( 28), and the absence of overlaps is achieved by Eq. ( 29). Block occurrences are related to their relevant blocks by Eqs. ( 30) and ( 31). The selection of variants of the rectangles is done using optional interval variables in Eq. ( 32)-Eq. ( 34).
Note that the optional interval variables have fixed length (given the variant with which they are associated); the length of x and y intervals of rectangles is free, and the solver fixes them using the alternative constraints.
The well-known concept of cumulative-resource constraints, powerful in project scheduling, can also be advantageously used to add valid inequalities, thereby tightening the constraints. For each block B i :
pulse(where,height) creates a signal, that is equal to height where the interval where is present, and 0 otherwise. These equations apply when there are no block occurrences within B i . The single-dimensional cumulative constraints ensure that resource consumption (in case of Eq. ( 40), the resource consumption refers to the length of the associated “other-dimension” interval) does not exceed capacity (total height for Eq. ( 40)) at any time.
The effect of block occurrence I i k can be included in the mentioned constraints using heightOf operator, which passes the dynamic width and height
to newly constructed pulses.
Altogether, equations ( 27)-( 41) form the “partial CP model” Part CP B i for block B i . The monolithic model M-CP is obtained as:
Due to the complexity of the single-block packing problem alone, it is necessary to provide good initial solutions to the MILP or CP solvers. Two well-known heuristics can be used: the bottom left heuristic (Chazelle, 1983) and the best fit heuristic (Imahori and Yagiura, 2010). The bottom left heuristic utilizes the simplified approach of Martello and Monaci (2015): objects are packed one-by-one in the bottom left manner, using a single permutation of objects sorted by their area. The best fit heuristic is run three times, to utilize all three position selection strategies described in Imahori and Yagiura (2010). The heuristic expects one of the dimensions of the block to be fixed. If there is no such constraint (e.g., for the top block minimizing half-perimeter), the width of the block is set to the square root of the block’s expected area, estimated from its rectangles.
The runtime of both heuristics is negligible, and both of them are called whenever a solution to a single-block packing problem is needed; whenever a CP or MILP solver is to be started, and there is no existing solution, heuristics provide a warm start. Furthermore, solutions of lower-level blocks can be utilized to construct solutions for upper-level blocks, constructing an initial solution for the entire monolithic model rapidly.
Due to the presence of variants, the lower bounds developed for strip packing in, e.g., Alvarez-Valdes et al. (2009), cannot be directly used. Thus, the minimum area bounds are calculated in the following manner. For each block B i , area estimates across its block occurrences I i and areas of rectangles R i (specifically, their smallest variant) are combined:
The main point of interest is the lower bound of the top block: LB area .
Given these area bounds, the minimum half-perimeter bound can also be derived:
The half-perimeter bound for the top block is shortened to LB W+H .
Even with a good initial solution, solvers using monolithic models outlined in Section 3 struggle to optimize large instances within the limited computation time. Therefore, it is natural to decompose the problem and solve it in parts, even without an optimality guarantee. A simple way to do this is to use the Bottom-Up method, which was utilized in Xu et al. (2017); Zhu et al. (2023). Due to the out-tree hierarchy, packing solutions for the leaf blocks can be constructed directly, and these solutions are subsequently provided to the upper-level blocks. It is only necessary to pass information about the width and height of the child blocks, whose block occurrences behave as rectangles with a set of newly generated variants. When all blocks are processed (in the reverse topological order), a feasible solution of the original problem is obtained.
At the top level, the half-perimeter of the block is minimized. In lower levels, it is more important to provide packing solutions, which will work well with other block occurrences and rectangles at upper levels. Since no information from the upper levels is passed down, several packing variants must be generated. This is achieved by generating a set of maximum widths
for each block and solving the single-block strip-packing problem for each such width. The number of possible packing variants for each block depends on the desired number of variants N . With more variants, there is a greater chance that one of them will match well with the other objects, but less time will be spent optimizing each packing variant individually.
In addition, it is crucial to make a good selection of the W i max values. In this paper, they are selected by uniformly partitioning the suitable range of aspect ratios (widest and tallest possible packing), and calculating the dimensions given area estimate LB area (B i ). Altogether, a feasible packing variant with maximum width W i max,q is obtained by solving the following packing problem with a strip-packing-like objective:
This is done for each block and each variant q. The no-overlap and boundary constraints Eqs. Since we are mostly interested in obtaining good solutions in a reasonable amount of time, we need to divide such a pre-defined time limit T among all blocks of the instance in the following manner:
The more “complex” the block, the proportionally more time is allocated;
this should achieve a similar block quality across the entire instance. τ (B i ) is then uniformly divided between each packing variant of B i , as the authors outlined in Xu et al. (2017). Together, this Bottom-Up approach is referred to as BU N , where N is the number of packing variants generated for each block with the exception of the top block. Note that more complex time management and variant generation strategies could be employed. Finally, any solver could be used to solve the isolated single-block packing problem that is encountered for each block and packing variant. We elaborate on the choice of the solver in Section 6. This also means that the solution provided by heuristics in Section 4.1 could be directly used as a feasible packing variant, without explicitly utilizing any MILP or CP solver. This fully heuristic variant of Bottom-Up is denoted as HEUR in Section 6.
The Bottom-Up works very well despite its simplicity. However, it cannot reason which packing variants would be useful at the top node. To counter this weakness, Bottom-Up needs to generate multiple variants to ensure one of them actually works well. This wastes computation time (by investigating useless variants) and requires additional control by the user (how many/which variants, time management, etc.). This motivated us to develop a more informed decomposition method, which could outperform the Bottom-Up baseline.
In this section, we describe the main contribution of our paper: the LBBD-based method, which aims to overcome the drawbacks of the Bottom-Up method mentioned in the last paragraph of Section 4.2. We first describe the decomposition in its exact formulation, enabling us to find an optimal solution. Later, in Section 5.2, a heuristic version of this decomposition is described that produces good solutions in a limited time. That version is used in our experiments.
Our decomposition method based on a recursive algorithm is illustrated in Fig. 2. Fig. 2a shows the control flow diagram of the procedure for block B i . The input of the procedure is an additional constraint (none for the top block and maximum allowed width for any other block), and the output of the procedure is the packing for B i that respects all constraints. The algorithm starts by processing the top block, where it tries to minimize its half-perimeter, and recursively enters the respective block’s children with a maximum width constraint derived from the parent while minimizing the height of the block.
At each B i , the “master problem” needs to be solved. Only the partial model (( 5)-( 24) for MILP) of B i from Section 3 is constructed, and the additional constraint passed from the parent block of B i is added. Without the partial models for the children of B i , the width and height variables
would be free variables, and the dimensions of the block occurrences would be set arbitrarily (see Eqs. ( 12), ( 13), ( 30) and ( 31)). Therefore, these variables need to be constrained in another manner. This is done by using the constant area bound:
For MILP, this is approximated as a piecewise linear function. The set of potentially feasible solutions given by this constraint for B i ′ ∈ C i is the convex space delimited by the blue hyperbole shown in Fig. 2b. Thus, the dimensions of occurrences of B i ′ at least follow the lowest possible area. These constraints, cuts, which are described later, and partial model from Section 3, create the “master problem” in Fig. 2a. For MILP, this model for some (non-top) block i is:
The solution to this problem, depicted as PLAN in Fig. 2a, that is influenced by constraint Eq. ( 50) from the parent of B i , may not be a feasible partial solution for the original 2DHRP problem. It is necessary to verify that the dimensions of each child block B i ′ ∈ C i are valid; i.e., the child block can be truly packed into a boundary with dimensions W i ′ PLAN , H i ′ PLAN , which are the dimensions of block occurrences of B i ′ in the “master problem” solution.
For that purpose, the same procedure of Fig. 2a is started for each B i ′ . B i ′ again optimizes model ( 49)-( 53): minimizing H i ′ and adding the constraint
The procedure is recursively initiated for children of B i ′ . Eventually, feasible packing of B i ′ given the constraints is found, with actual feasible dimensions W i ′ ACT , H i ′ ACT . This packing (ACT in Fig. 2a) is returned to B i . B i checks whether the actual packing of B i ′ follows the suggested dimen-
the solution of the “master problem” of B i is feasible with respect to the children subproblems, and optimal packing of the original problem was found. Otherwise, the following cut is added to the “master problem” model to reduce the space of potentially feasible dimensions for each child block B i ′ that could not be verified:
If no solution was found, W i ′ PLAN was too narrow, and we add cut
Then, the “master problem” for B i is solved again with these additional cuts. If “master problem” is solved optimally, these cuts remove only infeasible pairs W i ′ , H i ′ , and after a sufficient number of iterations of the loop of Fig. 2a the “master problem” would result in a solution that is feasible given the children of B i .
The way in which cuts (54) reduce the search space is shown in Fig. 2b.
There, potentially feasible pairs of width and height of one of the child block B i ′ , i ′ ∈ C i , correspond to the white region, while infeasible pairs correspond to the gray region. Potentially feasible pairs refer to dimensions of block occurrences of B i ′ , that are feasible with respect to the current set of cuts introduced for B i ′ . With each iteration of Fig. 2a, a new cut derived from the verification of a new pair W i ′ PLAN , H i ′ PLAIN , may be added to model B i ′ more precisely.
The first reduction of potentially feasible pairs is done by the hyperbolic curve corresponding to Eq. ( 48). In the first iteration, PLAN dimension pair (A) was suggested as the “master problem” solution of B i , and taller ACT dimension pair (B) was verified as the B i ′ subproblem. This created the first cut. In the next iteration, a new PLAN dimension pair (C) was suggested, but again the child solution had a greater height (D). Finally, PLAN (E) was successfully verified in the third iteration by (F). The cuts obtained by the first two iterations reduced the space by introducing two “stairs”. Note that the orange tops (including B, D, E/F) of the stairs belong to a potentially feasible region, while the vertical faces (including A and C) do not.
The method described in the previous section produces an optimal solution, but it relies on an optimality proof for each single-block packing problem to produce valid cuts, which is time-consuming and, even for small-sized instances, makes the method practically inapplicable. In this section, several heuristic modifications are developed, sacrificing optimality of the method but achieving good results in a reasonable time. The recursive procedure works in a similar way as in Section 5.1, but its control flow diagram is extended to stop optimization early, as shown in Fig. 3a.
First, while the original LBBD automatically decides which dimensions of blocks to explore, we still need to manage the allocation of the computation time. While the “master problem” is optimized, periodic checks are performed to determine whether the solver has found a new solution. If no improvement was achieved within the local time limit (improvement period), the computation is aborted, and the current (possibly non-optimal) solution PLAN is returned.
Note that such a solution, when passed from child block to parent block, could create a cut (as in Fig. 3b) that may remove some otherwise feasible width-and-height pairs. This means that the cut overconstrains the problem as the dark gray area no longer contains only infeasible pairs, as was the case in Section 5.1, but may also contain feasible pairs. That could prevent an optimal solution from being found, since the solver cannot utilize them.
However, such non-optimal approaches often yield a good time-performance ratio (Raidl et al., 2014). For the purpose of the heuristic-LBBD, we view the cuts as being valid, and a dimension pair being considered infeasible as a statement regarding this heuristic setting, not the original 2DHRP.
Since it may take many iterations of the original loop of Fig. 2a for the “master problem” to produce a feasible packing, we need to ensure that at least some packing is always found early. This is done by the “restricted master problem” step in Fig. 3a. This step solves structurally similar model as in “master problem”. However, instead of constraining the width and height of the child blocks by their area and generated cuts, they are fixed to the dimensions of feasible packing found by the child blocks:
The solver is partially initialized with relative positions from the solution of the “master problem”. Solution of the “restricted master problem” is actually a feasible packing of B i since it uses the feasible packing for each child.
It is then used as an upper bound while iterating, and when the time limit allocated for the block is reached, the best solution so far is returned.
When B i is being optimized, once the control flow leaves the main loop and enters block “Fine-tuning” in Fig. 3a, there is an opportunity to improve the packing so the parent of B i can strengthen its cuts. From perspective of parent of B i , let W i PLAN be the width suggested by the parent, and let W i
ACT , H i ACT be the dimensions of the best solution found after leaving the main loop in Fig. 3a for current block B i .
Improving Width of ACT: Solution ACT is a feasible packing of B i (from one of the iterations of the loop). Since B i was optimized in a strip packing manner by minimizing the height, it can be further improved by minimizing its width. This is done by re-solving the same “restricted master problem” model, but with objective min W i and setting
The solver is warm-started with the existing solution ACT. Once optimized, solution LEFT is obtained, with
The solution should be “to the left” of ACT in Fig. 3b, and thanks to the smaller width, it is an improvement on the original solution ACT. Both LEFT and ACT solutions are returned to the parent of B i once the control flow exits the diagram Fig. 3a. Then, the parent of block B i adds the same cut as in Eq. ( 54):
ACT , then this does not need to be validated by solving the subproblem for B i since a feasible child solution is already known.
Strengthening the Cut: Similarly, while still in the “Fine-tuning” part of the diagram for B i , we can try to find a closest packing with a smaller height. To do this, the height decrease α is selected and the modified “master problem” model (with children of B i modeled using generated cuts) is solved:
Solution should satisfy:
Due to the imprecise modeling of child blocks, this solution may not be a feasible packing of B i , but its width can be interpreted as a lower bound for fixed height
Thus, once B i finishes and returns control to its parent, the parent adds a wider cut to better model B i :
This expands the original cut to the right, as is shown in Fig. 3b; the parent of B i reduces the set of potentially feasible dimensions of its representation of B i . If α = 1, the cut should not overestimate the height of any solution with width between W i ACT and W i RIGHT . Risking this guarantee, a greater reduction of search space can be obtained by setting α to larger values, e.g., α = ⌊0.05 • H i ACT ⌋. Finally, the case with α = 0 corresponds to omitting the computation of RIGHT altogether.
The result of fine-tuning can be seen in Fig. 3b. We can see the initial hyperbole and an additional cut. Calculation was initiated by red (W i PLAN , H i PLAN ) pair from the parent of block B i . However, found (W i ACT , H i ACT ) has height greater than the original suggestion. Then, fine-tuning was performed and produced solutions (W i LEFT , H i LET ) and (W i RIGHT , H i RIGHT ). LEFT solution improved the existing ACT solution, RIGHT expanded the cut and significantly reduced the solution space, as the dark gray area suggests. In Section 6, we test three versions of our proposed method, abbreviated as LBBD: LBBD 0 does not use the decremented-height part of fine-tuning at all, LBBD 1 sets α = 1, and LBBD R uses the radical strategy with α = ⌊0.05 • H i ACT ⌋.
In this section, the computation on the LBBD 1 is demonstrated using a three-level instance with hierarchy shown in Fig. 4 with individual rectangles shown later in Fig. 7.
The experiment ran for 10 minutes. The sequence diagram in Fig. 5 shows how the decomposition progressed in the first 120 seconds. We can see how the solver first solves the “master problem” (first purple bar) for the top block B 1 , and at 20s it enters its children’s subproblems to verify whether the proposed dimensions work or whether cuts need to be introduced. This leads to the same procedure being done in the B 2 block, which further calls B 4 at 27s. Then B 3 is called at 32s. After that, the “restricted master problem” is rapidly solved in B 1 to obtain the first feasible solution at 33s.
In the first 120 seconds, four iterations of the “master problem” of B 1 were finished. In the third iteration from 72s to 100s, the inner loop for the B 2 was run twice, before the control was returned to B 1 . Furthermore, we can observe how the decomposition searches the possible dimensions of B 2 in Fig. 6. We see that B 1 focused on a solution that utilized tall variants of
We implemented the algorithms using Python 3.10. Experiments were performed on Intel Xeon E5-2690 using a single thread. CP Optimizer v22.1 was used as a CP solver, and Gurobi Optimizer v12.0 as a MILP solver.
The problem described in this paper does not utilize any standard instance sets found in the literature. Inspired by Grus and Hanzálek (2024), several sets of instances inspired by the placement of analog integrated circuits were generated to compare the monolithic methods, the existing Bottom-Up approach (Xu et al., 2017), and the proposed heuristic-LBBD method.
The generated instance sets are outlined in Table 1, and are provided in Grus et al. (2025). The way the instances were generated is described in the following sections.
Each instance set is characterized by the number of levels l. Each instance has blocks organized in a randomly generated hierarchy. This was done so the maximum path from the top block to one of its leaves contained exactly l blocks (thus, a single-level instance contains only the top block).
We generated instances with up to seven levels, which spans the typical complexity of designed analog integrated circuits. The average depth of the nodes, and the average number of blocks per instance are reported in Table 1 for each set of generated instances.
With the hierarchy determined, rectangles and block occurrences are generated for each node in the graph. One block occurrence per block was created, given the generated hierarchy. However, for sets L3-M and L4-M, multiple occurrences of the same block were allowed. The number of rectangles to generate was randomly sampled for each block. This value was as low as 5 and as high as 40-80 objects (relevant number of components for integrated circuits). For instances with more levels (and, thus, implicitly more blocks), the upper bound on the number of rectangles was set to the smaller value. The average number of objects per block and the average total number of objects are shown in Table 1.
Each rectangle was generated with up to 5 variants (with the exception of L1-NV, where only a single variant was generated) by first sampling an area from the pre-defined interval and then sampling the aspect ratio of the variant. Generation starts at the top block, and the interval of possible areas is multiplied by a randomly sampled value from the “area multiplier” interval (0.5; 1) whenever the child block is recursively entered. This way, the size of the rectangles is reduced for lower-level blocks.
For the two level instances, three sets L2-S, L2-I, L2-L were generated.
The only difference between these sets is the aforementioned “area multiplier” interval. L2-S used interval (0.1; 0.3); this made rectangles sampled in the child block much smaller than those in the parent. L2-I used (0.3, 0.7), and L2-L used (0.7, 1.0). These instances were used to test whether and how the size of lower-level blocks (which depends on the size of their rectangles) affects the optimization.
The computation time was fixed for each method given the instance set.
The value is reported in the last column of Table 1, and was kept fixed independently of the number of rectangles of the specific instance. For instance, with fewer levels, 10-30 minutes were provided, and a time limit of up to 4 hours was used for the most complex ones.
For multi-level instances, the time is managed as described in Section 4.2 and Section 5.2. Time is allocated proportionally among blocks for Bottom-Up. LBBD uses an improvement period of 10 seconds before each optimization of the single-block packing problem is aborted (i.e., when the objective does not improve). The main loop for each block in Fig. 3a is limited by 30 seconds before the best-so-far solution is “fine-tuned” and returned to the parent block. Table 1: Instance sets and their characteristics.
We compared the heuristic baseline HEUR and monolithic models M-CP, M-MILP on L1-NV and L1 instance sets to determine which of the two solving techniques should be used as the backbone of the decomposition methods.
To compare the results, mean (and median in parentheses) values of W+H and AREA gaps across the instance sets are reported in Table 2. For a given instance and solution with dimensions W, H, define:
As Table 2 shows, the best results are reported by the M-CP method, both for the area and half-perimeter. The difference is quite significant on L1-NV instances with a single variant per rectangle. The difference is smaller for the multi-variant instances L1, but the median of M-CP for the W+H gap is still better: 2.96 instead of 3.97 for M-MILP.
An alternative objective for CP was also examined, optimizing the area W • H explicitly. The results in Table 2 for M-CP AREA show that using the area objective does not help, but rather diminishes the overall performance of the CP solver. The performance according to the size of the instance is visualized in Fig. 8. There, for each instance of L1, L1-NV, the value of W + H gap is shown with respect to the instance’s number of rectangles. We can see that the gap reported by the heuristic HEUR improves with increasing number of rectangles, but exact methods are still mostly better, even though their performance worsens. M-CP performs best for the mid-sized instances with 20 -55 rectangles, but for the larger ones, the M-MILP wins. Since most of the blocks of the multiple-level instances were generated with around 50 rectangles, M-CP derived single-block packing solver was used as a solver for both decomposition methods BU and LBBD approaches.
As a final note, the experiment with M-CP and a time limit of 10 hours was performed. The mean values of W + H gap were 3.65 for L1-NV (4.91 for 10-minute M-CP) and 2.25 for L1 (3.63 for 10-minute M-CP). Although these values are not obtained from proven optimal solutions, they provide insight into the gap between lower bounds and solutions found by the compared methods.
In our preliminary experiments, we tested several alternative models for CP Optimizer, including omission of the 1D cumulative constraints, explicit pairwise modeling with two noOverlap constraints, and usage of integer variables instead of interval ones. Furthermore, we also experimented with ORtools CP solver (Perron and Didier, 2025), using its noOverlap2D constraint (specific case of geost constraint). However, the model used in this paper provided the overall best results.
In this section, we primarily study the performance of decomposition methods: Bottom-Up BU and LBBD LBBD. BU uses a different number of variants per block: BU 3 , BU 5 , BU 9 , BU 13 , BU 25 use 3, 5, 9, 13, and 25 variants, respectively.
LBBD versions differ in their use of fine-tuning of Section 5. We can see that the monolithic methods M-CP and M-MILP are not per-forming well for the multi-level instances. Interestingly, it actually seems that M-MILP outperforms its counterpart, but it is still much worse than decomposition methods. Thus, we do not include them in the comparison later in the paper. Furthermore, differences between the various two-level instance sets are not significant. It suggests that different scaling of the lower-level block and its rectangles does not have a significant effect on the methods.
If we focus on different versions of BU, it is not straightforward to pinpoint the winner, with the results being very close on all instance sets. The same holds for LBBD versions, and also for the values of half-perimeter and area.
However, LBBD R reports non-trivially better results; on average, about 0.3% smaller W+H is reported than any BU version; a notable gain given how close both approaches probably are to the theoretical lower bound.
When we compare the “best bound” columns BU B and LBBD B , the differences between the two approaches are not that significant, with BU B winning for two datasets regarding W + H gap by a small margin. This suggests that the Bottom-Up approach is quite powerful, but the incorrect choice of the number of variants to explore negatively affects the individual versions.
In Tables 5 and6, we report the half-perimeter and area results for more complex instances. An example of a complex six-level instance and its solution is shown in Fig. 9.
As before, Table 5 shows how the “best bound” LBBD B outperforms its Bottom-Up counterpart BU B . The difference between their W + H gap is between -0.1 percentage point (for L4, where BU 25 produced outstanding solutions) up to 2.3 percentage point for L7. The difference seems to be larger for L3-M and L4-M, where multiple occurrences of the same block are used. This suggests that the multiple occurrences of the “nonoptimally” packed block may lead to multiplication of the wasted space in the upper levels. Visually, this is presented for one instance of L3-M in Fig. 10, where solutions obtained by the best performing version of each method are shown. Thus, a more informed approach that can reason about a block’s dimensions beforehand may be much better suited for such a scenario.
If we focus only on individual versions of both methods, we can see that the proposed method performs better, and the difference between them seems to increase with the number of levels; from 0.2 up to 2.5 percentage points, for instance, set L7. It can be clearly seen that the small number of solutions generated by BU 3 is detrimental, since there is no guarantee that a suitable partial packing was produced throughout the hierarchy. However, not even BU 13 or BU 25 are good enough to defeat LBBD methods, probably due to the amount of time wasted on optimizing useless variants.
This can be further studied in the box plot generated for instances of L7, shown in Fig. 11. It is clear that the width of the inter-quartile range is much greater for the versions of BU in comparison to versions of LBBD. This suggests that LBBD produces more consistent results. When the different versions of LBBD are compared, there is not much difference between them with respect to the results. Thus, it may not be necessary to strengthen the cuts generated by subproblems, at least for the instances considered in this paper, simplifying the overall method.
Altogether, the results suggest that the proposed LBBD method performs better on the generated instances. This is especially true for more complex instances with more blocks spread across more levels, as the results for sets L6, L7 demonstrated. However, we need to consider that both LBBD and Bottom-up methods are finding a solution very close to the theoretical lower bound, and therefore, even the straightforward Bottom-Up method is a suitable way to tackle the 2DHRP.
The proposed method seems to, on average, outperform the Bottom-Up method for a diverse set of instances. Furthermore, it is solver independent, since any modeling method and appropriate solver could be utilized to solve the single-block packing problem, as long as the width-and-height cuts can be generated and added to the solver. This could be extended to 3D to apply a similar decomposition method for more practical logistics problems.
Proposed decomposition could potentially be utilized on single-level packing instances. If the complexity of such an instance is too much, a “virtual hierarchy” can be artificially created by clustering some rectangles together to produce “virtual blocks”.
For practical application, such as in the design of integrated circuits of Xu et al. (2017); Grus and Hanzálek (2024), some other objectives need to be considered (e.g., length of the components interconnections), but also complex constraints regarding, for example, non-uniform minimum distances between rectangles. Both of these problems complicate the way the cuts are generated, and how to evaluate the quality of the child subproblem solutions, and pass such information back to the parent.
In this paper, we formalized a hierarchical packing problem, which models the core features of the packing and placements problems found in (i) design of integrated circuits, (ii) design and planning of facility layouts, and (iii) packing in logistics.
Due to the complexity of the problem, we implemented a baseline Bottom-
Up method, and we proposed an LBBD method. The main advantage of the proposed method is that it is up to the parent block to select a suitable dimension of the subblocks, rather than randomly generating them as in the Bottom-Up method.
We evaluated the MILP and CP solvers on single-level packing instances and used the CP solver further within the decomposition methods due to its performance. When we compared the Bottom-Up and LBBD methods on instances with between two and seven levels, we showed that our proposed LBBD is superior to the Bottom-Up.