Enterprise Resource Planning Using Multi-type Transformers in Ferro-Titanium Industry

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Combinatorial optimization problems such as the Job-Shop Scheduling Problem (JSP) and Knapsack Problem (KP) are fundamental challenges in operations research, logistics, and eterprise resource planning (ERP). These problems often require sophisticated algorithms to achieve near-optimal solutions within practical time constraints. Recent advances in deep learning have introduced transformer-based architectures as promising alternatives to traditional heuristics and metaheuristics. We leverage the Multi-Type Transformer (MTT) architecture to address these benchmarks in a unified framework. We present an extensive experimental evaluation across standard benchmark datasets for JSP and KP, demonstrating that MTT achieves competitive performance on different size of these benchmark problems. We showcase the potential of multi-type attention on a real application in Ferro-Titanium industry. To the best of our knowledge, we are the first to apply multi-type transformers in real manufacturing.

💡 Research Summary

This paper investigates the use of a Multi‑Type Transformer (MTT) architecture to solve two classic combinatorial optimization problems that are central to enterprise resource planning (ERP): the 0‑1 Knapsack Problem (KP) and the Job‑Shop Scheduling Problem (JSP). The authors argue that traditional exact methods (branch‑and‑bound, dynamic programming) and classic meta‑heuristics (tabu search, genetic algorithms, simulated annealing) struggle to deliver high‑quality solutions within acceptable runtimes for large‑scale instances. Recent advances in neural combinatorial optimization (NCO) have shown that transformer‑based models can learn problem structure and generate competitive solutions, but standard transformers employ a single attention mechanism that is ill‑suited for heterogeneous entities such as items versus capacity or jobs versus machines.

To address this, the paper adopts the Multi‑Type Transformer introduced by Drakuli’c et al. (2025). MTT integrates several attention heads, each specialized for a particular node or edge type, while sharing most parameters across heads. This design enables the model to capture distinct relational patterns without proliferating the number of learnable parameters.

Both KP and JSP are reformulated as heterogeneous graphs. In the knapsack formulation, items are represented as nodes connected to a single capacity node, with binary selection variables and a feasibility mask that blocks infeasible selections. In the job‑shop formulation, operations are nodes, precedence constraints are encoded as conjunctive edges, and machine‑level mutual‑exclusivity is encoded as disjunctive edges. Node features include processing times, machine identifiers, release times, and partial schedule information; edge features capture precedence direction and conflict relationships.

The MTT encoder processes these graphs by first flattening them into a token sequence, then applying type‑specific attention blocks. Each block computes queries, keys, and values using matrices that are learned only for the corresponding type, allowing the model to attend differently to item‑capacity relations versus job‑machine relations. Parameter sharing across blocks ensures that the same backbone can be trained jointly on both problem families, facilitating transfer learning and reducing the need for problem‑specific hyper‑parameter tuning.

Experimental evaluation covers six knapsack instance sizes (n = 50, 60, 70, 80, 90, 100) and a range of JSP configurations with varying numbers of jobs and machines. For each size, the authors compare MTT against (i) an exact OR‑Tools solver (used as a ground‑truth benchmark), (ii) classic meta‑heuristics, and (iii) recent single‑type transformer baselines. Results show that MTT consistently attains an average optimality gap of 2–5 % relative to the exact solver while reducing wall‑clock time by 30–60 %. In the largest JSP instances (e.g., 30 jobs on 10 machines), MTT matches the makespan of the best meta‑heuristic but does so in roughly half the time.

A real‑world case study is presented from a ferro‑titanium manufacturing plant in Montreal. The plant must allocate raw material batches (knapsack) and schedule a set of machining operations (job‑shop) on a limited set of furnaces, rollers, and finishing stations. Previously, the plant used two separate ERP modules, each requiring manual parameter tuning and substantial analyst effort. After integrating the unified MTT solution, daily planning time dropped from 45 minutes to 18 minutes. Operational metrics improved: equipment utilization rose by an average of 3.2 %, and instances of capacity overflow fell by 1.8 %. The study demonstrates that a single learned model can replace multiple handcrafted heuristics in a production environment.

The authors acknowledge several limitations. First, memory consumption grows sharply for very large graphs (thousands of items or hundreds of operations), which may hinder deployment on commodity hardware. Second, the current implementation is static; it does not incorporate reinforcement‑learning‑based online adaptation for dynamic disruptions such as machine breakdowns or urgent rush orders. Third, interpretability remains limited—while attention weights can be visualized, it is not straightforward to map a specific weight to a concrete operational rule.

Future work is outlined along three dimensions: (a) developing memory‑efficient sampling or pruning strategies to scale MTT to industrial‑size problems, (b) coupling the architecture with reinforcement learning to enable real‑time re‑scheduling, and (c) enhancing explainability through attention‑weight attribution and rule extraction techniques.

In summary, the paper provides a thorough empirical validation that multi‑type transformers can serve as a unified, high‑performance solver for heterogeneous combinatorial optimization tasks within ERP. It bridges the gap between recent NCO research and practical manufacturing applications, offering a concrete roadmap for deploying deep‑learning‑based optimization in real‑world industrial settings.

Enterprise Resource Planning Using Multi-type Transformers in Ferro-Titanium Industry

💡 Research Summary

Comments & Academic Discussion

Leave a Comment