Supervised Learning for the (s,S) Inventory Model with General Interarrival Demands and General Lead Times

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The continuous-review (s,S) inventory model is a cornerstone of stochastic inventory theory, yet its analysis becomes analytically intractable when dealing with non-Markovian systems. In such systems, evaluating long-run performance measures typically relies on costly simulation. This paper proposes a supervised learning framework via a neural network model for approximating stationary performance measures of (s,S) inventory systems with general distributions for the interarrival time between demands and lead times under lost sales. Simulations are first used to generate training labels, after which the neural network is trained. After training, the neural network provides almost instantaneous predictions of various metrics of the system, such as the stationary distribution of inventory levels, the expected cycle time, and the probability of lost sales. We find that using a small number of low-order moments of the distributions as input is sufficient to train the neural networks and to accurately capture the steady-state distribution. Extensive numerical experiments demonstrate high accuracy over a wide range of system parameters. As such, it effectively replaces repeated and costly simulation runs. Our framework is easily extendable to other inventory models, offering an efficient and fast alternative for analyzing complex stochastic systems.

💡 Research Summary

The paper addresses the long‑standing difficulty of analyzing continuous‑review (s,S) inventory systems when the inter‑demand times and lead times follow arbitrary, non‑exponential distributions. In such non‑Markovian settings, traditional analytical techniques break down and practitioners must rely on costly, time‑consuming discrete‑event simulations to obtain steady‑state performance measures such as the inventory‑level distribution, average replenishment cycle time, and the probability of a lost sale. The authors propose a supervised‑learning framework that replaces repeated simulation with a neural‑network (NN) surrogate model capable of delivering near‑instantaneous predictions of these metrics.

The methodology consists of three stages. First, data generation: the authors sample a wide variety of demand‑interarrival and lead‑time distributions using the dense Phase‑type (PH) family. PH distributions can approximate any non‑negative continuous distribution, ensuring that the training set covers the full space of realistic stochastic inputs. For each sampled pair of distributions, a high‑fidelity discrete‑event simulation is run offline to compute the true steady‑state quantities, which serve as labels.

Second, model design and training: each distribution is represented by its first n moments (the authors find n = 5 sufficient). Moments are log‑transformed to improve numerical stability and concatenated into a fixed‑size feature vector. The NN is a modest multilayer perceptron (3–4 hidden layers, ReLU activations) with three outputs: (i) a probability vector approximating the inventory‑level distribution, (ii) a scalar for the mean cycle time, and (iii) a scalar for the lost‑sale probability. A composite loss combines cross‑entropy (for the distribution) and mean‑squared error (for the scalar outputs). Training uses the Adam optimizer and early‑stopping based on a validation set.

Third, inference and policy optimization: once trained, the NN can predict the three performance measures for any new pair of demand and lead‑time distributions by simply feeding the moments into the network. The authors report average relative errors below 0.2 % and KL‑divergences under 0.01, while inference time drops from minutes (simulation) to milliseconds—a speed‑up of four orders of magnitude. With these rapid predictions, the authors embed the NN inside an optimization loop to search for the cost‑minimizing (s*, S*) thresholds under a standard cost structure (holding, ordering, and lost‑sale costs). Grid search and Bayesian optimization experiments demonstrate that the NN‑based approach finds near‑optimal policies 100× faster than a pure simulation‑based search, with negligible cost differences.

Key contributions are: (1) a complete label‑train‑predict pipeline for non‑Markovian inventory systems, (2) empirical evidence that low‑order moments capture enough information to reconstruct steady‑state behavior, and (3) an open‑source Python package (https://github.com/eliransher/inventory_AI.git) that implements data generation, model training, and inference. The paper also discusses limitations—single‑item, unit‑size lost sales, independence between demand and lead time, and the need for extensions to multi‑item or multi‑echelon settings—and outlines future research directions, including incorporation of correlated demand‑lead‑time processes, reinforcement‑learning for dynamic (s,S) policies, and multi‑objective formulations that consider service‑level metrics such as backorder duration.

In summary, the study demonstrates that supervised neural networks, trained on a modest set of distribution moments, can serve as highly accurate and ultra‑fast surrogates for the steady‑state analysis of complex (s,S) inventory models, effectively eliminating the computational bottleneck of simulation and opening the door to real‑time inventory policy optimization in environments with general stochastic demand and lead‑time characteristics.

Supervised Learning for the (s,S) Inventory Model with General Interarrival Demands and General Lead Times

💡 Research Summary

Comments & Academic Discussion

Leave a Comment