CORL: Reinforcement Learning of MILP Policies Solved via Branch and Bound

Reading time: 1 minute
...

📝 Original Info

  • Title: CORL: Reinforcement Learning of MILP Policies Solved via Branch and Bound
  • ArXiv ID: 2512.11169
  • Date: 2025-12-11
  • Authors: Akhil S Anand, Elias Aarekol, Martin Mziray Dalseg, Magnus Stalhane, Sebastien Gros

📝 Abstract

Combinatorial sequential decision-making problems are typically modeled as mixed-integer linear programs (MILPs) and solved via branch-and-bound (B&B) algorithms. The inherent difficulty of modeling MILPs that accurately represent stochastic real-world problems leads to suboptimal performance in the real world. Recently, machine-learning methods have been applied to build MILP models for decision quality rather than how accurately they model the real-world problem. However, these approaches typically rely on supervised learning, assume access to true optimal decisions, and use surrogates for the MILP gradients. In this work, we introduce a proof-of-concept CORL framework that end-to-end fine-tunes an MILP scheme using reinforcement learning (RL) on real-world data to maximize its operational performance. We enable this by casting an MILP solved by B&B as a differentiable stochastic policy compatible with RL. We validate the CORL method in a simple illustrative combinatorial sequential decisionmaking example.

📄 Full Content

...(본문 내용이 길어 생략되었습니다. 사이트에서 전문을 확인해 주세요.)

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut