An exploration for higher efficiency in multi objective optimisation with reinforcement learning

Reading time: 5 minute
...

📝 Original Info

  • Title: An exploration for higher efficiency in multi objective optimisation with reinforcement learning
  • ArXiv ID: 2512.10208
  • Date: 2025-12-11
  • Authors: Mehmet Emin Aydin

📝 Abstract

Efficiency in optimisation and search processes persists to be one of the challenges, which affects the performance and use of optimisation algorithms. Utilising a pool of operators instead of a single operator to handle move operations within a neighbourhood remains promising, but an optimum or near optimum sequence of operators necessitates further investigation. One of the promising ideas is to generalise experiences and seek how to utilise it. Although numerous works are done around this issue for single objective optimisation, multi-objective cases have not much been touched in this regard. A generalised approach based on multi-objective reinforcement learning approach seems to create remedy for this issue and offer good solutions. This paper overviews a generalisation approach proposed with certain stages completed and phases outstanding that is aimed to help demonstrate the efficiency of using multi-objective reinforcement learning.

💡 Deep Analysis

📄 Full Content

An Exploration for higher efficiency in multi objective optimisation with reinforcement learning

Mehmet Emin AYDIN1,2

1 University of the West of England, School of Computing and Creative Technologies, Bristol, UK mehmet.aydin@uwe.ac.uk

2 Istanbul Ticaret University, Dept. of Industrial Engineering, Istanbul, Türkiye meaydin@ticaret.edu.tr

KEYWORDS – Adaptive operator selection, multi objective reinforcement learning, generalisation of experiences, Set union knapsack problem.

ABSTRACT Efficiency in optimisation and search processes persists to be one of the challenges, which affects the performance and use of optimisation algorithms. Utilising a pool of operators instead of a single operator to handle move operations within a neighbourhood remains promising, but an optimum or near optimum sequence of operators necessitates further investigation. One of the promising ideas is to generalise experiences and seek how to utilise it. Although numerous works are done around this issue for single objective optimisation, multi-objective cases have not much been touched in this regard. A generalised approach based on multi-objective reinforcement learning approach seems to create remedy for this issue and offer good solutions. This paper overviews a generalisation approach proposed with certain stages completed and phases outstanding that is aimed to help demonstrate the efficiency of using multi-objective reinforcement learning.

1 INTRODUCTION This paper introduces a generalisation approach through reinforcement learning in order to suggest a highly efficient swarm intelligence-based problem solver for combinatorial optimisation problems. It partially plays the role of a position paper but partially demonstrates completed work stages. Developing a general problem solver is one of the original targets of AI studies that attracts researcher since the golden time of artificial intelligence. However, the practice and studies in the field suggest that it is either extremely difficult to devise such a general problem solver or impossible [1].
Combinatorial optimisation problems are known as too difficult optimisation problems with classification of NP-Hard and/or NP-Complete problems. This escalates challenges into formulating efficient problem solver algorithms. Recently, heuristic-based approaches such as swarm intelligence or evolutionary algorithms have been studied extensively to ease these difficulties. One of the approaches emerged and proven success is the use of multiple operators within the population-based algorithms to keep the search diversified without losing intensification [2]. Reinforcement learning has taken researchers attention in building efficient rules to let adaptively select the most suitable operators subject to the search circumstances [3], [4]. Although the overall idea has been studied IMSS’25 Düzce University, Türkiye – 25–27 September 2025 How to Cite (APA): Aydin, M. (2025). An exploration for higher efficiency in multi objective optimisation with reinforcement learning. In Proceedings of the IMSS’25 (pp. 590–599). Düzce University - Türkiye, 25–27 September 2025. https://www.imss-symposium.info/ DOI: https://doi.org/10.5281/zenodo.17778541 IMSS’25 Düzce University, Türkiye, 25-27 September 2025 13th International Symposium on Intelligent Manufacturing and Service Systems towards a mature level, no approach has appeared considering and examining the problem from generalisation of experiences point of view. This paper investigates how an efficient approach can be devised to generalise the gained experiences in solving particular cases to utilise in different variants, similar use cases, and non- similar combinatorial problem types in such a way that gaining experience while solving a typical combinatorial problem (e.g. traveller salesman problem) and utilise them in solving very different problems such as job scheduling. For this purpose, the approach introduced here has two phases: (i) binarification of the problems, and (ii) transfer learning. By the means of binary representation all problems can be translated into a common space. Once binarification is completed and the suitability and usefulness of the binary operators in producing neighbouring solutions is sufficiently characterised, then a well-devised and customised machine learning algorithm such as reinforcement learning can help learn the characteristics of the operators in this respect to lead to generalisation of experiences.
This paper is organised as follows. Section 2 overviews the background of the study and the foundation knowledge in this respect covering the relevant work. Section 3 elaborates how operator selection can be conducted adaptively using reinforcement learning and how that drives to generalisation of experience. Section 4 present some proof-of-concept experimental studies, while Section 5 concludes the findings and future direction of this

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut