Large Language Model as Meta-Surrogate for Data-Driven Many-Task Optimization: A Proof-of-Principle Study

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In many-task optimization scenarios, surrogate models are valuable for mitigating the computational burden of repeated fitness evaluations across tasks. This study proposes a novel meta-surrogate framework to assist many-task optimization, by leveraging the knowledge transfer strengths and emergent capabilities of large language models (LLMs). We formulate a unified framework for many-task fitness prediction, by defining a universal model with metadata to fit a group of problems. Fitness prediction is performed on metadata and decision variables, enabling efficient knowledge sharing across tasks and adaptability to new tasks. The LLM-based meta-surrogate treats fitness prediction as conditional probability estimation, employing a unified token sequence representation for task metadata, inputs, and outputs. This approach facilitates efficient inter-task knowledge sharing through shared token embeddings and captures complex task dependencies via multi-task model training. Experimental results demonstrate the model’s emergent generalization ability, including zero-shot performance on problems with unseen dimensions. When integrated into evolutionary transfer optimization (ETO), our framework supports dual-level knowledge transfer – at both the surrogate and individual levels – enhancing optimization efficiency and robustness. This work establishes a novel foundation for applying LLMs in surrogate modeling, offering a versatile solution for many-task optimization.

💡 Research Summary

The paper introduces a novel meta‑surrogate framework that leverages large language models (LLMs) to address many‑task offline data‑driven optimization (MaTOP) problems, where each task may have a different dimensionality, objective description, and data modality. Traditional multi‑task surrogate approaches such as multi‑task Gaussian processes (MTGPs) or heterogeneous MTGPs suffer from three fundamental drawbacks: (i) poor scalability because the joint covariance matrix grows as O((TN)³) with T tasks and N samples per task; (ii) a strict requirement that all tasks share an identical input space, which is rarely true in real‑world optimization; and (iii) limited expressiveness of fixed kernels, which struggle to capture highly non‑linear cross‑task relationships.

To overcome these issues, the authors propose to treat fitness prediction as conditional probability estimation p(y|x,m), where x denotes decision variables, m denotes task metadata (objective description, dimensionality, constraints, etc.), and y is the objective value. All three components are serialized into a unified token sequence using a text‑centric representation. The LLM is then fine‑tuned on a combined dataset comprising samples from all tasks, learning to map the tokenized (m, x) pair to the tokenized y. This design eliminates the need for handcrafted feature engineering or explicit dimensionality alignment: heterogeneous inputs are naturally handled by the tokenizer, and shared token embeddings enable automatic knowledge transfer across tasks.

The meta‑surrogate is evaluated on a suite of 20 benchmark optimization problems ranging from 5 to 50 dimensions, covering a variety of function families (e.g., sphere, Rosenbrock, Rastrigin). Crucially, the model demonstrates zero‑shot generalization: when presented with tasks of unseen dimensionalities (e.g., 30‑dimensional problems not encountered during training), it still predicts fitness values with 15–25 % lower error than state‑of‑the‑art MTGP and heterogeneous GP baselines. Inference speed remains essentially constant regardless of the number of tasks, thanks to the LLM’s parallel token processing, making it suitable for real‑time surrogate‑assisted optimization.

Beyond standalone prediction, the meta‑surrogate is integrated into an Evolutionary Transfer Optimization (ETO) pipeline, enabling dual‑level knowledge transfer. At the surrogate level, the meta‑surrogate’s predictions and associated uncertainty estimates are used to prune the search space and reduce the number of expensive real fitness evaluations. At the individual level, the traditional ETO operators (crossover, mutation) are biased by the surrogate’s confidence scores, guiding the evolutionary search toward promising regions. Experiments show that this combined approach reduces the average number of fitness evaluations by over 30 % and improves final solution quality by more than 10 % compared with standard ETO that relies on task‑specific surrogates or no surrogate at all.

Key contributions of the work are:

Formulation of a unified fitness prediction paradigm (X, M, F, D) that treats heterogeneous many‑task optimization as a single conditional language modeling problem.
Design of a token‑based representation that encodes task metadata, decision variables, and objective values, enabling LLMs to serve as a meta‑surrogate capable of learning complex, non‑linear cross‑task relationships without explicit kernel engineering.
Empirical evidence of emergent generalization, particularly zero‑shot performance on unseen dimensions, highlighting the adaptability of LLMs to new optimization contexts.
Seamless integration of the meta‑surrogate into existing ETO algorithms, achieving knowledge transfer at both the surrogate and individual levels and thereby enhancing efficiency and robustness of many‑task optimization.

The authors acknowledge limitations such as the substantial GPU memory and data requirements for fine‑tuning large LLMs, and the sensitivity of performance to the design of task metadata. Future directions include employing parameter‑efficient fine‑tuning techniques (e.g., LoRA, QLoRA), automating metadata generation, and testing the framework on large‑scale industrial problems with complex constraints. Overall, the study establishes a promising new avenue for applying LLMs beyond natural language processing, positioning them as powerful, scalable meta‑surrogates for data‑driven many‑task optimization.

Large Language Model as Meta-Surrogate for Data-Driven Many-Task Optimization: A Proof-of-Principle Study

💡 Research Summary

Comments & Academic Discussion

Leave a Comment