FoMEMO: Towards Foundation Models for Expensive Multi-objective Optimization

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Expensive multi-objective optimization is a prevalent and crucial concern in many real-world scenarios, where sample-efficiency is vital due to the limited evaluations to recover the true Pareto front for decision making. Existing works either involve rebuilding Gaussian process surrogates from scratch for each objective in each new problem encountered, or rely on extensive past domain experiments for pre-training deep learning models, making them hard to generalize and impractical to cope with various emerging applications in the real world. To address this issue, we propose a new paradigm named FoMEMO (Foundation Models for Expensive Multi-objective Optimization), which enables the establishment of a foundation model conditioned on any domain trajectory and user preference, and facilitates fast in-context optimization based on the predicted preference-wise aggregated posteriors. Rather than accessing extensive real-world domain experiments for training, we demonstrate that pre-training the foundation model with a diverse set of hundreds of millions of synthetic data can lead to superior generalization and optimization performance to unknown problems, without necessitating any subsequent model training or updates in the following optimization process.

💡 Research Summary

The paper introduces FoMEMO, a novel framework that brings the concept of foundation models to the domain of expensive multi‑objective optimization (EMO). Traditional multi‑objective Bayesian optimization (MOBO) methods rely on Gaussian processes (GPs) that must be rebuilt from scratch for each new problem and each objective, leading to high computational overhead and poor scalability when evaluations are costly. Recent learning‑based approaches attempt to pre‑train deep models, but they assume the availability of large real‑world experimental datasets, which are rarely accessible in emerging applications.

FoMEMO addresses both issues by pre‑training a single transformer‑based foundation model on a massive synthetic corpus—hundreds of millions of samples generated from GP priors with diverse dimensionalities, numbers of objectives, and trajectory lengths. Each training instance consists of (i) a past evaluation trajectory Dₙ = {(xᵢ, yᵢ)}, (ii) a query point x, and (iii) a user preference vector λ sampled from the simplex. The target is the negative scalarized value g = –s_λ(x), where s_λ is a Tchebycheff scalarization. The model learns to predict the aggregated posterior distribution qθ(g | x, Dₙ; λ) directly, without any intermediate GP inference.

During inference, a user supplies the observed trajectory from an unseen real‑world EMO problem together with any known preferences (or none). The foundation model, in a single forward pass, outputs preference‑conditioned posterior distributions for all λ. These posteriors are then used to construct acquisition functions. Two families are proposed: (a) preference‑based acquisition, which evaluates expected improvement or hyper‑volume improvement for a specific λ, and (b) preference‑free acquisition, which averages over sampled λ to guide exploration when preferences are unknown. Crucially, no further model training or parameter updates are required; optimization proceeds entirely in‑context.

The authors benchmark FoMEMO on a suite of synthetic test functions and on real‑world engineering tasks such as neural architecture search, circuit design, and scientific simulation. Across all settings, FoMEMO consistently outperforms state‑of‑the‑art GP‑based MOBO methods (e.g., ParEGO, qEHVI, PESMO) and recent meta‑learning approaches (e.g., BOFormer). The advantage is most pronounced under tight evaluation budgets (≤ 50 function calls), where FoMEMO rapidly approximates the Pareto front with higher hyper‑volume and better coverage.

Key contributions are: (1) demonstrating that a foundation model trained solely on synthetic data can generalize to real EMO problems, (2) introducing the notion of preference‑conditioned aggregated posteriors as a universal surrogate, and (3) enabling fast, training‑free in‑context optimization via transformer inference. Limitations include reliance on GP‑generated synthetic functions, which may not capture all real‑world complexities, and the substantial computational cost of pre‑training large transformers. Future work could explore richer synthetic generators, handling of constraints, discrete variables, and scaling to even larger model sizes.

In summary, FoMEMO establishes a new paradigm where a single, pre‑trained foundation model serves as a universal surrogate for expensive multi‑objective optimization, delivering few‑shot generalization, high sample efficiency, and practical applicability across diverse domains.

FoMEMO: Towards Foundation Models for Expensive Multi-objective Optimization

💡 Research Summary

Comments & Academic Discussion

Leave a Comment