Synthesizing Systems with Optimal Average-Case Behavior for Ratio Objectives

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We show how to automatically construct a system that satisfies a given logical specification and has an optimal average behavior with respect to a specification with ratio costs. When synthesizing a system from a logical specification, it is often the case that several different systems satisfy the specification. In this case, it is usually not easy for the user to state formally which system she prefers. Prior work proposed to rank the correct systems by adding a quantitative aspect to the specification. A desired preference relation can be expressed with (i) a quantitative language, which is a function assigning a value to every possible behavior of a system, and (ii) an environment model defining the desired optimization criteria of the system, e.g., worst-case or average-case optimal. In this paper, we show how to synthesize a system that is optimal for (i) a quantitative language given by an automaton with a ratio cost function, and (ii) an environment model given by a labeled Markov decision process. The objective of the system is to minimize the expected (ratio) costs. The solution is based on a reduction to Markov Decision Processes with ratio cost functions which do not require that the costs in the denominator are strictly positive. We find an optimal strategy for these using a fractional linear program.

💡 Research Summary

The paper addresses the problem of synthesizing a reactive system that not only satisfies a given Boolean (qualitative) specification but also optimizes a quantitative objective expressed as a long‑run ratio of two cost functions. In many synthesis scenarios, multiple implementations satisfy the logical specification, yet designers lack a formal way to express preferences among them. The authors adopt the two‑step framework introduced by Bloem et al.: (i) a quantitative language that maps each infinite execution to a real value, and (ii) an environment model that determines how the value is to be optimized (worst‑case, average‑case, etc.).

The quantitative language considered here is a ratio objective. An automaton A equipped with two non‑negative cost functions c₁ (good events) and c₂ (bad events) assigns to an infinite word w the value

R(w) = lim infₘ→∞  ( Σ_{i=m}^{ℓ} c₁(δ⁎(q₀,w₀…w_i), w_{i+1}) ) / ( 1 + Σ_{i=m}^{ℓ} c₂(δ⁎(q₀,w₀…w_i), w_{i+1}) ).

The added “+1” in the denominator prevents division by zero and does not affect the limit when the denominator diverges. This objective captures the long‑run average ratio between accumulated good and bad costs, extending the classic mean‑payoff (average) objective.

For the environment, the paper uses a labeled Markov decision process (L‑MDP). An L‑MDP consists of a finite set of states, a set of actions, a probabilistic transition function, and a labeling function that maps each state to an input symbol. All actions are enabled in every state, which allows the system’s strategy to be defined purely as a mapping from states to actions (memoryless, pure strategies). The environment therefore behaves probabilistically rather than adversarially, enabling an average‑case analysis.

The synthesis problem is cast as a two‑player game: the system must guarantee the qualitative specification (modeled by a safety automaton) against all possible inputs, while simultaneously minimizing the expected ratio objective under the probabilistic environment. To solve this, the authors construct the product of the specification automaton (augmented with the two cost functions) and the L‑MDP. The product’s states are pairs (automaton state, MDP state); transitions inherit the probabilities from the MDP and the costs from the automaton. Consequently, the original synthesis problem reduces to finding an optimal strategy in a ratio‑cost MDP.

A crucial technical contribution is the treatment of ratio‑cost MDPs without requiring strictly positive denominator costs. The authors assume the MDP is unichain (every strategy induces a Markov chain with a single recurrent class), which guarantees the existence of optimal memoryless pure strategies. They then formulate the optimization as a fractional linear program (FLP). By introducing variables for the expected accumulated costs of c₁ and c₂ and applying the Charnes‑Cooper transformation, the FLP is converted into a standard linear program (LP). Solving this LP yields the optimal expected ratio and a corresponding stationary strategy.

The paper contrasts its approach with earlier work: Derman (1970) studied fractional objectives but required strictly positive denominator costs and expressed the objective directly as a ratio of expectations, not as a trace‑based limit inferior. De Alfaro (2005) considered similar MDP models but did not address synthesis. The present work thus generalizes prior models, integrates a probabilistic environment, and provides an algorithmic solution for the synthesis problem.

An experimental case study demonstrates the method on a server‑client scenario. Requests are treated as “bad” events (cost c₂ = 1) and acknowledgments as “good” events (cost c₁ = 1). The L‑MDP models stochastic request arrivals and processing delays. The synthesized strategy minimizes the expected request/acknowledgment ratio, achieving a substantially lower average ratio than a worst‑case optimal strategy. This validates that the ratio objective captures a meaningful performance metric and that the proposed algorithm can compute optimal controllers automatically.

In summary, the paper introduces a novel framework for average‑case optimal synthesis with ratio objectives. It formalizes ratio‑based quantitative specifications, models the environment as a labeled MDP, reduces the problem to a ratio‑cost MDP, and solves it via linear programming. The result is a practical method for automatically generating systems that satisfy logical requirements while optimizing nuanced performance criteria expressed as long‑run ratios.

Synthesizing Systems with Optimal Average-Case Behavior for Ratio Objectives

💡 Research Summary

Comments & Academic Discussion

Leave a Comment