FAIRFORMER: A transformer architecture for discrete fair division
We propose a deep neural network-based solution to the problem of allocating indivisible goods under additive subjective valuations without monetary transfers, trading off economic efficiency with envy-based fairness. We introduce FairFormer, an amortized, permutation-equivariant two-tower transformer that encodes items and agents as unordered token sets, applies self-attention within each set, and uses item-to-agent cross-attention to produce per-item assignment distributions in a single forward pass. FairFormer is trained end-to-end to maximize expected log-Nash welfare on sampled instances, requiring no solver supervision, unrolled allocation procedures, or fairness labels. At test time, we discretize by row-wise $\arg\max$ and apply a lightweight post-processing routine that transfers items to eliminate violations of envy-freeness up to one item while prioritizing improvements in Nash welfare. Our approach generalizes beyond its training regime and achieves near-optimal welfare (e.g., for uniformly sampled valuations, $96$–$97%$ for Nash welfare; $95$–$96%$ for utilitarian welfare), outperforming strong baselines in solution quality and/or runtime.
💡 Research Summary
The paper tackles the classic problem of allocating indivisible items among agents with additive, subjective valuations, where monetary transfers are prohibited. While the literature has long studied the trade‑off between economic efficiency (social welfare) and fairness (envy‑freeness and its relaxations EF1/EFX), existing algorithms are either combinatorial (NP‑hard MNW, round‑robin) or learning‑based but still tied to a procedural backbone (e.g., differentiable round‑robin, Lagrangian EF1 constraints). The authors propose a fundamentally different approach: a single‑shot neural allocator that directly maps a valuation matrix to an allocation in one forward pass, without any solver supervision, fairness labels, or unrolled optimization.
Model architecture (FairFormer).
- Input: an m×n non‑negative valuation matrix V (rows = items, columns = agents).
- Tokenization: two disjoint token sets are created by applying exchangeable linear projections ϕ_I and ϕ_A to V and Vᵀ, yielding item embeddings I₀∈ℝ^{m×d} and agent embeddings H₀∈ℝ^{n×d}. No positional encodings are used, guaranteeing permutation‑equivariance.
- Two‑tower encoder: each tower passes through L layers of self‑attention (FFSelfAttn) to capture intra‑set interactions.
- Cross‑attention fusion: items act as queries, agents as keys/values (FFCrossAttn) producing contextualized item embeddings Z₀∈ℝ^{m×d}.
- Global consistency: K additional item‑wise self‑attention layers refine Z, yielding ˜Z.
- Bilinear scoring: a compatibility matrix D = ˜Z H_Lᵀ (size m×n) is computed. A learned scalar α adds a residual of the raw values, forming the modified utility matrix Sθ(V) = D + αV. This residual preserves the obvious “high‑value‑item → high‑probability” signal while allowing the network to learn fairness‑related adjustments.
- Allocation distribution: a temperature‑scaled softmax is applied column‑wise, A = softmax(Sθ(V)/τ) ∈ Δⁿ^m. The temperature τ is annealed during training, providing a smooth path from fractional to near‑one‑hot outputs.
Training objective.
The model is trained end‑to‑end to maximize the expected logarithm of Nash welfare:
L_NW(θ) = E_{V∼D}
Comments & Academic Discussion
Loading comments...
Leave a Comment