Sponsored Questions and How to Auction Them
📝 Original Info
- Title: Sponsored Questions and How to Auction Them
- ArXiv ID: 2512.03975
- Date: 2025-12-03
- Authors: Kshipra Bhawalkar, Alexandros Psomas, Di Wang
📝 Abstract
Online platforms connect users with relevant products and services using ads. A key challenge is that a user's search query often leaves their true intent ambiguous. Typically, platforms passively predict relevance based on available signals and in some cases offer query refinements. The shift from traditional search to conversational AI provides a new approach. When a user's query is ambiguous, a Large Language Model (LLM) can proactively offer several clarifying follow-up prompts. In this paper we consider the following: what if some of these follow-up prompts can be "sponsored," i.e., selected for their advertising potential. How should these "suggestion slots" be allocated? And, how does this new mechanism interact with the traditional ad auction that might follow? This paper introduces a formal model for designing and analyzing these interactive platforms. We use this model to investigate a critical engineering choice: whether it is better to build an end-to-end pipeline that jointly optimizes the user interaction and the final ad auction, or to decouple them into separate mechanisms for the suggestion slots and another for the subsequent ad slot. We show that the VCG mechanism can be adopted to jointly optimize the sponsored suggestion and the ads that follow; while this mechanism is more complex, it achieves outcomes that are efficient and truthful. On the other hand, we prove that the simple-to-implement modular approach suffers from strategic inefficiency: its Price of Anarchy is unbounded.📄 Full Content
Platform presents a question ℓ, e.g., the suggested prompt “I am looking for trail running shoes.”
User interacts with the question (e.g., clicks), generating a signal σ.
Platform and advertisers update their belief about the user’s state to the posterior Dℓ,σ.
Based on the refined intent, one of the advertisers is allocated an ad slot. trail running shoes," an advertiser can guide a user with relevant intent directly toward their products, and produce better outcomes for the users, advertisers and platform. 1Given the limited display space and the varying utility advertisers derive from different suggestions, the platform faces a critical question: How should these “suggestion” slots be allocated? And, how does this new mechanism for allocating suggestions interact with the traditional ad auction that might follow? In this paper, we introduce a formal model for designing and analyzing these problems. We use this model to investigate a key design choice: whether to build an end-to-end mechanism that jointly optimizes the suggestion and downstream ad auction, or to adopt a simpler modular approach that decouples them into separate auctions.
We introduce a new model for thinking about advertising in interactive platforms. Our model combines information elicitation and (ad) allocation. While directly inspired by modern applications like sponsored suggestions in conversational AI, a key strength of our model is its generality. We give a framework flexible enough to model many different forms of platform-user interaction, including sponsored clickable suggestions (prompts), sponsored questions (that the LLM makes), or other formats.
Our setup is as follows. A platform has a single ad slot for sale to one of n strategic advertisers. The user’s intent is captured by a state of nature θ, that encapsulates complex, noisy user features and intent vectors that traditional keyword matching often misses. For the advertiser, θ represents the latent information about the user that is critical for determining an ad’s relevance. We assume θ is unknown to all parties and is sampled from a public prior distribution D. Each advertiser i has (1) a private base value v i ≥ 0, the same across all states, and (2) a state-dependent conversion rate α i (θ) ∈ R + that represents the ad’s relevance for advertiser i given θ.2 Advertisers have quasilinear utility, i.e., u i = v i • α i (θ) -p i , where p i is the payment charged by the platform.
At the heart of our model is an information elicitation stage: alongside any organic content presented to the user, the platform can choose to ask one of k possible sponsored questions. Asking question ℓ generates a signal σ, observable by everyone, drawn from a distribution Q ℓ (. | θ).
Here a “question” is the LLM-generated prompt (“I am looking for trail running shoes”), and a “signal” is the publicly observable, measurable outcome of the user clicking or ignoring it. The probability of a user’s response depends on their latent intent, a relationship captured by the probability distribution Q ℓ (. | θ). Given a signal σ generated by question ℓ, advertisers update their belief about the state θ to a common posterior distribution D ℓ,σ . This, in turn, implies updated (expected) conversion rates for the ad slot.
Example 1.1 (Running shoes). A user is asking for information about “running shoes.” The platform is uncertain about their specific needs. We model the user’s intent as the state of nature θ = (θ terrain , θ exp ) ∈ {0, 1} 2 , where θ terrain = 1 if the user is a trail runner and 0 if they are a road runner; θ exp = 1 if the user is an experienced runner, and 0 if they are a beginner. The prior belief D is that all four states in Θ = {(0, 0), (0, 1), (1, 0), (1, 1)} are equally likely; that is, D(θ) = 1/4 for all θ.
There are two advertisers. Advertiser 1 sells professional trail running shoes. Their private base value is high, v 1 = $50. Their conversion rate α 1 (θ) is highest for an experienced trail runner, e.g., α 1 (1, 1) = 0.9. It is lower for a beginner trail runner, α 1 (1, 0) = 0.3, and zero for all road runners (α 1 (0, •) = 0). Advertiser 2 sells popular road running shoes for beginners. Their base value is v 2 = $30. Their conversion rate α 2 (θ) is highest for a beginner road runner, α 2 (0, 0) = 0.8. It is lower for an experienced road runner, α 2 (0, 1) = 0.4, and zero for all trail runners (α 2 (1, •) = 0).
Before the ad auction, the platform can show one of two possible sponsored suggestions. One possible choice is ℓ trail : “I am looking for trail running shoes.” If the user clicks on this suggestion, this generates a signal σ trail,click , which, for simplicity, we assume perfectly reveals their interest in trails; that is, if the signal is σ trail,click , everyone learns that θ terrain = 1. The posterior distribution D ℓ trail ,σ trail,click is the uniform distribution over (1, 0) (trail, beginner) and (1, 1) (trail, experienced). If the user does not click on this suggestion this generates a signal σ trail,no click , which, for simplicity, we assume perfectly reveals the user’s lack of interest in trails. D ℓ trail ,σ trail,no click is the uniform distribution over (0, 0) (road, beginner) and (0, 1) (road, experienced). A different choice for a suggestion is ℓ targeted : “I am an experienced trail runner.” If this is clicked, we assume it completely reveals that the state is θ = (1, 1); otherwise, it is revealed that the state is not θ = (1, 1), yielding a posterior distribution that is uniform over (0, 0) (road, beginner), (1, 0) (trail, beginner), and (0, 1) (road, experienced).
In this paper, we are interested in understanding the trade-offs between two fundamental design philosophies: an end-to-end approach that jointly optimizes the suggestion and auction stages, versus a modular approach that decouples them.
We first analyze the end-to-end approach, where advertisers report their base values, and the platform uses this information to make decisions on which question to present, then which ad to show, and how much to charge the advertisers. We show that the platform can instantiate this as an end-to-end VCG auction. This auction is truthful, i.e., incentivizes the advertisers to report their true base values. An interesting observation is that the VCG payment can be decomposed into a second price payment (coming from the ad allocation step) and an additional payment based on the influence an advertiser’s bid had in the selection of the question. While a unified VCG auction may seem complex, we argue it is a strong candidate for practical systems, especially those with autobidders, because of its efficiency and truthfulness guarantees -guarantees that, as we will show, are lacking in simpler modular approaches.
Next, we consider a natural alternative to the end-to-end approach: a modular mechanism that decouples the auction into two stages (one for the question, and one for the ad). We consider a VCG-per-stage design: advertisers first bid on which question to ask, and after a question is chosen and a signal is observed, they bid again for the final ad slot. In this setup, an advertiser’s value for a given question is simply their expected utility from the subsequent ad auction. We prove that this intuitive behavior constitutes a pure Nash equilibrium: advertisers bid this expected utility in the first stage and their true value in the second. This seemingly elegant equilibrium hides significant practical challenges. For example, an advertiser’s value for a question is difficult to compute, as it depends on the unknown values and strategies of all other bidders. A tempting solution is for the platform to assist. Perhaps, for example, it could collect advertisers’ base values and compute the stage-1 bids on their behalf. We show that this “proxy” approach creates incentives for advertisers to misreport their underlying values to manipulate the outcome. Furthermore, the proposed bidding language (bidding on every question) fails to capture the context-dependent nature of conversational AI, where an advertiser’s value for a question can change dramatically based on the preceding dialogue. For example, an advertiser might value a question about trail running differently if they know that the user is an experienced runner; asking them to bid on the question by itself might miss such nuances.
Even if these practical hurdles could be overcome, the modular approach suffers from a fatal flaw. We prove that the equilibrium of the VCG-per-stage mechanism has an unbounded Price of Anarchy. That is, its outcome can be arbitrarily less efficient than the one achieved by the unified, end-to-end auction. This severe inefficiency is not an artifact of the VCG-per-stage design; we find that similar issues arise when using other common formats, such as first-price or all-pay auctions for the initial question slots.
The main take-home message is that a unified, end-to-end mechanism is a more attractive and robust solution for allocating sponsored suggestions. If a modular design is pursued for its apparent simplicity, our results show that significant further research is needed to develop a mechanism that is both strategically simple for advertisers and achieves good social welfare.
In summary, our contributions are as follows:
• We introduce a formal model for sponsored suggestions in interactive platforms.
• We observe that the end-to-end VCG mechanism is truthful (DSIC) in this setting and provide a simple characterization of its payments.
• We identify and characterize an intuitive pure Nash equilibrium for the natural “VCG-perstage” modular mechanism.
• We prove that this modular mechanism has an unbounded Price of Anarchy, and show this severe inefficiency persists even when the platform uses first-price or all-pay auctions in the suggestion stage.
Many search engines and other products show query refinements or suggested queries to help users. For most popular search engines, there are policies which commit to showing the best results for the user, and which preclude them from considering commercial interest (such as advertiser bids or ad revenue) to optimize these queries [Mic25,Goob]. Exceptions exist; for example, Google’s Relevant Search for Content (RSOC), allows publishers to use “funnel RPM” in optimization of suggested queries shown on their webpage [Goo25], thus allowing advertisers to inform which suggestions are shown to the user (note though that Google AFS policies [Gooa] still require that first and foremost user experience is optimized). Closer to our interest here, there is growing interest from practice to integrate advertising into conversational search. For instance, Perplexity recently launched an experiment featuring sponsored follow-up prompts [Per24]. At the same time, the academic literature on the topic is nascent but rapidly growing. Feizi et al. [FHRS25] discuss challenges and opportunities in this area. One major stream of research focuses on how to natively integrate ads into a final, LLM-generated answer. Various models have been proposed, including token-based auctions [DML + 24], bidding for placement within an LLM summary [DFK + 24], fusing organic and sponsored content [MTK24], embedding ads in an independently generated organic answer [HLRS24], and bidding to influence the fine-tuning of the LLM itself [SCS25]. Banchio et al. [BMP25] study the problem of ad placement in a conversational search; specifically, they study the equilibria under first-and second-price auctions with respect to the timing of showing an ad. In contrast, our work models the preceding, interactive stage as well: using suggestions to influence the flow of the conversation itself. Bergemann et al. [BBD + 25] study a setting where agents have private information about both their preferences and a common state, showing that standard mechanisms fail. They propose a solution via “data-driven mechanisms,” which use post-allocation data (like observed clicks) to adjust payments and restore truthfulness. While our information structure differs (advertisers’ state-dependent conversion rates are public information in our paper), our work addresses the same high-level challenge for the platform: designing a mechanism that must jointly elicit preferences and handle uncertainty about a common state.
We consider the problem of a platform with a single ad slot for sale, to one of n strategic advertisers. We will refer to the ad slot as the item. The item is described by a state of nature θ ∈ Θ, for some finite set Θ, that is initially unknown to both the platform and the advertisers. θ is sampled from a known distribution D; we write D(θ) for the probability that the state of nature is θ.
Each advertiser i has a private base value v i ≥ 0 for the item, which stays the same across all states θ. There is also a publicly known state-dependent conversion rate α i (θ) ≥ 0 that represents the item’s relevance for advertiser i given θ. 3 The value of advertiser i for an item with state θ is the base value multiplied by the conversion rate, i.e., v i • α i (θ). Advertisers have quasilinear utilities: when allocated an item with state θ for a payment p i , advertiser i has utility
Before the item is allocated, the platform presents one of k possible questions. Presenting question ℓ reveals a random signal σ ∈ Σ ℓ ; σ is drawn from a conditional distribution over signals
Upon observing σ, the posterior over states is
Therefore, upon observing signal σ from question ℓ, the posterior expected conversion rate for advertiser i is
We call v i α i (ℓ, σ) the effective value of advertiser i given (ℓ, σ).
Example 2.1 (Running shoes (continuation of Theorem 1.1)). Recall that Θ = {(θ terr , θ exp ) ∈ {0, 1} 2 } with uniform prior D. We have two advertisers with private base values v 1 = 50, v 2 = 30, and public conversion rates:
There are two questions:
- Terrain question ℓ terr (“I am looking for trail running shoes?”) with signals Σ terr = {click, no-click} and
Thus, S terr (click) = S terr (no-click) =1 2 . The posteriors are: D terr,click is uniform over {(1, 1), (1, 0)}, and D terr,no-click is uniform over {(0, 1), (0, 0)}. Hence, the posterior expected conversion rates are
Therefore, the effective value of advertiser 1 is 50 • 0.6 = 30 when the signal is “click,” and 0 otherwise. The effective value of advertiser 2 is 0 when the signal is “click,” and 30 • 0.6 = 18 otherwise. Therefore, ex-ante, ℓ terr induces a lottery that with probability 1/2 gives effective values 30 and 0, and with probability 1/2 the effective values are 0 and 18.
- Targeted question ℓ tgt (“I am an experienced trail runner”) with signals Σ tgt = {click, no-click} and
Thus S tgt (click) = 1 4 and S tgt (no-click) = 3 4 . The posteriors are: D tgt,click is a point mass on (1, 1), and D tgt,no-click is uniform over {(1, 0), (0, 1), (0, 0)}. Hence the posterior expected conversion rates are
Therefore, the effective value of advertiser 1 is 50 • 0.9 = 45 when the signal is “click,” and 50 • 0.1 = 5 otherwise. The effective value of advertiser 2 is 0 when the signal is “click,” and 30 • 0.4 = 12 otherwise. Therefore, ex-ante, ℓ tgt induces a lottery that with probability 1/4 gives effective values 45 and 0, and with probability 3/4 the effective values are 5 and 12.
After the question is presented, and the signal is randomly drawn and revealed to all parties, the platform can allocate the item.
Overall, an instance of our problem is parameterized by: (1) the public prior distribution D over states, (2) public conditional distributions Q ℓ (σ | θ) for every question ℓ, signal σ, and state θ, (3) private base values v i for each advertiser i, and (4) public conversion rates α i (θ) for each advertiser i and state θ.
Given an instance, the timing of events is as follows:
-
The platform presents a question ℓ.
-
Nature draws a signal σ from Q ℓ (. | θ), and reveals it to the platform and the advertisers.
-
The advertisers possibly interact with the platform.
-
The platform allocates the item to one of the advertisers (possibly in a randomized way), and charges payments.
Given this timing, the platform picks the rules of the game, and commits to them ex ante. The rules may condition on publicly observed signals σ and any messages solicited from advertisers in steps (2) and (5). In particular, the platform specifies (i) what input (if any) advertisers provide in steps (2) and ( 5); (ii) how the question ℓ is selected in step (3); and (iii) the allocation and payment rules used in step (6). The platform’s goal is to maximize expected welfare: the expected sum, over the platform’s (possibly randomized) choice of question and the induced signal, of each advertiser’s base value times their (posterior) conversion rate. The goal of each advertiser is to maximize expected utility: its base value times conversion rate minus its payment.
In this paper, we study two different approaches for solving the platform problem. First, in the direct revelation approach, the platform asks each advertiser to report their base values in step (2) (no interaction occurs in step (5)) and commits to a truthful mechanism. We define the setting in detail, and study such mechanisms, in Section 3. Second, we consider modular two-stage mechanisms: the platform commits to two auctions: one for picking the question in step (3), and one for allocating the item in step (6). Messages/bids are therefore solicited in both steps (2) and (5). Here, performance is evaluated at a Nash equilibrium. We define the setting in detail, and study such mechanisms in Section 4.
In this section, we study the design of a single, end-to-end mechanism for both information acquisition and allocation. This corresponds to the direct revelation approach, where the platform commits to a set of rules, asks advertisers to report their private information (their base values), and then manages the entire process (which question to ask, who gets the item, how much to charge). The timing of a direct mechanism is as follows:
-
Nature samples the item’s state θ from D. θ is not revealed to the platform or the advertisers.
-
Each advertiser i submits to the platform a bid b i (possibly different than their base value v i ).
-
The platform presents a question ℓ.
-
Nature draws a signal σ from Q ℓ (. | θ), and reveals it to the platform and the advertisers.
-
The platform allocates the item to one of the advertisers (possibly in a randomized way), and charges payments.
A direct mechanism M = (λ, q, p) consists of three functions, that depend on the reported bids b = (b 1 , b 2 , . . . , b n ), from the single interaction with the advertisers in step (2) above:
• The suggestion rule λ(b) is a probability distribution over the available questions; we write λ ℓ (b) for the probability that the mechanism picks question ℓ. It holds that k ℓ=1 λ ℓ (b) = 1, for all b.
• The allocation rule q(b, σ): For a given report b and a realized signal σ (drawn from
where ℓ is the question sampled from λ(b)), q i (b, σ) is the probability that advertiser i receives the item. It holds that n i=1 q i (b, σ) ≤ 1, for all b and σ.
• The payment rule p(b, σ): For a given report b and a realized signal σ, p i (b, σ) is the payment charged to advertiser i.
For deterministic mechanisms, λ ℓ (b) and q i (b, σ) are indicators for ℓ being chosen/advertiser i winning the item.
Let x i (b) = E ℓ∼λ(b),σ∼S ℓ (.) [α i (ℓ, σ) • q i (b, σ)] be the expected delivered conversion rate for advertiser i, where the expectation is the randomness of the mechanism and the random signal. Intuitively, if α i (ℓ, σ) is a click-through rate, x i (b) is the “expected number of clicks” allocated to advertiser i. Similarly, slightly overloading notation, let p i (b) = E ℓ∼λ(b),σ∼S ℓ (.) [p(b, σ)] be the expected payment of advertiser i. The interim utility of advertiser i, with a true base value v i , who submits a bid b i , and others report b -i is therefore
Our notion of truthfulness is dominant strategy incentive compatibility. A mechanism is dominant strategy incentive compatible (DSIC) if truthtelling is a dominant strategy:
b i and b -i . We are interested in designing a DSIC mechanism that maximizes expected social welfare.
Our first result establishes that the welfare-maximizing direct mechanism is truthful. This is achieved by defining the suggestion and allocation rules to greedily maximize the reported welfare, and then applying the classic Vickrey-Clarke-Groves (VCG) payment structure.
Lemma 3.1. The direct mechanism M * = (λ * , q * , p V CG ) that maximizes expected social welfare is Dominant Strategy Incentive Compatible (DSIC), where:
• The suggestion rule λ * (b) deterministically chooses the question ℓ * = ℓ * (b) that maximizes the ex-ante expected welfare, i.e., ℓ * = argmax ℓ E σ∼S ℓ (.) [max j=1,…,n b j • α j (ℓ, σ)]
• The allocation rule q * (b, σ) allocates the item to the advertiser i * with the highest posterior value after suggestion ℓ * (b) is presented, and signal σ is observed i * (b, σ) = argmax j∈{1,…,n} b j • α j (ℓ * (b), σ). That is, q i * = 1 and q j = 0 for j ̸ = i * .
• The payment rule p V CG charges each advertiser i the externality they impose on all other advertisers: the expected welfare of the optimal solution without i in the instance, minus the expected welfare of everyone other than i in the optimal solution with i.
Before proving the lemma, we give a simple example of how the mechanism works.
Example 3.2 (Mechanism M * applied to Theorem 2.1). Consider applying M * to the instance from Theorem 2.1 given (truthful) bids b 1 = 50, b 2 = 30. Given these bids, the mechanism first chooses a question to present.
Choosing the question ℓ * (b). As discussed in Theorem 2.1, question ℓ terr induces a lottery that, with probability 1/2, gives effective values (base value times posterior expected conversion rate) 30 and 0, and with probability 1/2, the effective values are 0 and 18. Therefore, the expected welfare under ℓ terr is 1 2 • 30 + 1 2 • 18 = 24. On the other hand, ℓ tgt induces a lottery that, with probability 1/4, gives effective values 45 and 0, and with probability 3/4, the effective values are 5 and 12. Therefore, the expected welfare under ℓ tgt is 1 4 • 45 + 3 4 • 12 = 20.25. Hence the mechanism picks ℓ * (b) = ℓ terr .
Allocating the item. Since ℓ terr is picked, then with probability 1/2 the signal is “click,” the induced effective values are 30 and 0, and advertiser 1 receives the item. Otherwise, the signal is “no-click,” the induced values are 0 and 18, and advertiser 2 receives the item.
Payments. The expected value of advertiser 1 in the current solution is 15: they receive the item when the signal from asking ℓ terr is “click” (this is a probability 1/2 event), and their effective value is 30 when this happens. For advertiser 2, their expected value is 9: they receive the item when the signal from asking ℓ terr is “no-click” (this is a probability 1/2 event), and their effective value is 18 when this happens.
What would happen if advertiser 1 did not participate? In that case, the question would be picked to optimize just for advertiser 2, who is indifferent between the two questions, since its the expected value is (still) 9 under both choices. This means that advertiser 1 does not impose any externality, so its payment is zero.
What would happen if advertiser 2 did not participate? Similarly, the question would be picked to optimize just for advertiser 1, who is also indifferent between the two questions: ℓ terr gives an expected value of 30/2 = 15, and ℓ tgt gives an expected value of 45/4 + 5 • 3/4 = 15. Therefore that advertiser 2 does not impose any externality, so its payment is also zero.
We will now prove Lemma 3.1 which shows that mechanism M * is DSIC.
Proof of Theorem 3.1. Fix bids b. For any question ℓ and signal σ, the reported welfare from allocating to i is b i α i (ℓ, σ). Hence, the maximum reported welfare for fixed ℓ is E σ∼S ℓ max j b j α j (ℓ, σ) , attained by the allocation rule q * (b, σ), by allocating, for each realized σ, to a maximizer of b j α j (ℓ, σ). Optimizing over ℓ, gives the suggestion rule λ * (b) above; therefore, our rules maximize expected reported welfare over all feasible outcomes (choice of ℓ and signal-contingent allocation). Define
Advertiser i’s payment is then:
Therefore, i’s expected utility when its true base value is v i and it reports b i is:
Since (λ * , q * ) selects, for each b, the outcome that maximizes
The next lemma gives a simplified way of calculating the VCG payments in our problem. Specifically, the VCG payment of advertiser i can be decomposed into (1) an externality from stage 1, plus (2) an expected second price payment.
expected Stage-2 second price payment .
Proof. Starting from the definition of p V CG i (b) we have
Let s i (σ) := max j̸ =i b j α j (ℓ * , σ). For every realized σ, the welfare-maximizing q * (b, σ) allocates only to maximizers of b j α j (ℓ * , σ). Therefore, for every signal σ
If s i (σ) > 0, since j q * j (b, σ) = 1, we have j̸ =i q * j (b, σ) = 1 -q * i (b, σ). If s i (σ) = 0, both sides are 0 regardless of q * . Therefore, for all σ,
(2)
Taking an expectation in (2) and substituting into (1) gives
Stage-1 externality
expected Stage-2 second price .
In contrast to the end-to-end approach, in this section we study modular mechanisms where the platform decouples the problem into two distinct stages. First, it runs an auction to decide which question to present. Second, after the signal from that question is publicly observed, it runs a separate auction to allocate the item. This approach is simpler to implement, but requires advertisers to act strategically. The timing of a modular mechanism is as follows:
- Nature samples the item’s state θ from D. θ is not revealed to the platform or the advertisers.
, where b Q i,ℓ is their bid for having question ℓ presented.
-
Nature draws a signal σ from Q ℓ (.|θ), and reveals it to the platform and the advertisers.
-
Each advertiser i submits a bid b A i for the item.
Analytically, we will treat this as a one-shot normal-form game: before play begins, each advertiser commits to a contingent plan for both stages. That is, advertisers simultaneously submit (i) a bid for each question and (ii) a bidding function for the item, mapping (ℓ, σ) to a stage-2 bid. The subsequent “stages” are merely the platform executing those plans after the signal is realized. Later in this section, we prove that a certain behavior is a Nash equilibrium under a certain modular mechanism in this normal form game; we also note, however, that the same behavior is a subgame-perfect equilibrium for the corresponding sequential game.
A modular mechanism M = (M Q , M A ) is composed of two separate mechanisms. The stage 1 mechanism, M Q , takes as input bids {b Q i }, and consists of: (i) a question selection rule λ(b Q ), which gives a probability distribution over the available questions; we write λ ℓ (b Q ) for the probability that the mechanism picks question ℓ and have that
The stage 2 mechanism, M A , takes as input bids {b A i }, and consists of: (i) an item selection rule q(b A ; ℓ, σ), that allocates the item to advertiser i with probability q i (b A ; ℓ, σ), and (ii) a stage 2 payment rule p A (b A ; ℓ, σ) that charges each advertiser i a payment p A i (b A ; ℓ, σ). We evaluate modular mechanisms in their worst-case Nash equilibria. A strategy s i for an advertiser i specifies what they will do at every interaction point. That is, a strategy s i consists of (1) a Stage 1 bidding vector b
), which specifies their bid for each question ℓ, and (2) a bidding function b A i (ℓ, σ) which specifies what they will bid for the item for every possible question ℓ and signal σ that could be revealed. Let s = (s 1 , . . . , s n ) be a strategy profile for the advertisers. This profile determines the bids in both stages, and thus the final outcome and advertisers’ utilities. Let U i (s i ; s -i ) be the expected utility for advertiser i when they play strategy s i and all other advertisers play according to strategies in s -i , where the expectation is over the strategies of others, the environment (random state of the item, signal), and randomness in the modular mechanism. A strategy profile s * is a Nash equilibrium if no advertiser has a profitable unilateral deviation; that is, if for every advertiser i, and any strategy
). We compare the social welfare of the Nash equilibrium to that of the optimal allocation assuming full information. Note that the optimal allocation is an example of what the direct revelation mechanism obtains. The worst case inefficiency of a Nash equilibrium in a model is captured by a concept called price of anarchy. Price of anarchy is thus the ratio of the social welfare of the optimal direct revelation outcome to that of the worst-case Nash equilibrium.
We instantiate the general modular framework with a specific, natural choice for the two mechanisms. Specifically, we suggest running VCG for each stage separately.
Working backwards, in the second stage, arguably the most natural choice is VCG, which boils down to a second price auction on “effective” values. Concretely, after a suggestion ℓ has been chosen (in step (3)) and a signal σ has been publicly observed (in step (4)), the platform holds the following auction: allocate the item to the advertiser with the highest effective bid, i.e. the advertiser i * with the highest b A i • α i (ℓ, σ), and charge them the second-highest effective bid
Anticipating the truthful and efficient outcome of the second stage auction, each advertiser can calculate its expected utility for any given question ℓ. Concretely, for a fixed question ℓ, an advertiser can simulate a draw σ from S ℓ (.), and the resulting payoff from the subsequent second price auction. This induces a well-defined expected utility R i (ℓ; v i ), for each advertiser i and question ℓ. Therefore, the stage 1 problem boils down to picking a mechanism for the following “public projects” problem: there are k projects to choose from (the questions), and n agents (the advertisers), each with a private value R i (ℓ; v i ) for each project. For this problem, VCG is again the natural, truthful solution: pick the project (question) that maximizes social welfare, and charge each agent its externality. Concretely, given a bid b Q i,ℓ from each advertiser i for each question ℓ: present the question ℓ * with maximum sum of bids, i.e. ℓ * = argmax ℓ n i=1 b Q i,ℓ , and charge each advertiser i its externality
Example 4.1. Consider the instance from Theorem 2.1. Assume that advertiser 1 submits:
• Stage 2 bidding function: b A 1 (ℓ, σ) = v 1 = 50 for all (ℓ, σ).
Assume that advertiser 2 submits:
• Stage 2 bidding function: b A 2 (ℓ, σ) = v 2 = 30 for all (ℓ, σ).
Then the mechanism proceeds as follows. First, since i b Q i,tgt = 20 + 12 > 21 + 9 = i b Q i,terr , the question chosen in the first stage is ℓ * = ℓ tgt . To compute the stage 1 VCG externalities for advertiser 1 we have: max ℓ b Q 2,ℓ = max{9, 12} = 12 and b Q 2,ℓ * = 12, so p Q 1 = 0. To compute the stage 1 VCG externalities for advertiser 2 we have:
Then, a signal is realized in stage 2. Since ℓ tgt was chosen:
• If σ = click (prob. 1/4), the effective bids are (b A 1 α 1 , b A 2 α 2 ) = (50 • 0.9, 30 • 0) = (45, 0). Advertiser 1 wins. The second-highest effective bid is 0, so the Stage 2 payment is p A 1 = 0.
• If σ = no-click (prob. 3/4), the effective bids are (50 • 0.1, 30 • 0.4) = (5, 12). Advertiser 2 wins and pays the second-highest effective bid 5, so p A 2 = 5.
We first prove that the natural mechanism from Section 4.1, VCG per stage, has a natural pure Nash equilibrium: every advertiser bids their (true) expected utility for each question, as well as their true base value. We defer the proof of Theorem 4.2 to Appendix A.
when the realized question is ℓ, the realized signal is σ, and everyone bids their base value in stage 2. Then the strategy profile
is a pure Nash equilibrium.
Note that, as described in the first part of this section, advertisers simultaneously submit (i) a bid for each question and (ii) a bidding function for the item, mapping (ℓ, σ) to a stage-2 bid. Suppose instead that the platform only collected base values b i , and then computed stage-1 values R i (ℓ; v) on the advertisers’ behalf (using v i = b i ). In this proxy variant, truthfully reporting b i = v i is not a Nash equilibrium. The proof of the following proposition is deferred to Section A. The equilibrium outcome in Theorem 4.2 might differ from the welfare-maximizing outcome (the outcome of the mechanism as described in Section 3). Specifically, while the stage 2 outcome of the equilibrium of Theorem 4.2 maximizes welfare for every realized question ℓ and signal σ, the stage 1 question selected maximizes expected quasilinear utility
On the other hand, the welfare-maximizing outcome selects the question that maximizes expected value
Therefore, it is natural to ask how inefficient this equilibrium outcome can be. And, more generally, how bad is the Price of Anarchy of the proposed mechanism M
Our next theorem shows that not only do inefficient equilibria exist, but even the pure Nash equilibrium suggested in Theorem 4.2 has infinite Price of Anarchy.
Moreover, for all c ≥ 10, there exists an instance such that the optimal welfare is at least c times the welfare of the equilibrium from Theorem 4.2.
Proof. Fix an integer m ≥ 3 and a parameter δ > 0 (fixed later in this proof). Our instance is constructed as follows:
• States and questions:
- Θ = {1, 2, . . . , m} with the uniform prior D(θ = t) = 1/m.
point mass on state t.
- An uninformative question ℓ = 2: Σ 2 = {σ} with Q 2 (σ | θ) = 1 for all θ. Thus, D 2,σ = D.
• Advertisers:
- There are n = m advertisers, each with a base value v i = 1.
For each state i ∈ {1, . . . , n}, α i (i) = 1. That is, advertiser i is the “primary” advertiser for state i.
For every state i ∈ {1, . . . , n}, α i-1 (i) = 1 -δ (where i -1 = n for i = 0). That is, advertiser i is the “secondary” advertiser for state i + 1.
-
For state i = 3, α 1 (3) = 1 -δ and α 2 (3) = 0. That is, advertiser 1 is the “secondary” advertiser for state 3 as well (also the secondary for state 2, and the primary for state 1), while advertiser 2 is not the secondary advertiser for any state.
-
All remaining conversion rates are 0: α i (t) = 0 for all other pairs (i, t).
This construction ensures that we have exactly two advertisers with positive bids in each state (which creates competition under the revealing question). We now compute induced (signal) conversion rates and compare the two questions.
Induced conversion rates. Under question ℓ = 1, for every realized σ t the posterior is a point mass at state t, therefore, α i (1, σ t ) = α i (t) for every state t (and signal σ t ) and advertiser i. On the other hand, under the uninformative question ℓ = 2, the posterior equals the prior; therefore, α i (2, σ) = E θ∼D α i (θ) = 1 m m t=1 α i (t). From our construction, we have:
truthful bidding is a dominant strategy, and v i = 1).
If ℓ = 1, in every state t, there are two positive effective values: 1 from the primary advertiser (advertiser t) and 1 -δ from the secondary advertiser (advertiser t -1 for t / ∈ {1, 3}, advertisers 1 and 2 for t = 3, and advertiser n for t = 1). Thus, for any state t, we have that the welfare (the value generated) is 1, but the utility of the winning advertiser (the primary advertiser) is δ. Taking an expectation over the uniform prior we have that R i (1; v) = δ/m for all i, where R
m for all i / ∈ {1, 2}. Therefore, under VCG (a second price auction) with these bids, we have that the (expected) value is 3-2δ m (for δ < 1, we have 3-2δ m > 2-δ m ), revenue 2-δ m , and R i (2; v) = 0 for i ̸ = 1, but R 1 (2; v) = 3-2δ m -2-δ m = 1-δ m .
Stage 1 outcome for M V CG Q Recall that the equilibrium from Theorem 4.2 prescribes that b Q i,ℓ = R i (ℓ; v), for all advertisers i and questions ℓ. Given the stage 2 calculations, we therefore have that the total bid on question ℓ = 1 is n i=1 R i (1; v) = δ. On the other hand, for question ℓ = 2 we have that the total bid is n i=1 R i (2; v) = R 1 (2; v) = 1-δ m . By picking δ such that 1-δ m > δ (e.g., δ = 1/m 2 ), we have that the mechanism picks question ℓ = 2. Therefore, in the prescribed equilibrium for the mechanism M = (M V CG Q , M V CG A ), the overall outcome is to pick question ℓ = 2, which results in advertiser 1 always winning in stage 2, for a total (expected) welfare of 3-2δ m . On the other hand, presenting question ℓ = 1 results in the “primary” advertiser to win the item in every state t, for a total (expected) welfare of 1.
It is tempting to conjecture that perhaps one can bypass Theorem 4.4 by tweaking the mechanism. Arguably, the most natural choice would be to change the VCG payment in stage 1 to other natural payment formats, e.g., first-price payments, or all-pay payments (but still select the question with the highest total bid). As we show next, the result of Theorem 4.4 is robust to such changes. The proof of Theorem 4.5 is deferred to Appendix A. , is replaced by a mechanism that uses the same “highest sum of bids” allocation rule but with one of the following payment rules: (i) First-Price: Only the advertiser who submitted the single highest bid for the winning question pays that bid, and (ii) All-Pay: Every advertiser pays their submitted bid for the winning question.
In this work, we introduce a formal model for interactive sponsored search, capturing the growing trend of platforms eliciting user intent through suggestions before conducting a final ad auction. In this model, we analyzed two different design choices: an end-to-end, direct revelation mechanism that jointly optimizes the choice of suggestion and the subsequent ad allocation, and a simpler, modular two-stage mechanism that decouples the suggestion auction from the ad auction. Our results suggest that the direct revelation format provides overall better outcomes for the platform and advertisers.
Our work opens several avenues for future research. One direction is to move beyond welfaremaximization and study mechanisms designed for revenue maximization. Further afield, a natural evolution of this technology would be hybrid systems that present a mix of organic and sponsored follow-up questions, possibly over multiple rounds of interaction. Analyzing the complex dynamics of such hybrid systems, and how they affect the choice of mechanism, is an important next step. not want to decrease its bid for the winning question: any decrease of ϵ > 0 would result in a different question being asked. Currently, advertiser 1 gets utility R 1 (2; v) -δ = 1-δ m -δ which is at least R 1 (1; v) = δ m for all δ < 1 m+2 . Second, the only thing that an advertiser i ̸ = 1 can do to affect the outcome is to bid strictly more than δ m on question 1. However, R i (1; v) = δ m , so, since under both payment rules (first-price and all-pay) they will end up paying their bid, this deviation is not profitable. Finally, if advertiser i’s deviation, for i ̸ = 1, does not affect the outcome (question asked), then its payment cannot be further reduced (since its bidding zero for the winning question already).
Allowing advertiser interest to steer user conversation should be done with utmost care to ensure high user quality. Companies’ policies and regulations might limit what is practically feasible. Our paper should be viewed as theoretically understanding what such a design would look like. We don’t make any claims about feasibility from a policy or regulatory perspective, or about concrete plans of our employers.
This can be, e.g., a click-though-rate, or a conversion value.
This can be, e.g., a click-through rate, or a conversion value.
Nature samples the item’s state θ from D. θ is not revealed to the platform or the advertisers.
The advertisers possibly interact with the platform.
📸 Image Gallery