A Collaborative Mechanism for Crowdsourcing Prediction Problems

A Collaborative Mechanism for Crowdsourcing Prediction Problems
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Machine Learning competitions such as the Netflix Prize have proven reasonably successful as a method of “crowdsourcing” prediction tasks. But these competitions have a number of weaknesses, particularly in the incentive structure they create for the participants. We propose a new approach, called a Crowdsourced Learning Mechanism, in which participants collaboratively “learn” a hypothesis for a given prediction task. The approach draws heavily from the concept of a prediction market, where traders bet on the likelihood of a future event. In our framework, the mechanism continues to publish the current hypothesis, and participants can modify this hypothesis by wagering on an update. The critical incentive property is that a participant will profit an amount that scales according to how much her update improves performance on a released test set.


💡 Research Summary

The paper begins by critiquing the incentive structure of traditional machine‑learning competitions such as the Netflix Prize. Those contests reward a single winner, require participants to keep their methods private, and often waste the contributions of many teams because the “winner‑takes‑all” prize discourages collaboration and sharing. To address these shortcomings, the authors propose a novel framework called a Crowdsourced Learning Mechanism (CLM), which adapts the core ideas of prediction markets to the problem of learning a hypothesis.

A CLM is defined by a tuple (H, O, Cost, Payout). H is a convex hypothesis space, O is the outcome space (e.g., a test set revealed after the mechanism closes), Cost(w, w′) is the fee charged when a participant updates the current hypothesis w to w′, and Payout(w, w′; X) is the reward paid after the true outcome X is observed. The mechanism repeatedly publishes the current hypothesis w_t; any participant may propose an update w_t → w′ and pays the associated Cost. After a predetermined number of rounds, the test data X is released and each participant receives the corresponding Payout.

The incentive design hinges on a Generalized Scoring Rule (GSR) L:H×O→ℝ, which the mechanism designer selects to reflect the loss function of the learning task. A GSR is a continuous function such that, for any distribution P over outcomes, the set of hypotheses minimizing the expected loss, W_L(P)=arg min_w E_{X∼P}


Comments & Academic Discussion

Loading comments...

Leave a Comment