No Last Mile: A Theory of the Human Data Market

No Last Mile: A Theory of the Human Data Market
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The standard framing treats structured human-data work as transitional, a bridge between today’s imperfect models and a future state where automation is complete. We challenge this view by modeling structured human data as a persistent production input: evaluation, rubric-based judgment, auditing, exception handling, and continual updates that convert raw model capability into dependable, deployable performance. These activities accumulate into a reusable AI capability stock that raises productivity by improving reliability on existing tasks and by expanding the frontier of task families for which AI can be used at high confidence. Crucially, this capability stock depreciates as tasks and contexts drift, standards evolve, and new edge cases emerge. In a tractable baseline model, an interior steady state implies a closed-form, strictly positive long-run labor share devoted to structured human-data work whenever depreciation is positive, a “no last mile” result in which maintenance demand persists even as models improve. We then microfound aggregate capability with a portfolio of task families featuring diminishing returns, frontier entry, and complementarity, generating reallocation toward low-maturity and bottleneck families and a Roy-style mechanism for within-structured wage dispersion. Finally, we map model objects to observable proxies using standard data layers, and provide a conservative calibration suggesting a 5-7% steady-state structured labor share in the long run.


💡 Research Summary

The paper “No Last Mile: A Theory of the Human Data Market” challenges the common view that structured human‑data work (such as evaluation, rubric‑based judgment, auditing, and exception handling) is merely a transitional bridge to fully automated AI systems. Instead, the authors model this work as a persistent production input that creates a reusable “AI capability stock” (denoted k). This stock raises aggregate productivity by improving reliability on existing AI‑exposed tasks and by expanding the frontier of task families that can be deployed with high confidence. Crucially, the capability stock depreciates over time because of technological drift, environmental drift, and organizational drift—collectively captured by a depreciation rate δₖ.

The baseline model assumes a fixed total labor endowment (\bar L) that can be allocated between an unstructured production sector (U) and a structured human‑data sector (S). Output follows a Cobb‑Douglas form (Y_t = A(k_t) K^{\alpha} L_{U,t}^{1-\alpha}) where (A(k)=\bar A k^{\gamma}) with (0<\gamma<1). Structured labor accumulates capability according to (k_{t+1} = (1-\delta_k)k_t + \eta L_{S,t}). Wages are set by marginal products: the unstructured wage is (w_U = (1-\alpha)A(k)K^{\alpha}L_U^{-\alpha}); the structured wage equals the marginal value of an extra unit of capability, (w_S = \eta v), where (v = \frac{\partial Y}{\partial k}/(r+\delta_k)).

Imposing interior equilibrium (w_U = w_S) yields a closed‑form steady‑state share of labor devoted to structured work:

\


Comments & Academic Discussion

Loading comments...

Leave a Comment