Future of Work with AI Agents: Auditing Automation and Augmentation Potential across the U.S. Workforce

Future of Work with AI Agents: Auditing Automation and Augmentation Potential across the U.S. Workforce
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The rapid rise of compound AI systems (a.k.a., AI agents) is reshaping the labor market, raising concerns about job displacement, diminished human agency, and overreliance on automation. Yet, we lack a systematic understanding of the evolving landscape. In this paper, we address this gap by introducing a novel auditing framework to assess which occupational tasks workers want AI agents to automate or augment, and how those desires align with the current technological capabilities. Our framework features an audio-enhanced mini-interview to capture nuanced worker desires and introduces the Human Agency Scale (HAS) as a shared language to quantify the preferred level of human involvement. Using this framework, we construct the WORKBank database, building on the U.S. Department of Labor’s O*NET database, to capture preferences from 1,500 domain workers and capability assessments from AI experts across over 844 tasks spanning 104 occupations. Jointly considering the desire and technological capability divides tasks in WORKBank into four zones: Automation “Green Light” Zone, Automation “Red Light” Zone, R&D Opportunity Zone, Low Priority Zone. This highlights critical mismatches and opportunities for AI agent development. Moving beyond a simple automate-or-not dichotomy, our results reveal diverse HAS profiles across occupations, reflecting heterogeneous expectations for human involvement. Moreover, our study offers early signals of how AI agent integration may reshape the core human competencies, shifting from information-focused skills to interpersonal ones. These findings underscore the importance of aligning AI agent development with human desires and preparing workers for evolving workplace dynamics.


💡 Research Summary

This paper tackles the pressing question of how emerging AI agents—autonomous, tool‑using systems built on large language models—will reshape the U.S. labor market. Existing studies have either focused on a handful of occupations or taken a capital‑centric view that neglects workers’ preferences. To fill this gap, the authors design a task‑level auditing framework that simultaneously captures (1) workers’ desire for automation or augmentation and (2) experts’ assessments of current AI capability.

The framework draws occupational tasks from the U.S. Department of Labor’s O*NET database, selecting 844 complex, multi‑step tasks across 104 occupations. For each task, a “audio‑enhanced mini‑interview” invites participants to speak about their work and AI attitudes, which is followed by two Likert‑scale questions: (a) automation desire (Aw) and (b) the preferred level of human involvement using a newly introduced Human Agency Scale (HAS, H1–H5). The audio component is meant to elicit richer, more reflective responses than plain text. Workers must confirm familiarity with each task, ensuring that ratings are grounded in real experience.

In parallel, 52 AI researchers and developers evaluate the same tasks for “technological capability” (Ct), i.e., how well current AI agents could perform the task autonomously or with assistance. The dual‑perspective data are merged into the AI Agent Worker Outlook & Readiness Knowledge Bank (WORKBank), a publicly available dataset that links worker preferences, expert capability judgments, and O*NET metadata.

By crossing Aw and Ct, the authors map tasks onto a 2 × 2 matrix that yields four zones:

  1. Automation “Green Light” (high desire, high capability) – tasks such as scheduling, basic data entry, and routine report generation where workers want automation and current AI can deliver it.
  2. Automation “Red Light” (low desire, high capability) – technically feasible tasks that workers prefer to keep under human control, notably many software‑development and business‑analysis activities.
  3. R&D Opportunity (high desire, low capability) – tasks where workers strongly want AI assistance but existing technology falls short (e.g., complex decision‑making, nuanced client communication, strategic planning).
  4. Low Priority (low desire, low capability) – tasks with little interest from both sides, often low‑value administrative chores.

The analysis shows that roughly 41 % of the task‑occupation mappings fall into the Red Light or Low Priority zones, indicating a misalignment between current AI investment (which heavily targets software and analytics) and worker preferences.

The Human Agency Scale (HAS) provides a nuanced language for the automation‑augmentation continuum. Across occupations, 45.2 % of workers favor H3 (equal partnership), suggesting that collaborative human‑AI workflows will dominate. However, many sectors—especially healthcare, education, and customer‑facing roles—lean toward H4 or H5, emphasizing that human judgment and oversight remain essential.

A further contribution is the mapping of tasks to core skill categories and wage levels. The authors find a shift: high‑wage “information‑processing” skills are becoming more automatable and thus less central, while “interpersonal,” “organizational,” and “leadership” skills are gaining relative importance and are associated with higher HAS levels. This suggests that AI agents will offload routine cognition, freeing workers to focus on social and strategic competencies.

Policy and managerial implications are drawn from these findings. Companies should rapidly deploy AI in Green Light tasks to capture productivity gains, while allocating R&D resources to the Opportunity zone to address worker‑driven demand. Policymakers are urged to design upskilling and transition programs that emphasize the emerging interpersonal skill set and to create safeguards that preserve human agency where workers deem it critical.

Limitations include the U.S.-centric sample, potential response bias despite audio prompts, and the reliance on expert judgments that may quickly become outdated as foundation models evolve.

In sum, the paper delivers a comprehensive, worker‑centered audit of AI agent readiness, introduces a novel human‑agency metric, and provides actionable insights for researchers, developers, firms, and policymakers navigating the future of work with AI.


Comments & Academic Discussion

Loading comments...

Leave a Comment