Usage, Effects and Requirements for AI Coding Assistants in the Enterprise: An Empirical Study

Usage, Effects and Requirements for AI Coding Assistants in the Enterprise: An Empirical Study
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The rise of large language models (LLMs) has accelerated the development of automated techniques and tools for supporting various software engineering tasks, e.g., program understanding, code generation, software testing, and program repair. As CodeLLMs are being employed toward automating these tasks, one question that arises, especially in enterprise settings, is whether these coding assistants and the code LLMs that power them are ready for real-world projects and enterprise use cases, and how do they impact the existing software engineering process and user experience. In this paper we survey 57 developers from different domains and with varying software engineering skill about their experience with AI coding assistants and CodeLLMs. We also reviewed 35 user surveys on the usage, experience and expectations of professionals and students using AI coding assistants and CodeLLMs. Based on our study findings and analysis of existing surveys, we discuss the requirements for AI-powered coding assistants.


💡 Research Summary

**
The paper “Usage, Effects and Requirements for AI Coding Assistants in the Enterprise: An Empirical Study” investigates whether AI‑driven coding assistants, powered by large language models (LLMs), are ready for real‑world enterprise projects and how they influence software development processes. The authors combine two research strands: (1) an original survey of 57 developers from a large technology company, and (2) a systematic meta‑analysis of 35 publicly available user‑survey studies published between 2023 and 2025.

Survey of 57 practitioners
The internal survey was conducted in May 2025 and collected responses from a diverse set of employees across finance, operations, software engineering, and research divisions. Participants varied widely in experience (from less than five years to more than twenty years) and in programming language expertise (Python, Java, JavaScript dominate, but C, Go, Rust, COBOL, etc., are also represented). Respondents reported using up to three AI tools, most commonly GitHub Copilot, ChatGPT (web interface), IBM watsonx Code Assistant, and Gemini. The questionnaire comprised 25 items covering motivation, most‑beneficial help type, perceived productivity gains, security and trust concerns, and desired future features.

Key findings from this survey:

  • Productivity – 68 % of respondents reported a 12‑25 % speed‑up in coding tasks, especially for repetitive boiler‑plate generation and unit‑test creation.
  • Primary use cases – code completion/generation, unit‑test generation, debugging assistance, and code explanation were the most frequently used capabilities.
  • Security & trust – the biggest worry was the possibility of generated code containing vulnerabilities, followed by concerns about licensing and intellectual‑property compliance.
  • Desired enhancements – participants asked for explainable suggestions (natural‑language rationale), project‑specific prompt persistence, deeper IDE integration for pull‑request review, and agentic automation that can perform multi‑step tasks such as bug fixing or CI/CD orchestration.

Meta‑analysis of 35 surveys
The authors performed a systematic literature search using terms like “AI coding assistant survey” and filtered for studies that involved human participants, focused on software‑engineering tasks, and were publicly accessible. After screening, 35 papers (2 from 2023, 14 from 2024, 19 from 2025) remained. To extract structured data at scale, two state‑of‑the‑art LLMs (Gemini 2.5 pro and Claude Sonnet 4) were prompted to parse each paper for: (a) AI tools examined, (b) supported SE tasks, (c) participant professions, and (d) study goals. The LLM‑generated extracts were manually verified.

Aggregated results reveal several systematic patterns:

  • Tool concentration – Over 70 % of the surveyed studies examined only ChatGPT or GitHub Copilot, leaving a sparse evidence base for newer assistants such as Gemini, Claude, Amazon Q, or specialized enterprise tools.
  • Task focus – Code generation, unit‑test generation, debugging, and code‑understanding dominate (≈80 % of all task mentions). Quality‑assurance or security‑focused tasks appear in only four studies, indicating a research gap.
  • Participant homogeneity – The majority of respondents are software engineers; only a small fraction are students, researchers, or non‑technical professionals. No study compared usage across different roles within the same organization.
  • Temporal scope – Most surveys are short‑term, measuring immediate productivity or single‑task outcomes. Longitudinal studies that assess maintainability, technical debt, or organizational ROI are virtually absent.
  • Agentic workflow omission – Despite the emergence of “agentic” coding assistants (e.g., Cursor, Replit, RooCode) that can execute multi‑step operations, none of the 35 surveys explicitly addressed user experiences with such capabilities.

Derived requirements for enterprise‑grade AI coding assistants
Synthesizing the internal survey and the meta‑analysis, the authors propose eight high‑level requirements that future tools should satisfy:

  1. Security & vulnerability analysis – Integrated static/dynamic analysis and real‑time vulnerability scanning of generated code.
  2. Explainability – Automatic natural‑language justification for each suggestion to aid debugging and learning.
  3. Contextual personalization – Persistent, project‑specific prompt and configuration storage to reduce repetitive prompt engineering.
  4. Agentic automation – Ability to orchestrate multi‑step tasks such as bug fixing, refactoring, or CI/CD pipeline updates without manual intervention.
  5. Broad language & platform support – Equal coverage for legacy languages (COBOL, Fortran) and emerging ones (Rust, Go).
  6. Collaboration & version‑control integration – Real‑time suggestions during pull‑request review, automated code‑review comments, and seamless Git integration.
  7. Educational scaffolding – Tiered tutorials, code walkthroughs, and “explain‑as‑you‑type” features for novices and students.
  8. Governance & compliance – Automated detection of licensing conflicts, copyright attribution, and policy‑driven usage controls.

Limitations
The authors acknowledge several constraints: the internal survey’s modest sample size (57) limits statistical generalizability; reliance on LLM‑based extraction may introduce parsing errors; and the meta‑analysis lacks quantitative performance metrics (e.g., defect density reduction, cost savings). Moreover, the surveyed literature is heavily skewed toward North‑American and European contexts, leaving regional usage patterns under‑explored.

Future research directions
The paper calls for:

  • Larger, multi‑industry, multi‑regional longitudinal studies that track code quality, maintenance effort, and business ROI over months or years.
  • Systematic evaluation of agentic assistants, including user‑experience studies and benchmark comparisons across tools.
  • Empirical studies that involve non‑developer stakeholders (e.g., data scientists, system administrators) to broaden the understanding of cross‑functional adoption.
  • Development of standardized metrics for security, explainability, and governance that can be incorporated into tool evaluation frameworks.

Conclusion
AI coding assistants are already delivering measurable productivity gains in enterprise settings, especially for repetitive coding, test generation, and debugging. However, the current research landscape is narrowly focused on a few dominant tools, short‑term tasks, and homogeneous developer populations. To mature these assistants for broader, mission‑critical enterprise adoption, vendors and researchers must address security, explainability, personalization, agentic automation, and governance. The eight requirements articulated in this study provide a concrete roadmap for both academic investigations and product road‑maps, aiming to bridge the gap between promising LLM capabilities and the rigorous demands of real‑world software engineering.


Comments & Academic Discussion

Loading comments...

Leave a Comment