How much are LLMs changing the language of academic papers after ChatGPT? A multi-database and full text analysis

This study investigates how Large Language Models (LLMs) are influencing the language of academic papers by tracking 12 LLM-associated terms across six major scholarly databases (Scopus, Web of Science, PubMed, PubMed Central (PMC), Dimensions, and OpenAlex) from 2015 to 2024. Using over 2.4 million PMC open-access publications (2021-July 2025), we also analysed full texts to assess changes in the frequency and co-occurrence of these terms before and after ChatGPT’s initial public release. Across databases, delve (+1,500%), underscore (+1,000%), and intricate (+700%) had the largest increases between 2022 and 2024. Growth in LLM-term usage was much higher in STEM fields than in social sciences and arts and humanities. In PMC full texts, the proportion of papers using underscore six or more times increased by over 10,000% from 2022 to 2025, followed by intricate (+5,400%) and meticulous (+2,800%). Nearly half of all 2024 PMC papers using any LLM term also included underscore, compared with only 3%-14% of papers before ChatGPT in 2022. Papers using one LLM term are now much more likely to include other terms. For example, in 2024, underscore strongly correlated with pivotal (0.449) and delve (0.311), compared with very weak associations in 2022 (0.032 and 0.018, respectively). These findings provide the first large-scale evidence based on full-text publications and multiple databases that some LLM-related terms are now being used much more frequently and together. The rapid uptake of LLMs to support scholarly publishing is a welcome development reducing the language barrier to academic publishing for non-English speakers.

💡 Research Summary

This paper provides a comprehensive, data‑driven investigation of how large language models (LLMs), epitomized by the public release of ChatGPT, are reshaping the lexical landscape of scholarly articles. The authors adopt a two‑pronged approach: (1) a meta‑data analysis across six major bibliographic databases—Scopus, Web of Science, PubMed, PubMed Central, Dimensions, and OpenAlex—covering the period 2015‑2024, and (2) a full‑text analysis of more than 2.4 million open‑access papers from PubMed Central (PMC) spanning 2021‑July 2025. Twelve LLM‑related terms were pre‑selected (e.g., “LLM”, “ChatGPT”, “large language model”, “prompt”, “delve”, “underscore”, “intricate”, “meticulous”, “pivotal”, “nuanced”, “synthetic”, “generative”).

For the database‑level study, the authors extracted term frequencies from titles, abstracts, and author‑provided keywords, then aggregated counts by year and by broad disciplinary categories (STEM versus Social Sciences and Arts & Humanities). Growth rates were estimated using log‑transformed counts and linear regression, while differences among fields were tested with ANOVA and post‑hoc Tukey comparisons.

The PMC full‑text component involved rigorous preprocessing: HTML stripping, tokenization, lemmatization, and a context‑aware filter to separate genuine scholarly uses of ambiguous words (e.g., “underscore”) from generic English usage. Term occurrences per article were counted, and co‑occurrence matrices were built. Pearson correlation coefficients quantified the strength of pairwise associations, and the authors compared the pre‑ChatGPT window (up to 2022) with the post‑ChatGPT window (2023‑2025).

Key findings include: (i) Across all databases, “delve”, “underscore”, and “intricate” exhibited the largest surges between 2022 and 2024, with relative increases of roughly 1,500 %, 1,000 %, and 700 % respectively. Growth was markedly higher in STEM fields—often two to three times the rate observed in social sciences and humanities. (ii) In the PMC full‑text corpus, the proportion of papers containing “underscore” six or more times rose by more than 10,000 % from 2022 to 2025; “intricate” and “meticulous” increased by 5,400 % and 2,800 % respectively. By 2024, nearly half (≈48 %) of all LLM‑term‑bearing papers featured “underscore”. (iii) Co‑occurrence patterns have strengthened dramatically. In 2024, “underscore” correlates with “pivotal” at r = 0.449 (versus 0.032 in 2022) and with “delve” at r = 0.311 (versus 0.018 in 2022). This indicates that the presence of one LLM‑related term now makes the inclusion of additional terms substantially more likely. (iv) The authors argue that this lexical diffusion reflects a broader adoption of LLM‑assisted writing tools, which may lower language barriers for non‑English‑speaking scholars and enrich the expressive repertoire of academic discourse.

The study acknowledges several limitations. Database coverage and metadata quality vary, potentially biasing term counts. Ambiguous terms may still be mis‑classified despite context filters, inflating apparent usage. The PMC analysis is restricted to open‑access literature, leaving subscription‑based journals under‑represented.

Future research directions proposed include: (a) applying semantic network analysis to trace how LLM‑related concepts evolve and interlink over time; (b) conducting finer‑grained disciplinary analyses to uncover adoption patterns in humanities and social sciences; and (c) complementing quantitative metrics with qualitative surveys or interviews of authors to capture perceived benefits and challenges of LLM‑driven writing assistance.

In sum, this work delivers the first large‑scale, multi‑database and full‑text evidence that LLM‑related terminology is not only increasing in frequency but also clustering together within scholarly articles, especially in STEM domains. The rapid lexical uptake signals a transformative moment for academic publishing, suggesting that LLM technologies are beginning to reshape how researchers communicate, potentially democratizing access to scientific discourse for a more globally diverse research community.

💡 Research Summary

📜 Original Paper Content