AI Corruption of Science | Evolution News and Science Today

Science writer Charles Blue reports on an effort to track the use of AI to write papers — and the sobering results. From Phys.org:

To shed light on just how widespread LLM content is in academic writing, a team of U.S. and German researchers analyzed more than 15 million biomedical abstracts on PubMed to determine if LLMs have had a detectable impact on specific word choices in journal articles.

Their investigation revealed that since the emergence of LLMs there has been a corresponding increase in the frequency of certain stylist word choices within the academic literature. These data suggest that at least 13.5% of the papers published in 2024 were written with some amount of LLM processing. The results appear in the open-access journal Science Advances.
“Massive study detects AI fingerprints in millions of scientific papers,” July 6, 2025

Analyzing language patterns, the authors found that, after chatbots (large language models or LLMs) became available three years ago, writing styles shifted significantly: “away from the excess use of ‘content words’ to an excess use of “stylistic and flowery” word choices, such as “showcasing,” “pivotal,” and “grappling.”

Considerable Discrepancy Between Fields

From the open-access paper:

Large language models (LLMs) like ChatGPT can generate and revise text with human-level performance. These models come with clear limitations, can produce inaccurate information, and reinforce existing biases. Yet, many scientists use them for their scholarly writing. But how widespread is such LLM usage in the academic literature? To answer this question for the field of biomedical research, we present an unbiased, large-scale approach: We study vocabulary changes in more than 15 million biomedical abstracts from 2010 to 2024 indexed by PubMed and show how the appearance of LLMs led to an abrupt increase in the frequency of certain style words. This excess word analysis suggests that at least 13.5% of 2024 abstracts were processed with LLMs. This lower bound differed across disciplines, countries, and journals, reaching 40% for some subcorpora. We show that LLMs have had an unprecedented impact on scientific writing in biomedical research, surpassing the effect of major world events such as the COVID pandemic.
Dmitry Kobak et al. ,Delving into LLM-assisted writing in biomedical publications through excess vocabulary. Sci. Adv.11,eadt3813(2025).DOI:10.1126/sciadv.adt3813

The authors noted that their detection method turned up considerable discrepancy between fields of study and suggest reasons for it:

Our estimated lower bound on LLM usage ranged from below 5% to more than 40% across different PubMed-indexed research fields, affiliation countries, and journals. This heterogeneity could correspond to actual differences in LLM adoption. For example, the high lower bound on LLM usage in computational fields (20%) could be due to computer science researchers being more familiar with and willing to adopt LLM technology. In non-English speaking countries, LLMs can help authors with editing English texts, which could justify their extensive use. Last, authors publishing in journals with expedited and/or simplified review processes might be grabbing for LLMs to write low-effort articles.
Through excess vocabulary

But they qualify their assumptions:

It is possible that native and non-native English speakers actually use LLMs equally often, but native speakers may be better at noticing and actively removing unnatural style words from LLM outputs. Our method would not be able to pick up the increased frequency of such more advanced LLM usage.
Through excess vocabulary

Lost in Content

Kobak et al. worry that what is gained in correct English may be lost in content because “LLMs are infamous for making up references (10), providing inaccurate summaries, and making false claims that sound authoritative and convincing.”

The authors of any given paper actually did the work that resulted in the study and will therefore be attuned to even small discrepancies. The LLM did not do anything. So the more LLMs are relied on, the less likely it is that incorrect or false material will be spotted pre-publication unless it is truly bizarre.

The study authors also worry about LLM plagiarism and homogeneity:

This makes LLM outputs less diverse and novel than human-written text. Such homogenization can degrade the quality of scientific writing. For instance, all LLM-generated introductions on a certain topic might sound the same and would contain the same set of ideas and references, thereby missing out on innovations and exacerbating citation injustice. Even worse, it is likely that malign actors such as paper mills will use LLMs to produce fake publications.

It would be remarkable if the paper mills are not already deep into LLM-generated papers.

The fact that there may be no turning back does not mean, of course, that we will all just muddle through somehow. Reliance on machine-written material may signal a long and continuing decline in the quality of research in many fields. Paper mills may even become irrelevant if machine-written papers are the norm. We shall see.

Cross-posted at Mind Matters News.

Evolution News and Science Today_{& Science Today}

Intelligent Design

Neuroscience & Mind

AI Corruption of Science Papers Becomes Evident

Considerable Discrepancy Between Fields

Lost in Content

Molecular MachinesEvolution News September 23, 202131

Fine-Tuning ParametersJay W. Richards September 17, 202113Fine-tuning

Theistic EvolutionJohn G. West May 1, 20095

Charles DarwinBenjamin Wiker May 1, 20099