Are we coming close to accurate AI detection?
KEY TAKEAWAYS
- Findings of a recent study suggest that accurate detection of AI-generated text can be achieved.
- Researchers propose that accuracy is dependent on tailoring detectors to specific fields and writing types.

The meteoric rise of large language models, such as ChatGPT, is likely to result in a rapid increase in the use of generative artificial intelligence (AI) in academic publishing. This presents a quandary for journal publishers and editorial teams as they strive to develop guidance and ‘stay ahead’ of the technology. Currently, attitudes vary somewhat between journals, ranging from The Lancet limiting AI use to improving readability, to Nature adopting a firm stance against the use of generative AI to create images. Regardless of the detail in individual guidelines, enforcement is reliant on accurate detection of AI-generated content; technology which, to date, has been viewed as flawed. A recent Nature News article by McKenzie Prillaman spotlights research on a potential solution, namely, the development of more specialist detectors.
Developing a specialist AI detector
As Prillaman reports, a recent study published in Cell Reports Physical Science suggests that tailoring AI detectors so that they are trained to check specific types of writing may result in more reliable detection methods.
Tailoring AI detectors so that they are trained to check specific types of writing may result in more reliable detection methods.
The research group, Desaire et al., used 100 published (ie, human-created) introductions from articles in various chemistry journals to train ChatGPT 3.5 to develop 200 introductions that followed similar styles. These documents were used to train their machine learning algorithm. The model was then used to test more articles, checking for AI- vs human-generated content via 20 different features of writing style. The group found that:
- the detector identified AI-generated documents with 98–100% accuracy
- human-written documents were detected with 96% accuracy
- the model outperformed other more general detectors, such as OpenAI’s AI classifier and ZeroGPT, in detecting AI-generated documents
- the model performed similarly when tested on writing from chemistry journals beyond those it was trained on, but not when tested on more general science magazine writing.
Implications for scientific publishers
The group concluded that their detector outperformed its contemporaries because it was trained specifically on academic publications. They propose that this tailored approach is vital for the development of accurate AI detectors suitable for use by academic publishers.
————————————————
Categories
Authorship, Medical writing, Research integrity, Transparency
