- A ‘ChatGPT-authored’ scientific paper highlights the promise and pitfalls of using AI in research and publications.
Use of artificial intelligence (AI) in scientific publishing seems inevitable. While the full capabilities of this fast-changing technology are yet to be determined, some in medical publishing have begun to explore ways to harness the potential of generative AI, while others urge caution and lament a lack of structured guidance. Recently, as reported by Gemma Conroy in Nature News, Professor Roy Kishony and his student, Tal Ifargan, provided new fuel for the debate, by asking ChatGPT to conduct research and write a paper from scratch.
Kishony and Ifargan used a ‘data to paper’ system, in which software acted as a ‘go between’ between humans and generative AI. This system automatically prompted ChatGPT to follow the steps of scientific research, from hypothesis generation to development of a scientific manuscript. In less than an hour, ChatGPT developed a study objective; wrote code to analyse a large, publicly available dataset; and drew conclusions based on its findings and existing literature, which it reported in a 19-page research article.
The study highlighted some promising aspects of incorporating AI into research and publication pathways, namely reduced timelines and the potential to quickly generate written summaries. However, it also shone a light on a number of limitations and risks:
- False narratives: In this case, ChatGPT claimed to ‘address a gap in the literature’, although the subject (a link between diabetes risk and diet and exercise) was already well investigated.
- Decrease in research quality: Kishony flagged the risks of generative AI leading to ‘p hacking’ or a flood of low-quality research papers.
- Incapable of self-correction: Stephen Heard of Scientist Sees Squirrel also provided commentary and analysis on the limitations thrown up by the study, including generative AI’s lack of accuracy. Expert human intervention was required throughout, to spot and correct errors.
- Regurgitating existing ideas: Heard also emphasised that generative AI creates content based on existing source material, thus perpetuating biases and reducing innovation and creativity.
- Hallucinations: As explained by Jie Yee Ong in The Chainsaw, ‘hallucinations’ are a well-known problem with generative AI. This study was no exception, with ChatGPT generating fake citations despite access to the published literature. As Ong puts it, “for now, it is best not to treat everything ChatGPT spits out as gospel”.
Kishony and Ifargan’s carefully planned study allowed generative AI’s work to be checked for accuracy by human experts. Researchers agree that these human checks and balances remain essential to ensuring the credibility of scientific research and publications in which AI plays a role.
Researchers agree that human checks and balances remain essential to ensuring the credibility of scientific research and publications in which AI plays a role.