laitimes

Manuscripts were faked with ChatGPT, and more than a dozen papers were exposed

author:Web of Science

Compilation|Dushani

On August 9, a paper on new solutions to complex mathematical equations was published in the physics journal Physica Scripta. At first glance, there seems to be nothing wrong with the content of the article. However, Guillaume Cabanac, a computer scientist and scientific detective at the University of Toulouse in France, read page 3 of the manuscript when she noticed an unusual phrase "Regenerate response."

Those who are familiar with ChatGPT should be familiar with this phrase. When you're not satisfied with the AI's answer, press the function button label to have it generate a new answer.

Cabanac quickly posted screenshots of the pages in the manuscript on PubPeer. He had previously exposed more than a dozen papers with similar situations.

Manuscripts were faked with ChatGPT, and more than a dozen papers were exposed

A screenshot of the Physica Scripta paper manuscript was posted on PubPeer, and Cabanac highlighted the phrase "regenerate response" in yellow.

What was found was just the tip of the iceberg

The publisher of Physica Scripta is the British Physical Society Press based in Bristol, UK. Kim Eggleton, the agency's head of peer review and research integrity, said the authors of the article later confirmed to the journal that they used ChatGPT to assist in drafting their manuscript.

The paper was submitted in May and a revised version was submitted again in July. After two months of peer review and typesetting, nothing unusual was found. The British Physical Society Press has now decided to withdraw the paper because the authors did not clarify that they used the tool at the time of submission.

"This is against our ethics policy." Eggleton said.

Similar cases are not uncommon. Since April, Cabanac has flagged more than a dozen papers and published them on PubPeer. Phrases that point to signs of ChatGPT use appear in these articles, such as "regenerate responses" or "As an AI language model, I...".

Manuscripts were faked with ChatGPT, and more than a dozen papers were exposed

A screenshot of a paper that has been flagged and posted on PubPeer highlights the phrase "As an AI language model, I..." in yellow color.

In a paper published in Resources Policy, a journal owned by Elsevier, Cabanac detected other typical ChatGPT phrases. The authors of the paper are from Liaoning University in Shenyang and the Academy of International Trade and Economic Cooperation of the Ministry of Commerce in Beijing.

At first, he just thought that some of the equations in the paper seemed meaningless. But when he browsed to the third chart of the paper, a paragraph above the chart exposed the truth: "Please note that as an AI language model, I cannot generate specific tables or test them...".

A spokesperson for Elsevier said they were "aware of the issue" and were investigating it.

Manuscripts were faked with ChatGPT, and more than a dozen papers were exposed

In a screenshot of the content of the paper in the journal Resources Policy, Cabanac highlighted the phrase "Please note that as an AI language model, I cannot generate specific tables or test ...".

In fact, many publishers, including Elsevier and Springer Nature, have said that authors are allowed to use ChatGPT and other large language modeling (LLM) tools to assist them in their manuscripts, but only if they declare whether AI or AI-assisted technology was used in the preparation of the manuscript.

But Cabanac found that none of the authors of the paper accounted for any of their use of ChatGPT, for example. They are discovered because they do not handle text details carefully, often forgetting to remove even the most obvious traces of AI-generated data.

With that in mind, the number of papers that are more "smart" and careful with text, while hiding that they use ChatGPT, is probably much higher than known.

"These discoveries are just the tip of the iceberg." Cabanac said.

Cabanac, working with other scientific detectives as well as researchers, found the same problem in manuscripts of non-peer-reviewed conference papers and preprints. Cabanac posted them all together on PubPeer, and authors of some of the articles sometimes admitted that they used ChatGPT to help create their work without announcing it.

The cat-and-mouse game is getting harder and harder

Long before ChatGPT, scientists were already battling papers written by computer software.

In 2005, 3 researchers at the Massachusetts Institute of Technology in the United States developed a paper generation software called SCIgen. Users can download and use the program for free, and the content of the papers generated by it is completely fake. The developers wanted to test whether these meaningless manuscripts could pass the screening process of conferences, which they believed existed only to make money.

In 2012, Cyril Labbé, a computer scientist at the University of Grenoble-Alpes in France, discovered 85 fake papers generated by the SCIgen program at a conference published by the Institute of Electrical and Electronics Engineers (IEEE). Two years later, Labbé found more than 120 SCIgen papers in IEEE and Springer publications. Subsequently, two journal companies removed these "gibberish" fake papers from their subscription services.

For SCIgen, Cyril Labbé has created a paper detection website that allows anyone to upload a dubious thesis manuscript and check if it was generated by SCIgen.

Articles generated by SCIgen often contain subtle but detectable traces. For example, specific language patterns, and "unusual expressions" that are mistranslated because of the use of automatic translation tools.

In contrast, if the researchers removed iconic phrases that reflected traces of ChatGPT use, the fluent text generated by more sophisticated chatbots would be "nearly impossible" to detect.

Matt Hodgkinson, research integrity manager at the Office of Research Integrity in London, UK, said: "This is essentially an arms race between scammers and those trying to keep them out".

Elisabeth Bik, a well-known academic counterfeiter, said the rapid rise of ChatGPT and other generative AI tools will provide firepower to paper factories — tools that academic paper counterfeiting companies will use to forge more fake manuscripts and sell them to researchers who want to quickly ramp up paper output.

"It's going to make matters worse," Bik said, "and I'm very concerned that academia has poured in with papers that we don't even know anymore." ”

There are more speculators, and there are not enough goalkeepers

David Bimler, a research integrity detective who goes by the pseudonym Smut Clyde, a retired psychologist at Massey University in Palmerston North, New Zealand, points out that the problem of concealing journal papers from the use of large language modeling tools points to a deeper concern: busy peer reviewers often don't have time to thoroughly check manuscripts for red flags of machine-generated text.

"The number of janitors can't keep up." Bimler said.

Hodgkinson offers a suggestion that might work: ChatGPT and other large language models tend to provide users with fake references. This could be a good clue for peer reviewers who want to spot traces of the use of these tools in their manuscripts. "If the citation doesn't exist, that's a red flag," he said.

For example, the Retraction Watch website reported a preprint paper on millipede research written using ChatGPT. Henrik Enghoff, a millipede researcher at the Danish Museum of Natural History, downloaded the paper and noticed that although it cited his work, it was not consistent with the preprint.

Rune Stensvold, a microbiologist at the National Serum Institute in Copenhagen, ran into citation forgery. When a student asked him for a copy of a paper he said to have co-authored with a colleague in 2006, Stensvold found that the article simply didn't exist. Retrospectively, it turned out that the student had asked the AI chatbot to recommend a paper on the genus Buddha, and the chatbot pieced together a reference with the name Stensvold.

"It looks real," Stensvold says, "and this thing tells me that when I'm going to review a paper, I should probably look at the references section first." ”

Resources

https://www.nature.com/articles/d41586-023-02477-w

https://www.nature.com/articles/nature03653

https://www.nature.com/articles/nature.2014.14763

https://www.nature.com/articles/d41586-021-01436-7

https://retractionwatch.com/2023/07/07/publisher-blacklists-authors-after-preprint-cites-made-up-studies/

Read on