Are AI tweets more convincing than real people? The University of Zurich study has been published in Science

Smart stuff

compile | Zhang Mingyi Edit | Yunpeng

According to The Verge, researchers at the University of Zurich recently conducted a study that deeply tested and analyzed the narrative ability of AI models, entitled "AI model GPT-3 (dis) informs us better than humans, https://www.science.org/doi/ 10.1126/sciadv.adh1850), published in Science Advances. The study found that AI-generated tweets may be more persuasive than real people.

The authors are postdoctoral researchers Giovanni Spitale, Federico Germani, and Nikola Biller-Andorno, director of the Institute of Biomedical Ethics and History of Medicine (IBME) at the University of Zurich. The study, which involved 697 participants, was designed to assess whether individuals could distinguish between false and accurate information presented in the form of Twitter.

GPT-3 has the ability to understand and describe information, and false texts have even deceived humans

In the study, participants were asked to scan information in the form of tweets and asked to judge its authenticity. On this basis, participants were asked to further determine whether the tweet was written by the user or GPT-3.

The result is that people can't do it. The findings suggest that GPT-3 is a double-edged sword. It can produce accurate information that is easier to understand than humans, or it can produce more convincing false information. The study also showed that humans could not distinguish between tweets written by GPT-3 and human Twitter users.

This is precisely where GPT-3 is "dangerous". Especially when it comes to tech topics such as vaccines and climate change, there is a lot of misinformation circulating online. In other words, people are more likely to believe GPT-3 than what is written by a real person.

In public expression, AI can be both a tool and a weapon

From the aforementioned research, we can deduce how powerful AI language models are when entering the realm of public expression.

Giovanni Spitale, lead author of the study and a postdoctoral researcher and research data manager at the Institute for Biomedical Ethics and History of Medicine at the University of Zurich, said: "These amazing technologies can easily be used as weapons. The use of these 'weapons' on any topic you want to use can cause a disinformation storm. ”

But, perhaps, things aren't that bad either. Spitale said technicians could develop new technologies to stop using them to spread misinformation. "Technology is not intrinsically evil or good, it is just an amplifier of human intent."

Third, the limitations behind the study of control variables

In the University of Zurich's study, Spitale and his colleagues collected 11 articles on Twitter covering 11 different scientific topics, covering vaccines, coronavirus and climate evolution. They then instructed GPT-3 to write tweets with accurate or false information.

In 2022, the team collected responses from 697 participants online through Facebook ads. They all speak English and are mostly from the UK, Australia, Canada, USA and Ireland.

The study concluded that participants could not distinguish between GPT-3 and tweets written by real people. In fact, the study had its limitations. The researchers say they themselves are also not 100% sure that the tweets they collect from social media must have been written by real people without the help of apps like ChatGPT.

Example of a fake tweet "posted" by GPT-3

Example of a real tweet "posted" by GPT-3

The study has its limitations. For example, participants can only see the content of the tweet itself, but not the Twitter profile of the content publisher, which may help them determine if it is a bot. If you can see past tweets and profile pictures of an account, it's easier to tell if the content posted by the account is misleading. Participants were better able to identify false information posted by users of live Twitter. Therefore, the false information tweets generated by GPT-3 are more "deceptive".

At present, the newer version of the Large Language Model (LLM) may be more persuasive than GPT-3. ChatGPT is powered by the GPT-3.5 model, and the GPT-4 version requires a paid subscription.

Fourth, GPT-3 already has the ability to complete sentences and judge information similar to humans

No one is perfect, and LLM is not a "complete". At a large machine conference, James Vincent of The Verge decided to ban authors from using AI tools to write scholarly articles. "These AI tools have powerful autocomplete systems that are trained to predict which word to use after the next word in any given sentence," he wrote. As a result, these tools have no 'fact' hardcoded database to draw on, only the ability to state that makes them sound soundable. ”

The study also found that humans performed better than GPT-3 in terms of judgment accuracy. The researchers also asked GPT-3 to analyze tweets and determine whether they were accurate. In terms of recognition accuracy, GPT-3 scores lower than humans. When it comes to identifying false information, GPT-3 is similar to human performance.

Notably, optimizing the training database used to develop LLM may make it harder for counterfeiters to use these AI tools to create false information. When it comes to publishing false information about vaccines and autism, GPT-3 "violated" instructions from some researchers and produced "inaccurate" content. This may be because in the training database, there is more information data to debunk false claims under these topics than other topics.

Humans should develop critical thinking to deal with more complex public information contexts

But, Spitale argues, the best long-term strategies to combat disinformation are very low-tech — that is, encouraging people to develop critical thinking and better able to distinguish between fact and fiction. Because in this study, the average person seems to have similar or even better judgment than GPT-3.

With proper training, ordinary people can become more proficient in critical thinking. The study suggests that people who are proficient in fact-judgment can work with LLMs such as GPT-3 to improve public information campaigns and create a legitimate and effective information context.

"Don't get me wrong, I'm a big fan of LLM." Spitale said, "I think generative AI is going to change the world... Of course, this also depends on humans. Humans can decide whether the future will get better. ”

Conclusion: LLM will force humans to reflect on the use of language

ChatGPT and other LLMs are mirrors of human language. Human beings have both a high degree of originality and non-originality in language. Instead of creating new phrases, GPT-3 uses a lot of input to learn how to arrange combinations of words. This allows them to predict with a high degree of accuracy how words will be used.

But human language does not arise through imitation alone. Human language abilities are generative. This is what distinguishes humans from other animals with complex communication systems. In theory, human language has an infinite ability to generate new phrases.

But today, ChatGPT forces us to reconsider a dusted question: How much of human language is truly ours? Have humans never really controlled language? At least, not as we thought.