Can big language models identify fake news? One study evaluated models such as ChatGPT

author：Technology Times 2023-07-18 13:59:00

In recent years, fake news and online rumors have become a serious social problem, which not only affects the public's cognition and judgment, but also threatens social stability and security. To address this challenge, many researchers and developers have tried to use artificial intelligence (AI) technology to assist in fact-checking and information verification.

Can big language models identify fake news? One study evaluated models such as ChatGPT

Kevin Matthe Caramancion, a researcher at Wisconsin State University, recently conducted a study that evaluated the performance of four of the most well-known LLMs, namely Open AI's Chat GPT-3.0 and Chat GPT-4.0, Google's Bard/LaMDA and Microsoft's Bing AI, in detecting the truth of news.

His findings, published on the preprint server arXiv, provide valuable references for future use of these advanced models to combat online rumors.

In an interview with Tech Xplore, Caramancion said: "My recent paper was inspired by the need to understand the capabilities and limitations of various LLMs in combating online rumors. My goal is to rigorously test the proficiency of these models in differentiating fact from fiction, using a controlled simulation experiment and an established fact-checking agency as a benchmark. ”

"We used a test suite of 100 news items verified by independent fact-checkers to evaluate the performance of these big language models," he says. We presented each news item to these models under controlled conditions, and then grouped their responses into three categories: true, false, and partially true/false. We measure the effectiveness of these models against their accuracy compared to verified facts provided by independent bodies. ”

Caramancion found that out of 100 test items, only Bing AI was able to correctly identify all real news and did not misjudge any fake news as real news. The other three LLMs showed varying degrees of error rates, with Chat GPT-4.0 being the worst one, correctly identifying only 67% of real news and misjudging 23% of fake news as real news.

Caramancion believes that these results suggest that the current LLM cannot fully replace the human role in fact-checking and needs more improvement and optimization. He suggests that when using these models, information should be validated in conjunction with other sources and methods, and be aware of their possible biases and limitations.

"I hope my research will draw attention to the potential and challenges of LLM in identifying fake news, as well as reflections on their impact and responsibility in society," he said. I also hope that my research will inspire more researchers and developers to explore and improve these models so that they can better serve the well-being of humanity. ”

#人工智能 #大语言模型 #假新闻 #事实核查 #ChatGPT

Can big language models identify fake news? One study evaluated models such as ChatGPT

Read on