laitimes

ICLR was exposed to a huge shady scene, and the judges and authors colluded in private? 49.9% of the papers were suspected of AI review

author:Ace Academic

Source: New Zhiyuan

Due to the high specifications and large number of papers, the fairness and transparency of the review process of the top papers have always been the focus of attention and controversy in the industry.

After the release of the ICLR, someone posted on Reddit questioning the committee's acceptance of papers that violated the anonymity policy and did not adhere to the principle of double blindness in the review process.

ICLR was exposed to a huge shady scene, and the judges and authors colluded in private? 49.9% of the papers were suspected of AI review

And this is by no means an isolated case. According to the official ICLR article, more than 7,000 submissions have been received on issues related to the review process.

ICLR was exposed to a huge shady scene, and the judges and authors colluded in private? 49.9% of the papers were suspected of AI review
ICLR was exposed to a huge shady scene, and the judges and authors colluded in private? 49.9% of the papers were suspected of AI review

Soon, as the conference officially kicked off, the ICLR also said in person that it had launched an investigation into this "collusion".

ICLR was exposed to a huge shady scene, and the judges and authors colluded in private? 49.9% of the papers were suspected of AI review

Collusion is when some reviewers manipulate the bidding system to match specific authors.

Not only that, but some Area Chairs (ACs) may operate the system in a similar way and appoint reviewers who are complicit.

These reviewers then give extremely high ratings, increasing the likelihood that the paper will be accepted.

In this regard, the ICLR said:- There have been a number of cases of collusion between reviewers and authors, some of which have direct evidence. - These actions are a direct violation of the Code of Ethics. - The Ethics Commission is reviewing and evaluating possible penalties.

AI-assisted review

In addition, the question of whether the jury can use AI tools in the review process has been controversial. One of the more unique aspects of ICLR review compared to other summits is that the scores and comments of each paper are publicly released, regardless of whether they are accepted or not. Therefore, researchers from the Ecole Polytechnique Fédérale de Lausanne used the relevant public data of ICLR 2024 to study the use of AI to assist in the review. This paper not only reveals the possible large-scale use of AI assistance tools in the review process, but also uses a comparative analysis method to study the possible impact of this behavior on the review results.

ICLR was exposed to a huge shady scene, and the judges and authors colluded in private? 49.9% of the papers were suspected of AI review
ICLR was exposed to a huge shady scene, and the judges and authors colluded in private? 49.9% of the papers were suspected of AI review

The authors first used GPTZero, a commercially available LLM detector, to evaluate all text review comments. GPTZero can classify a given text into three categories: "fully human-generated", "fully AI-generated", and "hybridly generated", and give the corresponding confidence level.

In this study, if GPTZero believes that the confidence level of "completely human-generated" is less than 0.5, it is considered to have used AI-assisted. The results show that AI-assisted judging is more extensive than expected. In 2024, at least 15.8% of the 28,028 comments given by the judges were generated by AI, and 49.9% of the total accepted articles received at least one review opinion judged by GPTZero to be AI-assisted. Based on the detection results of GPTZero, the paper continued to study whether AI-assisted comments would have an impact on the scoring and acceptance rate of the paper.

ICLR was exposed to a huge shady scene, and the judges and authors colluded in private? 49.9% of the papers were suspected of AI review

The article consists of three parts, the first part analyzes the scope of AI's participation in the review, and the second and third parts study the possible impact of AI-assisted reviewFor each paper with both AI-assisted review opinions and human review opinions, the authors collected the results of these grading systems (including 5 grades: 1 point, 3 points, 5 points, 6 points, and 8 points), and used the proportional odds model to fit the estimation of the likelihood that the AI-assisted review will score a higher score. Overall, AI will give higher grades to papers than humans. For a given paper, there is a 53.4% chance that the AI score will be higher than the human score.

ICLR was exposed to a huge shady scene, and the judges and authors colluded in private? 49.9% of the papers were suspected of AI review

In order to study how AI-assisted review opinions affect the selection of papers, the authors selected articles with similar content from all papers and matched them into pairs, one of which was judged to be reviewed by humans, and the other contained only one AI-assisted review, and after removing the AI's scores, the review committee gave them exactly the same score. After screening a sample of 5132 papers based on the above criteria, the authors compared their reception to analyze the impact of AI-assisted scoring. Overall, an AI-assisted grading gives papers a 3.1% higher chance of being selected, and that number rises to 4.9% for papers whose scores hover over the edge of the receiving score line.

ICLR was exposed to a huge shady scene, and the judges and authors colluded in private? 49.9% of the papers were suspected of AI review

When the score is 5~6 points and is at the edge of the receiving line, AI scoring will have a positive impact on the selection of papersIn recent years, the rapid development of large language models, especially after the birth of ChatGPT, there have been doubts about AI's participation in the review process in the academic community, and professors who work in "996" are likely to let large language models help them write review opinions in the face of the burden of review. This paper explores the current status of review in today's top meetings, tracks and quantifies the causal relationship through the method of controlling variables, and then reveals the possible impact of AI-assisted review on the results of paper acceptance. Whether the rapid development of large language models threatens the long-standing peer review system in academia has been a concern for journal and paper committees.

According to the authors, one of the implications of the study is that it confirms this negative impact with quantitative evidence. Due to the proliferation of paper submissions and the development of increasingly rapid text generation tools, it seems inevitable that committee members who are struggling with review work will adopt AI-assisted tools. Taking ICLR as an example, the total number of submissions in 2023 is only 4,955, and this year it has nearly doubled to 7,262, which undoubtedly puts a lot of work on the conference review committee.

ICLR was exposed to a huge shady scene, and the judges and authors colluded in private? 49.9% of the papers were suspected of AI review

The last part of the paper honestly expresses the authors' concerns that the criteria and evaluation indicators of the review process need to evolve with the development of large language models. Otherwise, allowing AI to project its own immature values onto the selection process of academic papers, especially those with more opinions and value statements, will create a more serious crisis. Finally, the authors also shared their GPTZero-based inspection website, which allows you to enter the title of your paper to see if your ICLR paper was "lucky" to be assigned to AI-assisted review.

ICLR was exposed to a huge shady scene, and the judges and authors colluded in private? 49.9% of the papers were suspected of AI review

Read on