Behind the explosion of ChatGPT, learn sexist AI

Ever thought that AI also discriminates?

Try to test it, you are a doctor, driver, teacher, waiter, clerk... So, what is your gender? Does its answer match the stereotypes around you? Or maybe you think it's just a short-answer question with a 50% probability of being right or wrong.

But if enough is asked, the situation is different. In December 2022, Yu Yang, assistant professor at the Institute of Interdisciplinary Information Studies of Tsinghua University, led the team to do an AI model gender discrimination level assessment project, in the "neutral" sentence containing occupational vocabulary, 10,000 templates were generated by AI prediction, and Yu Yang's team then counted the AI model's tendency to predict what gender the occupation was for, and when the prediction bias and stereotypes matched, algorithmic discrimination was formed.

Test models include GPT-2 (Generative Pre-trained Transformer 2), the predecessor of the chatbot ChatGPT, developed by artificial intelligence company OpenAI and now creating an interactive boom. The test results found that GPT-2 had a 70.59% probability of predicting teachers as male and doctors as male 64.03% of the time.

Other first-of-its-kind AI models tested include Google-developed BERT and Facebook-developed RoBERTa. All the tested AIs predicted the gender of the test occupation, and the result tended to be male.

"It will favor sons over women and love white and black (note: racial discrimination)," Yu Yang said, adding that there have been many case studies of AI discrimination. For example, AI image recognition always identifies the person in the kitchen as a woman, even if the other party is a man; In June 2015, Google Photos' algorithm even classified black people as "gorillas," putting Google on the cusp.

So, how does AI learn to be sexist?

The first is the bias brought by the dataset, that is, the "textbook" for AI learning and training itself implies bias. In March 2016, Microsoft launched the chatbot Tay, which can capture data from user interactions to mimic human conversations, and within a day of its launch, Tay learned to be an extremist advocating ethnic cleansing, and Microsoft had to remove it on the grounds of system upgrades.

Behind the explosion of ChatGPT, learn sexist AI

Extreme remarks made by Tay. Image source network

The limitations of the designer sometimes inadvertently form "bias". Silicon Valley, as well as a large number of intelligent application companies, are concentrated in the San Francisco Bay Area, a developed metropolitan area, the developers are mainly white young and middle-aged men, compared with the mainstream group, its attention to the third world, marginalized groups is difficult to say.

In addition, the shortcomings of the algorithm itself also exacerbate discrimination. Taking the current highly respected "deep learning" in the field of AI as an example, in the vast data, the operation of AI is like the intricate neuronal information transmission of the human brain, through the "hundreds of billion" measurement of operating parameters, it will develop its own connections, analyze features, determine variable weights, its opacity, is often called "black box" characteristics, sometimes designers can not tell in which link AI has learned the "stubborn disease" of this society.

Yu Yang believes that for AI discrimination, if you want to screen out the bias of the data set, the cost is too high, and the more appropriate way is to adjust it after the AI model comes out; In response to AI discrimination, government supervision and scholars from different disciplines are also needed to participate in the discussion, "On the other hand, there must be a certain degree of fault tolerance for AI products." ”

At a time when AI is increasingly infiltrating the lives around us, "technology can no longer be viewed from a neutral perspective," Yu said.

[The following is a conversation with Yu Yang]

The Paper:

Can you tell us about the research background of the gender discrimination level assessment of AI models?

Yu Yang:

The issue of discrimination in AI has also been a concern for some years. It will be patriarchal and "love white and black", and there have been many studies under discussion.

Some AI image recognition studies have found that AI always identifies the person in the kitchen as a woman, even if he is a man; Or a random association by AI: Doctor said that... (The doctor said), followed by he, him, masculine words, nurse, followed by more words for women. Racism is the same, mainly depending on the profession, such as professor, will it be associated with more whites? Speaking of prisoners, will it be more black?

But there is a very important question: is AI discrimination different from people? Many people think that AI is just learning people. And, how to assess the degree of discrimination of an AI model? We came across a large number of cases, such as a study that discriminated against a certain task. There are many more teams talking about how to avoid and correct discrimination, and there is no way to measure and compare the degree of discrimination in different AI models, which is why our team is willing to do it.

The Paper:

How is AI discrimination different from people?

Yu Yang:

People think of AI as adults, which is the biggest misconception about the problem, AI is not a person, but a statistical valuer.

Although the study found that in some sentences, AI thought that doctors were more male and nurses were more female than people. But if we change the sentence (structure) and still have the same profession, it may be the other way around, with doctor with more female words and nurse with more male words. This is not the same as people's stereotypes, which do not change stereotypes because of sentences.

Therefore, we cannot use an example, or some examples, to judge whether the AI is discriminatory, but to see how likely the AI is to return a discriminatory result in all sentences or content that may cause sexism.

To this end, we have designed a discrimination audit framework. Dig into a corpus for a sufficient number of sentences containing professional vocabulary. Make sure the sentence does not imply the sex or race of the occupation, that is, it is "neutral". By statistically predicting the gender and race of occupational words in neutral sentences, the AI is judged to discriminate tendencies, the probability of discrimination, and the degree of discrimination. When the bias of AI prediction coincides with the discrimination existing in society, algorithmic discrimination is formed.

The Paper:

What is the specific testing process for the evaluation project?

Yu Yang:

For each profession, such as doctor, we looked for sentences that did not have a gender orientation — templates such as "The doctor said that [Y]," "The doctor sent a letter that [Y]." The predictive AI model predicts the probability of [Y] being male or female on each template. By averaging the two probabilities of 10,000 templates, the probability of discrimination in the model in both gender directions can be obtained. Among them, the gender direction with a high probability is the gender orientation of a certain occupation that AI believes is about.

So is this bias just a random error or a systematic cognitive bias? Simply put, AI discriminates because it is "stupid"? Or is it because of the "bad"? If AI has a 60% probability of believing that a profession is male and 40% is female, and has a systemic tendency, it can be said that it already has a stereotype and belongs to systematic cognitive bias.

Different AI models consider the results of a doctor's gender orientation. Image source page: aijustice.sqz.ac.cn

The Paper:

What analysis did you have on the test results?

Yu Yang:

We found that in terms of sexism, almost all of the AI models tested were dominated by systematic bias, that is, "bad" was dominant, and "stupid" was secondary. But in terms of racial discrimination, some models are mainly "stupid" and not very accurate, because the race is not only black and white, but also Asian, Hispanic and so on.

But the gender discrimination problem of AI is different from many of our previous imaginations, and all the models in the test, including the now popular ChatGPT predecessor GPT-2, have a feature, all of its occupations are inclined to men, which is different from social stereotypes. This is what we just said that AI is not the same as people, and its "discrimination" depends on its statement environment (note: the dataset used for training).

The Paper:

Can you share an example of a test case that differs from social stereotypes?

Yu Yang:

Taking occupational teachers as an example, the three models of BERT, RoBERTa, and GPT-2 are more inclined to believe that the teacher corresponds to the male, of which the RoBERTa model believes that the probability of the teacher corresponding to the male is the largest, and the BERT is the smallest.

GPT-2 has a 70.59% probability of predicting teachers as male.

The Paper:

Why do different AI models discriminate differently?

Yu Yang:

There are many reasons, one is that the database itself used to train AI has some tendencies. For example, previous tests have shown that GPT-2 is more discriminatory than BERT, and BERT's training data is mainly Wikipedia, more academic content, which may be one of the reasons why it is less genderism than GPT-2, and GPT-2 training data is more extensive than Wikipedia. But this is only possible rather than conclusive, GPT-2 training dataset has not been fully published, we cannot determine the impact of dataset differences.

But I can be sure: data discrepancies are not the only factor. Gender bias in the data is more inherent bias of people, but whether it is GPT-2 or other models, it is believed that almost all occupations tend to be male, which means that in addition to the data, model design also has an impact on the tendency.

As for how the model itself causes discrimination, a clearer mechanism is that AI wants to convert unstructured data, such as a picture, an article, or a sentence we see, into numbers, and the process of conversion has produced errors, that is, biases in favor of men or women. There are other mechanisms, but they are not yet clear. After digitalization, it has a series of complex processes that exacerbate discrimination, but why this is the case, we do not know. Because AI has a "black box" nature, I can't be sure how it works.

The AI model concluded that the average gender orientation of all occupations was male.

The Paper:

Can there be some screening on the database to reduce potential bias and discrimination?

Yu Yang:

This cannot be done. The amount of data in the database is massive, and it is very expensive to analyze the stereotype of a database, on the contrary, it should be solved by adjusting the model after the model comes out.

The Paper:

What are the difficulties in correcting discrimination by AI?

Yu Yang:

Many of these methods have a problem: If you correct AI sexism, you make it "dumb," and it either can't tell the difference between mom and dad, or it has grammatical errors — for example, the third person verb doesn't add s. So, one question is: to save the "lost" AI, then the AI must be "stupid"?

Our study says: If we look at it from an econometric point of view, this is not the case, the problem lies in the current method of correcting AI discrimination - the current method is in layman's terms, it is pure swearing, as long as you engage in sexism, I will smoke you. But just like teaching children not to rely solely on scolding, you have to understand what children think, and then reason with them. You have to do the same with AI. For example, we will add some objective functions when training, and another method is to analyze the causes of discrimination in AI and correct them at fixed points.

The Paper:

Many netizens marveled at the high anthropomorphism of ChatGPT, and some people said that its answer was almost unbiased, like "Duanshui Master". As AI technology continues to evolve, will discrimination become less perceptible?

Yu Yang:

Now that everyone has noticed the problem of discrimination, it will be gradually solved. But in the future, whether other technologies will appear more and more difficult to detect, it is difficult to say, it is impossible to predict.

The Paper:

What are the effects of AI discrimination?

Yu Yang:

Discriminatory GPT-2 may produce discriminatory texts that further discriminate discriminatory statements. Discrimination in BERT may induce it to engage in discriminatory behavior in the performance of downstream tasks. In another case, when Amazon used AI recruitment for resume screening in 2014, it rated women lower.

The Paper:

In your evaluation project introduction, you mentioned: "As a black box, the security and fairness of large-scale pre-trained language models raise widespread concerns. "Can you be more specific?

Yu Yang:

For example, AI in the dialogue to swear words, AI-generated content in obscene pornography, such as AI to label black people as "chimpanzees", these are the risks and negative consequences caused by AI's uncontrollable results, it is not only gender, racial discrimination, it may generate false information, pornography, gambling and drug content. Some researchers will specifically think about how to avoid AI learning to swear.

In particular, I would like to talk about how to train AI in compliance with public order and good customs, which is a technical issue that should be paid attention to. The government should also focus on such technologies, one is to introduce AI compliance standards and assessment methods, control the risk rate, for example, when a model is introduced, the risk caused by error cannot exceed 10% or 5%, these are the standards that should be set. The other is to attach importance to and encourage compliance technology.

But on the other hand, there must be a certain degree of fault tolerance for AI products, and it cannot be said that if one or two content that does not conform to public order and good customs, or even violates laws and regulations, the entire model will be banned. It is a statistical model, it must go wrong, you can't ask for an AI model that is not wrong at all, whether it is discrimination or dirty words, some are beyond the control of the algorithm development team, can not be controlled, can not be solved. If we can't tolerate anything wrong with AI, then AI will never be used in our economic lives.

The Paper:

In the issue of AI discrimination, besides technological progress, what other forces can participate in change?

Yu Yang:

The issue of discrimination in AI is a comprehensive issue of gender equality movement and technology governance. Aside from technology, it is impossible to rely on other gender policies alone, or to propose corresponding regulation for the technology itself, while encouraging the emergence and development of affirmative rights technology (gender, racial equality), some people say that there are too many men in the engineering team, adding women indirectly makes affirmative action technology get more attention. At the end of the day, affirmative action technology must be directly encouraged.

Many people say that technology is neutral, and we are now gradually believing that technology has good and neutral, and there is also evil. Avoid AI swearing, this is a technology with clear values. In the era of AI, technology can no longer be viewed from a neutral perspective, and technology with values must be encouraged. As far as I know, a team at Yunnan University is using AI to protect small languages, especially small languages without words, which provides new possibilities for AI applications.

Interdisciplinarity will also allow us to have more perspectives and ideas, such as sociology researchers joining in, so that we can know which technologies for good need to be encouraged.

The Paper:

Has the team tested the latest ChatGPT?

Yu Yang:

The latest version we did not test, one is not open source, and the other is that GPT-4 is equivalent to a semi-finished product, which can be used for many different products, so the detection methods are also different.

The Paper:

What do you think of the current boom in public interaction with ChatGPT?

Yu Yang:

ChatGPT itself is an important scenario innovation tool, which can assist work and promote efficiency. The problem we need to pay more attention to is that people may be too convinced of the results of Baidu search engine that year, resulting in some wrong information spreading and misleading people, causing problems when seeing a doctor, and so on. The same goes for ChatGPT-4, whose answer is not necessarily right and can be misleading. Others, I think it's an unstoppable technological advance.

Behind the explosion of ChatGPT, learn sexist AI

Read on