Efforts to Make AI Based on Text Less Racist and Terrible


In another test, Xudong Shen, a doctoral student at Singapore National University, assessed language models based on how stereotyped people are by sex or whether they identify as queer, transgender or non-binary. He found that larger AI programs tend to engage in more stereotypes. Shen says that the creators of great language models must correct these shortcomings. OpenAI researchers have also found that language patterns tend to become more toxic as they grow; they say they don’t understand why.

The text generated by the great linguistic models is getting closer and closer to the language that looks or sounds like it came from a human, yet it fails to even understand things that require reasoning that almost all people understand. In other words, as some researchers say, this AI is a fantastic bullshitter, able to convince both AI searchers and other people that the machine understands the words it generates.

UC Berkeley psychology professor Alison Gopnik studies how children and young people learn to apply this understanding to computer science. Children, he said, are the best students, and the way children learn the language derives in large part from their knowledge and interaction with the world around them. On the contrary, the great linguistic models have no connection with the world, making their production less grounded in reality.

“The definition of bullshitting is to talk a lot and it seems a little plausible, but there’s no common sense behind it,” Gopnik says.

Yejin Choi, an associate professor at the University of Washington and leader of a group studying common sense at the Allen Institute for AI, put GPT-3 through dozens of tests and experiments to document how it can go wrong. Sometimes it repeats itself. Other times devulves in toxic language generation even when starting with a harmless or harmful text.

To teach AI more about the world, Choi and a team of researchers have created PIGLeT, AI trained in a simulated environment to understand things about the physical experience that people learn while growing up, such as a bad idea to touch a stove. hot. That training led to a relatively small language model for outperforming others on common sense reasoning skills. Those results, he said, show that scale is not the only winning recipe and that researchers should consider other ways to form models. His goal: “Can we actually build a machine learning algorithm that can learn abstract knowledge about how the world works?”


Choi also works on ways to reduce the toxicity of language models. Earlier this month, she and her colleagues introduced themselves an algorithm which learns from an offensive text, similar to the approach taken by Facebook AI Research; they say it reduces toxicity better than many existing techniques. Large-tongued models can be toxic because of humans, she says. “That’s the language that’s out there.”

Perversively, some researchers have found that attempts to sharpen and eliminate prejudice from patterns can end up hurting marginalized people. In a card published in April, Researchers from UC Berkeley and the University of Washington have found that Blacks, Muslims and people who identify as LGBT are particularly disadvantaged.

The authors say the problem stems, in part, from humans labeling the data to misjudge whether the language is toxic or not. This leads to prejudice against people who use the language differently than whites. The co-authors of this paper say this can lead to self-stigmatization and psychological damage, and even force people to change code. OpenAI researchers did not address this issue in their recent article.

Jesse Dodge, a research scientist at the Allen Institute for AI, came to a similar conclusion. He focused on efforts to reduce negative stereotypes of gays and lesbians by removing from the formation data of a large language model any text that contains the words “gay” or “lesbian”. He found that such efforts to filter language can lead to data sets that effectively erase people with these identities, making language models less able to handle text written by or on those groups of people.

Dodge says the best way to deal with prejudice and inequality is to improve the data used to form language models instead of trying to remove prejudices after the fact. It recommends better documenting the source of training data and recognizing the limitations of text scraped from the web, which can over-represent people who may allow access to the Internet and have time to make a website or post a commentary. It also urges you to document how content is filtered and avoiding the use of a block list to filter scraped content from the web.

Source link


Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button