Sentiment analysis in Twitch emoticons

2021-12-14 11:58:06

Author | Martin Anderson

Translated by | Marco Wei

Planning | Ling Min

In recent years, the increasing use of emojis, emojis, emojis, GIFs, and various non-written expressions on social media platforms has made it increasingly difficult for data scientists to study the sociological landscape on a global scale, but some trends in global sociology can be found in people's public statements.

While natural language processing (NLP) has been a very powerful emotion analysis tool over the past decade, it has not only failed to keep up with the rapidly updated, cross-language web vocabulary and abbreviations, but also struggled with image-based posts on social networking sites like Facebook and Twitter.

Because the only super-large-scale resources that this type of research can really rely on are these few large social media platforms, AI must keep up with the times.

In July, a paper proposed a new approach that uses a database of 30,000 tweets to categorize and predict the emotions triggered by a post based on the "GIF reaction" (see figure below) that users post to social network posts. The paper found that such image-based responses are easy to measure in all respects, as most would not contain the weakness in sentiment analysis: irony.

Researchers refer to the dynamic emoticon GIFs people use as "reduction metrics" and analyze their use in a paper published in 2021.

In the first half of 2021, Boston University led a research team to train a machine learning model to predict what might be popular on Twitter. In August 2021, British scholars integrated a large dataset of Twitter sentiments in seven languages by studying the comparison of trends in social media where people use emojis (meaning numbers, letters and punctuation in image form) or emoji (meaning faces, objects and symbols in image form).

Twitch emoji

Now, researchers in the United States have developed a machine learning training method that can better understand, classify, and measure the evolving emotes pseudo-vocabulary on Twitch.

Emoji emotes are new words on Twitch that are used to express emotions, emotions, or niche jokes. Because its definition is a newly created expression, the most difficult thing for a machine learning system is not to classify the new expressions that are constantly generated, and the speed of summary may not be as fast as they have been; we want the machine to better understand the structure behind these expressions, and develop a system to recognize these expressions as "temporary" words or combined phrases, and the emotions they represent need to rely entirely on the context to judge.

Emojis similar to happy frogs, simply changing the suffix has a completely different meaning.

The image above is from a paper published by three researchers at a social media analytics firm in San Francisco, "Happy Frog: Inferring the Emotional Meaning Behind New Words in Twitch (https://arxiv.org/pdf/2108.08411.pdf).

Transformation after the explosion

Although these emoticons are fresh for a while and most of them are short-lived, Twitch often digs out old emoticon material and recycles them, making the trained emotion analysis framework misjudged. By tracing the changes in the meaning of expressions as they evolved, it is often found that the emotions or intentions they now represent are completely inverted from when they were originally created.

For example, the researchers noticed that due to the misuse of the Happy Frog Terrier by the far right, the expression almost completely lost the political meaning it represented when it first became popular on Twitch.

The image of the Happy Frog and its classic phrase "Feels Good Man" first appeared in a 2005 comic by American illustrator Matt Furie, and then became a far-right icon around 2010. Vox wrote in 2017 that although Furie claimed to be out of touch with it, the meaning of this right-wing appropriation has survived, but the San Francisco researchers behind the paper do not agree:

In early 2010, Furie's cartoon frog image was used as a propaganda campaign by the right wing in various online forums, including 4chan (extranet anonymous forum). Since then, Furie has struggled to win back the meaning of the frog Pepe itself, and on Twitch, a large number of non-hateful, positive frog expressions have become mainstream, making the happy frog and its corresponding sad frog usage more in line with the literal meaning of the expression.

Follow-up troubles

The common expression of this terrier diagram is often frustrated by the fact that it has become popular and then converted. After all, these emoticons have been labeled "hate" or "nationalism (US)" and packaged into long-term open source repositories. Subsequent NLP research projects that use this data may not check the correctness of the data, either because there is no means of data audit, or because they are not aware of the need for auditing.

The consequences of this expired label are obvious, and if a "political classification" algorithm is trained using the Twitch emoticon dataset in 2017, then thanks to the heavy use of sad frog emoticons, we will observe a very pronounced far-right tendency on Twitch. Sure, Maybe Twitch is full of far-right streamers, but you can't rely on frog heads to verify that.

The political significance of the Sad Frog Terrier seems to have been unceremoniously abandoned by twitch's 140 million users (41% of whom are under the age of 24). They invariably took back the frog Pepe very efficiently from the politicians who stole the map, redefining it in their own way.

Methods and data

The researchers found that the tagged Twitch emoji dataset was "virtually non-existent," although previous studies said they used a total of eight million Twitch emoticons, of which 400,000 were created in the same week.

A 2017 study predicting popular emoji on Twitch], after limiting the prediction to the top 30, still scored only 0.39.

To address this conundrum, researchers in San Francisco used a new approach to old data, dividing the ratio of training sets to test sets into 80/20, and using naive Bayes, random forests (RF), support vectors (SVMs, with linear cores), and logistic regression, "traditional" machine learning algorithms that had not previously been used in Twitch data.

The algorithm's performance was 63.8 percent higher than the baseline of previous studies, and the LOOVE (abbreviation for "Learning Emotions from Words") framework developed by the researchers enabled the recognition of new words and the addition of these entirely new definitions to existing models.

The LOOVE (Learning Out Of Vocabulary Emotions) framework structure developed by the researchers

LOOVE excels at unsupervised training embeddings, avoiding the need for labeled datasets by regular retraining and fine-tuning. Given the number of expressions and the speed at which they evolve, it is very unrealistic to update the tagged dataset in real time.

In the project, the researchers trained a "pseudo-dictionary" of emojis with an unlabeled Twitch dataset, and during the training process, the model generated 444,714 embeddings of emojis, emojis, and emojis.

In addition, they added emoji and emoji vocabulary to the VADER dictionary, and in addition to the previously mentioned EC dataset, they also used three publicly available datasets sampled from Twitter, Rotten Tomatoes, and YELP to classify ternary feelings.

Since more than one methodology and dataset were used in the project, the results varied, but it is certain that the baseline of optimal performance in the project was 7.36 percentage points higher than in previous studies.

The researchers believe that the subsequent value of the project lies in the continued development of the LOOVE framework, training more than 331 million chat data on Twitch with the embedded training of K-nearest neighbor method (KNN) and word-to-vector (W2V).

The authors conclude that the function-driven behind the framework is a pseudo-dictionary of emoji that can be used to predict the emotion of unknown expressions. Using this pseudo-dictionary of emoji, we created an emotion table with 22,507 expressions, arguably the first case of emoji interpretation on this scale.

https://www.unite.ai/understanding-twitch-emotes-in-sentiment-analysis/

Sentiment analysis in Twitch emoticons

Read on

In recent weeks, OpenAI has quietly changed its list of "core values" to include generic people that were not explicitly listed before

Recently, Robin Li, founder, chairman and CEO of Baidu, said in his speech that the comprehensive level of Wenxin Model 4.0 is no less than that of GPT-4. This statement was triggered

Technology changes lives: artificial intelligence. Artificial intelligence is indeed an important trend in the future development, and the following are some of the main development directions: First, the improvement of intelligence: artificial intelligence technology

Top 10 Large Models, Tools, and Technologies at Home and AbroadThis paper introduces some important deep learning tools and technologies, including natural language processing, computer vision, neural networks, data augmentation, and modulars

Top 10 Algorithms and Technologies of Deep LearningThis article introduces some important algorithms and tools in the field of deep learning, including convolutional neural networks, recurrent neural networks, and deep belief networks

Wenxin Yiyan has been quietly updated, mainly upgrading the 4.0 version features. Compared to 3.0, it is more advanced. The specific differences are as follows: First, from a technical point of view, Wenxin Yiyan 4.0

GPT upgrade, consolidating the leading position of AI in the United States, cutting off the road of latecomers ChatGPT is an artificial intelligence language model launched by OpenAI, since 20

What exactly is ChatGPT?ChatGPT is an artificial intelligence chatbot that uses GPT technology (GenerativePre-trai

[Natural Language Processing] The natural language processing library spaCy uses the North

Online Public Opinion Communication Analysis: Natural Language Processing and Graph Analysis are Integrated

NLP: Predictive News Category - Embedded Technology in Natural Language Processing

AI Writing Bots: Unraveling how to create articles using machine learning algorithms and natural language processing

CPU, GPU, TPU, NPU !️are several different types of processors, each with its own advantages and disadvantages

Natural language processing in Python

Translation: Chronon, Airbnb's machine learning platform, is now open source

Surveying and Mapping Bulletin | Yang Sijia: Cotton growth monitoring combining coefficient of variation method and machine learning model

ABM+Machine Learning: How to Understand Endgame?

Report: Large Language Model Natural Language Processing Job Recruitment Increases by 111% Year-on-Year

How can machine learning and traditional statistical methods be integrated to build disease risk prediction models?

The popularity of ChatGPT has accelerated the growth of the market size of the upstream natural language processing industry

Experts from BAT, Byte, Kuaishou and other domestic large-scale model TOP companies gathered, and the cutting-edge topics of the 2024 Global Machine Learning Technology Conference were announced in advance!

Analyze classic algorithms and usage scenarios in the field of artificial intelligence and machine learning from a business perspective

Why is the Nobel Prize in Physics awarded to machine learning?

Two scientists share the 2024 Nobel Prize in Physics for their contributions to machine learning

The 2024 Nobel Prize in Physics will be awarded in the field of machine learning, and it is "not surprising" that two AI pioneers won the award

Promote the "explosive" development of artificial intelligence and machine learning technology! Interpreting the 2024 Nobel Prize in Physics

Why is the 2024 Nobel Prize in Physics awarded to machine learning?

40+ topics announced, the global machine learning technology conference to talk about the cutting-edge practice under the wave of large models!

University of Missouri Hayashi team Mater.Horiz.: Virtual Lab – Simulation of 3D printed thermoplastics using machine learning

The 2024 Nobel Prize in Physics has been announced! Promote the "explosive" development of artificial intelligence and machine learning technology

Huang: The key to machine learning is the flywheel, and Nvidia doesn't talk about market share

A new machine learning model for predicting the risk of liver-related adverse outcomes in patients with chronic hepatitis B with HBsAg clearance