Author | Martin Anderson
Translated by | Marco Wei
Planning | Ling Min
In recent years, the increasing use of emojis, emojis, emojis, GIFs, and various non-written expressions on social media platforms has made it increasingly difficult for data scientists to study the sociological landscape on a global scale, but some trends in global sociology can be found in people's public statements.
While natural language processing (NLP) has been a very powerful emotion analysis tool over the past decade, it has not only failed to keep up with the rapidly updated, cross-language web vocabulary and abbreviations, but also struggled with image-based posts on social networking sites like Facebook and Twitter.
Because the only super-large-scale resources that this type of research can really rely on are these few large social media platforms, AI must keep up with the times.
In July, a paper proposed a new approach that uses a database of 30,000 tweets to categorize and predict the emotions triggered by a post based on the "GIF reaction" (see figure below) that users post to social network posts. The paper found that such image-based responses are easy to measure in all respects, as most would not contain the weakness in sentiment analysis: irony.

Researchers refer to the dynamic emoticon GIFs people use as "reduction metrics" and analyze their use in a paper published in 2021.
In the first half of 2021, Boston University led a research team to train a machine learning model to predict what might be popular on Twitter. In August 2021, British scholars integrated a large dataset of Twitter sentiments in seven languages by studying the comparison of trends in social media where people use emojis (meaning numbers, letters and punctuation in image form) or emoji (meaning faces, objects and symbols in image form).
Twitch emoji
Now, researchers in the United States have developed a machine learning training method that can better understand, classify, and measure the evolving emotes pseudo-vocabulary on Twitch.
Emoji emotes are new words on Twitch that are used to express emotions, emotions, or niche jokes. Because its definition is a newly created expression, the most difficult thing for a machine learning system is not to classify the new expressions that are constantly generated, and the speed of summary may not be as fast as they have been; we want the machine to better understand the structure behind these expressions, and develop a system to recognize these expressions as "temporary" words or combined phrases, and the emotions they represent need to rely entirely on the context to judge.
Emojis similar to happy frogs, simply changing the suffix has a completely different meaning.
The image above is from a paper published by three researchers at a social media analytics firm in San Francisco, "Happy Frog: Inferring the Emotional Meaning Behind New Words in Twitch (https://arxiv.org/pdf/2108.08411.pdf).
Transformation after the explosion
Although these emoticons are fresh for a while and most of them are short-lived, Twitch often digs out old emoticon material and recycles them, making the trained emotion analysis framework misjudged. By tracing the changes in the meaning of expressions as they evolved, it is often found that the emotions or intentions they now represent are completely inverted from when they were originally created.
For example, the researchers noticed that due to the misuse of the Happy Frog Terrier by the far right, the expression almost completely lost the political meaning it represented when it first became popular on Twitch.
The image of the Happy Frog and its classic phrase "Feels Good Man" first appeared in a 2005 comic by American illustrator Matt Furie, and then became a far-right icon around 2010. Vox wrote in 2017 that although Furie claimed to be out of touch with it, the meaning of this right-wing appropriation has survived, but the San Francisco researchers behind the paper do not agree:
In early 2010, Furie's cartoon frog image was used as a propaganda campaign by the right wing in various online forums, including 4chan (extranet anonymous forum). Since then, Furie has struggled to win back the meaning of the frog Pepe itself, and on Twitch, a large number of non-hateful, positive frog expressions have become mainstream, making the happy frog and its corresponding sad frog usage more in line with the literal meaning of the expression.
Follow-up troubles
The common expression of this terrier diagram is often frustrated by the fact that it has become popular and then converted. After all, these emoticons have been labeled "hate" or "nationalism (US)" and packaged into long-term open source repositories. Subsequent NLP research projects that use this data may not check the correctness of the data, either because there is no means of data audit, or because they are not aware of the need for auditing.
The consequences of this expired label are obvious, and if a "political classification" algorithm is trained using the Twitch emoticon dataset in 2017, then thanks to the heavy use of sad frog emoticons, we will observe a very pronounced far-right tendency on Twitch. Sure, Maybe Twitch is full of far-right streamers, but you can't rely on frog heads to verify that.
The political significance of the Sad Frog Terrier seems to have been unceremoniously abandoned by twitch's 140 million users (41% of whom are under the age of 24). They invariably took back the frog Pepe very efficiently from the politicians who stole the map, redefining it in their own way.
Methods and data
The researchers found that the tagged Twitch emoji dataset was "virtually non-existent," although previous studies said they used a total of eight million Twitch emoticons, of which 400,000 were created in the same week.
A 2017 study predicting popular emoji on Twitch], after limiting the prediction to the top 30, still scored only 0.39.
To address this conundrum, researchers in San Francisco used a new approach to old data, dividing the ratio of training sets to test sets into 80/20, and using naive Bayes, random forests (RF), support vectors (SVMs, with linear cores), and logistic regression, "traditional" machine learning algorithms that had not previously been used in Twitch data.
The algorithm's performance was 63.8 percent higher than the baseline of previous studies, and the LOOVE (abbreviation for "Learning Emotions from Words") framework developed by the researchers enabled the recognition of new words and the addition of these entirely new definitions to existing models.
The LOOVE (Learning Out Of Vocabulary Emotions) framework structure developed by the researchers
LOOVE excels at unsupervised training embeddings, avoiding the need for labeled datasets by regular retraining and fine-tuning. Given the number of expressions and the speed at which they evolve, it is very unrealistic to update the tagged dataset in real time.
In the project, the researchers trained a "pseudo-dictionary" of emojis with an unlabeled Twitch dataset, and during the training process, the model generated 444,714 embeddings of emojis, emojis, and emojis.
In addition, they added emoji and emoji vocabulary to the VADER dictionary, and in addition to the previously mentioned EC dataset, they also used three publicly available datasets sampled from Twitter, Rotten Tomatoes, and YELP to classify ternary feelings.
Since more than one methodology and dataset were used in the project, the results varied, but it is certain that the baseline of optimal performance in the project was 7.36 percentage points higher than in previous studies.
The researchers believe that the subsequent value of the project lies in the continued development of the LOOVE framework, training more than 331 million chat data on Twitch with the embedded training of K-nearest neighbor method (KNN) and word-to-vector (W2V).
The authors conclude that the function-driven behind the framework is a pseudo-dictionary of emoji that can be used to predict the emotion of unknown expressions. Using this pseudo-dictionary of emoji, we created an emotion table with 22,507 expressions, arguably the first case of emoji interpretation on this scale.
https://www.unite.ai/understanding-twitch-emotes-in-sentiment-analysis/