laitimes

ChatGPT annotation data is 20 times cheaper than humans, and 80% of tasks are superior

Pine comes from the Temple of Cave Fei

Qubits | Official account QbitAI

Another "human job" was snatched away by AI, which is still closely related to training AI:

Data labeling.

Research by the University of Zurich found that in the face of ChatGPT, humans can be said to have no advantage in terms of cost and efficiency:

In terms of cost, ChatGPT has an average cost of less than $0.003 per annotation, which is 20 times cheaper than crowdsourcing platforms.

In terms of efficiency, in tasks such as relevance, position, and theme, ChatGPT also "crushes" humans with a 4:1 advantage.

After the paper was issued, some netizens ridiculed that the statement that "generating training data requires labor" has become a thing of the past.

Some people shouted that "is there hope that the digitization of ancient book restoration work will be accelerated?"

Some people see that the excitement is not too big, and directly tweeted:

This is directly serving the rice bowl of platform workers.

Then again, how did ChatGPT steal the "job" of data labeling workers?

ChatGPT has an advantage on 80% of tasks

First of all, we must understand the specific content of the data labeling work.

Simply put, data tagging is the labeling of content data on social media, categorizing it by different topics or concepts, or judging its position and sentiment.

This labeled data can be used as a training set or evaluation criterion for NLP models.

In the past, this kind of work was handled manually, such as MTurk, a crowdsourcing platform dedicated to data annotation.

Within crowdsourcing platforms like MTurk, there will be a more granular division of labor, such as professionally trained data annotators and crowdsourcing workers.

The former has an advantage in producing high-quality data but naturally has a higher cost, while the latter is cheaper but the quality fluctuates with the difficulty of the task.

So the research team set out to investigate the potential of large language models (LLM) in this regard, and compared the performance of ChatGPT (based on GPT-3.5) and MTurk without additional training (zero-shot) on data annotation.

The comparison is based on a sample of 2,382 tweets previously collected by the research team.

ChatGPT and MTurk label tweets with five tasks: "relevance, stance, theme, policy, and utility."

There are two criteria for evaluation:

Accuracy: percentage of ChatGPT and MTurk crowdsourced workers compared to correctly labeled;

Consistency reliability between coders: calculated by using any two consistency between ChatGPT, MTurk crowdsourcers and professional data annotators;

It turns out that ChatGPT outperforms MTurk crowdworkers in terms of accuracy for four-fifths of its tasks.

In terms of consistency and reliability, ChatGPT surpasses professional data annotators in all tasks.

In terms of cost, as mentioned at the beginning, ChatGPT is 20 times cheaper per message on average than labor, not to mention that AI can still be 24*7.

However, for the conclusion reached by the research team, not all netizens bought it, and some people said:

These five tasks are too single, and so is the difficulty. This alone leads to such a conclusion that is questionable in reliability.

Some netizens even ridiculed the study sample too little:

Only 2,382 tweets were sampled.

The "rice bowl threat" goes beyond data labeling

Now, it is difficult to say whether AI will completely replace a certain type of work, but it will affect human work to some extent.

Last week, OpenAI released an analysis report saying that 80% of jobs will be affected by ChatGPT to some extent, and 19% of jobs will be severely affected by ChatGPT.

And the higher the salary, the harder the hit will be the profession.

OpenAI further lists the specific occupations that will be affected, in descending order:

Translation practitioners, text creators (including poets, writers, etc.), publicists, mathematicians, tax preparers, blockchain engineers, financial workers, media practitioners...

In addition, OpenAI's CEO Altman has said on more than one occasion that "AI will replace some existing jobs."

Not long ago, the Midjourney V5 upgrade also made many human painters call their jobs insecure.

Emmmmmm, do you think your job will still be saved?

Paper Address:

https://arxiv.org/abs/2303.15056

Reference Links:

https://twitter.com/arankomatsuzaki/status/1640521970608402435

Read on