laitimes

Baidu intelligent Yunxi Ling-AI sign language platform is coming, and it only takes a few seconds for a thousand-word text to change sign language

Zhi DongXi (public number: zhidxcom)

Author | Cheng Qian

Edit | Heart edge

Zhidong reported on March 3 that today, Baidu Intelligent Cloud launched the Xiling-AI sign language platform, which further lowered the threshold for sign language production and generation, and built a barrier-free information channel for 28 million hearing-impaired people.

Wu Tian, vice president of Baidu Group, said: "Service-oriented digital people in public-facing digital people such as Luo Tianyi and banks appear more and more in our lives. ”

Now behind the digital human image, action, service is a series of AI technology, Baidu intelligent Cloud Xiling - AI sign language platform through cross-modal innovation, technology support, reduce the deployment cost of sign language translation, improve deployment efficiency.

At the scene, the first AI sign language anchor created by the "Baidu Intelligent YunxiLing" digital human platform served as the host, and the sign language anchor had previously been launched at the Winter Olympics to provide 24-hour sign language translation services for the hearing impaired.

Baidu intelligent Yunxi Ling-AI sign language platform is coming, and it only takes a few seconds for a thousand-word text to change sign language

▲The first AI sign language anchor created by the "Baidu Intelligent Yunxi Ling" digital human platform

Last year, Baidu's intelligent cloud digital human platform "Baidu Intelligent Cloud Xiling" was launched, bringing low-cost technical support in the generation and operation of digital people.

Today, Baidu's intelligent Yunxi Ling-AI sign language platform has been released, aiming at a more vertical and minority of hearing impaired groups, using technology to empower public welfare.

First, the sign language digital human platform is deployed at the hourly level, plug-in and ready-to-use

Baidu Intelligent YunxiLing-AI Sign Language Platform is composed of AI Sign Language Platform and AI Sign Language Platform All-in-One Machine, which can achieve online hourly deployment. The AI sign language platform all-in-one machine includes V3 all-offline all-in-one machine and P3 end cloud integrated all-in-one machine, which can be plugged in and used offline.

Baidu intelligent Yunxi Ling-AI sign language platform is coming, and it only takes a few seconds for a thousand-word text to change sign language

▲AI sign language platform all-in-one machine

Li Shiyan, head of Baidu Intelligent Cloud AI Human-Computer Interaction Laboratory, said that the Baidu Intelligent Cloud Xiling-AI Sign Language Platform contains five major characteristics, namely to hear clearly, turn accurately, play well, deploy quickly, and produce quickly.

In order to accurately translate video and speech into text, Baidu Intelligent Cloud has created a SMLTA speech recognition algorithm model, which can accurately recognize speech data, and the accuracy of sign language translation recognition has reached 98%.

The researchers used the sign language translation engine to create a natural sign language NLP sign language translation model, based on the "national sign language grammar rules", and cooperated with the national sign language expert group to generate nearly 10 million "natural sign language corpus" sentences as training data.

Baidu Intelligent Yunxiling-AI Sign Language Platform uses a digital human-driven engine to run portrait rendering, action engine, lip-type drive, and expression-driven linkage, and also specially designs an action fusion algorithm for sign language performance, bringing a coherent performance closer to the expression of real sign language.

Baidu intelligent Yunxi Ling-AI sign language platform is coming, and it only takes a few seconds for a thousand-word text to change sign language

▲Action fusion algorithm

On the basis of ensuring the accuracy of sign language recognition, Baidu Intelligent Yunxiling-AI Sign Language Platform can achieve hourly deployment and minute-level production of real-time sign language synthesis video.

Li Shiyan said that the ratio of the number of sign language interpreters to the hearing impaired people in the mainland is 1:2080, and there is an information gap between most hearing-impaired people and society.

Both the "14th Five-Year Plan for the Construction of Barrier-free Environment" and the "14th Five-Year Plan for the Development of Radio, Television and Online Audio-Visual Science and Technology" both put forward the concern for information accessibility and AI barrier-free broadcasting.

Sign language is different from speaking, it is a visual language, in Chinese we may directly say "cat scratching mouse", but the visual voice is relatively slower, you need to see the cat, mouse, and then see the action of scratching, in order to accurately convey the information.

Therefore, Baidu sign language digital people integrate the AI capabilities of the whole link, voice and video data need to be converted into Chinese text through the speech recognition engine, and then converted into sign language codes through the translation engine, and under the blessing of the digital human action fusion algorithm, the sign language video that can be heard clearly, translated accurately and played well is generated.

At present, the three major challenges of sign language problems are difficult deployment, low data and high requirements.

First of all, railway stations, airports, hospitals, etc. are all need scenarios for the hearing impaired, but the scene environment of actual life is diverse, and the network environment and acoustic environment are more complex.

Secondly, sign language is a truly small language, and its data volume is small, which limits the improvement of the quality of sign language digital translation.

Third, the semantic expression of sign language is different from the language of the listener, so when creating a sign language translation system, it is not only necessary to pursue high efficiency, but also to ensure the accuracy of sign language translation.

As a result, sign language translation platforms are more demanding to deploy faster and more costly.

2. Online and offline scene optimization, real-time and accurate sign language translation

Baidu Intelligent Yunxiling-AI Sign Language Platform has four major functions, namely video sign language synthesis, direct sign language synthesis, text to sign language, and speech to sign language, optimized for online and offline scenes.

In the online scene, in order to meet the three forms of graphics, video and live broadcast, Baidu Intelligent Yunxiling-AI sign language platform has adapted and optimized for different data, including video sign language synthesis for news, movies, TV series and other scenes, support for news, documents, novels and other graphic scenes, as well as live event broadcasts, live events and other scenarios.

It is worth mentioning that it takes only a few seconds for a thousand-word text to be synthesized in The Thousand Word Text to Sign Language Synthesis in Baidu's Intelligent Yunxiling-AI Sign Language Platform.

Baidu intelligent Yunxi Ling-AI sign language platform is coming, and it only takes a few seconds for a thousand-word text to change sign language

▲ Text-to-sign language

At present, the online version has been applied on the CCTV news client, and more than 200 sign language videos have been generated, with a total of more than 100 million views.

In addition to online scenes, offline scenes also have urgent demand for barrier-free windows.

According to the 2015 "Study on the Needs of Sign Language Interpreters and Translation Services for Deaf People in China", only 2.75% of the deaf people who purchased tickets at the sign language translation window of the railway station successfully purchased train tickets, and 3.56% of the hearing-impaired users in the hospital would refuse to go to the hospital for medical treatment because of inconvenient communication.

Therefore, the deployment of barrier-free facilities with fast deployment and low cost is particularly important, and the Baidu Intelligent Yunxiling-AI Sign Language Platform can quickly realize the construction of barrier-free windows.

Third, build a sign language translation model, focusing on three major difficulties

Finally, Yuan Tiantian, deputy dean of the School of Deaf Artificial Labor of Tianjin University of Technology, Gao Liang, director of Baidu's speech technology department, and He Zhongjun, chairman of Baidu Artificial Intelligence Technology Committee, made in-depth interpretations of the leading position of product technology.

Baidu intelligent Yunxi Ling-AI sign language platform is coming, and it only takes a few seconds for a thousand-word text to change sign language

▲Yuan Tiantian, deputy dean of the School of Deaf Artificial Labor of Tianjin University of Technology, Gao Liang, director of Baidu's speech technology department, and He Zhongjun, chairman of Baidu Artificial Intelligence Technology Committee, roundtable forum

Yuan Tiantian said that in the process of communication with deaf students, they found that hearing impaired students and hearing people lack effective means of communication, and there will be fear in the communication process, and it is a good way to assist communication through artificial intelligence technology.

Starting from the characteristics of Baidu's intelligent Yunxiling-AI sign language platform, it is inseparable from Baidu AI's voice technology. Gao Liang said that solving the real-time problem in the live broadcast scenario is the key, and the recognition of digital people in the AI sign language platform is continuous, which is fast and accurate, and will require higher requirements on the model. Baidu Intelligent Cloud adopts the latest speech big model technology in order to achieve higher accuracy while recognizing in real time.

He Zhongjun said that sign language translation is actually more difficult than traditional text translation, speech processing, text translation, visual technology, specific to the text to the sign language code has three major difficulties, the first is the order is different, the expression is inconsistent, need to adjust the word order; the second is that the vocabulary is not the same, the general sign language dictionary only 8,000 words, far less than the words in the actual application; the third is the word order of speech is faster, sign language recognition needs to refine the language, to ensure real-time.

Based on the accumulation of machine translation technology, Baidu Intelligent Cloud Platform builds a sign language translation model, and automatically learns to recognize the length control and speech recognition of the sign language translation video from the real training data, forming a coherent sign language translation sentence.

In practical applications, sign language translation is more common in gestural Chinese, that is, words and sentences are expressed through gestures according to the normal people's speaking word order, but natural sign language is more in line with the reading habits of hearing impaired people, and it is necessary to adjust the word order, omit unnecessary words, and express more accurately and condensedly.

He Zhongjun said that the existing machine learning technology is based on big data, but the natural sign language database is particularly small, and there is almost no sign language data that can be used for training, so the researchers set up a special project on sign language, and cooperated with hearing-impaired students at Tianjin University of Technology to label a large amount of real data, plus advanced algorithms, to achieve the current effect.

In the expression of the hearing impaired, expressions, body movements and gestures are equally important, Yuan Tiantian added, Baidu Intelligent Yunxi Ling-AI sign language platform through the integration of multi-channel expression, more in line with the expression habits of the hearing impaired.

Conclusion: Building a bridge between AI technology and barrier-free communication

Using AI to drive sign language translation video generation can further reduce the technical threshold of sign language translation, and Baidu Intelligent Cloud is committed to covering multiple scenarios such as radio and television, finance, travel, medical care, government and enterprises, and cultural tourism, bringing convenience to hearing impaired people from multiple dimensions.

Baidu Intelligent Yunxiling-AI Sign Language Platform builds voice interaction patterns, professional term recognition, etc. for different scenarios, promotes the adaptability of AI sign language platforms in more professional vertical fields, and allows sign language digital people to build communication bridges for more hearing impaired people.

Read on