Thank you for your interest in YYD.

Neural Machine Translation: Progress and Challenges

Dai Guangrong and Liu Siqi

Abstract:With the acceleration of globalization and the increasing intensity of international communication, traditional human translation methods can no longer meet the rapidly growing translation demand. Machine translation has gradually penetrated into people's lives with the advantages of convenience and speed, and has ushered in a new stage of development. As the latest paradigm of machine translation, neural network machine translation has greatly improved the quality of translation, and some experts claim that machine translation has been able to achieve the effect of "close to human translation" or "equivalent to human translation", which has led to a discussion in the academic community about whether machine translation should replace human translation. In what aspects has neural machine translation made progress, what problems and challenges do we still face, and what aspects should we improve the quality of neural machine translation? Focusing on these problems, this paper briefly reviews the quality improvement achievements of neural network machine translation, analyzes the problems and challenges it faces, and discusses its quality improvement path at multiple levels, in order to provide reference for the research and development of neural network translation system.

Keywords: Neural Machine Translation, Machine Translation Quality, Quality Improvement, Machine Translation Challenge

1. Introduction

Machine translation (MT) refers to the process of converting one natural language into another by a computer, involving knowledge in various fields of computer science, mathematics, and linguistics (Feng 2004:1). From the initial rule-based approach, to the statistical method, and then to the neural network approach, the technology of machine translation has become more and more mature, which has greatly improved the translation speed and efficiency, and the translation quality has also been greatly improved, and some experts even claim that the quality of neural network machine translation can achieve the effect of "near human parity" or "human parity" (Hassan et al. 2018). In response to the social discussion on "whether machine translation can replace human translation" caused by the improvement of quality in recent years, some scholars have also expressed their respective attitudes (Zhu 2018; Pym & Torres-Simón 2021)。

After decades of development, machine translation has gradually gained people's recognition and trust. The state has issued the "New Generation Artificial Intelligence Development Plan" to elevate artificial intelligence to a national strategy; According to the 2022 China Translation and Language Service Industry Development Report, language service companies generally agree with the working model of "machine translation + post-editing", and the application of machine translation in the industry is becoming more and more extensive, and the importance and necessity of machine translation are becoming increasingly apparent.

Machine translation includes human-computer co-translation, human-computer interaction, and neural network machine translation (Wang and Kong Xinke 2021: 75). Neural Machine Translation (MEL) is a next-generation machine translation method that uses artificial intelligence to mimic brain neurons for language translation, and is an end-to-end translation model. Compared with other machine translation methods, neural network machine translation has the advantages of strong generalization ability, simple construction, and less domain expertise required (Sutskever et al. 2014; Bentivogli et al. 2016; Toral & Sánchez-Cartagena 2017; Li et al. 2018). However, neural machine translation is not perfect, and there are still problems such as poor processing of long sentences, poor cross-domain adaptability, and unstable translation quality (Qin Ying 2018; Guo Wanghao et al. 2021), there is still a big gap between translation quality and human translation, and there is still a long way to go to make machine translation comparable to human translation. In view of this, this paper analyzes the progress of neural machine translation compared with previous translation methods, analyzes its challenges, and discusses the path to improve the quality of neural network machine translation in the era of big data, in order to provide reference for machine translation research.

2. Comparative advantages of neural machine translation over other machine translations

The name Artificial Neural Networks (ANNs) is derived from the biological term "neuron". The human brain has hundreds of millions of neurons, and these neurons are combined with dendrites and axons to form biological neural networks (BNNs). Neurons rely on dendrites to receive stimulus signals and then, depending on the strength of the signal, transmit the signals through the axons to other neurons, causing the person to respond. Artificial neural networks mimic this mode of operation, receiving information stimuli from neurons at one end and transmitting the stimuli to the next layer of neurons, with different weights attached to the connections between each layer of neurons, which are passed and weighted layer by layer, and finally responded by the neurons at the other end (Koehn 2020: 30-31). The history of neural network development can be traced back to more than 70 years ago, but it was not until 2016 when Google launched GNMT (Google Neural Machine Translation) that neural network machine translation gradually became well-known to the public. In the last decade, with the gradual accumulation of translation corpora, the massive collection of electronic texts on the World Wide Web, and the enhancement of computer processing power, machine translation has been better integrated with neural network models (O'Brien 2020:378).

From the perspective of historical development, machine translation can be broadly divided into two categories: one is the rationalist approach, that is, rule-based machine translation (RBMT); One is the empirical approach, also known as corpus-based machine translation (CBMT), which is divided into three categories: instance-based machine translation (EBMT), statistical machine translation (SMT), and neural network machine translation (NMT) (Li et al., 2018).

Rules-based machine translation refers to language rules compiled by experts, using the "IF... THEN", the original text is compared with the translation rule, and if the original text meets a certain rule, the corresponding target language under the rule will be output. This method has problems such as high compilation time and labor cost, inability to keep up with the speed of language evolution, contradictions between rules, and difficulty in expanding coverage (Li et al., 2015; Feng & Ding 2021).

Instance-based machine translation uses a database of translation examples and translation dictionaries to match the original text with similar translation instances, and then find out the differences and find the translation dictionary for filling. This method has high requirements for corpus size, coupled with system limitations, it is difficult to make full use of examples, which ultimately affects its adaptability, and fades out of the stage of machine translation after the emergence of statistical machine translation (Hou Qiang and Hou Ruili 2019:31).

Statistical machine translation consists of a translation model and a language model. The translation model learns translation knowledge through a bilingual parallel corpus, and the language model learns the language pattern of the target language through a monolingual corpus. Statistical machine translation does not require manual writing of rules, changes the method of obtaining translation knowledge, and breaks through the previous bottleneck, so before the emergence of neural network machine translation, it once "dominated" the stage of machine translation. This method relies solely on statistics and has limited linguistic knowledge that can be used, which can easily lead to the problem that although the meanings of words and phrases match but are incoherent, they are obscure, which seriously affects the readability of the translation (Li et al. 2015:4).

Neural network machine translation is composed of an input layer, a hidden layer and an output layer, the original text is encoded into vectors by the input layer, and these vectors are processed by the hidden layer to form a feature representative that the computer can understand, after many processing, the computer extracts the characteristics of the vector in different dimensions, and finally converts the processed vectors into the target language through the output layer (Koehn 2020; Feng Zhiwei 2010; Qin Ying 2018; Xiao Tong & Zhu 2021). Different from the above-mentioned machine translation models, the neural network machine translation model introduces methods such as long short-term memory network and attention mechanism, which makes the translated text more accurate and fluent, and improves the readability of the translation (Hou Qiang and Hou Ruili 2019). Since language appears as real vectors in neural networks, it is difficult to explain what exactly happens inside the neural network from a linguistic perspective (Liu Yang 2017:1147). With the development of artificial intelligence, neural network machine translation has shown a tendency to be interpretable, which we will explain further in Section 5.

3. Neural network machine translation quality evaluation methods and quality improvement results

Neural machine translation is a leap in the history of machine translation, and it has quickly become the main object of machine translation research since its birth.

There are three main methods for assessing the quality of machine translation: one is the automated evaluation method represented by BLEU (Bilingual evaluation understudy) (Sutskever et al. 2014; Jean et al. 2015); The second is to carry out manual evaluation methods such as error classification, scoring, and sorting of machine-translated texts (Burchardt et al. 2017; Isabelle et al. 2017); The third is a semi-automatic evaluation method that combines automatic evaluation and manual evaluation (Bentivogli et al., 2016; Wu et al. 2016; Castilho et al. 2017b)。 Results vary from one assessment method to another, but overall, neural machine translation has made breakthrough progress, and is currently the best performing machine translation method in terms of accuracy and fluency (Toral & Sánchez Cartagena 2017; Popovic' 2017), the improvement in fluency is more significant than the improvement in accuracy (Moorkens 2018; Van Brussel et al. 2018)。

At the word level, neural machine translation better handles issues such as word conjugation, word order adjustment, and word choice (Bentivogli et al. 2016; Toral & Sánchez-Cartagena 2017; Li Mei 2021), in which the improvement of word order adjustment was the most significant, and the machine translation results were closer to the reference translation (Toral & Sánchez-Cartagena 2017), in which the adjustment of verb word order improved the most (Popovic' 2017; Castilho et al. 2017b)。 Neural network translation has also made great strides in morphology, better handling subject-verb agreement problems (Isabelle et al. 2017), and producing more fluent translations when translating lexical-rich languages (Klubicˇka et al. 2017).

At the sentence level, neural machine translation can handle the transition between syntactic functions and sentence patterns, and the language is more natural and fluent (Isabelle et al. 2017; Xiao Weiqing & Gao Jiahui 2020; Li Mei 2021). The main reason is that neural machine translation adopts the strategy of "whole in and whole out", which overcomes the defect of statistical machine translation in the past, which used words (phrases) as the translation unit and the relationship between words (phrases) to be separated, so as to make sentences more readable (Qin Ying, 2018).

从语篇层面看,神经网络机器翻译在连贯性、衔接性等方面都取得了较大进展(Zhang et al. 2020),如使用额外的上下文编码器(Wang et al. 2017; Voita et al. 2018; Ma et al. 2020)、感知上下文的解码器(Maruf & Haffari 2018; Zhang et al. 2018 )、拓展翻译单位(Tiedemann & Scherrer 2017; Scherrer et al. 2019)等。

In addition, neural machine translation has also made breakthroughs in the processing of non-verbal information, which can add, remove, or transform punctuation marks according to the context (Avramidis et al. 2019; Xiao Weiqing and Gao Jiahui 2020), and some scholars have designed ASR models that can reduce punctuation errors and applied them to neural machine translation (Ding et al. 2021). Since the addition or deletion of punctuation marks involves more complex issues such as semantic analysis, post-editing is still indispensable.

4. Challenges in improving the quality of neural machine translation

Although the quality of neural machine translation has improved by leaps and bounds, and there has been a significant improvement in accuracy and fluency, it still produces some confusing translations. The challenges of neural network machine translation come from many aspects and fields, and it is inconvenient to carry out them comprehensively, mainly focusing on the three prominent problems of rare word/extra-set word translation, long sentence translation and missing translation.

4.1 Translation of rare words/extra-set words

At the lexical level, neural machine translation is more prominent in the translation of rare words. Since the complexity of neural machine translation training increases dramatically with the number of vocabularies, the size of the glossaries is generally small, usually between 30,000 and 80,000 (Hou Qiang and Hou Ruili 2021:56). With the rapid speed of language update in the Internet era, neural network machine translation will inevitably encounter some rare words, also known as out-of-vocabulary words, which will affect the quality of its translation.

Example (1) is a translation of a sentence with rare words from one of the four major online machine translation systems (2022-0308 test).

(1)原文:Metaverse NFTs are unique digital items where the ownership and other information is coded into the token.

DeepL Translation: Metaverse NFTs are unique digital items whose ownership and other information is encoded into tokens.

Baidu Translation: Metaverse NFTs are unique digital items where ownership and other information are encoded into tokens.

Youdao Translation: Metaverse NFTs are unique digital items in which ownership and other information are encoded into tokens.

Google Translate: Metaverse NFTs are unique digital items where ownership and other information are encoded into tokens.

In example (1), "metaverse" is derived from "meta" + "verse", which is the so-called "metaverse". "Metaverse" refers to "a virtual world that is linked and created by technological means, mapped and interacted with the real world, and a digital living space with a new social system" (1), which has received widespread attention and discussion in 2021 and has become one of the hot words of the year. "NFT (non-fungible token)" is a non-fungible token, which is a digital currency born in the metaverse. None of the four major online machine translation systems correctly translated the meaning of these two rare words, among them, two major translation systems directly copied the source language, and the other two translation systems only translated some rare words without accurate translation. It can be seen that the problem of rare words is still prominent in neural network machine translation systems.

In order to solve this problem, various methods have been tried, such as Luong et al. (2015), which uses the localization method to label the position of rare words in the data during the training stage of the translation model, and outputs rare words with positioning information by looking up the dictionary or finding the corresponding translation after the translation is completed. Gulcehre et al. (2016) based on the phenomenon that rare words such as named entities are directly copied from the source language in real translation, the source language words corresponding to the out-set words are replaced as the target language. Sennrich et al. (2016) condenses the vocabulary into limited-scale sub-word units by disassembling the original words. Luong & Manning (2016) fuse the neural network translation model at both the word and word levels to solve the problem of out-of-set words in both the source language and the target language.

4.2 Long sentence translation

Long sentences have always been one of the difficulties in improving the quality of neural machine translation. Several studies have shown (Bentivogli et al. 2016; Toral & Sánchez-Cartagena 2017; Koehn & Knowles 2017; Toral & Way 2018), as a sentence grows to a certain number of words, the quality of neural machine translation decreases rapidly; In contrast, the performance of statistical machine translation is more stable. Within a certain threshold, the translation performance of neural machine translation for long sentences is better than that of statistical machine translation, and the quality is significantly improved. For example, Toral Sánchez-Cartagena (2017) found that phrase-based machine translation is only more accurate than neural network translation when sentences are longer than 40 words, while in Koehn & Knowles (2017), this figure is 60. Popovic' (2017) did not find a significant advantage of phrase-based machine translation when dealing with long sentences; According to Van Brussel et al. (2018), neural machine translation still performs best in long sentences of 40 words or more.

The conclusions reached by these studies differed due to the different language pairs, translation directions, and types of translated texts. In order to visually demonstrate the problem of long sentences in neural machine translation, we extract a long sentence from news corpus and test it on four major online machine translation systems (tested on 2022-03-15), as shown in example (2):

(2)原文:And there were many, many “Smiths” among them, including a historically famous fellow named John Smith, the leader of Jamestown Colony, the first English settlement in North America.

DeepL Translation: There are many, many "Smiths" among them, including the historical John Smith, the leader of the Jamestown colony and the first British settlement in North America.

Baidu Translation: There are many, many "Smiths" among them, including the famous John Smith, the leader of the Jamestown colony, which was the first British colony in North America.

There are many, many "Smiths" among them, including the famous John Smith, who was the leader of the first British colony in North America, the Jamestown colony.

Google Translate: There are many, many "Smiths" among them, including a historically famous John Smith, who was the leader of the colony of Jamestown, the first British colony in North America.

In the original text, the leader of Jamestown Colony is named John Smith, and the first English settlement in North America is modified by Jamestown Colony. Among the four major machine translation systems, DeepL translation refers to errors, identifying "the leader of Jamestown Colony" and "the first English settlement in North America" as John Smith modifiers. Baidu machine translation refers to vagueness, and whether "this" refers to the previous "colony" or "leader" will confuse readers. However, we have also seen that Youdao and Google's systems provide correct translations, and Youdao machine translation even uses dashes to flexibly handle insertions in the original text, which echoes the "neural network machine translation can add, retract, or convert punctuation marks according to the situation" mentioned in the third part above.

At present, there are two main ways to deal with the problem of long sentence translation: one is to divide the long sentence into sentence fragments, and then combine the translation results of the fragments; The second is to increase the expression ability of neural machine translation by adding external memories (Li et al., 2018). Although these methods have achieved certain results, from the actual practice of machine translation and the above cases, there is still a lot of room for improvement in machine translation in processing long sentences.

4.3 Obviousness of Missing Translations

The visibility of the omission error refers to "the expectation of an omission error when reading only the translation, i.e., the degree to which the omission is evident when reading the translation from a monolingual perspective" (Van Brussel et al. 2018: 3802). Although the fluency of the translations produced by neural machine translation is high, the ensuing problem of "rather than believing" is also confusing, and it is difficult to identify errors in machine translation without analysis against the original text (Castilho et al. 2017a; Castilho et al. 2017b; Moorkens 2018; Ustaszewski 2019)。 In the past, rule-based and phrase-based machine translations often had missing translations where they were not fluent. If you read a translation that doesn't make sense, you can guess that the machine may have missed something from the original text. In neural machine translation, the characteristics of missing translation errors have changed, and the reader may not be able to find any omissions when reading the translation alone.

Example (3) shows the problem visually (tested on 2022-03-23):

(3)原文:Folds of scarlet drapery shut in my view to the right hand; to the left were the clear panes of glass, protecting, but not separating me from the drear November day.

DeepL Translation: The folds of the scarlet curtains blocked my view; On the left is a transparent glass window that protects me, but does not separate me from the dreary November weather.

If we compare the original text with the machine-translated text, it is easy to see that there is a missing translation of "to the right hand", which means "on the right". If you only read the translation and don't look at the original text, there seems to be nothing wrong with "the scarlet curtain folds blocking my view", and it reads smoothly, which is the so-called "obviousness of missing translations". In neural machine translation, the problem of missing translations brings great challenges to post-editing and monolingual translation quality evaluation. Although it is difficult to figure out exactly how the error of missing translation occurs due to the unexplainable nature of neural network models, some researchers have devised some methods to solve machine translation omissions, such as Wu et al. (2016) through Beam Search, which expands the coverage of the model and prompts the machine to output a translation that is most likely to cover all inputs; Yang et al. (2019) uses the contrastive learning method to take the real and correct set of translated texts as positive examples, and then automatically delete some words in the real translated text through the algorithm to form a set of translations with missing translation errors as negative examples, and compare them in the training stage, so that the model assigns a higher ratio to the positive example and a lower ratio to the negative example, so as to effectively reduce the missing translation error of neural network machine translation.

5. The path to improving the quality of neural machine translation

In recent years, the World Machine Translation Competitions such as WMT (Workshop on Machine Translation) and IWSLT (International Workshop on Spoken Language Translation) have been held every year to showcase the latest achievements in the development of machine translation and explore ways to improve the quality of machine translation. In an increasingly open and inclusive research environment, more and more open source software is being used to further explore the quality improvement of machine translation. The problems faced by neural network machine translation urgently need to be studied by the academic community to explore the path of quality improvement in an all-round way.

5.1 Integrate the advantages of different modes and methods of machine translation

Although neural network machine translation is the most advanced machine translation method, there are still shortcomings compared with traditional machine translation methods, such as serious constraints on the size and quality of the corpus, lack of guidance from grammar rules, and high hardware requirements (Hou Qiang and Hou Ruili 2019). Koehn & Knowles (2017) proposed that neural machine translation is better than statistical machine translation when dealing with high-resource language pairs, but it is worse when dealing with low-resource language pairs. Popovic (2017) compared the translation problems that arise in neural machine translation and phrase-based machine translation, and found that the prominent translation problems of the two are different and complementary. These studies have shown that when the advantages of different machine translation modes are combined, the translation results are more optimized.

In recent years, some scholars have begun to explore the advantages of combining different machine translation modes and methods to train neural network models and achieve certain results. Niehues et al. (2016) proposed that both the pre-translated text and the source language produced by phrase-based machine translation should be used as input materials for neural machine translation. Marie & Fujita (2018) combined the characteristics of statistical machine translation and neural network machine translation to design a reordering system, which can produce the best translation according to the n-best list of two machine translation models. Zhang et al. (2020) reordered the n-best list of neural machine translations by means of phrase-based forced decoding. Although these methods have achieved certain results, there are some problems, such as the quality of pre-translated texts through statistical machine translation cannot be guaranteed, and whether the translation model can improve the translation quality of low-resource language pairs has not been verified. Combining the advantages of different systems is still a good development idea, and it is necessary for the academic community to continue to explore the combination path with better results.

5.2 Human-Computer Interaction

Human-computer interaction refers to "project collaboration, knowledge base co-construction, and machine learning through the interaction of human and machine capabilities" (Xiao Fenghua and Yin Baien 2019: 39). Although neural machine translation seems to involve "inscrutable" internal computer calculations, and people don't even need to move their brains, they can get the desired translation with just a few clicks of the mouse, in fact, such operations are inseparable from human interaction with them.

First of all, from the perspective of project collaboration, the manifestation of human-computer interaction is "based on big data, artificial intelligence and mobile Internet, combining machine intelligence and human intelligence, balancing the high efficiency of machine translation and the high quality of human translation" (Xiao Fenghua and Yin Baien 2019:38). In this mode, machine translation can save human time and cost, and human beings can improve the quality of machine translation output, so as to complement each other's advantages. For example, Cui Qiliang and Lei Xuefa (2016) proposed a human-computer interaction-based translation strategy from three aspects: human-assisted machine translation, machine-assisted human translation and self-learning of translation systems. Based on the concept of human-computer interaction, Ji Chunyuan et al. (2019) developed a networked intelligent translation system by constructing an expert semantic database and other methods, which successfully improved the reliability and intelligence of machine translation. Huang et al. (2021) developed an interactive machine translation software that can update the translation in real time according to the user's input, and intelligently recommend the translation by learning the user's translation habits and translation history.

Secondly, from the perspective of knowledge base co-construction, human-computer interaction is manifested in the translator providing high-quality corpus for neural network machine translation, providing valuable reference translations for quality evaluation, and putting forward constructive evaluation opinions for machine translation translations. Way (2013) points out that this is the age of machine translation, and that machine translation can only grow if its developers work closely with human translators. At present, neural network machine translation is faced with the problem of data sparseness, especially the lack of low-resource corpus, which requires translators to provide more high-quality parallel corpora in vertical domains for machine translation data training, so as to ensure that the machine can conduct comprehensive deep learning of language phenomena and translation patterns.

Finally, from the perspective of machine learning, human-computer interaction is mainly manifested in integrating prior knowledge into the modeling process and using prior knowledge to increase the interpretability of machine translation. Methods of integrating prior knowledge include the fusion of monolingual corpus, bilingual dictionaries, linguistic knowledge, etc. (Li et al. 2018: 2738). On the one hand, Zhang et al. (2017) designed a framework for integrating prior knowledge into neural network machine translation based on posterior regularization. Niehues Cho (2017) integrates data labeled with linguistic features into machine translation models through multi-task learning, and these methods of fusing prior knowledge improve the quality of neural network machine translation and optimize the translation model. On the other hand, interpretability has always been a pain point in the development of neural network machine translation, which involves the mining of unknown scientific knowledge, improving the reliability of machine translation systems, and avoiding algorithmic discrimination (Zhang et al. 2021:727-728). Shi et al. (2016) trained strings to extract some syntactic information from neural machine translation. Bau et al. (2018) used unsupervised learning methods to discover the linguistic information contained in neurons. In the future, we are expected to open the "black box" of neural machine translation, and explain the translation process of neural networks through prior knowledge, especially linguistic knowledge, so as to improve translation models and improve translation quality.

6. Prospects

As the latest progress in machine translation, neural network machine translation has achieved remarkable results, but there is still a lot of room for development. As artificial intelligence penetrates into all aspects of our lives, the demand for machine translation is increasing, and the use scenarios of machine translation will become more and more extensive. In this context, it is particularly important to improve the quality of neural network machine translation, which requires the exchange and cooperation of experts in various fields to work together towards the ultimate goal of artificial intelligence. As Koehn (2020:13) argues, the "curse" to be broken by machine translation research is not to achieve a perfect translation, but to reduce the error rate. Our goal is not to replace human translation with machine translation, but to use it to facilitate human translation activities to the greatest extent, so that machine translation can become a productive force and contribute to national economic development and social progress.

(References omitted)

(This article was first published in Foreign Language Teaching, Issue 1, 2023)

Neural Machine Translation: Progress and Challenges

Neural Machine Translation: Progress and Challenges

Read on