Neural Machine Translation on Scarce-Resource Condition: A case-study on Persian-English

Reading time: 6 minute
...

๐Ÿ“ Original Info

  • Title: Neural Machine Translation on Scarce-Resource Condition: A case-study on Persian-English
  • ArXiv ID: 1701.01854
  • Date: 2017-01-10
  • Authors: Researchers from original ArXiv paper

๐Ÿ“ Abstract

Neural Machine Translation (NMT) is a new approach for Machine Translation (MT), and due to its success, it has absorbed the attention of many researchers in the field. In this paper, we study NMT model on Persian-English language pairs, to analyze the model and investigate the appropriateness of the model for scarce-resourced scenarios, the situation that exists for Persian-centered translation systems. We adjust the model for the Persian language and find the best parameters and hyper parameters for two tasks: translation and transliteration. We also apply some preprocessing task on the Persian dataset which yields to increase for about one point in terms of BLEU score. Also, we have modified the loss function to enhance the word alignment of the model. This new loss function yields a total of 1.87 point improvements in terms of BLEU score in the translation quality.

๐Ÿ’ก Deep Analysis

Deep Dive into Neural Machine Translation on Scarce-Resource Condition: A case-study on Persian-English.

Neural Machine Translation (NMT) is a new approach for Machine Translation (MT), and due to its success, it has absorbed the attention of many researchers in the field. In this paper, we study NMT model on Persian-English language pairs, to analyze the model and investigate the appropriateness of the model for scarce-resourced scenarios, the situation that exists for Persian-centered translation systems. We adjust the model for the Persian language and find the best parameters and hyper parameters for two tasks: translation and transliteration. We also apply some preprocessing task on the Persian dataset which yields to increase for about one point in terms of BLEU score. Also, we have modified the loss function to enhance the word alignment of the model. This new loss function yields a total of 1.87 point improvements in terms of BLEU score in the translation quality.

๐Ÿ“„ Full Content

* Shahram Khadivi has contributed to this work when he was with Amirkabir University of Technology. Neural Machine Translation on Scarce-Resource Condition: A case-study on Persian-English

Mohaddeseh Bastan

Shahram Khadivi*

Mohammad Mehdi Homayounpour

Computer Engineering and Information Technology Dept. Amirkabir University of Technology, Tehran, Iran Email: {m.bastan, khadivi, homayoun}@aut.ac.ir

Abstractโ€” Neural Machine Translation (NMT) is a new approach for Machine Translation (MT), and due to its success, it has absorbed the attention of many researchers in the field. In this paper, we study NMT model on Persian-English language pairs, to analyze the model and investigate the appropriateness of the model for scarce-resourced scenarios, the situation that exist for Persian-centered translation systems. We adjust the model for the Persian language and find the best parameters and hyper parameters for two tasks: translation and transliteration. We also apply some preprocessing task on the Persian dataset which yields to increase for about one point in terms of BLEU score. Also, we have modified the loss function to enhance the word alignment of the model. This new loss function yields a total of 1.87 point improvements in terms of BLEU score in the translation quality.
Keywords-component; neural machine translation; cost function; alignment model; text preprocessing I. INTRODUCTION Neural Networks are under great consideration. These networks have recently been used in many applications such as speech recognition [1], image processing [2], and natural language processing [3] and achieved remarkable results. Since the introduction of these networks and considerable results in different applications, many researchers in different fields are making use of the neural networks as a solution for their problems. MT which is a subcategory of natural language processing was firstly processed using neural networks by Castaรฑo in 1997 [4]. For machine translation, these networks have been used for many different language pairs. In this paper, we propose a neural model for Persian translation for the first time. We use Tensorflow MT model [5] which was released by Google in 2015. We improve the base model with a new feature obtained from the statistical model. The new model consists of a new term as a cost function which measures the difference between the alignment obtained from neural model and statistical model. Then this cost is used to improve both accuracy and convergence time for the NMT.
The paper is organized as follow. In part II Statistical Machine Translation (SMT) and NMT and the corresponding mathematics are introduced. In part III literature review of NMT is done. In part IV our NMT model is presented. In part V the experiments and the improvements of the new model in comparison with the baselines are discussed. Finally, section VI concludes the paper II. STATISTICAL AND NEURAL MACHINE TRANSLATION MT is the automation of the translation between human languages [6]. Two of the most successful models for machine translations are SMT and NMT which are discussed in light of the following subsections.
A. Statistical Machine Translation A common SMT model leads to find the target sentence f: y1, y2, โ€ฆ, yT using source side sentence e: x1, x2, โ€ฆ, xS by maximizing the following term [7]: p(e|f) ~ p(e).p(f|e) ๏€จ๏€ฑ๏€ฉ๏€  In this equation, ๐‘(๐‘’)is the language model which helps our output to be natural and grammatical, and p(f|e) is the translation model which ensures that e is normally interpreted as f, and not some other thing [8]. Most of the MT systems use log-linear model instead of the pure form, to model more features in the final equation. Then the model will be as follow [8]: log(๐‘(๐‘’|๐‘“)) = โˆ‘๐œ†๐‘šโ„Ž๐‘š(๐‘’. ๐‘“) ๐‘€ ๐‘š=1

  • ๐‘™๐‘œ๐‘”๐‘(๐‘’)๏€  ๏€จ๏€ฒ๏€ฉ๏€  This equation shows the mth feature of the SMT system with the โ„Ž๐‘š symbol and the corresponding weight with the๐œ†๐‘š. The term Z is a normalization term which is independent from the weights. In Fig. 1 we see an architecture of an SMT. The model searches through different possibilities using its features as shown.
    Alignment is one of the features for MT and the same alignment is used as what described in [10] for estimating parameters of SMT in this paper

Figure 1. Architecture of Translation approach based on log-linear model [9] B. Neural Machine Translation Deep neural networks (DNNs) have shown impressive results in machine learning tasks. The success of these networks mostly is the result of the hierarchical aspects of these networks. DNNs are like pipeline processing in which each layer solves part of the issue and the result is fed into the next layer and at the end the last layer generates the output [11]. DNNs are powerful because the ability to perform parallel computations for several steps [12]. Most of the NMT models consist of two parts including an encoder which

…(Full text truncated)…

๐Ÿ“ธ Image Gallery

cover.png page_2.webp page_3.webp

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

โ†‘โ†“
โ†ต
ESC
โŒ˜K Shortcut