๐ Original Info
- Title: Neural Machine Translation on Scarce-Resource Condition: A case-study on Persian-English
- ArXiv ID: 1701.01854
- Date: 2017-01-10
- Authors: Researchers from original ArXiv paper
๐ Abstract
Neural Machine Translation (NMT) is a new approach for Machine Translation (MT), and due to its success, it has absorbed the attention of many researchers in the field. In this paper, we study NMT model on Persian-English language pairs, to analyze the model and investigate the appropriateness of the model for scarce-resourced scenarios, the situation that exists for Persian-centered translation systems. We adjust the model for the Persian language and find the best parameters and hyper parameters for two tasks: translation and transliteration. We also apply some preprocessing task on the Persian dataset which yields to increase for about one point in terms of BLEU score. Also, we have modified the loss function to enhance the word alignment of the model. This new loss function yields a total of 1.87 point improvements in terms of BLEU score in the translation quality.
๐ก Deep Analysis
Deep Dive into Neural Machine Translation on Scarce-Resource Condition: A case-study on Persian-English.
Neural Machine Translation (NMT) is a new approach for Machine Translation (MT), and due to its success, it has absorbed the attention of many researchers in the field. In this paper, we study NMT model on Persian-English language pairs, to analyze the model and investigate the appropriateness of the model for scarce-resourced scenarios, the situation that exists for Persian-centered translation systems. We adjust the model for the Persian language and find the best parameters and hyper parameters for two tasks: translation and transliteration. We also apply some preprocessing task on the Persian dataset which yields to increase for about one point in terms of BLEU score. Also, we have modified the loss function to enhance the word alignment of the model. This new loss function yields a total of 1.87 point improvements in terms of BLEU score in the translation quality.
๐ Full Content
* Shahram Khadivi has contributed to this work when he was with Amirkabir University of Technology.
Neural Machine Translation on Scarce-Resource
Condition: A case-study on Persian-English
Mohaddeseh Bastan
Shahram Khadivi*
Mohammad Mehdi Homayounpour
Computer Engineering and Information Technology Dept.
Amirkabir University of Technology, Tehran, Iran
Email: {m.bastan, khadivi, homayoun}@aut.ac.ir
Abstractโ Neural Machine Translation (NMT) is a new approach
for Machine Translation (MT), and due to its success, it has
absorbed the attention of many researchers in the field. In this
paper, we study NMT model on Persian-English language pairs, to
analyze the model and investigate the appropriateness of the
model for scarce-resourced scenarios, the situation that exist for
Persian-centered translation systems. We adjust the model for the
Persian language and find the best parameters and hyper
parameters for two tasks: translation and transliteration. We also
apply some preprocessing task on the Persian dataset which yields
to increase for about one point in terms of BLEU score. Also, we
have modified the loss function to enhance the word alignment of
the model. This new loss function yields a total of 1.87 point
improvements in terms of BLEU score in the translation quality.
Keywords-component;
neural
machine
translation;
cost
function; alignment model; text preprocessing
I.
INTRODUCTION
Neural Networks are under great consideration. These
networks have recently been used in many applications such as
speech recognition [1], image processing [2], and natural
language processing [3] and achieved remarkable results. Since
the introduction of these networks and considerable results in
different applications, many researchers in different fields are
making use of the neural networks as a solution for their
problems. MT which is a subcategory of natural language
processing was firstly processed using neural networks by
Castaรฑo in 1997 [4].
For machine translation, these networks have been used for
many different language pairs. In this paper, we propose a
neural model for Persian translation for the first time. We use
Tensorflow MT model [5] which was released by Google in
2015. We improve the base model with a new feature obtained
from the statistical model. The new model consists of a new
term as a cost function which measures the difference between
the alignment obtained from neural model and statistical model.
Then this cost is used to improve both accuracy and
convergence time for the NMT.
The paper is organized as follow. In part II Statistical
Machine Translation (SMT) and NMT and the corresponding
mathematics are introduced. In part III literature review of
NMT is done. In part IV our NMT model is presented. In part
V the experiments and the improvements of the new model in
comparison with the baselines are discussed. Finally, section VI
concludes the paper
II.
STATISTICAL AND NEURAL MACHINE TRANSLATION
MT is the automation of the translation between human
languages [6]. Two of the most successful models for machine
translations are SMT and NMT which are discussed in light of
the following subsections.
A. Statistical Machine Translation
A common SMT model leads to find the target sentence f:
y1, y2, โฆ, yT using source side sentence e: x1, x2, โฆ, xS by
maximizing the following term [7]:
p(e|f) ~ p(e).p(f|e)
๏จ๏ฑ๏ฉ๏
In this equation, ๐(๐)is the language model which helps
our output to be natural and grammatical, and p(f|e) is the
translation model which ensures that e is normally interpreted
as f, and not some other thing [8].
Most of the MT systems use log-linear model instead of the
pure form, to model more features in the final equation. Then
the model will be as follow [8]:
log(๐(๐|๐)) = โ๐๐โ๐(๐. ๐)
๐
๐=1
- ๐๐๐๐(๐)๏
๏จ๏ฒ๏ฉ๏
This equation shows the mth feature of the SMT system with
the โ๐ symbol and the corresponding weight with the๐๐. The
term Z is a normalization term which is independent from the
weights. In Fig. 1 we see an architecture of an SMT. The model
searches through different possibilities using its features as
shown.
Alignment is one of the features for MT and the same
alignment is used as what described in [10] for estimating
parameters of SMT in this paper
Figure 1. Architecture of Translation approach based on log-linear model [9]
B. Neural Machine Translation
Deep neural networks (DNNs) have shown impressive
results in machine learning tasks. The success of these networks
mostly is the result of the hierarchical aspects of these networks.
DNNs are like pipeline processing in which each layer solves
part of the issue and the result is fed into the next layer and at
the end the last layer generates the output [11]. DNNs are
powerful because the ability to perform parallel computations
for several steps [12].
Most of the NMT models consist of two parts including an
encoder which
…(Full text truncated)…
๐ธ Image Gallery
Reference
This content is AI-processed based on ArXiv data.