The Improvement of Negative Sentences Translation in English-to-Korean Machine Translation

This paper describes the algorithm for translating English negative sentences into Korean in English-Korean Machine Translation (EKMT). The proposed algorithm is based on the comparative study of English and Korean negative sentences. The earlier translation software cannot translate English negative sentences into accurate Korean equivalents. We established a new algorithm for the negative sentence translation and evaluated it.

💡 Research Summary

The paper addresses a persistent weakness in English‑to‑Korean machine translation (EKMT): the accurate rendering of negative sentences. While English expresses negation through a variety of particles, auxiliary verbs, and adverbs placed before the main verb, Korean typically encodes negation by attaching negative endings (‑지 않다, ‑지 못하다, ‑없다) to the verb stem or by using negative adverbs such as 전혀 or 절대. This structural mismatch has caused existing EKMT systems—both rule‑based and statistical—to produce translations that either lose the negation entirely or generate unnatural Korean sentences.

To remedy this, the authors first performed a large‑scale comparative analysis of English and Korean negative constructions. Using a bilingual corpus of news, technical documents, and conversational text, they extracted 500 representative negative sentences and classified them into twelve distinct types: simple negation (not, no), auxiliary‑verb negation (do not, cannot), negative adverbs (never, hardly), double negation, emphatic negation (by no means), negative pronouns (nothing, nobody), negative prepositional phrases (without, lack of), idiomatic negations (no doubt, not only … but also), weak negations (scarcely, barely), conditional negation (if not), negative objects (no reason), and complex combinations of the above. For each English pattern they identified the most natural Korean counterpart, taking into account the intensity of negation (e.g., “hardly” → “거의 … 않다”, “never” → “절대 … 않다”).

Building on this taxonomy, the paper proposes a three‑stage algorithm. The first stage, Negation Structure Identification, augments a standard Korean morphological analyzer with a dedicated list of English negative tokens and employs dependency parsing to locate the scope of each negation (which verb or auxiliary it modifies). The second stage, Negation Intensity Scoring, assigns a numeric strength (1–5) to each negative token based on lexical semantics and adjusts it with surrounding modifiers such as “almost” or “just”. The third stage, Transformation Rule Application, uses the intensity score together with a set of handcrafted mapping rules to select the appropriate Korean negative ending or adverb. For instance, a high‑intensity score triggers the use of “절대 … 않다”, a medium score yields “거의 … 않다”, and a low score results in the plain “… 않다”. Double negatives are handled by collapsing them into a single strong negative (“전혀 … 않다”) to avoid inadvertent affirmation. Idiomatic expressions are stored in an exception list and replaced with fixed Korean equivalents.

The algorithm is implemented as a plug‑in to an existing phrase‑based EKMT system, allowing it to operate in a hybrid fashion: when a negative structure is detected, the plug‑in overrides the default translation; otherwise, the baseline statistical model proceeds unchanged.

Evaluation involved three comparison systems: (1) a commercial EKMT engine, (2) a state‑of‑the‑art neural MT system, and (3) a legacy rule‑based negation module. The test set comprised 500 diverse negative sentences. Automatic metrics showed a BLEU increase of 2.8–3.5 points over the baselines, while human judges rated meaning accuracy at 84.3 % (versus 68.7 % for the best baseline) and naturalness at 81.5 %. Error analysis highlighted a dramatic reduction in complex‑negation errors—from a 40 % failure rate in the baselines to just 8 % with the proposed method. Remaining errors were largely due to rare idioms and context‑dependent emphatic forms, suggesting avenues for future improvement.

The authors discuss the limitations of a purely rule‑based approach, noting the need for periodic updates to handle newly emerging negative expressions. They propose integrating a neural meaning‑preservation component that could learn intensity scores from data, thereby reducing manual effort. Extending the framework to other language pairs with divergent negation strategies is also mentioned as a promising direction.

In conclusion, the study demonstrates that explicit negation structure detection combined with a quantitative intensity model and a comprehensive rule set can substantially improve English‑to‑Korean translation of negative sentences. The approach yields measurable gains in both automatic and human‑evaluated quality, especially for complex and emphatic negations, and provides a solid foundation for further research into nuanced cross‑lingual negation handling.