Indonesian-English Neural Machine Translation
Thesis title in Czech: Indonésko-anglický neuronový strojový překlad
Thesis title in English: Indonesian-English Neural Machine Translation
Key words: strojový překlad, hluboké neuronové sítě, Transformer, indonéština
English key words: machine translation, deep neural networks, Transformer, Indonesian
Department: Institute of Formal and Applied Linguistics (32-UFAL)
Supervisor: Mgr. Martin Popel, Ph.D.
The current state of the art in machine translation is the Transformer architecture for neural machine translation. However, most research focuses on a limited number of languages only (English, German, French, Czech), for which enough parallel training data is available. The goal of this thesis is to apply Neural machine translation (Transformer) on the Indonesian-English language pair and focusing on two domains: translation of TED talks and movie subtitles. The first step will be a review of available parallel and monolingual training data as well as related work on the English-Indonesian translation. After building and evaluating baseline Transformer systems for both directions, one direction (probably Indonesian-to-English) will be chosen to be further improved by techniques involving e.g. backtranslation (Sennrich et al. 2016) and domain adaptation (Chu and Wang, 2018).
