Thesis (Selection of subject)Thesis (Selection of subject)(version: 385)
Thesis details
   Login via CAS
Indonesian-English Neural Machine Translation
Thesis title in Czech: Indonésko-anglický neuronový strojový překlad
Thesis title in English: Indonesian-English Neural Machine Translation
Key words: strojový překlad, hluboké neuronové sítě, Transformer, indonéština
English key words: machine translation, deep neural networks, Transformer, Indonesian
Academic year of topic announcement: 2018/2019
Thesis type: diploma thesis
Thesis language: angličtina
Department: Institute of Formal and Applied Linguistics (32-UFAL)
Supervisor: Mgr. Martin Popel, Ph.D.
Author: hidden - assigned and confirmed by the Study Dept.
Date of registration: 14.01.2019
Date of assignment: 21.01.2019
Confirmed by Study dept. on: 25.04.2019
Date and time of defence: 09.09.2019 09:00
Date of electronic submission:26.07.2019
Date of submission of printed version:26.07.2019
Date of proceeded defence: 09.09.2019
Opponents: Mgr. Michal Novák, Ph.D.
 
 
 
Guidelines
The current state of the art in machine translation is the Transformer architecture for neural machine translation. However, most research focuses on a limited number of languages only (English, German, French, Czech), for which enough parallel training data is available. The goal of this thesis is to apply Neural machine translation (Transformer) on the Indonesian-English language pair and focusing on two domains: translation of TED talks and movie subtitles. The first step will be a review of available parallel and monolingual training data as well as related work on the English-Indonesian translation. After building and evaluating baseline Transformer systems for both directions, one direction (probably Indonesian-to-English) will be chosen to be further improved by techniques involving e.g. backtranslation (Sennrich et al. 2016) and domain adaptation (Chu and Wang, 2018).
References
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is All you Need. In Guyon, I., U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 6000–6010. Curran Associates, Inc., 2017. URL http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf.

Martin Popel, Ondřej Bojar: Training Tips for the Transformer Model The Prague Bulletin of Mathematical Linguistics, No. 104, 2018, pp. 43–70.

Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. arXiv:1409.0473 (2014).

Sennrich, Rico, Barry Haddow, and Alexandra Birch. Improving neural machine translation models with monolingual data. ACL 2016.


Chenhui Chu, Rui Wang: A Survey of Domain Adaptation for Neural Machine Translation. arXiv:1806.00258 (2018).
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html