Multi-Target Machine Translation
Thesis title in Czech: | Strojový překlad do mnoha jazyků současně |
---|---|
Thesis title in English: | Multi-Target Machine Translation |
Key words: | Neural machine translation, Multi-target MT, linguistic relatedness |
English key words: | Neural machine translation, Strojový překlad do mnoha jazyků současně, lingvistická podobnost |
Academic year of topic announcement: | 2019/2020 |
Thesis type: | diploma thesis |
Thesis language: | angličtina |
Department: | Institute of Formal and Applied Linguistics (32-UFAL) |
Supervisor: | doc. RNDr. Ondřej Bojar, Ph.D. |
Author: | hidden - assigned and confirmed by the Study Dept. |
Date of registration: | 19.09.2019 |
Date of assignment: | 16.01.2020 |
Confirmed by Study dept. on: | 24.01.2020 |
Date and time of defence: | 14.09.2020 09:00 |
Date of electronic submission: | 31.07.2020 |
Date of submission of printed version: | 30.07.2020 |
Date of proceeded defence: | 14.09.2020 |
Opponents: | Ing. Tom Kocmi, Ph.D. |
Guidelines |
In highly multi-lingual environments such as the European Union or its bodies (e.g. European Parliament), there exists the need to translate a given input into many target languages. Each of the target languages can be produced by a dedicated pairwise model but this approach is very expensive in terms of model size (both on disk as well as when loaded in GPU or CPU memory) as well as overall training costs, because each of the models has to be trained independently.
Neural machine translation (NMT), the state-of-the-art approach to machine translation, offers an interesting possibility to train multi-lingual models, i.e. models that can handle multiple source or target languages. There is even evidence that for low-resource language pairs, this multi-lingual approach can improve translation quality by reusing knowledge from other languages in the model. Furthermore, GPU parallelization will allow to produce many target versions concurrently, which is desirable e.g. for live translation of transcribed speech. The goal of the thesis is to focus on multi-target translation, i.e. NMT models from one source language into many target ones. The thesis should empirically explore the trade-offs in terms of translation quality, disk and memory size as well as overall training costs across setups with fewer or more target languages in one model. A useful extension would be to study the linguistic relatedness of languages in the mixed models and its impact on the gains or losses in translation quality of multi-target models. Translation quality will be evaluated by automatic measures, although a small sanity check with manual evaluation on a very limited set of languages is desirable. |
References |
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is All you Need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 6000–6010. Curran Associates, Inc., 2017.
Orhan Firat, Kyunghyun Cho, and Yoshua Bengio. Multi-way, multilingual neural machine translation with a shared attention mechanism. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 866–875, San Diego, California, June 2016. Association for Computational Linguistics. https://ai.googleblog.com/2019/10/exploring-massively-multilingual.html Goodfellow, I., Y. Bengio, and A. Courville 2016. Deep learning. Cambridge, MA, USA: MIT press. Martin Popel, Ondřej Bojar (2018): Training Tips for the Transformer Model. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 110, pp. 43-70 Ziemski, M., Junczys-Dowmunt, M., and Pouliquen, B., (2016), The United Nations Parallel Corpus, Language Resources and Evaluation (LREC’16), Portorož, Slovenia, May 2016. |