Thesis (Selection of subject)

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Enriching Neural MT through Multi-Task Training

Thesis title in Czech:	Obohacování neuronového strojového překladu technikou sdíleného trénování na více úlohách
Thesis title in English:	Enriching Neural MT through Multi-Task Training
Key words:	multi-task neuronový strojový překlad NMT Transformer němčina
English key words:	multi-task neural machine translation NMT Transformer German
Academic year of topic announcement:	2017/2018
Thesis type:	diploma thesis
Thesis language:	angličtina
Department:	Institute of Formal and Applied Linguistics (32-UFAL)
Supervisor:	doc. RNDr. Ondřej Bojar, Ph.D.
Author:	Mgr. Dominik Macháček - assigned and confirmed by the Study Dept.
Date of registration:	18.04.2018
Date of assignment:	24.04.2018
Confirmed by Study dept. on:	09.05.2018
Date and time of defence:	11.09.2018 09:00
Date of electronic submission:	18.07.2018
Date of submission of printed version:	20.07.2018
Date of proceeded defence:	11.09.2018
Opponents:	Mgr. Jindřich Helcl, Ph.D.

Guidelines

Multi-task training has shown promising results in various areas of deep learning application, including neural machine translation (NMT). NMT has also already benefited from enriched information in either the source or the target language (e.g. part-of-speech tags, syntactic tags from CCG and others) for the sequence-to-sequence model with attention.

The goal of the thesis is to experiment with the Transformer architecture of NMT and to employ various types of additional information of the source or target side in a multi-task setup, aiming at a better translation quality. Specifically, the work will carry out a reasonably large set of experiments where the NMT system is trained to perform not only the translation but also one or more additional tasks for the source or target sentence. For example, the system can be trained to translate and label the source for part-of-speech tags, or translate and identify named entities in the source. The motivation for including these additional tasks is to improve the generalization capacity of the NMT model: if the model is capable of determining named entities, it should be able to spot them during translation as well and translate them accordingly.

The thesis will explore primarily a "simplified" multi-task setup which does not need any changes in the network architecture. The multiple tasks are reflected only in the training data, e.g. by interleaving tokens or alternating segments for the tasks considered. More advanced techniques of multi-task training are also possible.

The experiments will be carried out on the German-Czech translation task and possibly other language pairs, too. The translation direction will be chosen depending on the available automatic annotation for each of the languages. The outputs will be evaluated primarily with automatic methods. For the final runs, a small manual evaluation is also desirable.

References

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is All you Need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 6000–6010. Curran Associates, Inc., 2017.

Rico Sennrich and Barry Haddow. Linguistic Input Features Improve Neural Machine Translation. In Proceedings of the First Conference on Machine Translation, pages 83–91, Berlin, Germany, August 2016. Association for Computational Linguistics.

Jan Niehues and Eunah Cho. Exploiting linguistic resources for neural machine translation using multi-task learning. In Proceedings of the Second Conference on Machine Translation, pages 80–89. Association for Computational Linguistics, 2017.

Orhan Firat, Kyunghyun Cho, and Yoshua Bengio. Multi-way, multilingual neural machine translation with a shared attention mechanism. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 866–875, San Diego, California, June 2016. Association for Computational Linguistics.