Témata prací (Výběr práce)

Váš prohlížeč nepodporuje JavaScript nebo je jeho podpora vypnutá. Některé funkce nemusejí být dostupné.

Non-Autoregressive Neural Machine Translation

Název práce v češtině:	Neautoregresivní neuronový strojový překlad
Název v anglickém jazyce:	Non-Autoregressive Neural Machine Translation
Klíčová slova:	strojový překlad\|hluboké učení\|zpracování přirozených jazyků
Klíčová slova anglicky:	machine translation\|deep learning\|natural language processing
Akademický rok vypsání:	2014/2015
Typ práce:	disertační práce
Jazyk práce:	angličtina
Ústav:	Ústav formální a aplikované lingvistiky (32-UFAL)
Vedoucí / školitel:	prof. RNDr. Jan Hajič, Dr.
Řešitel:	skrytý - zadáno a potvrzeno stud. odd.
Datum přihlášení:	06.10.2014
Datum zadání:	06.10.2014
Datum potvrzení stud. oddělením:	06.10.2014
Datum a čas obhajoby:	09.02.2022 14:00
Datum odevzdání elektronické podoby:	15.11.2021
Datum odevzdání tištěné podoby:	16.11.2021
Datum proběhlé obhajoby:	09.02.2022
Oponenti:	Kevin Duh
	Mgr. Martin Popel, Ph.D.

Seznam odborné literatury

Bahdanau, D. – Cho, K. – Bengio, Y. Neural Machine Translation by Jointly Learning to
Align and Translate. CoRR. 2014, abs/1409.0473. ISSN 2331-8422.

Vaswani, A. – Shazeer, N. – Parmar, N. – Uszkoreit, J. – Jones, L. – Gomez, A. N. – Kaiser,
Ł. – Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing
Systems 30, p. 6000–6010, Long Beach, CA, USA, December 2017. Curran Associates, Inc.

Gu, J. – Bradbury, J. – Xiong, C. – Li, V. O. K. – Socher, R. Non-Autoregressive Neural
Machine Translation. In 6th International Conference on Learning Representations, ICLR
2018, Vancouver, BC, Canada, April 2018. Available at: https://openreview.net/forum?
id=B1l8BtlCb

Lee, J. – Mansimov, E. – Cho, K. Deterministic Non-Autoregressive Neural Sequence Mod-
eling by Iterative Refinement. In Proceedings of the 2018 Conference on Empirical Methods
in Natural Language Processing, p. 1173–1182, Brussels, Belgium, November 2018. Asso-
ciation for Computational Linguistics. Available at: http://www.aclweb.org/anthology/
D18-1149 .

Ghazvininejad, M. – Levy, O. – Liu, Y. – Zettlemoyer, L. Mask-Predict: Parallel Decoding
of Conditional Masked Language Models. In Proceedings of the 2019 Conference on Empir-
ical Methods in Natural Language Processing and the 9th International Joint Conference on
Natural Language Processing (EMNLP-IJCNLP), p. 6111–6120, Hong Kong, China, Novem-
ber 2019. Association for Computational Linguistics. doi: 10.18653/v1/D19-1633. Available
at: https://www.aclweb.org/anthology/D19-1633 .

Kaiser, L. – Bengio, S. – Roy, A. – Vaswani, A. – Parmar, N. – Uszkoreit, J. – Shazeer, N.
Fast Decoding in Sequence Models Using Discrete Latent Variables. In Dy, J. – Krause, A.
(Ed.) Proceedings of the 35th International Conference on Machine Learning, 80 / Proceedings
of Machine Learning Research, p. 2390–2399, Stockholmsmässan, Stockholm Sweden, 10–15
Jul 2018. PMLR. Available at: http://proceedings.mlr.press/v80/kaiser18a.html .

Saharia, C. – Chan, W. – Saxena, S. – Norouzi, M. Non-Autoregressive Machine Transla-
tion with Latent Alignments. In Proceedings of the 2020 Conference on Empirical Methods
in Natural Language Processing (EMNLP), p. 1098–1108, Online, November 2020. Associ-
ation for Computational Linguistics. doi: 10.18653/v1/2020.emnlp-main.83. Available at:
https://www.aclweb.org/anthology/2020.emnlp-main.83

Předběžná náplň práce

In recent years, neural machine translation has become the de-facto standard approach to machine translation. Using a neural network, the source sentence is processed into a hidden intermediate representation in continuous vector space, from which the target sentence is generated word by word.
The neural network translation model is autoregressive, which means that the output word probability distributions are conditioned on the previously generated words. This property constraints the otherwise highly parallelizable computation to be sequential.
Non-autoregressive translation models the output distributions as conditionally independent. This assumption allows for parallelization of the sentence generation algorithm, which brings significant speed-ups of the decoding process. However, the translation quality of these models is lower due to higher modeling error.
In this thesis, we bring toghether a number of techniques for improving the translation quality of non-autoregressive translation models, with the goal of preserving the high decoding speed. In order to provide fair comparison, we evaluate optimization methods invented and previously used only for autoregressive translation in context of non-autoregressive translation.

Předběžná náplň práce v anglickém jazyce