Thesis (Selection of subject)

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Mutual Relation of Machine Translation and Quality Estimation

Thesis title in Czech:	Vzájemný vztah strojového překladu a odhadu kvality
Thesis title in English:	Mutual Relation of Machine Translation and Quality Estimation
Key words:	strojový překlad\|odhad kvality\|strojové učení\|hluboké učení
English key words:	machine translation\|quality estimation\|machine learning\|deep learning
Academic year of topic announcement:	2019/2020
Thesis type:	diploma thesis
Thesis language:	angličtina
Department:	Institute of Formal and Applied Linguistics (32-UFAL)
Supervisor:	Mgr. Aleš Tamchyna, Ph.D.
Author:	hidden - assigned and confirmed by the Study Dept.
Date of registration:	07.04.2020
Date of assignment:	14.04.2020
Confirmed by Study dept. on:	26.04.2022
Date and time of defence:	07.09.2022 09:00
Date of electronic submission:	21.07.2022
Date of submission of printed version:	25.07.2022
Date of proceeded defence:	07.09.2022
Opponents:	Mgr. Martin Popel, Ph.D.



Advisors:	doc. RNDr. Ondřej Bojar, Ph.D.

Guidelines

Machine translation quality estimation (QE) is a machine learning problem within the area of natural language processing. Given a source sentence and its machine translation, the goal of QE is to predict how good the MT output is. The prediction can be performed at the level of the whole sentence (typically, some type of anticipated edit distance needed to manually fix the candidate) or at the level of individual tokens (typically marking which words are “good” and which are “bad”).

QE is highly relevant in the translation/localization industry where it can help predict savings when using MT. Translators can also use it as a guide when deciding whether to post-edit an MT output or whether to translate the sentence from scratch. For end users, QE can warn about unreliable parts in MT output, e.g. when reading machine-translated web pages.

The goal of the thesis is to explore the relationship between sentence-level QE and MT, i.e. to empirically study the relative power of the two models. If the MT system is strong, can we expect relatively simple QE models to perform well? Or does the QE model need to be strictly more powerful (e.g. in the number of parameters) in order to judge the MT outputs?

The thesis will focus primarily on the design and implementation of experiments studying the relationship between existing MT and QE implementations. The exact settings will be further refined in early stages of the work. The core experiments will build upon existing large data sources for English and Czech, specifically the parallel corpus CzEng and its derived versions in which one of the languages will be provided automatically by a machine translation system. The student will train several QE systems on varying training corpus sizes and with varying number of model parameters and evaluate the precision of these QE systems for MT systems trained on similarly varying corpus sizes.

The final evaluation will test MT, QE and their relation also on test sets of different domains, including e.g. transcripts of spoken language as available in IWSLT. In addition to the custom MT systems, the student will also assess the QE performance on a set of off-the-shelf English->Czech MT systems such as Lindat Translation or Google Translate or their saved past outputs from WMT evaluation campaigns.

The thesis will be supervised in collaboration with Memsource (https://www.memsource.com/). If possible, the test sets used in the thesis could include non-public data available to Memsource.

References

Bojar Ondřej, Dušek Ondřej, Kocmi Tom, Libovický Jindřich, Novák Michal, Popel Martin, Sudarikov Roman, Variš Dušan: CzEng 1.6: Enlarged Czech-English Parallel Corpus with Processing Tools Dockered. In: Lecture Notes in Computer Science, No. 9924, Text, Speech, and Dialogue: 19th International Conference, TSD 2016, Copyright © Springer International Publishing, Cham / Heidelberg / New York / Dordrecht / London, ISBN 978-3-319-45509-9, ISSN 0302-9743, pp. 231-238, 2016.

Fonseca, Erick, Lisa Yankovskaya, André FT Martins, Mark Fishel, and Christian Federmann. "Findings of the WMT 2019 shared task on quality estimation." In Proceedings of the Fourth Conference on Machine Translation. 2019.

Kepler, Fabio, Jonay Trénous, Marcos Treviso, Miguel Vera, António Góis, M. Amin Farajian, António V. Lopes, and André FT Martins. "Unbabel's Participation in the WMT19 Translation Quality Estimation Shared Task." arXiv preprint arXiv:1907.10352 (2019).

Lindat Translation, online service. https://lindat.mff.cuni.cz/services/transformer/

Martins, André FT, Marcin Junczys-Dowmunt, Fabio N. Kepler, Ramón Astudillo, Chris Hokamp, and Roman Grundkiewicz. "Pushing the limits of translation quality estimation." Transactions of the Association for Computational Linguistics 5 (2017): 205-218.

Specia, Lucia, Marco Turchi, Nicola Cancedda, Marc Dymetman, and Nello Cristianini. "Estimating the sentence-level quality of machine translation systems." In 13th Conference of the European Association for Machine Translation, pp. 28-37. 2009.