Measures of Machine Translation Quality
Název práce v češtině: | Měření kvality strojového překladu |
---|---|
Název v anglickém jazyce: | Measures of Machine Translation Quality |
Klíčová slova: | strojový překlad, vyhodnocování kvality, automatické metriky, anotace |
Klíčová slova anglicky: | machine translation, evaluation, automatic metrics, annotation |
Akademický rok vypsání: | 2013/2014 |
Typ práce: | diplomová práce |
Jazyk práce: | angličtina |
Ústav: | Ústav formální a aplikované lingvistiky (32-UFAL) |
Vedoucí / školitel: | doc. RNDr. Ondřej Bojar, Ph.D. |
Řešitel: | skrytý![]() |
Datum přihlášení: | 11.04.2014 |
Datum zadání: | 02.05.2014 |
Datum potvrzení stud. oddělením: | 12.05.2014 |
Datum a čas obhajoby: | 08.09.2014 09:00 |
Datum odevzdání elektronické podoby: | 31.07.2014 |
Datum odevzdání tištěné podoby: | 31.07.2014 |
Datum proběhlé obhajoby: | 08.09.2014 |
Oponenti: | doc. RNDr. Vladislav Kuboň, Ph.D. |
Zásady pro vypracování |
Methods of machine translation evaluation are a necessary for measuring progress in the field, for selecting the best system for a given translation task and also for the day-to-day development of machine translation systems.
The aim of the thesis is to explore the area of machine translation evaluation, covering both manual and automatic methods. For manual methods, we seek for a technique that is fast (and therefore inexpensive), requires little or no training of the annotators and is reliable in the sense that annotators reach a sufficient level of agreement in their judgements. For automatic methods, we primarily demand a high correlation with human judgements. Automatic methods rely on one or more reference translation. Recently, methods on the boundary of manual and automatic ones have also emerged: significant manual annotation is carried out in a preparatory phase (such as constructing many reference translations) and this dataset then serves in an automatic evaluation. Following a survey of recent advances in the field, the thesis should carry out an experiment with one manual evaluation method (due to limited resources for annotation) and contrast its strengths and weaknesses with other manual methods. The design and development of an annotation interface for the method is an important part of the thesis. The selected method should be versatile in the sense that the obtained annotations can be reused for further evaluation of other systems or other variants of the systems output. To verify this, the thesis should explore the applicability of the annotations in automatic tuning of an MT system. For automatic methods, a broad comparison of available techniques is desirable, empirically evaluating their correlation with human judgements. |
Seznam odborné literatury |
Ondřej Bojar: Čeština a strojový překlad. Copyright © ÚFAL, Praha, Czechia, ISBN 978-80-904571-4-0, 168 pp., 2012.
Ondřej Bojar, Matouš Macháček, Aleš Tamchyna, Daniel Zeman: Scratching the Surface of Possible Translations. In: Lecture Notes in Computer Science, Vol. 8082, Text, Speech and Dialogue: 16th International Conference, TSD 2013. Proceedings, Copyright © Springer Verlag, Berlin / Heidelberg, ISBN 978-3-642-40584-6, ISSN 0302-9743, pp. 465-474, 2013. Omar F. Zaidan and Chris Callison-Burch. Feasibility of human-in-the-loop minimum error rate training. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1* (EMNLP '09), Vol. 1. Association for Computational Linguistics, pp. 52-61, 2009. Chris Callison-Burch, Cameron Fordyce, Philipp Koehn, Christof Monz, and Josh Schroeder. 2008. Further meta-evaluation of machine translation. In: Proceedings of the Third Workshop on Statistical Machine Translation (StatMT '08). Association for Computational Linguistics, pp. 70-106. |