Thesis (Selection of subject)Thesis (Selection of subject)(version: 368)
Thesis details
   Login via CAS
Mezijazykový přenos znalostí v úloze odpovídání na otázky
Thesis title in Czech: Mezijazykový přenos znalostí v úloze odpovídání na otázky
Thesis title in English: Crosslingual Transfer in Question Answering
Key words: odpovídání na otázky, transfer znalostí, SQuAD
English key words: question answering, crosslingual transfer, SQuAD
Academic year of topic announcement: 2019/2020
Thesis type: diploma thesis
Thesis language: čeština
Department: Institute of Formal and Applied Linguistics (32-UFAL)
Supervisor: RNDr. Milan Straka, Ph.D.
Author: hidden - assigned and confirmed by the Study Dept.
Date of registration: 03.12.2019
Date of assignment: 03.12.2019
Confirmed by Study dept. on: 10.12.2019
Date and time of defence: 03.02.2020 09:00
Date of electronic submission:09.01.2020
Date of submission of printed version:06.01.2020
Date of proceeded defence: 03.02.2020
Opponents: Mgr. Rudolf Rosa, Ph.D.
 
 
 
Guidelines
Question answering is a long studied task, with dozens of datasets for English. However, the resources for other languages are much less frequent.

The goal of this thesis is to devise a method to train question answering system for Czech, based on the well known SQuAD question answering dataset. Apart from simple translation-based baselines, a suitable crosslingual transfer method (building up for example on bilingual word embeddings or on multilingual BERT pretraining) should be devised and evaluated.
References
- Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, Percy Liang. SQuAD: 100,000+ Questions for Machine Comprehension of Text, https://arxiv.org/abs/1606.05250

- Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, Hannaneh Hajishirzi. Bidirectional Attention Flow for Machine Comprehension, https://arxiv.org/abs/1611.01603

- Mikel Artetxe, Gorka Labaka, Eneko Agirre. A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings, https://arxiv.org/abs/1805.06297

- Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, https://arxiv.org/abs/1810.04805
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html