Mezijazykový přenos znalostí v úloze odpovídání na otázky
Thesis title in Czech: | Mezijazykový přenos znalostí v úloze odpovídání na otázky |
---|---|
Thesis title in English: | Crosslingual Transfer in Question Answering |
Key words: | odpovídání na otázky, transfer znalostí, SQuAD |
English key words: | question answering, crosslingual transfer, SQuAD |
Academic year of topic announcement: | 2019/2020 |
Thesis type: | diploma thesis |
Thesis language: | čeština |
Department: | Institute of Formal and Applied Linguistics (32-UFAL) |
Supervisor: | RNDr. Milan Straka, Ph.D. |
Author: | hidden - assigned and confirmed by the Study Dept. |
Date of registration: | 03.12.2019 |
Date of assignment: | 03.12.2019 |
Confirmed by Study dept. on: | 10.12.2019 |
Date and time of defence: | 03.02.2020 09:00 |
Date of electronic submission: | 09.01.2020 |
Date of submission of printed version: | 06.01.2020 |
Date of proceeded defence: | 03.02.2020 |
Opponents: | Mgr. Rudolf Rosa, Ph.D. |
Guidelines |
Question answering is a long studied task, with dozens of datasets for English. However, the resources for other languages are much less frequent.
The goal of this thesis is to devise a method to train question answering system for Czech, based on the well known SQuAD question answering dataset. Apart from simple translation-based baselines, a suitable crosslingual transfer method (building up for example on bilingual word embeddings or on multilingual BERT pretraining) should be devised and evaluated. |
References |
- Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, Percy Liang. SQuAD: 100,000+ Questions for Machine Comprehension of Text, https://arxiv.org/abs/1606.05250
- Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, Hannaneh Hajishirzi. Bidirectional Attention Flow for Machine Comprehension, https://arxiv.org/abs/1611.01603 - Mikel Artetxe, Gorka Labaka, Eneko Agirre. A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings, https://arxiv.org/abs/1805.06297 - Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, https://arxiv.org/abs/1810.04805 |