Shrnutí napříč jazyky
Thesis title in Czech: | Shrnutí napříč jazyky |
---|---|
Thesis title in English: | Cross-Language Summarization |
Key words: | shrnutí|CLS|XLS|NLP |
English key words: | Summarization|CLS|XLS|NLP |
Academic year of topic announcement: | 2024/2025 |
Thesis type: | diploma thesis |
Thesis language: | čeština |
Department: | Institute of Formal and Applied Linguistics (32-UFAL) |
Supervisor: | RNDr. Jiří Hana, Ph.D. |
Author: | hidden - assigned by the advisor |
Date of registration: | 14.05.2024 |
Date of assignment: | 16.05.2024 |
Guidelines |
The thesis should explore the methods for cross-language summarization (CLS).
CLS is a task when a document in one language is summarized in another language. Typically, this task involves machine translation before or after summarization, but there are also end-to-end methods. As part of the thesis, a prototype system should be developed and evaluated. It should summarize documents in a low-resource language, such as Maltese, in another language, for example, English. |
References |
Wang, J. et al (2022) A Survey on Cross-Lingual Summarization. Transactions of the Association for Computational Linguistics 2022; 10 1304–1323. doi: https://doi.org/10.1162/tacl_a_00520
Bhattacharjee, A. et al (2021). CrossSum: Beyond English-Centric Cross-Lingual Summarization for 1,500+ Language Pairs. arXiv:2112.08804. Huot, F.et al (2023). μPLAN: Summarizing using a Content Plan as Cross-Lingual Bridge. arXiv:2305.14205. Wan, X. et al (2010). Cross-language document summarization based on machine translation quality prediction. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (pp. 917-926). Ladhak, F. et al (2020). WikiLingua: A new benchmark dataset for cross-lingual abstractive summarization. arXiv:2010.03093. |