Thesis (Selection of subject)Thesis (Selection of subject)(version: 368)
Thesis details
   Login via CAS
Shrnutí napříč jazyky
Thesis title in Czech: Shrnutí napříč jazyky
Thesis title in English: Cross-Language Summarization
Key words: shrnutí|CLS|XLS|NLP
English key words: Summarization|CLS|XLS|NLP
Academic year of topic announcement: 2024/2025
Thesis type: diploma thesis
Thesis language: čeština
Department: Institute of Formal and Applied Linguistics (32-UFAL)
Supervisor: RNDr. Jiří Hana, Ph.D.
Author: hidden - assigned by the advisor
Date of registration: 14.05.2024
Date of assignment: 16.05.2024
Guidelines
The thesis should explore the methods for cross-language summarization (CLS).
CLS is a task when a document in one language is summarized in another language. Typically, this task involves machine translation before or after summarization, but there are also end-to-end methods.

As part of the thesis, a prototype system should be developed and evaluated. It should summarize documents in a low-resource language, such as Maltese, in another language, for example, English.
References
Wang, J. et al (2022) A Survey on Cross-Lingual Summarization. Transactions of the Association for Computational Linguistics 2022; 10 1304–1323. doi: https://doi.org/10.1162/tacl_a_00520

Bhattacharjee, A. et al (2021). CrossSum: Beyond English-Centric Cross-Lingual Summarization for 1,500+ Language Pairs. arXiv:2112.08804.

Huot, F.et al (2023). μPLAN: Summarizing using a Content Plan as Cross-Lingual Bridge. arXiv:2305.14205.

Wan, X. et al (2010). Cross-language document summarization based on machine translation quality prediction. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (pp. 917-926).

Ladhak, F. et al (2020). WikiLingua: A new benchmark dataset for cross-lingual abstractive summarization. arXiv:2010.03093.
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html