Témata prací (Výběr práce)

Váš prohlížeč nepodporuje JavaScript nebo je jeho podpora vypnutá. Některé funkce nemusejí být dostupné.

Řešení dezinformací v srbochorvatštině: korpusy a experimenty

Název práce v češtině:	Řešení dezinformací v srbochorvatštině: korpusy a experimenty
Název v anglickém jazyce:	Tackling misinformation in Serbo-Croatian: corpora and experiments
Klíčová slova:	NLP\|dezinformace\|srbochorvatština\|korpus\|klasifikace
Klíčová slova anglicky:	NLP\|misinformation\|fake news\|Serbo-Croatian\|corpora\|classification
Akademický rok vypsání:	2022/2023
Typ práce:	diplomová práce
Jazyk práce:
Ústav:	Ústav formální a aplikované lingvistiky (32-UFAL)
Vedoucí / školitel:	RNDr. Jiří Hana, Ph.D.
Řešitel:	skrytý - zadáno a potvrzeno stud. odd.
Datum přihlášení:	29.03.2023
Datum zadání:	30.03.2023
Datum potvrzení stud. oddělením:	08.03.2024

Zásady pro vypracování

Explore the area of misinformation in the news written in Serbo-Croatian (Serbian, Croatian, Bosnian, Montenegrin; closely related South Slavic languages).

- Create a news corpus with metadata describing whether the articles is trustworthy and if not then in which respect
- Evaluate the possibilities of automatic processing, for example:
- Classification of news articles: binary (truthful vs misinformation) or mutlilabel (fake news, pseudoscience, conspiracy theory, etc.)
- Claim detection - extraction of claims from articles

Seznam odborné literatury

- Max Glockner, Yufang Hou, and Iryna Gurevych. 2022. Missing Counter-Evidence Renders NLP Fact-Checking Unrealistic for Misinformation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5916–5936, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Zhijiang Guo, Michael Schlichtkrull, and Andreas Vlachos. 2022. A Survey on Automated Fact-Checking. Transactions of the Association for Computational Linguistics, 10:178–206.
- Isabelle Augenstein. 2021. Towards Explainable Fact Checking. ArXiv, abs/2108.10274.
- Nikola Ljubešić and Davor Lauc. 2021. BERTić - The Transformer Language Model for Bosnian, Croatian, Montenegrin and Serbian. In Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing, pages 37–42, Kiyv, Ukraine. Association for Computational Linguistics.
- James Thorne, Andreas Vlachos, Christos Christodoulopoulos, and Arpit Mittal. 2018. FEVER: a Large-scale Dataset for Fact Extraction and VERification. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 809–819, New Orleans, Louisiana. Association for Computational Linguistics.