Thesis (Selection of subject)

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Self-Supervised Summarization via Reinforcement Learning

Thesis title in Czech:	Automatická sumarizace z neanotovaných dat pomocí zpětnovazebního učení
Thesis title in English:	Self-Supervised Summarization via Reinforcement Learning
Key words:	summarization\|reinforcement learning\|language model\|self-supervision
English key words:	sumarizace\|zpětnovazební učení\|jazykový model\|učení s vlastním dohledem
Academic year of topic announcement:	2023/2024
Thesis type:	diploma thesis
Thesis language:	angličtina
Department:	Institute of Formal and Applied Linguistics (32-UFAL)
Supervisor:	doc. RNDr. Ondřej Bojar, Ph.D.
Author:	hidden - assigned and confirmed by the Study Dept.
Date of registration:	21.12.2023
Date of assignment:	21.12.2023
Confirmed by Study dept. on:	21.12.2023
Date and time of defence:	10.06.2024 09:00
Date of electronic submission:	02.05.2024
Opponents:	RNDr. Milan Straka, Ph.D.



Advisors:	Mgr. Aleš Tamchyna, Ph.D.

Guidelines

The goal of the master thesis is to explore the approach of reinforcement learning (RL) for automatic text summarization.
Text summarization datasets (a large collection of pairs of long texts and their abridged counterparts) are expensive to create and thus often consist of already existing data pairs like paper-abstract or news article-highlight paragraph, which semantically differ from summarization.
The thesis will design, implement and experiment with a learning approach that reduces the need for supervised training data by self-supervision: one part of the model (“summarizer”) will try to learn to summarize and another part of the model (“predictor”) will be assessing the usefulness of each token for this summarization, thus providing signal for the first part. Notably, this signal is not a direct indication on how to generate the summary, so it needs to be processed in the framework of reinforcement learning (RL). An inherent part of the exploration is how to best combine this RL signal with the standard supervised summarization objective.
As a starting point, the predictor will be trained once and fixed throughout the training of the summarizer. As an extension, the thesis may explore the option of joint or iterative training.

References

Fabbri, Alexander R., et al. "Summeval: Re-evaluating summarization evaluation." Transactions of the Association for Computational Linguistics 9 (2021): 391-409.

Nallapati, Ramesh, et al. "Abstractive text summarization using sequence-to-sequence RNNs and beyond." arXiv preprint arXiv:1602.06023 (2016).

Paszke, Adam, et al. "Pytorch: An imperative style, high-performance deep learning library." Advances in neural information processing systems 32 (2019).

Gao, Yang, Wei Zhao, and Steffen Eger. "SUPERT: Towards new frontiers in unsupervised evaluation metrics for multi-document summarization." arXiv preprint arXiv:2005.03724 (2020).

Dessì, Roberto, et al. "Cross-Domain Image Captioning with Discriminative Finetuning." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023.