Thesis (Selection of subject)Thesis (Selection of subject)(version: 368)
Thesis details
   Login via CAS
Textová inference v generování přirozeného jazyka
Thesis title in Czech: Textová inference v generování přirozeného jazyka
Thesis title in English: Textual inference in natural language generation
Key words: generování přirozeného jazyka|jazykový model|textová inference
English key words: natural language generation|language model|textual inference
Academic year of topic announcement: 2024/2025
Thesis type: dissertation
Thesis language:
Department: Institute of Formal and Applied Linguistics (32-UFAL)
Supervisor: Mgr. et Mgr. Ondřej Dušek, Ph.D.
Author:
Guidelines
State-of-the-art large language models (LLMs) have shown impressive capabilities in natural language generation (NLG), especially in following instructions [1]. However, they often produce factually inaccurate statements [2] and struggle with inference operations [3]. The end-to-end generation setting contributes to lack of interpretability [4]. Moreover, existing benchmarks, such as dialogue and WebNLG [5], require little inference. Tabular datasets [6] offer limited, pre-prepared highlights, and there is a scarcity of benchmarks for longer texts that require extensive content selection [7].

This project aims to improve inference accuracy in NLG systems by integrating symbolic textual inference operations with LLMs, facilitating the generation of logically entailed statements from provided data. The objectives include enhancing content selection and responsiveness to human feedback through explicit operations, ensuring the model's understanding of its actions.

To achieve these aims, the project will focus on using and extending LLM architectures. It will explore selected adaptation techniques, such as finetuning on symbolic operation data [8], architecture or input extensions for symbolic operations, training specialized operation models, multi-model setup [9], iterative generation approaches [10], multi-task learning or reinforcement learning [11].
References
[1] Zhao, Wayne Xin, et al. "A survey of large language models." arXiv preprint arXiv:2303.18223 (2023)
[2] Thomson, Craig, Ehud Reiter, and Barkavi Sundararajan. "Evaluating factual accuracy in complex data-to-text." Computer Speech & Language 80 (2023): 101482.
[3] Creswell, Antonia, Murray Shanahan, and Irina Higgins. "Selection-inference: Exploiting large language models for interpretable logical reasoning." arXiv preprint arXiv:2205.09712 (2022). / Wu, Zhaofeng, et al. "Reasoning or reciting? exploring the capabilities and limitations of language models through counterfactual tasks." arXiv preprint arXiv:2307.02477 (2023).
[4] Rogers, Anna, et al., “A Primer in BERTology: What We Know About How BERT Works,” Transactions of the Association for Computational Linguistics, vol. 8, pp. 842–866, Dec. 2020, doi: 10.1162/tacl_a_00349.
[5] Gardent, Claire, et al. "The WebNLG challenge: Generating text from RDF data." Proceedings of the 10th International Conference on Natural Language Generation. 2017.
[6] Borisov, Vadim, et al. "Deep neural networks and tabular data: A survey." IEEE Transactions on Neural Networks and Learning Systems (2022).
[7] Puduppully, Ratish, et al., “Data-to-Text Generation with Content Selection and Planning,” AAAI, Honolulu, HI, USA, Jan. 2019.
[8] Lialin, Vladislav, et al., “Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning.” arXiv, Mar. 27, 2023. doi:10.48550/arXiv.2303.15647.
[9] Wu, Qingyun, et al. "Autogen: Enabling next-gen LLM applications via multi-agent conversation framework." arXiv preprint arXiv:2308.08155 (2023).
[10] Malmi, Eric, et al., “Text Generation with Text-Editing Models,” in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Tutorial Abstracts, Seattle, United States: Association for Computational Linguistics, Jul. 2022, pp. 1–7. doi: 10.18653/v1/2022.naacl-tutorials.1.
[11] Lee, Harrison, et al. "RLAIF: Scaling reinforcement learning from human feedback with ai feedback." arXiv preprint arXiv:2309.00267 (2023).
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html