Thesis (Selection of subject)Thesis (Selection of subject)(version: 368)
Thesis details
   Login via CAS
Efektivní metody pro systémy generování přirozeného jazyka
Thesis title in Czech: Efektivní metody pro systémy generování přirozeného jazyka
Thesis title in English: Efficient methods for neural natural language generation systems
Key words: generování přirozeného jazyka|jazykový model|efektivita|destilace modelu
English key words: natural language generation|language model|efficiency|model distillation
Academic year of topic announcement: 2024/2025
Thesis type: dissertation
Thesis language:
Department: Institute of Formal and Applied Linguistics (32-UFAL)
Supervisor: Mgr. et Mgr. Ondřej Dušek, Ph.D.
Author:
Guidelines
Current state-of-the-art approaches to natural language generation generally use pretrained neural language models (Xiang et al., 2022; Zhao et al., 2023). While the latest-generation models are able to achieve extreme efficiency in training from limited amounts of in-domain data, e.g. using few-shot learning or zero-shot instructions (Axelsson & Skantze, 2023; Kasner & Dušek, 2024), the models itself require extreme amounts of data for pretraining and a lot of computational power simply to run, which makes them impractical. On the other hand, previous-generation approaches, using smaller-sized language models (Harkous et al., 2020; Kale & Rastogi, 2020), required large volumes of in-domain data in order to produce viable outputs, which again hampered their practicality. The aim of this project is to attain the upsides of both types of approaches: high sample efficiency and high computational efficiency upon inference.

To achieve the main aim, the thesis will explore different training techniques and/or model architecture modifications. These may include, for instance, model distillation (Sanh et al., 2019; Hsieh et al., 2023), data selection and filtering (Arun et al., 2020), synthetic data generation (Elder et al., 2020), self-training (Li et al., 2021), or iterative and pipeline approaches to text generation (Kasner & Dušek, 2022; Malmi et al., 2022).
References
A. Arun et al., “Best Practices for Data-Efficient Modeling in NLG: How to Train Production-Ready Neural Models with Less Data,” in Proceedings of the 28th International Conference on Computational Linguistics: Industry Track, Online: International Committee on Computational Linguistics, Dec. 2020, pp. 64–77. Accessed: Dec. 17, 2020. [Online]. Available: https://www.aclweb.org/anthology/2020.coling-industry.7
A. Axelsson and G. Skantze, “Using Large Language Models for Zero-Shot Natural Language Generation from Knowledge Graphs,” in Workshop on Multimodal, Multilingual Natural Language Generation, Prague, Czechia: arXiv, Sep. 2023. doi: 10.48550/arXiv.2307.07312.
H. Elder, A. O’Connor, and J. Foster, “How to Make Neural Natural Language Generation as Reliable as Templates in Task-Oriented Dialogue,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online: Association for Computational Linguistics, Nov. 2020, pp. 2877–2888. doi: 10.18653/v1/2020.emnlp-main.230.
H. Harkous, I. Groves, and A. Saffari, “Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity,” in COLING, Online, Dec. 2020. Accessed: Apr. 20, 2020. [Online]. Available: https://aclanthology.org/2020.coling-main.218/
C.-Y. Hsieh et al., “Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes,” in Findings of ACL, Toronto, Canada: arXiv, Jul. 2023. doi: 10.48550/arXiv.2305.02301.
M. Kale and A. Rastogi, “Text-to-Text Pre-Training for Data-to-Text Tasks,” in Proceedings of the 13th International Conference on Natural Language Generation, Dublin, Ireland: Association for Computational Linguistics, Dec. 2020, pp. 97–102. Accessed: Mar. 31, 2021. [Online]. Available: https://www.aclweb.org/anthology/2020.inlg-1.14
Z. Kasner and O. Dušek, “Beyond Reference-Based Metrics: Analyzing Behaviors of Open LLMs on Data-to-Text Generation.” arXiv, Jan. 18, 2024. doi: 10.48550/arXiv.2401.10186.
Z. Kasner and O. Dusek, “Neural Pipeline for Zero-Shot Data-to-Text Generation,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 3914–3932. doi: 10.18653/v1/2022.acl-long.271.
X. Li, S. Stevens-Guille, A. Maskharashvili, and M. White, “Self-Training for Compositional Neural NLG in Task-Oriented Dialogue,” in Proceedings of the 14th International Conference on Natural Language Generation, Aberdeen, Scotland, UK: Association for Computational Linguistics, Aug. 2021, pp. 87–102. Accessed: Sep. 21, 2021. [Online]. Available: https://aclanthology.org/2021.inlg-1.10
E. Malmi et al., “Text Generation with Text-Editing Models,” in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Tutorial Abstracts, Seattle, United States: Association for Computational Linguistics, Jul. 2022, pp. 1–7. doi: 10.18653/v1/2022.naacl-tutorials.1.
V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter,” in 5th Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS 2019, Dec. 2019. Accessed: Oct. 22, 2020. [Online]. Available: http://arxiv.org/abs/1910.01108
J. Xiang, Z. Liu, Y. Zhou, E. P. Xing, and Z. Hu, “ASDOT: Any-Shot Data-to-Text Generation with Pretrained Language Models,” in Findings of EMNLP, Abu Dhabi, UAE: arXiv, Dec. 2022. doi: 10.48550/arXiv.2210.04325.
Y. Zhao, H. Zhang, S. Si, L. Nan, X. Tang, and A. Cohan, “Investigating Table-to-Text Generation Capabilities of LLMs in Real-World Information Seeking Scenarios,” in EMNLP Industry Track, Singapore: arXiv, Dec. 2023. doi: 10.48550/arXiv.2305.14987.
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html