Thesis (Selection of subject)

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Efektivní metody pro systémy generování přirozeného jazyka

Thesis title in Czech:	Efektivní metody pro systémy generování přirozeného jazyka
Thesis title in English:	Efficient methods for neural natural language generation systems
Key words:	generování přirozeného jazyka\|jazykový model\|efektivita\|destilace modelu
English key words:	natural language generation\|language model\|efficiency\|model distillation
Academic year of topic announcement:	2024/2025
Thesis type:	dissertation
Thesis language:
Department:	Institute of Formal and Applied Linguistics (32-UFAL)
Supervisor:	Mgr. et Mgr. Ondřej Dušek, Ph.D.
Author:

Guidelines

Current state-of-the-art approaches to natural language generation generally use pretrained neural language models (Xiang et al., 2022; Zhao et al., 2023). While the latest-generation models are able to achieve extreme efficiency in training from limited amounts of in-domain data, e.g. using few-shot learning or zero-shot instructions (Axelsson & Skantze, 2023; Kasner & Dušek, 2024), the models itself require extreme amounts of data for pretraining and a lot of computational power simply to run, which makes them impractical. On the other hand, previous-generation approaches, using smaller-sized language models (Harkous et al., 2020; Kale & Rastogi, 2020), required large volumes of in-domain data in order to produce viable outputs, which again hampered their practicality. The aim of this project is to attain the upsides of both types of approaches: high sample efficiency and high computational efficiency upon inference.

To achieve the main aim, the thesis will explore different training techniques and/or model architecture modifications. These may include, for instance, model distillation (Sanh et al., 2019; Hsieh et al., 2023), data selection and filtering (Arun et al., 2020), synthetic data generation (Elder et al., 2020), self-training (Li et al., 2021), or iterative and pipeline approaches to text generation (Kasner & Dušek, 2022; Malmi et al., 2022).

References

A. Arun et al., “Best Practices for Data-Efficient Modeling in NLG: How to Train Production-Ready Neural Models with Less Data,” in Proceedings of the 28th International Conference on Computational Linguistics: Industry Track, Online: International Committee on Computational Linguistics, Dec. 2020, pp. 64–77. Accessed: Dec. 17, 2020. [Online]. Available: https://www.aclweb.org/anthology/2020.coling-industry.7
A. Axelsson and G. Skantze, “Using Large Language Models for Zero-Shot Natural Language Generation from Knowledge Graphs,” in Workshop on Multimodal, Multilingual Natural Language Generation, Prague, Czechia: arXiv, Sep. 2023. doi: 10.48550/arXiv.2307.07312.
H. Elder, A. O’Connor, and J. Foster, “How to Make Neural Natural Language Generation as Reliable as Templates in Task-Oriented Dialogue,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online: Association for Computational Linguistics, Nov. 2020, pp. 2877–2888. doi: 10.18653/v1/2020.emnlp-main.230.
H. Harkous, I. Groves, and A. Saffari, “Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity,” in COLING, Online, Dec. 2020. Accessed: Apr. 20, 2020. [Online]. Available: https://aclanthology.org/2020.coling-main.218/
C.-Y. Hsieh et al., “Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes,” in Findings of ACL, Toronto, Canada: arXiv, Jul. 2023. doi: 10.48550/arXiv.2305.02301.
M. Kale and A. Rastogi, “Text-to-Text Pre-Training for Data-to-Text Tasks,” in Proceedings of the 13th International Conference on Natural Language Generation, Dublin, Ireland: Association for Computational Linguistics, Dec. 2020, pp. 97–102. Accessed: Mar. 31, 2021. [Online]. Available: https://www.aclweb.org/anthology/2020.inlg-1.14
Z. Kasner and O. Dušek, “Beyond Reference-Based Metrics: Analyzing Behaviors of Open LLMs on Data-to-Text Generation.” arXiv, Jan. 18, 2024. doi: 10.48550/arXiv.2401.10186.
Z. Kasner and O. Dusek, “Neural Pipeline for Zero-Shot Data-to-Text Generation,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 3914–3932. doi: 10.18653/v1/2022.acl-long.271.
X. Li, S. Stevens-Guille, A. Maskharashvili, and M. White, “Self-Training for Compositional Neural NLG in Task-Oriented Dialogue,” in Proceedings of the 14th International Conference on Natural Language Generation, Aberdeen, Scotland, UK: Association for Computational Linguistics, Aug. 2021, pp. 87–102. Accessed: Sep. 21, 2021. [Online]. Available: https://aclanthology.org/2021.inlg-1.10
E. Malmi et al., “Text Generation with Text-Editing Models,” in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Tutorial Abstracts, Seattle, United States: Association for Computational Linguistics, Jul. 2022, pp. 1–7. doi: 10.18653/v1/2022.naacl-tutorials.1.
V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter,” in 5th Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS 2019, Dec. 2019. Accessed: Oct. 22, 2020. [Online]. Available: http://arxiv.org/abs/1910.01108
J. Xiang, Z. Liu, Y. Zhou, E. P. Xing, and Z. Hu, “ASDOT: Any-Shot Data-to-Text Generation with Pretrained Language Models,” in Findings of EMNLP, Abu Dhabi, UAE: arXiv, Dec. 2022. doi: 10.48550/arXiv.2210.04325.
Y. Zhao, H. Zhang, S. Si, L. Nan, X. Tang, and A. Cohan, “Investigating Table-to-Text Generation Capabilities of LLMs in Real-World Information Seeking Scenarios,” in EMNLP Industry Track, Singapore: arXiv, Dec. 2023. doi: 10.48550/arXiv.2305.14987.