Thesis (Selection of subject)

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Practical neural dialogue management using pretrained language models

Thesis title in Czech:	Praktický neuronový dialogový manažer s použitím předtrénovaných jazykových modelů
Thesis title in English:	Practical neural dialogue management using pretrained language models
Key words:	dialogové systémy\|předtrénované jazykové modely\|zpracování přirozeného jazyka\|dialogový manažer
English key words:	dialogue systems\|pretrained language models\|natural language processing\|dialogue management
Academic year of topic announcement:	2021/2022
Thesis type:	diploma thesis
Thesis language:	angličtina
Department:	Institute of Formal and Applied Linguistics (32-UFAL)
Supervisor:	Mgr. et Mgr. Ondřej Dušek, Ph.D.
Author:	hidden - assigned and confirmed by the Study Dept.
Date of registration:	19.09.2022
Date of assignment:	19.09.2022
Confirmed by Study dept. on:	27.09.2022
Date and time of defence:	05.09.2023 09:00
Date of electronic submission:	20.07.2023
Date of submission of printed version:	24.07.2023
Date of proceeded defence:	05.09.2023
Opponents:	doc. RNDr. Ondřej Bojar, Ph.D.

Guidelines

While a lot of research in dialogue systems nowadays is dedicated to end-to-end neural models (Lin et al., 2020; Peng et al., 2021), this approach is notorious for requiring large amounts of annotated data, which is costly to obtain, and neural generative models are generally unsafe to use in practical applications due to their tendency to hallucinate/produce ungrounded outputs (Ji et al., 2022). Dialogue systems for practical applications thus remain composed of multiple separate modules (language understanding, state tracking, dialogue policy, language generation). Hybrid Code Networks (HCN; Williams et al., 2017) are a neural data-driven architecture combining all modules apart from language generation and allowing to train on limited data, but it does not take advantage of recent developments in the field, i.e. pretrained language models (Radford et al., 2019; Lewis et al., 2020).

The goal of this thesis is to explore HCN-based or similar neural architectures for practical dialogue modeling while making use of pretrained language models. The implemented architecture will be tested on the language understanding – state tracking – dialogue policy combination, and it will be evaluated in a limited data setting.

References

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” Minneapolis, MN, USA, Jun. 2019. https://www.aclweb.org/anthology/N19-1423
Z. Ji et al., “Survey of Hallucination in Natural Language Generation,” arXiv:2202.03629 [cs], Feb. 2022. http://arxiv.org/abs/2202.03629
M. Lewis et al., “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, Jul. 2020, pp. 7871–7880. doi: 10.18653/v1/2020.acl-main.703.
Z. Lin, A. Madotto, G. I. Winata, and P. Fung, “MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, Nov. 2020, pp. 3391–3405. doi: 10.18653/v1/2020.emnlp-main.273.
B. Peng, C. Li, J. Li, S. Shayandeh, L. Liden, and J. Gao, “Soloist: Building Task Bots at Scale with Transfer Learning and Machine Teaching,” Transactions of the Association for Computational Linguistics, vol. 9, pp. 807–824, Aug. 2021, doi: 10.1162/tacl_a_00399.
A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language Models are Unsupervised Multitask Learners,” OpenAI, Feb. 2019. https://openai.com/blog/better-language-models/
J. D. Williams, K. Asadi, and G. Zweig, “Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning,” Vancouver, Canada, Feb. 2017. https://aclanthology.org/P17-1062/