Thesis (Selection of subject)

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Konverzační agenti pro dialog zaměřený na úkoly

Thesis title in Czech:	Konverzační agenti pro dialog zaměřený na úkoly
Thesis title in English:	Conversational agents for task-oriented dialogue
Key words:	dialogové systémy\|dialog zaměřený na úkoly\|velké jazykové modely\|zpracování přirozeného jazyka\|chatbot
English key words:	dialogue systems\|task-oriented dialogue\|large language models\|natural language processing\|chatbot
Academic year of topic announcement:	2023/2024
Thesis type:	diploma thesis
Thesis language:
Department:	Institute of Formal and Applied Linguistics (32-UFAL)
Supervisor:	Mgr. et Mgr. Ondřej Dušek, Ph.D.
Author:	hidden - assigned and confirmed by the Study Dept.
Date of registration:	15.04.2024
Date of assignment:	15.04.2024
Confirmed by Study dept. on:	15.04.2024

Guidelines

Prompted large language models (LLMs; Zhao et al., 2023) have shown promising abilities on a number of tasks, including task-oriented dialogue, such as search and booking of restaurants, trains, etc. (Hudeček & Dušek, 2023). There is a large potential in further performance improvements using advanced prompting techniques or extensions, such as chain-of-thought prompting (splitting the task into reasoning steps; Kojima et al., 2022) or external tool usage (referring to external APIs or functions and integrating their outputs into generation responses; Schick et al., 2023). An integration of these two particular techniques, interleaving “reflections” and “action”, is the recently proposed ReAct architecture (Yao et al., 2023), which has been shown to work well multiple tasks, including question answering or fact verification.

The aim of this thesis is to explore different prompting and inference strategies in LLMs, inspired by ReAct, for task oriented dialogue. The approach may combine different LLMs or different prompts for specific dialogue-related tasks, as well as the usage of external tools by the LLMs. The implementation will explore the currently most prominent OpenAI LLMs as well as open-source LLMs for comparison. The experiments will be performed mostly on the MultiWOZ multi-domain task-oriented benchmark (Budzianowski et al., 2018), which offers suitably detailed annotation. The approach will be evaluated using automatic metrics, user simulation (Lee et al., 2019; with a potential for LLM-based user simulators), and will include a small-scale manual evaluation or error analysis. The evaluation may potentially extend to other datasets than MultiWOZ (e.g., Schema-guided Dialogue; Rastogi et al., 2020) to check the generalization abilities of the approach.

References

Budzianowski, P., Wen, T.-H., Tseng, B.-H., Casanueva, I., Ultes, S., Ramadan, O. and Gašić, M., 2018. MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium. pp.5016–5026. <https://aclanthology.org/D18-1547/> .
Hudeček, V. and Dusek, O., 2023. Are Large Language Models All You Need for Task-Oriented Dialogue? In: Proceedings of the 24th Meeting of the Special Interest Group on Discourse and Dialogue. SIGDIAL 2023. Prague, Czechia: Association for Computational Linguistics. pp.216–228. <https://aclanthology.org/2023.sigdial-1.21> .
Kojima, T., Gu, S.S., Reid, M., Matsuo, Y. and Iwasawa, Y., 2022. Large Language Models are Zero-Shot Reasoners. https://doi.org/10.48550/arXiv.2205.11916.
Lee, S., Zhu, Q., Takanobu, R., Zhang, Z., Zhang, Y., Li, X., Li, J., Peng, B., Li, X., Huang, M. and Gao, J., 2019. ConvLab: Multi-Domain End-to-End Dialog System Platform. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. Florence, Italy: Association for Computational Linguistics. pp.64–69. https://doi.org/10.18653/v1/P19-3011.
Rastogi, A., Zang, X., Sunkara, S., Gupta, R. and Khaitan, P., 2020. Schema-Guided Dialogue State Tracking Task at DSTC8. In: DSTC8 Workshop @ AAAI-2020. [online] New York, NY, USA. <http://arxiv.org/abs/2002.01359>.
Schick, T., Dwivedi-Yu, J., Dessì, R., Raileanu, R., Lomeli, M., Zettlemoyer, L., Cancedda, N. and Scialom, T., 2023. Toolformer: Language Models Can Teach Themselves to Use Tools. https://doi.org/10.48550/arXiv.2302.04761.
Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. and Cao, Y., 2023. ReAct: Synergizing Reasoning and Acting in Language Models. In: ICLR. Kigali, Rwanda: arXiv. https://doi.org/10.48550/arXiv.2210.03629.
Zhao, W.X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., Min, Y., Zhang, B., Zhang, J., Dong, Z., Du, Y., Yang, C., Chen, Y., Chen, Z., Jiang, J., Ren, R., Li, Y., Tang, X., Liu, Z., Liu, P., Nie, J.-Y. and Wen, J.-R., 2023. A Survey of Large Language Models. https://doi.org/10.48550/arXiv.2303.18223.