Témata prací (Výběr práce)Témata prací (Výběr práce)(verze: 368)
Detail práce
   Přihlásit přes CAS
Design of LLM prompts for iterative data exploration
Název práce v češtině: Design of LLM prompts for iterative data exploration
Název v anglickém jazyce: Design of LLM prompts for iterative data exploration
Klíčová slova: large language models|data exploration|interactive programming|artificial intelligence
Akademický rok vypsání: 2023/2024
Typ práce: bakalářská práce
Jazyk práce:
Ústav: Katedra distribuovaných a spolehlivých systémů (32-KDSS)
Vedoucí / školitel: Mgr. Tomáš Petříček, Ph.D.
Řešitel: Mikoláš Fromm - zadáno a potvrzeno stud. odd.
Datum přihlášení: 25.09.2023
Datum zadání: 26.09.2023
Datum potvrzení stud. oddělením: 23.11.2023
Zásady pro vypracování
On the one hand, large language models (LLMs) [4] are increasingly used to create data exploration scripts [3]. However, generating an entire script in a single step makes it difficult for the users to understand and validate the generated scripts. On the other hand, "iterative prompting" [1, 5] makes it possible to build programmatic data exploration tool where the user is repeatedly offered a range of options and constructs a script by repeatedly choosing one of the offered options. However, doing so is not as convenient as specifying a query in natural language.

The aim of the thesis is to combine the two approaches. It will design an example integration of an LLM with an iterative prompting data exploration system. The integration will be subject to design evaluation and performance benchmarking that will compare several approaches how to build such a system. In the resulting system, the user will write a query in a natural language and the system developed for the thesis will use an LLM (with an appropriately constructed prompt, possibly inspired by emerging prompt patterns [2]) to iteratively advise the user which of the options offered by the "iterative prompting" system to choose. As with other conversational agents [6], this may increase user understanding of the problem [7]. The thesis work will consist of developing a system for data exploration (focusing on tabular data) based on iterative prompting and integrating it with an LLM. It will then explore and evaluate different ways of constructing LLM prompts for obtaining recommendations to control the system.
Seznam odborné literatury
[1] Petricek, T. The Gamma: Programmatic Data Exploration for Non-programmers. In 2022 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), pp. 1-7. IEEE, 2022.
[2] White, J., Fu, Q., Hays, S., Sandborn, M., Olea, C., Gilbert, H., Elnashar, A., Spencer-Smith, J. and Schmidt, D.C., 2023. A prompt pattern catalog to enhance prompt engineering with ChatGPT. Available at: https://arxiv.org/pdf/2302.11382.pdf
[3] Maddigan, P. and Susnjak, T., 2023. Chat2vis: Generating data visualisations via natural language using chatgpt, codex and gpt-3 large language models. IEEE Access.
[4] OpenAI. ChatGPT API documentation. Available at: https://platform.openai.com/docs/introduction, Accessed 9/2023
[5] Petricek, T., 2017. Data exploration through dot-driven development. In 31st European Conference on Object-Oriented Programming (ECOOP 2017).
[6] Fast, E., Chen, B., Mendelsohn, J., Bassen, J. and Bernstein, M.S., 2018, April. Iris: A conversational agent for complex tasks. In Proceedings of the 2018 CHI conference on human factors in computing systems (pp. 1-12).
[7] Reicherts, L. and Rogers, Y., 2020, July. Do make me think! How CUIs can support cognitive processes. In Proceedings of the 2nd Conference on Conversational User Interfaces (pp. 1-4).
 
Univerzita Karlova | Informační systém UK