Témata prací (Výběr práce)

Váš prohlížeč nepodporuje JavaScript nebo je jeho podpora vypnutá. Některé funkce nemusejí být dostupné.

Unsupervised Open Information Extraction with Large Language Models

Název práce v češtině:	Neomezená extrakce informací bez učitele pomocí velkých jazykových modelů
Název v anglickém jazyce:	Unsupervised Open Information Extraction with Large Language Models
Klíčová slova:	hluboké učení\|předtrénované jazykové modely\|extrakce informací\|strojové učení bez učitele
Klíčová slova anglicky:	deep learning\|pretrained language models\|unsupervised machine learning\|information extraction
Akademický rok vypsání:	2022/2023
Typ práce:	diplomová práce
Jazyk práce:	angličtina
Ústav:	Ústav formální a aplikované lingvistiky (32-UFAL)
Vedoucí / školitel:	doc. RNDr. Pavel Pecina, Ph.D.
Řešitel:	skrytý - zadáno a potvrzeno stud. odd.
Datum přihlášení:	24.03.2023
Datum zadání:	24.03.2023
Datum potvrzení stud. oddělením:	31.03.2023
Datum a čas obhajoby:	10.09.2024 09:00
Datum odevzdání elektronické podoby:	18.07.2024
Datum odevzdání tištěné podoby:	18.07.2024
Datum proběhlé obhajoby:	10.09.2024
Oponenti:	RNDr. Martin Holub, Ph.D.

Zásady pro vypracování

Open Information Extraction(OIE) is an NLP task which involves extracting the relationship between entities in textual corpora. Some methods of OIE involve using linguistic knowledge to extract relations between entities in an unsupervised manner. Recent studies have indicated that pre-trained Large Language Models (LLM’s) represent linguistic as well as relational information. Recognising this, the IELM benchmark (Wang et al., 2022) seeks to exploit the relational information that is stored in LLM’s to extract entities and
their relations by successfully converting an LLM into a zero-shot OIE system. The goal of this thesis is to improve OIE as outlined by IELM and following that, investigate how
the use of linguistic constraints/knowledge prompting applied on the input controls the behaviour of the information extraction process.

Seznam odborné literatury

Wang, Chenguang, Xiao Liu, and Dawn Song. "IELM: An Open Information Extraction Benchmark for Pre-Trained Language Models." arXiv preprint arXiv:2210.14128 (2022).
Wang, Chenguang, et al. "Zero-shot information extraction as a unified text-to-triple translation." arXiv preprint arXiv:2109.11171 (2021).