Témata prací (Výběr práce)

Váš prohlížeč nepodporuje JavaScript nebo je jeho podpora vypnutá. Některé funkce nemusejí být dostupné.

Named entity recognition in the biomedical domain

Název práce v češtině:	Rozpoznávání pojmenovaných entit v biomedicínské doméně
Název v anglickém jazyce:	Named entity recognition in the biomedical domain
Klíčová slova:	Rozpoznávání pojmenovaných entit\|biomedicínská doména\|hluboké neuronové sítě
Klíčová slova anglicky:	Named entity recognition\|biomedical domain\|deep neural networks
Akademický rok vypsání:	2018/2019
Typ práce:	diplomová práce
Jazyk práce:	angličtina
Ústav:	Ústav formální a aplikované lingvistiky (32-UFAL)
Vedoucí / školitel:	doc. RNDr. Pavel Pecina, Ph.D.
Řešitel:	skrytý - zadáno a potvrzeno stud. odd.
Datum přihlášení:	29.09.2018
Datum zadání:	29.09.2018
Datum potvrzení stud. oddělením:	25.04.2019
Datum a čas obhajoby:	08.09.2021 09:00
Datum odevzdání elektronické podoby:	22.07.2021
Datum odevzdání tištěné podoby:	22.07.2021
Datum proběhlé obhajoby:	08.09.2021
Oponenti:	RNDr. Jana Straková, Ph.D.

Zásady pro vypracování

Named entity recognition (NER) is the task of information extraction that attempts to recognize and extract particular entities in a text. One of the issues that stems from NER is that its models are domain specific. The goal of the thesis is to focus on entities strictly from the biomedical domain. The other issue with NER comes the synonymous terms that may be linked to one entity, moreover they lead to issue of disambiguation of the entities. Due to the popularity of neural networks and their success in NLP tasks, the work should use a neural network architecture for the task of named entity disambiguation, which is described in the paper by Eshel et al [1]. One of the subtasks of the thesis is to map the words and entities to a vector space using word embeddings, which attempts to provide textual context similarity, and coherence [2]. The main output of the thesis will be a model that attempts to disambiguate entities of the biomedical domain, using scientific journals (PubMed and Embase) as the documents of our interest.

Seznam odborné literatury

[1] Eshel, Yotam, et al. “Named Entity Disambiguation for Noisy Text.” Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), 2017.
[2] Yamada, Ikuya, et al. “Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation.” Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, 2016.