Thesis (Selection of subject)Thesis (Selection of subject)(version: 368)
Thesis details
   Login via CAS
Vyhledávání tříd v ontologii Wikidata
Thesis title in Czech: Vyhledávání tříd v ontologii Wikidata
Thesis title in English: Searching classes in the Wikidata ontology
Academic year of topic announcement: 2023/2024
Thesis type: diploma thesis
Thesis language:
Department: Department of Software Engineering (32-KSI)
Supervisor: doc. Mgr. Martin Nečaský, Ph.D.
Author: hidden - assigned and confirmed by the Study Dept.
Date of registration: 27.02.2024
Date of assignment: 27.02.2024
Confirmed by Study dept. on: 27.02.2024
Guidelines
When users need to reuse an ontology containing millions of classes without previous knowledge, they are challenged to find classes regarding their interests. One way to reuse such a large ontology is a data structure creation process of the Dataspecer tool. The tool enables users to create data structures with the help of classes from various ontologies. The initial step of the creation process is the root selection phase. This phase aims to find a meaningful class from the chosen ontology that will serve as the root of the data structure. One such ontology is the Wikidata ontology in the free and collaborative knowledge base Wikidata.

This thesis aims to design, implement, and evaluate multiple methods for the class search in the Wikidata ontology in the context of the root selection phase in the Dataspecer tool connected to the Wikidata ontology. The root selection phase should disregard previously created data structures and the user's iterative search attempts. The designed solutions should consider structural features of the ontology and embedding methods while keeping the user interface as simple as possible.
References
[1] Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Qianyu Guo, Meng Wang, Haofen Wang. Retrieval-Augmented Generation for Large Language Models: A Survey. 2024. https://arxiv.org/abs/2312.10997v4
[2] Duy-Hoa Ngo, Madonna Kemp, Donna Truran, Bevan Koopman, Alejandro Metke-Jimenez. Semantic Search for Large Scale Clinical Ontologies. 2022. https://arxiv.org/abs/2201.00118v1
[3] Ilievski, F., Shenoy, K., Chalupsky, H., Klein, N., & Szekely, P. (2023). A study of concept similarity in Wikidata. Semantic Web, (Preprint), 1-20.
[4] Emma J. Gerritse, Faegheh Hasibi, Arjen P. de Vries. Graph-Embedding Empowered Entity Retrieval. 2020. https://arxiv.org/abs/2005.02843v1
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html