Thesis (Selection of subject)

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Benchmark pro dotazování multi-modelových dat

Thesis title in Czech:	Benchmark pro dotazování multi-modelových dat
Thesis title in English:	A benchmark for querying multi-model data
Academic year of topic announcement:	2024/2025
Thesis type:	diploma thesis
Thesis language:
Department:	Department of Software Engineering (32-KSI)
Supervisor:	Ing. Pavel Koupil, Ph.D.
Author:

Guidelines

NoSQL and multi-model systems represent the so-called big data variety. Beside the relational model, we also distinguish, e.g., hierarchical (document) or graph data, and the scope and efficiency of querying over these representations differ.

The aim of this thesis is to look at existing benchmarks, especially for multi-model or NoSQL database systems, and identify their limitations. Based on this analysis, the student will extend an existing benchmark or design an entirely new benchmark that reflects the characteristics of modern database systems and their typical query languages. Finally, the student will perform an experimental validation of the proposed approach.

References

Zhang, Chao, et al. "UniBench: A benchmark for multi-model database management systems." Technology conference on performance evaluation and benchmarking. Springer, Cham, 2018.

Conrad, André, et al. "EvoBench: Benchmarking Schema Evolution in NoSQL." Technology Conference on Performance Evaluation and Benchmarking. Springer, Cham, 2021.

Holubova, Irena, Pavel Koupil, and Jiaheng Lu. "Self-Adapting Design and Maintenance of Multi-Model Databases." Proceedings of the 26th International Database Engineered Applications Symposium. 2022.

Lu, Jiaheng, and Irena Holubová. "Multi-model Data Management: What's New and What's Next?." Proceeding of the 20th international conference on extended databases. 2017.

https://db-engines.com/en/ranking

Preliminary scope of work

NoSQL a multi-modelové systémy reprezentují tzv. variety velkých dat. Kromě relačního modelu rozlišujeme také např. hierarchická (dokumentová) nebo grafová data, přičemž možnosti a efektivita dotazování se nad těmito reprezentacemi liší.

Cílem práce je zaměřit se na existující benchmarky především pro multi-modelové nebo NoSQL databázové systémy a identifikovat jejich omezení. Na základě této analýzy student rozšíří existující nebo navrhne zcela nový benchmark, který bude reflektovat vlastnostmi moderních databázových systémů a pro ně typických dotazovacích jazyků. Nakonec řešitel provede experimentální ověření navrženého přístupu.