Thesis (Selection of subject)Thesis (Selection of subject)(version: 285)
Assignment details
   Login via CAS
Experimental Evaluation of Big Data Generator BDgen
Thesis title in Czech: Experimental Evaluation of Big Data Generator BDgen
Thesis title in English: Experimental Evaluation of Big Data Generator BDgen
Key words: Big Data, data generating, project BDgen
English key words: Big Data, data generating, project BDgen
Academic year of topic announcement: 2018/2019
Type of assignment: diploma thesis
Thesis language:
Department: Department of Software Engineering (32-KSI)
Supervisor: doc. RNDr. Irena Holubová, Ph.D.
Author:
Guidelines
Currently there exists a number of generators of Big Data, but their capabilities are often limited. Software project BDgen was implemented with an emphasis on universality, generality and extensibility. The authors of the project have conducted preliminary experiments demonstrating its features.

The aim of the thesis is to get acquainted with the project in detail and perform its extensive comparisons with similar existing tools in the form of both textual and experimental. A related goal is to perform experimental tests of selected tools for processing Big Data using the outputs of BDgen. If necessary, an appropriate debugging or extension of the project is expected too.
References
BDgen http://bdgen.skvaril.net/

Holubová, I. - Kosek, J. - Minařík, K. - Novák, D.: Big Data a NoSQL databáze. Grada, Praha, Česká republika, říjen 2015. ISBN 978-80-247-5466-6. (http://www.ksi.mff.cuni.cz/bigdata/)

BigDataBench, A Big Data Benchmark Suite. ICT, Chinese Academy of Sciences. http://prof.ict.ac.cn/BigDataBench

Ming, Z. - Luo, C. - Gao, W. - Han, R. - Yang, H. - Wang, L. - Zhan, J.: BDGS: A Scalable Big Data Generator Suite in Big Data Benchmarking. CoRR abs/1401.5465 (2014). http://arxiv.org/abs/1401.5465

Sherif Sakr, Mohamed Gaber: Large Scale and Big Data: Processing and Management.

Rabl, T.: Big Data Generation. Middleware System Research Group, University of Toronto.

Transaction Processing Performace Council (TPC): http://www.tpc.org/
Preliminary scope of work
Cílem práce je seznámit se s obhájeným SW projektem BDgen a s doposud provedenými testy a experimenty. Dále pak s podobnými nástroji a odpovídajícími experimenty. Na základě těchto znalostí pak autor navrhne a realizuje podrobnější a rozsáhlejší experimenty, které předvedou vlastnosti projektu a srovnají ho s podobnými nástroji. Současně pomocí výstupů generátoru experimentálně zhodnotí vybrané nástroje pro zpracování velkých dat.
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html