Témata prací (Výběr práce)Témata prací (Výběr práce)(verze: 368)
Detail práce
   Přihlásit přes CAS
Complex experiment support through the IVIS framework
Název práce v češtině: Complex experiment support through the IVIS framework
Název v anglickém jazyce: Complex experiment support through the IVIS framework
Akademický rok vypsání: 2023/2024
Typ práce: diplomová práce
Jazyk práce:
Ústav: Katedra distribuovaných a spolehlivých systémů (32-KDSS)
Vedoucí / školitel: prof. RNDr. Tomáš Bureš, Ph.D.
Řešitel: Bc. David Košťál - zadáno a potvrzeno stud. odd.
Datum přihlášení: 26.01.2024
Datum zadání: 29.01.2024
Datum potvrzení stud. oddělením: 01.02.2024
Zásady pro vypracování
Many fields like business, healthcare and technology rely heavily on data-oriented systems. Data are collected and processed, and experimentation is performed, including a complex series of calculations corresponding to complex workflows. This process needs to be modelled, managed, run, and evaluated.

The process needs an oversight too. This means that in addition to computing the outputs themself, the outputs also need to be evaluated based on given metrics (e.g. accuracy). To fully explore the performance of the experiment run, data about the execution of the experiment (including used computational resources) also need to be evaluated. Computing the metrics is necessary to oversee the process and prevent its degradation, which may otherwise result, for instance, from distribution shifts in the data. It is also essential to be able to perform meta-analysis based on historical data, which can summarise a series of experiments over a more extended period. Furthermore, performing many experiments may lead to losing track of the origin of results and prevent repeatability, which is why it is essential to provide support for traceability.

Though there exist tools and frameworks for many data experimentation, these problems, especially ensuring traceability, are still not satisfactorily addressed by the state of the art.

Among other efforts, these problems have been a focal point of the currently running project ExtremeXP (Horizon Europe), to which this master thesis contributes. Within this scope, this thesis aims to design and implement a data storage schema for viewing experiment results, comparing different approaches to an experiment, and ensuring traceability of the resulting data.

Technically this will be implemented as an extension to the existing IVIS framework (developed at the Charles University). The models proposed by the thesis and the corresponding implementation will specifically support the storage of the following data
• Designed experiment models
• Features of the performed experiments with references to datasets
• Storage of metadata needed for the traceability of the resulting data, including the metrics used.

The results will be evaluated within the context of the cases coming from the ExtremeXP project.
Seznam odborné literatury
Wang, F., Liu, P., Pearson, J., Azar, F., & Madlmayr, G. (2006, April). Experiment management with metadata-based integration for collaborative scientific research. In 22nd International Conference on Data Engineering (ICDE'06) (pp. 96-96). IEEE.

Wlodarczyk, T. W. (2012, December). Overview of time series storage and processing in a cloud environment. In 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings (pp. 625-628). IEEE.

https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1617464

https://vldb.org/conf/1996/P274.PDF

Kuć, R., & Rogozinski, M. (2016). ElasticSearch server. Packt Publishing Ltd.

Kuć, R., & Rogoziński, M. (2013). Mastering ElasticSearch. Packt Pub..
 
Univerzita Karlova | Informační systém UK