Thesis (Selection of subject)Thesis (Selection of subject)(version: 368)
Thesis details
   Login via CAS
High-performance exploration and querying of selected multi-dimensional spaces in life sciences
Thesis title in Czech: Vysoce výkonné prohledávání a dotazování ve vybraných mnohadimenzionálních prostorech v přírodních vědách
Thesis title in English: High-performance exploration and querying of selected multi-dimensional spaces in life sciences
Key words: vysokodimenzionální data; vyhledávání informací; chemoinformatika; redukce dimenzionality; cytometrie; vyhledávání multimédií
English key words: high-dimensional data; information retrieval; cheminformatics; dimensionality reduction; cytometry, multimedia retrieval
Academic year of topic announcement: 2015/2016
Thesis type: dissertation
Thesis language: angličtina
Department: Department of Software Engineering (32-KSI)
Supervisor: RNDr. David Bednárek, Ph.D.
Author: hidden - assigned and confirmed by the Study Dept.
Date of registration: 25.09.2015
Date of assignment: 25.09.2015
Confirmed by Study dept. on: 05.10.2015
Date and time of defence: 15.12.2020 10:45
Date of electronic submission:09.09.2020
Date of submission of printed version:03.09.2020
Date of proceeded defence: 15.12.2020
Opponents: Enrico Glaab
  prof. Mgr. Daniel Svozil, Ph.D.
 
 
Guidelines
There are two areas of computing which were really successful at employing parallelism: high-performance computing and database technology. However, these areas use thoroughly different programming approaches: procedural programming (FORTRAN, C) and declarative or functional languages (SQL, XQuery), respectively. Thus, applications that mix computationally demanding parts with complex and large data require frequent and ineffective switching between the two programming environments. A number of attempts exists to cover the gap, ranging from in-database analytics to data-centric programming languages like Hadoop Pig or ECL; nevertheless, none of the approaches is generally applicable.

The goal of this thesis is evaluation of existing approaches and designing a programming environment targeted at parallel and distributed data processing. The environment shall allow interweaving of database technology elements principially similar to tables and joins with procedural elements like arrays, loops and branches. The environment may consist of languages, libraries, code transformations, compilers, schedulers, and/or runtime elements.
References
Nadathur Satish, Changkyu Kim, Jatin Chhugani, Hideki Saito, Rakesh Krishnaiyer, Mikhail Smelyanskiy, Milind Girkar, and Pradeep Dubey. 2012. Can traditional programming bridge the Ninja performance gap for parallel computing applications?. In Proceedings of the 39th Annual International Symposium on Computer Architecture (ISCA '12). IEEE Computer Society, Washington, DC, USA, 440-451.
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html