Comparison of Approaches for Querying of Chemical Compounds
Thesis title in Czech: | Porovnání přístupů k dotazování chemických sloučenin |
---|---|
Thesis title in English: | Comparison of Approaches for Querying of Chemical Compounds |
Key words: | Chemická databáze, Chemické sloučeniny, Benchmark, Hledání podgrafů, Grafová databáze, Izomorfismus podgrafů |
English key words: | Chemical database, Chemical Compounds, Benchmark, Subgraph querying, Graph database, Subgraph isomorphism |
Academic year of topic announcement: | 2015/2016 |
Thesis type: | diploma thesis |
Thesis language: | angličtina |
Department: | Department of Software Engineering (32-KSI) |
Supervisor: | doc. RNDr. Irena Holubová, Ph.D. |
Author: | hidden - assigned and confirmed by the Study Dept. |
Date of registration: | 12.04.2016 |
Date of assignment: | 26.04.2017 |
Confirmed by Study dept. on: | 02.05.2017 |
Date and time of defence: | 17.06.2019 09:00 |
Date of electronic submission: | 10.05.2019 |
Date of submission of printed version: | 10.05.2019 |
Date of proceeded defence: | 17.06.2019 |
Opponents: | prof. RNDr. Jaroslav Pokorný, CSc. |
Advisors: | doc. RNDr. David Hoksza, Ph.D. |
Guidelines |
Chemical compounds represent a unique type of a graph data set with a specific exploitation and querying. Currently there exist various approaches for storing and querying chemical compounds. They can be represented as general graphs or specific strings (e.g., in the SMILES format), queried using specific languages (e.g., the SMARTS language), indexed using specific indexes (e.g., GString) etc. The aim of the thesis is to describe, discuss and, in particular, experimentally compare the existing approaches for efficient storing and querying chemical compounds, including NoSQL graph databases and relational databases. |
References |
Holubová, I. - Kosek, J. - Minařík, K. - Novák, D.: Big Data a NoSQL databáze. Grada, Praha, Česká republika, říjen 2015. ISBN 978-80-247-5466-6. [http://www.ksi.mff.cuni.cz/bigdata/]
PubChem http://pubchem.ncbi.nlm.nih.gov/ ZINC http://zinc.docking.org/ ChEMBL https://www.ebi.ac.uk/chembl/ SMILES - A Simplified Chemical Language. http://www.daylight.com/dayhtml/doc/theory/theory.smiles.html SMARTS - A Language for Describing Molecular Patterns. http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html Haoliang Jiang, Haixun Wang, Shuigeng Zhou: GString: A Novel Approach for Efficient Search in Graph Databases http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4221705 Sherif Sakr - Eric Pardede: Graph Data Management: Techniques and Applications Vojtech Šípek: Vizuální dotazování v chemických databázích pomocí SMARTS vzorů. Bakalářská práce. MFF UK, 2014. |