|
|
|
||
The course gives an overview over the state-of-the-art techniques in content-based similarity search in multimedia databases (MDB) or, more generally, in collections of unstructured data. Unlike classic (relational) databases and exact-match querying, in MDB we need to extract features from the multimedia objects and provide a kind of similarity-based retrieval. The second part of the course is focused on indexing - in order to search the database efficiently (quickly).
Last update: T_KSI (28.04.2005)
|
|
||
Similarity Search - The Metric Space Approach, P. Zezula, G. Amato, V. Dohnal, M. Batko, Springer, 2006 Image Databases, V. Castelli, L.D. Bergman (eds.), Wiley, 2002 Metric Indexing in Information Retrieval, T. Skopal, Ph.D. thesis, TU Ostrava, 2004 (k dispozici na webu přednášejícího) Multimedia Systems and Content-based Management, S. Deb, Idea Group Publishing, 2004 Image Retrieval, C. Jorgensen, Scarecrow Press, 2003 + internetové zdroje a odkazy na webu přednášejícího Last update: Skopal Tomáš, prof. RNDr., Ph.D. (18.04.2006)
|
|
||
1. Introduction: Multimedia databases (MDB). The motivation for searching in MDB, applications. Modalities of searching and querying. Text-based vs. content-based retrieval. Feature extraction and similarity measures. Indexing.
2. MDB formalization, querying semantics, similarity as a relevance to a query. Query result quality (effectiveness) and search performance (efficiency) + measures.
3. Feature extraction and similarity measures. Vector representations, strings/sequences, sets, graphs. Properties of similarity measures, metric axioms. Discussion about measures and similarity theories.
4. Queries - range query, k nearest neighbors, reverse nearest neighbor, closest pair, similarity join.
5. Retrieval modalities. Querying, relevance feedback, browsing, navigation in query result, classification. Application interfaces. Examples.
6. Applications: Image retrieval (colors, textures and shape features). Fingerprint, iris, music, protein, text and XML retrieval.
7. Mapping methods and dimensionality reduction. Approximation vs. filtration. Latent semantics as a part of feature extraction. Linear projections: LSI, random projections, FastMap, SparseMap, MetricMap. Non-linear projections.
8. Metric access methods vs. spatial access methods. The curse of dimensionality. Distance distribution and intrinsic dimensionality.
9. Static MAM: (m)vp-tree, gh-tree, GNAT
10. Dynamic MAM: M-tree + modifications, PM-tree, LPM-tree
11. Pivot-based methods. Global and local pivots. M-tree vs. LAESA. vp-forest, D-index. Pivot selection methods.
12. Approximate and probabilistic methods of similarity search. AC and PAC search. Non-metric search.
Last update: T_KSI (28.04.2005)
|