Combining text-based and vision-based semantics
Název práce v češtině: | Combining text-based and vision-based semantics |
---|---|
Název v anglickém jazyce: | Combining text-based and vision-based semantics |
Klíčová slova: | semantics, semantic similarity measurement, text, image, vector space model |
Klíčová slova anglicky: | semantics, semantic similarity measurement, text, image, vector space model |
Akademický rok vypsání: | 2010/2011 |
Typ práce: | diplomová práce |
Jazyk práce: | angličtina |
Ústav: | Ústav formální a aplikované lingvistiky (32-UFAL) |
Vedoucí / školitel: | RNDr. Martin Holub, Ph.D. |
Řešitel: | skrytý - zadáno a potvrzeno stud. odd. |
Datum přihlášení: | 12.11.2010 |
Datum zadání: | 12.11.2010 |
Datum a čas obhajoby: | 06.09.2011 00:00 |
Datum odevzdání elektronické podoby: | 05.08.2011 |
Datum odevzdání tištěné podoby: | 05.08.2011 |
Datum proběhlé obhajoby: | 06.09.2011 |
Oponenti: | RNDr. Jana Straková, Ph.D. |
Zásady pro vypracování |
Automatic semantic similarity measurement with the support of
unsupervised statistical vector space models triumps in many active and growing areas of research [1]. The high quality measurement becomes more and more important in many applied fields. This thesis is focused on integration of text-based and vision-based semantics. The Internet provides chances to take advantages of relation between images and texts [6]. The goal of this thesis is to use both vision and text-based semantics to create a multimodal semantic space from images and texts, in order to improve measurement of semantic similarity. Student's task will be building a semantic space model from corpora [3], [4], and extracting bags of visual words from images that share some topic characteristics or themes. Then the emerging multimodal semantic spaces should be applied to tasks such as measuring word similarity or concept clustering [2], that might be in turn helpful in applications such as query reformulation in information retrieval [5]. The quality of measurement of semantic similarity should be tested using Rubenstein and Goodenough similarity ratings, and/or Toefl synonyms testing, and/or noun/verb/concept clustering. |
Seznam odborné literatury |
[1] Peter Turney and Patrick Pantel. 2010. From Frequency to Meaning:
Vector Space Models of Semantics. Journal of Artificial Intelligence Research (JAIR), 37(1):141-188. AI Access Foundation. [2] Turney, P. D. (2006). Similarity of semantic relations. Computational Linguistics, 32(3), pp. 379-416. [3] Baroni, M. and Lenci A. To appear. Distributional Memory: A general framework for corpus-based semantics. Computational Linguistics, 2010. [4] Baroni, M. and Zamparelli, R. 2010. Nouns are vectors, adjectives are matrices: Representing adjective-noun constructions in semantic space. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2010), East Stroudsburg PA: ACL, 1183-1193. [5] Manning, C. D., Raghavan, P., and Schuetze, H. 2008. Introduction to Information Retrieval. Cambridge University Press, Cambridge, UK. [6] Jonathon S. Hare, Sina Samangooei, Paul H. Lewis, and Mark S. Nixon. 2008. Semantic spaces revisited: investigating the performance of auto-annotation and semantic retrieval using semantic spaces. In Proceedings of the 2008 international conference on Content-based image and video retrieval (CIVR '08). ACM, New York, NY, USA, 359-368. |
Předběžná náplň práce |
The goal of this thesis is to use both vision and text-based semantics
to create a multimodal semantic space from images and texts, in order to improve measurement of semantic similarity. |
Předběžná náplň práce v anglickém jazyce |
The goal of this thesis is to use both vision and text-based semantics
to create a multimodal semantic space from images and texts, in order to improve measurement of semantic similarity. |