Témata prací (Výběr práce)

Váš prohlížeč nepodporuje JavaScript nebo je jeho podpora vypnutá. Některé funkce nemusejí být dostupné.

Combining text-based and vision-based semantics

Název práce v češtině:	Combining text-based and vision-based semantics
Název v anglickém jazyce:	Combining text-based and vision-based semantics
Klíčová slova:	semantics, semantic similarity measurement, text, image, vector space model
Klíčová slova anglicky:	semantics, semantic similarity measurement, text, image, vector space model
Akademický rok vypsání:	2010/2011
Typ práce:	diplomová práce
Jazyk práce:	angličtina
Ústav:	Ústav formální a aplikované lingvistiky (32-UFAL)
Vedoucí / školitel:	RNDr. Martin Holub, Ph.D.
Řešitel:	skrytý - zadáno a potvrzeno stud. odd.
Datum přihlášení:	12.11.2010
Datum zadání:	12.11.2010
Datum a čas obhajoby:	06.09.2011 00:00
Datum odevzdání elektronické podoby:	05.08.2011
Datum odevzdání tištěné podoby:	05.08.2011
Datum proběhlé obhajoby:	06.09.2011
Oponenti:	RNDr. Jana Straková, Ph.D.

Zásady pro vypracování

Automatic semantic similarity measurement with the support of
unsupervised statistical vector space models triumps in many active
and growing areas of research [1]. The high quality measurement
becomes more and more important in many applied fields.

This thesis is focused on integration of text-based and vision-based
semantics. The Internet provides chances to take
advantages of relation between images and texts [6].

The goal of this thesis is to use both vision and text-based semantics
to create a multimodal semantic space from images and texts, in order
to improve measurement of semantic similarity. Student's task will be
building a semantic space model from corpora [3], [4], and extracting
bags of visual words from images that share some topic characteristics
or themes.

Then the emerging multimodal semantic spaces should be applied to
tasks such as measuring word similarity or concept clustering [2],
that might be in turn helpful in applications such as query
reformulation in information retrieval [5]. The quality of measurement
of semantic similarity should be tested using Rubenstein and
Goodenough similarity ratings, and/or Toefl synonyms testing, and/or
noun/verb/concept clustering.

Seznam odborné literatury

[1] Peter Turney and Patrick Pantel. 2010. From Frequency to Meaning:
Vector Space Models of Semantics. Journal of Artificial Intelligence
Research (JAIR), 37(1):141-188. AI Access Foundation.

[2] Turney, P. D. (2006). Similarity of semantic relations. Computational
Linguistics, 32(3), pp. 379-416.

[3] Baroni, M. and Lenci A. To appear. Distributional Memory: A general
framework for corpus-based semantics. Computational Linguistics, 2010.

[4] Baroni, M. and Zamparelli, R. 2010. Nouns are vectors, adjectives are
matrices: Representing adjective-noun constructions in semantic
space. Proceedings of the Conference on Empirical Methods in Natural
Language Processing (EMNLP 2010), East Stroudsburg PA: ACL, 1183-1193.

[5] Manning, C. D., Raghavan, P., and Schuetze, H. 2008. Introduction to
Information Retrieval. Cambridge University Press, Cambridge, UK.

[6] Jonathon S. Hare, Sina Samangooei, Paul H. Lewis, and Mark S. Nixon.
2008. Semantic spaces revisited: investigating the performance of
auto-annotation and semantic retrieval using semantic spaces. In
Proceedings of the 2008 international conference on Content-based
image and video retrieval (CIVR '08). ACM, New York, NY, USA,
359-368.

Předběžná náplň práce

The goal of this thesis is to use both vision and text-based semantics
to create a multimodal semantic space from images and texts, in order
to improve measurement of semantic similarity.

Předběžná náplň práce v anglickém jazyce

The goal of this thesis is to use both vision and text-based semantics
to create a multimodal semantic space from images and texts, in order
to improve measurement of semantic similarity.