Towards concept visualization through image generation
Název práce v češtině: | Vizualizace konceptů pomocí generování obrazu |
---|---|
Název v anglickém jazyce: | Towards concept visualization through image generation |
Klíčová slova: | jazyk, obraz, sémantika |
Klíčová slova anglicky: | text2image, Cross-model Mapping, Distributed Semantics, Convolu- tional Neural Networks, Visual Feature Inversion. |
Akademický rok vypsání: | 2014/2015 |
Typ práce: | diplomová práce |
Jazyk práce: | angličtina |
Ústav: | Ústav formální a aplikované lingvistiky (32-UFAL) |
Vedoucí / školitel: | doc. RNDr. Pavel Pecina, Ph.D. |
Řešitel: | skrytý - zadáno a potvrzeno stud. odd. |
Datum přihlášení: | 19.03.2015 |
Datum zadání: | 27.03.2015 |
Datum potvrzení stud. oddělením: | 15.07.2015 |
Datum a čas obhajoby: | 03.02.2016 09:00 |
Datum odevzdání elektronické podoby: | 21.01.2016 |
Datum odevzdání tištěné podoby: | 04.12.2015 |
Datum proběhlé obhajoby: | 03.02.2016 |
Oponenti: | doc. Ing. Zdeněk Žabokrtský, Ph.D. |
Zásady pro vypracování |
Computational linguistic and computer vision have a common way to embed the semantics of linguistic/visual units through vector representation. In addition, high-quality semantic representations can be effectively constructed thanks to recent advances in neural network methods. Nevertheless, the understanding of these representations remains limited, so they need to be assessed in an intuitive way.
Cross-modal mapping is mapping between vector semantic embedding of words and the visual representations of the corresponding objects from images. Inverting image representation involves learning an image inversion of visual vectors (SIFT, HOG and CNN features) to reconstruct the original one. The goal of this project is to build a complete pipeline, in which word representations are transformed into image vectors using cross modal mapping and these vectors are projected to pixel space using inversion. This suggests that there might be a groundbreaking way to inspect and evaluate the semantics encoded in word representations by generating pictures that represent it. |
Seznam odborné literatury |
[1] Angeliki Lazaridou and Elia Bruni and Marco Baroni. Is this a wampimuk? Cross-modal mapping between distributional semantics and the visual world. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics 2014
[2] C Vondrick, A Khosla, T Malisiewicz, A Torralba. HOGgles: Visualizing Object Detection Features. The IEEE International Conference on Computer Vision (ICCV) 2013 [3] Karol Gregor Ivo Danihelka Alex Graves Daan Wierstra. DRAW: A Recurrent Neural Network For Image Generation. Google Deepmind. [4] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. In Proceedings of Workshop at ICLR, 2013. |