Výpočetní modely slovotvorby
|Thesis title in Czech:||Výpočetní modely slovotvorby|
|Thesis title in English:||Computational Models of Word Formation|
|Key words:||vektorová reprezentace slov, slovotvorba, morfologie|
|English key words:||vector space models, word formation, morphology|
|Academic year of topic announcement:||2023/2024|
|Type of assignment:||dissertation|
|Department:||Institute of Formal and Applied Linguistics (32-UFAL)|
|Supervisor:||doc. Ing. Zdeněk Žabokrtský, Ph.D.|
|Word formation data resources harmonized for multiple natural languages were almost non-existent until very recently (,), which was a limiting factor for developing models whose validity would be empirically testable in a multilingual setting. The aim of the thesis is to develop, implement, and evaluate word formation models that make use of modern distributional vector space word representations (word embedding models), with a special focus on derivational morphology () and on multilingual aspects (). Optionally, optimization criteria used in the models can be interpreted in terms of Information Theory, and might reflect hierarchical interactions in a language’s vocabulary, biological and cognitive biases relevant for natural languages, as well as language evolution perspectives.|
| Batsuren, K., Bella, G., & Giunchiglia, F. (2019, July). CogNet: A Large-Scale Cognate Database. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 3136-3145).
 Kyjánek, L., Žabokrtský, Z., Ševčíková, M., & Vidra, J. (2019). Universal Derivations Kickoff: A Collection of Harmonized Derivational Resources for Eleven Languages. In Proceedings of the Second International Workshop on Resources and Tools for Derivational Morphology (pp. 101-110).
 Bonami, O., & Paperno, D. (2018). Inﬂection vs. derivation in a distributional vector space. Lingue e linguaggio, 17(2), 173-196.
 Ruder, S., Vulić, I., & Søgaard, A. (2019). A survey of cross-lingual word embedding models. Journal of Artificial Intelligence Research, 65, 569-631.