Větné reprezentace s interpretací podobnosti
Název práce v češtině: | Větné reprezentace s interpretací podobnosti |
---|---|
Název v anglickém jazyce: | Sentence representations with similarity interpretation |
Klíčová slova: | neuronové sítě|větné embeddingy |
Klíčová slova anglicky: | neural networks|sentence embeddings |
Akademický rok vypsání: | 2022/2023 |
Typ práce: | diplomová práce |
Jazyk práce: | čeština |
Ústav: | Ústav formální a aplikované lingvistiky (32-UFAL) |
Vedoucí / školitel: | Mgr. Vojtěch Hudeček, Ph.D. |
Řešitel: | skrytý![]() |
Datum přihlášení: | 15.02.2023 |
Datum zadání: | 15.02.2023 |
Datum potvrzení stud. oddělením: | 12.12.2023 |
Datum a čas obhajoby: | 13.02.2024 09:00 |
Datum odevzdání elektronické podoby: | 11.01.2024 |
Datum odevzdání tištěné podoby: | 11.01.2024 |
Datum proběhlé obhajoby: | 13.02.2024 |
Oponenti: | Mgr. Jindřich Libovický, Ph.D. |
Zásady pro vypracování |
Obtaining sentence representations (embeddings) from neural network models is a common approach nowadays. However, its usage mostly relies on simple similarity measures. That is sufficient in most cases, however for some systems it might lack desirable features.
The goal of this work is to explore the possibilities of using sentence embeddings in novel ways so they are better suited to work with structured data or provide more semantic information when measuring similarities. To achieve this we plan to incorporate different sentence annotations in the representation learning process. Such new representations would be able to provide specific information and reason why two texts are similar either on syntactic or semantic level. Possible applications would be for retrieval-based systems or building knowledge bases from unstructured texts. |
Seznam odborné literatury |
Juri Opitz and Anette Frank. 2022. SBERT studies Meaning Representations: Decomposing Sentence Embeddings into Explainable Semantic Features. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 625–638, Online only. Association for Computational Linguistics.
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992, Hong Kong, China. Association for Computational Linguistics. Nils Reimers and Iryna Gurevych. 2020. Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4512–4525, Online. Association for Computational Linguistics. |