Joint Learning of Syntax and Semantics
Thesis title in Czech: | Joint Learning of Syntax and Semantics |
---|---|
Thesis title in English: | Joint Learning of Syntax and Semantics |
Key words: | semantics, syntax, joint learning, latent variables, language–independent |
English key words: | sémantika, syntaxe, joint learning, latentní proměnné, jazyková nezávislost |
Academic year of topic announcement: | 2012/2013 |
Thesis type: | diploma thesis |
Thesis language: | angličtina |
Department: | Institute of Formal and Applied Linguistics (32-UFAL) |
Supervisor: | doc. RNDr. Ondřej Bojar, Ph.D. |
Author: | hidden - assigned and confirmed by the Study Dept. |
Date of registration: | 11.11.2011 |
Date of assignment: | 11.11.2011 |
Confirmed by Study dept. on: | 07.12.2012 |
Date and time of defence: | 21.01.2013 00:00 |
Date of electronic submission: | 07.12.2012 |
Date of submission of printed version: | 07.12.2012 |
Date of proceeded defence: | 21.01.2013 |
Opponents: | RNDr. David Mareček, Ph.D. |
Guidelines |
Recovery of the full meaning of text requires structural analysis of its syntax and semantics. Current state-of-the-art approaches have treated modeling of syntax (syntatic parsing) and semantics (semantic role labelling) separately. The complete specification of the features for joint modeling of syntax and semantics is inappropriate because their complex structures and interactions are not well explained. Latent variables that are automatically induced from training data (Lluis and Marquez 2008) can be used to capture both of the structures without the need for hand-crafted features.
The goal of this thesis is to make a model of the joint learning of syntax and semantics defined as a multi-task machine learning problem with latent variables. The appropriate supervised learning and parsing algorithms will be devised and evaluated in terms of various measures for parsing and role labelling accuracy. The thesis will use the publicly available data in seven languages from the CoNLL 2009 shared task (Hajič et al. 2009) and optionally also the Prague Czech-English Dependency Treebank 2.0 (once available). |
References |
Andrea Gesmundo, James Henderson, Paola Merlo, Ivan Titov 2009. A Latent Variable Model of Synchronous Syntactic-Semantic Parsing for Multiple Languages CoNLL 2009 Shared Task., Conf. on Computational Natural Language Learning (CoNLL-09), Boulder, Colorado, USA).
Xavier Lluis and Lluis Marquez. 2008. A joint model for parsing syntactic and semantic dependencies. In Proceedings of CONLL 2008, pages 188–192, Manchester, UK. Hajič Jan, Ciaramita Massimiliano, Johansson Richard, Kawahara Daisuke, Martí Maria Antònia, Màrquez Lluís, Meyers Adam, Nivre Joakim, Padó Sebastian, Štěpánek Jan, Straňák Pavel, Surdeanu Mihai, Xue Nianwen, Zhang Yi: The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL): Shared Task. ACL. ISBN 978-1-932432-29-9, pp. 1-18, 2009. |