Thesis (Selection of subject)Thesis (Selection of subject)(version: 368)
Thesis details
   Login via CAS
Mapping the Prague Dependency Treebank Annotation Scheme onto Robust Minimal Recursion Semantics
Thesis title in Czech:
Thesis title in English:
Academic year of topic announcement: 2008/2009
Thesis type: diploma thesis
Thesis language: angličtina
Department: Institute of Formal and Applied Linguistics (32-UFAL)
Supervisor: doc. RNDr. Markéta Lopatková, Ph.D.
Author: hidden - assigned and confirmed by the Study Dept.
Date of registration: 06.11.2008
Date of assignment: 06.11.2008
Date and time of defence: 01.02.2010 00:00
Date of electronic submission:01.02.2010
Date of proceeded defence: 01.02.2010
Opponents: Mgr. Jan Štěpánek, Ph.D.
 
 
 
Guidelines
The Prague Dependency Treebank 2.0 [PDT 2.0, 2006] is a data corpus containing a large amount of Czech texts with complex and interlinked morphological (2 million words), syntactic (1.5 million words) and complex semantic annotation (0.8 million words). In correspondence with Praguian linguistic tradition (namely with the Functional Generative Description, see [Sgall et al., 1986]), the dependency-based formalism was adopted in PDT for describing both surface syntax and complex semantic representation of a sentence.

Robust Minimal Recursion Semantics (RMRS) is a formal system for describing flat semantics which is designed to allow deep and shallow processing to use compatible semantic representations. RMRS is based on predicate calculus with generalized quantifiers, see [Copestake, 2004/2006]. The RMRS representation consists of a bag of labeled elementary predicates and their arguments, a list of scoping constraints, and a unique handle that provides a hook into the representation (using the same formalism as Minimal Recursion Semantics, see [Copestake et al., 2005]).

The goal of the thesis is to translate a sample subcorpus of PDT from dependency-based annotation scheme into the RMRS formalism. The task is divided into several steps:
1. familiarizing with complex annotation scheme of PDT, http://ufal.mff.cuni.cz/pdt2.0/
2. familiarizing with RMRS, [Copestake et al., 2005]
3. gathering necessary surface and deep syntactic information from different layers of PDT
4. design and implementation of scripts for conversion of PDT data into RMRS representation

The RMRS formalism is used for representation of English and German. The translation of a large amount of Czech texts from PDT onto the RMRS representation should confirm the ability of RMRS to cover linguistic phenomena typical of typologically different language (inflective language with rich morphology and free word order).
References
[PDT 2.0, 2006]
http://ufal.mff.cuni.cz/pdt2.0/

[Sgall et al., 1986]
Petr Sgall, Eva Hajičová, Jarmila Panevová (1986) The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. Dordrecht, Reidel.

[Copestake et al., 2005]
Ann Copestake, Daniel P. Flickinger, Carl Pollard, and Ivan A. Sag. (2005) Minimal Recursion Semantics. An introduction. Journal of Research on
Language and Computation, Vol. 3, N. 4, p. 281-332.

[Copestake, 2004/2006]
Ann Copestake, Robust Minimal Recursion Semantics (Draft),
http://www.cl.cam.ac.uk/~aac10/papers.html
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html