Témata prací (Výběr práce)Témata prací (Výběr práce)(verze: 368)
Detail práce
   Přihlásit přes CAS
Mapping the Prague Dependency Treebank Annotation Scheme onto Robust Minimal Recursion Semantics
Název práce v češtině:
Název v anglickém jazyce:
Akademický rok vypsání: 2008/2009
Typ práce: diplomová práce
Jazyk práce: angličtina
Ústav: Ústav formální a aplikované lingvistiky (32-UFAL)
Vedoucí / školitel: doc. RNDr. Markéta Lopatková, Ph.D.
Řešitel: skrytý - zadáno a potvrzeno stud. odd.
Datum přihlášení: 06.11.2008
Datum zadání: 06.11.2008
Datum a čas obhajoby: 01.02.2010 00:00
Datum odevzdání elektronické podoby:01.02.2010
Datum proběhlé obhajoby: 01.02.2010
Oponenti: Mgr. Jan Štěpánek, Ph.D.
 
 
 
Zásady pro vypracování
The Prague Dependency Treebank 2.0 [PDT 2.0, 2006] is a data corpus containing a large amount of Czech texts with complex and interlinked morphological (2 million words), syntactic (1.5 million words) and complex semantic annotation (0.8 million words). In correspondence with Praguian linguistic tradition (namely with the Functional Generative Description, see [Sgall et al., 1986]), the dependency-based formalism was adopted in PDT for describing both surface syntax and complex semantic representation of a sentence.

Robust Minimal Recursion Semantics (RMRS) is a formal system for describing flat semantics which is designed to allow deep and shallow processing to use compatible semantic representations. RMRS is based on predicate calculus with generalized quantifiers, see [Copestake, 2004/2006]. The RMRS representation consists of a bag of labeled elementary predicates and their arguments, a list of scoping constraints, and a unique handle that provides a hook into the representation (using the same formalism as Minimal Recursion Semantics, see [Copestake et al., 2005]).

The goal of the thesis is to translate a sample subcorpus of PDT from dependency-based annotation scheme into the RMRS formalism. The task is divided into several steps:
1. familiarizing with complex annotation scheme of PDT, http://ufal.mff.cuni.cz/pdt2.0/
2. familiarizing with RMRS, [Copestake et al., 2005]
3. gathering necessary surface and deep syntactic information from different layers of PDT
4. design and implementation of scripts for conversion of PDT data into RMRS representation

The RMRS formalism is used for representation of English and German. The translation of a large amount of Czech texts from PDT onto the RMRS representation should confirm the ability of RMRS to cover linguistic phenomena typical of typologically different language (inflective language with rich morphology and free word order).
Seznam odborné literatury
[PDT 2.0, 2006]
http://ufal.mff.cuni.cz/pdt2.0/

[Sgall et al., 1986]
Petr Sgall, Eva Hajičová, Jarmila Panevová (1986) The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. Dordrecht, Reidel.

[Copestake et al., 2005]
Ann Copestake, Daniel P. Flickinger, Carl Pollard, and Ivan A. Sag. (2005) Minimal Recursion Semantics. An introduction. Journal of Research on
Language and Computation, Vol. 3, N. 4, p. 281-332.

[Copestake, 2004/2006]
Ann Copestake, Robust Minimal Recursion Semantics (Draft),
http://www.cl.cam.ac.uk/~aac10/papers.html
 
Univerzita Karlova | Informační systém UK