SubjectsSubjects(version: 970)
Course, academic year 2024/2025
   Login via CAS
Introductory Seminar on Mathematical Linguistics I - NPFL002
Title: Úvodní seminář matematické lingvistiky I
Guaranteed by: Institute of Formal and Applied Linguistics (32-UFAL)
Faculty: Faculty of Mathematics and Physics
Actual: from 2020
Semester: winter
E-Credits: 3
Hours per week, examination: winter s.:0/2, C [HT]
Capacity: unlimited
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: cancelled
Language: Czech
Teaching methods: full-time
Guarantor: doc. RNDr. Vladimír Petkevič, CSc.
Class: Informatika Mgr. - volitelný
Classification: Informatics > Computer and Formal Linguistics
Is co-requisite for: NPFL031
Annotation -
In the seminar the subject of mathematical (computational) linguistics is defined, its foundations and relation to related fields such as: general linguistics, computer science, artificial intelligence, various special fields of mathematics (especially the theory of formal languages and automata theory, algebra and other). The main objective consists in how natural language (rather than a formal one, e.g. a programming language) can be processed by exact mathematical and computer science methods and formalisms, the stress being laid on morphology and syntax. The basic elements of the theory of
Last update: T_UFAL (23.05.2003)
Literature - Czech

F. Čermák (1994): Jazyk a jazykověda. Pražská imaginace. Praha.

V. Fromkin, R. Rodman (1978): An Introduction to Language. Holt, Rinehart and Winston, New York.

A. V. Gladkij, I. A. Mel'čuk (1969): Elementy matematičeskoj lingvistiki. Moskva.

A. V. Gladkij (1973): Formal'nyje grammatiki i jazyki. Moskva.

Hajič J. (in press): Disambiguation of Rich Inflection (Computational Morphology of Czech), Vol. 1. Karolinum Charles University Press, Prague.

Harrison, M.A. (1978): Introduction to Formal Language Theory. Addison-Wesley Publ. Comp, Inc., Reading (Mass.).

Hajičová E., J. Panevová, P. Sgall (2002): Úvod do teoretické a počítačové lingvistiky. I. svazek - Teoretická lingvistika. Nakladatelství Karolinum, Univerzita Karlova v Praze, Praha.

Roland Hausser (2001): Foundations of Computational Linguistics. Human-Computer Communication in Natural Language. Springer-Verlag, Berlin, Heidelberg, New York.

J. E. Hopcroft, J. D. Ullman (1978): Formálne jazyky a automaty. Alfa, Bratislava.

Chomsky N. (1957): Syntactic Structures. Mouton, The Hague.

M. Chytil (1978): Teorie automatů a formálních jazyků. [Lecture notes.] SPN, Praha.

M. Chytil (1984): Automaty a gramatiky. SNTL Praha.

Karlsson F., Voutilainen A., Heikkilä J., Antilla A. (eds.) (1995): Constraint Grammar. A Language-Independent System for Parsing Unrestricted Text. Mouton de Gruyter, Berlin New York.

G. K. Krulee (1991): Computer Processing of Natural Language. Prentice-Hall, Inc., New Jersey.

J. Lyons (1968): Introduction to General Linguistics. Cambridge University Press, Cambridge.

Marcus S. (1969): Algebraické modely v lingvistice. Academia, Praha.

Mel'čuk I. A. (1988): Dependency Syntax. State University of New York Press, Albany.

Nebeský L. (1987): Kombinatorické vlastnosti větných struktur. Acta Universitatis Carolinae, Philologica, Monographia XCV - 1987, Univerzita Karlova, Praha.

Novotný M. (1988): S algebrou od jazyka ke gramatice a zpět. Academia, Praha.

Oliva K., M. Hnátková, V. Petkevič, P. Květoň (2000): The Linguistic Basis of a Rule-Based Tagger of Czech. Text, Speech and Dialogue. Proceedings of the Third International Workshop, TSD 2000, LNAI 1902, Brno 2000, 3-8.

B.H. Partee, A. ter Meulen, R. E. Wall (1990): Mathematical Methods in Linguistics. Kluwer Academic Publishers. Dordrecht, Boston, London.

Roche E., Y. Schabes (eds.) (1997): Finite-State Language Processing. The Massachusetts Institute of Technology, Mass.

Rozenberg, G., A. Salomaa (eds.) (1997): Handbook of Formal Languages. Springer-Verlag Berlin Heidelberg.

Sgall P. et al. (1986): Úvod do syntaxe a sémantiky. Academia, Praha.

Tesniere L. (1959): Eléments de syntaxe structurale. Klincksieck, Paris.

Last update: T_UFAL (23.05.2003)
Syllabus -
A. What is mathematical linguistics (ML in the sequel)?

  • definition of concepts: natural language processing (NLP); computational linguistics; formal linguistics; statistical linguistics; quantitative linguistics; algebraic linguistics; applied linguistics
  • relation of ML to mathematics, especially to: the theory of formal languages and automata; algebra; mathematical logic; probability and statistics; relation of ML to logic and to computer science
  • natural and artificial languages
  • computational analysis and synthesis of language: an outline
  • the tasks and objectives of ML

B. The foundations of the mathematical theory of languages: theory of formal languages and automata

  • formal generative grammars - Chomsky hierarchy of grammars and automata and its linguistic motivation
  • generative and recognizing procedures
  • recursion, recursive and recursively-enumerable languages
  • relation of formal grammars and languages (natural and artificial)
  • linguistically motivated examples of formal languages generated by these grammars
  • generative (weak) and explicative power of grammars

C. Immediate-constituent phrase-structure grammars

  • motivation and definition of immediate-constituent grammars
  • main features of immediate-constituent grammars
  • word order and hierarchical constituent structure
  • languages generated by immediate-constituent grammars and their properties
  • immediate-constituent tree

D. Introduction to transformational grammars

  • base component, phrase-structure, phrase-structure grammar, phrase-marker
  • transformation (preserving and nonpreserving meaning), transformational grammar
  • deep and surface structure of sentence
  • pros and cons of transformational grammars

E. Introduction to automata theory

  • automata as mathematical structures (devices)
  • types of automata from the viewpoint of their weak (generative) power and their relation to formal grammars: finite-state automata, pushdown store automata, linear-bounded automata, Turing machines
  • automata as acceptors and generators
  • examples of natural (sub)languages processed by various kinds of automata
  • the exploitation of finite-state automata for the processing of morphology and syntax of natural languages; two-level morphology

F. Dependency grammars

  • basic characteristics of dependency grammars
  • generative and explicative power of dependency grammars, their comparison with the Chomsky hierarchy and their status within the theory of formal languages
  • algebraic sentence representation in the dependency framework: directed graph
  • word order
  • projectivity and nonprojectivity of syntactic structures, unbounded dependencies
  • coordination and its description in dependency grammar
  • comparison of immediate-constituent and dependency grammars from the aspect of explicative power
  • convergence of immediate-constituent and dependency grammars: attempt at reconciliation

G. Categorial grammars

  • basic characteristics of categorial grammars
  • reasons for introducing categorial grammars and their adequacy for various types of languages
  • relation of categorial grammars to immediate-constituent and dependency grammars

F. Structural complexity of natural languages from the Chomsky hierarchy perspective

  • Capabilities and limitations of regular and context-free grammars for the description of various morphological and syntactic phenomena in natural languages
  • Context-sensitive languages. Typically context-sensitive structures in natural languages
  • Grammars on the boundary between context-free and context-sensitive languages: indexed grammars, tree-adjoining (TAG) grammars, head grammars

H. Automatic morphological analysis and disambiguation of natural languages

  • main objective and problems of morphological analysis
  • lemmatization of word forms and multi-word expressions
  • morphological disambiguation, its problems and methods
  • ambiguity in natural languages, its types and methods of discrimination

Last update: T_UFAL (23.05.2003)
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html