NLP Applications - NPFL093
Czech title: Aplikace NLP
Guaranteed by: Institute of Formal and Applied Linguistics (32-UFAL)
Faculty: Faculty of Mathematics and Physics
Actual: from 2012
Semester: summer
E-Credits: 5
Hours per week, examination: summer s.:2/1 MC [hours/week]
Capacity: unlimited
Min. number of students: unlimited
State of the course: taught
Language: Czech, English
Teaching methods: full-time
Guarantor: doc. RNDr. Vladislav Kuboň, Ph.D.
Class: Informatika Mgr. - Matematická lingvistika
Classification: Informatics > Computer and Formal Linguistics
Opinion survey results   Examination dates   Schedule   Noticeboard   
Annotation -
Last update: T_UFAL (10.05.2010)

The main goal of the course is to introduce basic types of natural language processing (NLP) applications and to give the students a chance to work with some of those applications in seminars. The course will concern machine translation, machine aided human translation tools, localization tools, information retrieval and extraction, question answering, speech recognition, spelling and grammar checking, generation etc.
Literature -
Last update: T_UFAL (10.05.2010)

Handbook of NLP, ed. N.Indurkhya, F.Damerau, CRC Press, 2010.

Foundations of Statistical Natural Language Processing, C. Manning and H. Schütze, MIT Press, 1999.

Syllabus -
Last update: T_UFAL (10.05.2010)

1. Introduction - an overview of basic application components.

2. Spelling checker

Dictionary based methods vs. checking of illegal combinations of characters, string similarity metrics, communication towards the user.

3. Grammar checking

Error patterns vs. syntactic analysis, types of detectable errors, attitude towards the user, RFODG and LanGR.

4. Machine Assisted human translation

Translation memory and its variants in commercial products, controlled language, glossary hierarchies.

5. Machine Translation

Google Translate vs. rule-based systems commercial systems (Systran, PC Translator), quality evaluation methods, evaluation of translation competitions, project Euromatrix.

6. Localization

Differences between translation and localization, commercial localization tools.

7. Generating

Text generation from tectogrammatical layer.

8. Information retrieval and extraction

Basic models, evaluation metrics, text similarity metrics, lemmatization, stop words, the role of linguistic tools, Malach project.

9. Question answering
Dialog systems, multimodal communication.

10. Speech synthesis and recognition

Basic problems and algorithms.

11. Semantic web

Exploitation of linguistic methods for searching for information on the web, the role of the tectogrammatical layer.