SubjectsSubjects(version: 964)
Course, academic year 2024/2025
   Login via CAS
Natural Language Processing - NPFL124
Title: Zpracování přirozeného jazyka
Guaranteed by: Institute of Formal and Applied Linguistics (32-UFAL)
Faculty: Faculty of Mathematics and Physics
Actual: from 2020
Semester: summer
E-Credits: 4
Hours per week, examination: summer s.:2/1, C+Ex [HT]
Capacity: unlimited
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: taught
Language: Czech, English
Teaching methods: full-time
Additional information: https://ufal.mff.cuni.cz/courses/npfl124
Guarantor: doc. Ing. Zdeněk Žabokrtský, Ph.D.
Teacher(s): doc. RNDr. Ondřej Bojar, Ph.D.
Mgr. Jindřich Helcl, Ph.D.
Mgr. Jindřich Libovický, Ph.D.
doc. RNDr. Pavel Pecina, Ph.D.
RNDr. Daniel Zeman, Ph.D.
doc. Ing. Zdeněk Žabokrtský, Ph.D.
Annotation -
The goal of the course is to provide students with knowledge and hands-on experience related to basic (mostly statistical) methods in the field of Natural Language Processing. The students will be acquainted with fundamental components such as corpora and language modes, as well as with complex end-user applications such as Machine Translation.
Last update: Vidová Hladká Barbora, doc. Mgr., Ph.D. (03.05.2019)
Course completion requirements -

To pass the course, you will need to submit homework assignments and do a written test.

Homework assignments

  • Assignments will be set in the class and specified on the website.
  • To get the credit, you need to get at least 50% of the total achievable points for the assignments.
  • If you miss the deadline, there is a second deadline in 2 weeks, but your points for the assignment will be multiplied by 0.5; after the second deadline, you get 0 points.

Test

  • There will be a written test at the end of the semester.
  • To pass the exam, you need to get at least 50% of the total points from the test.

Grading

  • Your grade is based on the average of your performance; the test and the homework assignments are weighted 1:1.
  • ≥ 90%: grade 1 (excellent)
  • ≥ 70%: grade 2 (very good)
  • ≥ 50%: grade 3 (good)
  • < 50%: grade 4 (fail)
Last update: Žabokrtský Zdeněk, doc. Ing., Ph.D. (13.06.2019)
Literature -

Electronic study materials are provided for each lecture.

Recommended literature beyond the basic requirements:

Manning C. D., Schuetze, H.: Foundations of Statistical Natural Language Processing.MIT Press, Cambridge, 1999 Koehn, P.: Statistical Machine Translation. Cambridge University Press New York, 2010. Manning, C., Raghavan, P., Schuetze, H.: Introduction to Information Retrieval. Cambridge University Press, 2008.

Last update: Vidová Hladká Barbora, doc. Mgr., Ph.D. (25.01.2018)
Syllabus -
  • Motivation for NLP. Probability models and information theory, basic notions.
  • Language models, smoothing.
  • Hidden markov models.
  • Language data resources, experiments in NLP.
  • Morphological tagging.
  • Syntactic analysis.
  • Overview of machine translation approaches.
  • Statistical machine translation.
  • Linguistic features in machine translation.
  • Information retrieval.
  • Term weights.
  • Document classification and clustering.
  • Word embeddings.

Last update: Vidová Hladká Barbora, doc. Mgr., Ph.D. (25.01.2018)
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html