SubjectsSubjects(version: 964)
Course, academic year 2024/2025
   Login via CAS
Language Technologies in Practice - NPFL128
Title: Jazykové technologie v praxi
Guaranteed by: Institute of Formal and Applied Linguistics (32-UFAL)
Faculty: Faculty of Mathematics and Physics
Actual: from 2021
Semester: summer
E-Credits: 4
Hours per week, examination: summer s.:2/1, MC [HT]
Capacity: unlimited
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: taught
Language: English
Teaching methods: full-time
Additional information: https://ufal.mff.cuni.cz/courses/npfl128
Guarantor: RNDr. Jiří Hana, Ph.D.
Teacher(s): RNDr. Jiří Hana, Ph.D.
Incompatibility : NPFL096
Interchangeability : NPFL096
Is incompatible with: NPFL096
Is interchangeable with: NPFL096
Annotation -
The course surveys solutions to common NLP tasks ranging from entity recognition to text generation. It evaluates various approaches (machine learning, rules, larger resources, ...) and their combinations. Part of the course consists of students presenting and discussing papers relevant to a give topic. Each student implements a prototype system solving a particular task.
Last update: Vidová Hladká Barbora, doc. Mgr., Ph.D. (31.01.2019)
Course completion requirements -
  • leading discussion on selected papers (max 2 papers per person)
  • programming project

Last update: Hana Jiří, RNDr., Ph.D. (10.06.2019)
Literature -
  • Koskenniemi, Kimmo, 1983, Two-level Morphology: A General Computational Model for Word-Form Recognition and Production, University of Helsinki, Department of General Linguistics.
  • Goldsmith, John. 2001. Unsupervised Acquisition of the Morphology of a Natural Language.
  • Yarowsky, David and Richard Wicentowski. 2001. Minimally supervised morphological analysis by multimodal alignment. Proceedings of ACL-2000, Hong Kong, pages 207-216
  • Schone, Patrick and Daniel Jurafsky. 2001. Knowledge-Free Induction of Inflectional Morphologies. Proceedings of the North American Chapter of the Association for Computational Linguistics.
  • Cucerzan. 2007. Large-Scale Named Entity Disambiguation Based on Wikipedia Data
  • Daiber, Joachim, Max Jakob, Chris Hokamp and Pablo N. Mendes 2013. Improving Efficiency and Accuracy in Multilingual Entity Extraction. Proceedings of the 9th International Conference on Semantic Systems (I-Semantics)
  • Surdeanu, Mihai, David McClosky, Mason R. Smith, Andrey Gusev, and Christopher D. Manning. 2011. Customizing an Information Extraction System to a New Domain. In Proceedings of the ACL 2011 Workshop on Relational Models of Semantics
  • Reiter, Ehud and Robert Dale 2000. Building Natural Language Generation Systems. Cambridge University Press.

Last update: Vidová Hladká Barbora, doc. Mgr., Ph.D. (31.01.2019)
Syllabus -
  • processing morphology
    • engineering approach to morphology, lemmatization
    • unsupervised and lightly-supervised morphology
    • Linguistica, Yarowski & Wicentowski 2001, Schoene & Jurafsky 2001, Morfessor
  • sentiment analysis
  • entities
    • named, unnamed and structured entities
    • recognition, normalization, standardization,
    • linking, knowledge graphs
  • intent detection
  • relation extraction
  • Natural Language Generation (NLG)
    • generation of documents vs. short texts/phrases
    • classical NLG vs neural NLG
    • document planning, microplanning, lexicalization, realization
  • Last update: Vidová Hladká Barbora, doc. Mgr., Ph.D. (31.01.2019)
     
    Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html