SubjectsSubjects(version: 978)
Course, academic year 2025/2026
   
Natural language processing on computational cluster - NPFL118
Title: Zpracování přirozeného jazyka na výpočetním clusteru
Guaranteed by: Institute of Formal and Applied Linguistics (32-UFAL)
Faculty: Faculty of Mathematics and Physics
Actual: from 2025 to 2025
Semester: winter
E-Credits: 3
Hours per week, examination: winter s.:0/2, C [HT]
Capacity: unlimited
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: taught
Language: Czech, English
Teaching methods: full-time
Additional information: http://ufal.mff.cuni.cz/courses/npfl118
Guarantor: RNDr. Milan Straka, Ph.D.
Mgr. Martin Popel, Ph.D.
Mgr. Rudolf Rosa, Ph.D.
Teacher(s): Mgr. Martin Popel, Ph.D.
Mgr. Rudolf Rosa, Ph.D.
RNDr. Milan Straka, Ph.D.
Class: DS, matematická lingvistika
Classification: Informatics > Computer and Formal Linguistics
Annotation -
The aim of the course is to introduce methods required in natural language processing (processing huge data sets in distributed environment and performing machine learning) and show how to effectively execute them on ÚFAL computational Linux cluster. The course will cover ÚFAL network and cluster architecture, SGE (Sun/Oracle/Son of Grid Engine), related Linux tools and best practices. The whole course will be taught in several first weeks of the semester.
Last update: Mírovský Jiří, RNDr., Ph.D. (09.04.2026)
Course completion requirements -

Solving the given assignments and active participation during the course.

To be able to meaningfully participate in the course and to complete the assignments, it is necessary to have access to the ÚFAL computational cluster. The course is therefore highly suitable for ÚFAL PhD students, but unsuitable for other students, apart from exceptional cases.

Last update: Mírovský Jiří, RNDr., Ph.D. (08.04.2026)
Literature -

Data-Intensive Text Processing with MapReduce; Jimmy Lin and Chris Dyer.; Morgan & Claypool Publishers, 2010

Slurm - https://slurm.schedmd.com/

Apache Spark - https://spark.apache.org/

TensorFlow - https://www.tensorflow.org/

Last update: Mírovský Jiří, RNDr., Ph.D. (08.04.2026)
Syllabus -

Technological difficulties with processing big data

ÚFAL network and cluster architecture

Slurm - architecture, commands

Related Linux tools

Last update: Mírovský Jiří, RNDr., Ph.D. (08.04.2026)
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html