SubjectsSubjects(version: 945)
Course, academic year 2023/2024
   Login via CAS
Natural language processing on computational cluster - NPFL118
Title: Zpracování přirozeného jazyka na výpočetním clusteru
Guaranteed by: Institute of Formal and Applied Linguistics (32-UFAL)
Faculty: Faculty of Mathematics and Physics
Actual: from 2022
Semester: winter
E-Credits: 3
Hours per week, examination: winter s.:0/2, C [HT]
Capacity: unlimited
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: taught
Language: Czech, English
Teaching methods: full-time
Teaching methods: full-time
Additional information: http://ufal.mff.cuni.cz/courses/npfl118
Guarantor: RNDr. Milan Straka, Ph.D.
Mgr. Martin Popel, Ph.D.
Mgr. Rudolf Rosa, Ph.D.
Annotation -
Last update: T_UFAL (04.05.2017)
The aim of the course is to introduce methods required in natural language processing (processing huge data sets in distributed environment and performing machine learning) and show how to effectively execute them on ÚFAL computational Linux cluster. The course will cover ÚFAL network and cluster architecture, SGE (Sun/Oracle/Son of Grid Engine), related Linux tools and best practices. The whole course will be taught in several first weeks of the semester.
Course completion requirements -
Last update: Mgr. Rudolf Rosa, Ph.D. (26.09.2022)

Solving the given assignments and active participation during the course.

To be able to meaningfully participate in the course and to complete the assignments, it is necessary to have access to the ÚFAL computational cluster. The course is therefore highly suitable for ÚFAL PhD students, but unsuitable for other students, apart from exceptional cases.

Literature -
Last update: Mgr. Martin Popel, Ph.D. (01.10.2022)

Data-Intensive Text Processing with MapReduce; Jimmy Lin and Chris Dyer.; Morgan & Claypool Publishers, 2010

Slurm - https://slurm.schedmd.com/

Apache Spark - https://spark.apache.org/

TensorFlow - https://www.tensorflow.org/

Syllabus -
Last update: Mgr. Martin Popel, Ph.D. (01.10.2022)

Technological difficulties with processing big data

ÚFAL network and cluster architecture

Slurm - architecture, commands

Related Linux tools

 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html