SubjectsSubjects(version: 845)
Course, academic year 2018/2019
   Login via CAS
Practical Fundamentals of Probability and Statistics for Computer Linguistics - NPFL081
Title in English: Praktické základy pravděpodobnosti a statistiky pro komputační lingvistiku
Guaranteed by: Institute of Formal and Applied Linguistics (32-UFAL)
Faculty: Faculty of Mathematics and Physics
Actual: from 2017
Semester: winter
E-Credits: 3
Hours per week, examination: winter s.:0/2 C [hours/week]
Capacity: unlimited
Min. number of students: unlimited
State of the course: taught
Language: English
Teaching methods: full-time
Additional information:
Guarantor: RNDr. Martin Holub, Ph.D.
Class: Informatika Mgr. - volitelný
Classification: Informatics > Computer and Formal Linguistics
Annotation -
Last update: SLEZA (22.05.2007)
ONLY for students in EM Program in LCT, see The aim of the course is to introduce elementary probabilistic and statistical principles, techniques and methods which are used in solving computational linguistics (natural language processing) tasks. An essential part of the course is active work with data and introduction to workflow in R while solving a given task. A part of the course will consist of individual study of mutually agreed selected materials.
Course completion requirements
Last update: RNDr. Martin Holub, Ph.D. (15.10.2017)

Students should regularly attend the classes and pass two written tests during the term and/or an assignement of a given task in R. Both theoretical knowledge and practical skills will be tested.

Last update: RNDr. Martin Holub, Ph.D. (15.10.2017)

Sheldon M. Ross: A First Course In Probability. (7th Ed.) Prentice Hall, 2005.

Gonick, Larry and Woollcott Smith. The Cartoon Guide to Statistics. Harper Resource. 2005.

Syllabus -
Last update: RNDr. Martin Holub, Ph.D. (15.10.2017)
  • mathematical probability, its definition and calculating
  • random variable (discrete and continuos) and its probability distribution
  • distribution function, quantile function, density
  • statistical independence
  • expected value and variance
  • properties of binomial and normal distributions
  • random sampling
  • parameters of distributions, parameter estimating, t-test
  • statistical hypothesis testing, critical values
  • contingency tables, hypothesis testing in contingency tables
  • chi-square distribution, chi-square tests
  • entropy, conditional entropy, mutual information
  • basics of programming in R system (

Charles University | Information system of Charles University |