SubjectsSubjects(version: 825)
Course, academic year 2017/2018
   Login via CAS
Introduction to Machine Learning - NPFL054
Czech title: Úvod do strojového učení
Guaranteed by: Institute of Formal and Applied Linguistics (32-UFAL)
Faculty: Faculty of Mathematics and Physics
Actual: from 2014 to 2017
Semester: winter
E-Credits: 6
Hours per week, examination: winter s.:2/2 C+Ex [hours/week]
Capacity: unlimited
Min. number of students: unlimited
State of the course: taught
Language: Czech, English
Teaching methods: full-time
Additional information: https://ufal.mff.cuni.cz/course/npfl054
Guarantor: Mgr. Barbora Vidová Hladká, Ph.D.
RNDr. Martin Holub, Ph.D.
Class: DS, matematická lingvistika
Informatika Bc.
Informatika Mgr. - Matematická lingvistika
Classification: Informatics > Informatics, Software Applications, Computer Graphics and Geometry, Database Systems, Didactics of Informatics, Discrete Mathematics, External Subjects, General Subjects, Computer and Formal Linguistics, Optimalization, Programming, Software Engineering, Theoretical Computer Science, Computer and Formal Linguistics
Annotation -
Last update: Mgr. Barbora Vidová Hladká, Ph.D. (25.01.2018)

This one-semester introductory course provides both theoretical background and the basic machine learning algorithms explained independently on a broad spectrum of multidisciplinary applications. The lab sessions are application-dependent and aim at  practical experience with machine learning applications related to different fields. This course is intended for students from the bachelor study programme. Introductory knowledge of probability and statistics is required. The course can be taught either in Czech or in English, based on students' preference.
Course completion requirements -
Last update: Mgr. Barbora Vidová Hladká, Ph.D. (09.10.2017)

Students have to submit three assignments (Homework #1, #2, #3) during the term so that the sum of scores in the HWs exceeds the required score limit:

  • - Total max score: 90 pts
  • - Required score limit: 65 pts

Students have to pass three written tests during the term so that the sum of scores in the tests exceeds the required score limit:

  • - ​Test1 max score: 20 pts
  • - Test2 max score: 20 pts
  • - Test3 max score: 100 pts
  • - Total max score: 140 pts
  • - Required score limit: 75 pts

Obtaining the course credit is a prerequisite for taking the examination in the course.

The details about the homeworks and tests are published on the course site.

Literature -
Last update: Mgr. Barbora Vidová Hladká, Ph.D. (25.01.2018)

• Mitchel, Tom: Machine Learning. McGraw-Hill, 1997.

• James, Gareth, Daniela Witten, Trevor Hastie and Robert Tibshirani: An Introduction to Statistical Learning. Springer, 2013.

• Lantz, Brett: Machine Learning with R. Packt Publishing, 2013.

Requirements to the exam -
Last update: Mgr. Barbora Vidová Hladká, Ph.D. (09.10.2017)

The exam takes the form of an oral examination. Obtaining the course credit is a prerequisite for taking the examination in the course.

The examination requirements correspond to the course syllabus. The details are published on the course site.

Syllabus -
Last update: Mgr. Barbora Vidová Hladká, Ph.D. (25.01.2018)

1. Introduction: what is machine learning, motivation examples, interdisciplinary nature of machine learning, supervised vs. unsupervised learning, machine learning and its applications.

2. Decision Tree learning: decision tree structure, ID3 algorithm, splitting criteria, incorporating continuous-valued attributes, handling missing attribute values.

3. Naive Bayes classifier: Bayes theorem, posterior probability, maximum likelihood estimation, Bayesian belief networks, K2 algorithm.

4. Experiment evaluation: accuracy, cross-validation, error estimation, bootstrapping, the ROC curve, statistical significance, confidence intervals.

5. Linear and logistic regression.

6. Instance-based learning: distance criteria, the k-NN algorithm, the discrete and continuous case, the curse of dimensionality.

7. Support vector machines: the classifier separator, finding the hyperplane, the linear and non-linear separation, Kernel tricks.

8. Over-fitting, regularization.

9. Ensemble methods: combination of classifiers, voting, bagging, boosting, AdaBoost, Random Forests.

10. Clustering: dendrograms, (non)hierarchical clustering, the K-means algorithm.

11. Principles of neural networks learning.

 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html