SubjectsSubjects(version: 845)
Course, academic year 2018/2019
   Login via CAS
Introduction to Machine Learning - NPFL054
Title in English: Úvod do strojového učení
Guaranteed by: Institute of Formal and Applied Linguistics (32-UFAL)
Faculty: Faculty of Mathematics and Physics
Actual: from 2018
Semester: winter
E-Credits: 5
Hours per week, examination: winter s.:2/2 C+Ex [hours/week]
Capacity: unlimited
Min. number of students: unlimited
State of the course: taught
Language: Czech, English
Teaching methods: full-time
Additional information: https://ufal.mff.cuni.cz/course/npfl054
Guarantor: Mgr. Barbora Vidová Hladká, Ph.D.
Class: DS, matematická lingvistika
Informatika Bc.
Informatika Mgr. - Matematická lingvistika
Classification: Informatics > Informatics, Software Applications, Computer Graphics and Geometry, Database Systems, Didactics of Informatics, Discrete Mathematics, External Subjects, General Subjects, Computer and Formal Linguistics, Optimalization, Programming, Software Engineering, Theoretical Computer Science, Computer and Formal Linguistics
Annotation -
Last update: Mgr. Barbora Vidová Hladká, Ph.D. (25.01.2018)
This introductory course provides both theoretical background and practical Machine Learning (ML) algorithms. ML methods discussed in the course are not limited to any specific domain and can be applied in many different fields. Lab sessions aim at practical experience with ML tasks. Introductory knowledge of probability and statistics is required as well as general programming skills.
Course completion requirements -
Last update: Mgr. Barbora Vidová Hladká, Ph.D. (09.10.2017)

Students have to submit three assignments (Homework #1, #2, #3) during the term so that the sum of scores in the HWs exceeds the required score limit:

  • - Total max score: 90 pts
  • - Required score limit: 65 pts

Students have to pass three written tests during the term so that the sum of scores in the tests exceeds the required score limit:

  • - ​Test1 max score: 20 pts
  • - Test2 max score: 20 pts
  • - Test3 max score: 100 pts
  • - Total max score: 140 pts
  • - Required score limit: 75 pts

Obtaining the course credit is a prerequisite for taking the examination in the course.

The details about the homeworks and tests are published on the course site.

Literature -
Last update: Mgr. Barbora Vidová Hladká, Ph.D. (25.01.2018)

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani: An Introduction to Statistical Learning. Springer, 2013.

Lantz, Brett: Machine Learning with R. Packt Publishing, 2013.

Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016.

Requirements to the exam -
Last update: Mgr. Barbora Vidová Hladká, Ph.D. (09.10.2017)

The exam takes the form of an oral examination. Obtaining the course credit is a prerequisite for taking the examination in the course.

The examination requirements correspond to the course syllabus. The details are published on the course site.

Syllabus -
Last update: Mgr. Barbora Vidová Hladká, Ph.D. (25.01.2018)

Machine learning - basic concepts. What is machine learning, motivation examples of practical applications, theoretical foundations of machine learning. Supervised and unsupervised learning. Classification and regression tasks. Training and test examples. Feature vectors. Target variable and prediction function. Machine learning development cycle. Curse of dimensionality. Clustering algorithms.

Decision tree learning. Decision tree learning algorithm, splitting criteria, pruning.

Linear regression. Least square cost function.

Instance-based learning. k-NN algorithm.

Logistic regression. Discriminative classifier.

Naive Bayes learning. Naive Bayes classifier. Bayesian belief networks.

Support Vector Machines. Large margin classifier, soft margin classifier. Kernel functions. Multiclass classification.

Ensemble methods. Bagging and boosting. AdaBoost algorithm. Random Forests.

Parameters in ML. Learning parameters tuning. Grid search. Gradient descent algorithm. Maximum likelihood estimation.

Predictor evaluation. Working with test data. Sample error, generalization error. Cross-validation, one-leave-out method. Bootstrap methods. Performance measures. Evaluation of binary classifiers. ROC curve.

Statistical tests. Statistical hypotheses, one-sample and two-sample t-tests, chi-square goodness-of-fit test. Significance level, p-value. Using statistical tests for classifier evaluation. Confidence level, confidence intervals.

Overfitting. How to recognize and avoid. Decision tree pruning. Regularization.

Dimensionality reduction. General principles of feature selection. Filters, wrappers, embedded methods.

Feature selection using information gain. Principal Component Analysis.

Foundations of Neural Networks. Single perceptron. Single hidden layer neural networks. Back-propagation training. Multi-layer feed-forward models. Remarks on Deep Learning.

 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html