SubjectsSubjects(version: 964)
Course, academic year 2024/2025
   Login via CAS
Data Science 2 - NMFP436
Title: Data Science 2
Guaranteed by: Department of Probability and Mathematical Statistics (32-KPMS)
Faculty: Faculty of Mathematics and Physics
Actual: from 2024
Semester: summer
E-Credits: 5
Hours per week, examination: summer s.:2/2, C+Ex [HT]
Capacity: unlimited
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: taught
Language: English, Czech
Teaching methods: full-time
Guarantor: RNDr. Václav Kozmík, Ph.D.
doc. RNDr. Michal Pešta, Ph.D.
Teacher(s): RNDr. Karel Kozmík
RNDr. Václav Kozmík, Ph.D.
Mgr. Ondřej Týbl, Ph.D.
Class: M Mgr. FPM
M Mgr. FPM > Povinně volitelné
M Mgr. PMSE
M Mgr. PMSE > Povinně volitelné
Classification: Informatics > Software Applications
Mathematics > Probability and Statistics
Is pre-requisite for: NMFP556
Annotation -
A crucial part of big data analysis is machine learning. Machine learning is widely used and is successful when solving complex tasks in many fields. This course serves as an introduction to basic machine learning principles and its use in practice. It presents the most used methods as decision trees or neural networks, which will be implemented in practicals in Python language. We will focus on analysis of real data and interpretation of the results.
Last update: Branda Martin, doc. RNDr., Ph.D. (11.12.2020)
Aim of the course -

An introduction to basic machine learning principles and its use in practice.

Last update: Zichová Jitka, RNDr., Dr. (06.05.2021)
Course completion requirements -

Details can be found on the webpage: https://www2.karlin.mff.cuni.cz/~kozmikk/DS2.php

Last update: Kozmík Václav, RNDr., Ph.D. (09.02.2022)
Literature -

Yoshua Bengio, Ian Goodfellow, Aaron Courville: Deep learning, MIT Press, In preparation.

Jürgen Schmidhuber: Deep learning in neural networks: An overview, Neural networks 61 (2015): 85-117.

Friedman, J. H. (March 1999): Stochastic Gradient Boosting, Computational Statistics and Data Analysis, vol. 38, pp. 367-378

Last update: Kozmík Václav, RNDr., Ph.D. (11.12.2020)
Teaching methods -

Lecture + exercises.

Last update: Zichová Jitka, RNDr., Dr. (06.05.2021)
Requirements to the exam -

Exam will include solving a practical task in Python with discussion about selected algorithm, its theoretial background and results achived in the practical task. Student will receive a data set together with a description of the prediction task which needs to be solved.

Last update: Kozmík Václav, RNDr., Ph.D. (21.04.2022)
Syllabus -

Lectures:

• introduction to machine learning, motivation, examples

• general methods in machine learning: split of dataset to training and validation, over-fitting, regularization

• methods using decision trees: decision trees, random forest, gradient boosting

• methods using neural networks: simple neural networks, convolutional neural networks, recurrent neural networks

• clustering methods – supervised vs unsupervised

• other classification methods – support vector machine, naive Bayes

Practicals:

• Practicals will be held in computer lab and Python language will be used

• Machine learning algorithms will be applied on real data

Last update: Kozmík Václav, RNDr., Ph.D. (11.12.2020)
Entry requirements

Necessary:

  • Basic calculus: derivatives, integrals, Taylor expansion, etc.
  • Basic probability and statistics: probability distributions, central limit theorem, statistical tests and hypotheses, Fisher information, maximum likelihood estimators
  • Basic programming skills (in any language)

Good to know:

  • Python: some basics will be covered, but can be challenging if the student has no experience with Python

Last update: Omelka Marek, doc. Ing., Ph.D. (19.11.2021)
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html