SubjectsSubjects(version: 945)
Course, academic year 2023/2024
   Login via CAS
Introduction to data analysis - NMMB333
Title: Základy analýzy dat
Guaranteed by: Department of Algebra (32-KA)
Faculty: Faculty of Mathematics and Physics
Actual: from 2020
Semester: winter
E-Credits: 5
Hours per week, examination: winter s.:2/2, C+Ex [HT]
Capacity: unlimited
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: taught
Language: Czech
Teaching methods: full-time
Teaching methods: full-time
Guarantor: RNDr. Pavel Charamza, CSc.
Class: M Mgr. MMIB
M Mgr. MMIB > Povinně volitelné
Classification: Mathematics > Mathematics, Algebra, Differential Equations, Potential Theory, Didactics of Mathematics, Discrete Mathematics, Math. Econ. and Econometrics, External Subjects, Financial and Insurance Math., Functional Analysis, Geometry, General Subjects, , Real and Complex Analysis, Mathematics General, Mathematical Modeling in Physics, Numerical Analysis, Optimization, Probability and Statistics, Topology and Category
Annotation -
Last update: doc. Mgr. et Mgr. Jan Žemlička, Ph.D. (09.05.2018)
The lecture covers standard methods of data analysis, including modern trends of big data analysis using machine learning. Modelling over real data in the R environment.
Course completion requirements -
Last update: doc. Mgr. et Mgr. Jan Žemlička, Ph.D. (28.10.2019)

Students have to pass oral exam.

Literature -
Last update: doc. Mgr. et Mgr. Jan Žemlička, Ph.D. (09.05.2018)

Zvára Karel: Regresní analýza, Academia 1989

Hebák, Hustopecký: Vícerozměrné statistické metody 1, 2, 3, Informatorium, 2007

Kolaczyk, Csardi: Statistical Analysis of Network Data with R, Springer, 2014

Munzert, Rubba, Meissner, Nyhuis: Automated Data Collection with R, Wiley, 2015

Requirements to the exam -
Last update: doc. Mgr. et Mgr. Jan Žemlička, Ph.D. (28.10.2019)

Students have to pass oral exam. The requirements for the exam correspond to what has been done during lectures and practicals.

Syllabus -
Last update: doc. Mgr. et Mgr. Jan Žemlička, Ph.D. (09.05.2018)

1) Basics of linear regression, logistic regression, lasso regression, principles of hypotheses testing, likelihood ratio tests, stepwise algorithms

2) Basics of multidimensional statistics - principle component analysis, factor analysis, cluster analysis

3) Discrimination measures - Kolmogorov-Smirnov, Gini coefficient, Somer’s d

4) Back test principles, cross validation and bootstrapping

5) Regression trees, random forests

6) Gradient boosting

7) Bayes networks, neural networks

8) Linear optimization, Support vector machine

Labs: Programming in R, practical work with data

 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html