SubjectsSubjects(version: 837)
Course, academic year 2018/2019
   Login via CAS
Introduction to data analysis - NMMB333
Title in English: Základy analýzy dat
Guaranteed by: Department of Algebra (32-KA)
Faculty: Faculty of Mathematics and Physics
Actual: from 2018 to 2019
Semester: winter
E-Credits: 5
Hours per week, examination: winter s.:2/2 C+Ex [hours/week]
Capacity: unlimited
Min. number of students: unlimited
State of the course: taught
Language: Czech
Teaching methods: full-time
Guarantor: RNDr. Pavel Charamza, CSc.
Class: M Mgr. MMIB
M Mgr. MMIB > Povinně volitelné
Classification: Mathematics > Mathematics, Algebra, Differential Equations, Potential Theory, Didactics of Mathematics, Discrete Mathematics, Math. Econ. and Econometrics, External Subjects, Financial and Insurance Math., Functional Analysis, Geometry, General Subjects, , Real and Complex Analysis, Mathematics General, Mathematical Modeling in Physics, Numerical Analysis, Optimization, Probability and Statistics, Topology and Category
Annotation -
Last update: doc. Mgr. et Mgr. Jan Žemlička, Ph.D. (09.05.2018)
The lecture covers standard methods of data analysis, including modern trends of big data analysis using machine learning. Modelling over real data in the R environment.
Literature -
Last update: doc. Mgr. et Mgr. Jan Žemlička, Ph.D. (09.05.2018)

Zvára Karel: Regresní analýza, Academia 1989

Hebák, Hustopecký: Vícerozměrné statistické metody 1, 2, 3, Informatorium, 2007

Kolaczyk, Csardi: Statistical Analysis of Network Data with R, Springer, 2014

Munzert, Rubba, Meissner, Nyhuis: Automated Data Collection with R, Wiley, 2015

Syllabus -
Last update: doc. Mgr. et Mgr. Jan Žemlička, Ph.D. (09.05.2018)

1) Basics of linear regression, logistic regression, lasso regression, principles of hypotheses testing, likelihood ratio tests, stepwise algorithms

2) Basics of multidimensional statistics - principle component analysis, factor analysis, cluster analysis

3) Discrimination measures - Kolmogorov-Smirnov, Gini coefficient, Somer’s d

4) Back test principles, cross validation and bootstrapping

5) Regression trees, random forests

6) Gradient boosting

7) Bayes networks, neural networks

8) Linear optimization, Support vector machine

Labs: Programming in R, practical work with data

Charles University | Information system of Charles University |