Subjects

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Seminar on Data Mining - NAIL121

Title:	Seminář dobývání znalostí
Guaranteed by:	Department of Theoretical Computer Science and Mathematical Logic (32-KTIML)
Faculty:	Faculty of Mathematics and Physics
Actual:	from 2023
Semester:	summer
E-Credits:	4
Hours per week, examination:	summer s.:1/2, MC [HT]
Capacity:	unlimited
Min. number of students:	1
4EU+:	no
Virtual mobility / capacity:	no
State of the course:	taught
Language:	Czech, English
Teaching methods:	full-time

Guarantor:	Mgr. Marta Vomlelová, Ph.D.
Teacher(s):	Mgr. Marta Vomlelová, Ph.D.
Class:	Informatika Bc.

Opinion survey results SS schedule Noticeboard

Annotation -

Lectures introduce to machine learning tools and library functions usage. Participants of the seminar analyze a given data set and submit their results as a seminar work.

Last update: Šámal Robert, doc. Mgr., Ph.D. (01.06.2018)

Aim of the course -

The course provides basic experience with data preprocessing and machine learning algortithms.

Last update: Vomlelová Marta, Mgr., Ph.D. (14.05.2021)

Course completion requirements -

Students have to complete small task during the practicals in the first par of the semester. Then, they have to analyze selected data set, present the results and submit the analysis in a as a semestral project.

Last update: Vomlelová Marta, Mgr., Ph.D. (20.05.2025)

Literature -

Willi Richert, Luis Pedro Coelho: Building Machine Learning Systems with Python, Packt Publishing 2013

Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani: An Introduction to Statistical Learning with Applications in R, Springer 2013

Last update: Vomlelová Marta, Mgr., Ph.D. (15.05.2024)

Syllabus -

The seminar provides an experience in data analysis. It extends the lecture Introduction to Machine Learning.

Lectures introduce to machine learning tools and library functions usage. Participants of the seminar analyze a given dataset and submit their results as a seminar work.

The lectures cover:

graphs (scatter plot, box plot and basic graphs and graph annotations)

groupby function and group statistics

simple classification and regression models

evaluation with respect to different error functions

ways to identify outliers, missing data handling.

According a specific dataset we may further focus at:

maps (geopandas),

time series,

text tfidf vectorization,

clustering and apriori algorithm.

Last update: Vomlelová Marta, Mgr., Ph.D. (15.05.2024)