|
|
|
||
|
The course will provide a practical introduction to data science. The lectures will discuss phases of the data
science project, related technologies and methods. In the practicals, the individual steps will be applied to real-
world data. Part of the lectures will also focus on the specifics of Big Data. The added value will be practical
experience from data science projects of the Profinit company, hardly found in textbooks.
The course is intended for students of specialization Big Data Processing and also other specializations who
want to gain a basic overview of the field of data science.
Last update: Zavoral Filip, RNDr., Ph.D. (17.03.2021)
|
|
||
|
During the practicals students will receive (or choose and have approved by the instructors) a suitable real-world data set. Using them the students will gradually experiment with methods discussed in the lectures. The results of continuous data processing will be described in the form of two written reports (in the middle and at the end of the semester), which will be evaluated using points. The credit will be awarded for a required minimum amount of points. The points above the limit will be added to the points gained from the written exam text. Last update: Zavoral Filip, RNDr., Ph.D. (16.03.2021)
|
|
||
|
Sinan Ozdemir: Principles of Data Science Tony Ojeda, Sean Patrick Murphy, Benjamin Bengfort, Abhijit Dasgupta: Practical Data Science Cookbook Frank Kane: Hands-On Data Science and Python Machine Learning Last update: Zavoral Filip, RNDr., Ph.D. (16.03.2021)
|
|
||
|
What is data science, typical use cases. Data science decathlon (an overview of related methods, algorithms and technologies). Map of follow-up lectures, organization of the course, requirements for credit / exam. Motivation and problems of data science - a view from industry. Limits of statistical methods, distortion. Technologies for data science I: overview of popular representatives (technology stack), Python and data science. Phases of a data science project, methodology CRISP-DM. Business understanding, data understanding. Methods of data exploration and visualization. Creating a useful and understandable report. Data preparation (cleaning, transformation, feature extraction, ...). Modeling I: basic statistical models and performance evaluation. Modeling II: applied Bayesianism. Data science in modern database systems. Big Data science, MapReduce and data science. Apache Spark and data science. Technologies for data science II: MLops versioning, documentation, ... Business view of a data science project. Last update: Zavoral Filip, RNDr., Ph.D. (16.03.2021)
|