Introduction to Data Engineering - NDBI046
Title: Úvod do datového inženýrství
Guaranteed by: Department of Software Engineering (32-KSI)
Faculty: Faculty of Mathematics and Physics
Actual: from 2025
Semester: summer
E-Credits: 5
Hours per week, examination: summer s.:2/2, C+Ex [HT]
Capacity: unlimited
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: taught
Language: Czech, English
Teaching methods: full-time
Additional information: https://teaching.mff.cuni.cz/ndbi046-web/
Guarantor: Mgr. Petr Škoda, Ph.D.
Teacher(s): Ing. Pavel Koupil, Ph.D.
Mgr. Petr Škoda, Ph.D.
Incompatibility : NDBX046
Interchangeability : NDBX046
Is incompatible with: NDBX046
Is interchangeable with: NDBX046
Opinion survey results   Noticeboard   
Annotation -
The goal of the Data Management course is to give an overview of commonly used operations and techniques in a typical data processing process. This includes data retrieval, cleaning, transformation, validation, catalogization, versioning, documentation, publication via API, integration, search, compression, encryption, and working with large and distributed data.
Last update: Kopecký Michal, RNDr., Ph.D. (09.09.2020)
Course completion requirements -

During the semester there will be homework assignments for credit.

The final exam is a written test.

Last update: Škoda Petr, Mgr., Ph.D. (16.01.2024)
Requirements to the exam -

Getting homework credit is a prerequisite for the final exam.

Last update: Škoda Petr, Mgr., Ph.D. (04.08.2020)
Syllabus -
  • The role of data engineering
  • OLTP vs. OLAP
  • Data cube and related operations
  • Business intelligence
  • Data marketplace, data warehouse, data lake, and data contracts
  • Data warehouse modeling
  • Data pipeline (ETL) and data processing workflows
  • Data lineage and provenance
  • Data quality, dimensions, and metrics
  • Frameworks for data management and data governance
  • Methods for trustworthy and efficient data sharing
  • Data cataloging, metadata, data versioning
  • Data dictionary, data semantics, ontology
Last update: Škoda Petr, Mgr., Ph.D. (28.11.2025)
Entry requirements -

The course expects working knowledge from NPRG036 (Data Formats) course.

Last update: Škoda Petr, Mgr., Ph.D. (27.04.2021)