SubjectsSubjects(version: 945)
Course, academic year 2023/2024
   Login via CAS
Data Processing in Python - JEM207
Title: Data Processing in Python
Czech title: Data Processing in Python
Guaranteed by: Institute of Economic Studies (23-IES)
Faculty: Faculty of Social Sciences
Actual: from 2023 to 2023
Semester: both
E-Credits: 5
Examination process: written
Hours per week, examination: 2/0, Ex [HT]
Capacity: winter:130 / unknown (60)
summer:unknown / unknown (60)
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: taught
Language: English
Teaching methods: full-time
Teaching methods: full-time
Note: course can be enrolled in outside the study plan
enabled for web enrollment
priority enrollment if the course is part of the study plan
you can enroll for the course in winter and in summer semester
Guarantor: PhDr. František Čech, Ph.D.
Mgr. Martin Hronec
Teacher(s): PhDr. František Čech, Ph.D.
Mgr. Martin Hronec
Mgr. Bc. Vít Macháček, Ph.D.
Ing. Alena Pavlova
Mgr. Jan Šíla, M.Sc.
Class: Courses for incoming students
Annotation -
Last update: Mgr. Jan Šíla, M.Sc. (06.02.2023)
The course is taught in person and we expect students to come to the class to attend the lectures and seminars.

The aim of the course is to provide hands-on experience in programming in Python with a special emphasis on data manipulation and processing.

Students will get the basics of Pandas, Numpy or Matplotlib and also collect web data with API requests and BeatifiulSoup. The students will also be guided through modern social-coding and open-source technologies such as GitHub, Jupyter and Open Data.

The students will gain experience using the data from the IES website and subject evaluation protocols.

The course would make use of the DataCamp online sources ( https://www.datacamp.com ) to provide the students with reliable yet simple resources for learning Python programming.
Aim of the course -
Last update: Mgr. Martin Hronec (06.02.2020)

After passing the course, the students will be able to execute a software-based, data-oriented project in Python, specifically download the data from APIs or directly from the web, pre-process it, analyze it and visualize it. Further, they will be able to do it in a repeatable, standard software-development quality manner using version control.

Literature -
Teaching methods -
Last update: Mgr. Martin Hronec (06.02.2020)

Please see the course GitHub repository (https://github.com/vitekzkytek/PythonDataIES/blob/master/README.md).

Requirements to the exam -
Last update: Mgr. Jan Šíla, M.Sc. (05.12.2023)

The final grade consists of four parts:

  • Homework assignments (5%)
  • Midterm (25%)
  • Presentation of work in progress on the final project (10%) - at least 50% required from this part
  • Final project (60%) - at least 50% required from this part

more info on the course GitHub( https://github.com/vitekzkytek/PythonDataIES/blob/master/README.md )

Grading scale (according to Dean's Provision 17/2018):

  • A: above 90 (not inclusive)
  • B: between 80 (not inclusive) and 90 (inclusive)
  • C: between 70 (not inclusive) and 80 (inclusive)
  • D: between 60 (not inclusive) and 70 (inclusive)
  • E: between 50 (not inclusive) and 60 (inclusive)
  • F: below 50 (inclusive)
Syllabus -
Last update: Mgr. Martin Hronec (05.02.2024)

Previous experience with general coding is assumed - The course is designed for students that have at least some basic coding experience. It does not need to be very advanced, but they should be aware of concepts such as for loop, if and elsevariable or function.

No knowledge of Python is required for entering the course

| Week | Date | L/S | Topic | Lecturer | Deadline || --- | --- | --- | --- | --- | --- |
| 1 | 19.2. | S | Seminar 0: Setup (Jupyter, VScode, Git, OS basics) | Martin + Alena |  |
| 1 | 20.2. | L | Python basics | Martin |  |
| 2 | 27.2. | L | Python basics II | Jan |  |
| 3 | 4.3. | S | Seminar 1: Basics | Alena | HW 1 |
| 3 | 5.3. | L | Numpy | Jan |  |
| 4 | 12.3. | L | Pandas I | Martin |  |
| 5 | 18.3. | S | Seminar 2: Numpy & pandas | Alena | HW 2 |
| 5 | 19.3. | L | Pandas II + Matplotlib | Martin |  |
| 6 | 26.3. | L | Data formats, APIs | Jan |  |
| 7 | 2.4. | S | Seminar 3: Data formats & APIs | Alena | HW 3 |
| 7 | 8.4. | - | MIDTERM | Alena, Jan & Martin |  |
| 8 | 9.4 | L | Algorithmic problem solving  | Jan |  |
| 9 | 15.4. | S | MIDTERM solution | Alena |  |
| 9 | 16.4. | L | Data science | Martin |  |
| 10 | 23.4. | L | How to code (avoiding spaghetti code) | Martin | Project proposal |
| 11 | 29.4. | S | Seminar 5: Data science case-study | Alena |  |
| 11 | 30.4. | L | Databases | Jan | Topic approved |
| 12 | 7.5. | L | Guest Lecture + Beer after lecture @ https://pivo-klub.cz/ | TBD |  |
| 13 | 12.-16.5. | - | WiP: Project consultations | Alena, Jan & Martin |  |
| 14 | 20.-23.5. | - | WiP: Project consultations | Alena, Jan & Martin |  |

Entry requirements -
Last update: Mgr. Martin Hronec (06.02.2020)

Previous experience with general coding is assumed - The course is designed for students that have at least some basic coding experience. It does not need to be very advanced, but they should be aware of concepts such as for loop, if and else, variable or function.

No knowledge of Python is required for entering the course.

Registration requirements -
Last update: Mgr. Martin Hronec (04.10.2022)

The course is primarily for master and advanced bachelor students.

 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html