SubjectsSubjects(version: 945)
Course, academic year 2023/2024
   Login via CAS
Data Processing and Analysis for the Humanities - NPFL143
Title: Zpracování a analýza dat pro humanitní a společenské vědy
Guaranteed by: Institute of Formal and Applied Linguistics (32-UFAL)
Faculty: Faculty of Mathematics and Physics
Actual: from 2023
Semester: summer
E-Credits: 2
Hours per week, examination: summer s.:0/2, C [HT]
Capacity: unlimited
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: not taught
Language: Czech
Teaching methods: full-time
Teaching methods: full-time
Guarantor: RNDr. Martin Holub, Ph.D.
Class: Informatika Mgr. - volitelný
Classification: Informatics > Computer and Formal Linguistics
Annotation -
Last update: RNDr. Jiří Mírovský, Ph.D. (24.05.2023)
Computer data processing has become a methodological prerequisite for the vast majority of scientific fields, including the humanities and social sciences. The teaching takes place in the form of illustrative examples and guides students from the very basics of working with data to solving practical problems using tools implemented in the R software environment. This course does not assume any previous knowledge of programming and is intended primarily for students of humanities and social sciences, at any level (Bc/Mgr/PhD).
Aim of the course -
Last update: RNDr. Jiří Mírovský, Ph.D. (24.05.2023)

Students will learn to independently use the R system to process and analyze data from the humanities and social sciences. The course provides systematic technical support for mastering the basics of artificial intelligence within the follow-up course "Artificial Intelligence for the Humanities" [NPFL 142].

Course completion requirements -
Last update: RNDr. Jiří Mírovský, Ph.D. (24.05.2023)

The course will end with a written credit test. Attendance at practice is mandatory. Credit is awarded for active work throughout the semester and submission of ongoing homework.

Literature -
Last update: RNDr. Martin Holub, Ph.D. (06.06.2023)
  • Grolemund, Garrett and Hadley Wickham: R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O'Reilly Media, 2016. [https://r4ds.hadley.nz/]
  • Gonick, Larry and Woollcott Smith. The Cartoon Guide to Statistics. Harper Resource. 2005.
  • Arnold, Taylor and Lauren Tilton: Humanities Data in R. Exploring Networks, Geospatial Data, Images, and Text. Springer, 2015. [https://link.springer.com/book/10.1007/978-3-319-20702-5]

Syllabus -
Last update: RNDr. Martin Holub, Ph.D. (06.06.2023)
Basics of using the R system, elementary programming skills
  • Data vectors, data tables, working with data files
  • Methods for textual data processing
  • Tools from the tidyverse package
  • Data visualization
  • Regular expressions

Elementary knowledge of probability and statistics

  • Theoretical and empirical probability distribution
  • Contingency tables
  • Simulation of random processes
  • Binomial and normal distributions
  • Using simple statistical tests

Support for experimenting with Artificial Intelligence

 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html