SubjectsSubjects(version: 945)
Course, academic year 2023/2024
   Login via CAS
Introduction to Data Analysis in R - ASGV00154
Title: Úvod do analýzy dat v R
Guaranteed by: Department of Sociology (21-KSOC)
Faculty: Faculty of Arts
Actual: from 2023
Semester: winter
Points: 0
E-Credits: 3
Examination process: winter s.:
Hours per week, examination: winter s.:0/2, C [HT]
Capacity: unknown / unlimited (15)
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
Key competences:  
State of the course: taught
Language: Czech
Teaching methods: full-time
Teaching methods: full-time
Level:  
Note: course can be enrolled in outside the study plan
enabled for web enrollment
Guarantor: Mgr. Aleš Vomáčka
Teacher(s): Mgr. Aleš Vomáčka
Annotation -
Last update: Mgr. Aleš Vomáčka (19.09.2023)
This course is taught in Czech.

The course is an introduction to the R programming language developed for statistical data analysis. Previous knowledge of the R language is not assumed in the course, but basic knowledge of descriptive statistics and prior experience of data analysis is a prerequisite. The minimum input for particularly motivated students at the Department of Sociology FF UK is to take first year Statistics 1 (ASG100117), Statistics 1 Seminar (ASG100118) and Sociological Data Processing (ASG100118).

We base our course on a modern approach to data analysis in R using the R Studio development environment and Tidyverse "grammar." This approach is likely to prevail among the user community today.<br>
<br>
Learning R is a long shot. It's a journey that means a much bigger time investment than mastering GUI software, such as SPSS. The reward is much more flexibility, and in hand a universal tool for data processing, analysis, visualization, as well as programming and automation. Although we can't get that far in the course, in R today, thanks to available libraries and tools, it is also possible to create interactive graphic applications, web pages, presentations and, in addition to standard statistical analysis, machine learning tools are also available. This course makes sense especially for those students who want to focus quantitatively in their sociological path and who are ready to self-study and further develop the modest foundations that the course will offer.<br>
<br>
Participation in teaching presupposes a custom laptop with an Internet connection (Eduroam or other).<br>
Aim of the course -
Last update: Mgr. Aleš Vomáčka (19.09.2023)

The aim of the course is to introduce students into the programming environment R for statistical analysis with a focus on the modern concept of working in R using the Tidyverse packages. In particular, students will learn to manipulate data efficiently (dplyr package) and to visualize data flexibly and efficiently (ggplot2 package). In addition, attention is paid to the forcats packages (working with categorical variables or factors) and the stringr package (working with text variables).

Course completion requirements -
Last update: Mgr. Jaromír Mazák, Ph.D. (27.03.2022)

To successfully complete the course, the following task must be completed:

To submit a semester task by the end of the academic year in which the course was enrolled, which consists in replicating the data analysis entered. The exact assignment for the given academic year will be made available to students at the beginning of the course. In any case, it is always necessary to submit a script, which must be fully functional, i.e. it must run without error from start to finish without external interference, the data that the script needs to run and the output that is generated by the script.

Literature -
Last update: Mgr. Aleš Vomáčka (19.09.2023)

Primary literature:

* Wickham, H., & Grolemund, G. (2017). R for Data Science: Import, Tidy, Transform, Visualize, and Model Data (1st edition). O’Reilly Media. Dostupné online: https://r4ds.had.co.nz/

 

Secondary literature:

* Rodrigues, B. (2020). Modern R with the tidyverse. Dostupné online: https://b-rodrigues.github.io/modern_R/

* Winston Chang (2018): R Graphics Cookbook. Dostupné online: http://users.metu.edu.tr/ozancan/R%20Graphics%20Cookbook.pdf Případně útlejší verze zde: http://www.cookbook-r.com/Graphs/

* Navarro, D. (n.d.).  Learning Statistics with R: A tutorial for psychology students and other beginners. Dostupné online: https://open.umn.edu/opentextbooks/textbooks/learning-statistics-with-r-a-tutorial-for-psychology-students-and-other-beginners

 

Teaching methods -
Last update: Mgr. Jaromír Mazák, Ph.D. (27.03.2022)

Seminar.

Syllabus -
Last update: Mgr. Petra Poncarová (20.09.2022)

Topics:

0. Before starting - install R, Rstudio, Tidyverse individually at home according to our instructions

1. What you learn in the course (motivation), what you have to accomplish, R as software, R Studio as user interface, materials and where to find help, R-base vs. Tidyverse, examples of working with R-base, data structures in R, built-in functions in R.

2. Data import, data file transformations (dplyr package; select, filter, arrange, mutate, summarize function)

3. Working in multiple variables at once (across function)

4. Data file manipulation (pivot_longer, pivot_wider, *_join, bind_rows, bind_collumns function)

5. Revision of functions from the dplyr and tidyr packages

6. Working with factors (forcats package)

7. Exploring data using visualization (ggplot package 2) - 1st class

8. Exploring data using visualization (ggplot package 2) - 2nd class

9. Aesthetic and functional editing of graphs (ggplot2 package, scales package)

10. Working with text variables (stringr package)

11. Introduction to RMarkdown and generating analytical outputs in various formats

12. Revision

 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html