PředmětyPředměty(verze: 945)
Předmět, akademický rok 2023/2024
   Přihlásit přes CAS
Základy R - APS300426
Anglický název: Basic R
Zajišťuje: Katedra psychologie (21-KPS)
Fakulta: Filozofická fakulta
Platnost: od 2023 do 2023
Semestr: oba
Body: 0
E-Kredity: 3
Způsob provedení zkoušky:
Rozsah, examinace: 1/1, Z [HT]
Počet míst: zimní:25 / neurčen (20)
letní:neurčen / neurčen (20)
Minimální obsazenost: neomezen
4EU+: ne
Virtuální mobilita / počet míst pro virtuální mobilitu: ne
Kompetence:  
Stav předmětu: vyučován
Jazyk výuky: angličtina
Způsob výuky: prezenční
Způsob výuky: prezenční
Úroveň:  
Další informace: https://dl1.cuni.cz/course/view.php?id=13126
Poznámka: předmět je možno zapsat mimo plán
povolen pro zápis po webu
předmět lze zapsat v ZS i LS
Garant: Mgr. Jana Dlouhá
Vyučující: Mgr. Jana Dlouhá
Je neslučitelnost pro: APS300426E
Anotace - angličtina
Poslední úprava: Mgr. Jana Dlouhá (19.12.2021)
This course acquaints students with data science methods with application in the environment of the R. language. It expands the previous knowledge of statistical methods acquired in the bachelor's degree or self-study.

Data science is a combination of various fields, including mathematics, statistics, computer science, information science, machine learning and artificial intelligence. An article in the Harward Business Review refers to data science as "The Sexiest Job of the 21st Century" (Davenport & Patil, 2012). The most commonly used tools in this area are Python, SQL and R.

R is a programming language and environment designed for statistical analysis of data and their graphical display. It is an implementation of the programming language S under a free license. Because it's free, R has already outpaced commercial software such as SPSS in terms of users. At the same time, it provides users with a number of features beyond the free software, such as Jasp or Jam. The functionality of the R environment can be extended using libraries called packages, of which more than 15,000 are available in the CRAN repository. R is thus very variable and can be used for a number of different tasks.

Davenport, Thomas H., and D. J. Patil. "Data Scientist: The Sexiest Job of the 21st Century." Harvard Business Review 90, no. 10 (October 2012): 70–76.
Rodriguez Salgado, J. J. (2021, December 9). What does a data scientist do? breaking down the responsibilities of data scientists. DataCamp Community. Retrieved December 19, 2021, from https://www.datacamp.com/community/blog/what-does-a-data-scientist-do

Literatura -
Poslední úprava: Mgr. Jana Dlouhá (13.01.2022)

Required:
Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. URL: https://ggplot2.tidyverse.org

Wickham, H., & Grolemund, G. (2017). R for data science: import, tidy, transform, visualize and model data. O'Reilly. URL: https://r4ds.had.co.nz/index.html

R package documentation - https://www.rdocumentation.org/ (used packages)

R manuals https://cran.r-project.org/manuals.html

Recommended:
Field, A. P., Miles, J., & Field Zoë. (2014). Discovering statistics using R. Sage.

Mair, P. (2018). Modern Psychometrics with R. In Use R! Springer International Publishing. https://doi.org/10.1007/978-3-319-93177-7

Grolemund, G., & Wickham, H. (2017). R for Data Science. O’Reilly Media. https://r4ds.had.co.nz/

Zamora Saiz, A., Quesada González, C., Hurtado Gil, L., & Mondéjar Ruiz, D. (2020). An Introduction to Data Analysis in R. In Use R! Springer International Publishing. https://doi.org/10.1007/978-3-030-48997-7 (selected chapters)

R Document Collections, Journals and Proceedings https://www.r-project.org/other-docs.html (including a list of books and other publications related to R)

Požadavky ke zkoušce -
Poslední úprava: Mgr. Jana Dlouhá (13.01.2022)

Attendance is not mandatory, but highly recommended.

Credit will be awarded to students who actively participate in a reasonable number of lectures and exercises. Attendance can be compensated by completing assigned tasks and reading.

The exam will take place at the agreed date at the end of the semester. Each student will be assigned a case study based on the knowledge covered during the course. Students perform an analysis and briefly (10 min) present their procedure and conclusions to their classmates.

Exam evaluation

  • preparation and cleaning of data for analysis
  • performing a basic exploratory analysis
  • choice of methods suitable for answering questions from the case study
  • analysis using selected methods
  • Interpretation of results and answers to case study questions
  • presentation of results

Sylabus - angličtina
Poslední úprava: Mgr. Jana Dlouhá (21.02.2023)

  1. Introduction
    a. R framework and available software to use it, installing packages and solving problems (different OS, missing libraries, R versions, StackOverflow)
    b. data types, base R functions
    c. saving and loading data, Rdata files, work environment
    d. R documentation and CRAN, creating project and its structure
    e. DataCamp courses
  2. Introduction II
    a. R syntax, cycles, conditions, apply family functions, writing functions, „OOP" in R
    b. Git, installing packages from GitHub
    c. best practices, defensive programming,
  3. Tidyverse package – ggplot2, dplyr, tidyR, readr, purrr, tibble, stringer, forcats
  4. Statistics in R – correlation, regression, t-test, anova, chí-square, probability and distributions
  5. Data visualization in R (ggplot, plotly, lattice, gganim)
  6. Visualization best practices
  7. R Markdown – slides, HTML pages, pdf files, docx documents, LaTeX, bibTeX and CSS basics
  8. R shiny
  9. Psychometrics in R – lavaan, psych, psychometrics, mirt, mirtCAT
  10. Missing data, types of missing data, consequences to parametric statistics, imputation methods, multiple imputation
  11. Text analysis and text mining – quanteda, word2vec; basic steps (); text statistics and summaries, readability indices, word frequency, similarity
  12. Basics of unsupervised and supervised machine learning
  13. Preparation for the exam - selection of suitable methods of analysis and visualization for different types of data, communication of your findings

Vstupní požadavky -
Poslední úprava: Mgr. Jana Dlouhá (13.01.2022)
Students will need their own laptops with any operating system (Win, Linux, MacOS) during the course. Downloading and installing the necessary software will be part of the introductory lesson.
 
Univerzita Karlova | Informační systém UK