SubjectsSubjects(version: 970)
Course, academic year 2016/2017
   Login via CAS
Data Science with R - JEM181
Title: Data Science with R
Czech title: Data Science with R
Guaranteed by: Institute of Economic Studies (23-IES)
Faculty: Faculty of Social Sciences
Actual: from 2016 to 2016
Semester: winter
E-Credits: 6
Examination process: winter s.:combined
Hours per week, examination: winter s.:2/2, Ex [HT]
Capacity: 84 / 84 (60)
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: taught
Language: English
Teaching methods: full-time
Note: course can be enrolled in outside the study plan
enabled for web enrollment
priority enrollment if the course is part of the study plan
Guarantor: prof. PhDr. Ladislav Krištoufek, Ph.D.
Teacher(s): Mgr. Ladislav Habiňák, MBA
prof. PhDr. Ladislav Krištoufek, Ph.D.
Examination dates   Schedule   Noticeboard   
Annotation
Introductory course to Data Science with applications in the R programming environment. Special focus is put on data visualization, data & text mining, and machine learning methods.
Last update: Krištoufek Ladislav, prof. PhDr., Ph.D. (23.09.2016)
Aim of the course

The main aim of the course is to train students to be able to properly analyze specific datasets with methods outside of standard econometric framework using the R programming environment.

Last update: Krištoufek Ladislav, prof. PhDr., Ph.D. (23.09.2016)
Literature

Mandatory literature:

  • Toomey, Dan (2014): R for Data Science, Packt Publishing Ltd., Birmingham, UK
  • Zumel, Nina & Mount, John (2014): Practical Data Science with R, Manning Publications Co., Shelter Island, NY, USA

Additional suggested literature:

  • Grolemung, Garret (2014): Hands-On Programming with R, O'Reilly Media Inc., Sebastopol, CA, USA
  • Ojeda, Tony et al. (2014): Practical Data Science Cookbook, Packt Publishing Ltd., Birmingham, UK
Last update: Krištoufek Ladislav, prof. PhDr., Ph.D. (25.09.2017)
Teaching methods

Lectures + Seminars (2 parallel classes, Tuesdays and Wednesdays):

  • Group 1: Tuesdays 9:30 - 12:20 (room 016) with a break of 10 minutes
  • Group 2: Wednesdays: 9:30 - 12:20 (room 016) with a break of 10 minutes

Software: R and RStudio (available on all computers in room 016)

Last update: Krištoufek Ladislav, prof. PhDr., Ph.D. (01.10.2018)
Requirements to the exam

The final grade consists of three ingredients:

  • DataCamp assignments: 25 (5*5)
  • Active participation during lectures and seminars: 10
  • Final project: 35
  • Final test: 30

Grading scale (according to Dean's Provision 17/2018):

  • A: above 90 (not inclusive)
  • B: between 80 (not inclusive) and 90 (inclusive)
  • C: between 70 (not inclusive) and 80 (inclusive)
  • D: between 60 (not inclusive) and 70 (inclusive)
  • E: between 50 (not inclusive) and 60 (inclusive)
  • F: below 50 (inclusive)

DataCamp.com assignments:

  • Assignment #1 - by the end of Week #4:
    • Introduction to R
  • Assignment #2 - by the end of Week #7:
    • Intro to Exploratory Data Analysis (optional)
    • Training and Evaluating of Regression Models
    • Issues to Consider
    • Tree-based Methods
  • Assignment #3 - by the end of Week #10:
    • Introduction to Machine Learning
  • Assignment #4 - by the end of Week #12:
    • Text Mining: Bag of Words
  • Assignment #5 - by the end of Week #12:
    • Unsupervised Learning in R
Last update: Krištoufek Ladislav, prof. PhDr., Ph.D. (10.10.2018)
Syllabus
  • Week #1-#2: Course information + R basics (ZM 1, G 3-5)

  • Week #3: Loading data, cleaning data, sampling (ZM 2-4)

  • Week #4: Model evaluation (ZM 5)

  • Week #4-5: Memorization methods (ZM 6)

  • Week #6: Correlations, linear and logistic regressions and beyond (ZM 7, T4-5)

  • Week #7: Clustering (T1, ZM 8)

  • Week #8-#9: Data and text mining sequences (T 2-3)

  • Week #10: Reducing training variance & Generalized additive models (ZM 9)

  • Week #11: Machine learning techniques (ZM 9, T 10-12)

  • Week #12: aLook Analytics presentation

Last update: Krištoufek Ladislav, prof. PhDr., Ph.D. (01.10.2018)
Entry requirements

There are no formal course requirements. However, knowledge up to the level of Statisics (JEB105) and Econometrics I (JEB109) courses is assumed and expected.

Last update: Krištoufek Ladislav, prof. PhDr., Ph.D. (26.09.2016)
Registration requirements

There are no formal course requirements. However, knowledge up to the level of Statisics (JEB105) and Econometrics I (JEB109) courses is assumed and expected.

Last update: Krištoufek Ladislav, prof. PhDr., Ph.D. (26.09.2016)
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html