Title: Data analysis in R and Python
Czech title: Analýza dat v prostředí R a Python
Language: English, Czech
Guarantor: prof. Mgr. Vojtěch Janoušek, Ph.D.
doc. Mgr. Ondrej Lexa, Ph.D.
Teacher(s): prof. Mgr. Vojtěch Janoušek, Ph.D.
doc. Mgr. Ondrej Lexa, Ph.D.
Annotation -
The course is taught in English when at least one international student is enrolled. This practical course is aimed at senior undergraduate and postgraduate students. It is intended to: a) explain fundamentals of data processing and visualization in geology as well as functioning of computing algorithms in general; b) present basics of the R and Python programming languages; c) illustrate the usability and versatility of both languages for everyday calculations, as well as for production of publication-quality graphics; d) demonstrate examples of using both languages in reproducible research (with certain structural geology and whole-rock geochemistry bias).
Literature -
Learning materials (only for students):

Web links:

de Vries A: Using R with Jupyter Notebooks

Jupyter: Open source, interactive data science and scientific computing across over 40 programming languages

The R Project for Statistical Computing

Dive into Python 3

Scientific Python Lecture Notes

Wikipedie: R (programming language)

Wikipedie: Python (programming language)



Requirements to the exam -
The examination is a practical test, whereby the participants are required to write several short programs in the R and Python programming languages.

Syllabus -
1 Introduction to data analysis and algorithmization I. [OL]

  • Reproducible research
  • Data Analysis in Earth Sciences
  • Why Python?
  • Let’s install out scientific computing environment

2. Introduction to data analysis and algorithmization II. [VJ]

  • Why R? – a bit of history and its current upswing
  • How does the computer programme work?
  • Fundamental data types, algorithmization, typical parts of a computer programme, object-oriented programming

3.  Fundamentals of the Python language I. [OL] 

Introduction to Jupyter Notebooks and JupyterLab

Python crash course, basics of Python programming

  • Variables and simple data types
  • Advanced datatypes
  • Built-in functions and operators
  • Blocks and loops
  • User-defined functions
  • Errors and exceptions

4.  Fundamentals of the Python language II. [OL]

Scientific Python


  • Introduction to NumPy

  • Visualizations with Matplotlib and Seaborn
  • Data input and output

5. Fundamentals of the R language I. [VJ]


Introduction, fundamental data types and basic operations with them

  • Interactive/batch mode
  • Help and documentation
  • Main data types, attributes
  • Vectors
  • Matrices and arrays
  • Factors
  • Lists

6. Fundamentals of the R language II. [VJ]

Programming and graphics

  • Data import and output from/to files
  • Graphical functions and their main parameters
  • Printing and exporting graphics (PDF, PostScript…)
  • Programming in R – conditional execution, loops, user-defined functions
  • R community, CRAN, mailing lists, useR! conferences
  • Expanding R by additional packages (libraries)

7. Python applications I. [OL]

Calculations and statistics

  • Advanced NumPy and SciPy
  • Data analysis and manipulation with Pandas

8.  Python applications II. 

Directional statistics

  • Basics of directional statistics in 2D and 3D
  • Advanced analyses of 3D orientational data – APSG

9.  R applications I. [VJ]

Calculations and statistics

  • Simple geochemical recalculations
  • On usefulness of matrices
  • Descriptive statistics in R
  • Working with large and complex datasets

10. R applications II. [VJ]

  • Graphics in R – examples from whole-rock geochemistry
  • Binary diagrams and Harker plots
  • Ternary diagrams
  • Spiderplots
  • Calculating simple petrogenetic models, including graphical output
