SubjectsSubjects(version: 945)
Course, academic year 2023/2024
   Login via CAS
UNIX and work with genomic data - MB170C47
Title: UNIX and work with genomic data
Czech title: UNIX a práce s genomickými daty
Guaranteed by: Department of Zoology (31-170)
Faculty: Faculty of Science
Actual: from 2023
Semester: winter
E-Credits: 2
Examination process: winter s.:
Hours per week, examination: winter s.:0/3, C [DS]
Capacity: 28
Min. number of students: 10
4EU+: no
Virtual mobility / capacity: no
State of the course: taught
Language: English
Note: enabled for web enrollment
Guarantor: RNDr. Radka Reifová, Ph.D.
Teacher(s): Mgr. Václav Janoušek, Ph.D.
RNDr. Libor Mořkovský, Ph.D.
RNDr. Radka Reifová, Ph.D.
Mgr. Anastasija Sedláková, Ph.D.
Annotation -
Last update: Ing. Jindřiška Peterková (12.09.2023)
As the field of biology evolves, biologists increasingly require advanced computational skills and expanded computational resources. An essential tool in this domain is the Unix command line, which also facilitates remote access to more powerful computing platforms. Furthermore, tools like git are indispensable for the reproducibility of research, ensuring consistency and reliability in findings.


We present an updated course with focus on remote computing and code reproducibility. Participants of the course will gain sufficient skills and confidence in unix-like environments in order to be able to use it for processing and analysis of their own genomics data. Besides a lot of hands-on exercise we will also provide an overview of available computational environments used in academic as well as commercial setups in bioinformatics.
Syllabus -
Last update: Mgr. Václav Janoušek, Ph.D. (15.09.2015)

I. Introduction to Unix - Learn about the Unix philosophy.

II. Basic Unix - Learn to use the basic commands (cd, ls, ll, mkdir, mv, cp, pwd, htop, screen, grep, globbing, less, head, tail, cat, cut, sort, uniq, paste, join, pipes).

III. Advanced Unix - Learn basics of awk, sed, regular expressions, shell scripting, shell variables, parallel, subshells.

IV. Introduction to Genomics - Learn how ‘genomes’ are made.

V. Data visualization - Learn how to format your data for effective visualization and how to use RStudio, tidyr, dplyr and ggplot2 to explore your data visually.

VI. Read quality assessment - Learn how to use Unix to explore FASTQ files, calculate some basic statistics, assess read quality, filter out low-quality reads.

VII. Genome assembly - Learn how to do a (small) genome assembly.

VIII. Variant calling - Learn how to use the original NGS reads and a genome assembly to call variants.

IX. Standard annotation formats - Learn how information on genes, variants and genome properties is stored (GFF, VCF, BED formats) and how to obtain quick summaries with impressive speed (bedtools, vcftools, etc.)

X. A lot of practice.

 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html