SubjectsSubjects(version: 953)
Course, academic year 2023/2024
   Login via CAS
Advanced Programming in Parallel Environment - NPRG058
Title: Pokročilé programování v paralelním prostředí
Guaranteed by: Department of Software Engineering (32-KSI)
Faculty: Faculty of Mathematics and Physics
Actual: from 2020 to 2023
Semester: winter
E-Credits: 6
Hours per week, examination: winter s.:2/2, C+Ex [HT]
Capacity: unlimited
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: taught
Language: Czech, English
Teaching methods: full-time
Teaching methods: full-time
Additional information:
Guarantor: doc. RNDr. Martin Kruliš, Ph.D.
RNDr. Jakub Yaghob, Ph.D.
Pre-requisite : NPRG042
Annotation -
A practical seminar, which is a continuation of Programming in Parallel Environment lectures, focuses on more advanced aspects of parallel programming. The main objective is to practically introduce more complicated problems to the students regarding programming of multiprocessor NUMA servers and employing additional parallel devices, especially the GPGPUs (CUDA) and Xeon Phi devices. The students will be given several problems, which will be analyzed during lectures and implemented by the students in their home assignments. The results will be verified and subjected to collective discussion
Last update: T_KSI (27.04.2015)
Literature -

James Reinders: Intel Threading Building Blocks: Outfitting C++ for Multi-core Processor Parallelism, O'Reilly

Benedict Gaster, Lee Howes, David R. Kaeli, Perhaad Mistry, Dana Schaa: Heterogeneous Computing with OpenCL, Morgan Kaufmann; 2 edition (November 27, 2012)

Shane Cook: CUDA Programming: A Developer's Guide to Parallel Computing with GPUs (Applications of GPU Computing Series)

OpenCL - Online Manual (

CUDA Online Documentation (

Last update: T_KSI (01.05.2013)
Syllabus -

The seminar will present the following problems:

  • Task scheduling on multicore CPUs and NUMA systems
  • Synchronization on multi-core CPUs and multiprocessor systems
  • Efficiency of the data transfers between additional devices and host memory
  • Load balancing between CPU and additional accelerators
  • Transforming problems into data parallel tasks and their mapping to GPUs
  • Shared memory access, cache-aware programming, and atomic operations on GPU
  • Solving irregular workloads on GPUs (persistent threads, dynamic parallelism)
  • Xeon Phi devices and the most important differences between Intel MIC and GPU architectures

Last update: T_KSI (27.04.2015)
Charles University | Information system of Charles University |