Optimizing memory layouts of data structures designed for parallel systems
Thesis title in Czech: | Optimalizace paměťových rozložení datových stuktur určených pro paralelní systémy |
---|---|
Thesis title in English: | Optimizing memory layouts of data structures designed for parallel systems |
Key words: | paralelní|systém|paměť|GPU|optimalizace|rozložení |
English key words: | parallel|system|memory|GPU|optimization|layout |
Academic year of topic announcement: | 2021/2022 |
Thesis type: | dissertation |
Thesis language: | angličtina |
Department: | Department of Distributed and Dependable Systems (32-KDSS) |
Supervisor: | doc. RNDr. Martin Kruliš, Ph.D. |
Author: | hidden - assigned and confirmed by the Study Dept. |
Date of registration: | 19.09.2022 |
Date of assignment: | 19.09.2022 |
Confirmed by Study dept. on: | 21.09.2022 |
Guidelines |
In the past two decades, mainstream hardware has experienced a significant shift towards parallelism. Multicore CPUs, as well as manycore GPUs, are present in commodity PCs and laptops, servers, and specialized computing devices (e.g., in IoT domain). Unfortunately, most applications and algorithms of the day are not ready for such a radical change and do not utilize the hardware to its full potential.
The main objective of this thesis is to investigate the impact of data organization in memory on performance of parallel algorithms. That involves designing cache-aware and cache-oblivious data structures, methods for reducing write collisions or memory bank collisions, or smart planning of data transfers between memory spaces (e.g., host-device memory or global-shared memory) and their overlapping with computations. Selecting an optimal layout is often elusive, so it requires experimental evaluation or even dynamic auto-tuning methods. These in turn require the design of seamless layout transformation techniques, performance metrics, layout cost models, and possibly machine-learning methods for selecting the best layout. The outlined research will improve the knowledge of methods for designing parallel applications and spawn new semi-automatic methods for designing and implementing data structures for high-performance computing. |
References |
- |