Thesis (Selection of subject)Thesis (Selection of subject)(version: 368)
Thesis details
   Login via CAS
Linux Kernel Live Dump
Thesis title in Czech: Dump z běžícího linuxového jádra
Thesis title in English: Linux Kernel Live Dump
Key words: crashdump|Linux|debugování|postcopy
English key words: crashdump|Linux|debugging|postcopy
Academic year of topic announcement: 2022/2023
Thesis type: diploma thesis
Thesis language: angličtina
Department: Department of Distributed and Dependable Systems (32-KDSS)
Supervisor: Mgr. Michal Koutný
Author: hidden - assigned and confirmed by the Study Dept.
Date of registration: 07.03.2023
Date of assignment: 08.03.2023
Confirmed by Study dept. on: 24.10.2023
Opponents: Mgr. Martin Děcký, Ph.D.
 
 
 
Advisors: prof. Ing. Petr Tůma, Dr.
Guidelines
The Linux kernel has the ability to dump state of the system when a fatal error is encountered and the system can no longer execute. The dump provides exhaustive information that helps determine the likely cause of the crash.

The goal of the thesis is to extend the dump functionality to "live dumps", that is, the ability to dump state at arbitrary moments in system execution. Such dumps can be useful when troubleshooting non fatal issues, where crashes do not happen or where reboots are too intrusive.

Live dumps are already possible for guest kernels under a hypervisor coordination, however, we are primarily focusing on bare metal setups and setups with confidential virtual machines, where live dumps are not available.

An ideal dump should contain a CPU and memory state snapshot that corresponds to what a simultaneous interrupt during classic kernel panic observes, without stopping all CPUs for the duration of the dump (which may not be feasible on production systems with real memory sizes and target device throughputs). The thesis should not deviate from this goal in a way that would interfere with the debugging purpose of the dump. Additionally, the dumping mechanism could also serve as a vehicle for live migration (focus on confidential VMs).

Possible approaches are similar to the dumps during live VM migration (under a hypervisor) or modified suspend to disk. These ideas should not be limiting, broader analysis is also expected.

The implementation (but not necessarily the analysis) should target the mainline Linux kernel on x86_64. The implementation should be accompanied with an evaluation of the dump impact and restrictions on common workloads.
References
[1] https://www.kernel.org/doc/html/latest/admin-guide/kdump/kdump.html
[2] https://lwn.net/Kernel/Index/#Crash_dumps
[3] M. Hines, U. Deshpande, and K. Gopalan. Post-copy live migration of virtual machines. In SIGOPS Operating Systems Review, July 2009.
[4] https://research.ibm.com/publications/secure-live-migration-of-encrypted-vms
[5] Linux Kernel Mailing List archive, https://lore.kernel.org/lkml/
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html