Thesis (Selection of subject)Thesis (Selection of subject)(version: 368)
Thesis details
   Login via CAS
Distributed job execution in IVIS Framework
Thesis title in Czech: Distribuované vykonávání jobu v IVIS Framework
Thesis title in English: Distributed job execution in IVIS Framework
Key words: distribuovaný výpočet|cloud|zpracování dat|javascript
English key words: distributed computing|cloud|data processing|javascript
Academic year of topic announcement: 2022/2023
Thesis type: Bachelor's thesis
Thesis language: angličtina
Department: Department of Distributed and Dependable Systems (32-KDSS)
Supervisor: prof. RNDr. Tomáš Bureš, Ph.D.
Author: hidden - assigned and confirmed by the Study Dept.
Date of registration: 29.09.2022
Date of assignment: 02.10.2022
Confirmed by Study dept. on: 23.11.2022
Date and time of defence: 07.09.2023 09:00
Date of electronic submission:14.07.2023
Date of submission of printed version:14.07.2023
Date of proceeded defence: 07.09.2023
Opponents: Mgr. Vojtěch Horký, Ph.D.
 
 
 
Guidelines
IVIS is a web-based framework for creating data analytics and visualization web applications. The framework allows complex data processing through the mechanism of tasks and jobs. A task is a collection of Python scripts and metadata that specifies the parameters of the scripts. A job is an instantiation of the task with a concrete set of parameters and input datasets and with the specification of triggers that govern when the job is executed.

In its current form, the execution of a job may utilize only the host machine of the IVIS server. This fact manifests in almost every part of a job lifecycle implementation: from creation to scheduling and execution. This thesis aims to extend the task and job subsystems to allow for external computation resources (individual machines and machine pools) for job execution.

The solution will require extending the IVIS-core server with

1. a way to add external executor nodes
2. a way to configure an executor node on a per-job basis
3. a way to securely communicate with the executor nodes over the internet
4. new logic for scheduling remotely-executing jobs

While executing, a job utilizes the IVIS Python package to coordinate with the IVIS-core server. This includes the reception of input data, connection to a storage and indexing solution (currently Elasticsearch), and sending requests to the IVIS server to allow side effects such as creating derived datasets and storage of the state of the job. Because the data storage is centralized on the IVIS server host, remotely executing jobs will require a secure way to communicate with the IVIS server and the storage and indexing server (Elasticsearch).

The solution should enable the user of the IVIS framework to run jobs on a particular machine (local or remote) as well as on a configured pool of automatically managed machines. The implementation aims to support local execution (the status quo), execution on a publicly-available machine, and execution using a pool backed by a local computing cluster or a commercial cloud service provider.

The solution will be validated using an artificial set of tasks, jobs, and datasets.
References
[1] https://github.com/smartarch/ivis-core
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html