Distributed job execution in IVIS Framework
Název práce v češtině: | Distribuované vykonávání jobu v IVIS Framework |
---|---|
Název v anglickém jazyce: | Distributed job execution in IVIS Framework |
Klíčová slova: | distribuovaný výpočet|cloud|zpracování dat|javascript |
Klíčová slova anglicky: | distributed computing|cloud|data processing|javascript |
Akademický rok vypsání: | 2022/2023 |
Typ práce: | bakalářská práce |
Jazyk práce: | angličtina |
Ústav: | Katedra distribuovaných a spolehlivých systémů (32-KDSS) |
Vedoucí / školitel: | prof. RNDr. Tomáš Bureš, Ph.D. |
Řešitel: | skrytý - zadáno a potvrzeno stud. odd. |
Datum přihlášení: | 29.09.2022 |
Datum zadání: | 02.10.2022 |
Datum potvrzení stud. oddělením: | 23.11.2022 |
Datum a čas obhajoby: | 07.09.2023 09:00 |
Datum odevzdání elektronické podoby: | 14.07.2023 |
Datum odevzdání tištěné podoby: | 14.07.2023 |
Datum proběhlé obhajoby: | 07.09.2023 |
Oponenti: | Mgr. Vojtěch Horký, Ph.D. |
Zásady pro vypracování |
IVIS is a web-based framework for creating data analytics and visualization web applications. The framework allows complex data processing through the mechanism of tasks and jobs. A task is a collection of Python scripts and metadata that specifies the parameters of the scripts. A job is an instantiation of the task with a concrete set of parameters and input datasets and with the specification of triggers that govern when the job is executed.
In its current form, the execution of a job may utilize only the host machine of the IVIS server. This fact manifests in almost every part of a job lifecycle implementation: from creation to scheduling and execution. This thesis aims to extend the task and job subsystems to allow for external computation resources (individual machines and machine pools) for job execution. The solution will require extending the IVIS-core server with 1. a way to add external executor nodes 2. a way to configure an executor node on a per-job basis 3. a way to securely communicate with the executor nodes over the internet 4. new logic for scheduling remotely-executing jobs While executing, a job utilizes the IVIS Python package to coordinate with the IVIS-core server. This includes the reception of input data, connection to a storage and indexing solution (currently Elasticsearch), and sending requests to the IVIS server to allow side effects such as creating derived datasets and storage of the state of the job. Because the data storage is centralized on the IVIS server host, remotely executing jobs will require a secure way to communicate with the IVIS server and the storage and indexing server (Elasticsearch). The solution should enable the user of the IVIS framework to run jobs on a particular machine (local or remote) as well as on a configured pool of automatically managed machines. The implementation aims to support local execution (the status quo), execution on a publicly-available machine, and execution using a pool backed by a local computing cluster or a commercial cloud service provider. The solution will be validated using an artificial set of tasks, jobs, and datasets. |
Seznam odborné literatury |
[1] https://github.com/smartarch/ivis-core |