Thesis (Selection of subject)

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Online training of deep neural networks for classification

Thesis title in Czech:	Online trénování hlubokých neuronových sítí pro klasifikaci
Thesis title in English:	Online training of deep neural networks for classification
Key words:	neuronové sítě, varianční autoenkodér, online učení, klasifikace
English key words:	neural network, variational autoencoder, online learning, classification
Academic year of topic announcement:	2017/2018
Thesis type:	diploma thesis
Thesis language:	angličtina
Department:	Department of Theoretical Computer Science and Mathematical Logic (32-KTIML)
Supervisor:	prof. RNDr. Ing. Martin Holeňa, CSc.
Author:	Mgr. Jiří Tumpach - assigned and confirmed by the Study Dept.
Date of registration:	08.05.2018
Date of assignment:	08.05.2018
Confirmed by Study dept. on:	18.07.2019
Date and time of defence:	16.09.2019 09:00
Date of electronic submission:	18.07.2019
Date of submission of printed version:	19.07.2019
Date of proceeded defence:	16.09.2019
Opponents:	Jakub Kořenek

Guidelines

Online trénování označuje trénování modelů strojového učení když jsou jejich trénovací data online aktualizována. V takových situacích pravděpodobnostní rozdělení trénovacích dat typicky bývá nestacionární a vyvíjí se. Online trénování se používá s oběma hlavními typy strojového učení s učitelem – klasifikací a regresí, stejně tak jako v kontextu učení bez učitele. Kvůli online aktualizaci a vývoji trénovacích dat se online učení musí vypořádávat s několika specifickými problémy, jako např. s možností posunu pojmů, s vlivem časového horizontu, s možností zapomínání a s vlivem jeho rychlosti.
V tomto desetiletí jsou pravděpodobně nejpopulárnějším a nejrychleji se rozvíjejícím druhem modelů strojového učení hluboké neuronové sítě. Pro ně je však výzkum zabývající se online učením, zvláště pak online učením s učitelem, a výše zmíněnými problémy teprve v začátcích. To činí navržené téma velmi aktuálním, zvláště v oblastech, které poskytují vyvíjející se data ve velkých množstvích potřebných pro hluboké učení, jako je analýza obsahu webu, detekce proniknutí do sítě či detekce malware. Poslední zmíněná oblast je zamýšlenou oblastí aplikace navrženého výzkumu.

References

viz http://www.cs.cas.cz/~martin/diplomka50.html

Preliminary scope of work in English

Online training refers to the training of machine learning models when their training data is online updated. In such situations, the probability distribution of the training data is typically non-stationary and evolving. Online training has been used with both main kinds of supervised machine learning – classifiecation and regression, as well as in the context of unsuperviesed learning. Due to online updating and evolution of training data, online training has to tackle several specific problems, such as the possibility of concept shift, the influence of time horizon, the possibilty of forgetting and the influence of its speed.
In this decade, deep neural networks are the probably most popular and most quickly developing kind of machine learning models. For them, however, research into online learning, especially supervised online learning, and into dealing with the above mentioned problems, is only starting. This makes the proposed topic very timely, particularly in areas providing evolving data in large amounts needed for deep learning, such as web content analysis, network intrusion detection, or malware detection. The last mentioned area is the intended application domain of the proposed research.