Thesis (Selection of subject)

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Integrace evolučních algoritmů a zpětnovazebního učení

Thesis title in Czech:	Integrace evolučních algoritmů a zpětnovazebního učení
Thesis title in English:	Integration of Evolutionary algorithms and Reinforcement Learning
Key words:	Evoluční algoritmy\|Zpětnovazební učení\|Evoluce\|Učení\|Hybridní algoritmy
English key words:	Evolutionary algorithms\|Reinforcement learning\|Evolution\|Learning\|Hybrid algorithms
Academic year of topic announcement:	2023/2024
Thesis type:	dissertation
Thesis language:	čeština
Department:	Ústav informatiky AV ČR, v.v.i. (32-UIAV)
Supervisor:	Mgr. Roman Neruda, CSc.
Author:	hidden - assigned and confirmed by the Study Dept.
Date of registration:	28.02.2024
Date of assignment:	28.02.2024
Confirmed by Study dept. on:	29.02.2024

Guidelines

Reinforcement learning (RL) is an important and complex domain of machine learning research area. Recent developments of gradient-based deep learning models brought many successful applications in solving hard real-world problems. Gradient approaches can utilize the data intensively, yet they represent a local search procedure which is hard to parallelize. Evolutionary algorithms, on the other hand, are robust population-based global search algorithms with strong exploration, but their intensity of data utilization is relatively low.

The goal of the work is to explore the possibilities of integration and hybridization of these two paradigms for a design of efficient RL algorithms, e.g. by the following four approaches. First, the exploration procedures can make use of data-intense gradient algorithms and apply them in new quality-diversity or novelty search based on gradient information. For example, the evolutionary algorithms can use injection of deep learning RL actors into population, or a deep learning critic for an efficient fitness prediction, sometimes called a surrogate fitness. The second approach, the so-called evolutionary RL, can use evolutionary neural architecture search for optimizing an actor architecture and gradient-based algorithm for its learning. Another promising area of research is training a critic to reward not just the performance but novelty as well, leading to better exploration. Finally, evolutionary strategies represent a sound optimization technique to optimize numerous hyper-parameters of RL algorithms, which are often quite problematic to fine-tune by hand.

We expect this endeavor to result in a proposal of original hybrid RL algorithms that will be implemented and tested on standard benchmark tasks.

References

[1] Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). The MIT Press.

[2] De Jong, K. A. (2016). Evolutionary Computation: A Unified Approach. MIT Press, Cambridge, MA, USA.

[3] Sigaud, Olivier & Stulp, Freek. (2018). Policy Search in Continuous Action Domains: an Overview. Neural Networks. 113. 10.1016/j.neunet.2019.01.011.

[4] Majid, Amjad & Saaybi, Serge & Rietbergen, Tomas & Francois, Vincent & Prasad, Venkatesha & Verhoeven, Chris. (2021). Deep Reinforcement Learning Versus Evolution Strategies: A Comparative Survey. 10.36227/techrxiv.14679504.v2.

[5] Olivier Sigaud. (2022). Combining Evolution and Deep Reinforcement Learning for Policy Search: a Survey. ACM Trans. Evol. Learn. Optim. (October 2022). https://doi.org/10.1145/3569096

[6] Qian, H., Yu, Y. Derivative-free reinforcement learning: a review. (2021) Front. Comput. Sci. 15, 156336. https://doi.org/10.1007/s11704-020-0241-4