Thesis (Selection of subject)

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Evolution strategies for policy optimization in transformers

Thesis title in Czech:	Evoluční strategie pro optimalizaci policy v transformerech
Thesis title in English:	Evolution strategies for policy optimization in transformers
Key words:	Evoluční strategie\|Transformery\|Optimalizace policy\|Novelty
English key words:	Evolution strategies\|Tranformers\|Policy optimization\|Novelty
Academic year of topic announcement:	2022/2023
Thesis type:	diploma thesis
Thesis language:	angličtina
Department:	Department of Theoretical Computer Science and Mathematical Logic (32-KTIML)
Supervisor:	Mgr. Roman Neruda, CSc.
Author:	hidden - assigned and confirmed by the Study Dept.
Date of registration:	12.04.2023
Date of assignment:	12.04.2023
Confirmed by Study dept. on:	07.12.2023
Date and time of defence:	13.02.2024 09:00
Date of electronic submission:	11.01.2024
Date of submission of printed version:	11.01.2024
Date of proceeded defence:	13.02.2024
Opponents:	Mgr. Martin Pilát, Ph.D.

Guidelines

Evolutionary strategies have proven successful in policy optimization as an alternative to deep reinforcement learning on traditional network architectures due to their scalability and parallelization possibilities. Moreover, the evolutionary strategies can utilize better exploration approaches, such as novelty or quality diversity.

The goal of this work is to explore the performance of those strategies for modern transformer architectures. The student will implement evolutionary strategies and test their ability to improve the parameters of self-attention layers in transformer networks. This approach will be experimentally validated on standard benchmark tasks, such as a selection of Atari games or MuJoCo humanoid.

References

[1] Pagliuca P, Milano N and Nolfi S (2020) Efficacy of Modern Neuro-Evolutionary Strategies for Continuous Control Optimization. Front. Robot. AI 7:98. doi: 10.3389/frobt.2020.00098

[2] Chen L, Lu K, et al (2021) Decision Transformer: Reinforcement Learning via Sequence Modeling. arXiv preprint arXiv:2106.01345. https://doi.org/10.48550/arXiv.2106.01345

[3] Conti E, Madhavan V, et al (2018) Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking agents. In Proc. of the 32nd Int. Conf. on Neural Information Processing Systems (NIPS'18). Curran Associates Inc., Red Hook, NY, USA, 5032–5043. https://dl.acm.org/doi/10.5555/3327345.3327410