Evolution strategies for policy optimization in transformers
Thesis title in Czech: | Evoluční strategie pro optimalizaci policy v transformerech |
---|---|
Thesis title in English: | Evolution strategies for policy optimization in transformers |
Key words: | Evoluční strategie|Transformery|Optimalizace policy|Novelty |
English key words: | Evolution strategies|Tranformers|Policy optimization|Novelty |
Academic year of topic announcement: | 2022/2023 |
Thesis type: | diploma thesis |
Thesis language: | angličtina |
Department: | Department of Theoretical Computer Science and Mathematical Logic (32-KTIML) |
Supervisor: | Mgr. Roman Neruda, CSc. |
Author: | hidden - assigned and confirmed by the Study Dept. |
Date of registration: | 12.04.2023 |
Date of assignment: | 12.04.2023 |
Confirmed by Study dept. on: | 07.12.2023 |
Date and time of defence: | 13.02.2024 09:00 |
Date of electronic submission: | 11.01.2024 |
Date of submission of printed version: | 11.01.2024 |
Date of proceeded defence: | 13.02.2024 |
Opponents: | Mgr. Martin Pilát, Ph.D. |
Guidelines |
Evolutionary strategies have proven successful in policy optimization as an alternative to deep reinforcement learning on traditional network architectures due to their scalability and parallelization possibilities. Moreover, the evolutionary strategies can utilize better exploration approaches, such as novelty or quality diversity.
The goal of this work is to explore the performance of those strategies for modern transformer architectures. The student will implement evolutionary strategies and test their ability to improve the parameters of self-attention layers in transformer networks. This approach will be experimentally validated on standard benchmark tasks, such as a selection of Atari games or MuJoCo humanoid. |
References |
[1] Pagliuca P, Milano N and Nolfi S (2020) Efficacy of Modern Neuro-Evolutionary Strategies for Continuous Control Optimization. Front. Robot. AI 7:98. doi: 10.3389/frobt.2020.00098
[2] Chen L, Lu K, et al (2021) Decision Transformer: Reinforcement Learning via Sequence Modeling. arXiv preprint arXiv:2106.01345. https://doi.org/10.48550/arXiv.2106.01345 [3] Conti E, Madhavan V, et al (2018) Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking agents. In Proc. of the 32nd Int. Conf. on Neural Information Processing Systems (NIPS'18). Curran Associates Inc., Red Hook, NY, USA, 5032–5043. https://dl.acm.org/doi/10.5555/3327345.3327410 |