Evolution strategies for policy optimization in transformers
Název práce v češtině: | Evoluční strategie pro optimalizaci policy v transformerech |
---|---|
Název v anglickém jazyce: | Evolution strategies for policy optimization in transformers |
Klíčová slova: | Evoluční strategie|Transformery|Optimalizace policy|Novelty |
Klíčová slova anglicky: | Evolution strategies|Tranformers|Policy optimization|Novelty |
Akademický rok vypsání: | 2022/2023 |
Typ práce: | diplomová práce |
Jazyk práce: | angličtina |
Ústav: | Katedra teoretické informatiky a matematické logiky (32-KTIML) |
Vedoucí / školitel: | Mgr. Roman Neruda, CSc. |
Řešitel: | skrytý - zadáno a potvrzeno stud. odd. |
Datum přihlášení: | 12.04.2023 |
Datum zadání: | 12.04.2023 |
Datum potvrzení stud. oddělením: | 07.12.2023 |
Datum a čas obhajoby: | 13.02.2024 09:00 |
Datum odevzdání elektronické podoby: | 11.01.2024 |
Datum odevzdání tištěné podoby: | 11.01.2024 |
Datum proběhlé obhajoby: | 13.02.2024 |
Oponenti: | Mgr. Martin Pilát, Ph.D. |
Zásady pro vypracování |
Evolutionary strategies have proven successful in policy optimization as an alternative to deep reinforcement learning on traditional network architectures due to their scalability and parallelization possibilities. Moreover, the evolutionary strategies can utilize better exploration approaches, such as novelty or quality diversity.
The goal of this work is to explore the performance of those strategies for modern transformer architectures. The student will implement evolutionary strategies and test their ability to improve the parameters of self-attention layers in transformer networks. This approach will be experimentally validated on standard benchmark tasks, such as a selection of Atari games or MuJoCo humanoid. |
Seznam odborné literatury |
[1] Pagliuca P, Milano N and Nolfi S (2020) Efficacy of Modern Neuro-Evolutionary Strategies for Continuous Control Optimization. Front. Robot. AI 7:98. doi: 10.3389/frobt.2020.00098
[2] Chen L, Lu K, et al (2021) Decision Transformer: Reinforcement Learning via Sequence Modeling. arXiv preprint arXiv:2106.01345. https://doi.org/10.48550/arXiv.2106.01345 [3] Conti E, Madhavan V, et al (2018) Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking agents. In Proc. of the 32nd Int. Conf. on Neural Information Processing Systems (NIPS'18). Curran Associates Inc., Red Hook, NY, USA, 5032–5043. https://dl.acm.org/doi/10.5555/3327345.3327410 |