Fast Algorithms for Attention Mechanism
Thesis title in Czech: | Rychlé algoritmy pro mechanismus pozornosti |
---|---|
Thesis title in English: | Fast Algorithms for Attention Mechanism |
Key words: | Strojové učení|Velké jazykové modely|Transformátory|Lineární algebra|Polynomy |
English key words: | Machine learning|Large language models|Transformers|Linear algebra|Polynomials |
Academic year of topic announcement: | 2023/2024 |
Thesis type: | Bachelor's thesis |
Thesis language: | angličtina |
Department: | Department of Applied Mathematics (32-KAM) |
Supervisor: | doc. Mgr. Petr Kolman, Ph.D. |
Author: | hidden - assigned and confirmed by the Study Dept. |
Date of registration: | 03.05.2024 |
Date of assignment: | 03.05.2024 |
Confirmed by Study dept. on: | 05.05.2024 |
Date and time of defence: | 28.06.2024 09:00 |
Date of electronic submission: | 09.05.2024 |
Date of submission of printed version: | 09.05.2024 |
Date of proceeded defence: | 28.06.2024 |
Opponents: | Ing. Uladzislau Yorsh |
Advisors: | Timothy Chu, Ph.D. |
Guidelines |
In today's language models, one of the key ingredients of the so-called transformed-based models, is the attention mechanism. This mechanism can be represented by matrix multiplications together with a normalization step where matrix rows are converted into probability distributions. Usual implementations of the attention mechanism make use of the exponential function for the normalization, and altogether require quadratic time, which presents a critical computational bottleneck in the generation stage.
The task of the student will be: * to provide an overview of the general transformer architecture, in particular with respect to the computational demands, * to attempt to make attention computations faster. |
References |
[1] Josh Alman, Zhao Song: Fast Attention Requires Bounded Entries. Proceedings of Annual Conference on Neural Information Processing Systems, 2023
[2] Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, François Fleuret: Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention. Proceedings of the 37th International Conference on Machine Learning 2020 [3] Praneeth Kacham, Vahab Mirrokni, Peilin Zhong: PolySketchFormer: Fast Transformers via Sketching Polynomial Kernelsi. arXiv:2310.01655, 2023 |