Thesis (Selection of subject)Thesis (Selection of subject)(version: 381)
Thesis details
   Login via CAS
Fast Algorithms for Attention Mechanism
Thesis title in Czech: Rychlé algoritmy pro mechanismus pozornosti
Thesis title in English: Fast Algorithms for Attention Mechanism
Key words: Strojové učení|Velké jazykové modely|Transformátory|Lineární algebra|Polynomy
English key words: Machine learning|Large language models|Transformers|Linear algebra|Polynomials
Academic year of topic announcement: 2023/2024
Thesis type: Bachelor's thesis
Thesis language: angličtina
Department: Department of Applied Mathematics (32-KAM)
Supervisor: doc. Mgr. Petr Kolman, Ph.D.
Author: hidden - assigned and confirmed by the Study Dept.
Date of registration: 03.05.2024
Date of assignment: 03.05.2024
Confirmed by Study dept. on: 05.05.2024
Date and time of defence: 28.06.2024 09:00
Date of electronic submission:09.05.2024
Date of submission of printed version:09.05.2024
Date of proceeded defence: 28.06.2024
Opponents: Ing. Uladzislau Yorsh
 
 
 
Advisors: Timothy Chu, Ph.D.
Guidelines
In today's language models, one of the key ingredients of the so-called transformed-based models, is the attention mechanism. This mechanism can be represented by matrix multiplications together with a normalization step where matrix rows are converted into probability distributions. Usual implementations of the attention mechanism make use of the exponential function for the normalization, and altogether require quadratic time, which presents a critical computational bottleneck in the generation stage.

The task of the student will be:
* to provide an overview of the general transformer architecture, in particular with respect to the computational demands,
* to attempt to make attention computations faster.
References
[1] Josh Alman, Zhao Song: Fast Attention Requires Bounded Entries. Proceedings of Annual Conference on Neural Information Processing Systems, 2023
[2] Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, François Fleuret: Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention. Proceedings of the 37th International Conference on Machine Learning 2020
[3] Praneeth Kacham, Vahab Mirrokni, Peilin Zhong: PolySketchFormer: Fast Transformers via Sketching Polynomial Kernelsi. arXiv:2310.01655, 2023
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html