Thesis (Selection of subject)

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Fast Algorithms for Attention Mechanism

Thesis title in Czech:	Rychlé algoritmy pro mechanismus pozornosti
Thesis title in English:	Fast Algorithms for Attention Mechanism
Key words:	Strojové učení\|Velké jazykové modely\|Transformátory\|Lineární algebra\|Polynomy
English key words:	Machine learning\|Large language models\|Transformers\|Linear algebra\|Polynomials
Academic year of topic announcement:	2023/2024
Thesis type:	Bachelor's thesis
Thesis language:	angličtina
Department:	Department of Applied Mathematics (32-KAM)
Supervisor:	doc. Mgr. Petr Kolman, Ph.D.
Author:	hidden - assigned and confirmed by the Study Dept.
Date of registration:	03.05.2024
Date of assignment:	03.05.2024
Confirmed by Study dept. on:	05.05.2024
Date and time of defence:	28.06.2024 09:00
Date of electronic submission:	09.05.2024
Date of submission of printed version:	09.05.2024
Date of proceeded defence:	28.06.2024
Opponents:	Ing. Uladzislau Yorsh



Advisors:	Timothy Chu, Ph.D.

Guidelines

In today's language models, one of the key ingredients of the so-called transformed-based models, is the attention mechanism. This mechanism can be represented by matrix multiplications together with a normalization step where matrix rows are converted into probability distributions. Usual implementations of the attention mechanism make use of the exponential function for the normalization, and altogether require quadratic time, which presents a critical computational bottleneck in the generation stage.

The task of the student will be:
* to provide an overview of the general transformer architecture, in particular with respect to the computational demands,
* to attempt to make attention computations faster.

References

[1] Josh Alman, Zhao Song: Fast Attention Requires Bounded Entries. Proceedings of Annual Conference on Neural Information Processing Systems, 2023
[2] Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, François Fleuret: Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention. Proceedings of the 37th International Conference on Machine Learning 2020
[3] Praneeth Kacham, Vahab Mirrokni, Peilin Zhong: PolySketchFormer: Fast Transformers via Sketching Polynomial Kernelsi. arXiv:2310.01655, 2023