Témata prací (Výběr práce)Témata prací (Výběr práce)(verze: 368)
Detail práce
   Přihlásit přes CAS
Textual Ciphers as a Tool for Better Understanding the Transformers
Název práce v češtině: Textové šifry jako nástroj pro lepší pochopení modelů Transformer
Název v anglickém jazyce: Textual Ciphers as a Tool for Better Understanding the Transformers
Klíčová slova: Transformer|interpretovatelnost|NLP|deep learning|šifry
Klíčová slova anglicky: Transformer|interpretability|NLP|deep learning|ciphers
Akademický rok vypsání: 2023/2024
Typ práce: bakalářská práce
Jazyk práce: angličtina
Ústav: Ústav formální a aplikované lingvistiky (32-UFAL)
Vedoucí / školitel: Mgr. Jindřich Libovický, Ph.D.
Řešitel: skrytý - zadáno a potvrzeno stud. odd.
Datum přihlášení: 29.09.2023
Datum zadání: 29.09.2023
Datum potvrzení stud. oddělením: 13.10.2023
Datum a čas obhajoby: 28.06.2024 09:00
Datum odevzdání elektronické podoby:09.05.2024
Datum odevzdání tištěné podoby:09.05.2024
Datum proběhlé obhajoby: 28.06.2024
Oponenti: Ing. Zdeněk Kasner
 
 
 
Zásady pro vypracování
Basic textual ciphers (substitution, Caesar cipher, Vigenère cipher) transform meaningful texts into strings that are, at first glance, incomprehensible, and deciphering them requires quite an effort without knowing the cipher key. Transformer models that are intensively used in Natural Language Processing (NLP), including nowadays very popular language modeling, can be trained to decipher such texts even with relatively few parameters. Doing so requires reverse-engineering the cipher algorithm and some knowledge of the language that allows the model to guess the key internally. Unlike standard NLP problems such as machine translation, question answering, or sentiment analysis, there is very little interference with cultural aspects of meaning, and the tasks consist purely of language and computation. This makes it an ideal task for studying what language phenomena are the easiest for the Transformers so that they can rely on them in this noisy setup.

By experimenting with various test sets focusing on different types of language features (for instance, statistical and information theoretical properties such as character distribution, word frequencies, n-gram perplexity, or length; linguistic features such as dependency tree complexity, part-of-speech statistics, presence of named entities), the student will estimate what language phenomena make the deciphering easy and what makes deciphering difficult. This analysis will serve as a proxy for understanding the training dynamics of the Transformer models in the early stages of training.
Seznam odborné literatury
Bengio, Y., Goodfellow, I., & Courville, A. (2017). Deep learning (Vol. 1). Cambridge, MA, USA: MIT Press.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877-1901.

Greydanus, S. (2017). Learning the enigma with recurrent neural networks. arXiv preprint arXiv:1708.07576.

Aldarrab, N., & May, J. (2020). Can Sequence-to-Sequence Models Crack Substitution Ciphers?. arXiv preprint arXiv:2012.15229.
 
Univerzita Karlova | Informační systém UK