Investigating Large Language Models' Representations Of Plurality Through Probing Interventions
Thesis title in Czech: | Zkoumání reprezentace plurálu ve velkých jazykových modelech prostřednictvím sondovacích intervencí |
---|---|
Thesis title in English: | Investigating Large Language Models' Representations Of Plurality Through Probing Interventions |
Key words: | probing|interpretace|jazykový model|neuronová síť |
English key words: | probing|interpretation|language model|neural network |
Academic year of topic announcement: | 2022/2023 |
Thesis type: | diploma thesis |
Thesis language: | angličtina |
Department: | Institute of Formal and Applied Linguistics (32-UFAL) |
Supervisor: | RNDr. David Mareček, Ph.D. |
Author: | hidden - assigned and confirmed by the Study Dept. |
Date of registration: | 30.06.2022 |
Date of assignment: | 30.06.2022 |
Confirmed by Study dept. on: | 29.08.2022 |
Date and time of defence: | 02.09.2022 09:00 |
Date of electronic submission: | 20.07.2022 |
Date of submission of printed version: | 25.07.2022 |
Date of proceeded defence: | 02.09.2022 |
Opponents: | Mgr. Jindřich Helcl, Ph.D. |
Guidelines |
With the advent of large neural language models (LLMs), the field of interpretability has grown, meeting these models with new methods and analyses that aim to determine how and why they work. Chief among these methods is probing. In probing, simple models, called probes, are trained to extract linguistic features from LLMs' internal representations; probing has shown that LLMs' internal representations contain both syntactic and semantic information. However, the degree to which the information encoded in these representations is actually used remains unclear.
The goal of this thesis is to explore probing and causal intervention methods to investigate the question of subject-verb agreement with respect to the subject’s plurality. We first train probes to extract the plurality feature from a LLM's internal representations. Then, when running the LLM on its task, we alter its representations by either projecting the representations onto the probe's decision boundary (removing all information about the linguistic feature), or even moving the representation across the boundary, flipping the probe's decision. If the LLM's behavior changes appropriately according to this intervention, it likely used the probed information to make its predictions; otherwise, it did not. |
References |
Tenney et al.: What do you learn from context? Probing for sentence structure in contextualized word representations (https://arxiv.org/abs/1905.06316) |