Investigating Large Language Models' Representations Of Plurality Through Probing Interventions
Název práce v češtině: | Zkoumání reprezentace plurálu ve velkých jazykových modelech prostřednictvím sondovacích intervencí |
---|---|
Název v anglickém jazyce: | Investigating Large Language Models' Representations Of Plurality Through Probing Interventions |
Klíčová slova: | probing|interpretace|jazykový model|neuronová síť |
Klíčová slova anglicky: | probing|interpretation|language model|neural network |
Akademický rok vypsání: | 2022/2023 |
Typ práce: | diplomová práce |
Jazyk práce: | angličtina |
Ústav: | Ústav formální a aplikované lingvistiky (32-UFAL) |
Vedoucí / školitel: | RNDr. David Mareček, Ph.D. |
Řešitel: | skrytý - zadáno a potvrzeno stud. odd. |
Datum přihlášení: | 30.06.2022 |
Datum zadání: | 30.06.2022 |
Datum potvrzení stud. oddělením: | 29.08.2022 |
Datum a čas obhajoby: | 02.09.2022 09:00 |
Datum odevzdání elektronické podoby: | 20.07.2022 |
Datum odevzdání tištěné podoby: | 25.07.2022 |
Datum proběhlé obhajoby: | 02.09.2022 |
Oponenti: | Mgr. Jindřich Helcl, Ph.D. |
Zásady pro vypracování |
With the advent of large neural language models (LLMs), the field of interpretability has grown, meeting these models with new methods and analyses that aim to determine how and why they work. Chief among these methods is probing. In probing, simple models, called probes, are trained to extract linguistic features from LLMs' internal representations; probing has shown that LLMs' internal representations contain both syntactic and semantic information. However, the degree to which the information encoded in these representations is actually used remains unclear.
The goal of this thesis is to explore probing and causal intervention methods to investigate the question of subject-verb agreement with respect to the subject’s plurality. We first train probes to extract the plurality feature from a LLM's internal representations. Then, when running the LLM on its task, we alter its representations by either projecting the representations onto the probe's decision boundary (removing all information about the linguistic feature), or even moving the representation across the boundary, flipping the probe's decision. If the LLM's behavior changes appropriately according to this intervention, it likely used the probed information to make its predictions; otherwise, it did not. |
Seznam odborné literatury |
Tenney et al.: What do you learn from context? Probing for sentence structure in contextualized word representations (https://arxiv.org/abs/1905.06316) |