Témata prací (Výběr práce)

Váš prohlížeč nepodporuje JavaScript nebo je jeho podpora vypnutá. Některé funkce nemusejí být dostupné.

Evaluace genderové zaujatosti velkých jazykových modelů v přirozených kontextech

Název práce v češtině:	Evaluace genderové zaujatosti velkých jazykových modelů v přirozených kontextech
Název v anglickém jazyce:	Evaluation of gender bias of Large Language Models in natural contexts
Klíčová slova:	genderová zaujatost\|velké jazykové modely\|evaluace
Klíčová slova anglicky:	gender bias\|large language models\|evaluation
Akademický rok vypsání:	2023/2024
Typ práce:	diplomová práce
Jazyk práce:
Ústav:	Ústav formální a aplikované lingvistiky (32-UFAL)
Vedoucí / školitel:	RNDr. David Mareček, Ph.D.
Řešitel:	skrytý - zadáno a potvrzeno stud. odd.
Datum přihlášení:	08.01.2024
Datum zadání:	08.01.2024
Datum potvrzení stud. oddělením:	08.01.2024

Zásady pro vypracování

Large Language Models are trained on vast amounts of data collected from the internet, and this data often reflects the biases present in society. As a result, language models can inadvertently perpetuate and even amplify biases. For example, the models often learn and reproduce stereotypes about gender roles i.e. they may associate certain professions or qualities with a specific gender.

There exist many evaluation datasets measuring the amount of gender biases in language models. Almost all of them are created artificially, either by filling words into the templates or asking annotators to write sentences that may contain stereotypical gender biases. Also, they usually evaluate bias of only specific groups of words such as professions (e.g. doctor vs. nurse).

The goal of this thesis is to build a new evaluation dataset for detecting gender bias, which would be based on real texts and which would evaluate biases across the whole dictionary (we suppose, that the words like ‘yoga’, ‘children’, ‘clamp’, ‘tire’ are also sources of a stereotypical bias).

The outputs of the thesis could be:
- analysis of gender bias on different types of words
- causal tracing of such gender bias in Transformers
- use the new dataset in the existing methods for mitigation of gender bias in large language models

Seznam odborné literatury

1. Vig et al: Causal Mediation Analysis for Interpreting Neural NLP: The Case of Gender Bias, ArXiv, 2020. (https://arxiv.org/pdf/2004.12265.pdf)

2. Stanczak and Augenstein: A Survey on Gender Bias in Natural Language Processing. ArXiV, 2021 (https://arxiv.org/pdf/2112.14168.pdf)

3. Meng et al: Locating and Editing Factual Associations in GPT. 36th Conference on Neural Information Processing Systems, 2022 (https://proceedings.neurips.cc/paper_files/paper/2022/file/6f1d43d5a82a37e89b0665b33bf3a182-Paper-Conference.pdf)