Témata prací (Výběr práce)

Váš prohlížeč nepodporuje JavaScript nebo je jeho podpora vypnutá. Některé funkce nemusejí být dostupné.

Parameter-efficient unlearning of sensitive data in Large Language Models

Název práce v češtině:	Efektivní odnaučování citlivých údajů ve velkých jazykových modelech
Název v anglickém jazyce:	Parameter-efficient unlearning of sensitive data in Large Language Models
Klíčová slova:	velké jazykové modely\|strojové odnaučení\|citlivá data\|autorské právo\|efektivní doladění
Klíčová slova anglicky:	large language models\|machine unlearning\|user privacy\|copyright\|parameter-efficient fine-tuning
Akademický rok vypsání:	2024/2025
Typ práce:	diplomová práce
Jazyk práce:	angličtina
Ústav:	Ústav formální a aplikované lingvistiky (32-UFAL)
Vedoucí / školitel:	Mgr. Jindřich Helcl, Ph.D.
Řešitel:	skrytý - zadáno a potvrzeno stud. odd.
Datum přihlášení:	13.11.2024
Datum zadání:	13.11.2024
Datum potvrzení stud. oddělením:	13.11.2024
Datum a čas obhajoby:	09.06.2025 09:00
Datum odevzdání elektronické podoby:	29.04.2025
Datum odevzdání tištěné podoby:	30.04.2025
Datum proběhlé obhajoby:	09.06.2025
Oponenti:	doc. RNDr. Ondřej Bojar, Ph.D.

Zásady pro vypracování

The student shall investigate methods for the unlearning of sensitive and copyrighted information from large language models and combine them with parameter-efficient fine-tuning methods.

The student will focus on the following paradigms:
large language models (language transformers)
machine unlearning, e.g., gradient ascent, negative preference optimisation, logit difference,...
parameter-efficient tuning, e.g., low rank adaptation, side-loaded network,...

The student will perform a mutual comparison of various approaches, evaluate the obtained results, and propose a reliable strategy for parameter-efficient unlearning in large language models on real data.

Seznam odborné literatury

Parameter Efficient Finetuning:

Hu, Edward J., et al. "Lora: Low-rank adaptation of large language models." arXiv preprint arXiv:2106.09685 (2021).
https://arxiv.org/abs/2106.09685

Dettmers, Tim, et al. "Qlora: Efficient finetuning of quantized llms." Advances in Neural Information Processing Systems 36 (2024).
https://proceedings.neurips.cc/paper_files/paper/2023/hash/1feb87871436031bdc0f2beaa62a049b-Abstract-Conference.html

Zhengxin, Zhang, et al. 2024. Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1–17, Bangkok, Thailand. Association for Computational Linguistics.
https://aclanthology.org/2024.acl-long.1

Machine Unlearning in LLMs:

Zhang, Ruiqi, et al. "Negative preference optimization: From catastrophic collapse to effective unlearning." arXiv preprint arXiv:2404.05868 (2024).
https://arxiv.org/abs/2404.05868

Ji, Jiabao, et al. "Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit Difference." arXiv preprint arXiv:2406.08607 (2024).
https://arxiv.org/abs/2406.08607

Nguyen, Thanh Tam, et al. "A survey of machine unlearning." arXiv preprint arXiv:2209.02299 (2022).
https://arxiv.org/abs/2209.02299