Témata prací (Výběr práce)

Váš prohlížeč nepodporuje JavaScript nebo je jeho podpora vypnutá. Některé funkce nemusejí být dostupné.

Metody doménové adaptace pro rozpoznávání řeči

Název práce v češtině:	Metody doménové adaptace pro rozpoznávání řeči
Název v anglickém jazyce:	Methods of Domain Adaptation for Speech Recognition
Akademický rok vypsání:	2019/2020
Typ práce:	diplomová práce
Jazyk práce:	čeština
Ústav:	Ústav formální a aplikované lingvistiky (32-UFAL)
Vedoucí / školitel:	doc. RNDr. Ondřej Bojar, Ph.D.
Řešitel:	skrytý - zadáno a potvrzeno stud. odd.
Datum přihlášení:	20.02.2020
Datum zadání:	16.03.2020
Datum potvrzení stud. oddělením:	01.06.2020
Datum a čas obhajoby:	08.07.2020 09:00
Datum odevzdání elektronické podoby:	28.05.2020
Datum odevzdání tištěné podoby:	28.05.2020
Datum proběhlé obhajoby:	08.07.2020
Oponenti:	Mgr. et Mgr. Ondřej Dušek, Ph.D.

Zásady pro vypracování

The quality of automatic speech recognition (ASR) critically depends on the match of the test and training data. Domain adaptation techniques are used to adjust a more general system to improve its performance for a particular situation.

The goal of the thesis is to explore method of domain adaptation for speech recognition. The thesis should consider adaptation at various levels, starting with adaptation to a given subject area (e.g. economics vs. computational linguistics) up to adaptation to individual talks given by a known speaker on a known topic.

An inherent part of the thesis is the empirical evaluation of the discussed or proposed methods. Specifically, the work should start with creating a baseline ASR system for spoken Czech and then carry out a series of domain adaptation experiments at various levels. The quality of the system will be evaluated automatically using the standard WER (word error rate) measure.

Seznam odborné literatury

Mohri, M., Pereira, F., & Riley, M. (2008). Speech recognition with weighted finite-state transducers.
In *Springer Handbook of Speech Processing* (pp. 559-584). Springer, Berlin, Heidelberg.

Young, S. et al. (2006). The HTK book. *Cambridge university engineering department*, *3*, 75.

Goodman, J. (2001). A bit of progress in language modeling. *arXiv preprint cs/0108005*.

Peddinti, V., Povey, D., & Khudanpur, S. (2015). A time delay neural network architecture for efficient modeling of long temporal contexts.
In *Sixteenth Annual Conference of the International Speech Communication Association*.

Goodfellow, I., Bengio, Y., & Courville, A. (2016). *Deep learning*.
MIT press.