Thesis (Selection of subject)

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Využití uživatelské odezvy pro zvýšení kvality řečové syntézy

Thesis title in Czech:	Využití uživatelské odezvy pro zvýšení kvality řečové syntézy
Thesis title in English:	Improving text-to-speech in spoken dialogue systems by employing user’s feedback
Key words:	speech synthesis, phonetic dictionary, user feedback, machine learning, FST, speech recognition
English key words:	syntéza řeči, fonetický slovník, uživatelská odezva, strojové učení, FST, rozpoznávání řeči
Academic year of topic announcement:	2016/2017
Thesis type:	diploma thesis
Thesis language:	čeština
Department:	Institute of Formal and Applied Linguistics (32-UFAL)
Supervisor:	prof. Ing. Zdeněk Žabokrtský, Ph.D.
Author:	hidden - assigned and confirmed by the Study Dept.
Date of registration:	27.01.2017
Date of assignment:	27.01.2017
Confirmed by Study dept. on:	26.04.2017
Date and time of defence:	07.09.2017 09:30
Date of electronic submission:	19.07.2017
Date of submission of printed version:	21.07.2017
Date of proceeded defence:	07.09.2017
Opponents:	Mgr. Nino Peterek, Ph.D.



Advisors:	Mgr. Ondřej Plátek

Guidelines

Although spoken dialogue systems have greatly improved, they still cannot handle communications involving unknown topics and are very fragile. We will investigate methods that can improve spoken dialogue systems by correcting or even learn the pronunciation of unknown words. Thus we will provide better user experience, since for example mispronounced proper nouns are highly undesirable. Incorrect pronunciation is caused by imperfect phonetic representation, typically phonetic dictionary. We aim to detect incorrectly pronounced words by exploiting user’s feedback as well as using prior knowledge of the pronunciation and correct the transcriptions accordingly. Furthermore, the learned phonetic transcriptions can be used to improve speech recognition module by refining its models. Models used in speech recognition cannot handle words that are not in their vocabulary or have phonetic representation. Extracting those words from user’s utterances and adding them to the vocabulary should lead to a better overall performance.

References

Huang, Xuedong, et al. Spoken language processing: A guide to theory, algorithm, and system development. Prentice hall PTR, 2001.
Psutka, Josef, et al. Mluvíme s počítačem česky. 2006.
Pappu, Aasish. Knowledge Discovery Through Spoken Dialog. Diss. Carnegie Mellon University, 2014.
Pappu, Aasish K., and Alexander I. Rudnicky. "Knowledge acquisition strategies for goal-oriented dialog systems." (2014):