Restoring and improving the technical quality of audio recordings using machine learning methods
Thesis title in Czech: | Restaurování a vylepšování technické kvality zvukových nahrávek metodami strojového učení |
---|---|
Thesis title in English: | Restoring and improving the technical quality of audio recordings using machine learning methods |
Key words: | Hluboká neuronová síť|audio|Konvoluční neuronová síť|TensorFlow|kvalita |
English key words: | Deep Neural Network|audio|Convolutional Neural Network|TensorFlow|quality |
Academic year of topic announcement: | 2020/2021 |
Thesis type: | diploma thesis |
Thesis language: | angličtina |
Department: | Institute of Formal and Applied Linguistics (32-UFAL) |
Supervisor: | Mgr. Nino Peterek, Ph.D. |
Author: | Mgr. Adam Lechovský - assigned and confirmed by the Study Dept. |
Date of registration: | 07.05.2021 |
Date of assignment: | 07.05.2021 |
Confirmed by Study dept. on: | 25.05.2021 |
Date and time of defence: | 07.09.2022 09:00 |
Date of electronic submission: | 21.07.2022 |
Date of submission of printed version: | 25.07.2022 |
Date of proceeded defence: | 07.09.2022 |
Opponents: | Mgr. et Mgr. Ondřej Dušek, Ph.D. |
Guidelines |
The thesis will focus on the use of current artificial intelligence machine learning methods to improve the quality of variously damaged or dynamically imbalanced recordings.
Open source audio data and artificially degraded versions of the data will be used for training and evaluation. The thesis will use evaluation procedures to objectively and subjectively capture the technical quality of audio recordings. |
References |
Kamath, U., Liu, J., Whitaker, J.: Deep learning for NLP and speech recognition. Springer, 2019. doi: 10.1007/978-3-030-14596-5
https://link.springer.com/book/10.1007/978-3-030-14596-5 Watanabe, Shinji, et al.: New Era for Robust Speech Recognition. Springer, 2017. doi: 10.1007/978-3-319-64680-0 https://link.springer.com/book/10.1007/978-3-319-64680-0 Jiang, L., Hu, R., Wang, X., Zhang, M.: Low bitrates audio bandwidth extension using a deep auto-encoder. In: Ho, Y.-S., Sang, J., Ro, Y.M., Kim, J., Wu, F. (eds.) PCM 2015. LNCS, vol. 9314, pp. 528–537. Springer, Heidelberg, 2015. doi: 10.1007/978-3-319-24075-6_51 https://link.springer.com/chapter/10.1007/978-3-319-24075-6_51 Mack, W. and Habets, E. A. P.: Declipping Speech Using Deep Filtering. 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2019, pp. 200-204. doi: 10.1109/WASPAA.2019.8937287. https://ieeexplore.ieee.org/abstract/document/8937287 Naithani, G., Parascandolo, G., Barker, T., Pontoppidan, N. H., Virtanen, T.: Low-latency sound source separation using deep neural networks. 2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP), 2016, pp. 272-276. doi: 10.1109/GlobalSIP.2016.7905846. https://ieeexplore.ieee.org/abstract/document/7905846 |