Dance Recognition from Audio Recordings
Název práce v češtině: | Rozpoznávání tance ze zvukových záznamů |
---|---|
Název v anglickém jazyce: | Dance Recognition from Audio Recordings |
Klíčová slova: | ballroom, dance, genre, classification, CNN, audio, music |
Akademický rok vypsání: | 2018/2019 |
Typ práce: | diplomová práce |
Jazyk práce: | angličtina |
Ústav: | Katedra teoretické informatiky a matematické logiky (32-KTIML) |
Vedoucí / školitel: | Jan Čech |
Řešitel: | skrytý - zadáno a potvrzeno stud. odd. |
Datum přihlášení: | 24.04.2019 |
Datum zadání: | 26.04.2019 |
Datum potvrzení stud. oddělením: | 15.05.2019 |
Datum a čas obhajoby: | 03.02.2020 09:00 |
Datum odevzdání elektronické podoby: | 06.01.2020 |
Datum odevzdání tištěné podoby: | 07.01.2020 |
Datum proběhlé obhajoby: | 03.02.2020 |
Oponenti: | Mgr. Josef Moudřík |
Zásady pro vypracování |
Recognizing the dance (Waltz, Tango, Rumba, etc.) which fits the music being played is a challenging problem for non-professionals. To the best of our knowledge, an automatic system able to predict the matching dance categories, given a short music sample, does not exist yet. Developing such a system is realistic considering recent progress in deep convolutional neural networks.
A Related problem, recognizing the dance category from a visual recording, is largely unexplored too. The problem is interesting since it requires distinguishing fine grained human movements from sequential data with certain repetitive patterns and thus it is complementary to standard human action or activity recognition that have been well studied. Guidelines: ------------- (1) Review the literature on dance recognition and related problems. (2) Collect a dataset of labelled examples, namely short audio or audio/visual recordings with annotated labels of the dance categories. Hundreds to thousands of examples will be required. (3) Audio Recognition: Train a deep net classifier that takes music samples as an input and predicts the label of the dance category, i.e. the dance style which fits to the music. Evaluate the classifier on an independent test set. (4) Optionally, Visual Recognition: Train a deep net classifier that takes a video stream with a dance as an input and predicts the label of the dance category. Evaluate the classifier on an independent test set. (5) Consider a fusion of both Audio and Video classifiers. The thesis will investigate an open research problem. The goal is to review the literature, collect an appropriate dataset, propose and train a model, a deep network based classifier, and evaluate the model on an independent test set, i.e. a portion of the dataset unseen during the training. |
Seznam odborné literatury |
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016. ISBN 9780262035613. http://www.deeplearningbook.org
Hareesh Bahuleyan. Music Genre Classification using Machine Learning Techniques. arXiv preprint arXiv:1804.01149, 2018. Caroline Chan, Shiry Ginosar, Tinghui Zhou, Alexei A. Efros. Everybody dance now. arXiv preprint arXiv:1808.07371, 2018. Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick. Mask R-CNN. In Proc. IEEE ICCV, 2017, pp. 2980--2988. Joao Carreira, Andrew Zissermanan. Quo vadis, action recognition? A new model and the kinetics dataset. In Proc. IEEE CVPR, 2017, pp. 4724--4733. Diogo C. Luvizon, David Picard, Hedi Tabia. 2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning. In Proc. IEEE CVPR, 2018, pp. 5137--5146. |