Témata prací (Výběr práce)Témata prací (Výběr práce)(verze: 336)
Detail práce
   Přihlásit přes CAS
Dance Recognition from Audio Recordings
Název práce v češtině: Rozpoznávání tance ze zvukových záznamů
Název v anglickém jazyce: Dance Recognition from Audio Recordings
Klíčová slova: ballroom, dance, genre, classification, CNN, audio, music
Akademický rok vypsání: 2018/2019
Typ práce: diplomová práce
Jazyk práce: angličtina
Ústav: Katedra teoretické informatiky a matematické logiky (32-KTIML)
Vedoucí / školitel: Jan Čech
Řešitel: skrytý - zadáno a potvrzeno stud. odd.
Datum přihlášení: 24.04.2019
Datum zadání: 26.04.2019
Datum potvrzení stud. oddělením: 15.05.2019
Datum a čas obhajoby: 03.02.2020 09:00
Datum odevzdání elektronické podoby:06.01.2020
Datum odevzdání tištěné podoby:07.01.2020
Datum proběhlé obhajoby: 03.02.2020
Oponenti: Mgr. Josef Moudřík
Zásady pro vypracování
Recognizing the dance (Waltz, Tango, Rumba, etc.) which fits the music being played is a challenging problem for non-professionals. To the best of our knowledge, an automatic system able to predict the matching dance categories, given a short music sample, does not exist yet. Developing such a system is realistic considering recent progress in deep convolutional neural networks.

A Related problem, recognizing the dance category from a visual recording, is largely unexplored too. The problem is interesting since it requires distinguishing fine grained human movements from sequential data with certain repetitive patterns and thus it is complementary to standard human action or activity recognition that have been well studied.

(1) Review the literature on dance recognition and related problems.
(2) Collect a dataset of labelled examples, namely short audio or audio/visual recordings with annotated labels of the dance categories. Hundreds to thousands of examples will be required.
(3) Audio Recognition: Train a deep net classifier that takes music samples as an input and predicts the label of the dance category, i.e. the dance style which fits to the music. Evaluate the classifier on an independent test set.
(4) Optionally, Visual Recognition: Train a deep net classifier that takes a video stream with a dance as an input and predicts the label of the dance category. Evaluate the classifier on an independent test set.
(5) Consider a fusion of both Audio and Video classifiers.

The thesis will investigate an open research problem. The goal is to review the literature, collect an appropriate dataset, propose and train a model, a deep network based classifier, and evaluate the model on an independent test set, i.e. a portion of the dataset unseen during the training.
Seznam odborné literatury
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016. ISBN 9780262035613. http://www.deeplearningbook.org

Hareesh Bahuleyan. Music Genre Classification using Machine Learning Techniques. arXiv preprint arXiv:1804.01149, 2018.

Caroline Chan, Shiry Ginosar, Tinghui Zhou, Alexei A. Efros. Everybody dance now. arXiv preprint arXiv:1808.07371, 2018.

Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick. Mask R-CNN. In Proc. IEEE ICCV, 2017, pp. 2980--2988.

Joao Carreira, Andrew Zissermanan. Quo vadis, action recognition? A new model and the kinetics dataset. In Proc. IEEE CVPR, 2017, pp. 4724--4733.

Diogo C. Luvizon, David Picard, Hedi Tabia. 2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning. In Proc. IEEE CVPR, 2018, pp. 5137--5146.
Univerzita Karlova | Informační systém UK