Témata prací (Výběr práce)Témata prací (Výběr práce)(verze: 381)
Detail práce
   Přihlásit přes CAS
Smoothness of Functions Learned by Neural Networks
Název práce v češtině: Hladkost funkcí naučených neuronovými sítěmi
Název v anglickém jazyce: Smoothness of Functions Learned by Neural Networks
Klíčová slova: strojové učení, neuronové sítě, hladkost, zobecňování
Klíčová slova anglicky: machine learning, neural networks, smoothness, generalization
Akademický rok vypsání: 2019/2020
Typ práce: bakalářská práce
Jazyk práce: angličtina
Ústav: Ústav formální a aplikované lingvistiky (32-UFAL)
Vedoucí / školitel: Mgr. Tomáš Musil
Řešitel: skrytý - zadáno a potvrzeno stud. odd.
Datum přihlášení: 12.05.2020
Datum zadání: 12.05.2020
Datum potvrzení stud. oddělením: 22.05.2020
Datum a čas obhajoby: 07.07.2020 09:00
Datum odevzdání elektronické podoby:04.06.2020
Datum odevzdání tištěné podoby:04.06.2020
Datum proběhlé obhajoby: 07.07.2020
Oponenti: RNDr. Milan Straka, Ph.D.
 
 
 
Zásady pro vypracování
Classic machine learning theory states that models suffer from the bias-variance tradeoff: if their capacity is too low, they underfit the training set and thus have poor performance on the test set as well. If it is too high, they overfit the training set displaying (near-)perfect performance on it, but again perform poorly on the test set due to finding spurious patterns in the data.

Recent research shows that this does not apply to modern neural networks (NNs): they often have the capacity to perfectly fit (interpolate) the training set, but despite being extremely “overfit” they generalize well. This suggests that there is some form of implicit regularization in the training process which biases the learned functions in a way which is good for generalization. But how this implicit regularization works is an open problem.

In the thesis we will explore the hypothesis that training NNs with gradient descent tends to learn smooth functions, where “smoothness” is understood in some intuitive sense. If we also assume that smooth functions are good for generalization, this would explain why NNs generalize. We focus on the first part of the hypothesis: whether training NNs yields smooth functions. We formalize this notion of smoothness and run experiments to see under what conditions smooth functions are actually learned.

Specifically, we will propose measures of function complexity (inverse of smoothness) and measure the complexity of NNs trained with various hyperparameters. We will use the simplest NN possible: a two-layer network with ReLU activation. We will compare the computed complexity of NNs and other models, such as polynomial interpolation. We will use synthetic datasets at first and later move on to simple real data (MNIST, CIFAR). This way, we will determine which training procedures lead to smooth functions being learned. The empirical study may lead to development in theory: based on the data, one may be able to formulate precise conditions under which smooth functions are learned.
Seznam odborné literatury
Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. Understanding deep learning requires rethinking generalization. In International Conference on Learning Representations, 2017.
Hartmut Maennel, Olivier Bousquet and Sylvain Gelly. Gradient Descent Quantizes ReLU Network Features. arXiv preprint arXiv:1803.08367, 2018.
Mikhail Belkin and Daniel Hsu and Siyuan Ma and Soumik Mandal. Reconciling modern machine learning practice and the bias-variance trade-off. arXiv preprint arXiv:1812.11118, 2018.
Neyshabur, Behnam, et al. Exploring generalization in deep learning. Advances in Neural Information Processing Systems. 2017.
Stuart Geman, Elie Bienenstock, and Ren Doursat. Neural networks and the bias/variance dilemma. Neural Computation, 4(1):1–58, 1992. doi: 10.1162/neco.1992.4.1.1. URL https://doi.org/10.1162/neco.1992.4.1.1
Ian Goodfellow and Yoshua Bengio and Aaron Courville. Deep Learning. MIT Press, 2016.
Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The Elements of Statistical Learning, volume 1. Springer, 2001.
 
Univerzita Karlova | Informační systém UK