Témata prací (Výběr práce)Témata prací (Výběr práce)(verze: 348)
Detail práce
   Přihlásit přes CAS
Neural networks and tree-based credit scoring models
Název práce v češtině: Neuronové sítě a stromové metody v kreditních skóringových modelech
Název v anglickém jazyce: Neural networks and tree-based credit scoring models
Klíčová slova anglicky: machine learning , loan default model , logistic regression, random forests, neural networks
Akademický rok vypsání: 2016/2017
Typ práce: bakalářská práce
Jazyk práce: angličtina
Ústav: Institut ekonomických studií (23-IES)
Vedoucí / školitel: prof. PhDr. Ladislav Krištoufek, Ph.D.
Řešitel: skrytý - zadáno vedoucím/školitelem
Datum přihlášení: 04.06.2017
Datum zadání: 04.06.2017
Datum a čas obhajoby: 11.09.2018 09:00
Místo konání obhajoby: Opletalova - Opletalova 26, O105, Opletalova - místn. č. 105
Datum odevzdání elektronické podoby:31.07.2018
Datum proběhlé obhajoby: 11.09.2018
Oponenti: Mgr. Nicolas Fanta
Kontrola URKUND:
Seznam odborné literatury
Athey, Susan, and Guido Imbens. “The State of Applied Econometrics-Causality and Policy Evaluation.” arXiv Preprint arXiv:1607.00699, 2016.

Varian, Hal R. “Big Data: New Tricks for Econometrics.” Journal of Economic Perspectives 28, no. 2 (May 2014): 3–28.

Krauss, Christopher, Xuan Anh Do, and Nicolas Huck. “Deep Neural Networks, Gradient-Boosted Trees, Random Forests: Statistical Arbitrage on the S&P 500.” European Journal of Operational Research 259, no. 2 (June 2017): 689–702.

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. An Introduction to Statistical Learning. Vol. 103. Springer Texts in Statistics. New York, NY: Springer New York, 2013.

Murphy, Kevin P. Machine Learning: A Probabilistic Perspective. Adaptive Computation and Machine Learning Series. Cambridge, MA: MIT Press, 2012.
Předběžná náplň práce v anglickém jazyce
Thesis will try to introduce various machine learning techniques to answer the question of whether in terms of loan default prediction, they can perform comparably or even better than normal linear regression models.
Default predictions used in big institutions are usually exclusively modelled by logistic regression, therefore I will try to show that machine learning models can replace/be used together with this normal approach.

Data used for the thesis are from Lending Club from 2007-2017. Lending Club is the biggest peer-to-peer lending platform in the US. The dataset contains ~300 thousand completed loans with ~30 relevant variables.
The dataset will be split into a randomly selected training subset and a smaller randomly selected testing subset. The models will be constructed using the training subset and subsequently run on the testing subset to compare the performance of the models.
Selected machine learning models will involve primarily decision trees & random forests (James et al., An Introduction to Statistical Learning) and artificial neural networks (Murphy, Machine Learning: A Probabilistic Perspective).

The thesis will contain a theoretical and an empirical part. In the theoretical part I will firstly review the current state of machine learning usage in economics and secondly review machine learning techniques.
In the empirical part I will use the data to create different models for predicting default. In the final chapter I will compare the results and conclude.
Univerzita Karlova | Informační systém UK