Thesis (Selection of subject)Thesis (Selection of subject)(version: 368)
Thesis details
   Login via CAS
Determinants of Used Car Prices
Thesis title in Czech: Determinanty cen použitých aut
Thesis title in English: Determinants of Used Car Prices
Key words: trh s použitými auty; cena použitého auta; Frekvenční modelové průměrování; Evropa; hedonická regrese
English key words: used car market; used car price; Frequentist Model Averaging; Europe; hedonic regression
Academic year of topic announcement: 2021/2022
Thesis type: diploma thesis
Thesis language: angličtina
Department: Institute of Economic Studies (23-IES)
Supervisor: doc. PhDr. Jozef Baruník, Ph.D.
Author: hidden - assigned by the advisor
Date of registration: 15.06.2022
Date of assignment: 15.06.2022
Date and time of defence: 21.06.2023 09:00
Venue of defence: Opletalova, O206, místnost. č. 206
Date of electronic submission:02.05.2023
Date of proceeded defence: 21.06.2023
Opponents: PhDr. Jiří Kukačka, Ph.D.
 
 
 
References
Gegic, E., Isakovic, B., Keco, D., Masetic, Z., & Kevric, J. (2019). Car Price Prediction using Machine Learning Techniques. TEM Journal, 8(1), 113–118. https://doi.org/10.18421/TEM81-16
Meng, S.-M., Liu, L.-J., Kuritsyn, M., & Pechnikov, V. (2019). Price Determinants on Used Car Auction in Taiwan. International Journal of Asian Social Science, 9(1), 48–58. https://doi.org/10.18488/JOURNAL.1.2019.91.48.58
Noor, K., & Jan, S. (2017). Vehicle Price Prediction System using Machine Learning Techniques. International Journal of Computer Applications, 167(9), 975–8887. www.pakwheels.com.
Pal, N., Arora, P., Kohli, P., Sundararaman, D., & Palakurthy, S. S. (2019). How Much Is My Car Worth? A Methodology for Predicting Used Cars’ Prices Using Random Forest. Advances in Intelligent Systems and Computing, 886, 413–422. https://doi.org/10.1007/978-3-030-03402-3_28
Pudaruth, S. (2014). Predicting the Price of Used Cars using Machine Learning Techniques. International Journal of Information & Computation Technology, 4(7), 753–764. http://www.irphouse.com
Shen, G., Wang, Y., & Zhu, Q. (2011). A new model for residual value prediction of the used car based on BP neural network and nonlinear curve fit. Proceedings - 3rd International Conference on Measuring Technology and Mechatronics Automation, ICMTMA 2011, 2, 682–685. https://doi.org/10.1109/ICMTMA.2011.455
Wu, J. da, Hsu, C. C., & Chen, H. C. (2009). An expert system of price forecasting for used cars using adaptive neuro-fuzzy inference. Expert Systems with Applications, 36(4), 7809–7817. https://doi.org/10.1016/J.ESWA.2008.11.019
Preliminary scope of work in English
Motivation

The used cars have an important position in the vehicle market in the Czech Republic. According to the data from Car Importers Association share of the new registrations of used cars on the total amount of cars registered in the Czech Republic was around 40 % in the last decade and recently in 2021 it even increased to 47 %. With decreasing purchasing power caused by a high inflation combined with disrupted supply chain resulting in a fewer number of new cars available this ratio will with a high probability further increase in upcoming years. Despite this positive trend and relative importance of the market for the economy the behavior of the customers in the used car market is not well understood, one specific example being that the determinants of the car prices are not well mapped. Potential research in this area is then complicated by low data availability.
The aim of this thesis will be to tackle this problem and fill the gap in the research by creating a new dataset from the data available on the internet websites with the offers of used cars and then to use this dataset to analyze which factors, and with how great an effect, determine their prices. The final step will be then to use these identified determinants to create a model which purpose will be to accurately predict the prices.

Hypotheses

1. Hypothesis #1: Machine learning methods show better results in estimation of used car prices
2. Hypothesis #2: There exists a significant risk premium for buying a car directly from previous owner instead of purchasing it through a dealership
3. Hypothesis #3: The importance of various determinants varies between different types of cars


Methodology

The first important step in the research process will be the collection of the data from various internet webpages with used car sale offers using a web scrapping program and creating a dataset with different attributes of used cars. The following analysis will deal with analysis of the effect of these attributes on the used car prices. Firstly, model will be estimated with OLS, in case there is high collinearity found between the independent variables of the model, Lasso regression will be also implemented. Further, quantile regression will be also utilized, for the first time in the analysis of determinants of the prices of used cars to examine if there are differences between determinants among different price levels of used cars. After this analysis, several models will be created for the prediction of used car prices. We will again utilize the OLS model from the first part of the analysis, which will serve as the benchmark model. We will compare its performance with various machine learning methods (Decision tree, random forest, k nearest neighbor and support vector regressions). From these methods we will then select that will be able to predict the prices most accurately.
Expected Contribution
The contribution will partly lie in creation of a new unique dataset suitable for the analysis of used car prices determinants. Further, although there is existing literature concerning itself with the same research question to the knowledge of the author this will be the first work with the focus on the Czech market. This can be valuable insight in the behavior of participants in the domestic used cars market and the prediction models based on these determinants can be used as a reference point for both buyers and sellers in estimating what is the optimal price to sell a car or to buy one respectively. Moreover, the quantile regression will be implemented for the first time in analyzing used cars determinants, which make it possible to search into differences in their effects for different price levels.

Outline

1. Introduction
2. Literature Review
3. Data Description
4. Creation of the Dataset
5. Dataset Overview
6. Methodology
7. Results
8. Conclusion
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html