Thesis (Selection of subject)

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Předpovídání trendů akciového trhu z novinových článků

Thesis title in Czech:	Předpovídání trendů akciového trhu z novinových článků
Thesis title in English:	Predicting Stock Market Trends from News Articles
Key words:	Předpovídání, akciový trh, novinové články
English key words:	Prediction, stock market, news articles
Academic year of topic announcement:	2017/2018
Thesis type:	diploma thesis
Thesis language:	angličtina
Department:	Institute of Formal and Applied Linguistics (32-UFAL)
Supervisor:	doc. RNDr. Vladislav Kuboň, Ph.D.
Author:	hidden - assigned and confirmed by the Study Dept.
Date of registration:	22.05.2018
Date of assignment:	11.06.2018
Confirmed by Study dept. on:	07.08.2018
Date and time of defence:	11.09.2018 09:00
Date of electronic submission:	19.07.2018
Date of submission of printed version:	20.07.2018
Date of proceeded defence:	11.09.2018
Opponents:	doc. Mgr. Barbora Vidová Hladká, Ph.D.

Guidelines

The goal of predicting stock market trends has emerged from the desire to make profitable investing decisions. Random Walk Theory and Efficient Market Hypothesis suggest that it is impossible to outwit the market and the stock prices are changing at random. However, recent advances in machine learning and the growing availability of wide-scale data have made it possible to apply state-of-the-art algorithms to this problem.
The thesis will make an attempt to analyze financial news articles (which are one of the most valuable sources of textual information that influence the stock market) in order to predict the changes in the stock prices. This idea has been studied before, but although a lot of work has been conducted in this area, there are still many weak points. The thesis should address some of them.
The first issue concerns the evaluation standards. The existing evaluation methods focus on various subtasks; various time frames are chosen for the prediction and different observation windows are used. Therefore, even if the textual data as well as the stock market information are available, it is not self-evident how to align them in order to get meaningful results. The thesis should explore the possible data alignment options and find the ones yielding the best performance.
Second, most researchers used different datasets for solving the task and therefore it is impossible to compare the implemented approaches directly. Moreover, most of the used datasets were never published. If feasible, we will try to compare some of the existing approaches on the same dataset and explore the impact of various textual, meta-textual and non-textual features that were mentioned in the previous works and that we come up with ourselves.
Last but not the least, the performance of the best classifiers never exceeds 65% for the binary classification task focusing on the directionality of the change (with ~50% baseline), which obviously shows that there is still some room for improvement. We will try to achieve better results by combining the findings from previous research with our own conclusions.

References

De Fortuny, Enric Junqué, et al. "Evaluating and understanding text-based stock price prediction models." Information Processing & Management 50.2 (2014): 426-441.

Ding, Xiao, et al. "Deep learning for event-driven stock prediction." Ijcai. 2015.

Schumaker, Robert P., and Hsinchun Chen. "Textual analysis of stock market prediction using breaking financial news: The AZFin text system." ACM Transactions on Information Systems (TOIS) 27.2 (2009): 12.

Vargas, Manuel R., Beatriz SLP de Lima, and Alexandre G. Evsukoff. "Deep learning for stock market prediction from financial news articles." Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA), 2017 IEEE International Conference on. IEEE, 2017.

Zhai, Yuzheng, Arthur Hsu, and Saman K. Halgamuge. "Combining news and technical indicators in daily stock price trends prediction." International symposium on neural networks. Springer, Berlin, Heidelberg, 2007.

Preliminary scope of work in English

In this work we made an attempt to predict the upwards/downwards movement of the S&P 500 index from the news articles published by Bloomberg and Reuters. We employed the SVM classifier and conducted multiple experiments aiming at understanding the shape of the data and the specifics of the task better. As a result, we established the common evaluation settings for all our subsequent experiments. After that we tried incorporating various features into the model and also replicated several approaches previously suggested in the literature. We were able to identify some non-trivial dependencies in the data which helped us achieve a high accuracy on the development set. However, none of the models that we built showed comparable performance on the test set. We have come to the conclusion that whereas some trends or patterns can be identified in a particular dataset, such findings are usually barely transferable to other data. The experiments that we conducted support the idea that the stock market is changing at random and a high quality of prediction may only be achieved on particular sets of data and under very special settings, but not for the task of stock market prediction in general.