Předpovídání trendů akciového trhu z novinových článků
Název práce v češtině: | Předpovídání trendů akciového trhu z novinových článků |
---|---|
Název v anglickém jazyce: | Predicting Stock Market Trends from News Articles |
Klíčová slova: | Předpovídání, akciový trh, novinové články |
Klíčová slova anglicky: | Prediction, stock market, news articles |
Akademický rok vypsání: | 2017/2018 |
Typ práce: | diplomová práce |
Jazyk práce: | angličtina |
Ústav: | Ústav formální a aplikované lingvistiky (32-UFAL) |
Vedoucí / školitel: | doc. RNDr. Vladislav Kuboň, Ph.D. |
Řešitel: | skrytý - zadáno a potvrzeno stud. odd. |
Datum přihlášení: | 22.05.2018 |
Datum zadání: | 11.06.2018 |
Datum potvrzení stud. oddělením: | 07.08.2018 |
Datum a čas obhajoby: | 11.09.2018 09:00 |
Datum odevzdání elektronické podoby: | 19.07.2018 |
Datum odevzdání tištěné podoby: | 20.07.2018 |
Datum proběhlé obhajoby: | 11.09.2018 |
Oponenti: | doc. Mgr. Barbora Vidová Hladká, Ph.D. |
Zásady pro vypracování |
The goal of predicting stock market trends has emerged from the desire to make profitable investing decisions. Random Walk Theory and Efficient Market Hypothesis suggest that it is impossible to outwit the market and the stock prices are changing at random. However, recent advances in machine learning and the growing availability of wide-scale data have made it possible to apply state-of-the-art algorithms to this problem.
The thesis will make an attempt to analyze financial news articles (which are one of the most valuable sources of textual information that influence the stock market) in order to predict the changes in the stock prices. This idea has been studied before, but although a lot of work has been conducted in this area, there are still many weak points. The thesis should address some of them. The first issue concerns the evaluation standards. The existing evaluation methods focus on various subtasks; various time frames are chosen for the prediction and different observation windows are used. Therefore, even if the textual data as well as the stock market information are available, it is not self-evident how to align them in order to get meaningful results. The thesis should explore the possible data alignment options and find the ones yielding the best performance. Second, most researchers used different datasets for solving the task and therefore it is impossible to compare the implemented approaches directly. Moreover, most of the used datasets were never published. If feasible, we will try to compare some of the existing approaches on the same dataset and explore the impact of various textual, meta-textual and non-textual features that were mentioned in the previous works and that we come up with ourselves. Last but not the least, the performance of the best classifiers never exceeds 65% for the binary classification task focusing on the directionality of the change (with ~50% baseline), which obviously shows that there is still some room for improvement. We will try to achieve better results by combining the findings from previous research with our own conclusions. |
Seznam odborné literatury |
De Fortuny, Enric Junqué, et al. "Evaluating and understanding text-based stock price prediction models." Information Processing & Management 50.2 (2014): 426-441.
Ding, Xiao, et al. "Deep learning for event-driven stock prediction." Ijcai. 2015. Schumaker, Robert P., and Hsinchun Chen. "Textual analysis of stock market prediction using breaking financial news: The AZFin text system." ACM Transactions on Information Systems (TOIS) 27.2 (2009): 12. Vargas, Manuel R., Beatriz SLP de Lima, and Alexandre G. Evsukoff. "Deep learning for stock market prediction from financial news articles." Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA), 2017 IEEE International Conference on. IEEE, 2017. Zhai, Yuzheng, Arthur Hsu, and Saman K. Halgamuge. "Combining news and technical indicators in daily stock price trends prediction." International symposium on neural networks. Springer, Berlin, Heidelberg, 2007. |
Předběžná náplň práce v anglickém jazyce |
In this work we made an attempt to predict the upwards/downwards movement of the S&P 500 index from the news articles published by Bloomberg and Reuters. We employed the SVM classifier and conducted multiple experiments aiming at understanding the shape of the data and the specifics of the task better. As a result, we established the common evaluation settings for all our subsequent experiments. After that we tried incorporating various features into the model and also replicated several approaches previously suggested in the literature. We were able to identify some non-trivial dependencies in the data which helped us achieve a high accuracy on the development set. However, none of the models that we built showed comparable performance on the test set. We have come to the conclusion that whereas some trends or patterns can be identified in a particular dataset, such findings are usually barely transferable to other data. The experiments that we conducted support the idea that the stock market is changing at random and a high quality of prediction may only be achieved on particular sets of data and under very special settings, but not for the task of stock market prediction in general. |