Thesis (Selection of subject)

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Deep Learning for Term Structure Forecasting

Thesis title in Czech:	Deep Learning for Term Structure Forecasting
Thesis title in English:	Deep Learning for Term Structure Forecasting
English key words:	artificial neural networks, machine learning, Nelson-Siegel, term structure, yield curves, financial markets, bonds, computational learning theory, backtesting, forecasting
Academic year of topic announcement:	2018/2019
Thesis type:	diploma thesis
Thesis language:	angličtina
Department:	Institute of Economic Studies (23-IES)
Supervisor:	doc. PhDr. Jozef Baruník, Ph.D.
Author:	hidden - assigned by the advisor
Date of registration:	22.05.2019
Date of assignment:	22.05.2019

References

Core Bibliography:

Baruník, J., & Malinská, B. (2016). Forecasting the term structure of crude oil futures prices with neural networks. Applied Energy, 164, 366–379.

Bliss, R. R. (1996). Testing term structure estimation methods. FRB Atlanta Working Paper 96-12a.

Chollet, F. & J. J. Allaire (2018). Deep Learning with R. Shelter Island, NY: Manning Publications Co.

Diebold, F. X., & Li, C. (2006). Forecasting the term structure of government bond yields. Journal of Econometrics, 130(2), 337–364.

Diebold, F. X., & Mariano, R. S. (1995). Comparing Predictive Accuracy, Journal of Business & Economic Statistics, 13:3, 253-263.

Diebold, F. X., & Rudebusch, G. (2013). Yield Curve Modeling and Forecasting: The Dynamic Nelson-Siegel Approach. Princeton University Press.

Enders, W. (2010). Applied Econometric Time Series. Hoboken, NJ: John Wiley & Sons, Inc., 3rd edition: p. 463.

Gers, F. A., J. A. Schmidhuber, & F. A. Cummins (2000). Learning to forget: Continual prediction with lstm. Neural Computation 12(10): pp. 2451–2471.

Hochreiter, S. & J. Schmidhuber (1997). Long short-term memory. Neural Computation 9(8): pp. 1735–1780.

Hornik, K., M. Stinchcombe, & H. White (1989). Multilayer feedforward networks are universal approximators. Neural Networks 2: pp. 359–366.

Kuan, C.-M. & H. White (1994). Artificial neural networks: an econometric per-spective. Econometric Reviews 13: pp. 1–91.

Prieto, A., M. Atencia, & F. Sandoval (2013). Advances in artificial neural networks and machine learning. Neurocomputing 121: pp. 1–4.

Zhang, G., B. E. Patuwo, & M. Y. Hu (1998). Forecasting with artificial neural networks: The state of the art. International Journal of Forecasting 14: pp. 35–62.

Preliminary scope of work in English

Motivation:

The term structure of government bond yields is an invaluable indicator of the market expectations about the future of the economy. These risk-free yields have an impact on many macroeconomic agents and their decision making, such as monetary policy of central banks, lending rates in the credit market, and consequently individual asset allocations of firms and households. Forecasting the development of the yields is therefore of pivotal interest for sundry market participants and policy makers. The smaller the error of our model’s predictions, the more informed decisions can be inferred. Wistfully, forecasting is often challenging in financial economics due to the nature of most problems at hand, stemming from analysing relationships without a clear structure and liquid markets being generally efficient – adapting to inefficiencies that temporarily arise.

Artificial neural networks (ANNs) are nonlinear models that have achieved unprecedented predictive results in various real world tasks. The availability of data and computational power have enabled advances in areas of perceptual recognition, internet data analysis, or multi-agent systems, to mention but a few (Prieto et al., 2013). In essence, many of the applications come down to forecasting a time series akin to our financial data, regardless of the origin of the series. Neural networks are universal function approximators (Hornik et al., 1989) capable of achieving any degree of accuracy on reasonable functions (Kuan & White, 1994) with even shallow model architectures. However, by increasing the depth of the network, the model has more representation spaces which can learn complex features easier. This greatly limits the need of feature engineering and large number of neurons in a layer that can be required for shallow networks to fit well onto intricate dependencies. Since ANNs are machine learning algorithms trained on patterns in the data, they are especially useful when abundant data is present and the underlying process is unclear and nonlinear (Enders, 2010), making estimation using parametric models difficult and too restrictive in the functional form.

In this thesis, we aim to find out whether there is predictive power of ANN models when applied on term structure forecasting and what models perform the best. It has previously been shown by Baruník & Malinská (2016) that these types of nonlinear models can outperform more traditional ones in the dynamic Nelson-Siegel setting when forecasting term structure of oil prices for certain maturities and horizons. We seek to follow up on such current literature and delve deeper into robustly optimising more advanced architectures of neural networks in order to find objectively well performing models for our data of government bond yields.

Hypotheses:

We analyse whether using ANN models works well for forecasting intraday bond yields. Extensive hyperparameter tuning is carried out and various kinds of ANN models are trained to find the optimal ones. We are interested in testing the following hypotheses about the models’ out-of-sample performance:

1. ANN models have statistically smaller errors than the benchmarking random walk model.
2. ANN models outperform traditionally used AR and VAR models.
3. Deep neural network models outperform the parsimonious neural network structures.

Methodology:

Our data is intraday high-frequency US and EU government bond futures prices with medium to long-term maturities. The data is appropriately preprocessed into a structure that the models require, using a form of interpolation to get constant-maturity futures prices. It is then split into training, validation, and out-of-sample test sets. Cross validation is used to make the analysis less dependent on the specific period of the testing set. After fitting the models, they are evaluated using various statistics (MAE, RMSE, asymmetric loss function) and model comparisons like the Diebold-Mariano test (1995) and other multi-model tests to find out significance.

The methodology of the models itself is twofold. First, we use the dynamic Nelson-Siegel framework (Diebold & Li, 2006) for estimating factor loadings of our datasets. This is useful for compressing information of several maturities into a smaller dimension (Diebold & Rudebusch, 2013). Second, ANN models are trained on the three factor time series and used for forecasting the factors. This way it is possible to predict the term structure by training our ANN models to find out the three parameters of the Nelson-Siegel model. The other methods of term structure modelling (no-arbitrage and equilibrium affine models) are shown by Bliss (1996) to not perform as well for yield estimation.

The ANN models themselves can leverage the time dependence of our data when using the Long Short-Term Memory (LSTM) architecture by Hochreiter & Schmidhuber (1997), further improved into a more current form by Gers et al. (2000). This is a recurrent type of a network that has additional operations inside and weights determining how much the model should remember each past output. Nowadays, this architecture is used in myriads of real world applications that are of sequential nature, for instance Google’s digital Assistant. LSTM is therefore the prime candidate for our time series forecasting.

Expected Contribution:

Following up on the current scarce literature that uses neural networks for term structure forecasting, we explore the performance of more recent and complex architectures for modelling intraday bond yields. This requires carefully designing and carrying out extensive hyperparameter tuning of the models, resulting in finding the best performing ones for our problem. Hyperparameter tuning is often an overlooked and little discussed part of training neural networks in financial academic literature due to being time consuming to program, computationally intensive (Loffe & Szegedy, 2015), unconstrained, and prone to overfitting (Hinton et al., 2015). Yet it is a very crucial step for tasks where ANNs can leverage nonlinear relationships in the data. Especially when the data is abundant and of higher frequency as in our case; there is a potential for some model architectures to train well and for others to perform poorly, casting a bad light on the whole ANN family of models. This tuning and overfitting prevention is an essential technical part of our thesis that complements the financial aspects and makes the results trustworthy. Lastly, our applied analysis of this interesting large high-frequency dataset is a contribution to the academic literature by itself.

Outline:

1. Introduction: motivation and summary of research goals

2. Literature Review: survey of relevant literature

3. Methodology: theory behind the models

4. Data: preprocessing, descriptive statistics, visualisations

5. Results and Discussion: comparison of forecasting models, out-of-sample performance

6. Conclusion: summary of main findings and future research possibilities