One trading strategy we can adopt is to first model the market, forecast returns for the next trading day and then go long or short based on whether the forecasted returns are positive or negative. Model the market, you say? **What foolery!** Who can possibly hope to capture the millions of individual (human and machine) decisions, which themselves occur in response to millions of external stimuli, which all interact with each other in mysterious and magical ways to lead to market outcomes.

But that is not what a model does- we build a model to capture the essentials of what drives whichever process we are interested in. We **abstract away** from the details which we think are unnecessary or overly complicate our life. These details can turn out to be contentious as someone else may consider them to be critical and your model useless if it doesn’t capture them. I personally don’t put much faith in forecasting returns but thanks to modern technology, we can test how successful our forecasts are. And besides, if macro-economists can attempt to model whole economies, why pick on an undergrad who tries to model the stock market?

I’ll be using the **ARMA (Auto Regressive Moving Average)** and the **GARCH (Generalized Auto Regressive Conditional Heteroskedasticity)** models in combination to compute forecasts. This has been done previously by Quintuitive who runs an extremely informative and helpful blog. I have learned much from his writings and owe a lot of my learning to him. He has also generously hosted the code up for testing these models. I used his code with some very slight modifications, which I’ll detail further on.

**Bring on the models**

Before that though, a brief introduction to ARMA and GARCH is in order. We are using the ARMA model (with to-be-optimized parameters p and q) to specify the c**onditional mean** of the stock market returns and the GARCH(1,1) model to specify the** conditional variance** of these returns. Conditional on what, you ask? Conditional on the information set we provide the model with. So for example lets say we have stock data from 2001-01-01 to 2005-01-01. If today is 2004-12-01, the forecast for tomorrow will be based on the returns today, and on a predetermined window of stock returns, for example the last 500 days. Why do we create a window for the information set, instead of using all the data available to us? This is to allow us to be more flexible in determining the process we are trying to model- **to allow it to evolve** instead of just update with each new data point. It allows for the model to be more **responsive subsequent to structural change** in the Data Generating Process (DGP). This paper provide more information on how to select the information set.

**What is ARMA?**

The ARMA(p,q) model consists of an auto-regressive part and a moving average part, with p auto-regressive terms and q moving average terms. The values of p and q are also referred to as the order of the model. The ARMA model is good for modelling situations in which a system is** vulnerable to exogenous, unexplained shocks** (a stock market!) but also shows **mean-reverting behaviour** (did you say stock market?). The moving-average part helps model the exogenous shocks to market return and the auto-regressive part allows us to make it mean-revert. The mean-reversion bit is also why it’s better to have a rolling window for our information set.

**What is GARCH?**

The GARCH(1,1) model is useful for modelling time series when the variance today is a function of some prior variance. This helps model the** ‘volatility clustering’** we see in the stock market, where periods of high volatility are clustered around each other and take some time to decay. There are many variations of the ARCH model (GARCH is just one of them), some of which may be better suited to pairing with ARMA, but I’m not familiar with them so I didn’t use them. Here is an easy to read paper on ARCH models.

A good textbook I found for understanding and implementing these models is Time Series Analysis and its Applications: With R Examples.

In the next post I will be going over the code posted by Quintuitive to test the forecasts made by this model.