To ascertain the robustness of an investment strategy in the financial markets, professionals often default to backtesting. Backtesting is a process where professionals test the buying and selling rationale of their investment strategy using historical prices, and finally proceed to measure the risk and reward they could have earned across the historical backtest period. The performance results of the backtest are then deemed to be somewhat indicative of how the strategy will perform in the future.
However, prior to backtesting, there are usually a few caveats to take care of in the dataset. Beginners often commit these mistakes which could easily go unnoticed. This is dangerous because no matter how great an investment strategy is – garbage results will be obtained from garbage input. In this article, we look at 3 pre-backtesting errors which beginners usually commit.
1) Unaligned dates
It is very common to backtest with different financial assets. For instance, an investment strategy could involve both the SPDR S&P500 Index ETF (SPY) and the DAX Futures (FDAX). Beginners often download historical prices for the same start and end date, and then proceed to backtest.
However, it is important to note that different financial assets may be listed in different countries. From the example above, this means that while the SPY may not be trading on 4th July as it is a holiday in the US, the FDAX is still be trading. This results in a price for the FDAX on 4th July but no prices for the SPY for the same date.
2) Missing Data
Sometimes, data obtained could have missing data points, even if the data came from Bloomberg – who are one of the top data providers in the world. The data could be missing because there was a mistake in recording of prices at the exchange. Either way, this will create holes in the dataset which we use for backtesting.
Many professionals use a forward filling methodology to overcome this problem, where they assume that the missing data is equivalent to the previous trading day’s prices. For example, suppose the stock price of Apple is missing on 10th July 2013. We would then assume that the stock price of Apple on 10th July is the same as that of 9th July 2013.
However, one caveat of this solution is when a couple of days of data are missing. In such cases, we would have the same data point for several periods.
3) Disjoint Prices
Due to different kinds of corporate actions such as dividends and stock splits, stock prices often go through a series of big jumps or dips. One example would be the recent dividend pay-out from Apple. Apple announced a cash dividend of USD0.63, and its stock price dropped by USD0.63 on the ex-dividend date on 11th May 2017. If we were to use stock prices that showed this drop of USD0.63, our investment strategy might misinterpret it as market forces causing the price to drop and wrongly recommend an action to take i.e. buy or sell or hold.
Another example is in the case of stock splits for Apple. Over its trading history, Apple has undergone a total of 4 stock splits, with its most recent one in June 2014. Stock prices change severely when undergoing stock splits – Apple’s price was divided by 7 in its last stock split due to a 7 for 1 stock split.
As such, it is important for us to use continuous price series rather than disjoint prices. Yahoo Finance, for one, tends to provide stock prices that are already adjusted for both dividends and splits.
Backtesting is an important process frequently used for testing an investment strategy. In fact, all kristals at Kristal.ai are required to undergo some form of backtesting to ensure that they are suitable for investment. However, the steps to collect and clean the data prior to backtesting are equally necessary yet often neglected by many analysts out there. To better understand how we perform the entire backtest process at Kristal.ai, please feel free to comment on our blog or reach out to me at firstname.lastname@example.org.