Years ago while pitching 4pm estimated prices for non-US stocks to replace the local closing prices to a mutual fund company president I said the estimated price is closer to next day’s open 65% of the time. He countered saying, doing so incorrectly modified the local price 35% of the time - and, moreover, local price had been used since the beginning of the industry, The company’s general counsel, however, supported my pitch with the reciprocal view that not using the estimate meant using the wrong price 65% of the time. I used this equivalent, but more convincing argument, in all subsequent meetings. Recalling this event, while reading Captain Cook and the art of negative discovery of the southern continent, made me reexamine my results.
I have focused on performance during 9 years of history when trades occurred on 315 days - but why not also focus on non-model day performance.
Consider these 1921 days. On them net return was -11.50%. (Some days had very large losses allowing 53% positive return on 53% of the days yet net for all days was negative.)
The table shows this non-model day weakness has persisted through history.
Results are one-sided. Column 6 shows that in every year but one model day win ratio was higher by at least 12%. Column 9 shows that in every year, average daily return was at least .08% higher. Column 12 shows that in only 2 years was model net return not the highest - and this excess was achieved by trading on only 15% of the days. Particularly compelling is the full period return difference (60.2% vs -11.5%) in the last row.
(Strong results but my physicist friend’s suggestion, why not "Just buy-and-hold SPY" has merit. Its return over 2236 days (holding 24 hours/day) was 119% while model return over 315 days (holding only 4 hours/day) was 60.2%. Now time is he wrong variable to measure risk (Post 8) but still it's pleasant to earn such a return with so little time exposure. And a trade-war market collapse is more likely during the longer period.)
So is there any reason, besides the hassle of trading every day, to not use the model? One major concern is "data snooping." The model is based on GDP-weighted momentum in the morning but I've refined the model to also depend on several other variables - volatility, ETF premium,... - which weren't obvious until I started testing. This snooping, looking a test results then refining the model, casts a shadow on it. To address this a pure statistician, whose papers are among the world's most referenced, has agreed to review my model for snooping. Results will be posted.
I notified the New York Times author that his reported 24 year results were concentrated in the first 15 years while results during the last 9 years differed dramatically. Surprisingly, no response so I'm writing a letter to the editor.
This post repeats the analysis in post 17 with a focus on Europe and China. Results are constructed somewhat differently. It uses additive (more intuitive) rather than compound returns; It uses 9:45am prices (less noisy) rather than the 9:30am open.
Between 1994 and 2008 SPY returned 101% during the day while dropping -2% at night (These are additive not compound returns listed in the last post.) whereas between 2009 and 2018 it returned 68% during the day and 72% at night. So during the last 9 years, day returns equaled night returns. But what about Europe and China?
Both regions have liquid ETFs (FXI for China and EZU for Europe) which can approximate day and night returns.
Consider China which is open from 9pm to 4am. FXI's return from 4pm to 9:45am approximates this period with the proviso that China between 4am_ 945am and 4pm_9pm is unobservable. (OK, some of these periods could be measured by Asia/Europe ETFs or futures but these are illiquid.) China being closed 4am_ 9pm can be approximated by FXI 945am_4pm.
Similarly Europe open from 4am till 1130am can be approximated by EZU from 4pm_1130am while Europe closed from 1130am-4am by EZU from 1130am_4pm.
These approximations find China and Europe dropping while open and rising significantly while closed.
These results suggest a robotic model such as buy FXI every day at 9:45am and sell at 4pm. This system overcomes transaction costs but faces the danger that the "rise overnight; drop during trading hours" phenomenon stops as it did for SPY in 2008.
Also, these results might partially explain my world GDP-weighted momentum system. If non-US markets rises are less frequent than drops, then when they do rise, the US market follows their lead.
I hear you saying: So if the 2008 day-night shift happens for non-US markets there-by killing the buy_FXI_every day system won't it also kill buy-SPY-on selected-days. Possibly, but trading FXI every day returns about .02%/day after transaction cost with far more market exposure; Whereas selected trading returns .15%/day. This should provide more time to "see" the regime shift.
Also, buy_FXI_every-day certainly depends on China rising when its markets are closed. It's speculation that but-at-noon depends on this phenomenon so perhaps China could fall during closed periods and buy-at-noon would continue to work.
Any system, however, can (will) stop working but since Buy_SPY-at-noon has worked for 9 years why not follow it? Time will tell but for now all is well !
On Sunday Feb 4th the New York Times discussed a report that found : "Buying SPY at Close and Selling at Open" returned 571% versus a daytime return of -5%:
(Many stories focus on SPY since it's the world's most traded instrument.) This daily trading of SPY paralleled my system but at oppose times - I propose "Buying SPY at Noon and Selling at Close." They reported on losses in SPY during the day forcing the question: If SPY was dropping during the day, how, how could I select afternoons when it rose?.
The study used 25 years of SPY data But since GDP-based ETFs have only been liquid for 8 years I examined daytime vs overnight profit for 8 and 10 year sub periods.
To stay consistent with the NY Times article In the table below I use compound rather than additive returns used in past posts. Recall the grammar school problem of a $100 item whose price is $121 after a two 10% price increases so compound returns are higher than my previous additive returns.
Note that results in the last ten years (row 3) differ with the prior fifteen (row 2); Daytime returns dramatically shift from negative 39% to positive 53%.row 3. This solved my question. SPY has not been dropping during afternoons for more than 8 years. (Although the quoted study is correct I've notified the NY Times about this sub period difference.)
Comparing afternoons in the last 2 rows shows SPY_PM model's impressive performance. In the last 1765 afternoons SPY retuned 23% or .01%/day. On model afternoons (329 days) it returned 57% or .17/day. In 10 million random selections of 329 days not one selection had a better return. Strong significance is when the system is better than 1 in 100 random trials. These results are better than 1 in 10,000,000
Financial advisors and academics both oppose market timing. Their advice: When the market drops wait out the storm. But one recent academic work, "Volatility-Managed Portfolios" by Alan Moreira and Tyler Muir, disagrees saying: Reduce your investments when volatility increases. Don't just sit tight.
Their justification uses simple arithmetic. Sharp Ratio (an accepted measure of return-to-risk) is return divided by volatility. The higher the better. Now volatility is reasonably predictable (except for large jumps like Feb 5th) whereas returns are totally unpredictable. Here’s the arithmetic: if the numerator (return) stays constant while the denominator (volatility) increases, Sharp ratio will decrease when volatility increases. By decreasing investment we're less committed during low Sharp Ratio periods.
The authors tested their thesis with seven investment strategies always finding better returns by decreasing investment size when the market becomes more volatile. One of these strategies was momentum. By happenstance, my GDP-based momentum system, called SPY_PM, had been following their recommendation in two ways.
Low volatility (VIX). Earlier posts discussed relaxing SPY_PM's triggers during low VIX periods to achieve more trades. I didn't follow Moreira and Tyler exactly by increasing assets during these periods but more frequent trading gave an almost equivalent result.
High VIX. Originally SPY_PM backtested with data to 2013. During this period VIX exceeded 30 only during the China plunge of late 2016. (Even the Euro scare in the summer of 2012, solved with words by that “brilliant Italian” Mario Draghi, only brought VIX to 25.) See the chart below. Since SPY_PM wasn't tested with high VIX, it doesn’t trade when VIX exceeds 30. Implicitly I'm following their recommendation by eliminating trades during high volatility periods.
Moreira and Tyler and the SPY_PM model reach the same conclusion from different directions, They're efficient market guys saying saying volatility can't predict returns. SPY_PM finds, however, that volatility does predict PM returns. (See Post 13 at etf12trade.com) One possible explanation is: GDP-weighted momentum posits that the US slowly follows world movement. But in high VIX periods, the US acts more independently; In low VIX periods the US follows the world.
Coincidentally, before reading the article, Post 15 compared SPY_PM having a VIX filter of 30 with a model having a VIX filter of 20. This filter of 30 was luck so after reading the article I retested SPY_PM back to 2011 including the Euro sovereign debt crisis (Spain and Italy). Result: It works best with a filter of 23.
Early in this blog series a physicist asked why always holding SPY, called Buy&Hold, wasn’t a better strategy than trading at noon and selling at 4pm, called SPY_PM. My too quick response focused on holding time and risk. I found the model holds SPY only 2% of the time and then concluded, incorrectly, that it's only one fiftieth as risky. But then an ex-CEO mentioned that Time was the wrong variable - one hour on Saturday evening when all markets are closed shouldn't be equated with one hour on Wednesday afternoon when the Fed is making announcements - so Post 8 compared volatility to compare periods.
Using volatility Post 8 found SPY_PM is still less risky than Buy&Hold. Given the recent volatility and market drop this post extends that analysis. It compares Buy&Hold losses with SPY_PM losses looking at both holding time and volatility. Column 2 and 3 groups the number and amount of daily losses into the categories of column 1. This is the physicists strategy and also the strategy recommended by Warren Buffet for his wife. Columns 4 and 5 categorizes all changes for every afternoon, that is, SPY changes between noon and 4pm. Columns 6 and 7 shows afternoons selected by SPY_PM. Finally, columns 8 and 9 further restricts SPY_PM afternoons to only those afternoons when VIX at noon was less than 20.
Over the last 7 years the physicists strategy (col 2 and 3) clearly wins returning 104% but not without some pain. Ten days with loses between 3 and 5% and 165 days with losses between 1 and 3%. With SPY_PM (col 6 and 7) not one day had a loss greater than 2% and only 8 days had losses between 1 and 2%. Of course this comes with lower return of 42.3%.
SPY_PM's power becomes quite clear by comparing with trading every afternoon (col 4 and 5). Imagine on 1701 days buying SPY at noon and selling MOC. (Trading costs would be just 3-4% but let's ignore these.) This strategy returns 37%. SPY_PM, however, trades on only 493 days yet returns 42%!
The final two columns are a response to the last week of volatility. As mentioned in earlier posts SPY_PM is now volatility adjusted so its triggers get wider when VIX is low. At the other extreme what if one didn't trade on high VIX days? The result isn't conclusive. Yes instead of 8 days with a 1-2% loss there are only 5 so less risk but net return drops from 42% to 33%. As always thoughts on this comparison and other ideas in here will be appreciated.
My model has powerful significance. Normal significance means a T-stat from a regression > 1.96 and less than 1 in 20 Monte Carlo tests beating a model. My model has a T-stat > 6 and only 1 in 1,000,000 Monte Carlo tests beat my model. Academics have noted, however, it's "out-of-sample" (OOS) performance that really matters. Given I published my model in July I now have six months of OOS data. The table to the right shows an OOS WinRatio of 70%.
OK, so my model works, really works, and its story, momentum, has pedigree -it's even accepted by academics who preach market efficiency. But I can't explain why only GDP-weighted momentum works. Why can't US-weighted momentum predict the US market? I've spent time with accepted statistical modeling and with some questionable data snooping techniques yet neither could find any relation between SPY AM moves and SPY PM moves.
The first row above shows the average data-snooped result. It has very little variance. All models produced about the same results as all days. It's as if an "invisible hand" makes the correlation zero. And yet the power of a GDP-weighted momentum model is undebatable. Thoughts from readers will be appreciated.
With year-end the financial press will be reporting the performance of various sectors and strategies. Following them, although a few days early, this post reports performance of my trade-SPY-at-noon strategy. First it reports performance during the last 6 years; Then at 6 volatility (VIX) levels.
The year-by-year performance could never have been predicted (So for stock selection I believe in the efficient market-hypothesis, see Post 1; For instance, the odds on the health-case industry beating energy - or any sector beating any other - in 2018 is 50%.) We’ll see again (as discussed in posts 3, 7 and 10) that noon- strategy returns depend on volatility. The beauty of this is that by looking at VIX level we can continuously estimate the strategy’s returns. With time this can't be done. No-one can estimate which sector will preform best next year.
The table shows 2017 Win Ratios were positive on 67.4% of the days. Net return was low at 2.8% because of low volatility. As in Post 10 yearly significance in the the Monte Carlo column (using only one-sixth of the full sample) almost passed the 5% threshold. The full six years have astounding significance.
And note the dual ability of VIX. It allows estimation of strategy returns and of strategy significance.
The table of six VIX quantiles shows these dual predictive abilities of VIX even more forcefully. On days when VIX has its three lowest levels the average daily return is .07% and Monte Carlo significance is weak. On the three highest level days average return is .25% with Monte Carlo always significant and below 1% twice. (The 15.7-18.3 level is a bit of an outlier but still significant at the 5% level.)
The charts below repeat the table to emphasize the correlation between VIX and return. In the yearly chart, rank order would be exact except that 2017 with lowest volatility is minimally higher than 2014 with the second lowest volatility. The pink band shows VIX between 10% and 90% of its range during each year.
The volatility chart displays the median of 6 VIX quantiles against average daily return. The three lowest VIX groups have the 3 lowest returns; Again the 5th group is visually, but not significance wise, an outlier.
To conclude this year the July comment from a physicist, "why not just buy-and-hold SPY" needs an answer. Yes in 2017 when SPY returned 20.1%, trading at noon and 4pm hardly can be justified with its 2.7% return. I would, however, say it can be justified with a longer view:
In 2015 SPY B&H returned 1.3% while the noon-strategy returned 8.5%;
In 2016 SPY B&H returned 12.0% while the noon-strategy returned 5.7%.
This post has been deleted. It only explained the error in Post 11