Why you can’t test intraday strategies with daily data

Let’s use this strategy trading XLP ETF as an example. We want to buy XLP at the open if it falls more than -0.1% and if the previous day it lost more than -0.25%. In appearance we could test this strategy with the following simple python code:

import numpy as np
import pandas as pd
import quandl

xlp_daily=quandl.get("YAHOO/XLP",start_date="2005-1-1",end_date="2013-1-1")

"""
    Yesterday must have been a down day of at least 0.25%.
    If XLP opens down more than 0.1% today, go long and exit on the close.
"""
signal =( ( -1 + xlp_daily.Open / xlp_daily.shift(1).Close < -0.001) &
          ( -1 + xlp_daily.shift(1).Close / xlp_daily.shift(2).Close < -0.0025) )

multiplier=1E5

profits=(signal*multiplier*(xlp_daily.Close-xlp_daily.Open))
returns=(signal*(-1+xlp_daily.Close/xlp_daily.Open))

profits.cumsum().plot()

And plotting the profits curve we get the same result as the linked blog post:

What happens if we code this strategy in Quantopian?

import numpy as np


def initialize(context):
    schedule_function(func=sell_at_close,
                      date_rule=date_rules.every_day(),
                      time_rule=time_rules.market_close(minutes=5),
                      half_days=True)
    schedule_function(record_daily_values,
                      date_rules.every_day(),
                      time_rules.market_open(hours=0,minutes=3))  
    

def handle_data(context, data):
    xlp = sid(19659)
    open_today = data.history(xlp,'open',1,'1d')[-1]
    if (xlp not in context.portfolio.positions) and (not np.isnan(open_today)) and (not get_open_orders(xlp)):
        price = data.current(xlp,'price')
        closes = data.history(xlp,'close',3,'1d')
        pct_yesterday = -1 + closes[-2]/closes[-3]
        pct_today = -1 + open_today/closes[-2]
        if (pct_yesterday <= -0.0025) & (pct_today < -0.001):
            order_target(xlp, 5000/price)
            log.info('buying @%s' % price)
            
def sell_at_close(context, data):
    for stock in context.portfolio.positions:
        order_target_percent(stock, 0.0, style=MarketOrder())
        log.info('selling @%s' % data.current(stock,'price'))
       

def record_daily_values(context,data):
    # The record() function allows you to create a custom time-series plot of any variable in your simulation.
    # In this example we will record and plot the market exposure and leverage of our portfolio over time. 
    record(leverage = context.account.leverage)
    record(market_exposure=context.account.net_leverage)

First we notice that even trading a small position of 5000$, the platform warns us that lots of sell orders near the close (5 min before) cannot be filled:

2005-02-18 15:34 handle_data:32 INFO buying @23.42
2005-02-18 21:55 sell_at_close:37 INFO selling @23.47
2005-02-18 22:00 WARN Your order for -213 shares of XLP failed to fill by the end of day and was canceled.
2005-02-22 21:55 sell_at_close:37 INFO selling @23.1
2005-03-10 15:31 handle_data:32 INFO buying @23.33
2005-03-10 21:55 sell_at_close:37 INFO selling @23.4
2005-03-10 22:00 WARN Your order for -214 shares of XLP has been partially filled. 145 shares were successfully sold. 69 shares were not filled by the end of day and were canceled.
2005-03-11 21:55 sell_at_close:37 INFO selling @23.26
2005-03-16 15:33 handle_data:32 INFO buying @23.04
2005-03-16 21:55 sell_at_close:37 INFO selling @23.09
2005-03-16 22:00 WARN Your order for -217 shares of XLP has been partially filled. 75 shares were successfully sold. 142 shares were not filled by the end of day and were canceled.

If we look at the cumulative returns, they look much worse than our simple backtest:

Why such different results?

  • Bad data: Yahoo Open prices have lots of errors
  • Low liquidity: the simple backtest assumes you can trade at open and close prices. Most of the time this is not the case, even with small positions.

 

Implementing Andreas Clenow Plunger Indicator in Python

I wanted to check a mean revesion model as described in this post from Andreas Clenow. For this system we need to implement a simple indicator, the Clenow Plunger indicator. The purpose of this indicator is to measure the “pain” of trend followers when the price goes against the trend.

First some imports:

import pandas as pd
import numpy as np
import talib
from talib import MA_Type
import quandl
import matplotlib.pyplot as plt
from matplotlib.finance import candlestick_ohlc
import matplotlib.dates as mdates
from matplotlib.pyplot import style
from matplotlib.dates import DateFormatter, WeekdayLocator,\
    DayLocator, MONDAY

Load the data from daily bars of ES (SP500 emini) futures from quandl:

sp500_daily = quandl.get("CHRIS/CME_ES1", 
                         start_date="2012-1-1", 
                         end_date="2014-12-30")

And define our indicator:

def EMA(closes, periods = 50):
    sma = talib.MA(closes.values, timeperiod = periods, matype = MA_Type.EMA) 
    return pd.Series(data=sma, index=closes.index)

def ClenowPlunger(ohlcDaily):
    trendUp = EMA(ohlcDaily.Close, periods = 50) >  EMA(ohlcDaily.Close, periods = 100)
    trendDown = EMA(ohlcDaily.Close, periods = 50) <  EMA(ohlcDaily.Close, periods = 100)
    twentyDayHigh = pd.rolling_max(ohlcDaily.High, 20)
    twentyDayLow = pd.rolling_min(ohlcDaily.Low, 20)
    atr = pd.Series(data=talib.ATR(ohlcDaily.High.values, ohlcDaily.Low.values, ohlcDaily.Close.values, timeperiod=50),index=ohlcDaily.index)
    plungeUp = (twentyDayHigh - ohlcDaily.Close) / atr;
    plungeDown = (ohlcDaily.Close - twentyDayLow) / atr;
    clenowPlunger = trendUp*plungeUp+trendDown*plungeDown
    return pd.DataFrame(data={'ClenowPlunger':clenowPlunger,'TrendUp':trendUp,'TrendDown':trendDown},index=clenowPlunger.index)

We will test our indicator with 2013 data for emini SP500 so that we can view if it matches the behaviour as described in the linked article:

sp500_daily['ClenowPlunger']= ClenowPlunger(sp500_daily).ClenowPlunger

sp500_2013 = sp500_daily['2013']

Let’s plot the result:

style.use('ggplot')

mondays = WeekdayLocator(MONDAY)
alldays = DayLocator()
weekFormatter = DateFormatter('%b %d')
dayFormatter = DateFormatter('%d')
    
fig = plt.figure()
ax1 = fig.add_subplot(211)
sp500_2013['Date']= mdates.date2num(sp500_2013.index.to_pydatetime())
sp500_candlestick = [tuple(x) for x in sp500_2013[['Date', 'Open', 'High', 'Low', 'Close']].to_records(index=False)]
candlestick_ohlc(ax1, sp500_candlestick)  
ax1.set_ylabel('ES Futures')
ax1.xaxis.set_major_locator(mondays)
ax1.xaxis.set_minor_locator(alldays)
ax1.xaxis.set_major_formatter(weekFormatter)
ax1.xaxis_date()
ax1.autoscale_view()
plt.setp(plt.gca().get_xticklabels(), rotation=45, horizontalalignment='right')

ax2 = fig.add_subplot(212)
ax2.set_ylabel('Clenow Plunger')
sp500_2013['ClenowPlunger'].plot(ax=ax2)

If you had to be long DAX one day per week, which one would it be?

This post wants to replicate the plots in this tweet from @Kkta90:

First some imports:

from datetime import datetime, date, time
import pickle
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

Load the data from 15 minutes bars of DAX futures continuously adjusted from 2006-2016. I cannot redistribute the data, so you will have to obtain it from your broker or data provider of choice:

with open('dax-15min-2006-2016.pkl', 'rb') as input:
    dax_15min=pickle.load(input)

Build the cumulative returns of mean 15 minutes returns for each different day of the week:

dax_15min_rets = (dax_15min.Close/dax_15min.shift(1).Close)

dax_15min_rets_monday=-1+dax_15min_rets[dax_15min_rets.index.dayofweek==0].groupby(lambda t: time(t.hour,t.minute)).aggregate(np.mean).cumprod()
dax_15min_rets_tuesday=-1+dax_15min_rets[dax_15min_rets.index.dayofweek==1].groupby(lambda t: time(t.hour,t.minute)).aggregate(np.mean).cumprod()
dax_15min_rets_wednesday=-1+dax_15min_rets[dax_15min_rets.index.dayofweek==2].groupby(lambda t: time(t.hour,t.minute)).aggregate(np.mean).cumprod()
dax_15min_rets_thrusday=-1+dax_15min_rets[dax_15min_rets.index.dayofweek==3].groupby(lambda t: time(t.hour,t.minute)).aggregate(np.mean).cumprod()
dax_15min_rets_friday=-1+dax_15min_rets[dax_15min_rets.index.dayofweek==4].groupby(lambda t: time(t.hour,t.minute)).aggregate(np.mean).cumprod()
dax_15min_rets_avg=-1+dax_15min_rets.groupby(lambda t: time(t.hour,t.minute)).aggregate(np.mean).cumprod()

And plot:

def add_subplot(id,title, rets):
    plt.subplot(2, 3, id)
    plt.plot(rets)
    plt.title(title)
    plt.xlim(time(8,15), time(22,15))
    plt.xticks([time(8,0),time(10,0),time(12,0),time(14,0),time(16,0),time(18,0),time(20,0),time(22,0)])
    plt.ylim(-0.0010, 0.0015)
    
add_subplot(1,'Monday', dax_15min_rets_monday)
add_subplot(2,'Tuesday', dax_15min_rets_tuesday)
add_subplot(3,'Wednesday', dax_15min_rets_wednesday)
add_subplot(4,'Thrusday', dax_15min_rets_thrusday)
add_subplot(5,'Friday', dax_15min_rets_friday)
add_subplot(6,'Average', dax_15min_rets_avg)
    
plt.suptitle('DAX 2006-2016')
plt.show()

This is what you get:

Note that the returns of each time series doesn’t start at 0. This is because the mean overnight gap. I found it interesting to have this information but you can always substract it as in the tweet from @Kkta90.

VIX closes on a 1 year low and below 12, what happens next?

This tweet from @aaljechin looked interesting: Let’s look at what happens after a year low in VIX index.

  • VIX index must be below the minimum from the last 255 days (business year).
  • No other year low in the past month
  • VIX below 12

First get some data:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import scipy as sp
import scipy.stats
import quandl

vix_index = quandl.get("YAHOO/INDEX_VIX")
vix_fut = quandl.get("CHRIS/CBOE_VX1")

Create signals for both index and futures. There where no futures traded before june 2005.

# vix closes on a 1 year low and below 12
signal1 = ( (pd.rolling_min(vix_index.shift(1).Close,255) > vix_index.Close)  &
          (vix_index.Close < 12) )

# and no match in the last month
signal2 = signal1 & (pd.rolling_sum(signal1.shift(1),20) == 0)

signal = signal2[signal2==True]

# no VIX futures before this date...
signal_fut = signal['2005-6-20':]

There are 16 events found with these conditions:

def pct_ret(close, amount):
    rets = (close.shift(-amount) / close) - 1
    return rets

# 15 occasions
ret5days_index = pct_ret(vix_index.Close, 5)[signal.index].dropna()
ret10days_index = pct_ret(vix_index.Close, 10)[signal.index].dropna()
ret15days_index = pct_ret(vix_index.Close, 15)[signal.index].dropna()
ret30days_index = pct_ret(vix_index.Close, 30)[signal.index].dropna()
ret60days_index = pct_ret(vix_index.Close, 60)[signal.index].dropna()

# 20% avg 
ret60days_index.mean()

VIX is 20% above the year min after 60 days. This heatmap shows each instance:

vix_index_heatmap=vix_index.copy()

for i in range(1, 60):
    vix_index_heatmap[str(i)] = 100*pct_ret(vix_index["Close"], i)

vix_index_heatmap.drop(["Open","High","Low","Close","Volume","Adjusted Close"], axis=1, inplace=True)
vix_index_heatmap=vix_index_heatmap.loc[signal.index].dropna()

plt.figure(figsize=(16, 13))
sns.heatmap(vix_index_heatmap, yticklabels=False,annot=True, fmt="d")
plt.ylabel("VIX on 1 year low events")
plt.xlabel("Days after low")
plt.title("VIX year low vs. index returns N days later")

Can we trade this with VIX futures? Let’s check:

# 6 occasions
ret5days_fut = pct_ret(vix_fut.Close, 5)[signal_fut.index].dropna()
ret10days_fut = pct_ret(vix_fut.Close, 10)[signal_fut.index].dropna()
ret15days_fut = pct_ret(vix_fut.Close, 15)[signal_fut.index].dropna()
ret30days_fut = pct_ret(vix_fut.Close, 30)[signal_fut.index].dropna()
ret60days_fut = pct_ret(vix_fut.Close, 60)[signal_fut.index].dropna()

# -11% avg
ret60days_fut.mean()

In the six occasions that the signal is activated, we get an average return of -11% after 60 days. Looks like all the profit plus more is eaten by the contango in the VIX futures.

vix_fut_heatmap=vix_fut.copy()

for i in range(1, 60):
    vix_fut_heatmap[str(i)] = 100*pct_ret(vix_fut["Close"], i)

vix_fut_heatmap.drop(["Open","High","Low","Close","Settle", "Change",  "Total Volume",  "EFP","Prev. Day Open Interest"], axis=1, inplace=True)
vix_fut_heatmap=vix_fut_heatmap.loc[signal_fut.index].dropna()

plt.figure(figsize=(16, 13))
sns.heatmap(vix_fut_heatmap, yticklabels=False,annot=True, fmt="d")
plt.ylabel("VIX on 1 year low events")
plt.xlabel("Days after low")
plt.title("VIX year low vs. futures returns N days later")

SPY ETF: Three down days and gap up?

Backtesting this overnight mean reversion trading idea from Quantified Strategies:

  • SPY must be down 3 days in row (from close to close).
  • Entry on close on the 3rd doen day
  • Exit next day open
spy_daily=quandl.get("YAHOO/INDEX_SPY",start_date="2001-1-1")

signal =( (spy_daily.Close < spy_daily.shift(1).Close) &
          (spy_daily.shift(1).Close < spy_daily.shift(2).Close) &
          (spy_daily.shift(2).Close < spy_daily.shift(3).Close))

multiplier=1E5

profits=(signal*multiplier*(spy_daily.shift(-1).Open-spy_daily.Close))
returns=(signal*(-1+spy_daily.shift(-1).Open/spy_daily.Close))

profits.cumsum().plot()

# hit rate
len(profits[profits>0])/signal.sum()

# sharpe
np.sqrt(252) * (np.mean(returns)) / np.std(returns)

Looks like it has lost its edge in the last three years:

Backtest of Gold futures (GC) strategy with 3 day holding period

I’m going to backtest this Gold futures strategy from @Dburgh:

First we import the basic packages: pandas and numpy for time series manipulation and quandl for retrieving data:

import numpy as np
import pandas as pd
import quandl

Then we need to obtain daily data for ES futures from Quandl:

gold_daily = quandl.get("CHRIS/CME_GC1", 
                         start_date="2001-5-14", 
                         end_date="2016-11-23")
# use Last column as Close
gold_daily.columns=['Open', 'High', 'Low', 'Close', 
                     'Change', 'Settle', 'Volume', 
                     'Open Interest']

Since the algorithm logic is pretty simple, we can backtest using a vectored approach instead of an event driven loop:

multiplier=100

signal=( (gold_daily.Low > gold_daily.shift(5).Close) & 
        (gold_daily.shift(3).Open > gold_daily.shift(9).Low) & 
        (gold_daily.shift(7).Open > gold_daily.shift(8).Close) & 
        (gold_daily.index.dayofweek == 2) )

# hold time 3 days
profits=(signal*multiplier*(gold_daily.shift(-3).Close-gold_daily.Close))
returns=(signal*(-1+gold_daily.shift(-3).Close/gold_daily.Close))

Now we can plot the cumulative returns:

profits.cumsum().plot()

And this is what we get:

Want to know the hitrate of this strategy? Around 65%

len(profits[profits>0])/signal.sum()

Sharpe ratio? 0.89

np.sqrt(252) * (np.mean(returns)) / np.std(returns)

Backtesting S&P 500 futures (ES) long-only 1 day hold strategy

I am going to backtest the following strategy from @Dburgh in python:

First we import the basic packages: pandas and numpy for time series manipulation and quandl for retrieving data:

import numpy as np
import pandas as pd
import quandl

Then we need to obtain daily data for ES futures from Quandl:

sp500_daily = quandl.get("CHRIS/CME_ES1", 
                         start_date="2000-1-1", 
                         end_date="2016-10-1")
sp500_daily.columns=['Open', 'High', 'Low', 'Close', 
                     'Change', 'Settle', 'Volume', 
                     'Open Interest']

Since the algorithm logic is pretty simple, we can backtest using a vectored approach instead of an event driven loop:

multiplier=50

# close[0] <= close[9] && 
# low[0] <= low[1] && 
# low[3] <= high[6] && 
# volume[0] <= volume[1]
signal = ((sp500_daily.Close <= sp500_daily.shift(9).Close) & 
          (sp500_daily.Low <= sp500_daily.shift(1).Low) & 
          (sp500_daily.shift(3).Low <= sp500_daily.shift(6).High) & 
          (sp500_daily.Volume <= sp500_daily.shift(1).Volume))
# hold time 1 day        
profits=(signal*multiplier*(sp500_daily.shift(-1).Close-sp500_daily.Close))
returns=(signal*(-1+sp500_daily.shift(-1).Close/sp500_daily.Close))

Now we can plot the cumulative returns:

profits.cumsum().plot()

And this is what we get:

Want to know the hitrate of this strategy? Around 60%

len(profits[profits>0])/signal.sum()

Sharpe ratio? Near 1

np.sqrt(252) * (np.mean(returns)) / np.std(returns)