When we backtest a strategy on a portfolio, it is a simple analysis of a single period in time. There are ways to “stress test” a strategy such as monte carlo, random portfolios, or shuffling the returns in a random order. I could never really wrap my head around monte carlo and shuffling the returns seemed to be a better approach because the actual returns of the backtest are used, but it misses one important thing… the impact of consecutive periods of returns. If we are backtesting a strategy and we want to minimize max drawdown, consecutive down periods have a significant impact on max drawdown. If, for example, the max drawdown occured due to 4 consecutive months during 2008, we wan’t to keep those 4 months together when shuffling returns.

In my opinion, a better way to shuffle returns is to shuffle “blocks” of returns. This is nothing new, the TradingBlox software does monte carlo analysis this way. I had a look at the boot package and tseries package for their boot functions, but it was not giving me what I wanted. I wanted to visual a number of equity curves with blocks of returns randomly shuffled.

To accomplish this in R, I wrote two functions. The shuffle_returns function takes an xts object of returns, the number of samples to run (i.e. how many equity curves to generate), and a number for how many periods of returns makes up a ‘block’ as arguments.The ran_gen function function is a function within the shuffle_returns function that is used to generate random blocks of returns.

shuffle_returns returns an xts object with the random blocks of returns so we can do further analysis such as max drawdown, plotting, or pretty much anything in the PerformanceAnalytics package that takes an xts object as an argument.

This is not a perfect implementation of this idea, so if anybody knows of a better way I’d be glad to hear from you.

The example below uses sample data from edhec and generates 100 equity curves with blocks of 5 consecutive period of returns.

R code:

require(PerformanceAnalytics) #Function that grabs a random number and then repeats that number r times ran_gen <- function(x, r){ #x is an xts object of asset returns #r is for how many consecutive returns make up a 'block' vec <- c() total_length <- length(x) n <- total_length/r for(i in 1:n){ vec <- append(vec,c(rep(sample(1:(n*100),1), r))) } diff <- as.integer(total_length - length(vec)) vec <- append(vec, c(rep(sample(1:(n*100),1), len = diff))) return(vec) } shuffle_returns <- function(x, n, r){ #x is an xts object of asset returns #n is the number of samples to run #r is for how many consecutive returns make up a 'block' and is passed to ran_gen mat <- matrix(data = x, nrow = length(x)) for(i in 1:n){ temp_random <- ran_gen(x = x, r = r) temp_mat <- as.matrix(cbind(x, temp_random)) temp_mat <- temp_mat[order(temp_mat[,2]),] temp_ret_mat <- matrix(data = temp_mat[,1]) mat <- cbind(mat, temp_ret_mat) } final_xts <- xts(mat, index(x)) return(final_xts) } #get sample data data(edhec) a <- edhec[,1] a <- head(a, -1) start <- Sys.time() yy <- shuffle_returns(a, 100, 5) chart.CumReturns(yy[,1:NCOL(yy)], wealth.index = TRUE, ylab = "Equity", main ="Generated Equity Curve of Shuffled Returns") end <- Sys.time() print(end-start)

Created by Pretty R at inside-R.org

What you are thinking about is bootstrap sampling method, I think. It’s not a new problem as you said. For the consideration of auto-correlation, or consecution, there are block-wised bootstrap and stationary bootstrap methods for your reference. It’s a simple but not short story about these two methods. In R you can easily implement both of them with function ‘tsbootstrap’ of ‘tseries’ package.

Thanks, I’ll take a harder look at the tseries package.

You might also be interested in checking out the “Maximum Entropy” boostrap. I haven’t ever applied this method to testing trading strategies, but I’m sure it’s applicable:

http://cran.r-project.org/web/packages/meboot/index.html

Interesting post. Is there any statistic function that takes a data series as input and returns the persistence factor of the data series or the median block size of consecutive losses or consecutive wins?

In R, please find the package ‘np’ and refer to function ‘b.star’, which I think should work for you:

Description

b.star is a function which computes the optimal block length for the continuous variable data using the method described in Patton, Politis and White (2009).

Thanks for the info, there are so many packages on R… it’s a great community!

I’m not familiar with the persistence factor, could you send a link or info on it. There might be something in PerformanceAnalytics that deals with consecutive losses, otherwise I’m sure we could write a function to find the count of N consecutive gains or losses.

Thanks for suggestions. Ross, may be there are better alternatives to persistence factor. One premise in the post is if a strategy has consecutive periods of losses then it has significant impact on max draw down. My premise is slightly orthogonal to it i.e., yes the consecutive periods has significant impact but that impact is other way i.e., it helps one reduce max draw down considerably. For example, say the underlying strategy produces trending/persisting equity curve up/down. Then how about one uses equity curve based money management on a strategy like taking trades only when strategy equity curve is above 10 MA? Similarly one can vary position size. The key for all this IMO is the persistence factor of the strategy equity curve.

Hi rb,

I have modified your function ran_gen() such that there is NO for loop, and it uses bootstrap sampling (by calling runif()). Also, I have replaced the argument x, to n, as the data is NOT actually used by the function, as ONLY the length(x) is used. These modifications reduces the execution time significantly (about 10-15% decrease in time taken).

As you have willingly shared your code for free, I am sharing some of my improvements made to your code, as I planned to use this for my Monte Carlo simulations.

boot_ran_gen <- function(n, r)

{

#n is the length of object of asset returns

#r is for how many consecutive returns make up a 'block'

i <- trunc(n/r)

j <- n %% r

if( (i*r+j) != n ) stop("logical error")

#— For i times, runif() returns a random number between 1 and n

# Duplicate r times, for each row

idxNum <- round(runif(i, min=1, max=n))

idxNum <- idxNum[ rep(1:length(idxNum), rep(r, length(idxNum))) ]

#— For ONE (1) time, runif() returns a random number between 1 and n

# Duplicate j times, this row and append to idxNum

modNum <- round(runif(1, min=1, max=n))

modNum <- modNum[ rep(1, j) ]

c(idxNum, modNum)

}

Note: The modNum at the end is to allow for fractions. For example if n=100, r=3, then idxNum is made up of 33 (i is integer of mod) blocks of 3, and ONE block of 1 (j is fractional of mod).

Thanks for sharing… appreciate it!

Note that sample(1:n,i,replace=T) is equivalent to round(runif(i, min=1, max=n)). Essentially, bootstrap is just sampling with replacement.

The concept of Monte Carlo shouldn’t be too difficult to understand. It’s just running a numerical simulation. You can think of Nassim Taleb’s example of estimating π by picking numbers from unif × unif (a square [0,1] × [0,1]) and then seeing if they fall within a circle with radius ½ centred at (½,½).

Similarly if you didn’t know how to value e.g. a swap formulaically you could compute a bunch of “random” (you hope, representative) samples of “what could happen” and deduce the probabilities of those outcomes from your sim.