Corn Calendar Spread Drivers

Introduction

The aim of this write-up is to investigate what fundamental features can be seen as the driver of corn calendar spreads.

For each calendar spread we start out with a random forest model that tries to forecast the value of the spread with input features consisting of the stock-to-usage numbers of

  • Argentina
  • Brazil
  • China
  • Russia
  • Ukraine
  • United States
  • World
  • World without China

as well as the number of days the front month contract has to expiry. The stock-to-usage numbers are determined form monthly WASDE reports and reflect the amount of ending stocks relative to consumption of each of the countries listed.

Due to the randomness involved when constructing random forest models we choose ten different seed values when training the models. We also perform hyperparameter tuning of

  • the number of estimators
  • maximum depth
  • minimum inpurity decrease

over a 3-fold cross validation of the training sample. We leave 20% of the total data set for out of sample testing. The feature importances of the trees making up the best models are recorded. The means and standard deviations of the feature importances are shown on bar charts with error bars.

An interesting aspect of machine learning models, in particular random forests, boosted trees and neural networkds, is that they are able to caputure nonlinearities and interaction effects. By interaction effects we mean how the combination of two more more features influence the value we are trying to model. An example of this is how the stock-to-usage of the United States corn stocks affect the ZN spread when the front month contract has lots of time vs only a month left to expiry. In order to study the

  • linear,
  • non-linear and
  • pairwise interaction

terms of the calendar spread models we make use of the fingerprint method of Li, Turkington and Yazdani.

Quick Overview of the Fingerprint method

This section is technical and quite mathematical, the interested reader is encouraged to follow, however the main purpose is to serve as a quick reminder of how the functions are constructed. Feel free to skip to the next section if you are not interested in the technical details. This section follows straight from Li, Turkington and Yazdani.

Denote the model prediction function \(\hat{f}\) we a trying to find as

\[ \hat{y} = \hat{f}(x_1, \dots, x_m) \] In general the prediction function depends on the \(m\) input parameters or features. The partial dependence function only depends on one of the features, \(x_k\). For a given value of \(x_k\), this partial dependence function returns the expected value of the prediction over all other possible values for the other predictors, which we denote as \(x_{\backslash k}\). The partial dependence function is then defined as

\[ \hat{y}_k = \hat{f}_k(x_k) = E[\hat{f}(x_1, \dots, x_{k-1}, x_{k+1}, \dots, x_m)] = \int \hat{f}(x_1, \dots, x_m) p(x_{\backslash k}) dx_{\backslash k} \] where \(p(x_{\backslash k})\) is the probability distribution over \(x_{\backslash k}\).

In practice we follow the following steps:

  1. Choose a value of the feature \(x_k\), say \(\alpha\)
  2. Combine this value with one of the actual input vectors for the remaining variables, \(x_{\backslash k}\), and generate a new prediction from the function: \(\hat{y} = \hat{f} (x_1, \dots, x_{k-1}, \alpha, x_{k+1}, \dots, x_m)\).
  3. Repeat step 2 with every input vector for \(x_{\backslash k}\), holding the value for \(x_k = \alpha\) constant, and record all predictions.
  4. Average all the predictions for this value of \(x_k\) to arrive at the value of the partial prediction at that point, \(y_{x_k}\).
  5. Repeat steps 1 through 4 for any desired values of \(x_k\) and plot the resulting function.

The partial dependence function will have small deviations if a given variable has little influence on the model’s predictions. Alternatively, if the variable is highly inf luential, we will observe large f luctuations in prediction based on changing the input values.

Next, we decompose a variable’s marginal impact into a linear component and a nonlinear component by obtaining the best fit (least squares) regression line for the partial dependence function. We define the linear prediction effect, the predictive contribution of the linear component, as the mean absolute deviation of the linear predictions around their average value. Mathematically we write,

\[\text{Linear Prediction Effect}(x_k) = \frac{1}{N} \sum^{N}_{i=1}\left| \hat{I}(x_{k,i}) - \frac{1}{N} \sum^{N}_{j=1} \hat{f}(x_{k,j}) \right|\]

In the above equation, for a given predictor \(x_k\), the prediction \(\hat{I}(x_{k,i})\) , results from the linear least square fit of its partial dependence function, and \(x_{k,i}\) is the \(i\)th value of \(x_k\) in the dataset.

Next, we define the nonlinear prediction effect, the predictive contribution of the nonlinear component, as the mean absolute deviation of the total marginal (single variable) effect around its corresponding linear effect. When this procedure is applied to an ordinary linear model, the nonlinear effects equal precisely zero, as they should. Mathematically we write,

\[ \text{Nonlinear Prediction Effect}(x_k) = \frac{1}{N} \sum^{N}_{i=1}\left| \hat{I}(x_{k,i}) - \hat{f}(x_{k,i}) \right| \]

A similar method can be applied to isolate the interaction effects attributable to pairs of variables \(x_k\) and \(x_l\), simultaneously. The procedure for doing this is the same as given earlier, but in step 1 values for both variables are chosen jointly. The partial dependence function can then be written as

\[ \hat{y}_{k,l} = \hat{f}_{k,l}(x_k, x_l) = E[\hat{f}(x_k, x_{\backslash k}, x_l, x_{\backslash l})] = \int \hat{f}(x_1, \dots, x_m) p(x_{\backslash (k l)}) dx_{\backslash k}dx_{\backslash l} \]

We define the pairwise interaction effect as the demeaned joint partial prediction of the two variables minus the demeaned partial predictions of each variable independently. When this procedure is applied to an ordinary linear model, the interaction effects equal precisely zero, as they should. Mathematically we write,

\[ \text{Pairwise Interaction Effect}(x_k, x_l) = \frac{1}{N^2} \sum^{N}_{i=1} \sum^{N}_{j=1} \left| \hat{f}(x_{k,i}, x_{l, j}) - \hat{f}(x_{k,i}) - \hat{f}(x_{l, j})\right| \]

Corn Calendars

The corn curve consists of contract codes

code month
H Mar
K May
N Jul
U Sep
Z Dec

Below we split the analysis into consecutive and longer dated calendars. The consecutive calendars are formed using the consecutive contract codes in the table above. These are

  • HK
  • KN
  • NU
  • UZ
  • ZH

The longer dated calendars are examples of other contract codes are have been interested in during the past and include

  • HN
  • NZ
  • ZN
  • UH

Consecutive Calendars

The consecutive calendars are those the make up the roll schedule of the systematic strategies we consider where corn is involved. A detailed understanding the features that cause the values of these calendar spreads to change will help in determining how to roll the curves forward.

C HK

The plot below shows the feature importance of the corn HK calendar spreads. The filled bars and error bars denote the mean and standard deviation of the feature importance assocaited with each of the features. The vertical red dashed line shows the threshold of feature importance where all the features are equally imprtant. In the case of the HK spread we see that the most important features are

  • daysdiff
  • china
  • brazil
  • worldnochina

It is interesting to note that the HK spread places a greater importance on the Brazil stock-to-usage than that of the United States. This might be because the H and K contracts reflect when the South American corn storck come to the market.

Another interesting observation is the high importance placed on the daysdiff parameter. This implies that there will be a definite seasonal behaviour present in the model.

The facet plot below shows scatter plots of the HK spread as a function of each of the four main features above. Superimposed on top of the scatterplots are linear best fit plots to highlight any linear trends that are present. For the three stock-to-usage related features we see clear negative slopes highlighting the fact that higher stock-to-usage numbers are generally associated with spreads that are more negative or contango.

The daysdiff facet above shows a negative slope. Note that as time becomes closer to expiry the linear fit is lower compared to dates that are further away from expiry. This shows that the HK spread likes to mature into contango as it enters into expiry. This is a phenomena that we encounter in many commodites that have to be stored in silos and warehouses.

C KN

The KN part of the curve places the greatest importance on

  • worldnochina
  • unitedstates
  • world

Here it is interesting to note that the importance of the Brazillian stock-to-usage number has fallen away implying that the Brazil stock are more highly coupled with the H contracts. We aslo see that the daysdiff parameter has become less predictive compared to the HK model which signals that the KN curve has less of a drift toward contango as the spread enters into expiry.

The facet plot below shows the top three features from above together with daysdiff. Notice that the linear best fits to the three facets with stock-to-usage numbers fail to capture the nonlinear behaviour assocaited with decreasing stock levels.

C NU

The NU spread is totally dominated by the stock-to-usage numbers of the United States. A distant second and third is world and worldnochina.

The nonlinear dynamics with respeck to the stock-to-usage numbers of the the main driving feature is obvious in the facet plot below.

C UZ

Similar ot the NU spread we see that the UZ spread is also dominated by the stock-to-usage numbers of the Unites States. We do also observe that the dependence on daysdiff is much greater compared to the NU spread.

The facet plot below again highlights the nonlinear dynamics associated with the stock-to-usage numbers. It is interesting the note the decreasing linear behaviour as the spread goes into expiry. This shows that the spread exhibits a tendancy to decrease or become more contango going into expiry.

C ZH

From our previous analysis we gather that Brazilian stocks might enter the picuture once more because of the inclusion of the H contract in the ZH spread. This is exactly what we observe in the feature importance bar plot below. Here the main contributing features are

  • daysdiff
  • argentina
  • brazil
  • ukraine
  • russia
  • china

So far, this has been the spread with the greatest number of important input features. However, the high dependence on daysdiff point to a drift in time that dominates the overall evolution of the spread. It is interesting to note that the Argentine and Brazilian stocks are found to be meaningful predictors of the ZH spread. This is probably because the Argentine and Brazilian stocks come to market duing the same time.

The scatterplots below show how the spread behaves for different values of the main features. For the features associated with stock-to-usage numbers we see clear and intuitive linear downtrends. This spread seems to not have such a strong nonlinear dependence on the input features as the previous ones. Notice the strong downtrend with respect to the daysdiff parameter showing that the curve tends to become more contango or more negative as the front month nears expiry.

Longer Dated Calendars

The longer dated calendars to not make up part of the roll schedule we employ in trading corn on a systematic manner. These spreads form part of other calendar spreads that we have found interesting in the past.

HN

The presence of the H contract should point to the fact that the South American stocks might become valuable predictive features. This is supported by the feature importance bar plot below. The main contributing model features are

  • worldnochina
  • daysdiff
  • brazil
  • china
  • unitedstates

Notice that the nonlinear dynamics with respect to the stock-to-usage features have returned for this longer dated spread. However there is still this drift toward contango that we have also seen in the shorter dated consecutive calendar spreads.

NZ

The NZ spread is totally dominated by United States and global stock-to-usage numbers. The other parameters hardly enter into the picture.

The facet plot below shows that there is not much of a linear drift towards contango in the NZ spread. However, the nonlinear dynamics associated with United States and global stock-to-usage numbers are clear.

ZN

It is interesting to note the presence on the the Brazilian stock-to-usage numbers in the feature importance barplot below because the H contract is not directly measured, however the influence of Brazilian stocks can definitely influence the ZN spread. Notice also the high importance associated with the daysdiff feature.

The ZN spread behaviour with respect to stock-to-usage numbers are surprizingly linear compared to the majority of results showns throughout the rest of the write-up. The dift toward contango is also quite evident from around 200 days to expiry onward.

UH

In the UH spread the presence of the H contract does not warrent the presence of South American stock-to-usage implying that the majority of the moves can be associated with the U contract. As in the case of the UZ spread we see that the feature importance is again dominated by the United States stock-to-usage numbers

The nonlinear contribution of the dominating feature, United States stock-to-usage, is clear in the facet plot below. Notice also the drift toward contago as the front contract moves toward expiry.

Remarks

The current version of the write up only shows the feature importance of the consecutive and longer datet calendar spreads when modeling the spreads with respect to

  • Argentina
  • Brazil
  • China
  • Russia
  • Ukraine
  • United States
  • World
  • World without China

as well as the number of days the front month contract has to expiry. We have seen that different spreads have different main predictive features. In particular calendar spreads that involve the H contract rely heavily on the Brazilian stock-to-usage numbers. Similarly spreads involving the U contract are dominated by United States stock-to-usage. Throughout we have seen that the majority of the calendar spreads have a tendancy to drift toward contango as the number of days to expiry decreases to zero. This nagative carry is something to keep in the back of your mind when deciding how to roll our corn exposure foward along the curve.

In the next iteration of this write-up we will include the specific nonlinear and interaction effects present for each of the spreads.

Avatar
Mauritz van den Worm
Portfolio Manager and Quantitative Researcher

My research interests include the use of artificial intelligence in managing commodity portfolios

Related