Natural Gas Price and Crude Oil Price History
Below is a long term chart showing monthly prices for Natural Gas and Crude Oil (on log scales of course):

There is an obvious long term correlation in both but it is also apparent that at times the two diverge from each other. I won't show it here but I looked at the regression relationship between the two but a simpler and more common way to compare two different price series is to look at the ratio of prices. Not surprisingly it is important to look at that ratio in log scale also. Here is the ratio of Natural Gas price to Crude Oil price for the same time period:

There are a few different interesting dynamics going on here. You will see a long term trend where from 1992 through 2002 or so, the ratio was trending upward (meaning natural gas was getting more expensive relative to crude oil) and then since the overall trend is coming down. Beyond that you also see shorter term dynamics where the ratio swings quickly over the course of just a few months.
With this kind of long term shifting in behavior, you need to be careful trying to come up with any absolute ratio that might signal a point where the ratio is due to shift the other way.
Fitting an ARIMA Model to the Price Ratio
Looking closer at this ratio data, I found some interesting time series dynamics that were able to be fit with an ARIMA model. This indicates some non-randomness on some scale although it may appear overall to be a random walk. While I don't have the time or space to go into all the details, I looked at the residual error from that ARIMA model and found some interesting forecasting/trading implications.
Applying some basic control chart rules to the ARIMA model residual errors, I came up with a basic forecasting tool which indicates when Natural Gas Prices (relative to crude specifically) are due for a move up or down. This shows a statistical process control chart on those residual errors:

(for the statgeeks, I used 2 sigma control limits to be a bit more liberal but that still should give you a 95% confidence). When this chart gets above the upper line noted by red squares it indicates a bearish signal for natural gas and when it gets below the lower line it indicates a bullish signal.
What is this really doing?
Trying to boil this down, this looks at the price ratio of natural gas to crude. It fits the time series behavior of that ratio with an ARIMA model. Then it looks at the model fit errors to note when the ratio is moving too fast up or too fast down relative to where it should be based on the ARIMA model of how the ratio should randomly walk and wobble around.
How would it perform if traded?
Below is a plot summarizing all the signals above. A buy would be issued when the chart above has a point plotted below the lower control limit, then I closed that long position when the MA of the residual error plotted above rose back above the center line. Here is what it looks like:

Pretty amazing and interesting results. There are some that seem obvious but many that do not, for example going short back in December of 2008 after Natural Gas had already dropped quite a bit.
Final Thoughts and Fine Print
So this looks pretty impressive. But it is a back tested system which is always unrealistic to some degree if one were to use this going forward although I think that is somewhat minimized because this is not an overfit or over-optimized system. It is a very basic statistical system using standard statistical models and common statistical rules. I really like this overall methodology and will definitely look into this more in the future. This does not include any slippage costs, etc.
Where is Natural Gas Price Headed?
Finally note the end of the trading chart it has a long position issued in March of this year. That is still an open long position. Keep in mind this really is a ratio of Natural Gas to Crude Oil so technically it could mean either Natural Gas goes up in the coming months, Crude Oil goes down, or both. Also keep in mind, this is monthly data so it is longer term and means nothing for the next few days or even weeks. This may seem to disagree with my Natural Gas inventory blog post where I said it did not look like a price spike this winter so far. Well really, these are two different things being measured and forecasted and just because one does not forecast a price increase does not mean the other can't. In fact I see these being used together.
Please see the disclaimer.
...
20 comments:
So, you didn't use some of the data to build the model and then test it against a hold out sample? Is that what you meant by:
"But it is a back tested system which is always unrealistic to some degree if one were to use this going forward..."?
Why has the ratio broken down? If you know anything about the fundamentals of NG, you would know that over the last 4 years, domestic production is up 58%. This ratio means nothing any more.
Dave,
Correct, what I am showing here is all backtested with no hold out data set. And for something like this it would be difficult to do that at all because of the fact that this is a long term system with minimal trades and even though there is data back to the early 90's there is not enough data really to split it into two data sets to do that. However, it is still very impressive even for a backtested result because of the fact there was no real optimization or over-fitting involved. Bottom line, I am still personally very encouraged by this but as with anything, you can't ever be certain what the future will hold. I am curious what the next few months will look like since this system says to be long natural gas right now. I may enter some paired trade of short crude and long natural gas to try to take advantage of that.
Anon,
As you may notice, my expertise is in the statistical modeling, not really in the fundamentals of any particular sector. But in doing these analyses I often learn alot about something like natural gas. So, the ratio has trended down over the last 6 years or so which probably speaks to your comment about domestic production but I don't think the ratio as "broken down" it just changes over time.
What I am doing here is not just looking at historical extremes in the ratio, I am fitting the time dynamics behavior of the ratio over time with an ARIMA model and noting when that behavior is too high or too low on a relative basis compared to the ARIMA model and using that as a trading signal.
So I disagree that this ratio means nothing now, I think the ratio still is important, it is just that the nature of that importance evolves over time and the ARIMA model gives a statistical way of tracking that evolution and still generating sound signals that are valid on a 2-5 month timeframe I believe.
Sorry Caveman, I wasn't trying to slam you; I understand that many are not in the NG business. But I wanted to point out that the ground you're standing on has shifted. A few years ago we thought we would run out of NG and have to make up the difference via LNG; hence we built a number of regas terminals that are now sitting largely unused because of the massive surplus of gas being produced.
So while oil is a very international commodity, international movements of NG (LNG) only occur at the margins where domestic supply is inadequate. This is mainly Japan/Asia, then Europe, and if no one will buy it, it finally diverts to the US and we take it at bargin prices.
BUT, no one anticipated the huge production increase in the US, so there are no liquefaction plants operating in the US to send the excess back across the pond where they could actually use it.
People used to think that you could use a mmBtu equivalent energy to kind of get an idea of where NG should be relative to #2 FO, but that idea is less and less valid given the dramatic shift in NG production.
I don't know how else to put it; there has been a major technology shock introduced that I believe disturbs a longer-term statistical analysis.
BTW, everyone and their brother has been talking up this trade on the NG/crude ratio, long NG short crude. Do a search on seeking alpha. Or look at the growth of UNG shares over the last month (its doubled...) That says to me there are way too many people on that side of this trade and barring supply disruptions, the storage number is going to be.....high.
Looking at this again, what you are really doing is arb'ing the time-lag correction to the ratio. I would think this would work better on a much shorter time-scale. Arb'ing the lag correction isn't a bad idea; it works in 'other places...'
Anon, no worries, I didn't take offense to your comments. I just really enjoy these discussions and debates. You bring up some great points. And in fact, all the attention and volume on UNG has been exactly what has made me hesitate to get long UNG. I totally agree in not liking to be on the same side of everyone. I think what we may see happen will be more crude dropping than nat gas going up. But the short crude/long nat gas trade would still work in that case and the ratio would still bounce back up.
I had planned to look at these types of relationships on a shorter timeframe like weekly instead of monthly to start.
I'm looking at this right now on the daily, which shows some higher ratios in 2000 (>0.3 after smoothing w/ a 20MA). I'm very curious how you fit a model here; what are your regressors? And how did you transform the data to stationary (there is high autocorrelation, although interesting partial ac's)?
Sorry I'm not much of a stats background but I find this very interesting and am trying to replicate this in Stata.
BTW, re: UNG; you don't really want it even if you did want to be long NG; the monthly roll given the steep contango in the market seriously errodes your value each month. UNG publishes their trades from the previous day, and you can see that the roll doesn't work out for them well...I liked macroshares approach better.
Anon,
First of all, I did this on monthly data. I was curious about daily data and may look at that in the future but that will give you a different look and different dynamics than what I got.
On the monthly ratio data, it was stationary enough so I fit a 1AR Arima model on the log of the monthly ratio data. That produced a solid model with no autocorrelation in the residuals. Then I took the residuals from that ARIMA model and looked at those for my signals. A simple individual control chart with 2 sigma limits actually worked pretty good but a 3MA control chart with 2 sigma limits worked a bit better.
If you are working with daily data, first make sure you take a log transform on the ratio data. Then you probably need to take a first difference to get it stationary if it is still not stationary after the log transform. Then you would have to play around with an ARIMA model on the differenced series but I would guess that you should be able to get a model on it.
But sometimes I find that if you use weekly or monthly data you can get better models because you are picking up on different longer term dynamics and if you look at daily data the daily noise just makes it hard to fit much of anything but it is still worth trying with the daily data.
On UNG, I am aware of the problems but I still personally believe that inspite of all that, if natural gas prices really went up that UNG would go up too. In its history, it has done a pretty good job of tracking the overall price action in natural gas over the course of months which is what I am interested in.
So ultimately, I am thinking along the lines of the short USO, long UNG play here more than anything but I am not in a hurry either I am fine just not trading this at all if it just doesn't look right too.
By the way, where are you getting your historical data for natural gas daily prices? Just curious.
I get my historicals from LIM via Trendwise. Basically its a time-series repository for financials, settles, forwards, etc. It gets populated from the providers, in this case the combined Pit+Electronic session for NG, using the spot continuation contract. Same for CL on the NYMEX.
Anon, seems to me the excess production and it's different this time talk are as popular as the pairs trade you urge caution on. At the end of the day, they're both essential hydrocarbons, so a long-term ratio analysis should work.
"they're both essential hydrocarbons, so a long-term ratio analysis should work"
And here's why not: the marginal supply in crude moves to the best bid. But the marginal NG supply in the US is stuck here; it cannot leave the US; it can only go into storage. Previously the marginal supply had to come *in* to the US via LNG or Canadian/Mexican imports, much like crude; this is no longer the case with NG.
94 injection number today...if that's not bearish I don't know what would convince you. But hey, this is a stats/technical discussion, not fundamentals, so I digress.
Anon 2 or 3... not sure :)
Continued great discussion and please do continue to "digress"! This is a great discussion and although I am focused on statistical and data analysis because that is my expertise, but that does not mean I am not interested in fundamentals. I just like to make clear where my experise is and isn't. But I find the most powerful analysis is when you can tie the statistical result back to some fundamental understanding in the process so I especially appreciate the fundamental insight so please keep it coming!
I have the injection data as well, I just did not initially look at it that close but I may look a bit closer although I think the inventory data which I looked at in my post last week will encompass the injection data as well.
And one final point, ultimately, this ratio analysis that was the subject of this post says that the ratio is due to move up. But keep in mind, that does not have to mean that nat gas will move up. It could end up being that crude moves down and I kind of think we will see both and probably see crude moving down more dramatically than nat gas moving up.
I'll call myself "Anonoman" as I had all the posts except the one at June 25, 2009 11:29 AM. :-)
I completely agree that your correction to the "ratio" is going to come on the crude more likely than on the NG side. The injection data is your bible; its really the only point in the market where you get some actual *tie* to the physical reality of the system. Or if you have proprietary pipe scrape models....
But ultimately everyone weekly is trying to guess the injection number, and more importantly, the 'end-of-Oct' storage number. Where you put that end-of-Oct number at should determine your bias; this year especially.
Hurricane risk is slightly below normal, mainly with dust coming off the Sahara its keeping the Cape Verde waters from really being able to do anything, along with some moisture along the W.African coast. Also, off-shore is less important that its been in the past as lots of production is on-land in shale.
Finally, vol is historically high around the tropics seasons (and the rest of the tenors for that matter), so the market is already priced extremely high in terms of vol. I take that as vol is over-bought.
There is a fundamental problem right now, which is that there is a finite amount of storage in North America. Consider AECO and watch its levels as it will probably be the first indicator.
I got an ARIMA fit yesterday that was surprisingly good, then did another with a 20% hold-out. Haven't run a new backcast against the hold-outs, but it looks promising. I have to say I am surprised. Still looking at your residual signals...and looking at my own. The window size of the MA has a big effect on the magnitude of the residuals (I suppose you'd expect that) so its a little challenging to figure out the correct P and Q values for the ARIMA model.
The AC's/PAC's didn't give a lot of clues as to where those values might be; I had significants at lags 5, 36, 38...
Anonoman,
Sounds good. That end of October inventory number is key it seems as I saw in my inventory analysis. I was thinking about trying to fit a seasonal ARIMA model to the monthly inventory data itself to forecast October inventory.
Your ARIMA model sounds interesting. The window of the MA definitely makes a big difference although with the monthly data I was working with there wasn't much choice. But with daily data you may want to try an EWMA also. But also keep in mind that doing a model on daily data should point you to shorter term dynamics. If I am interested in longer term dynamics (like changes over months or years) I often use weekly or monthly data because it focuses on that long term action and ignores the shorter term movement and noise. So if you are fitting a model on daily data it will be fitting very different dynamics on a shorter term basis.
If you want, you could email me your daily data set and I could play around with the daily data some also.
Anonoman,
Oh yea, on the hurricane risk, I agree. See my post on the hurricane outlook for this season. Also, even with an active hurricane season, it takes alot to make a sustained impact on price. It looks like the Katrina year was the only year that had a sustained impact on NG price.
Well, the double shot of Gustav/Ike last year was supportive of prices as well. Gustav did some damage to the LOOP which is bullish crude while Ike took out a ton of power lines that made it a lot harder to get production started again. The timing of the two storms meant that many rigs were just restarted or about to after Gustav before they had to evac again. That combo killed about 200bcf of production.
With Katrina, it wasn't that storm in particular that hurt production, it was the Katrina/Rita/Wilma combo that kept crews from getting back to the rigs.
The gulf infrastructure is surprisingly robust and last year Cuba took a lot of energy out of the gulf-track entry Cape Verde-type storms. The loop flow in the gulf isn't that warm yet either, so that takes a lot of potential energy out of the system.
Its possible to take a storm track, the surge, wind and wave actions expected, and draw it across the MMS blocks to get an idea of who needs to evac, and what effect that will have on production.
Did you read the EIC report on hurricane impacts to NG production, it was a pretty good analysis. MMS? what is that? I assume something to do with the NG production rigs, etc.
I have alot of interest experience playing around with with hurricane forecasting data so you will see more blog posts in the future on that. Sound like you have a good handle on that as well. I think we will have a light season although I am wondering if it won't be as light as the latest seasonal forcasts from Colorado state.
MMS (minerals management services) is the mineral rights blocks in the Gulf. MMS reports who has each block, and you can then tie the block owner to production numbers to see how much production is in the block. The blocks give you the geography of the infrastructure. Then you supplement with rig types (jack-up, etc) and their corresponding ratings for wave action, wind, etc. And throw in the shut-in and start-up times along w/ evac leads and lags, and you get an idea of what's going to happen.
Altho everyone is forecasting a below-normal season, the waters in the inner gulf are getting pretty warm where the entry track is getting over 30 and the coast has some spots that are warmer. It worries me; so does the edge of Verde temps.
So finished up an ARIMA on the log ratio; second-differences went stationary with a rather strange significant AC and PC 30+ lags out...weird. Y-Residuals were quite low, then went to the 2-sigma signal and backtested. It 'halfway' worked; 18 success, 17 fails, total of 12%+ net. Guess I need to refine the signals a bit more; hopefully its not just a mean reversion. Dataset starts 6/24/99
With a long data set and large AC and PC chart you will get random and spurious significant lags that don't necessarily mean anything. If the significance line for the AC correlations is a 95% confidence then you should on average get 1 false "significant" lag per 20 lags plotted. Now if it is showing a bunch of significant lags then it may actually mean something is going on on a longer term scale that you have yet to fit from a model standpoint. Daily data can be more difficult to fit than longer term data sometimes. and with daily data, you won't be able to get a model that is analogous to my monthly model since it would have to have a model that looks back 30+ lags to get out to a month if you know what I mean.
Post a Comment