Estimating Spot Price and Smooth Forward Curve in Electricity

UNIVERSITY OF CALGARY
Estimating Spot Price and Smooth Forward Curve in Electricity Markets
with Bayesian Penalized Spline
by
Azamed Yehuala Gezahagne
A DISSERTATION
SUBMITTED TO THE FACULTY OF GRADUATE STUDIES
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
DEGREE OF DOCTOR OF PHILOSOPHY
DEPARTMENT OF MATHEMATICS AND STATISTICS
CALGARY, ALBERTA
April, 2014
c Azamed Yehuala Gezahagne 2014
Abstract
The first part of this thesis presents a Bayesian penalized spline approach to constructing smooth forward curves in electricity markets. Since electricity should be delivered as
a continuous flow, power contracts have settlement periods rather than a fixed delivery
time. In addition, electricity forward curves have strong seasonal shape. Our approach provides a method for estimating a continuous forward price curve from market forward prices
quoted over a period. The approach is illustrated using observed market data from the
Mid-Columbia (Mid-C) and California Oregon Border (COB) pricing hubs.
Since Mid-C is a liquid market where forward contracts are quoted every day and COB
is an illiquid hub, a two step estimation procedure is developed from Bayesian perspective.
First, the Mid-C smooth curve is constructed using Bayesian penalized spline. Next, the
COB smooth curve is estimated by adding a spread to the constructed Mid-C smooth curve
and incorporating the positive spread between the two hubs as an informative prior.
In the second part, we present a mean reverting model for the electricity spot price and
we employ a Bayesian penalized spline approach to model the deterministic seasonal function
exhibited in the monthly averages. Based on the historical spot price from Alberta power
market, we calibrated the model parameters, also by implementing Bayesian estimation
techniques.
i
Acknowledgements
First, I would like to thank my supervisor, Dr. Antony Frank Ware for all his encouragement,
support, valuable comments during the preparation of this thesis. Your extreme patience
when things were very slow set an example of a great mentor and a role model. I also thank
my committee members for their time and willingness to work with me; your discussion,
ideas, and feedback have been absolutely invaluable.
Second, I am thankful to the Department of Mathematics & Statistics. Special thanks
go to the graduate program administrators Joanne Mellard, Carissa Matthews and Yanmei
Fei for their administrative supports throughout the entire program. I am also very grateful
to Ryan Trelford, Niroshan Withanage both PhD students at U of C and Mohammed Abdi
for their support and valuable comments.
Finally, I sincerely thank my wife Lidiya Teklu and my daughters Sarah, Dinah and
Liya who supported me on many difficult times along the way and being an inspiration to
complete this thesis.
ii
Dedication
To: Lidiya, Sarah, Dinah and Liya
whose never ending support, understanding, and patience made this thesis possible.
iii
Table of Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . .
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . .
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
List of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1 Background of Thesis . . . . . . . . . . . . . . . . . . . . . .
1.2 Review of Literature . . . . . . . . . . . . . . . . . . . . . .
1.2.1 Spot Price Modelling . . . . . . . . . . . . . . . . . .
1.2.2 Forward Price Modelling . . . . . . . . . . . . . . . .
1.3 Market Description . . . . . . . . . . . . . . . . . . . . . . .
1.3.1 Alberta Electricity Market . . . . . . . . . . . . . . .
1.3.2 Mid-Columbia (Mid-C) and California Oregon Border
1.3.3 Inter Continental Exchange (ICE) . . . . . . . . . . .
1.4 Overview of Thesis . . . . . . . . . . . . . . . . . . . . . . .
2
Bayesian Estimation of Spot Price Model . . . . . . . . . . .
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Bayesian Estimation Technique . . . . . . . . . . . . . . . .
2.3 Specification of Prior . . . . . . . . . . . . . . . . . . . . . .
2.3.1 Noninformative Priors . . . . . . . . . . . . . . . . .
2.3.2 Conjugate Priors . . . . . . . . . . . . . . . . . . . .
2.4 Markov Chain Monte Carlo (MCMC) . . . . . . . . . . . . .
2.4.1 Gibbs Sampling Technique . . . . . . . . . . . . . . .
2.4.2 Metropolis Hastings Algorithm . . . . . . . . . . . .
2.5 O-U Spot Price Model . . . . . . . . . . . . . . . . . . . . .
2.5.1 Maximum Likelihood Estimation (MLE) . . . . . . .
2.5.2 MLE of the O-U Model . . . . . . . . . . . . . . . . .
2.5.3 Bayesian Estimation of the O-U Process . . . . . . .
2.6 Statistical Inferences . . . . . . . . . . . . . . . . . . . . . .
2.6.1 Numerical Example . . . . . . . . . . . . . . . . . . .
2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
Bayesian Penalized Spline Smoothing . . . . . . . . . . . . .
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Parametric and Nonparametric Regression . . . . . . . . . .
3.3 Smoothing and Regression Penalized Spline . . . . . . . . .
3.3.1 Smoothing Spline . . . . . . . . . . . . . . . . . . . .
3.3.2 Bayesian Smoothing Spline . . . . . . . . . . . . . . .
3.3.3 Regression Spline . . . . . . . . . . . . . . . . . . . .
3.3.4 Penalized Spline . . . . . . . . . . . . . . . . . . . . .
3.3.5 Bias-Variance Trade-off . . . . . . . . . . . . . . . . .
iv
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
(COB) Markets
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
i
ii
iii
iv
vi
vii
ix
1
1
4
5
8
10
10
11
12
13
17
17
18
19
19
21
22
23
23
26
26
28
29
32
32
34
36
36
38
39
39
40
41
42
43
3.3.6 Smoothing Parameter Selection . . . . . . . . . . . . . . . . . . . . . 44
3.4 Illustrative Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.5 Bayesian Penalized Spline Smoothing . . . . . . . . . . . . . . . . . . . . . 47
4
Extracting Smooth Forward Curve From Average Based Contacts with a
Bayesian Penalized Spline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2 Forward Price Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.3 Average Based Power Contracts . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.4 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.5 Bayesian Penalized Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.5.1 MCMC Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.5.2 Mid-Columbia (Mid-C) Forward Prices Data . . . . . . . . . . . . . . 61
4.6 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.7 Long Term Forward Curve using Bayesian Penalized Spline . . . . . . . . . . 63
4.8 Estimating COB Smooth Forward Curve . . . . . . . . . . . . . . . . . . . . 65
4.9 Numerical Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.9.1 COB Long Term Smooth Curve . . . . . . . . . . . . . . . . . . . . . 69
4.9.2 Statistical Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.10 Evolution of Mid-C and COB Forward Curves over Trading Days . . . . . . 71
4.11 Financial Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.11.1 Financial reporting preparation . . . . . . . . . . . . . . . . . . . . . 74
4.11.2 Asset valuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.11.3 Risk management and reporting. . . . . . . . . . . . . . . . . . . . . 74
4.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5
Estimating Electricity Spot Price with Bayesian Penalized Spline . . . . . . . 76
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.2 Historical Spot Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2.1 Non-Normality of Electricity Spot Prices . . . . . . . . . . . . . . . . 79
5.2.2 Seasonality of Electricity Spot Prices . . . . . . . . . . . . . . . . . . 81
5.2.3 Spikes of Electricity Spot Prices . . . . . . . . . . . . . . . . . . . . . 82
5.3 Spot Price Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.4 Spline Modelling of the Seasonal Function . . . . . . . . . . . . . . . . . . . 83
5.5 Extracting Smooth Seasonal Function . . . . . . . . . . . . . . . . . . . . . . 85
5.6 Bayesian Parameter Estimation of the Spot Price model . . . . . . . . . . . 87
5.6.1 Calibration of Parameters . . . . . . . . . . . . . . . . . . . . . . . . 88
5.6.2 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.6.3 Constructing the Daily Spot Prices . . . . . . . . . . . . . . . . . . . 91
5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.2 Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
v
List of Tables
2.1
Point estimates of the O-U interest rate model . . . . . . . . . . . . . . . . .
34
3.1
MSE of each fit relative to the MSE of penalized spline . . . . . . . . . . . .
46
4.1
Bayesian estimates of the spread term and its 95% Bayes posterior interval .
70
5.1
5.2
Monthly average of spot prices from January 2011- December 2012 . . . . . 86
Bayesian estimates and their standard error of O-U spot price model parameters 91
vi
List of Figures and Illustrations
1.1
1.2
1.3
2.1
2.2
3.1
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
5.1
5.2
5.3
5.4
Alberta hourly spot price in 2011 and 2012 . . . . . . . .
The North West Electric Power Markets (www.ferc.gov)
Columbia River & Bonneville dam (www.wikipedia.org).
Corps of Engineers, 2003. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
Source: U.S. Army
. . . . . . . . . . .
5
11
11
Transition density of the O-U process starting at x0 = 0 with µ = 0, θ =
1, σ 2 = 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
O-U sample path with X0 = 0.10, θ = 0.18, µ = 0.08, and σ = 0.02 . . . . . .
27
33
Scatter plot, linear fit, polynomial of order 5, B spline of order 6 and penalized
spline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
Mid-C estimated smooth forward curve (in red) and actual forward prices
quoted on December 01, 2009 (in green) for delivery between January 2010 December 2010 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Mid-C observed forward prices on December 01, 2009 delivery between 20102012 (in green), Mid-C estimated smooth forward curve (in blue) and calculated monthly averages from the estimated smooth curve (in red). . . . . . .
Smooth forward price curve by Bayesian penalized splines on Mid-C and COB
forward contract prices on December 01, 2009. The Mid-C smooth curve (in
dark), COB smooth curve estimated by adding a spread to Mid-C(in blue),
COB observed forward prices (in green), monthly averages of COB forward
prices constructed from the smooth curve (in red) . . . . . . . . . . . . . . .
COB observed forward prices on December 01, 2009 delivery between 20102012 (in green), COB estimated smooth forward curve (in blue) and calculated
monthly averages from the estimated smooth curve (in red). . . . . . . . . .
Histogram of after burn in samples from the posterior distribution of the
spread γ0 and its 95% Bayesian posterior intervals . . . . . . . . . . . . . . .
Fitted surface for Mid-C forward prices traded between December 01- December 31, 2009 for delivery on January 2010-December 2010 . . . . . . . . . . .
Forward curve change for Mid-C forward prices traded between December 01December 31, 2009 for delivery on January 01, 2010 . . . . . . . . . . . . . .
The fitted forward price curves for COB (upper sheet in red) and Mid-C (lower
sheet in blue) over the 22 trading days between December 01- December 31,
2009 for delivery on January 2010-December 2010. A constant spread is added
to obtain the COB smooth forward price curves from Mid-C curves . . . . .
Alberta historical spot price from January 01, 2011 - December 31, 2012 . .
Normality test of Alberta historical hourly spot price returns from January
01, 2011 - December 31, 2012 . . . . . . . . . . . . . . . . . . . . . . . . . .
Autocorrelation test of Alberta spot price returns from January 01, 2011 December 31, 2012 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Spikes in Alberta spot market from January 01, 2011 - December 31, 2012 .
vii
63
65
68
69
70
71
72
73
79
80
81
82
5.5
5.6
5.7
5.8
5.9
Smooth seasonal function and Alberta monthly average spot prices
Normality test on Log de-seasonalized daily spot prices . . . . . . .
Normality test on transformed log de-seasonalized daily spot prices
Actual (in blue) and constructed (in red) daily spot prices . . . . .
Historical and constructed spot prices and their monthly averages .
viii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
86
89
90
92
93
List of Symbols and Abbreviations
Symbol
Definition
AESO
CDF
COB
CV
GCV
IG
iid
ICE
LSQ
MCMC
Mid-C
MLE
MSE
MW
MWh
NRMSE
O-U
QQ
QML
RMSE
U of C
SDE
SE
US
UK
Alberta Electric System Operator
Cumulative Density Function
California Oregon Border
Cross Validation
Generalized Cross Validation
Inverse Gamma
independent identically distributed
Inter Continental Exchange
Least Square
Monte Carlo Markov Chain
Mid-Columbia
Maximum Likelihood Estimation
Mean Squared Error
Megawatt
Megawatt Hour
Normalized Root Mean Squared Error
Ornstein-Uhlenbeck
Quantile Quantile
Quasi Maximum Likelihood
Root Mean Squared Error
University of Calgary
Stochastic Differential Equation
Standard Error
United States
United Kingdom
ix
Chapter 1
Introduction
1.1 Background of Thesis
This thesis is mostly motivated by three literatures namely,
• Bayesian estimation of financial models by Gray (2002).
• Estimating the interest rate term structure of Treasury and corporate debt
with Bayesian penalized spline by Li and Yu (2005).
• Extracting and applying smooth forward curves from average-based commodity contracts with seasonal variation by Benth, Koekebakker and Ollmar (2007).
The first paper outlines a Bayesian methodology for estimating the parameters of a
continuous time financial models such as Black-Scholes option pricing model and Vasicek’s
model of the short-term interest rate. The posterior distribution of the parameters are
estimated using Bayesian paradigm and Markov Chain Monte Carlo (MCMC) technique is
implemented to estimate the model parameters. Though these models are relatively simple
models where numerical computation is not necessary and analytical solution is possible, the
paper shows the ease with Bayesian numerical computation leads to a wide spread application
in finance.
The second paper presents a Bayesian penalized spline smoothing technique to estimate
the interest rate term structure of Treasury and corporate debt. Jarrow, Ruppert and Yu
(2004) were the first to adopt a nonparametric penalized spline model by minimizing the
sum of the least square criterion to estimate the Treasury bond term structure. Due to the
small number of data available on the market for the individual corporate debt, they add a
1
linear credit spread to the Treasury and they use a semi-parametric approach to estimate
the term structure of the individual corporate debt.
Li and Yu ( 2005) however, provide a Bayesian approach to estimate the interest rate
term structure of Treasury and corporate debt by modelling the forward rate f as a spline
f = δ T B,
where δ is a vector consisting of smoothing spline coefficients and B is a matrix of spline
basis functions such as power basis. Bayesian penalized spline is applied to estimate the
term structure of the Treasury and corporate debt by minimizing the lack of fit and the
roughness penalty measure
n
1X
[Pi − µi (δ)]2 + αρ(δ),
n i=1
where Pi is the market bond prices which are observed, µi is model price from the penalized
spline, ρ(δ) is a roughness penalty measure, and α is the smoothing parameter which controls
the compromise between the lack of fit and the roughness measure. The roughness measure
ρ(δ) can be taken as a prior information about the spline coefficient and the lack of fit term
1
n
Pn
i=1 [Pi
− µi (δ)]2 is used to estimate the spline coefficient with the observed data.
The paper suggests that for Bayesian approach the small sample size available on the
market for the individual corporate debt is not an issue. Moreover, the smoothing parameter
α used for smoothing needs not to be selected as in the frequentist case. It will rather be
obtained as a bi product from the Bayesian approach and the MCMC algorithm employed by
taking the ratio of the variances of the posteriors. The paper also shows the term structure
fitted curves for both bonds by applying the methodology 21 times independently on the
market prices of 117 US Treasury and 5 AT&T bonds from April 1994 to December 31, 1995
in the University of Houston fixed income database.
The third paper proposes a method of extracting a smooth electricity forward curve from
the observed market forward prices with settlement over a period. In case of power markets,
such a contract is based on the average of the spot price over the period specified on the
2
contract. The smooth curve constructed is assumed to be the prices of a forward contract
with settlement at a fixed time with in the period if such a contract would have been traded.
Due to the seasonal dependency of power prices, the author model the forward curve f (t)
as a sum of two continuous functions s(t) and (t) the former for the seasonality of the
forward curve and the latter for the adjustment function between the forward curve and the
its seasonality function.
For any fixed time t in the period [t0 , t1 ], the forward curve f (t) is written as
f (t) = s(t) + (t),
t ∈ [t0 , t1 ].
The paper presented three different parametric forms for the seasonal term as
1. s(t) = 0,
2. s(t) as a trigonometric function as in Lucia and Schwartz (2002),
3. A spot forecast from a bottom-up model from Agder Energi AS.
Moreover, the adjustment function (t) is modelled as a smoothing spline of order 4 using
the maximum smoothness criteria by minimizing
Z t1
[00 (t)]2 dt,
t0
over the subset of twice differentiable functions as in Adams and van Deventer (1994).
Using the available 21 Nord pool forward contact prices available on May 4, 2005 and
using a spline consisting of 32 polynomials, a smooth forward curve is constructed. It is
shown that, the choice of the seasonal function has little effect especially on the front end of
the curve however at the back end of the curve, the seasonal term is crucial in determining the
shape of the curve because there is a limited number of market prices available to determine
the shape as the weekly or monthly data goes out only for one year.
3
In this thesis, we follow a mixture of the above papers in the sense we employ Bayesian
penalized spline smoothing to estimate electricity forward curves. However our approach is
different from the above literatures for the following reasons.
First, we consider electricity contracts where such contracts are average based contracts
where settlement occurs over a delivery period instead of fixed time delivery as in interest
rate markets. Market data are also not available for fixed time to delivery of electricity and
such products are not traded in the market.
Second, unlike the second paper, where the forward curve is a mixture of a parametric
seasonality function and a polynomial spline adjustment function, we adopt a full nonparametric Bayesian spline smoothing by which we let the market data to speak for themselves
in terms of determining the seasonality shape. In addition, the Bayesian approach deals
naturally with a small data set.
1.2 Review of Literature
Over the last few years there has been much interest in modelling prices of electricity. These
models are mostly stochastic in nature and they fall short in terms of explaining the complexity associated with power price dynamics such as seasonality, spikes, mean reversion and
high volatility.
Figure 1.1 shows Alberta hourly spot price between January 01, 2011 and December 31,
2012. We can see that the prices show spikes up to $1000 per MWh and revert back to a
long run equilibrium price level. They also show a seasonal trend of low price period in the
second quarter and high prices in the third quarter.
4
H our ly Sp ot Pr ic e ( $/MW h)
1500
1000
500
0
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
6000
7000
8000
9000
H ourly Sp ot Pr ic e ( $/MW h)
H our s in 2011
1500
1000
500
0
0
1000
2000
3000
4000
5000
H our s in 2012
Figure 1.1: Alberta hourly spot price in 2011 and 2012
Generally, electricity price modelling approaches can be grouped into two classes as spot
price models and forward price models.
1.2.1 Spot Price Modelling
Most of the spot price modelling approach involves at least two factors: one for capturing
the short-term hourly or daily price dynamics modelled by mean reversion and very high
volatility and the second factor representing the long term price behaviour in the forward
market. The hourly spot price dynamics can thus be captured by fitting the models to
historical spot prices.
Schwartz (2000) was the first to model the spot price accounting for the mean reversion.
Lucia and Schwartz (2000) modelled the spot price as an exponential function of the sum of
a mean reverting OU process Xt and a deterministic seasonal function f (t). The spot price
5
St can be written as
dXt = −αXt dt + σ dWt
St = exp(f (t) + Xt ),
where W is a standard Brownian motion and σ is a volatility and α is the speed of mean
reversion. But their model didn’t incorporate the spike shown in electricity prices.
To incorporate jumps, Clewlow and Strickland (2000) suggested a jump component using
a Poisson process Nt with intensity λ and a jump size of J. The model can be written as
dXt = −αXt dt + σ dWt + Jt dNt
St = exp(f (t) + Xt ).
Deng (2002) presents analytical results which are based on transform analysis described in
Duffie (2000). These models to have an upward jump followed by a downward jump, the
mean reverting speed parameter α must be extremely high.
Benth (2005) also suggested that spot prices can be modelled as a linear combination of
independent pure mean reverting jump processes
St =
n
X
(i)
wi Yt ,
(i)
dYt
(i)
(i)
= −αYt dt + σi dLt , i = 1 . . . , n,
i=1
where wi are positive weights and Lt are independent increasing c´adlag pure jump processes.
Positive prices are guaranteed in this model by allowing only positive jumps.
Another approach by Barlow (2002) modelled the demand supply equilibrium, where the
underlying electricity demand is an O-U process Xt and for some supply function u the spot
price is determined from the demand supply equilibrium equation
u(St ) = Xt ,
where u : R+ → [0, k].
Cartea and Figueroa (2005) presented a mean reverting jump diffusion model where the
deterministic seasonal function is determined by fitting the monthly historical averages with
6
a Fourier series of order 5. The paper calibrated the model parameters using the UK and
Wales historical electricity spot price data and they pointed out the difficulty of calibrating
the model to historical spot prices. The paper also noted that maximum likelihood estimation
approach employed for parameter calibration gives incorrect result.
Although the above models provide good and reliable descriptions of the dynamics of
electricity prices and are relatively simple to modify and extend, one of the shortcoming of
these models is that it is challenging and difficult to estimate parameters with a very few
market data. In addition, these models follow a parametric approach in terms of modelling
the seasonal function. Our approach, instead, estimates all the model parameters involved
based on Bayesian estimation using the historical data. A new modelling approach using
Bayesian penalized splines will be used for estimating the seasonal function.
Another approach by Liebl (2013) proposed a functional factor model that relates the
hourly electricity spot prices yth of day t with a smooth price - demand function Xt using
yth = Xt (uth ) + th ,
where uth denotes electricity demand at hour h of day t and th is assumed to be a white
noise process. The price - demand functions Xt (u) is modelled by a functional factor model
defined by
Xt (u) =
K
X
βtk fk (u),
k=1
where the factors or basis functions fk are time constant and βtk are factor scores allowed to
be nonstationary time series. First the author estimated the daily price - demand functions
ˆt by cubic spline smoothing for all days. The second step is to determine a K < ∞
X
dimensional common functional basis system f1 , . . . , fK for the estimated price - demand
ˆ1, . . . , X
ˆT .
functions X
7
1.2.2 Forward Price Modelling
Backing out forward prices from spot price based models will in general not be consistent
with market observed forward prices. As a solution to this, a lot of research has focused
on modelling the evolution of the whole forward curve. Market models for forward prices
are dated back to Black (1976) where a single forward contract is considered. Clewlow and
Strickland (1999), Koekabaker and Ollmar (2001) and Carmona and Durrleman (2003) apply
the idea of term structure modelling from Heath, Jarrow and Merton (1992) to model the
dynamics of the whole forward curve. The stochastic differential equation for the dynamics
of the forward curve with maturity T in these models is given by
n
X
dF (t, T )
= µ(t, T )dt +
σk (t, T )dWk (t),
F (t, T )
k=1
t ≤ T,
where W = (W1 , . . . , Wn ) is n-dimensional standard Brownian motion, and the drift term µ
and the n volatilities σk are deterministic functions of the current date t and maturity T .
Such models have the advantage that the market can be considered as being complete
and standard risk-neutral pricing may be used. One of the drawbacks of these models is that
forward prices do not reveal information about the price behaviour on a fine granularity of
time scale such as hourly or daily. In addition, unlike interest rate term structure models
with a fixed maturity T , electricity forward contracts have delivery periods, and market data
for fixed time delivery do not exist. As a result, these models are not directly suitable for
modelling electricity forward curves.
Fleten and Lemming (2005) presented a different approach of constructing approximated
high resolution electricity forward price curves. Average based contracts with delivery over
a period [T1 , T2 ] were considered, as in Bjerksund (2000). Let F (T1 , T2 ) be the price of the
forward contract with delivery in [T1 , T2 ] and ft be modelling constructs price of a forward
contract for fixed delivery time t within the period [T1 , T2 ] if such a contract is traded in the
8
market. Then
F (T1 , T2 ) = PT2
1
t=T1
e−rt
T2
X
e−rt f
t.
t=T1
The forward prices are constrained using the bid ask market information from the Nordic
power exchange as
F (T1 , T2 )bid ≤ PT2
T2
X
e−rt f
1
t=T1
e−rt
t
≤ F (T1 , T2 )ask .
t=T1
The seasonal variation shape of the forward curve is thus optimized using the above bid ask
information bundled with information from the forecasts generated by bottom up models.
Benth, Koekebakker and Ollmar (2007) also presented a method of estimating a smooth
forward curve from these average based contracts as, described in Section 1.1.
Borak and Weron (2008) introduced the dynamic semiparametric factor model (DFSM)
for modelling electricity forward curves. The model represents the observed forward electricity prices Yt,j using a linear combination of nonparametric loading functions ml and
parameterized common factors Zt,l as
Yt,j = m0 (Xt,j ) +
L
X
Zt,l ml (Xt,j ) + t,j ,
l=1
where Xt,j is observable covariates that represents the delivery periods. The loading functions
ml (.) is modeled with B-spline as
ml (Xj ) =
K
X
al,k ψk (Xj ),
k=1
where K is the number of knots, ψk (.) are the splines and al,k are spline coefficients. The
estimation is thus carried out by least square minimization scheme.
Fleten and Lemming (2005) and Benth, Koekebakker and Ollmar (2007) models employ
either a parametric form or a bottom up forecast for the seasonal function. Borak and
Weron (2008) approach however is heavily dependent on the number of loading functions
employed in the model. In addition, small market data poses a problem in the least square
minimization scheme.
9
Our approach however does not rely on any parametric form or forecast but provides
a method of constructing the smooth forward curve using Bayesian spline, which naturally
works well with small dataset. We also provide a method of estimating forward curves for
an illiquid hub from a liquid hub by incorporating a prior information about the hubs.
1.3 Market Description
The empirical analysis of this thesis is based on the market data from ICE quoted to Mid-C
market hub and the historical spot prices from Alberta. The following two sections give an
overview of the Alberta and Mid-Columbia markets as well as the summary of the forward
contracts traded at ICE.
1.3.1 Alberta Electricity Market
Since deregulation in 1996, Alberta electricity market operates as a wholesale power market
where the power produced and consumed as well as prices for each and every hour of the
year are facilitated and regulated by the Alberta Electric System Operator (AESO).
One of the key features of Alberta electricity spot market is that the price is highly volatile
from hour to hour. Unexpected short term outage at the power plant coupled with extreme
weather can spike the price as high as $1000/MWh for several hours. On the other hand,
surplus production together with low demand makes the price to fall as low as $0/MWh.
However, longer term prices, which can be thought of as an average of many hourly
prices, are much more stable because overall price trends tend to be driven by economical
forecasts such as demand growth, capacity additions and retirement.
Alberta has about 14,000 MW of installed electricity generation capacity as well as 21,000
kilometres of transmission lines. Together, this system continuously delivers electricity to
homes, farms and businesses in every corner of the province. About 41 percent of Alberta’s
installed electricity generation capacity is from coal and almost 40 percent from natural gas.
10
Alberta also uses water, wind, biomass and waste heat as forms of electricity generation.
We collected Alberta hourly spot prices from January 01, 2010 to December 31, 2012 to
be used for the empirical analysis in the spot price modelling approach which we present in
Chapter 5.
1.3.2 Mid-Columbia (Mid-C) and California Oregon Border (COB) Markets
The Mid Columbia (Mid-C) is a power market hub located in the state of Washington. It
serves as a common market pricing point used for trading physical and financial electricity
products through brokers in the northwest electric market, which also includes the states of
Oregon, Idaho, Utah, Nevada, Montana, Wyoming and part of California.
The region relies on hydroelectric production for approximately two thirds of its electricity
needs. The hydropower is generated from hydro dams built across the Columbia river, which
flows from the base of the Canadian Rockies in southeastern British Columbia to the states
of Washington and Oregon and flows to the Pacific ocean.
Figure 1.2: The North West Electric
Power Markets (www.ferc.gov)
Figure 1.3: Columbia River & Bonneville
dam (www.wikipedia.org). Source: U.S.
Army Corps of Engineers, 2003.
11
The volume of water in the Columbia river varies seasonally, depending on the timing
and volume of melting snowpack and precipitation. In Canada, the highest flows occur
between May and August; the lowest between December and February. In the U.S., peak
flows typically occur between April and June; the lowest also occur between December and
February.
Based on the seasonal variations of the water level through out the year, electricity
prices also show a seasonal price pattern opposite to the water level. When peak flows occur
between April and June, power prices will be generally at their lowest and in the months
between December and February they will be relatively higher.
The California Oregon Border (COB) is another pricing hub located in the state of Oregon
where the price pattern is dictated by Mid-C prices. Unlike Mid-C which is one of the most
liquid hubs in North America where power contracts are traded daily, COB is an illiquid hub
with limited market activities.
1.3.3 Inter Continental Exchange (ICE)
A market based in Atlanta, Georgia that facilitates the electronic exchange of energy commodities, ICE operates completely as an electronic exchange. It is linked directly to individuals and firms looking to trade in oil, natural gas, jet-fuel, emissions, electric power and
commodity derivatives.
ICE was established in May 2000, and has been at the forefront of the commodities
exchange market since then. The ICE network offers companies the ability to trade energy
commodities with another company around the clock and spanning the globe. ICE also
facilitates the exchange of emissions contracts and over-the-counter energy exchange.
One of the many products of ICE traded is electricity forward with delivery over a period
rather than at a fixed point in time which is the main focus of the thesis. The delivery
periods are standardized and may range from monthly to yearly contracts and they may
overlap such as yearly overlaps with quarterly and quarterly contracts overlap with monthly.
12
On the delivery period, cash settlement is based up on the mathematical average of the daily
prices calculated by averaging the peak hourly and and off peak hourly prices published by
ICE. We can think this settlement as a fixed verses float swaps where we are swapping the
monthly forward contract price (fixed price) by the daily spot price (float price).
Mid-C average based forward contracts are traded at ICE based on the time of use as
peak or off peak. Peak electricity forwards are defined based on the times of the delivery of
electricity during the peak hours defined as Monday to Saturday between hour ending 07:00
to hour ending 22:00 www.theice.com.
The market data used for empirical analysis is Mid-C forward prices traded between
December 01, 2009 - December 31, 2009 to be delivered on the peak hours between the years
2010 and 2012. The 2010 prices are monthly data, however 2011-2013 are quarterly prices
which are a single forward price for the three months of the year.
1.4 Overview of Thesis
The thesis focuses on three problems concerning the construction of electricity price curves.
The first involves the development of a Bayesian penalized spline model and associated
methodologies for the construction of a smooth forward price curve delivery at a fixed time
from average based electricity contracts delivery over a period. The proposed model is
numerically illustrated using the monthly market data quoted to Mid-Columbia pricing hub
on December 01, 2009 for delivery on January - December 2010.
The second concerns the construction of another smooth forward price curves for an
illiquid hub (COB) with limited market information by adding a spread term to the already
estimated curve for a liquid hub (Mid-C). These entail the construction of joint Bayesian
models that incorporate the positive spread between the hubs as an informative prior. It
also concerns with the evolution of these forward prices over different trading days. The
methodology is applied 22 times independently to Mid-C and COB forward prices obtained
13
from December 01 - December 31, 2009.
The third concerns the estimation of spot prices, with special attention to the construction
of daily prices from historical settled prices in the Alberta electricity market. This involves
the construction of models that flexibly represent the complex dependence structures between
seasonality, spikes and mean reverting that is exhibited in electricity spot prices. The thesis
is thus organized as follows.
Chapter 2 presents brief introduction to Bayesian estimation techniques that focuses on
Bayes formula, choice of priors and MCMC algorithm. Bayesian parametric estimation of
an O-U type spot price model and detail numerical results on the parameter estimates and
their Bayesian credible intervals are presented.
In Chapter 3, we provide a brief background introduction on smoothing spline, penalized
spline and Bayesian penalized spline that are necessary for the development of this thesis.
The chapter begins by laying out the theoretical basis of regression spline models, and it
illustrates by example the power of penalized spline models over other regression model.
MCMC algorithm and hierarchical Bayesian methodologies are also discussed in detail.
In Chapter 4, we discuss the construction of a smooth forward curve based on Bayesian
penalized spline. Our approach follows Bjerksund (2000) which models the relationship
between the forward price function at any fixed time t and the average based contracts over
the delivery period [T1 , T2 ]. Let F (t, T1 , T2 ) be the contract price of electricity at time t to
be delivered over a period [T1 , T2 ] and assuming constant cash flow throughout the delivery
period, the expression of the average based contract is
F (t, T1 , T2 ) =
Z T2
w(t, u)f (t, u) du,
T1
where
e−r(t−u)
−r(t−u) du
T1 e
w(t, u) = R T2
and f (t, u) is is the price of a forward with fixed-delivery time at u. By taking zero interest
14
rate, we can get a very good approximation to the above model as
F (t, T1 , T2 ) ≈
1
T2 − T1
Z T2
f (t, u) du.
T1
Assuming delivery occurs at a discrete delivery times, the above integral turns into a sum
and the whole expression will turn into an average of f (t, u) over the delivery periods which
we use for our numerical computation.
The forward curve f inside the above integral is modelled as a spline the same way as Li
and Yu (2005) and we estimate the smoothing parameter from the ratio of variances of the
posterior distribution inside the MCMC algorithm employed. The chapter also presents a
way of extending the smooth forward curve to a long term price curve beyond the one year.
A two step construction methodology from Bayesian perspective is also presented in
Chapter 4. The Mid-Columbia smooth forward curve is first estimated with a Bayesian
penalized spline model. The California Oregon Border curve is then constructed by adding
a spread to the already estimated Mid-C curve. The positive spread between Mid-C and
COB is incorporated as an informative prior. Finally, the methodology is illustrated using
the market data from Mid-C and few COB monthly forward prices and results from the
simulation study is summarized and the fitted surfaces are presented.
Motivated by the ease of parameter estimation of a spot price model using Bayesian
paradigm, we present spot price modelling approach using a Bayesian penalized spline
smoothing in Chapter 5. Our approach follows Lucia and Schwartz (2002) spot price modelling approach but we apply penalized spline smoothing to the monthly average of historical
spot prices to estimate the deterministic seasonal function f . After removing the seasonal
function from the log of the historical spot price, we left with a pure stochastic O-U process
for which we can estimates the parameters using Bayesian and MCMC algorithm. Using the
estimated parameters and the seasonal function, spot prices are constructed back to see how
the model fits the observed prices. Mathematical formulations, graphical representations
and numerical results are presented in detail.
15
Finally, in Chapter 6, results from the thesis are summarized, issues associated with
spline modelling are discussed, and recommendations are presented regarding the methods
considered throughout the thesis. Potential areas for future research are pointed out as well.
16
Chapter 2
Bayesian Estimation of Spot Price Model
2.1 Introduction
Electricity spot and forward price models are mostly stochastic in nature and are governed
by a set of continuous-time stochastic differential equations. While this modelling approach
is very common in theoretical analysis, numerical estimation of model parameters is difficult
and poses serious challenges.
The natural choice for empirical estimation of model parameters is to use a maximum
likelihood estimation (MLE) approach. However in most cases the likelihood function is
unknown or has a nonstandard form which makes the implementation difficult. One approach
is to employ an approximate likelihood function but it induces bias in parameters values as
noted in Gray (2005).
In addition, MLE is based on the assumption of large number of samples, and as a result
the approach may produce incorrect estimates under a limited number of samples case. This
is the case pointed out by Cartea and Figueora (2005) who applied MLE on UK and Wales
electricity markets. They found out that the MLE approach produces negative values for
parameters that should otherwise be positive.
In order to avoid these difficulties posed in parameter estimation, we propose Bayesian
approach as a natural solution. The Bayesian paradigm makes use of a prior information
about the parameters and models the joint posterior distribution conditioning on the given
data using Bayes’ formula. Estimation of the parameters is performed using a Markov Chain
Monte Carlo algorithm based on Gibbs sampling and Metropolis Hastings algorithm.
In this chapter, we examine the Bayesian estimation approach to estimate parameters of
a continuous time O-U spot price model where the mean reversion parameters do not have
17
a standard density. An MCMC scheme based on Gibbs sampling for the long run mean and
volatility parameters and Metropolis Hastings algorithm for the mean reversion is employed.
The estimates for the parameters are then compared against results from the maximum
likelihood estimation.
The remainder of the chapter is organized as follows. Section 2.2 presents a brief overview
of Bayesian estimation framework. Section 2.3 provides noninformative and conjugate priors.
MCMC algorithms together with Gibbs and Metropolis Hastings steps are presented in
Section 2.4. The continuous time O-U spot price model together with MLE and Bayesian
estimation of the parameter are discussed in Section 2.5. Section 2.6 illustrates the Bayesian
methodology applied to a simulated samples from the proposed model and numerical results
are presented and summarized in a table.
2.2 Bayesian Estimation Technique
Bayesian statistical inferences about the parameter θ are made conditional on the observed
data and takes the parameters as random variables. Suppose that θ has a probability
distribution p(θ). Then from Bayes theorem we have
p(Y |θ)p(θ) = p(Y, θ) = p(θ|Y )p(Y ).
Conditional on the observed data Y , the distribution of θ is
p(θ|Y ) =
p(Y |θ)p(θ)
.
p(Y )
Given the data Y , the p.d.f. p(Y |θ) may be regarded as a function of θ, which we recall as
the likelihood function of θ given Y and written as L(θ|Y ). So, given a likelihood L(θ|Y )
and a prior probability density p(θ), the posterior density for θ is given as
p(θ|Y ) =
p(θ)L(θ|Y )
.
p(Y )
18
The factor p(Y ) does not depend on θ and, can thus be considered a constant, yielding
p(θ|Y ) ∝ p(θ)L(θ|Y ).
This expression condenses the technical core of the Bayesian estimation technique: the primary task of any estimation is to develop the model p(θ, Y ) and perform the necessary
computation to summarize p(θ|Y ). In most cases, analytic evaluation of the estimates
θˆ = E[θ|Y ] is impossible so that alternative approaches such as Markov Chain Monte Carlo
(MCMC) can be used. Moreover, specification of the prior p(θ) is also a challenge so that
noninformative and conjugate priors are considered.
2.3 Specification of Prior
The prior distribution represents a subjective information or an expression of belief about
an unknown quantity before the data are available. Generally, there are two classes of prior
distributions: noninformative priors and conjugate (informative) priors.
2.3.1 Noninformative Priors
In cases where we do not have a strong prior belief or lack of information about the parameter, it is advisable to use noninformative prior which is dominated by the likelihood function
as in Tanner (1996). We considered two cases to illustrates the choice of noninformative
priors.
Case 1: Normal population (mean unknown, variance known).
Consider an iid sample from a N (θ, σ 2 ) population, where σ 2 is known and θ is unknown.
Note that the likelihood function f (Y |θ) is equal to
n
Y
®
1
1 (yi − θ)2
exp
−
2
2
σ2
i=1 2πσ
´
n
1
1X
(yi − θ)2
=
exp
−
.
(2πσ 2 )n/2
2 i=1
σ2
(
19
)
Since σ 2 is given, (2πσ 2 )−n/2 is considered a constant and does not help to determine the
shape of the likelihood hence can be omitted. We also note that
n
X
(yi − θ)2 =
i=1
since
Pn
i=1 (yi
n
X
[(yi − y¯) + (¯
y − θ)2 =
i=1
n
X
(yi − y¯)2 + n(¯
y − θ)2 ,
i=1
− y¯)(¯
y − θ) = 0. Thus, the likelihood can be re-expressed as
®
1
exp −
2
Pn
i=1 (yi
´
− y¯)2 + n(¯
y − θ)2
.
σ2
Conditional on the data,
−
n
1X
(yi − y¯)2
2 i=1
σ2
is a constant. Hence, we have the further reduction that the likelihood can be expressed as
®
´
y − θ)2
1 n(¯
exp −
.
2
σ2
Under the prior p(θ) ∝ constant, it follows that the posterior density is N (¯
y , σ 2 /n). Under
the prior p(θ) = N (θ0 , σ02 ), where θ0 and σ0 are known constants, it can be shown that the
posterior density is normal with mean given by a weighted average of the prior mean and
the sample average:
Ç
θ0
1/σ02
1/σ02 + n/σ 2
å
Ç
å
n/σ 2
+ y¯
,
1/σ02 + n/σ 2
and the posterior variance is equal to (1/σ02 + n/σ 2 )−1 . Note that as σ0 → ∞ (n is fixed)
or as n → ∞ (σ0 fixed) the prior flattens out and we recover the posterior under the prior
p(θ) ∝ constant discussed in Gelman (1995) and Tanner (1996).
In general, if we can write the likelihood L(θ|Y ) in the form g[ψ(θ) − t(Y )], then the
likelihood is in data-translated form, since different datasets only shift the likelihood through
t(Y ) as discussed in Tanner (1996). So we can take p(ψ(θ)) ∝ constant, since it is on the
ψ(θ) scale that the likelihood is data-translated thus, a noninformative prior would say
that any value is reasonable on the ψ(θ) scale. Transforming back to the θ scale, we have
p(θ) ∝ | ∂ψ(θ)
|.
∂θ
20
Case 2: Normal population (mean known, variance unknown)
Let y1 , · · · , yn be an iid sample from N (µ, σ 2 ), then L(σ|Y ) = σ −n exp
s2 =
Pn
i=1 (yi
Ä
−ns2
2σ 2
ä
, where
− θ)2 /n. Now note that
Ç
σ −n
−ns2
exp
s−n
2σ 2
å
ï
Å ãò
ñ
Ç
Ç
ååô
−n
s2
× exp
exp log
2
σ2
n
= exp(−n(log σ − log s) − exp[−2(log σ − log s)]).
2
σ
= exp −n log
s
In this case, g(y) = exp(−ny −
n
2
exp(−2y)), t(Y ) = log s and ψ(θ) = log σ. Hence, arguing
as above we would put a flat prior on log σ, i.e. p(log σ) ∝ constant, which implies that
p(σ) ∝ σ1 .
When both θ and σ are unknown, we argue that knowledge of θ should be independent
of σ and vice versa. Thus, p(θ, σ) = p(θ)p(σ 2 ) ∝ constant × σ12 ∝
1
σ2
as presented in Gelman
(1995) and Tanner (1996).
2.3.2 Conjugate Priors
In some situations, the data analyst may not wish to be vague and may be willing to use
some smooth distribution that is a specific member of a family of distributions. One such
class of distributions is the conjugate class.
If the likelihood is based on a random sample of size n from an exponential family
distribution with density

m
X
g(θ)h(y) exp 
j=1


φj (θ)tj (y) ,
then a conjugate prior density of the form

m
X
p(θ) ∝ [g(θ)]b exp 
j=1


φj (θ)aj 
combines with the likelihood to yield a posterior density having the same form as the prior
density, but with b, a1 , · · · , am replaced by
b0 = b + n, a0j = aj +
m
X
i=1
21
tj (y), j = 1, · · · , m.
It can be seen that the gamma distribution is conjugate to the Poisson, the normal distribution is conjugate to the normal (with known σ and unknown µ), and the inverse gamma
(IG)1 is conjugate to the normal (unknown σ and known µ).
2.4 Markov Chain Monte Carlo (MCMC)
MCMC is an estimation approach which provides an efficient estimates to the parameter
of interest asymptotically. It also provides the finite sample distribution. The sampling
based MCMC method avoids the computation of a high-dimensional integral necessary for
obtaining the marginal distribution of parameters. Instead, it generates random samples
from any nonstandard likelihood functions. These random samples converge to finite sample
distribution of the estimates.
MCMC methods have been mostly applied in Bayesian applications where probability
densities are known up to a constant of proportionality. The interest of estimation is on the
posterior marginal distribution p(θ|Y ) and the point estimate θˆ = E[θ|Y ]. Computing the
expectation involves integration and is often analytically difficult as θ is high dimensional.
MCMC method thus generates random samples of observations from p(θ|Y ) namely {θ n :
n = 1, . . . , N } and provides a density estimator that converges in distribution to the marginal
density p(θ|Y ). Thus, posterior mean and standard deviation of parameter θ can thus be
evaluated using
θˆ =
‘
Var(θ|Y
)
1
N
1 X
θn
N n=1
(2.1)
N T
1 X
=
θ n − θˆ θ n − θˆ .
N − 1 n=1
θ ∼ Inverse-gamma(α, β), if the p.d.f of θ is p(θ) =
22
β α −(α+1) −β/θ
e
,
Γ(α) θ
α > 0, β > 0 and θ > 0.
(2.2)
2.4.1 Gibbs Sampling Technique
Gibbs sampling is an MCMC sampling-based approach used for estimating marginal density.
Gibbs sampler has been applied in the context of stochastic models involving large numbers
of variables. Gelfand and Smith (1990) employed the technique in a variety of models,
including multinomial, multivariate normal, hierarchical and k groups mean models.
Assume the distribution of interest be π(θ), where θ = (θ1 , . . . , θd )0 and the full conditional distributions πi (θi ) = π(θi |θ−i ), i = 1, . . . , d are completely known and can be sampled
from. Here θ−i = (θ1 , θ2 , . . . , θi−1 , θi+1 , . . . , θd ) is the vector θ with its ith component removed. The interest is to draw samples from π when direct sampling scheme is not available
but drawing from πi is possible. Gibbs sampling provides a way of sampling successive draws
from the full conditional posterior distributions. The steps in the algorithm are
(0) 0
(0)
1. Set initial values θ(0) = θ1 , . . . , θd
.
(j) 0
(j)
2. Obtain a new value θ(j) = θ1 , . . . , θd
(j−1)
from θd
through successive gen-
eration of values
(j)
(j−1)
, . . . , θd
(j)
(j)
(j−1)
(j−1)
(j)
θ1 ) ∼ π θ1 |θ2
(j−1)
θ2 ) ∼ π θ2 |θ1 , θ3
, . . . , θd
,
,
..
.
(j)
(j)
θd ) ∼ π θd |θ1 , . . . , θd−1 .
3. Update j = j + 1 and repeat step 2 until convergence is achieved.
When convergence is reached, the resulting value θ(j) is a draw from π.
2.4.2 Metropolis Hastings Algorithm
Metropolis algorithm is another MCMC algorithm required when one or more of the full
conditionals are not available in closed form but the functional form of joint density p(θ|Y ) =
23
p(θ1 , . . . , θd |Y ) is known. Gibbs sampler cannot be applied to situations where conditional
densities are unavailable or not easy to sample directly from.
The Metropolis algorithm was first developed in Metropolis (1953). It can be used to
generate random samples from any target distribution known up to a normalizing constant,
regardless of its analytical complexity and its dimensionality.
The interest is to generate random samples that have a stationary distribution π. One of
the contributions of Metropolis algorithm is in constructing the transition density p(θ, φ) that
realizes a Markov chain with equilibrium distribution π. It is equivalent to find transition
matrix P that realizes a reversible chain, which satisfies π = πP . That is, for all i 6=
j, πi pij = πj pji . In other word, p(θ, φ) must be a reversible chain and satisfy the detailed
balance equation
π(θ)p(θ, φ) = π(φ)p(φ, θ).
The proposed state φ is generated from a symmetric proposal transition density that
satisfies q(θ, φ) = q(φ, θ). The new state is thus accepted with a probability α(θ, φ) given by
®
´
π(φ)
α(θ, φ) = min 1,
,
π(θ)
and the new state is rejected with probability 1 − α(θ, φ).
The algorithm was further generalized to an arbitrary transition density in Hastings
(1970). The proposal density q(θ, φ) does not have to be symmetric. The proposed state φ
in this case is accepted with probability α(θ, φ) given by
®
´
π(φ)q(φ, θ)
.
α(θ, φ) = min 1,
π(θ)q(θ, φ)
There are other acceptance probability specifications that generate reversible chains, such
as in Barker (1965). Peskun (1973) concludes that the Metropolis acceptance-rejection rule
is optimal in the sense of having the smallest asymptotic variance for the resulting estimates.
By the standard Markov chain theory, if the chain is irreducible, aperiodic and reversible,
24
it will converge to its equilibrium distribution π(x), which is the case with the Metropolis
algorithm. Moreover, after a sufficient burn-in steps, the samples produced from the chain
can be used to approximate the target distribution.
The closeness of the proposal density to the the target distribution is one of the factors
which determine the efficiency of Metropolis Hastings sampler. Common candidate proposal
densities are multivariate normal density and multivariate t distribution. Theoretically, the
proposal density in MH algorithm is not required to have any good global properties and
can take any form in Hastings (1973) generalization.
The steps in Metropolis Hastings algorithm can be summarized as below.
1. Initialize the iteration counter of the chain j = 1 and set initial values θ (0) .
2. Move the chain to a new value φ generated from the density q(θ (j−1) , ·).
3. Evaluate the acceptance probability of the move α(θ (j−1) , φ) given by
®
´
π(φ)q(φ, θ)
.
α(θ, φ) = min 1,
π(θ)q(θ, φ)
4. Generate an independent uniform quantity u and
• If u ≤ α then the move is accepted and θ (j) = φ.
• If u > α then the move is rejected and θ (j) = θ(j−1) .
5. Change the counter from j to j + 1 and return to step 2 until convergence is
reached.
The capability of MCMC algorithms to work with high dimensions leads to many applications in Finance. Aguilar and West (2000) use MCMC and sequential filter to estimate
dynamic factor models, with international exchange rates data. The applications on termstructure models are mostly with stochastic volatility models. Jaquier, Polson & Rossi (1994)
shows that the Bayesian Markov chain estimators outperform estimators from moment and
QML methods on stochastic volatility models.
25
2.5 O-U Spot Price Model
The O-U process is the unique solution to the following stochastic differential equation
dXt = θ(µ − Xt )dt + σdWt , X0 = x0 ,
where σ is interpreted as the volatility or diffusion, µ is the long-run mean (equilibrium)
value of the process, and θ is the speed of reversion.
Using the Itˆo lemma with f (t, x) = xet , one can show the explicit solution of the above
SDE as
−θt
Xt = µ + (x0 − µ)e
The random variable It =
R t −θ(t−u)
e
dW
u
0
+σ
Z t
0
e−θ(t−u) dWu .
appearing on the right-hand side of the above
equation is normally distributed with mean zero and variance
Z t
1
(1 − e−2θt ).
2θ
e−2(t−u) du =
0
Thus, Xt is normally distributed with mean µ + (x0 − µ) e−t and variance
Ç
σ2
(1
2θ
− e−2θt ).
å
Xt ∼ N µ + (x0 − µ)e
−θt
,
σ2
(1 − e−2θt ) .
2θ
2.5.1 Maximum Likelihood Estimation (MLE)
A maximum likelihood estimate (MLE) of θ is a value of θ that maximizes the likelihood
L(θ|Y ) or, equivalently, the loglikelihood l(θ|Y ).
ˆ is unique, θˆ is the value of θ
In situations where the maximum likelihood estimate (θ)
best supported by the data. Alternatively, θˆ is the value of θ which makes the observed
value of the data most likely. It is noted that if g(θ) is a 1-1 function of θ and θˆ is an MLE
ˆ is an MLE of g(θ). This is called the invariance property of MLE.
of θ, then g(θ)
Suppose the likelihood is differentiable, unimodal, bounded above, and θ is of dimension
d. Then the mode θˆ is obtained by
26
0.8
0.6
3.0
P(t,x)
2.5
0.4
2.0
0.0
Tim
e
0.2
1.5
-3
1.0
-2
-1
0
x
1
0.5
2
3
Figure 2.1: Transition density of the O-U process starting at x0 = 0 with µ = 0, θ = 1, σ 2 = 2
1. Differentiating the loglikelihood with respect to θ to obtain the d × 1 vector
∂l(θ|Y )/∂θ.
2. Setting the d dimensional likelihood equations to 0, i.e. ∂l(θ|Y ) = 0.
3. Solving for θ.
When the loglikelihood is quadratic, the likelihood equations are linear in θ and the
solution is relatively easy to obtain. In more general situations where a closed-form solution
of the likelihood equations cannot be found, one may use an iterative numerical method to
locate a mode as in Tanner (1996).
27
2.5.2 MLE of the O-U Model
Assume a discrete sample of size n + 1 at equally spaced intervals h, X = (X0 , Xh , . . . , Xnh ).
The O-U process has a normal transition density
Ç
p(Xjh |X(j−1)h ) =
å− 21
πσ 2
(1 − e−2θh )
θ
θ(Xjh − X(j−1)h e−θh − µ(1 − e−θh ))2
exp −
.
σ 2 (1 − e−2θh )
"
#
The likelihood function L(θ, µ, σ 2 |X) is given by
L(θ, µ, σ 2 |X) =
n
Y
p(Xjh |X(j−1)h )
j=1
Ç
=
å− n2
πσ 2
(1 − e−2θh )
θ

θ

exp 
−
n
P
[Xjh − X(j−1)h e−θh − µ(1 − e−θh )]2
j=1
σ 2 (1
−
e−2θh )


.

The log-likelihood function l(θ, µ, σ 2 |X) = log(L(θ, µ, σ 2 |X)) can be derived from the above
conditional density function as
l(θ, µ, σ 2 |X) =
n
X
j=1
ln p(Xjh |X(j−1)h )
Ç
å
n
n
σ2
= − ln(π) − ln
(1 − e−2θh )
2
2
θ
n
X
θ
−
[Xjh − X(j−1)h e−θh − µ(1 − e−θh )]2 .
σ 2 (1 − e−2θh ) j=1
The maximum likelihood estimates (MLE) of the parameters can be found by taking the
partial derivative of the log-likelihood function and set it to zero and solving the equation
28
as follows.
∂l(θ, µ, σ 2 |X)
= 0
∂µ
=
n
X
θ
2[Xjh − X(j−1)h e−θh − µ(1 − e−θh )](1 − e−θh )
σ 2 (1 − e−2θh ) j=1
n
P
⇒ µMLE =
[Xjh − X(j−1)h e−θh ]
j=1
n(1 − e−θh )
∂l(θ, µ, σ 2 |X)
= 0
∂σ 2
n
X
n
θ
= − 2+ 4
[Xjh − X(j−1)h e−θh − µ(1 − e−θh )]2
−2θh
2σ
σ (1 − e
) j=1
2
⇒ σMLE
=
n
X
2θ
[Xjh − X(j−1)h e−θh − µ(1 − e−θh )]2
−2θh
n(1 − e
) j=1
∂l(θ, µ, σ 2 |X)
= 0
∂θ
=
n
2θhe−θh X
[(Xjh − µ)(X(j−1)h − µ) − e−θh (X(j−1)h − µ)2 ]
σ 2 (1 − e−2θh ) j=1
n
P
⇒ θMLE
(X − µ)(X(j−1)h − µ)
1 j=1 jh
= − ln
.
n
P
h
(X(j−1)h − µ)2
j=1
2.5.3 Bayesian Estimation of the O-U Process
Bayesian estimation technique is a widely used technique in finance and there are many
valuable contributions of the Bayesian approach to research. For example, Gray (2002) has
studied the Vasicek interest rate model using the Bayesian approach.
Assume a discrete sample of n + 1 interest rates at equally spaced intervals h, X =
(X0 , Xh , . . . , Xnh ). The O-U interest rate has a normal transition density
Ç
p(Xjh |X(j−1)h ) =
å− 21
πσ 2
(1 − e−2θh )
θ
θ(Xjh − X(j−1)h e−θh − µ(1 − e−θh ))2
.
exp −
σ 2 (1 − e−2θh )
"
29
#
The likelihood function L(θ, µ, σ 2 |X) is given by
2
L(θ, µ, σ |X) =
n
Y
p(Xjh |X(j−1)h )
j=1
Ç
=

å− n2
2
πσ
(1 − e−2θh )
θ
1
,
σ2
Assuming a flat prior p(σ 2 ) ∝
θ

exp 
−
n
P
[Xjh − X(j−1)h e−θh − µ(1 − e−θh )]2
j=1


.

σ 2 (1 − e−2θh )
which corresponds to IG(0,0), and the fact that the
inverse gamma (IG) is conjugate prior to the normal (unknown σ and known µ), we have
the conditional density
Ñ
p(σ 2 |X, µ, θ) ∼ IG
Ç
å
n
X
n
θ
,
[Xjh − X(j−1)h e−θh − µ(1 − e−θh )]2
2 1 − e−2θh j=1
é
.
Moreover, the normal distribution is conjugate prior to the normal (with known σ and
unknown µ), and we have the conditional density
Ñ
p(µ|X, σ 2 , θ) ∼ N
n
Xjh − X(j−1)h e−θh σ 2 (1 − e−2θh )
1X
,
n j=1
1 − e−θh
2θn(1 − e−θh )2
é
.
However, the conditional density of θ is non-standard. So assuming a flat prior on θ in
combination with the likelihood L(θ|X), we can use the normal based inference approach as
described below to get the candidate density for θ.
Normal-based inference
Let θˆ be the mode of the likelihood for the data set Y . Then in certain situations
ˆ ∼ N (0, C),
(θ − θ)
(2.3)
ˆ
where C is the d × d variance-covariance matrix of (θ − θ).
In Bayesian paradigm, θˆ is fixed (conditional on the data Y ) and θ is considered as
random. Given the model and the data, (2.3) indicates that the posterior density of θ is
ˆ C).
normal with mean θˆ and variance-covariance matrix C, i.e. p(θ|Y ) = N (θ,
The Bayesian justification of (2.3) is an expansion of the loglikelihood about the fixed
value θˆ (conditional on Y ),
ˆ ) + (θ − θ)
ˆ T S(θ|Y
ˆ ) − 1 (θ − θ)
ˆ T J(θ|Y
ˆ )(θ − θ)
ˆ + r(θ|Y ),
l(θ|Y ) = l(θ|Y
2
30
ˆ ) = −∂ 2 l(θ|Y )∂θ2 | ˆ and J(θ|Y ) is called the observed information matrix. Since
where J(θ|Y
θ
ˆ ) = 0, assuming that r(θ|Y ) can be neglected, then
S(θ|Y
1
ˆ T J(θ|Y
ˆ )(θ − θ)},
ˆ
L(θ|Y ) = exp(l(θ|Y )) ∼ exp{− (θ − θ)
2
which implies that the likelihood/posterior density is proportional to the multivariate normal
ˆ ) as described in Tanner
density with mean θˆ and variance-covariance matrix C = J −1 (θ|Y
(1996).
Based on the above description, we can assume that the candidate density of θ is given
by
ˆΣ ,
p(θ|X, µ, σ 2 ) ∼ N θ,
where
n
P
(X − µ)(X(j−1)h − µ)
1 j=1 jh
ˆ
θ = θM LE = − ln
,
n
P
h
(X(j−1)h − µ)2
ñ
j=1
2
2
∂ log L(θ, µ, σ |Y )
−
∂θ2
Σ =
ô−1
=
θˆ
B2
,
A0 BC − AB 0 C + ABC 0
where
A = he−θh
B =
C =
σ 2 (1 − e−2θh )
2θ
n
X
[(Xjh − µ)(X(j−1)h − µ) − e−θh (X(j−1)h − µ)2 ]
j=1
0
A
= −θhe−θh
B0 =
σ2
[(1 + 2hθ)e−2hθ − 1]
2θ2
C 0 = he−θh
n
X
(X(j−1)h − µ)2 .
j=1
So using the above candidate density and applying Metropolis Hastings method, we can
draw iterates for θ.
31
2.6 Statistical Inferences
Statistical inference of the model can then be conducted on the basis of a simulated samples
of observations from p(θ|Y ) namely {θ n : n = 1, . . . N }. The Bayesian estimate of θ as well
as the numerical standard error estimates can be obtained from
θˆ =
‘
Var(θ|Y
)
N
1 X
θn
N n=1
N T
1 X
=
θ n − θˆ θ n − θˆ .
N − 1 n=1
(2.4)
(2.5)
It can be shown that θˆ tends to E(θ|Y ) as N tends to infinity. Other statistical inference on
θ can be carried out based on the simulated sample, {θ n : n = 1, . . . N }. For instance, the
2.5% and 97.5% points of the sampled distribution of an individual parameter give a 95%
posterior interval.
As the posterior distribution of θ given Y describes the distributional behaviour of θ
with the given data, the dispersion of θ can be analyzed through Var(θ|Y ), with an estimate
given by (2.4) based on the sample covariance matrix of the simulated observations. The
‘
positive square root of of the diagonal element in Var(θ|Y
) is commonly taken as a standard
error estimate of the Bayesian estimates.
2.6.1 Numerical Example
In this example, a sequence of monthly interest rates is simulated according to the O-U
model with X0 = 0.10, θ = 0.18, µ = 0.08, and σ = 0.02 as shown in Figure 2.2. Bayesian
estimates with conjugate prior distributions are first sampled. Using the Gibbs sampling
technique, we draw σ 2 from an inverted-gamma and µ from a normal and the samples are
accepted with probability one. However, we use the Metropolis Hastings algorithm to draw
θ from the candidate normal density and the sample is accepted with some probability. As a
result, a sequence of iterates for (σ 2 , θ, µ) is generated and, after an initial warm up period,
32
0.10
0.00
0.05
Xt
0.15
Vasciek Interest Rate Path
0
100
200
300
t
Figure 2.2: O-U sample path with X0 = 0.10, θ = 0.18, µ = 0.08, and σ = 0.02
these draws converge to draws from the true, but unknown, joint distribution of parameters.
The output from the chain is used to derive point estimates of the parameters, as well as
the numerical density of any function of these parameters as in Gray (2002).
In this example, 5000 Bayesian estimates are used for each of the parameters and 2500
post warm up samples (after burn in samples) are considered. Then, the Bayesian estimates
along with maximum likelihood estimates and their numerical standard errors estimates are
obtained via (2.4) and (2.5). The results are presented in Table 2.1.
Table 2.1 shows that the standard errors computed from the simulated observations
are relatively small for the Bayesian estimates. For large sample size of 500, the Bayesian
posterior interval for the mean reverting parameter θ is tighter comparing to the Maximum
Likelihood approach. However, for the other two parameters µ & σ both approaches result
in almost the same results because their posterior distributions have standard forms. So
in case of nonstandard posterior distribution, Bayesian approach gives a tighter posterior
interval and better estimate for the parameters.
33
ML estimates
n
µ
0.076
0.023
(0.060,0.097)
5%
σ
Bayesian estimates
θ
µ
σ
θ
100
Mean
Std Err
95% Ints
Bias
0.023
0.248
0.003
0.082
(0.016,0.026) (0.081,0.369)
15%
37.7%
0.074
0.021
(0.060,0.096)
7.5%
0.023
0.003
(0.017,0.027)
15%
0.223
0.063
(0.098,0.336)
23.9%
300
Mean
0.082
0.021
0.212
Std Err
0.006
0.002
0.060
95% Ints (0.0793,0.0809) (0.018,0.024) (0.145,0.300)
Bias
2.5%
5%
17.8%
0.084
0.007
(0.075,0.094)
5%
0.022
0.001
(0.018,0.025)
10%
0.191
0.046
(0.130,0.299)
6%
500
Mean
0.081
0.021
0.194
Std Err
0.005
0.001
0.050
95% Ints (0.0796,0.0803) (0.019,0.021) (0.151,0.233)
Bias
1.3%
5%
7.8%
0.082
0.021
0.005
0.001
(0.0797,0.091) (0.019,0.021)
2.5%
5%
0.185
0.034
(0.154,0.212)
2.8%
Table 2.1: Point estimates of the O-U interest rate model
2.7 Conclusion
Spot and forward price models often use continuous time stochastic equations to model
power price dynamics. Numerical estimation of these models, however, is bottlenecked by a
variety of problems. The scarcity of historical data coupled with the complex or nonstandard
nature of the likelihood function makes the regular MLE estimation approach unfeasible. An
alternative to this will be to adopt a Bayesian approach.
The advantage of Bayesian framework is the capability of incorporating prior information
about the parameters and the ease to work with small number of sample data. Unlike
most frequentist approaches that provide point estimates, the Bayesian approach provides a
distribution for the parameters from which we can draw any kind of statistical inference.
In this chapter, we discussed a Bayesian parameter estimation of a continuous time O-U
spot price model using MCMC scheme. The long run mean µ and the diffusion parameters
σ in the model have a standard distribution, so Gibbs sampling is used to draw samples.
However, the mean reversion has unknown density and Metropolis Hastings algorithms were
applied.
34
The O-U model we illustrated has a normal transition density and an exponential type
likelihood function. In cases like this, the Bayesian and MLE approaches may give the same
result though Bayesian outperforms MLE on the estimation of the mean reversion parameter
θ.
35
Chapter 3
Bayesian Penalized Spline Smoothing
3.1 Introduction
Regression spline smoothing techniques are among some of the most widely used smoothing
methods in finance. A lot of research has been done in terms of interest rate term structure
estimation. McCulloch (1971, 1975), Vasicek and Fong (1982), Shea (1984, 1985) estimate
the discount function with regression splines. Chambers, Carleton, and Waldman (1984)
estimate the yield curve with polynomials. Adams and van Deventer (1994) estimate the
term structure using smoothing splines, applying the maximum smoothness criteria.
After the initial work by Eliers and Marx (1996), the penalized spline smoothing approach
become more popular in theoretical statistics and has recently been applied to financial
models. The underlying principle of penalized spline smoothing is to estimate an unknown
smooth function by replacing the function with a high dimensional basis representation and
impose a penality on the spline coefficient to achieve a smooth fit.
Ruppert, Wand, and Carroll (2003) linked the idea of penalized spline smoothing to linear mixed models and showed the practical application to perform any regression analysis.
Recently, Jarrow, Ruppert, and Yu (2004) estimate the term structure of US government
Treasury bond with a nonparametric penalized spline model by minimizing the sum of least
square criterion and a roughness penalty term. They also adopt a semi-parametric model
to estimate the corporate debt term structure by adding a constant credit spread to the
Treasury term.
Financial application of penalized spline smoothing modelling approaches are mostly
applied to estimating the term structure of interest rates. In terms of modelling the term
36
structure of energy forward prices using spline smoothing, not much work has been done
so far. Benth, Koekebakker and Ollmar (2007) construct a smooth curve from observed
Nord Pool electricity forward prices with settlement over a period using a polynomial spline
with a maximum smoothness property, so that the newly constructed smooth forward curve
matches the observed market prices.
Li and Yu (2005) employed the Bayesian approach to estimate the interest rate term
structure of Treasury and corporate debt as in Jarrow, Ruppert, and Yu (2004). Bayesian
penalized spline is a class of penalized spline smoothing which treats the spline coefficients as
random and assigns prior densities to characterize prior information about parameter values
before estimation. The posterior distribution of the parameters given the data is thus used
for statistical inference. The authors found out that unlike the frequentist approach the
small size of the corporate debt poses no particular difficulty.
Most of the spline smoothing and regression modelling approaches are used for interest
rate term structure modelling. Motivated by the work of Li and Yu (2005) and Benth,
Koekebakker and Ollmar (2007), we employ a Bayesian penalized spline smoothing approach
to extract a smooth forward curve from average based power contracts and to model the
seasonal function in spot price models.
In this chapter, we summarize and provide background information on smoothing and
regression penalized splines. As our estimation of the electricity price curves in the next
two chapters depend on Bayesian penalized spline, we present a brief introduction of these
smoothing spline approaches in a Bayesian framework and the numerical algorithm employed
using Markov Chain Monte Carlo (MCMC).
This chapter is organized as follows. Section 3.2 gives a brief definition of parametric
and nonparametric regression. Smoothing and regression penalized spline are discussed
and summarized in Section 3.3. Section 3.4 illustrates the different regression techniques
using example from Strontium isotopes ratio data. Bayesian penalized spline smoothing
37
and MCMC algorithm are also presented in the later Section 3.5. For additional details
on smoothing spline regression we recommend Wahba (1978, 1990), Green and Silverman
(1994), Hastie and Tibshirani (1990), Eliers and Marx (1996), Ruppert and Caroll (1997,
2002), Eubank (1999), Ruppert, Wand and Caroll (2003) and Jarrow, Ruppert, and Yu
(2004).
3.2 Parametric and Nonparametric Regression
Regression analysis builds mathematical models that estimate the functional relationship of
a dependent variable to one or more independent variables. These models may be used to
predict responses at unobserved and/or future values of the independent variables. In the
simple case when both the dependent variable y and the independent variable x are scalar
given bivariate observations (xi , yi ) for i = 1, . . . , n, a regression model relates dependent
and independent variables as follows:
yi = f (xi ) + i ,
i = 1, . . . , n,
(3.1)
where f is an unknown regression function that we wish to estimate and i ∼ N (0, σ 2 ) are
zero-mean identical independent random errors with a common variance σ 2 . We can also
write the above regression model as
yi ∼ N (f (xi ), σ 2 ).
The goal of regression analysis is to construct a model for f and estimate it based on
noisy data. Both parametric and nonparametric techniques are commonly used to find the
regression function f . A common approach is to approximate f by polynomial of order m
f (x) = β0 + β1 x + β2 x2 + . . . + βm xm .
(3.2)
The above polynomial regression is a familiar type of parametric regression and it assumes
that the form of f is known except for finitely many unknown parameters. This assumptions
38
is too restrictive and the approximations may be too crude for some applications, in particular
fitting wiggly curves like power price curves.
A nonparametric regression model does not assume a predetermined form. Instead, it
makes assumptions on qualitative properties of f . The basic idea of nonparametric regression
is to depend on the data and decide which function fits the best without imposing any specific
form on f . So it is more flexible and can uncover structure in the data that might otherwise
be missed.
Smoothing and regression penalized splines are examples of nonparametric regression and
they are a powerful and versatile tool for building models to exploit structure in data. The
idea of these penalized spline is to estimate a smooth f using a spline (basis functions) by
choosing the number and locations of knots for the spline and to introduce a penalty to
prevent overfitting. In the next sections, we provide a brief introduction about penalized
splines and Bayesian penalized splines which are a focus of this thesis.
3.3 Smoothing and Regression Penalized Spline
3.3.1 Smoothing Spline
Given data y = [y1 , . . . , yn ]T = [y(x1 ), . . . , y(xn )]T with a < x1 , . . . , xn < b and consider the
regression model (3.1)
yi = f (xi ) + i ,
i = 1, . . . , n.
The penalized sum of squares S(f ) is defined as
S(f ) = n−1
n
X
(yi − f (xi ))2 + λ
i=1
Z b
{f (m) (x)}2 dx,
(3.3)
a
where f is assumed to be a 2m − 1 differentiable function on the interval [a,b].
A smoothing spline fˆ is defined as the minimizer over twice differentiable functions f of
the penalized sum of squares:
fˆ = arg min S(f ).
39
The integral term above is considered as a roughness penalty and λ > 0 is a smoothing
parameter which governs the tradeoff between smoothness and goodness-of-fit. Wahba (1978)
noted that if y cannot be interpolated exactly by a polynomial of degree less tham m, then
the solution is well known to be unique, and to be a polynomial spline of degree 2m − 1 and
to have 2m − 2 continuous derivatives on the interval [xi , xi+1 ],
i = 1, 2, . . . , n − 1.
Taking the natural choice for the spline, which is a cubic polynomial smoothing spline
(m = 2), Green (1987) suggested that the penalty term is quadratic and can be expressed
as
λ
Z b
{f 00 (x)}2 dx = λf T Df ,
a
where f = [f (x1 ), . . . f (xn )]T and D an n × n positive semidefinite matrix with rank n − 2
defined as in Eubank (1999).
With the quadratic penality assumption, the smoothing spline
fˆ = (I + λD)−1 y
minimizes S(f ). In addition, the vector fˆ uniquely defines the smoothing spline as shown
in Berry, Caroll and Ruppert (2002).
3.3.2 Bayesian Smoothing Spline
The Bayesian approach to the above smoothing spline specifies the prior distribution for f
proportional to a density from a normal family as
Ç
å
λ
p(f ) ∝ exp − 2 f T Df .
σ
Then it follows that the posterior distribution of f given y, σ2 will be
p(f |y, σ2 ) ∝ p(y|f , σ2 )p(f )
∝ N(f , σ2 )p(f )
∼ N((I + λD)−1 y, σ2 (I + λD)−1 ).
40
3.3.3 Regression Spline
Since smoothing spline depends on the number of knots placed on each xi , it becomes computationally less practical when n becomes large as pointed out by Berry, Caroll and Ruppert
(2002). An alternative approach is to express f (x) using a spline basis representation as
f (x) = B(x)δ where B(x) is a spline basis and δ is a coefficient vector. The spline basis can
be any form such as B-spline as in Eilers and Marx (1996) or mth degree truncated power
basis as in Ruppert and Carroll (2000).
In this thesis, we consider the latter one with k fixed knots k1 , . . . , kK and a basis of
m T
B(x) = (1, x, x2 , . . . , xm , (x − k1 )m
+ , . . . , (x − kK )+ ) . Using the truncated power spline
basis, we can express f as
f (x) = β0 + β1 x + β2 x2 + . . . + βm xm +
K
X
βm+k (x − kt )m
+,
(3.4)
t=1
where (x − kt )+ = max{0, (x − kt )}. Denoting the coefficient vector as
δ = [β0 , β1 , β2 , . . . βm , βm+1 , . . . , βm+K ]T
= [β, γ]T ,
with β = [β0 , β1 , β2 , . . . βm ], γ = [βm+1 , . . . , βm+K ] and
m
m
m
B(x) = [1, xi , x2i , . . . , xm
i , (xi − k1 )+ , (xi − k2 )+ , . . . , (xi − kK )+ ]1≤i≤n
= [U (x), Z(x)].
m
m
m
with U (x) = [1, xi , x2i , . . . , xm
i ], Z(x) = [(xi − k1 )+ , (xi − k2 )+ , . . . , (xi − kK )+ ], we can write
(3.4) as
f (x) = B(x)δ
(3.5)
= U (x)β + Z(x)γ
and the regression model (3.1) can be written as
y = B(x)δ + = U (x)β + Z(x)γ + ,
41
(3.6)
(3.7)
with y = (y1 , y2 , . . . , yn )T and ∼ N(0, σ 2 In ).
The above model (3.6) is parametric and can be estimated with ordinary least squares
yˆ = B(B T B)−1 B T y.
This technique is referred as regression spline smoothing. Numerical computation of the
parametric regression spline smoothing is easy and fast because it inherent the advantage
of parametric modelling. However, it has drawbacks in fitting ”wiggly” curves as stated in
Denison (1997).
3.3.4 Penalized Spline
An alternative approach to optimize the fit is achieved by imposing a penalty on spline
coefficients, specifically, choosing large number of knots e.g min{n/4, 40} suggested by Ruppert (2002) and avoiding overfitting by constraining the spline coefficient. This leads to an
optimization problem
min ||y − B(x)δ||2 ,
β
subject to ||γ||2 ≤ λ,
for any λ ≥ 0 and γ = [βm+1 , . . . , βm+K ]. The above problem can be written as
min ||y − B(x)δ||2 + λδ T Dδ,
β
where D is a block diagonal matrix with [0m+1,m+1 , IK ] and λ ≥ 0 is a smoothing parameter
and λδ T Dδ is a roughness penalty measure as shown by Ruppert and Carroll (1997). The
solution to the above minimization problem is given by
yˆ = B(B T B + λD)−1 B T y.
(3.8)
The model (3.8) is called a penalized smoother. Draper and Smith (1982) used (3.8) in
parametric regression to reduce the variability of the estimates. The smoothness of the
42
estimate is a function of the smoothing parameter λ. The larger the value of the smoothing
parameter, the more the fit shrinks towards a polynomial fit, while smaller values of λ result
in over fitting as pointed out by Eubank (1999). The matrix
Sλ = B(B T B + λD)−1 B T ,
(3.9)
in (3.8) is called a smoother matrix. The degree of smoother, corresponding to the smoothing
parameter λ can be defined as the trace of the smoother matrix. That is
df = tr(Sλ ),
which tells the number of fitted parameters.
3.3.5 Bias-Variance Trade-off
To measure the accuracy of the smoother, whether it fits observations sufficiently well or
not, we need a target criterion that quantifies a model’s performance. Even though there is
no universally accepted measure, some criteria are widely accepted and used in practice. We
discuss a criterion called mean squared error (MSE) that is widely accepted in regression
models.
Consider the loss function L
L(λ) =
1 ˆ
||f − f ||2 .
n
We define the mean squared error (MSE) as
MSE(λ) = EL(λ)
1
= E( ||fˆ − f ||2 )
n
1
=
E||(Efˆ − f ) + (fˆ − Efˆ)||2
n
1
2
1
=
E||(Efˆ − f )||2 + E(Efˆ − f )T (fˆ − Efˆ) + E||(fˆ − Efˆ)||2
n
{z
} n
|n
E(Efˆ−f )=Efˆ−Ef =fˆ−fˆ=0
43
So, we have
1
1
||Efˆ − f ||2 + E||fˆ − Efˆ||2
n
n
σ2
1
||(I − Sλ )f ||2 + tr(Sλ2 )
=
n
n
MSE(λ) =
= bias2 (λ) + var(λ).
We note that the squared bias measures how well fˆ approximates the true function f and
it depends on the true function, while the variance, which measures how well the function
can be estimated, does not depend on the function f . So the MSE reflects the bias-variance
trade-off. Big λ values result in smaller variance and higher bias; however smaller λ gives
the opposite. Thus, the over all goal is to find the optimal smoothing parameter that strikes
a balance between these two conflicting aspects of penalized spline smoothing. In the next
section, we discuss the most widely used methodologies by frequentists to select the optimal
smoothing parameter.
3.3.6 Smoothing Parameter Selection
The penalizing smoother that we mentioned in the previous section represents the compromise between the goodness of fit and the penalty to the departure from the observed
data. The balance is controlled by the smoothing parameter λ and it can be selected using
data-driven methodologies provided next.
Cross Validation (CV)
The Cross Validation (CV) estimate of the penalized smoother matrix (3.9) is defined as
n 1X
[i] 2
CV(λ) =
yi − yˆi
;
n i=1
[i]
where yˆi is exactly the same as the fit yˆi except with the ith element deleted. For the linear
smoother (3.8), we can show that
[i]
yˆi
=
X
i6=j
Sλij
yi ,
1 − Sλii
44
where Sλij is the ij th entry and Sλii is the diagonal element of the smoother matrix (3.9).
Hastie and Tibshirani (1990) showed that
Ç
n
1X
yi − yˆi
CV(λ) =
n i=1 1 − Sλii
å2
.
(3.10)
Therefore, the cross-validation criterion can be calculated using the fit based on the whole
sample and the diagonal elements of the smoother matrix (3.9).
Generalized Cross Validation
Replacing Sλii by the average of the diagonal element tr(Sλ )/n in (3.10) as in Craven and
Wahba (1979), we have the Generalized Cross Validation(GCV) criterion
Ç
n
1X
yi − yˆi
GCV(λ) =
n i=1 1 − tr(Sλ )/n
=
å2
(yi − yˆi )2
.
{ n1 tr(I − Sλ )}2
1
n
Pn
i=1
(3.11)
We note that the GCV criterion is the minimizer of GCV(λ), and since tr(Sλ )/n ∼ o(1) in
the neighbourhood of the optimal λ, we can approximate
E(GCV(λ)) ≈ MSE(λ)(1 + o(1)).
Wahba (1990) and Gu (2002) proved that under a certain regularity condition GCV(λ) is a
consistent estimate of the relative loss function and is invariant under orthogonal transformation.
3.4 Illustrative Example
As an illustrative example, we consider the fossil data in R package SemiPar. This dataset has
106 observations on fossil shells. The response is ratio of strontium isotopes (strontium.ratio),
and the covariate is age in millions of years (age) as presented in Bralower, Fullagar, Paull,
Dwyer and Leckie (1997).
45
We fit a nonparametric regression model to the data as
strontium.ratio = f (age) + ε.
Estimating f can be carried out in one of those regression methods we discussed. So we fit
the data by
• a simple linear regression model using polynomial of order m = 1 in (3.2).
• a polynomial regression model using polynomial of order m = 5 in (3.2).
• a B-spline using 6 B-cubic spline basis functions with 6 degree of freedom.
• a penalized cubic smoothing spline of order 4 (m = 2) in (3.3) and 6 degree of
freedom.
We put the scatter plot of the fossil data together with linear, polynomial of order 5, 6
B-cubic spline and penalized smoothing spline fits in Figure 3.1. By looking at the fits, we
can tell that the penalized spline fits the data better. However, one of the criteria that would
help to quantify the better fitting is to calculate the mean squared error (MSE). The smaller
the MSE, the better the fit is.
Table 3.1 summarizes MSE values of different fits relative to MSE of the penalized fit.
The MSE of a B-spline using 6 B-cubic spline fit is almost twice the MSE of the penalized
spline fit, however it is better than the other two parametric regression models as they have
higher MSE. So with small MSE together with better fit shown in the Figure 3.1, penalized
spline is the better way to fit the data than the other approaches. The mean squared error
of each fit relative to the penalized spline fit MSE summarized as follows.
Type of fit
MSE relative to MSE of penalized spline
Linear
10.2
Polynomial of order 5
2.4
B-Spline of order 6
1.9
Table 3.1: MSE of each fit relative to the MSE of penalized spline
46
0.70750
0.70740
0.70730
Strontium ratio
0.70720
Penalized Spline
6 B-Spline fit
Polynomial of order 5
Linear Fit
95
100
105
110
115
120
Age
Figure 3.1: Scatter plot, linear fit, polynomial of order 5, B spline of order 6 and penalized
spline
3.5 Bayesian Penalized Spline Smoothing
Consider the penalized spline smoothing model (3.6)
y = U β + Zγ + ,
where matrix U is a polynomial and Z is a set of truncated polynomial basis functions. Now
we treat β and γ as a random variables with prior distribution specified as
• β ∝ constant or β ∼ N(0, σβ2 Im+1 ),
• γ ∼ N(0, σγ2 IK ) ,
• ∼ N(0, σ2 In ),
with γ and are assumed independent. Now If we specify a full prior distribution to the
whole parameter set θ = (β, γ, σ2 , σγ2 ) in the model above, we will get a full Bayesian
47
penalized smoothing framework.
First we consider a prior distribution for β as uniform density p(β) ∝ constant as in
Berry, Carroll and Ruppert (2002) and the popular prior choice for σ2 , σγ2 are from the
family of inverted gamma distributions. This choice is due to the fact that inverted gamma
distributions are conjugate priors to exponential likelihood functions. So we can assume
σ2 ∼ IG (Θ , Γ )
σγ2 ∼ IG (Θγ , Γγ )
with the hyper parameters chosen small enough to get a non-informative proper prior.
Using the above specification of the prior distribution and Bayes’ formula, we can hierarchically derive the posterior distribution of δ = (β, γ) given (y; σ2 ; σγ2 ) as follows. First
the prior distribution for δ will be N(0, D) where


D=


0
0
0 σγ2 IK

.

Next, the posterior distribution can be derived as
p(β, γ|y; σ2 ; σγ2 ) ∝ p(y|β; γ; σ2 ; σγ2 )p(γ|σ2 ; σγ2 )p(β|σ2 ; σγ2 )
σ2
1
= exp − 2 ||y − Bδ||2 + 2 δ T Dδ
2σ
σγ
(
"
The expression
σ2
δˆ = (B T B + 2 D)−1 B T y,
σγ
minimizes the above expression so that by denoting
Σδ = (B T B +
σ2 −1
D) ,
σγ2
and
δˆ = Σδ B T y,
48
#)
.
(3.12)
we will get
oô
1 n
T −1
ˆ
ˆ
∝ exp − 2 [δ − δ] Σδ [δ − δ] .
2σ
ñ
p(δ|y; σ2 ; σγ2 )
(3.13)
It follows that,
ˆ σ 2 Σδ ).
δ|y; σ2 ; σγ2 ∼ N(δ,
(3.14)
The posterior distribution of σ2 and σγ2 can be derived the same way as
p(σ2 |y; δ; σγ2 ) ∝ p(y|δ; σ2 ; σγ2 )p(σ2 |δ; σγ2 )
Ç
å
n
1
∝ IG Θ + , Γ + ||y − Bδ||2 ;
2
2
p(σγ2 |y; δ; σ2 ) ∝ p(y|δ; σ2 ; σγ2 )p(σγ2 |δ; σ2 )
Ç
(3.15)
å
K
1
∝ IG Θγ + , Γγ + ||γ||2 .
2
2
(3.16)
Then the posterior distribution of f = Bδ, conditional on (y, σ2 ; σγ2 ) is Normal distribution with
E[f |y; σ2 ; σγ2 ] = E(Bδ) = BE(δ) = B δˆ
Cov[f |y; σ2 ; σγ2 ] = Cov(Bδ) = B T Cov(δ)B = σ2 B T Σδ B.
(3.17)
(3.18)
The corresponding spline smoother for the Bayesian penalized spline model will have the
form of
σ2
fˆ = B(B T B + 2 D)−1 B T y.
σγ
The ratio
σ2
σγ2
(3.19)
in (3.19) plays the same role as the smoothing parameter λ.
Numerical estimation is performed using iterative Markov Chain Monte Carlo (MCMC).
Since the conditional distributions of the variance terms are inverted gamma and δ is normal,
we can sample directly using Gibbs sampling. We thus estimate the smoothing parameter λ
as a ratio of the mode of the two inverted gamma distributions.
Next, we consider a case when the prior distribution for β is p(β) = N(0, σβ2 Im+1 ) with
large σβ2 as in Li and Yu (2005) and Krivobokova (2006). The prior distribution for δ will
be N(0, D) where
49


D=


σβ2 Im+1
0
0
σγ2 IK

.

The variance of the error term σ2 has an inverted gamma distribution σ2 ∼ IG (Θ , Γ )
and we use another inverse gamma prior for σγ2 as σγ2 ∼ IG (Θγ , Γγ ) with the hyper parameters
in these distribution assumed to be constants. The posterior distribution of δ = [β, γ] is
given by
1
σ2
∝ exp − 2 ||y − Bδ||2 + 2 δ T Dδ
2σ
σγ
"
p(δ|y; σ2 ; σγ2 )
!#
.
(3.20)
We note that unlike the uniform prior case above, this posterior distribution in the normal prior case is not standard. Numerical estimation is performed using iterative Markov
Chain Monte Carlo (MCMC) algorithm as follows. First, the conditional distributions of
the variance terms are inverted gamma and can be sample directly using Gibbs sampling.
However, the distribution of δ is not standard therefore the Metropolis Hasting algorithm
can be implemented, as in Hasting (1979). The smoothing parameter λ will be estimated as
a ratio of the mode of the two inverted gamma distributions.
50
Chapter 4
Extracting Smooth Forward Curve From Average
Based Contacts with a Bayesian Penalized Spline
4.1 Introduction
Electricity forward prices play a major role in risk management serving as an indication to the
future dynamics of the market. Hedging of any financial or physical position heavily depends
on these prices. Fleten and Lemming (2003) pointed out that they are also important
information carrier for operational and investment decisions.
However, electricity forward contracts that are traded on any power exchange are limited
in numbers and if they exist they are quoted to deliver one MW of electricity over a longer
period of time instead of fixed time to delivery. As a result practitioners, power marketers,
risk managers and power producing companies need to estimate a continuous forward price
curve that can best represent prices at a fixed time inside the settlement period.
Fleten and Lemming (2003) suggested a method for generating forward price curves
based on a combination of information from the bid ask market data and forecasts from
bottom up models. A constrained least square (LSQ) procedure that minimizes the squared
differences between the model variables and the bottom-up forecast values subject to the
bid-ask constraints is used. Moreover, a smoothing term is added to the objective function
to prevent large jumps in the forward curve.
Benth, Koekebakker and Ollmar (2007) proposed a method of extracting smooth forward
price curve by modelling the forward curve as a sum of a seasonal function and an adjustment
function. The seasonal function is assumed to be deterministic and it has a parametric form
and a maximum smoothness criterion is applied to the adjustment function to construct a
51
smooth curve.
The Fleten and Lemming (2003) approach is subject to the availability of bottom-up
models and numerical computation deficiencies tied to constrained optimization. Benth,
Koekebakker and Ollmar (2007) methodology assumes a parametric form for the seasonal
function and the available market data do not account determining the seasonal shape of
the curve. Although, in their paper they find that the form of the seasonal term makes little
difference to the estimate.
Another approach by Borak and Weron (2008) introduced the dynamic semiparametric
factor model (DFSM) for modelling electricity forward curves. The model employed factor
representation - a linear combination of nonparametric loading functions (linearized with
B-splines) and parameterized common factors. The authors compared their model with
Benth, Koekebakker and Ollmar (2007) and they noted that the model yields less pronounced
seasonality in the far end of the curve. They also compared their results with Fleten and
Lemming (2003) and found out that DFSM curves are smoother and less closely follow the
quoted prices. The authors suggested that to get more accurate in-sample fits, more loading
factors should be used at the cost of computational speed and universality (robustness).
Moreover, the misspecification of the loading function can be harmful.
To avoid the shortcoming that results in parametric modelling of the seasonal function
and minimization of an objective function that involves many parameters, we suggest a
fully non-parametric Bayesian spline approach by which the available monthly market data
decides the seasonal variation that is exhibited by electricity forward curves. Unlike other
approaches that depend on large sample data, the small number of market data available
will not be a problem for the Bayesian approach we employ. Using the quarterly and yearly
market data coupled with the seasonal shape decided by the monthly market data, the
proposed model can also be used to estimate forward price curves for longer term.
In addition, using the estimated smooth forward for a liquid hub like Mid-C and adding
52
a spread term to it, we can construct a smooth forward curve for markets with few samples
of market data such as COB. The positive spread between the two hubs will be incorporated
as an informative prior.
The chapter is organized as follows. Section 4.2 provides a brief review of the literature
relating to forward price model. A discussion on average based contracts is presented in
Section 4.3. The spline model setup for average based contracts is given in Section 4.4.
Section 4.5 presents detail on Bayesian penalized spline, MCMC algorithm and the market
data description. Numerical results of the Bayesian spline model on the market data and
the smooth graph are presented in 4.6. Long term forward curve construction is given in
Section 4.7.
Section 4.8 presents a two step estimation approach of constructing the COB forward
curve from the estimated Mid-C curve by adding a spread. The numerical result and statistical inferences about the spread term are given in 4.9. Evolution of both Mid-C and COB
forward curves over different trading days together with their fitted surfaces are provided in
4.10.
4.2 Forward Price Modelling
Commodity forward price models date back to Black (1976) where the time dynamics of future prices are described by a set of stochastic differential equations. Clewlow and Strickland
(1999) mentioned that most commodity pricing models fall in two categories: The first group
of models is based on short rate bond pricing models. Spot price models are thus derived
from these short rate models by incorporating additional terms that capture the behaviour
of a spot price. Then, the forward prices will be backed out from these spot prices. Lucia
and Schwartz (2002) proposed one and two factor models with time dependent volatility and
different seasonal functions using data from the Nord pool.
The second group of models is based on the Heath-Jarrow- Merton (HJM) models where
53
the entire forward curve dynamics are modelled instead of modelling the spot price. The first
Market models for forward prices are Black’s model (Black, 1976) where a single forward
contract is considered. Clewlow and Stickland (1999) , Koekabaker and Ollmar (2001) and
Carmona and Durrleman (2003) apply the idea of term structure modelling from Heath,
Jarrow and Merton (1992) to model the dynamics of the whole forward curve.
The Stochastic Differential Equation for the dynamics of the forward curve with maturity
T in these models is given by
n
X
dF (t, T )
= µ(t, T )dt +
σk (t, T )dWk (t),
F (t, T )
k=1
t ≤ T,
where W = (W1 , . . . , Wn ) is n-dimensional standard Brownian motion, and the drift term µ
and the n volatilities σk are deterministic function of the current date t and maturity T .
Such models have the advantage that the market can be considered as being complete
and standard risk-neutral pricing may be used. One of the drawbacks of these models is that
forward prices do not reveal information about the price behaviour on a fine granularity of
time scale such as hourly or daily. In addition, unlike the interest rate term structure models
with a fixed maturity T , electricity forwards have delivery periods.
These models are also heavily influenced by the input data used to calibrate the model
parameters. Moreover, these models need a longer history data which are not available
in electricity market due to the deregulation of these markets that has happened recently.
Instead of depending on a predetermined parametric model, we propose to use a fully nonparametric Bayesian approach by taking the observed forward prices from the market as a
driving force to determine different characteristics of the forward curve.
In electricity markets like Mid-Columbia where most of the electricity is generated from
hydropower, seasonality plays a major role in determining the forward prices. Heavy rain
and snow melt in the month of April, May and June coupled with the low demand in the
mild summer push down power prices in the second quarter of the year. Conversely, in the
winter season, high demand and low inflow reverses the power prices to a high level.
54
Forward Contracts for Mid-Columbia are traded over the counter at the Inter Continental
Exchange (ICE). These contracts have unique specifications in that the delivery of electricity
is over a period of time rather than a specific delivery time. These delivery periods can be
a month, quarter, year or long term. As a result, regular term structure forward models
with fixed maturity can not be used. As pointed out by Fleten and Lemming (2003), these
forward electricity contracts can be viewed as a portfolio of basic forward contracts each
with different times to maturity, one for each point in time during the delivery period.
4.3 Average Based Power Contracts
Consider a forward contract on a commodity over the delivery period [T1 , T2 ] with T1 < T2 <
∞. Let F (τ, T1 , T2 ) represent the price of a forward contract at time τ < T1 < T2 to deliver
1 MW of power between times of delivery T1 and T2 . Now, if the spot price of the power at
time t is S(t), then the payoff of the forward contract is
Z T2
T1
e−rt {S(t) − F (τ, T1 , T2 )}dt,
where r is the risk free interest rate. Since the premium to enter into this kind of contract
is zero, the arbitrage-free price F (τ, T1 , T2 ) yields that
F (τ, T1 , T2 ) =
Z T2
T1
w(r; t)E[S(t)|Fτ ]dt,
(4.1)
where
w(r; t) =
e−rt
1 −rT1
(e
− e−rT2 )
r
and E is the risk-neutral expectation as in Benth and Koekebakker (2005). Benth, Koekebakker and Ollmar (2007) showed that the forward price at time τ with delivery at a fixed
time t ≥ τ is
f (τ, t) = E[S(t)|Fτ ].
(4.2)
Substituting (4.2) into (4.1), we will get
F (τ, T1 , T2 ) =
Z T2
T1
55
w(r; t)f (τ, t)dt.
(4.3)
Power contracts are structured in such a way that the delivery of the MW is either
hourly, daily or monthly so we assume that the settlement is at discrete times t1 , t2 , . . . , tN
with t1 = T1 the start of the delivery time and tN = T2 is the end of the delivery time of
the proposed contract. This results in the above integration to be performed as summation
over this discrete time as
F (τ, T1 , T2 ) =
j=N
X
w(r; tj )f (τ, tj ),
(4.4)
j=1
with
e−rτj
w(r; tj ) = PN −rt .
k
k=1 e
We note that the average based forward contracts can be considered as fixed vs float swaps
because they are swapping the fixed price (forward price) with the floating price (spot price)
on the delivery periods. If the contracts are settled at the end of the delivery period T2 , then
w will have a form of w(r, t) = 1/N as noted by Benth, Koekebakker and Ollmar (2007).
If we set τ = 0 in (4.4), we have
F (0, T1 , T2 ) =
j=N
X
w(r; tj )f (0, tj ).
j=1
Without risk of ambiguity, we can drop 0 term in the expression and write the above expression as
F (T1 , T2 ) =
j=N
X
w(r; tj )f (tj ).
(4.5)
j=1
In this thesis we present a technique of computing a smooth forward curve f (t) based
on observed average based market forward prices F (T1 , T2 ) with delivery over the settlement
period.
We modelled the forward curve f (t) as a regression model
yi = f (ti ) + i
where y is observed market data , f is an unknown smoothing function that we wish to estimate and i ∼ N (0, σ 2 ) are zero-mean identical independent random errors with a common
56
variance σ 2 .
We consider Mid-C power market, where contracts are based on the average of the spot
prices of electricity over the delivery period. The delivery period could range from one
month to many years and we employ a Bayesian penalized spline approach to extract smooth
forward curves for a fixed delivery from the market observed forward prices of average based
contracts.
4.4 Model
Let Fi , i = 1, 2, . . . , n be the observed market forward price of power where delivery starts
at the first day of the ith month, Ti1 and ends the last day of the month, Ti2 . From the
previous section and (4.5), the model prices Pi of the average based contracts are related to
the forward curve f (tij ) through the expression
Pi =
j=N
X
w(r; tij )f (tij ).
(4.6)
j=1
where tij are delivery days of the ith month. We express the observed market forward price
Fi from the model price Pi by adding some noises (disturbances) i as
Fi = Pi + i ,
(4.7)
where i , i = 1, 2, . . . , n are assumed to be normal with zero mean and constant variance.
The forward curve f is modelled as a spline like (3.5) and we write f (t) = B(t)δ, where
B(t) is a vector of truncated polynomial basis function and δ is spline coefficient vector. So
using the m-degree truncated power basis, the forward f (t) will have the form
2
m
f (t) = β0 + β1 t + β2 t + . . . + βm t +
K
X
βm+j (t − kj )m
+,
j=1
with
δ = [β0 , β1 , β2 , . . . , βm , βm+1 , . . . , βm+K ]T ,
57
and
m
m
B(t) = [1, t, t2 , . . . , tm , (t − k1 )m
+ , (t − k2 )+ , . . . , (t − kK )+ ],
where (t − kj )+ = max{0, (t − kj )} and {kj }j=K
j=1 are K fixed spline knots. Power functions
are chosen because they are convenient and easy for numerical simulation however, other
basis such as B-spline as in de Boor (1978) or periodic spline can also be used.
With the above representation of the forward f , we can write the model price Pi as
Pi (δ) =
j=N
X
w(r; tij )B(tij )δ
(4.8)
j=1
The classical penalized spline minimizes the sum of the lack of fit of the data and a
roughness penalty measure in the form
n
1X
[Fi − Pi (δ)]2 + λg(δ)
min
n i=1
where Fi are the observed forward prices, Pi are the model prices from the penalized spline
model and λ is the smoothing parameter. Jarrow, Ruppert and Yu (2004) pointed out that
the roughness penalty measure λg(δ) = λδ T Dδ with D symmetrical and positive definite
penalty matrix minimizes the above penalized sum of squares errors criterion.
The positive smoothing parameter λ controls the trade off between the lack of fit term
and the roughness penalty measure. A larger λ shrinks the spline coefficients towards zero
and smaller λ puts more weight on the lack of term making the model errors small. So the
choice of λ will be performed to control the compromise between these two terms in the
minimization problem. However, in the Bayesian approach the roughness measure can be
treated as a prior information on the spline coefficient and the lack of fit term will be the
point of interest to estimate the spline coefficient with the observed prices.
58
4.5 Bayesian Penalized Smoothing
Consider the model (4.7) in matrix form
F =P +
where
∼ N(0, σ2 I),
P = [P1 , P2 , . . . , Pn ]T ,
F = [F1 , F2 , . . . , Fn ]T
where Pi is given by (4.8) and n is the number of months where market prices are observed.
Now we impose normal prior distribution on the polynomial coefficients with β0 , β1 , β2 , . . . βm
distributed as N(0, σβ2 ) and the knot coefficients βm+1 , βm+2 . . . , βm+K has a distribution of
N(0, σγ2 ).
Now if we assume σβ2 → ∞ so that the polynomial coefficients β0 , β1 , β2 , . . . βm are not penalized and the spline coefficients are penalized, the posterior distribution of the coefficients
is proportional to
n
σ2
1 X
exp − 2
[Fi − Pi (δ)]2 + 2 δ T Dδ
2σ i=1
σγ
(
"
Now if we set the smoothing parameter λ to be equal to
σ2
σγ2
#)
.
(4.9)
which is the ratio of the variances
as in Lindley and Smith (1972), the mode of the distribution (4.9) minimizes the above sum
of squared errors criterion, as shown in Ruppert and Carroll (1997).
The two variances which define the smoothing parameter are not given, so we use the
hierarchical Bayesian framework to estimate them. In Chapter 3, we presented how this step
by step Bayesian approach can be performed and we follow the same approach to choose
conjugate priors for the coefficient vector and the error variance.
Assume the prior distribution of the coefficient vector δ ∼ N(0, D), where


D=


σβ2 Im+1
0
0
σγ2 IK
59

.

The variance of the error term σ2 has an inverted gamma distribution σ2 ∼ IG (Θ , Γ ) and
we use another inverse gamma prior for σγ2 as σγ2 ∼ IG (Θγ , Γγ ) with the hyper parameters
in these distribution assumed to be constants. Using (3.12), the posterior distribution of
δ = [β, γ] is given by
1
σ2
∝ exp − 2 [F − P (δ)]T [F − P (δ)] + 2 δ T Dδ
2σ
σγ
"
p(δ|F ; σ2 ; σγ2 )
!#
.
(4.10)
The posterior distribution of σ2 and σγ2 can be obtained from (3.15) and (3.16) and we get
Ç
å
n
1
∝ IG Θ + , Γ + [F − P (δ)]T [F − P (δ)] ;
2
2
Ç
å
K
1
2
2
2
p(σγ |F ; δ; σ ) ∝ IG Θγ + , Γγ + ||γ|| .
2
2
p(σ2 |F ; δ; σγ2 )
Since the posterior distributions of σ2 and σγ2 are standard distributions, we directly can
sample from their respective distributions using the Gibbs sampling algorithm. However the
posterior distribution of δ is non standard and we adopt the Metropolis-Hasting algorithm.
To acquire a vague (diffuse) prior, we take the values of Θ , Θγ and Γ to be small and Γγ
to be large. As a result, the impact of the prior distribution on the posterior distribution will
be minimal. The smoothing parameter λ =
σ2
σγ2
is a bi-product of the hierarchical Bayesian
approach and this is one of the advantage of Bayesian penalized smoothing over other forms
of penalized spline smoothing.
4.5.1 MCMC Algorithm
The conditional distributions of the variances are standard distributions, which are Inverted
Gamma and samples can be drawn directly. However, the coefficient vector δ has a nonstandard distribution and we apply the Metropolis-Hasting algorithm to draw samples. The
Markov Chain Monte Carlo (MCMC) algorithm is presented below.
î
ó
Step 1. First, at the initial iteration of the chain set initial values δ (0) = β (0) , γ (0) .
Step 2. At iteration m, obtain a new value σ2 from Inverted Gamma distribution
Ç
å
ó î
ó
n
1î
(m−1) T
(m−1)
IG Θ + , Γ + F − P (δ
) F − P (δ
) .
2
2
60
Step 3. Draw σγ2 from Inverted Gamma distribution
Ç
å
K
1
IG Θγ + , Γγ + ||γ (m−1) ||2 .
2
2
Step 4. Move the chain to a new value φ generated from the density q(δ (m−1) ) ∼ N(δ (m−1) , Σ).
Step 5. Evaluate the acceptance probability of the move α(δ (m−1) , φ) given by
®
´
q(φ, δ)
α(δ, φ) = min 1,
.
q(δ, φ)
Step 6. Generate an independent uniform quantity u and
• If u ≤ α then the move is accepted and δ (m) = φ.
• If u > α then the move is rejected and δ (m) = δ (m−1) .
4.5.2 Mid-Columbia (Mid-C) Forward Prices Data
Inter Continental Exchange (ICE) is the leading network of regulated exchanges and clearinghouses for financial and commodity markets www.theice.com. Monthly electricity forwards
are traded on ICE. A typical ICE electricity contract for Mid-C market has the following
specification
Description - A monthly cash settled Exchange Futures Contract based upon the mathematical average of daily prices calculated by averaging the peak hourly electricity prices published by ICE for the location specified in Reference Price.
Reference Price - “ELECTRICITY-MID C PEAK-ICE” - the price for a Pricing Date
will be that day’s Specified Price per MWh of on-peak electricity for delivery
on the Delivery Date, stated in U.S. Dollars, published by ICE at www.theice.
com, under the headings“Market Data: Indices: Market: ICE OTC: Report:
North American Power: Hub: Mid C Peak” or any successor headings, that
reports prices effective on that Pricing Date.
61
Pricing Date - Each Monday through Saturday, excluding NERC holidays, that prices are
reported for the Delivery Date.
Our empirical analysis is based on forward prices of the above specified contract. The market
data is for the monthly forward contracts delivery between January 2010- December 2010 of
Mid-C Peak hours from December 01, 2009.
4.6 Numerical Results
We apply a Bayesian penalized spline smoothing approach to the market data to estimate
the smooth forward curve that can represent forward prices of fixed time delivery in the
settlement period. The knots for the spline are chosen as 12 equally spaced knots located
at the delivery months {Ti }ni=1 . The knots chosen as the automatic quantile knots used
in penalized spline by Ruppert and Carroll (1997) also considered and we found out that
the knots at the delivery month fit the data better than the automatic quantile knots. A
quadratic spline of degree 2 is enough to smooth the data.
For MCMC, a burn-in phase of 2000 simulation draws was enough to achieve the convergence and the last 2/3 of the draws were used for calculating the mode of the distribution. In
the above Bayesian mixed models, the estimates of the variance components are sensitive to
the prior specification. So, after experimenting different values of the hyperparameters, we
considered Θ ∼ IG (0.001, 0.001) & Γγ ∼ IG (0.001, 1000). In general, the hyperparameters
has to be relatively small so that the posterior distribution will not be affected by the prior
assumption. The fitted smooth curve which represents the daily forward prices is shown in
Figure 4.1
We also report the mean squared error (MSE) of the estimated fitted curve calculated as
PN
MSE =
t=1 (ft
− fˆt )2
N
where N is the number of months in a year and ft are observed monthly forward prices and
62
65
60
For w ar d Pr ic e ( $/MW h)
55
50
45
40
35
30
25
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
De live r y Mont hs
Figure 4.1: Mid-C estimated smooth forward curve (in red) and actual forward prices quoted
on December 01, 2009 (in green) for delivery between January 2010 - December 2010
fˆt are monthly average prices calculated from the smooth forward curve estimated using
Bayesian penalized spline model. The MSE of the fit shown in Figure 4.1 is 0.0035. This
value is too small which shows the smooth curve matches the monthly prices and can be
considered as an arbitrage free curve. The small MSE coupled with the shape of the curve
which captures the monthly seasonality perfectly makes our modelling approach theoretically
and fundamentally sound.
4.7 Long Term Forward Curve using Bayesian Penalized Spline
Most monthly electricity forward contracts are traded (or quoted) for about a year. Beyond
one year, market prices are obtained in a quarterly basis so instead of 12 prices for one year,
we will have only 4 prices. However, we can extend forward curves beyond one year by fitting
a smooth curve to the available market data using Bayesian penalized spline smoothing. The
63
approach is exactly the same as the previous section except we tune the spline basis matrix
by extending the number of basis and knots to cover the future prices.
Consider F = [F1 , F2 , . . . , Fn1 ]T as market data for n1 months and assume ∼ N(0, σ2 I).
Let F2 and 2 be observed future price values and let F ∗T = [F , F2 ]T and ∗T = [, 2 ]T hold
the observed monthly prices and the future quarterly or yearly prices.
Suppose that we wish to extend the curve to n2 months into the future. The smoothing
can be done by computing the new spline smoothing matrix B ∗ for n1 +n2 months. This can
be done by extending the set of knots used to compute B - a basis to smooth the observed
prices F , and adding a new basis to forecast n2 forward prices.
It follows that B ∗ will have the following form

B∗ = 



B
0
B(1) B(2)

.

Since more knots and basis functions are added to include long term prices, we need to
construct a new penalty matrix D ∗ as

D∗ = 



D
0
D(1) D(2)

.

Then the form of the smoothing spline matrix B ∗ enables the penalized smoother to be
written as
T
Sλ = B ∗ (B ∗T B ∗ + λD ∗ )−1 B ∗ ,
(4.11)
and the long term prices Fˆ ∗ can be found by
T
Fˆ ∗ = B ∗ (B ∗T B ∗ + λD ∗ )−1 B ∗ F ∗ ,
(4.12)
which we can make use of the quarterly forward prices beyond the first year as a future
values F2 for F ∗ as in Currie (2004) and Lee and Durb´an (2012).
Next, we pass the new basis matrix and penalty matrix into our Bayesian penalized spline
approach presented in Section 4.5. We use the Mid-C 12 months market data for delivery
64
between January 2010 to December 2010 and we use the quarterly prices from the market
for periods between 2011-2012. The fitted smooth curve for the long term contract obtained
by fitting the available monthly and quarterly observed prices using 36 knots located at the
delivery months and a truncated power basis of degree 2 is presented in Figure 4.2. The
calculated monthly average prices are also shown.
70
65
For war d Pr ic e ( $/MW h)
60
55
50
45
40
35
30
2010
2011
2012
2013
De live r y Dat e
Figure 4.2: Mid-C observed forward prices on December 01, 2009 delivery between 2010-2012
(in green), Mid-C estimated smooth forward curve (in blue) and calculated monthly averages
from the estimated smooth curve (in red).
4.8 Estimating COB Smooth Forward Curve
We have estimated a smooth forward curve of Mid-C electricity forward contracts delivery
over a period using Bayesian penalized spline. Next, we provide a Bayesian estimation of
the California Oregon Border (COB) pricing hub with informative prior specification on the
parameters. Since COB is illiquid hub, the number of forward prices quoted is very small.
So we need to estimate the COB smooth forward curve by taking advantage of the liquidity
65
of Mid-C.
Jarrow, Ruppert and Yu (2004) constructed the corporate term structure by adding a
spread to the estimated Treasury term structure due to the small number of corporate bonds
observed in the market. So,
fcorporate (t) = fTreasury (t) + spread
Li and Yu (2005) pointed out that Treasury bonds are backed by the US government who
has a higher credit grade than any corporate debt, a positive premium should be added to
adjust the risk of default by a corporate debt. As a result, a positive constant prior from a
uniform distribution is assumed for the spread.
We follow analogous approach to Jarrow, Ruppert and Yu (2004) and Li and Yu (2005)
in electricity market perspective and we model the the COB forward curve fCOB (t) from
Mid-C forward curve fMid-C (t) by adding a spread term:
fCOB (t) = fMid-C (t) + spread,
where the spread assumed to be either a constant, linear or quadratic spread as in Li and
Yu (2005). In this section, we consider the constant spread case so that
fCOB (t) = fMid-C (t) + γ0 = δCOB B(t),
where
δCOB = [β0 + γ0 , β1 , β2 , . . . βm , βm+1 , . . . , βm+K ]T .
When we move power from Mid-C to COB, the price of this power at COB is always
Mid-C power prices plus additional cost for transmission line and transmission line losses.
So the power prices at COB is always higher than the Mid-C prices. As a result of this, we
incorporate prior subjective information about the positive spread into the Bayesian model
as informative priors. Following the same approach as Jarrow, Ruppert and Yu (2004) and
66
Li and Yu (2005), we assume the spread of the two hubs is between 0 and a positive constant
prior to estimation, where the prior for the spread is assumed to have a uniform distribution
on (0, s):
p(γ0 ) ∝ I[0,s] (γ0 ).
The posterior distribution for the constant spread γ0 is:
p(γ0 |FCOB , σ2 )
PNCOB
1
∝ I[0,s] (γ0 ) exp −
(FCOB − δCOB B(t))2
σ2
!
(4.13)
where NCOB is the number of forward months where market prices are observed for COB.
This distribution is nonstandard and we implement the Metropolis-Hasting algorithm.
The posterior distribution of the error variance is an Inverted-Gamma distribution and
can be written as
p(σ2 |FCOB , γ0 , A , B )
NCOB
∼ IG
+ A ,
2
PNCOB
1
(FCOB − δCOB B(t))2
+ B .
2
!
(4.14)
To fully implement the Gibbs sampling and Metropolis-Hasting steps, we need to pick the
value of s to complete the specification of the prior. This can be done either using historical
analysis or taking a prior information from the market. We take s to be 8 before estimation
as the historical analysis shows the spread between Mid-C and COB is on average $5 and
can rise as high as $8. However, this assumption can be tweaked based on any subjective
information.
4.9 Numerical Result
We apply Gibbs sampling algorithm to sample the error variance term σ2 and MetropolisHasting steps for the spread γ0 using the Mid-C monthly forward prices quoted on December
01, 2009 for delivery between January 2010 - December 2010. To improve the shape of the
67
curve, we incorporate the available market data for COB on the same day. These prices are
6 in number: three of them are monthly prices quoted for January, February and March
2010 and the other three are for the second, third and fourth quarter. The knots are chosen
65
For war d Pr ice ($/MWh)
60
55
50
45
40
35
30
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Deliver y days between Jan 2010-Dec 2010
Figure 4.3: Smooth forward price curve by Bayesian penalized splines on Mid-C and COB
forward contract prices on December 01, 2009. The Mid-C smooth curve (in dark), COB
smooth curve estimated by adding a spread to Mid-C(in blue), COB observed forward prices
(in green), monthly averages of COB forward prices constructed from the smooth curve (in
red)
as equally spaced and located at the delivery months. A spline of degree 2 is sufficient to fit
the data. A burn in phase of 6000 simulation draws was adequate to guarantee convergence
and two-third of the draws were used to calculate the estimate and its posterior interval.
Figure 4.3 presents the fitted forward price curves for both Mid-C and COB average based
power forward contracts on December 01, 2009. The observed prices for COB are flat prices
for quarters 2,3 and 4.
68
4.9.1 COB Long Term Smooth Curve
The observed prices for COB is few in number. There are two annual forward prices for 2011
and 2012. However we can again make use of the available MidC market data to estimate the
long term curve of COB. First we extend the knots and spline basis function to capture all
the futures months. Second, we estimate the Mid-C long term forecast as we did in Section
4.7. Finally we add a spread and we estimate the long term forward curve of COB.
70
65
For w ar d Pr ic e ( $/MW h)
60
55
50
45
40
35
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
De live r y Dat e s
Figure 4.4: COB observed forward prices on December 01, 2009 delivery between 2010-2012
(in green), COB estimated smooth forward curve (in blue) and calculated monthly averages
from the estimated smooth curve (in red).
Figure 4.4 presents the observed forward prices on December 01, 2009 for delivery Jan
2010- Dec 2012. The estimated long term smooth curve for COB obtained by adding a
spread to Mid-C curve and the calculated monthly average prices are presented. We can see
that the estimated smooth curve gives seasonal shape to the flat annual price at the back
end.
69
4.9.2 Statistical Inference
Statistical inference about the spread term γ0 can be done without the assumption of large
samples.The Bayesian paradigm allows us to obtain the posterior interval, a 100(1 − α)
credible interval from MCMC sample draws.
500
450
400
350
300
250
200
150
100
50
0
3.5
4
4.5
5
5.5
6
6.5
Figure 4.5: Histogram of after burn in samples from the posterior distribution of the spread
γ0 and its 95% Bayesian posterior intervals
The estimate value of γ0 together with the 95% Bayesian credible interval is summarized
in Table 4.1. We noticed that there is a 95% probability that the spread value lies on the
Bayesian estimate Bayes Credible Interval
4.9887
(4.3551, 5.6636)
Table 4.1: Bayesian estimates of the spread term and its 95% Bayes posterior interval
interval (4.3551, 5.6636) and Bayesian estimates the spread value to be $4.9887 which is close
to the average spread value of $5 between the two markets on December 01, 2009.
70
4.10 Evolution of Mid-C and COB Forward Curves over Trading Days
Next, we collected 22 trading days monthly forward contract prices quoted from December
01, 2009 - December 31, 2009 for delivery between January - December 2010. We applied the
Bayesian Penalized smoothing 22 times independently to the 22 days monthly forward data
sets. This is the assumption that forward prices F (t, T1 , T2 ) and F (t+∆, T1 , T2 ) are independent for any trading days t with ∆t = 1 day. The fitted surface constructed independently
for these market price data is presented in 4.6.
65
For w ar d Pr ic e s ( $/MW h)
60
55
50
45
40
35
30
25
25
20
15
10
Tr ading Day s ( De c 01-31, 2009)
5
0
0
50
150
100
200
250
300
350
400
De live r y dat e s b e twe e n J anuar y - De c e mb e r 2010
Figure 4.6: Fitted surface for Mid-C forward prices traded between December 01- December
31, 2009 for delivery on January 2010-December 2010
From the smooth surface 4.6, we can see a strong seasonality over the year. In the second
quarter which is the month of April, May and June forward prices are very low because of
the excess rain in the state of Washington that lifts the water level up in the hydro dams
which results in excessive hydro power from these dams. This fact coupled with a mild
summer with relatively low power demand in the second quarter brings the power price at
an annual low. In the third quarter which is the month of July, August and September, the
71
hot summer pushes the demand high which results in higher power prices. Winter months
are considered as a low hydro season and the cold weather in the winter result in higher
prices.
We also notice that forward prices over different trading days follow the same seasonal
shape except a small adjustment. The forward curve can move up or down based on new
information coming into the market such as weather forecast, short term outage, transmission
failure, water spill from dams and soon.
58
56
For w ar d Pr ic e ( $/MW h)
54
52
50
48
46
44
42
28/11/09
03/12/09
08/12/09
13/12/09
18/12/09
23/12/09
28/12/09
02/01/10
C ale nde r t ime
Figure 4.7: Forward curve change for Mid-C forward prices traded between December 01December 31, 2009 for delivery on January 01, 2010
To explore the term structure at a given delivery time, we can fix the delivery time T
to T0 and we can see the development of f (t, T0 ) for a given T0 . This is understood as a
stochastic process and is very useful in risk management because it shows the daily volatility
of the contract. Figure 4.7 shows the forward curve development of a contact with delivery
on January 01, 2010, which is basically the left edge of the surface in Figure 4.6.
We also collect the available market data for COB over those trading days. By adding
72
a spread to each of the estimated Mid-C curve and applying the Bayesian penalized spline
smoothing 22 times independently, we are able to see the evolution of COB forward prices
over different trading days. The fitted surfaces over different trading days for both Mid-C
and COB are provided in Figure 4.8.
70
65
Forward Prices ($/MWh)
60
55
50
45
40
35
30
25
25
20
15
10
5
Trading Days Dec 01ï31, 2009
0
0
50
100
150
200
250
300
350
Delivery Days between Jan 01ïDec 31, 2010
Figure 4.8: The fitted forward price curves for COB (upper sheet in red) and Mid-C (lower
sheet in blue) over the 22 trading days between December 01- December 31, 2009 for delivery
on January 2010-December 2010. A constant spread is added to obtain the COB smooth
forward price curves from Mid-C curves
4.11 Financial Implications
Energy companies are subject to various sorts of risk, including operational, credit, regulatory
and financial risk. Continuous smooth forward curves constructed from observed market
prices will have important implications for the practice of risk management in the electricity
markets. In terms of managing risks, the main usage of these curves can be grouped into
three main categories.
73
400
4.11.1 Financial reporting preparation
Energy companies prepare financial statement every quarter and at the end of each year.
They use forward curves as inputs to derivative models to calculate the fair value of financial
instruments carried on the balance sheet. However, miscalculating or misrepresenting these
price inputs may expose the company to a regulatory risk which results the company at
risk of investigation in the future or penalty. So having a continuous smooth forward curve
provides a more accurate mark to market (fair value) of any financial position.
4.11.2 Asset valuation
Since the operating characteristics of electricity is more granular and seasonal than available
market quotes, using shaped curves would provide the company with a better estimate of
the asset’s value. With a better estimate of the asset’s value, the company would be in a
better position to plan for future investment or manage the asset’s net risk.
4.11.3 Risk management and reporting.
Energy companies use Value-at-Risk (VaR) as one of their risk metrics to measure a financial
risk. However, to accurately calculate VaR, the model needs forward curves that are seasonally adjusted and in fine granularity. In addition, trading desk position limits monitoring
process also needs these smooth forward curves for risk reporting purpose.
4.12 Summary
Representing forward prices by a continuous smooth curve is crucial for pricing electricity
derivatives, hedging financial and physical positions and also mark to market valuation. In
this chapter we have suggested a new approach for generating forward price curves based
on a Bayesian penalized spline smoothing approach applied on the observed market data.
We presented the methodology in detail and also showed how to implement the proposed
74
technique using the Mid-Columbia power pricing hub market data. The ease and capability
of Bayesian approach to work with a small number of market data to construct a smooth
forward price curve makes our approach computationally tractable.
Due to the small number of market data available, we model the California Oregon
Border (COB) forward prices using the observed market prices from the liquid hub (MidC) by adding a spread term that is assumed to have a uniform prior. The positive spread
between Mid-C and COB is a good subjective information served as a prior in Bayesian
framework. The small MSE of the fit together with seasonal shape of the curves captured
well makes our approach theoretically as well as fundamentally sound.
75
Chapter 5
Estimating Electricity Spot Price with Bayesian
Penalized Spline
5.1 Introduction
Deregulation of the electricity markets across the world creates an opportunity for a competitive market where electricity prices are no longer controlled by regulators but determined
by supply demand balance and are organized by power trading exchanges or independent
system operators. In Alberta, prices are kept in balance and system security is maintained
on a second by second basis by Alberta Electricity System Operator (AESO). However, the
non-storability or very expensive to store nature of electricity results in prices to be sensitive
to demand and supply imbalance which results in spot prices to exhibit unique features such
as strong seasonality, extreme spikes and mean-reverting behaviours.
A significant increase in price variations after deregulation opened the door for new power
based financial products to mitigate the risks incurred due to fluctuation of prices in both
physical and financial markets. Pricing these energy contracts need models to capture the
stylized properties of spot prices. Significant contributions have been made by Schwartz
(1997) by introducing an O-U type of model to capture the mean reversion behaviour of
electricity spot prices. Lucia and Schwartz (2002) extend these models to two factor model
where the second factor accounts for a deterministic seasonal component.
The spot models mentioned above explain the mean reverting and seasonal behaviour of
electricity prices but not the spike observed in the spot market. One way of solving this issue
is to incorporate a jump term in the above models. Clewlow and Strickland (2001) introduced
a model that accounts for mean reversion and spike, although they did not provide closed
76
form solutions for the forward. Cartea and Figueroa (2005) presented the closed form for
the forward and also showed the difficulty of calibrating the spot price models to historical
spot prices.
Modelling the deterministic seasonal function is first suggested by Pilipovi´c (1998) by a
sinusoidal function and Knittel and Roberts (2001) using a constant piece-wise function and
Lucia and Schwartz (2002) as a function between weekdays and monthly seasonal component.
Cartea and Figueroa (2005) proposed modelling the deterministic seasonal function by fitting
the available monthly averages of historical data with a Fourier series of order 5.
However, the above modelling approaches are fully parametric and rely on the calibration
of parameters involved in the model using historical spot prices. Cartea and Figueroa (2005)
pointed out that as the number of parameters to be calibrated grows to make the model
more sophisticated, it creates problem when dealing with markets with not enough data and
it adds difficulty and instability to the spot model calibration.
Instead of using parametric form of modelling seasonal function, we propose a fully
nonparametric approach using penalized spline smoothing approach by which the seasonal
function is modelled as a periodic spline. This approach avoids at least two difficulties
associated with calibration: first since it is nonparametric, we don’t need to calibrate any
parameters and second unlike the regular modelling approaches that rely on large number
of data, small number of available market data will not be an issue.
The main contributions of this chapter can be seen in two different ways. First, we
present a spot price model that explains the mean reverting, seasonality and extreme spike
behaviours of electricity prices and employ a Bayesian parameter estimation approach to
calibrate the model parameters to the Alberta power market. Second, we employ a penalized
spline modelling approach to the deterministic seasonal component of spot price models. This
modelling approach avoids the difficulty associated with calibration of model parameters. In
addition, we do not impose any parametric form to model the seasonal component instead
77
we let the market data to determine the form of the function.
The rest of the chapter is organized as follows. Historical spot prices from Alberta
electricity market from January 01, 2011 to December 31, 2012 is provided in Section 5.2.
Detailed data analysis such as tests for normality, seasonality and spike of the spot prices
are also presented. Different spot price models and a short literature review on these models
are summarized in Section 5.3. In Section 5.4, we present the spot price model and a
brief overview of penalized spline modelling approach to construct the deterministic seasonal
function.
Section 5.5 discusses a simulation study to estimate the seasonal function using the
monthly averages of the historical spot prices. MCMC algorithm employed to calibrate
the O-U model parameters are discussed in Section 5.6. Numerical results are summarized
in tables and resulting graphs are also shown. Finally in Section 5.7, we provide a short
summary of the results of the chapter.
5.2 Historical Spot Prices
Electricity markets show a strong evidence of spikes and mean reversion in spot prices,
which are not common in other markets. Spikes and mean reversion are very pronounced in
the hourly spot markets. The prices revert around an equilibrium mean level with unusual
periods of volatility. These periods of volatility accounts for the spike behaviour observed in
Alberta power market.
We can simply see these unique features from the market data. Figure 5.1 shows hourly
spot prices and daily averages. The historical data are taken from the hourly pool price
from Alberta Electric System Operator (AESO www.ets.aeso.ca) ranging from January
01, 2011-December 31, 2012.
78
Alberta Spot Price ï Hourly
Spot Price ($/MWh)
1500
1000
500
0
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
Hours
Alberta Spot Price ï Daily
Spot Price ($/MWh)
1500
1000
500
0
0
100
200
300
400
500
600
700
Days
Figure 5.1: Alberta historical spot price from January 01, 2011 - December 31, 2012
5.2.1 Non-Normality of Electricity Spot Prices
The regular Black Scholes model which assumes prices to be log-normally distributed or
equivalently saying the returns of the prices have a normal distribution will fail in electricity
markets. The reason is that unexpected or rare events such as spikes in the case of electricity
market occur most often than predicted by the normal distribution. This can be easily verified by plotting the quantilie-quantile (QQ) plot of the returns. For a normally distributed
data, the QQ plot shows that the data form a straight line.
79
800
Figure 5.2 shows the QQ plot of these hourly returns and it shows that there are so
many data points that deviates from the straight line and this simply explains the extreme
non-normality nature of the distribution. The fat tailed property of the distribution is clear
that there is a probability different from zero with returns higher than value of 0.7 instead
of zero if they were perfectly normal as shown on the straight line.
0.999
0.997
0.99
0.98
0.95
0.90
Probability
0.75
0.50
0.25
0.10
0.05
0.02
0.01
0.003
0.001
ï6
ï4
ï2
0
2
4
6
Returns
Figure 5.2: Normality test of Alberta historical hourly spot price returns from January 01,
2011 - December 31, 2012
80
8
5.2.2 Seasonality of Electricity Spot Prices
One of the main assumptions of the regular Black Scholes model is returns are assumed to
be independent. However, in electricity spot market this assumption is no longer valid. This
can be verified by testing the autocorrelation structure of the returns. The periodic nature of
the autocorrelation function shown in Figure 5.3 suggests that there is a strong seasonality
in electricity markets. Moreover, the lags between these highly correlated points which in
these case is 24 hours shows an intraday seasonality is more pronounced in Alberta.
Autocorrelation Coefficient
0.8
0.6
0.4
0.2
0
ï0.2
0
50
100
Lag (Hours)
Figure 5.3: Autocorrelation test of Alberta spot price returns from January 01, 2011 December 31, 2012
81
150
5.2.3 Spikes of Electricity Spot Prices
Electricity spot prices exhibit extreme jumps which is mostly triggered by different reasons
such as unexpected increase in demand due to extreme weather or shortfall of supply due to
unplanned outage of power plants. These extreme price spikes however cannot be predicted
by a normal distribution as it is shown above.
We study the frequency of these spikes using the historical spot prices by repeatedly
counting price returns and filtering them out from the time series data if they fall beyond
three standard deviations. Figure 5.4 shows the cumulative distribution of the spikes that
are filtered out from the historical data and the QQ plot of the remaining returns after taking
the spikes out.
Percentage of Jumps
Cumulative Distribution of Number of Jumps in Alberta Spot Price Returns
0.12
0.1
0.08
0.06
0.04
0.02
0
2
4
6
8
10
12
14
Iteration
Probability
Normality Test for Alberta Hourly Spot Price Returnsï Jumps Removed
0.999
0.997
0.99
0.98
0.95
0.90
0.75
0.50
0.25
0.10
0.05
0.02
0.01
0.003
0.001
ï0.5
ï0.4
ï0.3
ï0.2
ï0.1
0
0.1
0.2
0.3
0.4
0.5
Returns
Probability
Normality Test for Alberta Hourly Spot Price Returns
0.999
0.997
0.99
0.98
0.95
0.90
0.75
0.50
0.25
0.10
0.05
0.02
0.01
0.003
0.001
ï6
ï4
ï2
0
2
4
6
8
Returns
Figure 5.4: Spikes in Alberta spot market from January 01, 2011 - December 31, 2012
82
It is shown that about 11% of the returns are considered price spikes in Alberta spot
market from January 2011- december 2012. This percentage is quite astonishing comparing
to other markets such as UK and Wales and Nord pool where the number of jumps per year
is between 8 and 9 as calibrated by Cartea and Figueroa (2005). After leaving these price
spikes out from the data series, the QQ plot shows that returns become close to normally
distributed.
5.3 Spot Price Modelling
Our spot price modelling approach follows the same approach as Lucia and Schwartz (2002).
The spot price St can be modelled as an exponential function of the sum of an O-U process
Xt and a deterministic seasonal function f (t). So, the spot price St is written as
dXt = α(µ − Xt ) dt + σ dWt
St = exp(f (t) + Xt ),
(5.1)
(5.2)
where W is a standard Brownian motion and σ is a volatility and α is the speed of mean
reversion.
Our approach thus apply a Bayesian paradigm to calibrate the parameters (α, µ, σ) involved in the stochastic part of the model Xt in (5.2). We also employ Bayesian penalized
spline approach to model the deterministic seasonal function f (t) in (5.2).
5.4 Spline Modelling of the Seasonal Function
One of the drawbacks of modelling the seasonal function using the parametric form we
mentioned above is that it is very difficult to estimate the parameters given the large number
of parameters coupled with a limited number of market data. Clewlow and Strickland (2000)
and Knittel and Roberts (2001) pointed out this shortcoming and Cartea and Figueroa (2005)
proposed a maximum likelihood estimator (MLE) approach to estimate all the parameters
83
involved from historical data. However, the MLE approach employed on England and Wales
market data yields incorrect estimate, i.e negative values for certain parameters that should
otherwise to be positive and the methodology heavily depends on the initial values of the
parameters. They believed that the issue is mainly due to small number of data in the
market.
Our approach instead employ a Bayesian penalized smoothing approach where the seasonal function is modelled by a periodic spline function where parameter estimation is not
an issue. Moreover, unlike the frequentist approach where large number of data is a requirement, Bayesian approach works well with the small number of data available in the
market.
To estimate the seasonal function, we first need to remove the seasonal function from the
logarithm of historical spot prices as
Xt = ln St − f (t),
and approximation of the seasonal function f (t) from the historical spot prices can be done
by minimizing
min || ln St − f (t)||2 .
To solve the above minimization problem, we model the seasonal function f (t) as a spline
and we write f (t) = B(t)δ, where B(t) is a vector of Fourier series basis function and δ is
spline coefficient vector.
The parameter vector δ are chosen to minimize the squared errors between observed
monthly average prices and the seasonal function
min
1X
[ln Sti − f (ti )]2 ,
n i
which can be solved by Bayesian penalized spline smoothing approach discussed in Chapter
3.
84
If we let Y = [ln St1 , ln St2 , . . . , ln Stn ]T be the monthly average log spot prices observed
in the market, the classical penalized spline minimizes the sum of the lack of fit of the data
and a roughness penalty measure as
1X
[Y − B(t)δ]2 + λg(δ)
n
where B(t)δ is the seasonal function represented as a periodic spline, λ is the smoothing
parameter. The roughness penalty measure g(δ) = δ T P δ with P is a penalty matrix, which
is assumed to be symmetric and positive definite as in Eilers and Marx (1996).
5.5 Extracting Smooth Seasonal Function
The seasonal function f (t) is a deterministic function which represent the observed seasonal
behaviour in the spot market. From the autocorrelation test shown in Figure 5.3, we noticed
that there is a strong seasonality that occurs in a electricity spot prices. So our approach
focus on constructing the seasonal function using the monthly averages of the spot prices.
We apply penalized spline smoothing approach to the monthly average spot price data
to estimate the smooth seasonal function that can best represents the monthly seasonal
behaviour observed in electricity prices. The monthly average prices that are constructed
from the hourly spot prices to be used by the penalized spline smoothing for the extraction of
the smooth seasonal function are given in Table 5.1. The knots used for the spline smoothing
are chosen as 12 equally spaced knots located at the months. Assuming the seasonal function
is periodic, a periodic spline of period 12 months is considered to smooth the data. The
fitted smooth seasonal function together with the historical monthly average prices is shown
in Figure 5.5
85
Month Price
01
81.80
02
83.06
03
49.80
04
46.96
05
30.88
06
60.60
07
64.80
08
91.45
09
103.48
10
80.56
11
97.83
12
54.44
Table 5.1: Monthly average of spot prices from January 2011- December 2012
110
Mont hly A ve r age Pr ic e ( $/MW h)
100
90
80
70
60
50
40
30
0
1
2
3
4
5
6
7
8
9
10
11
Mont hs
Figure 5.5: Smooth seasonal function and Alberta monthly average spot prices
86
12
5.6 Bayesian Parameter Estimation of the Spot Price model
Bayesian parameter estimation methodology has been widely used in Finance. Gray (2001)
employed the approach for estimating the parameters of a continuous time financial models
such as Black-Scholes option pricing model and Vasicek’s model of the short-term interest
rate. Though these models are relatively simple models where numerical computation is
not necessary and analytical solution is possible, the author shows the ease with Bayesian
numerical computation leads to a wide spread application in Finance.
Cartea and Figueroa (2005) pointed out that a maximum likelihood estimator (MLE)
approach to estimate parameters involved in Spot price model yield incorrect estimate. They
conclude that the issue is mainly due to small number of data in the market and the approach
depends on the initial values of the parameters. We however apply the Bayesian parameter
estimation which can work for small number of market data and does not depend on the
initial estimate os the parameters.
In order to estimate the model parameters, the estimated seasonal function f (t) can be
removed from the historical daily prices and we are left with pure stochastic OU process Xt
Xt = ln St − f (t),
where Xt is described by
dXt = θ(µ − Xt ) dt + σ dWt .
Now, we can perform the task of estimating the parameters of the mean reversion OU
process from the log de-seasonalized time series data. This can easily be done using the
Bayesian estimation techniques and MCMC algorithm discussed briefly in Chapter 2. The
summary of parameter estimation is presented below but one can also refer Chapter 2 for
detail presentation of the procedure.
Solving the above SDE using Ito formula, we can show that
Xt = µ + (x0 − µ)e−θt + σ
87
Z t
0
e−θ(t−u) dWu ,
and it follows that
å
Ç
Xt ∼ N µ + (x0 − µ)e
−θt
,
σ2
(1 − e−2θt ) .
2θ
So the given O-U process (5.2) has a normal transition density and its likelihood function
is from an exponential family of distributions. Assuming a flat prior p(σ 2 ) ∝
1
,
σ2
which cor-
responds to IG(0,0), and the fact that the inverse gamma (IG) is conjugate prior to normal,
we have a conditional density p(σ 2 |X, µ, θ) from Inverted Gamma family and p(µ|X, σ 2 , θ)
from Normal family. However, the conditional density of θ is non-standard. So assuming a
flat prior on θ in combination with the likelihood, we can take the candidate density for θ
from Normal family.
5.6.1 Calibration of Parameters
The conditional distributions of σ and µ are standard distributions, which are Inverted
Gamma and Normal respectively and sample can be drawn directly. However, the mean
reversion rate θ has a nonstandard distribution and we apply the Metropolis Hasting algorithm to draw samples. The Markov Chain Monte Carlo (MCMC) algorithm that carries
out the required parameter estimation is presented below.
Step 1. First, at the initial iteration of the chain set initial values for θ(0)
Step 2. At iteration m, obtain a new value σ 2 from Inverted Gamma distribution
Ñ
p(σ 2 |X, µ, θ) ∼ IG
Ç
å
n
X
n
θ
[Xjh − X(j−1)h e−θh − µ(1 − e−θh )]2
,
2 1 − e−2θh j=1
Step 3. Draw µ from Normal distribution
Ñ
p(µ|X, σ 2 , θ) ∼ N
n
Xjh − X(j−1)h e−θh σ 2 (1 − e−2θh )
1X
,
n j=1
1 − e−θh
2θn(1 − e−θh )2
é
.
é
.
Step 4. Move the chain to a new value φ generated from the density q(θ(m−1) ) ∼ N(θ(m−1) , Σ)
Step 5. Evaluate the acceptance probability of the move α(θ(m−1) , φ) given by
®
´
q(φ, θ)
α(θ, φ) = min 1,
.
q(θ, φ)
88
Step 6. Generate an independent uniform quantity u and
• If u ≤ α then the move is accepted and θ(m) = φ.
• If u > α then the move is rejected and θ(m) = θ(m−1) .
5.6.2 Numerical results
A sequence of two years daily log-deseasonalized historical Alberta spot prices were used to fit
the mean reverting OU process and estimate the model parameters using MCMC described
in Section 5.6. However, we found out that the daily log-deseasonalized spot prices are not
normally distributed as shown by the QQ plot and histogram in Figure 5.6
Histogram of logïdeseasonalized daily prices
90
80
70
60
50
40
30
20
10
0
ï2.5
ï2
ï1.5
ï1
ï0.5
0
0.5
1
1.5
2
2.5
Normal Probability Plot
Probability
0.999
0.997
0.99
0.98
0.95
0.90
0.75
0.50
0.25
0.10
0.05
0.02
0.01
0.003
0.001
ï2
ï1.5
ï1
ï0.5
0
0.5
1
1.5
2
2.5
Data
Figure 5.6: Normality test on Log de-seasonalized daily spot prices
As a result, we need to transform these prices into a normal distribution and calibrate an
O-U process. The transformation is based on first transforming the log-deseasonalized daily
prices to a uniform distribution and then applying inverse normal cumulative distribution
89
function to get a normal distribution. The histogram and QQ plot of the transformed prices
are shown in Figure 5.7
Histogram of transformed logïdeseasonalized daily prices
60
50
40
30
20
10
0
ï3
ï2
ï1
0
1
2
3
Normal Probability Plot
Probability
0.999
0.997
0.99
0.98
0.95
0.90
0.75
0.50
0.25
0.10
0.05
0.02
0.01
0.003
0.001
ï3
ï2.5
ï2
ï1.5
ï1
ï0.5
Data
0
0.5
1
1.5
2
Figure 5.7: Normality test on transformed log de-seasonalized daily spot prices
We consider Gibbs sampling approach by first drawing σ 2 from an inverted-gamma and
µ from a normal and we accept them with probability one. However, we use MetropolisHastings algorithm to draw θ from the candidate normal density and the sample is accepted
with some probability.
As a result, a sequence of iterates for (σ 2 , θ, µ) are generated and after an initial warm
up period, these draws converge and used to derive point estimates of the parameters. 5000
Bayesian estimates are used for each of the parameters and 2500 post warm up samples
(after burn in samples) are taken to derive the mode of these distributions.
90
The Bayesian estimates along with their 95% posterior interval are obtained and the
results are presented in the table below.
θ
µ
σ
Bayesian Estimate
0.5187
-0.4636
0.8741
Bayes Credible Interval
(0.5172, 0.5202)
Standard error
0.0007
(-0.4660, -0.4612) (0.8220, 0.9258)
0.0012
0.0009
Table 5.2: Bayesian estimates and their standard error of O-U spot price model parameters
5.6.3 Constructing the Daily Spot Prices
After estimating the parameter values, we can generate 730 realizations of the daily OU
process using
s
−θ∆t
Xt+∆t = µ + (Xt − µ)e
+
σ2
(1 − e−2θ∆t )Wt ,
2θ
where µ = −0.4636, σ = 0.8741, θ = 0.5287, ∆t = 1 and Wt ∼ N(0, 1). Now using the above
simulated values of Xt and the smooth seasonal function f (t) estimated as a spline, we can
construct back the daily spot prices as
St = exp(Xt + f (t)).
The newly constructed daily spot prices together with the actual daily pool price is shown
in Figure 5.8. Spikes, mean reverting and seasonal behaviours of electricity spot prices are
captured fairly well.
91
H ist or ic al & Simulat e d Pr ic e s
700
Ele c t r ic ity Sp ot Pr ic e ( $/MW h)
600
500
400
300
200
100
0
2011
2012
2013
2014
2015
Dat e
Figure 5.8: Actual (in blue) and constructed (in red) daily spot prices
To check whether seasonality is explained by our approach, the monthly average prices
from both the actual and constructed prices are calculated and shown in Figure 5.9. We can
clearly see that the monthly shape of the prices is modelled and captured well. Low power
prices are shown in the months of April, May and June and higher prices in June, July and
August. This phenomenon is consistent with the Alberta power market as the hot summer
in the third quarter results in an increases in cooling demand which forces prices to spike.
92
800
800
800
600
600
600
400
400
400
200
200
200
0
2012
2014
0
2012
2014
0
800
800
800
600
600
600
400
400
400
200
200
200
0
0
2012
2014
2012
2014
0
800
800
800
600
600
600
400
400
400
200
200
200
0
2012
2014
0
2012
2014
0
2012
2014
2012
2014
2012
2014
Figure 5.9: Historical and constructed spot prices and their monthly averages
5.7 Summary
In this chapter we analyzed spot prices in Alberta electricity market. Deregulation of electricity markets created competition and power price variations. However, practitioners and
researchers struggled with lack of market data to test their models and also calibration difficulty to estimate model parameters. We proposed a Bayesian approach to overcome these
difficulties and then use the available market data to calibrate the model parameters with
ease. In addition, we relax a rigid parametric modelling approach of the seasonal function
and employ a fully nonparametric spline approach by which the data itself determines what
type of form the seasonal function should have. Finally, the simulated price path captures all
of the unique features exhibited in the spot price which validates the mathematical soundness
of our approach.
93
Chapter 6
Conclusion
6.1 Summary
The main objective of the thesis was the development of a Bayesian penalized spline methods
for constructing a smooth forward curve in electricity markets. In the thesis, we developed
a Bayesian spline models and their associated methodologies that adequately address the
complex characteristics exhibited in spot and forward price of electricity markets. The
Bayesian penalized spline approach we adopted is attractive in that the resulting models can
solve difficulties encountered by frequentist such as parameter estimation with small number
of market data.
Throughout the thesis, we assumed the forward curve can be modelled by a spline using
a truncated power spline basis function the same way as Ruppert, Carroll and Yu (2003)
and we applied penalty as a function of a smoothing parameter which governs the tradeoff
between the lack of fit and smoothness. As a result, the function we estimated reflects key
features of the market data but retains some degree of smoothness.
Representing electricity forward prices by one smooth continuous curve is crucial in pricing derivatives, marking to market valuations and investment decisions. It is also required if
no-arbitrage forward price models are to be used for risk management. Common frequentist
approaches employ a parametric regression and constrained optimization models to construct the smooth forward curve. The lack of market data and the complexity of parameter
calibration makes these approaches hard to implement.
Our Bayesian penalized spline smoothing methodology, instead incorporates a prior
knowledge about the parameter and takes the observed market data to model the posterior distribution of parameter of interest to be used for any statistical inference. The ease
94
and efficiency to work with small number of data coupled with the capability of engaging
a prior information makes Bayesian approach very attractive to financial modelling. This
makes mathematical sense, as most electricity markets are deregulated recently and longer
history data is not available to test the common non-Bayesian models which rely on theory
of large number.
Our methodologies were motivated by forward prices data quoted to Mid-Columbia (MidC) and California Oregon Border (COB) pricing hubs and Alberta spot prices. The positive
spread between Mid-C and COB forward prices is too strong to ignore and it is incorporated
as a prior information to model a smooth forward curve of COB by borrowing the strong
liquidity of Mid-C and adding a spread term. In addition, we employ our approach to model
the deterministic seasonal function that is exhibited in the long term electricity spot price
dynamics. This is a new way of modelling the seasonal function using a Bayesian penalized
spline approach because most of the regular methodologies use a parametric sinusoidal function to model seasonality. Rather than imposing a parametric form, we let the historical
data to determine the form of the seasonal function.
In Chapter 2, we discussed a general Bayesian framework giving detail attention to Bayes
formula, specification of priors and hierarchical Bayesian parameter estimation using MCMC
and Bayesian statistical inference. We illustrated the Bayesian paradigm using a continuous
time stochastic differential equation of O-U type spot price model. The model involves three
parameters: mean reversion rate θ, long run mean µ and volatility (diffusion) parameter σ.
Since the model under discussion has a normal transition density and an exponential type
likelihood function, we proposed a noninformative flat priors for σ and µ and derived the
posterior distribution of these two parameters in closed form. However, since the density of
the mean reversion rate θ is nonstandard, we used a Normal based inference and approximate
the conditional density to be from a Normal family.
Empirical analysis is performed using MCMC algorithm. First, Gibbs sampling is applied
95
to draw samples for σ and µ from their respective distribution of Inverted gamma and
Normal families respectively. Next, since the posterior distribution of θ is unknown, we
implemented Metropolis- Hastings steps and we estimated the parameters by finding the
mode of their respective posterior distributions. Maximum likelihood estimates of these
parameters were derived in closed form and the results were compared against the Bayesian
estimates. We found out that even though the two approaches are very close in estimating
the two parameters σ and µ, Bayesian approach outperforms MLE in estimating θ with a
very tight credible interval.
In Chapter 3, we presented a brief introduction to spline smoothing and Bayesian penalized spline smoothing techniques. We illustrated different regression approaches (linear,
polynomial, spline smoothing and penalized spline smoothing) by fitting a nonparametric
regression model to the fossil data in R package SemiPar. This dataset has 106 observations
on fossil shells. The response is ratio of strontium isotopes (strontium.ratio), and the
covariate is age in millions of years (age). The coefficients of the spline basis functions used
for smoothing were assumed to be all deterministic and extra steps of Cross Validation and
Generalized Cross Validation analyses are used for selecting the smoothing parameter. The
result showed that penalized spline smoothing has a better fit by capturing most of the data
points and it has the lowest mean squared error (MSE) of all other approaches considered.
We proposed a Bayesian penalized spline framework by relaxing the deterministic assumption imposed onto the spline coefficients vector, and assumed that the coefficient vectors are
random. We specified Normal prior distributions with mean 0 and variance terms and we
derived the respective posterior distributions. These posterior distributions are standard
and nonstandard in nature so that we employed MCMC algorithm hierarchically using a
combination of Gibbs and Metropolis-Hastings steps. Unlike the frequentist approach, the
smoothing parameter in Bayesian framework is found as bi-product of the MCMC algorithm
employed as the ratio of the two variance terms that appeared in the prior distributions.
96
In Chapter 4, we provided a Bayesian approach to construct the smooth forward curves
of Mid-Columbia (Mid-C) and California Oregon Border (COB) power pricing hubs with a
penalized spline modelling approach. We developed a method of estimating a smooth curve
for illiquid markets such as COB from a liquid market such as Mid-C in a two step approach.
First, we estimate the term structure of forward prices quoted to Mid-Columbia market
hub, which is the most liquid market hub in the Northwest US. The smoothing parameter
is automatically calculated in the model as a ratio of posterior variances from the MCMC
algorithm and does not need to be selected using CV or GCV as in the frequentist approach.
Next, the California Oregon Border pricing hub smooth forward curve is estimated by
adding a spread to the constructed Mid-Columbia forward curve by taking account of the
prior knowledge of the positive credit spread into the Bayesian model as an informative
prior. The small sample size of the California Oregon Border forward prices available on
the market poses no additional challenge to the proposed Bayesian methodology. One of the
advantages of the Bayesian model in this case is the capability of incorporating valuable prior
information about the parameter and the flexibility to work with small market data. The
proposed methodology was illustrated using ICE historical monthly forward prices quoted to
Mid-C on December 01, 2009 delivery on January 2010 - December 2010 and COB quarterly
prices for the same period.
By extending the spline basis and knots used for estimating the Mid-Columbia forward
curve, we developed a Bayesian model to construct a long term smooth forward curve beyond
the one year period. We employed the same approach to estimate the long term forward
curve for COB by taking advantage of the liquidity of Mid-C and adding a positive spread.
Beyond 2010, COB forward prices are flat prices i.e one annual price for the year 2011
and another one for 2012 however by borrowing the strong seasonal shape variations from
Mid-Columbia we were able to get a seasonal shapes to COB for 2011 and 2012.
We applied the methodology 22 times independently to the 22 trading days data set
97
quoted to Mid-C from December 01- December 31, 2009 delivery on January - December
2010. Next we estimated independently the COB forward curves over the 22 days by adding
a constant spread to the fitted forward curve for Mid-C over the same period. The fitted
forward curves are shown at the end of the chapter. We found out that, forward curves over
the 22 days have the same shape for the same delivery month except a small adjustment
of moving the previous day curve upward or downward by a risk premium. This risk premium is mostly based on a new market information such as short term generation outages,
transmission outage, weather forecast, wind, etc.
In Chapter 5, we illustrated how the Bayesian framework can also be applied to the spot
price based forward curve modelling. First, we presented a spot price model that explains
the mean reverting and seasonality behaviours of electricity prices and we applied a Bayesian
estimation approach discussed in Chapter 2 to calibrate the model parameters. Second, we
employed a penalized spline modelling approach as in Chapter 4 to the deterministic seasonal
component of spot price models using the historical monthly average prices. This modelling
approaches avoided any parametric form imposed to model the seasonal component.
The proposed approach was illustrated using Alberta spot prices between January 2010December 2012. We tested the Normality of Alberta spot prices and the results showed that
the spot prices are actually non-Normal and the deviation from Normality is extreme. As a
result, we recommended not to use a regular Black Scholes type modelling for Alberta spot
price. We also tested the seasonality of the time series data by running the autocorrelation
test on them and the result suggested that Alberta spot prices show strong seasonality.
We found out that the highly correlated data points occurred every other 24 hours of lag.
As a result, we suggested that any Alberta spot price modelling should take into account
the strong seasonality exhibited. In addition, we filtered the spikes from the data set by
repeatedly removing price returns that lie outside of three standard deviation range. It is
shown that after 14 iterations, we filtered 10% of the returns considered as a spike so the
98
frequency of the jumps defined as the total number of jumps divided by the annualized
number of observation in Alberta is about 10 which is close to the UK and Wales spot price
jump frequency of 8.58 reported by Cartea and Figueroa (2005).
Based on the calibrated parameter, we simulated daily realizations using the continuous
time O-U process. We then added the smooth seasonal component estimated using penalized
spline back to the simulated prices and reconstruct the spot prices. The result showed that
seasonality, spike and mean reversion characteristics seen in the spot price dynamics were
captured reasonably well. Unlike the regular approach of modelling the seasonal component
parametrically, we proposed a new nonparametric modelling approach using penalized spline
by which the historical prices determine the form of the seasonal component.
6.2 Future Research
The research in this thesis has opened up a number of opportunities and issues which should
be explored in a future work.
The Bayesian penalized spline approach to construct the smooth forward curve based on
the market data discussed in Chapter 4 needs to be further explored. In particular, different
form of spline basis functions such as B-splines or periodic spline should be implemented fully
in order to clearly determine the advantages and disadvantages of the truncated power spline
basis considered in the thesis. In addition, we selected the knots to be fixed knots located
at the delivery months however this assumption should be further relaxed and explored by
proposing a new way of selecting the number and location of knots. Although our approach
validated that after a minimum number of knots are attained, increasing the number of knots
do not add any value to the smoothing, it should be tested with a different sets of knots
to completely understand if the estimation depends on the number and location of knots or
not.
The spot based forward modelling illustrated using the three year Alberta historical spot
99
prices definitely requires a more complete and in-depth study than we provided. Although
our development of the penalized spline approach depends on the monthly average prices
shown in the three year history and was restricted to forecasting the daily prices-our approach
can be conceptually extended to the hourly level. However, a careful study of the peak and
off peak hours with in a typical day and between days within a week in this setting are
essential; in addition, the inexorable computational issues need to be addressed.
Our modelling approaches assumed forward curves over different trading days are independent and we model the forward curve as a function of a single variate of delivery month.
However, this assumption should be further studied by modelling the forward curve as a
function of a calendar time and time to delivery or delivery month. In many interest rate
term structure modelling, this can be done by jointly modelling as function of time and
maturity date using tensor product of spline basis functions.
Although our empirical investigation shows forward prices are heavily depend on the time
of the year when delivery occurs than which day of the year the forward prices are quoted,
the multivariate modelling approach may result in a smooth curve over trading days as well
as over the delivery month. However, a careful study of the correlation structure in this
setting is again crucial. The curse of dimensionality and computational cost are also another
challenges because the dimension of tensor product matrices increases rapidly so that the
issues should be addressed. Filling in these holes would be a worthwhile assignment of future
work. We intend to address some of these issues in future work.
100
Bibliography
[1] Adams, K. J. and Van Deventer, D. R. (1994). Fitting yield curves and forward rate
curves with maximum smoothness. Journal of Fixed Income, 4, 52-56.
[2] Aguilar, O. and West, M. (2000). Bayesian dynamic factor models and portfolio allocation. Journal of Business and Economic Statistics, 18, 338-357.
[3] Barker, A. (1965). Monte Carlo calculations of the radial distribution functions for
proton-electron plasma. Austrian Journal of Physics, 18, 119-33.
[4] Barlow, M. (2002). A diffusion model for electricity prices. Mathematical Finance, Vol
12, 4, 287-298.
[5] Benth, F. E. and Ollmar, F. (2005). Stochastic modelling of financial electricity contracts. Preprint.
[6] Benth, F. E., Koekebakker, S. and Ollmar, F. (2007). Extracting and applying smooth
forward curves from average-based commodity contracts with seasonal variation. Journal of Derivatives, 15, 1.
[7] Bjerksund, P., Rasmussen, H. and Stensland, G. (2000). Valuation and risk management in the Nordic electricity market, Working paper. Institute of Finance and
Management Science, Norwegian School of Economics and Business Administration.
[8] Black, F. (1976). The pricing of commodity contracts. Journal of Financial Economics,
Vol.3, 167-179.
[9] Bralower, T. J., Fullagar, P. D., Paul, C. K., Dwyer, G. S. and Leckie, R. M. (1997).
Mid-Cretaceous strontium-isotope stratigraphy of deep-sea sections. Geological Society
of America Bulletin, 109, 1421-1442.
101
[10] Borak, S. and Weron, R. (2008). A semiparametric factor model for electricity forward
curve dynamics. Journal of Energy Markets, Vol. 1, 3, 3-16.
[11] Carmona, R. and Durrleman, V. (2003). Pricing and hedging spread options. SIAM
Review, Vol. 45, 4, 627-685.
[12] Cartea, A. and Figueroa, M. (2005). Pricing in electricity markets: a mean reverting
jump diffusion model with seasonality. Applied Mathematical Finance, Vol 12, 4, 313335.
[13] Chambers, D. R., Carleton, W. T. and Waldman, D. W. (1984). A new approach to
estimation of the term structure of interest rates. Journal of Financial and Quantitative
Analysis, 19, 233-252.
[14] Clewlow, L. and Strickland, C. (1999a). A multifactor model for energy derivatives.
Quantitative Finance Research Group, Working paper, University of Technology, (Sydney).
[15] Clewlow, L. and Strickland, C. (1999b). Valuing energy options in a one factor model
fitted to forward prices, Quantitative Finance Research Group, Working paper, University of Technology (Sydney).
[16] Clewlow, L. and Strickland, C. (2000). Energy derivatives: Pricing and risk management. Lacima Publications, London, England.
[17] Cortazar, G. and Schwartz, E. S. (1994). The valuation of commodity contingent
claims. Journal of Derivatives, 1, 27-39.
[18] Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions. Numerische Mathematik, 31, 377-403.
[19] Currie, I. D., Durb´an, M. and Eilers, P. H. C. (2004). Smoothing and forecasting
mortality rates. Statistical Modelling, 4, 279-298.
102
[20] de Boor, C. (1978). A practical guide to splines. Springer Verlag.
[21] Denison, D. G. T., Mallick, B. K. and Smith, A. F. M. (1997). Automatic Bayesian
curve fitting. Journal of the Royal Statistical Society. Series B (Statistical Methodology), Vol. 60, 2, 333-350.
[22] Deng, S. (2000). Stochastic models of energy commodity prices and their applications:
Mean-reversion with jumps and spikes. Working paper PWP-073.
[23] Draper, N. and Smith, H. (1998). Applied regression analysis. Wiley
[24] Duffie, D., Pan, J. and Singleton, K. (2000). Transform analysis and asset pricing for
affine jump-diffusions. Econometrica, Vol 68 ,6, 1343-1376.
[25] Duffie, D. and Singleton, K. (1999). Modelling term structures of defaultable bonds.
Review of Financial Studies, 12, 197-226.
[26] Eilers, P. H. C. and Marx, B. D. (1996). Flexible smoothing with B-splines and penalties. Statistical Science, 11, 89-121.
[27] Edwards, D. W. (2010). Energy trading and investing. McGraw Hill.
[28] Fahrmeir, L. and Kneib T. (2011). Bayesian smoothing and regression for longitudinal,
spatial and event history data. Oxford Statistical Science Series.
[29] Fleten, S. E. and Lemming, J. (2003). Constructing forward price curves in electricity
markets. Energy Economics, 25, 409-424.
[30] Gamerman, D. (1997). Markov Chain Monte Carlo, stochastic simulation for Bayesian
inference. Chapman & Hall/ CRC.
[31] Gelfand, A. E. and Smith, A. F. M. (1990). Sampling based approaches to calculating
marginal densities. Journal of the American Statistical Association, Vol. 85, 410, 398409.
103
[32] Gelman, A., Carlin, J. B., Stern, H. and Rubin, D. (1995). Bayesian data analysis.
Chapman & Hall/ CRC.
[33] Gilks, W. R., Richardson, S. and Spiegelhalter, D. J. (1996). Markov Chain Monte
Carlo in practice. Chapman & Hall/ CRC.
[34] Gray, P. (2002). Bayesian estimation of financial models. Accounting and Finance Vol.
42, 111-130.
[35] Gray, P. (2005). Bayesian estimation of short rate models. Australian Journal of
Management, 30:1.
[36] Green, P. (1987). Penalized likelihood for general semiparametric regression models.
Internat. Statist. Rev., 55, 245-259.
[37] Green, P. J. and Silverman B. W. (1994). Nonparametric regression and generalized
linear models: A roughness penalty approach. Chapman & Hall/CRC.
[38] Gu, C. (2002). Smoothing spline anova models Springer Series in Statistics.
[39] Hastie, T. and Tibshirani, R. (1990). Generalized additive models. London. Chapman
and Hall.
[40] Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their
applications. Biometrika, 57(1), 97-109.
[41] Heath, D., Jarrow, R. and Merton A. (1992). Bond pricing and the term structure
of interest rates: A new methodology for contingent claims valuation. Econometrica,
77-105.
[42] Jacquier, E., Polson, N. and Rossi, P. (2000). Bayesian analysis of stochastic volatility
models. Journal of Business & Economic Statistics, American Statistical Association,
Vol. 12, 4, 371- 389.
104
[43] Jarrow, R., Ruppert, D. and Yu, Y. (2004). Estimating the term structure of corporate
debt with a semiparametric penalized spline model. Journal of the American Statistical
Association, 99, 57-66.
[44] Knittel, C. R. and Roberts, M. R. (2001). An empirical examination of deregulated
electricity prices. Power Working Papers, PWP-087.
[45] Krivobokova, T. (2006) Theoretical and practical aspects of penalized spline smoothing. Bielefeld (Germany): Bielefeld University.
[46] Krivobokova, T. (2007). A note on penalized splines with correlated errors. Journal of
the American Statistical Association Vol. 102, 480, 1328-1337.
[47] Lang, S. and Brezger, A. (2004). Bayesian P-splines. Journal of Computational and
Graphical Statistics, 13, 183-212.
[48] Lee, D. and Durb´an, M. (2012). Seasonal modulation mixed models for time series
forecasting. Working paper. Statistics and Economics Series, Universidad Carlos III de
Madrid.
[49] Lee, S. Y., Poon, W. Y. and Song, X. Y. (2007). Bayesian analysis of the factor model
with finance applications. Quantitative Finance, Vol.7, 3, 343-356.
[50] Liebl, D. (2013). Modeling and forecasting electricity spot prices: A functional data
perspective. Annals of Applied Statistics Vol. 7, 3, 1562-1592.
[51] Liechty, J. C., Liechty, M. W. and M¨
uller, P. (2004). Bayesian correlation estimation.
Biometrika, Vol.91, 1, 1-14.
[52] Lindley, D. V. and Smith, A. F. M. (1972). Bayes estimates for the linear model (with
discussion). Journal of the Royal Statistical Society, Series B 34, 1-41.
105
[53] Lucia, J. and Schwartz, E. S. (2000). Electricity prices and power derivatives: Evidence
from the Nordic power exchange. Review of Derivatives Research, 5, 5-50.
[54] Li, M. and Yu Y. (2005). Estimating the interest rate term structure of treasury and
corporate debt with Bayesian penalized splines. Journal of Data Science, 3, 223-240.
[55] Metropolis, N., Rosenbluth A. W., Rosenbluth, M. N., Teller, A. H., and Teller E.
(1953). Equation of State Calculations by Fast Computing Machines. The Journal of
Chemical Physics, 21, 1087.
[56] McCulloch, J. H. (1971). Measuring the term structure of interest rates. Journal of
Business, 44, 19-31.
[57] McCulloch, J. H. (1975). The tax-adjusted yield curve. Journal of Finance, 30, 811830.
[58] Peskun P. (1973). Optimum Monte Carlo sampling using Markov chains. Biometrika,
60(3), 607-12.
[59] Pilipovi´c, D. (1998) Energy risk: Valuing and managing energy derivatives. McGrawHill, New York, USA.
[60] Ramsay, J. O. and Silverman B. W. (2005). Functional Data Analysis. Springer Series
in Statistics.
[61] Ruppert, D. and Carroll, R. (1997). Penalized regression splines. Working paper, Cornell University.
[62] Ruppert, D. and Carroll, R. J. (2000). Spatially-adaptive penalties for spline fitting.
Australian and New Zealand Journal of Statistics, 42, 205-223.
[63] Ruppert, D. (2002). Selecting the number of knots for penalized splines. Journal of
Computational and Graphical Statistics, 11, 735-757.
106
[64] Ruppert, D., Wand, M. P. and Carroll R. J. (2003). Semiparametric regression. Cambridge University Press.
[65] S¨arkk¨a, S. (2013). Bayesian filtering and smoothing. Cambridge University Press.
[66] Schwartz, E. S. (1997). The stochastic properties of commodity prices. Implication for
valuation and hedging. Journal of Finance, 52, 923-973.
[67] Schwartz, T. (1998). Estimating the term structure of corporate debt. Review of
Derivatives Research, 2, 193-230.
[68] Shea, G. (1984). Pitfalls in smoothing interest rate term structure data: Equilibrium
models and spline approximations. Journal of Financial and Quantitative Studies, 19,
253-269.
[69] Shea, G. (1985). Interest rate term structure estimation with exponential splines: A
note. Journal of Finance, 40, 319-325.
[70] Simonoff, J. S. (1996). Smoothing methods in statistics Springer Series in Statistics.
[71] Tanggaard, C. (1997). Nonparametric smoothing of yield curves. Review of Quantitative Finance and Accounting, 9, 251-267.
[72] Tanner, M. A. and Wong, W. H. (1987). The calculation of posterior distributions
by data augmentation. Journal of the American Statistical Association, Vol. 38, 398,
528-540.
[73] Tanner, M. A. (1996). Tools for statistical inference. Springer.
[74] Vasicek, O. A. and Fong, H. G. (1982). Term structure modelling using exponential
splines. Journal of Finance, 37, 339-348.
107
[75] Wahba, G. (1978). Improper priors, spline smoothing and the problem of guarding
against model errors in regression. Journal of the Royal Society. Series B (Methodological), Vol. 40, 3, 364-372.
[76] Wahba, G. (1990). Spline models for observational data, Philadelphia. Society for
Industrial and Applied Mathematics (SIAM).
108