Slideshow

Modeling financial default: two approaches
Susan Thomas
3 November, 2014
Goals
I
I
Seek to model and predict the financial health of a firm.
Two approaches:
1. Statistical: Z-score of accounting ratios
I
I
I
I
Identify two sets of firms: those that defaulted on payments
and those that have not.
Calculate accounting ratios from the balance sheets of both.
Identify accounting ratios which are significantly different for
the defaulted firms compared to non-defaulted firms.
Apply these on existing firms to predict default.
2. Theoretical: Merton’s approach.
Approach 1: Binary choice models of default
Motivation
I
Default modelling helps us identify factors that are important in
determining the ability of firms to pay back borrowed money.
I
The model also generates weights for each factor to capture a
single measure of financial health of the firm.
I
This single measure of financial health is then mapped into a
probability of default p which is a component of credit risk
calculation.
I
Note: Here p is calculated for a firm, not a single bond.
Econometric approaches
I
Discriminant Analysis (DA) is used to determine which variables
discriminate between two (or more) naturally occurring groups.
For example: default and no default.
I
However, DA requires some stringent assumptions on the data
that are usually not met in practice.
I
References for Discriminant Analysis
I
I
I
http://www.statsoft.com/textbook/stdiscan.html
http://www2.chass.ncsu.edu/garson/pa765/discrim.htm
Probit and Logit Analysis have become the accepted way of
modeling financial default.
Probit and Logit models are a subset of the wider class of Binary
Choice Models.
Econometric approaches
I
Discriminant Analysis (DA) is used to determine which variables
discriminate between two (or more) naturally occurring groups.
For example: default and no default.
I
However, DA requires some stringent assumptions on the data
that are usually not met in practice.
I
References for Discriminant Analysis
I
I
I
http://www.statsoft.com/textbook/stdiscan.html
http://www2.chass.ncsu.edu/garson/pa765/discrim.htm
Probit and Logit Analysis have become the accepted way of
modeling financial default.
Probit and Logit models are a subset of the wider class of Binary
Choice Models.
Binary Choice Models
I
We want to predict the probability of default for a firm.
I
Let Zi be a random variable with the following properties:
Zi
=1
with probability Pi
Zi
=0
with probability (1 − Pi ) (no-default)
(default)
I
E(Zi ) = 1 ∗ Pi + 0 ∗ (1 − Pi ) = Pi
I
Therefore, the expected value of Zi is equal to the probability of
the event occurence.
I
Note: that the probability of the event depends on i – ie, it varies
by firm characteristics.
I
Pi
= G(β0 + β1 x1,i + β2 x2,i + β3 x3,i + . . . + βk xk ,i )
Pi
= G(β 0 Xi )
If we select an G(.) and can estimate β, then:
ˆ i = G(βˆ0 Xi )
P
Linear Probability Model
I
Ideally, we would like to allow β 0 Xi to vary freely but restrict
G(β 0 Xi ) to lie in [0,1].
−∞ < β 0 Xi < +∞
0≤
I
G(β 0 Xi )
≤1
Cumulative distributions satisfy this property.
In principle any CDF can be chosen.
0
F (β Xi )
Z
β 0 Xi
1
1
√ exp − 2 d
2
2π
=
−∞
0
F (β 0 Xi )
=
eβ Xi
(1 + eβ 0 Xi )
(Logit)
(Probit)
Latent Variable Interpretation of Binary Models
– The Concept of Z-Score
I
Suppose we write:
Zi∗ = β0 + β1 x1,i + β2 x2,i + β3 x3,i + . . . + βk xk ,i + i = β 0 Xi + i
I
X1,i , . . . , Xk ,i are accounting ratios and possibly other
macroeconomic and management factors.
I
Then Zi∗ can be interpreted as the financial health of the firm.
I
But Zi∗ is not observable. It is a latent variable.
I
Instead, we observe a proxy, Zi , where
Zi = 1
if
Zi∗ ≤ 0
Zi = 0
if
Zi∗ > 0
where
Zi∗ = β0 + β1 x1,i + β2 x2,i + β3 x3,i + . . . + βk xk ,i + i = β 0 Xi + i
I
Alternative PDFs of i give either the Probit (standard normal) or
the Logit (standard logistic) models.
Choice between Probit and Logit Model
I
In principle one can use uniform or log-normal distributions.
I
The standardized logistic and normal distributions are symmetric
and almost identical, except at the tails.
I
The logisitic distribution is fatter in the tails so that the extreme
values have slightly higher probability of occurrence than that in
the normal distribution.
I
If the data is moderately balanced (between 0 and 1) we use the
probit model.
Otherwise we use the logit model.
I
Estimates from the logit/probit models are related by the
approximate rule
βˆLogit = 1.6 βˆProbit
Choice between Probit and Logit Model
I
In principle one can use uniform or log-normal distributions.
I
The standardized logistic and normal distributions are symmetric
and almost identical, except at the tails.
I
The logisitic distribution is fatter in the tails so that the extreme
values have slightly higher probability of occurrence than that in
the normal distribution.
I
If the data is moderately balanced (between 0 and 1) we use the
probit model.
Otherwise we use the logit model.
I
Estimates from the logit/probit models are related by the
approximate rule
βˆLogit = 1.6 βˆProbit
Choice between Probit and Logit Model
I
In principle one can use uniform or log-normal distributions.
I
The standardized logistic and normal distributions are symmetric
and almost identical, except at the tails.
I
The logisitic distribution is fatter in the tails so that the extreme
values have slightly higher probability of occurrence than that in
the normal distribution.
I
If the data is moderately balanced (between 0 and 1) we use the
probit model.
Otherwise we use the logit model.
I
Estimates from the logit/probit models are related by the
approximate rule
βˆLogit = 1.6 βˆProbit
Outputs of the probit/logit models: Z-score
I
The factors that determine the financial health of the firm: Xi∗
I
The financial health of the firm, Zi∗ :
Zˆi∗ = βˆ0 Xi∗
This is called the Z-score.
I
Observed values on the financial health of the firm (Zi ) are either
0 or 1.
Zˆi∗ is continuous.
I
Zˆi∗ can be used to rank or score firms in terms of their financial
health.
I
The probit and logit models are the most widely used forms of
Scoring Models.
These models have wide applicability in other situations; i.e.,
ranking of customers based on their willings (probability) to buy
a product.
I
The Z-score can be used to put firms into rating categories. But
cut-offs have to done exogenously.
Outputs of the probit and logit models
I
The probability of default:
P(Zi = 1) = 1 − Φ(βˆ0 Xi∗ )
I
Marginal effects:
Probit model
δPi
δXik∗
Logit model
δPi
δXik∗
= −φ(βˆ0 Xi∗ )(βˆk )
2
1
= −
(βˆk )
(1 + eβ 0 Xi )
I
The marginal effect of Xi∗ on the probability Pi is not fixed as in
the linear regression model, but varies with Xi∗ .
I
Instead, use the odds ratio because it is independent of Xi∗ .
I
For example, in the logit model:
0
Pi
= e−β Xi
1 − Pi
→ The odds in favour of the event.
I
Pi
Pi
∗
∗
1−Pi |Xik =Xik +1 / 1−Pi |Xik =Xik
I
This measures how the odds of an event change with one unit
change in the variable, and is independent of the value of Xk .
= e−βk is the odds ratio.
Outputs of the probit and logit models
I
The probability of default:
P(Zi = 1) = 1 − Φ(βˆ0 Xi∗ )
I
Marginal effects:
Probit model
δPi
δXik∗
Logit model
δPi
δXik∗
= −φ(βˆ0 Xi∗ )(βˆk )
2
1
= −
(βˆk )
(1 + eβ 0 Xi )
I
The marginal effect of Xi∗ on the probability Pi is not fixed as in
the linear regression model, but varies with Xi∗ .
I
Instead, use the odds ratio because it is independent of Xi∗ .
I
For example, in the logit model:
0
Pi
= e−β Xi
1 − Pi
→ The odds in favour of the event.
I
Pi
Pi
∗
∗
1−Pi |Xik =Xik +1 / 1−Pi |Xik =Xik
I
This measures how the odds of an event change with one unit
change in the variable, and is independent of the value of Xk .
= e−βk is the odds ratio.
Diagonistics for the Probit/Logit models
I
Significance of the coefficients (⇔ Odds ratio) as different from
1.
I
Overall fit of the model:
I
I
I
I
Likelihood ratio (LR) test
Score (LM) test
Wald test
Classification Tables (All require the specification of a cut-off
probability for classification, usually 0.5):
I
I
I
I
Sensitivity – proportion of 1 correctly predicted
Speficity – proportion of 0 correctly predicted
False positive – proportion of predicted 1 that are actually 0
False negative – proportion of predicted 0 that are actually
1
An example of classification table
Fitted
0
1
Total
I
Sensitivity: 600/700
I
Specifity: 250/300
I
False Positive: 50/650
I
False Negative: 100/350
Actual
0
1
250 100
50 600
300 700
Total
350
650
1000
Building a z-score model for Indian firms
Modeling issues
I
Which universe of companies?
I
What time period is used in estimating the model?
I
What is the definition of default used?
I
Which Ratios? - Selection of variables
I
Which Model? - Model Selection
Sample of companies
I
Models of default probability differ across company
characteristics.
I
Moody’s have two separate models for public and private
companies.
I
I
Some types of information are not available for private
companies.
The behaviour of ratios themselves are different with
respect to defaults of these two different sets.
Sample of time
I
A random sample from the universe is selected over a time
period. It is important that the random sample contains some
“minimum” number of defaults.
I
Data should be spread over a “sufficient” time period which
allows for out-of-sample time validation.
I
For India, out-of-time validation is tough; even if there is span,
there have been lots of regulatory changes. However this is not
debilitating to the exercise.
The definition of default
I
Moody’s has five ways to define default. In the model for
Germany, default is defined as coming from bankruptcy data.
I
The model output is the probability that the company has
payments that are 90 days past due to banks.
I
In India also, we might define default as 90 days past due to
banks.
Selection of variables
I
Data on Xi = [X1i , X2i , X3i , X4i ]
I
I
I
I
X1i = Firm characteristics (accounting data): Measures of
coverage, liquidity, leverage, profitability, dividend history,
group affiliation
X2i = Firm characteristics (stockmarket data):
price-earnings ratio, market capitalization
X3i = Loan characteristics: size of loan, collateral, quality of
gurantor
X4i = Industry characteristics, Macroeconomic factors
What ratios to select?
I
Basic considerations – how many ratios?
I
I
I
I
I
Too few will not capture all information.
Too many will make for a good insample fit, but poor
out-of-sample.
Too many will tend to confound intuition in a multivariate
setting.
Too many ratios will also be costly in data requirement.
How many variables is done based on the tradeoff of parsimony
in data and intuitive accuracy on the coefficients on the variables.
Typically, they are selected on the basis of best performance as
measured by defined metrics.
1.0
Example: Cash Ratio to identify default vs. non-default
firms
0.6
0.4
0.2
0.0
ROC convex hull
0.8
Cash Ratio, AUC 0.411
0.0
0.2
0.4
0.6
0.8
1.0
1.0
Example: Debt-Equity ratio to identify default vs.
non-default firms
0.6
0.4
0.2
0.0
ROC convex hull
0.8
Debt Equity Ratio, AUC 0.713
0.0
0.2
0.4
0.6
0.8
1.0
How to do model selection
I
Log-likelihood function value
I
Out-of-sample prediction performance
I
I
I
I
out-of-time
out-of-sample (companies that didn’t exist in the estimation
set)
out-of-universe (Sectoral selection)
Power curves and accuracy ratios
Power curves
The power curve is a graph which answers the simple question about
a model:
I
How many firms that the model categorised as having a
relatively high probability of default have actually defaulted?
Example of two models and their output
I
Dataset: we have 20 firms (named A-T)
I
We observe that four of them have defaulted (B, J, M, N).
I
Models: we have two candidate models M1, M2 and their output
probability of default for all the firms.
The model outputs
Model
M1
M2
M1
M2
Decreasing order of Probability of Default (PoD)
1
2
3
4
5
6
7
8
9 10
B
T
M
S
R
Q
N
J
A
C
B
S
T
R
Q
J
A
C
O
P
11 12 13 14 15 16 17 18 19 20
O
P
L
D
E
F
H
G
I
K
L
M
N
D
E
F
H
G
I
K
The power curve
I
The x-axis: Fraction of firms in the sample ordered by
descending probability of default (falls between 0 and 1).
I
The y-axis: ratio of the number of defaulted firms within X ≤ X ∗
to the total number of defaulted firms in the sample.
An example of the power curve
1.0
0.8
0.6
0.4
M1
M2
Model with no prediction capability
Model with perfect prediction capability
0.2
0.0
0.0
0.2
0.4
0.6
0.8
1.0
Interpreting the power curve
I
The 45o line is the line of zero prediction.
I
The step line close to the y-axis is the line of perfect prediction.
I
The closer the model’s power curve is to the 45o line, the worse
the model is.
I
Accuracy ratio is a relative and quantitative measure of the
performance of one model compared to another model.
I
The closer the model’s power curve is to the 45o line, the closer
the accuracy ratio is to 0.
I
The larger the accuracy ratio, the better the model.
I
The accuracy ratio varies between 0 for model with no
explanatory power to 1 for the model with the highest
explanatory power.
The CMIE credit model
The data set
I
CMIE has defined companies as having defaulted using
in-house expertise.
Default information was also obtained from Centurion Bank, IDBI
Bank and UTI Bank.
I
Full dataset spans from 1990-2001, but the default rate is
representative of the population only during the period of
1996-2000.
Since we are predicting one year ahead probability of default,
our sample consists of default and non-default firms for
1995-1999.
I
In this period, there were 584 default episodes and a total of
17981 non-default events.
Selected ratios and the signs on first derivatives
Debt equity ratio
Times interest ratio
Fixed assets turnover
Productivity ratio
Cash ratio
Pr(default)
+
−
−
−
−
Alternative models
I
Altman’s Z-Score model: Working Capital/Total assets,
Earnings/Total Assets, Return on Assets, Cash ratio
Z-scorei = 6.56X1i + 3.26X2i + 6.73X3i + 1.05X4i + 3.25
I
Moody’s model (proxy): Equity/Total assets, Total Debt ratio,
EBDIT, Profit margin, Cash ratio, Sales, Trade Creditor’s ratio,
Productivity ratio.
I
Our sample model: Debt equity ratio, Times interest ratio, Fixed
assets turnover, Cash ratio, Productivity ratio.
Power curves for models
1.0
0.8
0.6
0.4
Altman’s Z-score 1
0.2
0.0
Power curves for models
1.0
0.8
0.6
0.4
Altman’s Z-score 1
Altman’s Z-score 2
0.2
0.0
Power curves for models
1.0
0.8
0.6
0.4
Moody’s proxy
Altman’s Z-score 1
Altman’s Z-score 2
0.2
0.0
Power curves for models
1.0
0.8
0.6
0.4
CMIE Credit Model
Moody’s proxy
Altman’s Z-score 1
Altman’s Z-score 2
0.2
0.0
Accuracy ratios for alternative models
Altman Z-Score 1
Altman Z-Score 2
Moody’s proxy
CMIE Credit Model
0.2379
0.3248
0.4920
0.5722
The economic content of the model
I
The model uses ratios from each of the six categories.
I
We find that the direction and magnitude of the sensitivity of the
ratios are meaningful.
I
For example, one of the ratios used is the leverage which is
found to be one of the most important to predict default.
I
In addition, the model also incorporates
1. A management proxy
2. Industry effects
3. Macro-economic effects
Approach 2: The Merton model
The structural credit risk model approach
I
A firm defaults when the market value of its assets is less than
the debt it has to repay.
I
Inputs: firm’s balance sheet and financial structure.
I
Balance sheet equation: assets = liabilities + shareholder equity.
Loss in asset value can be absorbed as long as it is less than
shareholder equity. After that, the firm is technically bankrupt:
can no longer repay debt.
I
Then,
I The market value of liabilities/debt = B.
I The market value of equity = E.
I Market value of assets = V .
I Then V = E + B
I
Book value of B can be observed, and market value of E is
observed.
I
But V is difficult to observe accurately – balance sheets only
provide an accounting version of the firm’s assets.
I
Solution: Estimate V using option pricing formula.
The structural credit risk model approach
I
A firm defaults when the market value of its assets is less than
the debt it has to repay.
I
Inputs: firm’s balance sheet and financial structure.
I
Balance sheet equation: assets = liabilities + shareholder equity.
Loss in asset value can be absorbed as long as it is less than
shareholder equity. After that, the firm is technically bankrupt:
can no longer repay debt.
I
Then,
I The market value of liabilities/debt = B.
I The market value of equity = E.
I Market value of assets = V .
I Then V = E + B
I
Book value of B can be observed, and market value of E is
observed.
I
But V is difficult to observe accurately – balance sheets only
provide an accounting version of the firm’s assets.
I
Solution: Estimate V using option pricing formula.
The structural credit risk model approach
I
A firm defaults when the market value of its assets is less than
the debt it has to repay.
I
Inputs: firm’s balance sheet and financial structure.
I
Balance sheet equation: assets = liabilities + shareholder equity.
Loss in asset value can be absorbed as long as it is less than
shareholder equity. After that, the firm is technically bankrupt:
can no longer repay debt.
I
Then,
I The market value of liabilities/debt = B.
I The market value of equity = E.
I Market value of assets = V .
I Then V = E + B
I
Book value of B can be observed, and market value of E is
observed.
I
But V is difficult to observe accurately – balance sheets only
provide an accounting version of the firm’s assets.
I
Solution: Estimate V using option pricing formula.
How equity is like a call option
I
An option is a financial contract, of which there are two types:
calls and puts.
I
A call option gives a positive payoff when the market price of
asset is higher than a defined (“strike”) price.
If the market price is less than the strike price, the call option has
a zero payoff.
I
In the context of a firm with equity holders and bond holders:
I
I
I
Equity E has a positive payoff only when asset V > B, the
debt.
Equity is similar to the payoff of a call option.
Then, holding E is like holding a call option on V , the value
of the firm.
How equity is like a call option
I
An option is a financial contract, of which there are two types:
calls and puts.
I
A call option gives a positive payoff when the market price of
asset is higher than a defined (“strike”) price.
If the market price is less than the strike price, the call option has
a zero payoff.
I
In the context of a firm with equity holders and bond holders:
I
I
I
Equity E has a positive payoff only when asset V > B, the
debt.
Equity is similar to the payoff of a call option.
Then, holding E is like holding a call option on V , the value
of the firm.
Example of a call option payoff, B = 1250
Payoff to call option buyer (Rs.)
200
100
0
1000
1100
1200
1300
1400
-100
-200
Spot price on expiration (Rs.)
1500
Inputs required to reverse-calculate V
1. Market price of equity, E
2. volatility of equity, σE
3. Value of debt, B
4. Maturity T , which is the horizon over which to estimate the
probability of default of the firm.
5. Risk free interest rates, rf
Note: This approach can only work for listed firms.
From Merton to KMV
I
Merton, 1974, showed how to use the Black-Scholes formula to
calculate the market value of the firm’s assets V from the E,
equity price and σE equity volatility.
I
This was not immediately adopted in the financial industry.
I
The KMV Corporation was a company created by three
economics professors in the 1990s.
They created the first market product using the Merton model to
calculate the Distance to Default (DtD) for a individual firm.
I
If DtD is 0 (or negative), then the firm is bankrupt.
The larger is DtD, the further the firm is from defaulting on a
payment, or bankruptcy.
I
KMV then mapped the DtD for all the U.S. firms to estimates of
default probabilities, and published these results are newsletters
that were sold for a subscription fee.
I
KMV Corp. was bought out by Moody’s in 2000 for USD 250
million.
From Merton to KMV
I
Merton, 1974, showed how to use the Black-Scholes formula to
calculate the market value of the firm’s assets V from the E,
equity price and σE equity volatility.
I
This was not immediately adopted in the financial industry.
I
The KMV Corporation was a company created by three
economics professors in the 1990s.
They created the first market product using the Merton model to
calculate the Distance to Default (DtD) for a individual firm.
I
If DtD is 0 (or negative), then the firm is bankrupt.
The larger is DtD, the further the firm is from defaulting on a
payment, or bankruptcy.
I
KMV then mapped the DtD for all the U.S. firms to estimates of
default probabilities, and published these results are newsletters
that were sold for a subscription fee.
I
KMV Corp. was bought out by Moody’s in 2000 for USD 250
million.
Using bonds instead of equity?
I
The same approach to estimate V firm can also be applied to
get V from the bond price of the firm.
I
But most firms have more active speculative trading on shares
than on their bonds.
I
A real world example: The stock price of Enron had dropped to
$3 long before the credit rating on it’s bonds had dropped below
investment grade.
Applying the Merton model to an Indian firm
An example of DtD calculation: TISCO
Inputs to calculate DtD at a given point in time is:
Marketcap
Debt
Volatility
Risk free interest rate
E
B
σE
rf
Using these values, we estimate V and DtD for TISCO as:
ˆ
V
DtD
March 1990
2924
2.7
March 2000
9523
1.3
0
10000 20000 30000 40000 50000 60000
Tata Steel Market Cap
Marketcap of TISCO, 1990-2007
1990
1995
2000
2005
2
1
0
Tata Steel DfD
3
4
DtD of TISCO, 1990-2007
1990
1995
2000
2005
4
3
2
Tata Steel Leverage
5
6
Leverage of TISCO, 1990-2007
1990
1995
2000
2005
Testing the Merton model versus credit rating agencies
I
Credit rating upgrade/downgrade events are widely used in the
real world.
I
Broadly speaking, a good deal of measurement has been done
about average failure probabilities associated with each rating
category.
I
Rating agencies are routinely criticised for reacting ‘too late’
(e.g. East Asia, Enron, etc).
I
Corporate bond prices do drop when ratings drop - so a tool for
predicting rating downgrades has practical use.
I
Hence we pose the question: Does Merton’s model guide us on
future credit rating changes?
Estimating p from DtD in India
I
We have a simple and ready way in which to calculate the value
of firm assets V if the firm is listed and we observe the market
price of its equity.
I
We can calculate how close the firm is to default.
I
This is a real-time measure of the credit worthiness of the firm.
I
However, so far, it is only a relative measure of the credit
worthiness of the firm – we still have not got a probability of
default p.
I
To map the DfD to p, we need to have a well-established history
of known firm defaults and non-defaults, as well as their related
DfD values.
I
However, this is not a well-established or widely available
database in India today.
I
One source of the defaults data is with banks. But this is
currently privately held, and can be only internally
operationalised.
Homework
I
What variables are available in public domain that indicates the
financial health of your chosen firm?