How Large the N? - Nassim Nicholas Taleb

EXTREME RISK INITIATIVE —NYU SCHOOL OF ENGINEERING WORKING PAPER SERIES
Fat Tails, Model Uncertainty and the Law of Very
Large Numbers
Nassim Nicholas Taleb
School of Engineering, New York University
This is extracted from Chapter 6 of Silent Risk. This chapter is in progress and comments are welcome. It has been slowed
down by the author’s tinkering with explicit expressions for partial expectations of asymmetric alpha-stable distributions and
the accidental discovery of semi-closed form techniques for assessing convergence for convolutions of one-tailed power laws.
C ONTENTS
-A
The "Pinker Problem" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The problem of Matching Errors
II
Generalizing Mean Deviation as Partial Expectation
III
Class of Stable Distributions
III-A
Results . . . . . . . . . . . . . . . .
III-A1
Relative convergence . .
III-A2
Speed of convergence . .
III-B
Stochastic Alpha or Mixed Samples
FT
I
2
2
3
.
.
.
.
4
4
5
5
5
IV
Symmetric NonStable Distributions in the Subexponential Class
IV-A
Symmetric Mixed Gaussians, Stochastic Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
IV-B
Half cubic Student T (Lévy Stable Basin) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
IV-C
Cubic Student T (Gaussian Basin) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
6
7
8
V
Asymmetric NonStable Distributions in the Subexponetial Class
V-A
One-tailed Pareto Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
V-B
The Lognormal and Borderline Subexponential Class . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
9
9
VI
Asymmetric Distributions in the Superexponential Class
VI-A
Mixing Gaussian Distributions and Poisson Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VI-B
Skew Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VI-C
Super-thin tailed distributions: Subgaussians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
9
9
9
VII
Acknowledgement
VII-A Cumulants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VII-B Derivations using explicit E(|X|) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VII-C Derivations using the Hilbert Transform and β = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
10
10
10
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
D
RA
.
.
.
.
References
11
Ou observe data and get some confidence that the average is represented by the sample thanks to a standard metrified
"n". Now what if the data were fat tailed? How much more do you need? What if the model were uncertain –we had
uncertainty about the parameters or the probability distribution itself?
Y
Main Results In addition to explicit extractions of partial expectations for alpha stable distributions, one main result
in this paper is the expression of how uncertainty about parameters (in terms of parameter volatility) translates into a
larger (or smaller) required n. Model Uncertainty The practical import is that model uncertainty worsens inference, in
a quantifiable way.
1
EXTREME RISK INITIATIVE —NYU SCHOOL OF ENGINEERING WORKING PAPER SERIES
2
FT
Fig. 1: How thin tails (Gaussian) and fat tails (1< α ≤2) converge to the mean.
RA
A. The "Pinker Problem"
It is also necessary to debunk a fallacy: we simply do not have enough data with commonly discussed fat-tailed processes
to naively estimate a sum and make series of claims about stability of systems, pathology of people reacting to risks, etc. A
surprising result: for the case with equivalent tails to the "Pareto 80/20 rule" (a tail exponent α = 1.16) one needs 1011 more
data than the Gaussian.
Take a certain sample size in the conventional Gaussian domain, say n = 30 or some other such heuristically used number.
Assuming we are confortable with such a number of summands, how much larger (or smaller) n does one need for the same
error under a different process? And how do we define errors in the absence of standard deviation which might not exist
(power laws with exponents close to 2), or be too unreliable (power laws with exponents > 2, that is finite variance but infinite
kurtosis).
It is strange that given the dominant role of fat tails nobody thought of calculating some practical equivalence table. How
can people compare averages concerning street crime (very thin tailed) to casualties from war (very fat tailed) without some
sample adjustment?1
Perhaps the problem lies at the core of the law of large numbers: the average is not as "visible" as other statistical dimentions;
there is no sound statistical procedure to derive the properties of a powerlaw tailed data by estimating the mean – typically
estimation is done by fitting the tail exponent (via, say, the Hill estimator or some other method), or dealing with extrema, yet
it remains that many articles make comparisons about the mean since it is what descriptive statistics and, alas, decisions, are
based on.
D
I. T HE PROBLEM OF M ATCHING E RRORS
By the weak law of large numbers, consider a sum
P of random variables X1 , X2 ,..., Xn independent and identically distributed
with finite mean m, that is E[Xi ] < ∞, then n1 1≤i≤n Xi converges to m in probability, as n → ∞. And the idea is that
we live with finite n.
We get most of the intuitions from closed-form and semi-closed form expressions working with:
• stable distributions (which allow for a broad span of fat tails by varying the α exponent, along with the asymmetry via
the β coefficient
• stable distributions with mixed α exponent.
• other symmetric distributions with fat-tails (such as mixed Gaussians, Gamma-Variance Gaussians, or simple stochastic
volatility)
More complicated situations entailing more numerical tinkering are also covered: Pareto classes, lognormal, etc.
Instability of Mean Deviation
Indexing with p the property of the variable X p and g for X g the
np
!
(
X X p − m p
i
np : E =E
np
Gaussian:
ng
!)
X X g − m g
i
ng
(1)
1 The Pinker Problem A class of naive empiricism. It has been named so in reference to sloppy use of statistical techniques in social science and policy
making, based on a theory promoted by the science writer S. Pinker [1] about the drop of violence that is based on such statistical fallacies since wars –unlike
domestic violence –are fat tailed. But this is a very general problem with the (irresponsible) mechanistic use of statistical methods in social science and
biology.
EXTREME RISK INITIATIVE —NYU SCHOOL OF ENGINEERING WORKING PAPER SERIES
C2
C1
3
(α)
1.7
1.6
Fig. 2: The ratio of cumulants for a symmetric
powerlaw, as a function of the tail exponent.
1.5
1.4
1.5
2.0
2.5
α
FT
1.0
3.0
1
RA
And since we know that convergence for the Gaussian happens at speed n 2 , we can compare to convergence of other classes.
We are expressing in Equation 1 the expected error (that is, a risk function) in L1 as mean absolute deviation from the
observed average, to accommodate absence of variance –but assuming of course existence of first moment without which there
is no point discussing averages.
Typically, in statistical inference, one uses standard deviations of the observations to establish the sufficiency of n. But in
fat tailed data standard deviations do not exist, or, worse, when they exist, as in powerlaw with tail exponent > 3, they are
extremely unstable, particularly in cases where kurtosis is infinite.
Using mean deviations of the samples (when these exist) doesn’t accommodate the fact that fat tailed data hide properties.
The "volatility of volatility", or the dispersion around the mean deviation increases nonlinearly as the tails get fatter. For
instance, a stable distribution with tail exponent at 23 matched to exactly the same mean deviation as the Gaussian will deliver
measurements of mean deviation 1.4 times as unstable as the Gaussian.
Using mean absolute deviation for "volatility", and its mean deviation "volatility of volatility" expressed in the L1 norm, or
C1 and C2 cumulant:
C1 = E(|X − m|)
C2 = E (|X − E(|X − m|)|)
D
We can compare that matching mean deviations does not go very far matching cumulants.(see Appendix 1)
Further, a sum of Gaussian variables will have its extreme values distributed as a Gumbel while a sum of fat tailed will
follow a Fréchet distribution regardless of the the number of summands. The difference is not trivial, as shown in figures , as
in 106 realizations for an average with 100 summands, we can be expected observe maxima > 4000 × the average while for
a Gaussian we can hardly encounter more than > 5 ×.
II. G ENERALIZING M EAN D EVIATION AS PARTIAL E XPECTATION
It is unfortunate that even if one matches mean deviations, the dispersion of the distributions of the mean deviations (and
their skewness) would be such that a "tail" would remain markedly different in spite of a number of summands that allows
the matching of the first order cumulant. So we can match the special part of the distribution, the expectation > K or < K,
where K can be any arbitrary level.
Let Ψ(t) be the characteristic function of the random variable. Let θ be the Heaviside theta function. Since sgn(x) = 2θ(x)−1
Z ∞
2ieiKt
θ
Ψ (t) =
eitx (2θ(x − K) − 1) dx =
t
−∞
And the special expectation becomes, by convoluting the Fourier transforms; where F is the distribution function for x:
Z ∞
Z ∞
∂
x dF (x) = E(X|X>K )P(X > K) = −i
Ψ(t − u)Ψθ (u)du|t=0
(2)
∂t
K
−∞
R∞
Rµ
Mean deviation becomes a special case of equation 2, E(|X|) = µ x dF (x) − −∞ x dF (x).
EXTREME RISK INITIATIVE —NYU SCHOOL OF ENGINEERING WORKING PAPER SERIES
4
III. C LASS OF S TABLE D ISTRIBUTIONS
Assume alpha-stable the class S of probability distribution that is closed under convolution: S(α, β, µ, σ) represents the
stable distribution with tail index α ∈ (0, 2], symmetry parameter β ∈ [0, 1], location parameter µ ∈ R, and scale parameter
σ ∈ R+ . The Generalized
Pn Central Limit Theorem gives sequences an and bn such that the distribution of the shifted and
rescaled sum Zn = ( i Xi − an ) /bn of n i.i.d. random variates Xi the distribution function of which FX (x) has asymptotes
1 − cx−α as x → +∞ and d(−x)−α as x → −∞ weakly converges to the stable distribution
c−d
, 0, 1).
c+d
We note that the characteristic functions are real for all symmetric distributions. [We also note that the convergence is not
clear across papers[2] but this doesn’t apply to symmetric distributions.]
Note that the tail exponent α used in non stable cases is somewhat, but not fully, different for α = 2, the Gaussian case
where it ceases to be a powerlaw –the main difference is in the asymptotic interpretation. But for convention we retain the
same symbol as it corresponds to tail exponent but use it differently in more general non-stable power law contexts.
The characteristic function Ψ(t) of a variable X α with scale σ will be, using the expression for α > 1, See Zolotarev[3],
Samorodnitsky and Taqqu[4]:
πα α
sgn(t)
Ψα = exp iµt − |tσ| 1 − iβ tan
2
which, for an n-summed variable (the equivalent of mixing with equal weights), becomes:
1 α πα Ψα (t) = exp iµnt − n α tσ 1 − iβ tan
sgn(t)
2
FT
S(∧α,2 , 10<α<2
A. Results
RA
Let X α ∈ S, be the centered variable with a mean of zero, X α = (Y α −µ) . We write. Ω(α, β, µ, σ, K) ≡ E(X α |X α >K P(X α >
K)) under the stable distribution above. From Equation 2:
E(X|X>K ) P(X > K) =
with explicit solution:
1
2π
Z
∞
ασ α |u|
α−2
1 + iβ tan
πα −∞
2
πα α
sgn(u) exp |uσ| −1 − iβ tan
sgn(u) + iKu du
2
πα 1/α πα 1/α 1
1
Ω(α, β, σ, 0) = −σ
Γ −
1 + iβ tan
+ 1 − iβ tan
.
πα
α
2
2
and semi-explicit generalized form:
1 + iβ tan
Γ α−1
α
Ω(α, β, σ, K) = σ
k
i K Γ
k+α−1
α
D
∞
X
k
+
πα
2
1/α
+ 1 − iβ tan
πα
2
(3)
1/α 2π
β tan2
2
πα
2
k=1
+1
1−k
α
(−1)k 1 + iβ tan
πα
2
k−1
α
+ 1 − iβ tan
πα
2
k−1
α
2πσ k−1 k!
(4)
Our formulation in Equation 4 generalizes and simplifies the commonly used one from Wolfe [5] from which Hardin [6]
got the explicit form, promoted in Samorodnitsky and Taqqu [4] and Zolotarev[3]:
!!
1
2α
tan−1 β tan πα
1
1 2
2 πα
2
E(|X|) = σ 2Γ 1 −
β tan
+1
cos
π
α
2
α
Which allows us to prove the following statements:
EXTREME RISK INITIATIVE —NYU SCHOOL OF ENGINEERING WORKING PAPER SERIES
5
1) Relative convergence: The general case with β 6= 0: for so and so, assuming so and so, (precisions) etc.,
πα α1 α
πα α1 α
α
α−1 √
β
1−α
2−2α
α−1
+ 1 + iβ tan
π
Γ
ng
1 − iβ tan
nα = 2
α
2
2
(5)
with alternative expression:

α
nβα = π 2−2α
2
 sec


1
πα − 2 /α
2
√
sec
ng Γ
tan−1 (tan( πα
2 ))
α
α−1
α
α
 1−α



(6)
Which in the symmetric case β = 0 reduces to:
nα = π
1
√
ng Γ
α
2(1−α)
α−1
α
(7)
n
!
α
X
1
Xiα − mα = k α −1
nα
FT
2) Speed of convergence: ∀k ∈ N+ and α ∈ (1, 2]
kn
!
α
X
Xiα − mα E /E
nα
α
! 1−α
(8)
Table I shows the equivalence of summands between processes.
TABLE I: Corresponding nα , or how many for equivalent α-stable distribution. The Gaussian case is the α = 2. For the case
with equivalent tails to the 80/20 one needs 1011 more data than the Gaussian.
1
β=± 2
-
nβ=±1
α
-
6.09 × 1012
2.8 × 1013
1.86 × 1014
5
4
574,634
895,952
1.88 × 106
11
8
5,027
6,002
8,632
3
2
567
613
737
13
8
165
171
186
7
4
75
77
79
15
8
44
44
44
2
30.
30
30
nα
Fughedaboudit
9
8
nα
D
RA
α
1
1
Remark 1. The ratio mean deviation of distributions in S is homogeneous of degree k . α−1 . This is not the case for other
classes "nonstable".
Proof. (Sketch) From the characteristic function of the stable distribution. Other distributions need to converge to the basin
S.
B. Stochastic Alpha or Mixed Samples
Define mixed population X α and ξ(X α ) as the mean deviation of ...
Proposition 1. For so and so
ξ(Xα¯ ) ≥
m
X
ωi ξ(Xαi )
i=1
where α
¯=
Pm
i=1
ωi αi and
Pm
i=1
ωi = 1.
Proof. A sketch for now: ∀α ∈ (1, 2), where γ is the Euler-Mascheroni constant ≈ 0.5772, ψ (1) the first derivative of the Poly
Gamma function ψ(x) = Γ0 [x]/Γ[x], and Hn the nth harmonic number:
1
∂2ξ
2σΓ α − 1
α−1
(1)
α −1
1 + log(n) + γ
1 + log(n) + γ
2α
−
H
=
n
ψ
+
−H
−α
−α
∂α2
πα4
α
α
EXTREME RISK INITIATIVE —NYU SCHOOL OF ENGINEERING WORKING PAPER SERIES
6
(|X|)
α = 5/4
3.5
3.0
Fig. 3: Asymmetries and Mean
Deviation.
2.5
α = 3/2
1.5
α = 7/4
FT
-1.0
2.0
0.5
-0.5
∂2 ξα
∂α 2
300
250
200
150
100
D
50
β
RA
350
1.0
1.4
1.6
1.8
Fig. 4: Mixing distributions: the
effect is pronounced at lower values of α, as tail uncertainty creates
more fat-tailedness.
2.0
α
which is positive for values in the specified range, keeping α < 2 as it would no longer converge to the Stable basin.
Which is also negative with respect to alpha as can be seen in Figure 4. The implication is that one’s sample underestimates
the required "n". (Commentary).
IV. S YMMETRIC N ON S TABLE D ISTRIBUTIONS IN THE S UBEXPONENTIAL C LASS
A. Symmetric Mixed Gaussians, Stochastic Mean
While mixing Gaussians the kurtosis rises, which makes it convenient to simulate fattailedness. But mixing means has the
opposite effect, as if it were more "stabilizing". We can observe a similar effect of "thin-tailedness" as far as the n required
to match the standard benchmark. The situation is the result of multimodality, noting that stable distributions
q unimodal
are
µ2
(Ibragimov and Chernin) [7] and infinitely divisible Wolfe [8]. For Xi Gaussian with mean µ, E = µ erf √µ2σ + π2 σe− 2σ2 ,
and keeping the average µ ± δ with probability 1/2 each. With the perfectly symmetric case µ = 0 and sampling with equal
EXTREME RISK INITIATIVE —NYU SCHOOL OF ENGINEERING WORKING PAPER SERIES
7
0.12
5
2
3
8
0.08
3
3
2
0.10
9
4
2
2
0.06
2
0.04
1
0.02
20 000
40 000
60 000
80 000
100 000
20 000
40 000
60 000
80 000
100 000
probability:

2
δ
− 2σ
2
1
σe
1
(E+δ + E−δ ) =  √
+ δerf
2
2
2π
FT
Fig. 5: Different Speed: the fatter tailed processes are not just more uncertain; they also converge more slowly.
 q

2 
δ2
− 2σ
2
2
√δ
√δ
δerf
σe
+
δerf
π
δ
2σ
2σ

 + √σ exp 
 erf  e √ +
√
√

−
2
2σ
π
2σ
2σ
2π


2
δ
− 2σ
2
B. Half cubic Student T (Lévy Stable Basin)
Relative convergence:
RA
Theorem 1. For all so and so, (details), etc.
P
kn Xiα −mα E nα
≤ c2
c1 ≤ P
α
n Xi −mα
E nα
where:
(9)
1
c1 = k α −1
−2
1
7/2 1/2
c2 = 2 π
−Γ −
4
D
Note that because the instability of distribution outside the basin, they end up converging to SM in(α,2) , so at k = 2, n = 1,
equation 9 becomes an equality and k → ∞ we satisfy the equalities in ?? and 8.
Proof. (Sketch)
The characteristic function for α = 23 :
3/4
33/8 |t|
√
8
Ψ(t) =
K 34
2Γ
q
3
4
3
2
|t|
Leading to convoluted density p2 for a sum n = 2:
Γ
p2 (x) =
5
4
2 F1
√
3Γ
5
7
2x2
4 , 2; 4 ; − 3
3 2
Γ 74
4
EXTREME RISK INITIATIVE —NYU SCHOOL OF ENGINEERING WORKING PAPER SERIES
|
1
n
8
n
xi |
0.7
0.6
0.5
Fig. 6: Student T with exponent =3.
This applies to the general class of
symmetric power law distributions.
0.4
0.3
0.1
10
20
30
C. Cubic Student T (Gaussian Basin)
FT
0.2
40
50
n
we have:
RA
Student T with 3 degrees of freedom (higher exponent resembles Gaussian). We can get a semi-explicit density for the Cubic
Student T.
√
6 3
p(x) =
2
π (x2 + 3)
ϕ(t) = E[eitX ] = (1 +
√
3 |t|) e−
√
3 |t|
hence the n-summed characteristic function is:
ϕ(t) = (1 +
and the pdf of Y is given by:
using
D
p(x) =
1
π
∞
Z
k −t
t e
0
+∞
Z
(1 +
√
3|t|)n e−n
√
3 t)n e−n
√
√
3 |t|
3t
cos(tx) dt
0
√
T1+k (1/ 1 + s2 )k!
cos(st) dt =
(1 + s2 )(k+1)/2
where Ta (x) is the T-Chebyshev polynomial,2 the pdf p(x) can be writen:
−n−1
2
n2 + x3
√
p(x) =
3π
1−k
2 +n
x2
2
+
Tk+1
n!
n
3
n
X
k=0
!
q 1
x2
3n2
+1
(n − k)!
which allows explicit solutions for specific values of n, not not for the general form:
( √
)
√
√
√
√
√
√
2 3 3 3 34 71 3 3138 3
899
710162 3 425331 3
33082034
5719087 3
√ ,
√ ,
{En }1 ≤n<∞ =
,
, √ ,
,
,
,
,
,...
π
2π 9 3π 64π 3125π 324 3π 823543π 524288π 14348907 3π 7812500π
2 With
thanks to Abe Nassen and Jack D’Aurizio on Math Stack Exchange.
EXTREME RISK INITIATIVE —NYU SCHOOL OF ENGINEERING WORKING PAPER SERIES
0.5
1
p=
2
4
0.4
9
7
p=
1
8
Fig. 7: Sum of bets converge
rapidly to Gaussian bassin but remain clearly subgaussian for small
samples.
Gaussian
0.3
4
FT
0.2
6
8
p=
0.5
Betting against the
0.4
10
1
2
4
7
long shot (1/100)
p=
1
100
0.3
0.2
0.1
40
60
80
Fig. 8: For asymmetric binary bets,
at small values of p, convergence is
slower.
100
D
20
RA
Gaussian
V. A SYMMETRIC N ON S TABLE D ISTRIBUTIONS IN THE S UBEXPONETIAL C LASS
A. One-tailed Pareto Distributions
B. The Lognormal and Borderline Subexponential Class
VI. A SYMMETRIC D ISTRIBUTIONS IN THE S UPEREXPONENTIAL C LASS
A. Mixing Gaussian Distributions and Poisson Case
B. Skew Normal Distribution
This is the most untractable case mathematically, apparently though the most present when we discuss fat tails [9].
C. Super-thin tailed distributions: Subgaussians
P
P
Consider a sum of Bernoulli variables X. The average n ≡ i≤n xi follows a Binomial Distribution. Assuming np ∈ N+
to simplify:
X
x n
E (|Σn |) = −2
(x − np) p
(1 − p)n−x
x
i≤0≤np
EXTREME RISK INITIATIVE —NYU SCHOOL OF ENGINEERING WORKING PAPER SERIES
10
E (|Σn |) = −2(1 − p)n(−p)+n−2 pnp+1 Γ(np + 2)
n
n
(p − 1)
λ1 − p(np + 2)
λ2
np + 1
np + 2
where:
and
λ1 =2 F˜1 1, n(p − 1) + 1; np + 2;
p
p−1
λ2 =2 F˜1 2, n(p − 1) + 2; np + 3;
p
p−1
VII. ACKNOWLEDGEMENT
Colman Humphrey,...
APPENDIX :
M ETHODOLOGY, P ROOFS , E TC .
A. Cumulants
C2g =
FT
we have in the Gaussian case indexed by g:
1
erf( √ + e−1/π C1g
π
which is ≈ 1.30 C1g .
For a powerlaw distribution, cumulants are more unwieldy:
α=3/2
C1
Γ
3
4
5
4
σ
RA
Move to appendix
=
q
2 π6 Γ
p q
1
9/2
9/2
5/4 3 5/2
9/4
9/4
Γ2 4 πΓ21 + Γ23 Γ1 H1
= √
σ
384π
Γ
Γ
+
24π
Γ
Γ
−
2π
2
2
1
1
2 6π 3/2 Γ31 (πΓ21 + Γ23 ) 5/4
q
q
√
√
4
3 4
3
3/4
5 4
2
2
2
2
πΓ1 + Γ3 3 2Γ1 + 3Γ3 (H2 + 2) − 2 2π H1
+ 1536Γ2 πΓ1 + Γ3 H2 + π
α =3/2
C2
where Γ1 = Γ
3
4
, Γ2 = Γ
5
4
, Γ3 = Γ
1
4
, H1 = 2 F1
2
πΓ1
3 5 7
4 , 4 ; 4 ; − Γ23
, and H2 = 2 F1
Γ23
1 5 3
2 , 4 ; 2 ; − πΓ21
.
B. Derivations using explicit E(|X|)
See Wolfe [5] from which Hardin got the explicit form[6].
D
C. Derivations using the Hilbert Transform and β = 0
Section obsolete since I found forms for asymmetric stable distributions. Some commentary on Hilbert transforms for
symmetric stable distributions, given that for Z = |X|, dFz (z) = dFX (x)(1 − sgn(x)), that type of thing.
Hilbert Transform for a function f (see Hlusel, [10], Pinelis [11]):
Z ∞
1
f (x)
H(f ) = p.v.
dx
π
−∞ t − x
Here p.v. means principal value in the Cauchy sense, in other words
Z ∞
Z
p.v.
= lim lim
a→∞ b→0
−∞
−b
−a
Z
E(|X|) =
In our case:
E(|X|) =
a
Z
+
b
∞
1 ∂
Ψ(z)
∂
H(Ψ(0)) =
p.v.
dz|t=0
∂t
π ∂t
t
−∞ − z
Z ∞
Ψ(z)
1
E(|X|) = p.v.
dz
2
π
−∞ z
1
p.v.
π
Z
α
∞
−
−∞
e−|tσ|
2
dt = Γ
2
t
π
α−1
α
σ
EXTREME RISK INITIATIVE —NYU SCHOOL OF ENGINEERING WORKING PAPER SERIES
11
R EFERENCES
S. Pinker, The better angels of our nature: Why violence has declined. Penguin, 2011.
V. V. Uchaikin and V. M. Zolotarev, Chance and stability: stable distributions and their applications. Walter de Gruyter, 1999.
V. M. Zolotarev, One-dimensional stable distributions. American Mathematical Soc., 1986, vol. 65.
G. Samorodnitsky and M. S. Taqqu, Stable non-Gaussian random processes: stochastic models with infinite variance. CRC Press, 1994, vol. 1.
S. J. Wolfe, “On the local behavior of characteristic functions,” The Annals of Probability, pp. 862–866, 1973.
C. D. Hardin Jr, “Skewed stable variables and processes.” DTIC Document, Tech. Rep., 1984.
I. Ibragimov and K. Chernin, “On the unimodality of geometric stable laws,” Theory of Probability & Its Applications, vol. 4, no. 4, pp. 417–419, 1959.
S. J. Wolfe, “On the unimodality of infinitely divisible distribution functions,” Probability Theory and Related Fields, vol. 45, no. 4, pp. 329–335, 1978.
I. Zaliapin, Y. Y. Kagan, and F. P. Schoenberg, “Approximating the distribution of pareto sums,” Pure and Applied geophysics, vol. 162, no. 6-7, pp.
1187–1228, 2005.
[10] M. Hlusek, “On distribution of absolute values,” 2011.
[11] I. Pinelis, “On the characteristic function of the positive part of a random variable,” arXiv preprint arXiv:1309.5928, 2013.
D
RA
FT
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]