Inefficient Markets VII: predictability Damien Challet [email protected] October 29, 2014 Damien Challet Inefficient Markets VII: Previous episodes Adaptive Complex Market Hypothesis Models and mathematical methods for Heterogeneity Learning Interaction Price predictability Damien Challet Inefficient Markets VII: Dynamics of predictability Geanakoplos&Farmer (2008) Persistence? Damien Challet Inefficient Markets VII: Predictability: where from? Wrong backtesting Finite buying power Market impact Behavioural biases Lazyness or ingenuity Backtesting difficulties The Rise of the Machines Legal contraints Trading constraints Who cares? Damien Challet Inefficient Markets VII: From ideas to trading: 1 Data 2 Backtesting 3 Portfolio of strategies 4 Build portfolio draft 5 Risk management → final portfolio 6 Trading Damien Challet Inefficient Markets VII: What can go wrong 1 Data 2 Backtesting 3 Portfolio of strategies 4 Build portfolio draft 5 Risk management → final portfolio 6 Trading Damien Challet Inefficient Markets VII: How Science works 1 Acquire data 2 Lots of data 3 Find patterns: graphs, statistics 4 Design a model, test it, validate it 5 New data that invalidate the model: find a new model Damien Challet Inefficient Markets VII: How Making Money works 1 Acquire data 2 Lots of data 3 Find patterns: graphs, statistics 4 Design a trading strategy 5 Test it, validate it 6 New data that invalidate the strategy: find a new strategy Damien Challet Inefficient Markets VII: Theory vs practice Theoretical [mathematical] finance 1 Assumptions, mostly wrong 2 Fancy theorems 3 Works some of the time NOTA BENE: Always respect the nature of of real markets Damien Challet Inefficient Markets VII: Wrong assumptions: example I GARCH models fit a GARCH model to real data find negative volatility jumps long-range memory Solution: other volatility models (Heston, etc) include long-range memory: FIGARCH, Zumback&Lynch Damien Challet Inefficient Markets VII: Example I: volatility response function Power-law decay, bumps at typical human time scales Taken from Zumbach and Lynch (2001) Damien Challet Inefficient Markets VII: Example I: volatility mugshots Time reversal asymmetry: vol at large time horizons influence smaller time horizons taken from Borland et al. (2005) Damien Challet Inefficient Markets VII: Wrong assumptions: example II Risk management cf 2007,2008 crisis Speculation wrong tools: EMA? missed opportunities you are the fish: cf option pricing Damien Challet Inefficient Markets VII: Nature of markets: Random walks Price returns log pt+1 = log pt + rt+1 IID returns: pt ∼ random walk Log-normal RW (pt −p1 )2 1 e− 2σ ²t P(log pt ) = √ 2πtσ Diffusion E[(pt − p1 )2 ] ∝ σ 2 t √ E[|pt − p1 |] ∝ σ t Damien Challet Inefficient Markets VII: Predictability detection Biased Gaussian random walk (yt = xt rt,1 ) E1 (y) = µ ∆t ∑ yt ' µ∆t ± √ σ ∆t t=1 Time needed to detect bias √ µ∆t ≥ K σ ∆t √ √ σ 1 ∆t ≥ K ∝ µ S Statistics point of view √ µ ∆t √ σ Damien Challet ∼ t-stat Inefficient Markets VII: Sharpe ratio Sharpe ratio (1994) Sharpe ratio ≡ signal-to-noise ratio Raw Et →t (xr) S = p0 1 var (xr) With respect to benchmark B Et →t (xr − rB ) SB = p0 1 rB : benchmark return var (xr − rB ) Damien Challet Inefficient Markets VII: Improving a Sharpe ratio: compressed sensing from wired.com/magazine/2010/02/ff_algorithm/all/1 Damien Challet Inefficient Markets VII: Improving a Sharpe ratio: Karhunen-Loeve Transform From C. Maccone, Deep Space Flight and Communications: Exploiting the Sun as a Gravitational Lens Damien Challet Inefficient Markets VII: Nature of markets: mean-reverting and trending Mizuno et al. (2007) 1 Autoregressive fit Pt = Pt + ηt where K Pt = ∑ wk Pt−k optimal moving average k=1 E(ηt ηt0 ) = σε δt,t0 wk : Yule-Walker equations, ensures uncorrelated noise 2 Moving average of moving average Φt = 3 1 M−1 ∑ Pt−τ M τ=0 Is P attracted or repulsed by Φt ? Damien Challet Inefficient Markets VII: Detecting mean-reverting and trending Empirical fact: Pt+1 − Pt = − bt (Pt − Φt ) + ft M−1 where Et (f ) = 0 Damien Challet Inefficient Markets VII: Detecting mean-reverting and trending Empirical fact: Pt+1 − Pt = − bt 1 d (Pt − Φt )2 + ft M − 1 2 dPt where Et (f ) = 0 Damien Challet Inefficient Markets VII: Detecting mean-reverting and trending measure of bt : long memory bt < 0 → over-diffusion at short times bt > 0 → under-diffusion at short times Damien Challet c.f. variance ratio tests Inefficient Markets VII: Alway respect the underlying process Over/under-diffusive processes (approximate generalization) Biased Gaussian random walk (yt = xt rt,1 ) E1 (y) = µ ∆t ∑ yt ' µ∆t ± (σ ∆t)H t=1 Time needed to detect bias µ∆t ≥ K(σ ∆t)H (∆t)1−H ≥ K σH 1 ∝ µ S Statistics point of view (∆t)1−H Damien Challet µ DH ∼ t-stat Inefficient Markets VII: How to measure H? Hurst exponent: tricky R: >7 methods 8-th: Mizuno Variance ratio tests H determines the style of strategies (trend following vs mean-reverting) Damien Challet Inefficient Markets VII: Strategy: definition A trading stragegy xt xt = position to hold at time t Simplest case: xt ∈ {−1, 0, +1} Example: signal st , ( +1 if st > +θ xt = −1 if st < −θ θ = 0 first. signals si,t (BEWARE) xt = ∑ ai,k si,t−k i,k OR xt = ∑ Θ(si,t − θ ) − Θ(θ − si,t ) i Backtest gain xt rt Damien Challet Inefficient Markets VII: Strategy goodness: measures Backtesting: “what if?” xt : strategy, position at time t log-gain from t0 to t1 : t1 Gt0 →t1 = ∑ xt rt,1 t=t0 Performance measures of x percentage of positive trades φ= t1 1 θ (xt rt,1 ) ∑ t1 − t0 t=t0 gain ratio γ= Damien Challet t1 xr ∑t=t 0 t t,1 t1 ∑t=t0 |xt rt,1 | Inefficient Markets VII: Backtesting Freeman, J. D. Behind the smoke and mirrors: Gauging the integrity of investment simulations. Financial Analysts Journal (1992), 26–31 Leinweber, D. J. Stupid data miner tricks: overfitting the S&P 500. The Journal of Investing 16, 1 (2007), 15–22. Damien Challet Inefficient Markets VII: Backtesting Methodological problem? IN sample only: Leinweber, Nerds on Wall Street: Math, Machines and Wired Markets (2009) Bangladesh butter production with S&P500 next returns: R2 = 0.75: if it was up 1%, the S&P 500 was up 2% the next year. Conversely, if butter production was down 10%, you could predict the S&P 500 would be down 20%. Bangladesh butter production + US cheese production: R2 = 0.95 Bangladesh butter production + US cheese production + Bangladesh sheep population: R2 = 0.99 OUT sample: R2 = 0: Damien Challet OVERFITTING Inefficient Markets VII: Backtesting IN and OUT samples Sliding windows Alternate windows Add Transaction costs Market impact Other costs Also Wrong data Subtly wrong data Trading problems Damien Challet Inefficient Markets VII: Google Trends: Search Volume Index Damien Challet Inefficient Markets VII: GT: example Damien Challet Inefficient Markets VII: Google Hedge Fund In “Googled: The End of the World As We Know It”, Ken Auletta Sergey Brin: "We should run a hedge fund." Eric Schmidt: "Sergey, among your many ideas, this is the worst" Sergey Brin: "No, we can do it because we have so much information." Eric Schmidt: “[...] legal complications [...] NO!” Damien Challet Inefficient Markets VII: GT and predictability: claims Keywords: ticker, company names Bordino et al. 2011, increase in SVI → increase of traded volume Da et al 2013, [2004-2008] increase in SVI → higher stock prices in the next 2 weeks Joseph et al 2011 [2005-2008], increase in SVI → higher stock prices in the next week Takeda et al. 2013 [2008-2011]: weak for future returns, strong for future volume Kristoufek 2013 [2004-2013]: portfolio weight ∼ SVI−α Preis et al 2013 [2004-2011]: fancy keywords; relative increase in SVI → lower index in the next week Damien Challet Inefficient Markets VII: Counter-example Damien Challet Inefficient Markets VII: A practitioner point of view 1 Trading strategies 2 Backtest period 3 Assets 4 Keywords 5 Download GT data 6 Timescale of returns 7 Parameters 8 Input GT data only, 9 Input past returns only 10 Input both 11 Compare. Damien Challet Inefficient Markets VII: Prediction: past returns vs GT data Nowcasting: Choi and Varian (2009) Forecasting: Da et al. (2013) Damien Challet Inefficient Markets VII: A practitioner point of view 1 Trading strategies 2 Backtest period 3 Assets 4 Keywords 5 Download GT data 6 Timescale of returns 7 Parameters 8 Input GT data only, 9 Input past returns only 10 Input both 11 Compare. Damien Challet Inefficient Markets VII: 1. Trading strategies Linear methods Conditional predictability Ensemble learning methods Damien Challet Inefficient Markets VII: 2+3 Backtest period, assets 2. Backtest period Whole period Sliding in/out-of-sample periods 3. Choice of assets Index components: S&P100 Damien Challet Inefficient Markets VII: 4. Keywords Recipe for disaster: 1 Think of finance-related keywords finance, debt, CDS, bonds, crisis 2 Use Google Sets: finance → marketing, real estate, insurance, accounting, debt consolidation, investing, [...] Damien Challet Inefficient Markets VII: 4. Keywords: example Preis et al. (2013): contrarian strategy Damien Challet Inefficient Markets VII: 4. Keywords: null hypothesis? 1 100 classic cars 2 100 classic arcade video games 3 200 classic illnesses/ailments keyword t-stat keyword t-stat keyword t-stat multiple sclerosis -2.1 Chevrolet Impala -1.9 Moon Buggy -2.1 muscle cramps -1.9 Triumph 2000 -1.9 Bubbles -2.0 premenstrual syndrome -1.8 Jaguar E-type -1.7 Rampage -1.7 alopecia 2.2 Iso Grifo 1.7 Street Fighter 2.3 gout 2.2 Alfa Romeo Spider 1.7 Crystal Castles 2.4 bone cancer 2.4 Shelby GT 500 2.4 Moon Patrol 2.7 Damien Challet Inefficient Markets VII: 4. Keywords −1 0 1 t−stat 2 3 debt Moon Patrol 0 50 100 150 k IN SAMPLE Damien Challet Inefficient Markets VII: 200 4. Keywords: example Games 0.95 0.90 cumulated performance 0.85 1.00 0.95 0.90 0.80 0.80 0.85 cumulated performance 1.05 1.00 Cars 2004 2006 2008 2010 2012 2004 2006 2012 Preis et al. 1.05 cumulated performance 0.95 1.00 1.05 1.00 0.95 0.90 0.85 cumulated performance 2010 1.10 Illnesses 2008 2004 2006 2008 2010 2012 Damien Challet 2004 2006 2008 2010 Inefficient Markets VII: 2012 4. Keywords KISS: Symbols Company names Key products Damien Challet Inefficient Markets VII: 5. GT data 1 Weekly 2 Starts in 2004 3 Data not available before 2008-08 4 File format change in 2012-01 before Nov 27 2005, Dec 4 2005, after 2005-11-27 2005-12-04 - 1.14, 5% 1.00, 5% 2005-12-03,31 2005-12-10,28 Damien Challet Inefficient Markets VII: 5. GT daily data AAPL 20081201−20090430 20 40 60 20 80 100 20 40 40 60 60 80 100 20 80 40 60 80 100 100 AAPL 20081201−20090430 Jan Mar May Damien Challet Jan Inefficient Markets VII: Mar May 5. GT daily data: delay (downloaded 2014-01-20, 09:02:00 UTC) Damien Challet Inefficient Markets VII: GT+returns GT returns 1.6 1.4 1.2 1.0 cumulated performance 1.8 2.0 Prediction: binary inputs 2006 2008 Damien Challet 2010 2012 Inefficient Markets VII: 2014 Backtest: GT + returns 2.0 1.5 1.0 1.0 0.0 net exposure gross exposure 0 1 2 3 4 5 −1.0 80 # stocks 40 0 performance GT data + price returns 2006 2008 Damien Challet 2010 2012 Inefficient Markets VII: Prediction: GT data only 2.0 1.5 1.0 1.0 0.0 net exposure gross exposure 0 1 2 3 4 5 −1.0 80 # stocks 40 0 performance GT data 2006 2008 Damien Challet 2010 2012 Inefficient Markets VII: Prediction: returns only 2.0 1.5 1.0 1.0 0.0 net exposure gross exposure 0 1 2 3 4 5 −1.0 80 # stocks 40 0 performance Price returns 2006 2008 Damien Challet 2010 2012 Inefficient Markets VII: 0 0 80 2006 2008 2010 2012 Damien Challet 40 80 # stocks 40 # stocks gross exposure 0.0 2.0 1.0 1.0 1.5 2.0 performance 1.5 performance 1.0 1.0 net exposure 0 1 2 3 4 5 −1.0 gross exposure 0.0 net exposure 0 1 2 3 4 5 −1.0 Prediction: comparison GT data Price returns 2006 Inefficient Markets VII: 2008 2010 2012 Market state: Clustering N objects i = 1, · · · , N T properties xi,t t = 1, · · · , T Normalisation E(xi ) = 0, E(xi2 ) = 1 Group objects in K clusters si ∈ 1, · · · , K Similarity measure? Damien Challet Inefficient Markets VII: Clustering K-means Fix K Find si ∈ {1, · · · , K} that minimise cost function H = ∑ ∑ δsi ,s (Xs − xi )2 s i 1 Xs = ∑ δsi ,s xi ns i ns = ∑ δsi ,s number of objects in cluster s i Minimisation? Value of K? Damien Challet Inefficient Markets VII: Maximum likelihood clustering Marsili (2003) Correlation matrix Cij = E(xi xj ) ≥ 0 Cij has O(N 2 ) coefficients (too many) Clustering by correlations: dimensionality reduction Cluster = objects with ∼ same cross-correlation Ansatz: C diagonal by blocks 1 i = j Ci,j = cs si = sj , i 6= j 0 si 6= sj ns = ∑ δs,si i cs = ∑ δs,si δs,sj Ci,j i,j Damien Challet Inefficient Markets VII: Clustering Stochastic model for xi,t √ gsi ηsi ,t + εi,t x˜ i,t = p 1 + gsi η and ε iid and ∼ N (0, 1) Cross-correlation inside cluster s Cs = gs δsi, sj + δi,j 1 + gsi Model of time series given by G = {g1 , · · · , gK } how many clusters, correlation S = {s1 , · · · , sN } cluster attribution Damien Challet Inefficient Markets VII: Clustering Model of time series given by G = {g1 , · · · , gK } how many clusters, correlation S = {s1 , · · · , sN } cluster attribution Likelihood " T P(x|S, G) = ∏ Eη,ε t=1 # N ∏ δ (xi,t − x˜ i,t ) i=1 Exponentiation of Dirac functions with Z +∞ dk ikx δ (x) = e −∞ Damien Challet 2π Inefficient Markets VII: ∝ P(S, G|x) Clustering: maximum likelihood Gaussian integration → P(S, G|x) ∝ eTL {S,G} 1 gs cs L {S, G} = − ∑[(1 + gs )(ns − ) 2 s 1 + gs ns + ns ln(1 + gs ) − ln(1 + gs ns ) Log-likelihood L ; maximisation: ∂∂L gs = 0 ( gˆ s = cs −ns n2s −cs ns > 0 0 ns = 0 1 ns n2s − ns Lc (S) = ln + (ns − 1) ln 2 2 s,n∑ cs ns − cs s >0 Damien Challet Inefficient Markets VII: Clustering: maximum log-likelihood Problem: si is discrete. Maximize Lc w.r.t S? Enumerate: O(K N ) Random search 1 2 3 4 5 Start with arbitrary S Propose si → s for all i Compute differences in Lc for each i Keep single move that improves Lc the most Stop when no move improves Lc Merging algorithm 1 2 3 Start with N clusters, si = i Merge two clusters s0 , s00 so that Lc is the most improved Repeat N − 1 times Damien Challet Inefficient Markets VII: Clustering Merging algorithm 1 Start with N clusters, si = i 2 Merge two clusters i, j so that Lc is the most improved 3 Repeat N − 1 times Damien Challet Inefficient Markets VII: Clustering Lc = ∑s ls : superposition of terms merge r and s into q: 1 2 3 nq = nr + ns cq : recompute from xi merge ls and ls0 into lq lq > lr + ls lq < lr + ls , lq > max(lr , ls ) lq < lr + ls , lq < max(lr , ls ) : no links in dendrogram Damien Challet Inefficient Markets VII: Clustering: assets Clusters: economic sectors 1 electric and computers 2 electric and computers 3 mixed 4 gold 5 banks Damien Challet Inefficient Markets VII: Clustering: days xi,t : matrix N ' T: transpose and cluster Clusters of days → state Damien Challet Inefficient Markets VII: Clustering: days Day states: way of sectors co-moving Damien Challet Inefficient Markets VII: Clustering: days Date → state 5 meaningful states + 1 random state 1 6 4 2 cluster 1 44 5 2 date 1990/01/02 1990/01/03 1990/01/04 1990/01/05 Claim: after crash, same sequence of states Damien Challet Inefficient Markets VII: Custering of days: predictability? At the close of time t, state µt E(rt |µt ) very significantly non-zero Is E(rt+1,1 |µt ) significantly non-zero? E(rt+1,1 |µt ) = ∑ W(µt → ν)E(rt+1 |ν) ν where W(µt → ν) changes slowly a function of time Raw Sharpe ratio E(rt+1,1 |µ) Sµ,raw = p var(rt+1,1 |µ) Benchmark: E(rt+1,1 ), δ rt+1,1 = rt+1,1 − E(rt+1,1 ) E(δ rt+1,1 |µ) Sµ = p var(δ rt+1,1 |µ) Damien Challet Inefficient Markets VII: Custering of days: predictability? For stock i : Hi = E(Si,µ ) Some predictability Damien Challet Inefficient Markets VII:
© Copyright 2025 ExpyDoc