Outline Introduction Concepts Stats Examples Software Extensions Discussion Flexible modelling of the cumulative effects of time-varying exposures Applications in environmental, cancer and pharmaco-epidemiology Antonio Gasparrini Department of Medical Statistics London School of Hygiene and Tropical Medicine (LSHTM) Centre for Statistical Methodology – LSHTM 28 November 2014 Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Outline 1 Introduction 2 Conceptual model 3 Statistical model 4 Examples 5 Software 6 Extensions 7 Discussion Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Temporal aspects The relationship between a risk factor and the associated health effect always implies a temporal dependency: a common problem in biomedical research This issue encompasses study designs and statistical model: Tobacco smoke and CVD risk Occupational exposure and incidence of cancer Drug intake and beneficial or side effects Short-term temperature variation and mortality A topic (somewhat) neglected in methodological research Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Previous research Standard statistical approaches do not directly characterize this temporal structure Challenge: modelling (potentially complex) temporal patterns of risk due to time-varying exposures Models previously proposed in cancer epidemiology (Thomas 1988, Hauptmann 2000, Richardson 2009) and pharmaco-epidemiology (Abrahamowicz 2012) Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Limitations Incomplete statistical development: e.g. no measures of uncertainty Poor software implementation: ad-hoc routines, computational issues, convergence problems Lack of a consistent conceptual and interpretational framework Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Distributed lag models DLMs proposed by Almon (Econometrica 1965) in econometrics for time series data, then applied in environmental epidemiology by Schwartz (Epidemiology 2000). Armstrong (Epidemiology 2006) extended them to distributed lag non-linear models (DLNMs), applicable to non-linear exposure-response associations A far more developed statistical framework, but only applicable to time series data Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Conceptual representation Single exposure event Forward perspective ● Effect ● ● 0 ● t t+1 … t+2 ● ● … t+L Time Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Conceptual representation Multiple exposure events 0 Effect Backward perspective t−L … … t−2 t−1 t Time Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Assumptions Under specific assumptions, these two perspectives can be merged together: assumption of identical effects (fundamental) assumption of independency These conditions underpin the conceptual framework for defining and modelling DLNMs Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Conceptual representation New lag dimension ● ● ● ● ● Effect ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● Forward Backward t0 0 t0 + L te lag L Time (Lags) Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Exposure-lag-response associations The risk is represented by a function s(xt−` , . . . , xt−L ) defined in terms of both intensity and timing of a series of past exposures, expressed through: an exposure-response function f (x) for exposure x a lag-response function w (`) for lag ` Generating a bi-dimensional exposure-lag-response function f ·w (x, `), whose integral provides: Z L f ·w (xt−` , `) d` ≈ s(xt−` , . . . , xt−L ) = `0 Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures L X f ·w (xt−` , `) `=`0 LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Distributed lag models (DLMs) Given a exposure history at time t for lags ` = `0 , . . . , L: qxt = [xt−`0 , . . . , xt−` , . . . , xt−L ]T and assuming a linear exposure-response, we can write: T s(qxt ; η) = qT xt Cη = wxt η where C is obtained from the lag vector ` = [`0 , . . . , `, . . . , L]T by applying a specific basis transformation Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Distributed lag non-linear models (DLNMs) First the matrix Rxt is obtained applying a second basis transformation to qxt Then we define a tensor product: T Axt = (1T v` ⊗ Rxt ) (C ⊗ 1vx ) which forms the crossbasis: T s(qxt ; η) = (1T vx ·v` Axt )η = wxt η The problem reduces to choosing a basis for each qxt and `, defining exposure-response and lag-response functions, respectively Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Alternative study designs Time series A Case−control A B C tBj B ● tAj ● Longitudinal Cohort A C A tBj B ● tCk tAj ● C tBj ● tCj tj tAj ● tCk ● ● tk tCj tAk● Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures tAk● LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion First example Temperature and all-cause mortality Research area where DLNMs were originally proposed Time series data with daily death counts and temperature measurements between 1st Jan 1993 and 31st Dec 2006 in London (845,215 deaths in total) In this setting, exposure histories are simply derived by ’lagging’ the temperature series Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Quasi-Poisson GLM Analysis with a generalized linear model with quasi-Poisson family, controlling for trends and day of the week log(µt ) = α + sx (qxt ; η) + P X sz (zt ; βz ) p=1 Here spline functions used to specify both f (x) and w (`) Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Exposure-lag-response 1.10 1.05 RR 0 1.00 5 0.95 La g 10 20 15 15 Tem p 10 erat ure 5 (C) 0 Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures 20 LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Summaries Exposure−lag−response 1.04 1.06 1.08 Lag−response at temperature 22C 1.02 RR 1.10 1.05 1.00 RR 1.00 0 0.98 5 0.95 20 La g 10 15 15 Tem 10 pea ture 5 (C ) 0 0 10 15 20 Lag Overall cumulative exposure−response 1.4 0.8 0.98 1.0 1.00 1.2 RR 1.02 1.6 1.04 1.8 2.0 1.06 Exposure−response at lag 4 RR 5 20 −5 0 5 10 15 20 25 −5 Temperature (C) Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures 0 5 10 15 20 25 Temperature (C) LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Second example Radon exposure and lung cancer mortality 3,347 subjects working in the Colorado Plateau mines between 1950–1960, 258 lung cancer deaths Yearly exposure history to radon (WLM) and tobacco smoke (pack×100) reconstructed from 5-year age periods Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Proportional hazard model Analysis with Cox proportional hazards model using age as time axis, controlling for smoking and calendar year. For subject i: log [h(it)] = log [h0 (t)] + sx (qxit ; ηx ) + sz (qzit ; ηz ) + γuit Different functions used to specify f (x) and w (`): constant, piecewise constant, quadratic B-spline Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Exposure-lag-response Linear-by-constant 1.10 1.08 1.06 HR 1.04 1.02 1.00 0.98 250 200 5 10 150 W 15 LM /ye 100 ar 20 25 30 50 35 0 g La ) ars (ye 40 Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Exposure-lag-response Spline-by-constant 1.16 1.14 1.12 1.10 HR 1.08 1.06 1.04 1.02 1.00 0.98 250 200 5 10 150 W 15 LM /ye 100 ar 20 25 30 50 35 0 g La ) ars (ye 40 Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Exposure-lag-response Linear-by-spline 1.10 1.08 1.06 HR 1.04 1.02 1.00 0.98 250 200 5 10 150 W 15 LM /ye 100 ar 20 25 30 50 35 0 g La ) ars (ye 40 Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Exposure-lag-response Step-by-step 1.30 1.25 1.20 HR 1.15 1.10 1.05 1.00 250 200 5 10 150 W 15 LM /ye 100 ar 20 25 30 50 35 0 g La ) ars (ye 40 Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Exposure-lag-response Spline-by-spline 1.30 1.25 1.20 HR 1.15 1.10 1.05 1.00 250 200 5 10 150 W 15 LM /ye 100 ar 20 25 30 50 35 0 g La ) ars (ye 40 Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion 1.3 Lag-response curves from DLNMs 1.2 1.1 1.0 0.9 RR for 100 WLM/year Spline−by−spline Spline−by−piecewise 0 5 10 15 20 25 30 35 40 Lag (years) Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion 1.3 Exposure-responses at different lags Lag 15 Lag 25 1.1 0.9 1.0 RR at lag 15 1.2 Lag 5 0 50 100 150 200 250 WLM/year Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Third example MMR vaccine and ITP risk Data from 35 children receiving the MMR (measles, mumps, rubella) vaccine months and admitted to the hospital for idiopathic trombocytopenic purpura (ITS) within 12-24 months of age. Replicating and extending a previous analysis using the self-controlled case series design (Whitaker 2006) Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Conditional Poisson regression Analysis with conditional Poisson regression controlling for age. For subject i at age a: log(λiat ) = αi + sx (qxit ; ηx ) + f (ait ; γ) Single exposure event modelled with a binary variable Exposure-response assumed linear, lag-response modelled with spline or piecewise constant functions Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion 10 Spline Piecewise constant 0 5 IRR 15 Lag-response 0 10 20 30 40 Lag (days) Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Fourth example Tobacco and lung cancer incidence 1,479 cases and 1,918 controls from three case-control studies within the Synergy network Yearly exposure history to tobacco smoke (cigarette/day) reconstructed from questionnaires Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Logistic regression Analysis with logistic regression controlling for sex logit (µi ) = α + sx (qxi ; ηx ) + γui Different functions used to specify f (x) and w (`): log, piecewise constant, quadratic B-spline Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Exposure-lag-response Log-by-spline 1.15 1.10 OR 1.05 1.00 80 60 0 r a ye y/ da g/ Ci 10 40 20 30 20 40 50 0 Lag ) rs (yea 60 Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion 1.00 1.05 1.10 Spline Step 0.95 OR for 20 cig/day/year 1.15 Lag-response curves from DLNMs 0 10 20 30 40 50 60 Lag (years) Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion 2.0 1.5 Relapse 10 20 50 40 30 20 10 0 Cessation 0.0 0 30 40 Cig/day/year 1.0 Quit 0.5 Cumulative OR 2.5 3.0 Dynamic prediction of risk 50 60 years Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Fifth example Trial on the effect of a drug 50 subjects followed for 4 weeks Time-varying treatment randomly allocated in two of the four weeks, each with a different dose selected at random Outcome measured at the end of the 28 days Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Linear regression Analysis with linear regression controlling for sex yi = α + sx (qxi ; ηx ) + γui + i Exposure-response assumed linear Lag-response modelled with spline or decay functions Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Exposure-lag-response linear-by-spline 6 Effect 4 2 0 5 0 (d ay s) 10 15 60 Dos 40 e 20 g 80 La 100 20 0 Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures 25 LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Lag-response 3 2 −1 0 1 Effect at dose 60 4 5 6 Spline function 0 5 10 15 20 25 Lag (days) Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Lag-response 3 2 −1 0 1 Effect at dose 60 4 5 6 Decay function 0 5 10 15 20 25 Lag (days) Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Software implementation The framework is fully implemented in the R package dlnm, available from the CRAN (Gasparrini JSS 2011) The package contains a new vignette focusing on applications beyond time series data Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion The R package dlnm Example of code library(dlnm) cb <- crossbasis(Q,lag=c(2,40), argvar=list(fun="bs",degree=2,knots=59.4,cen=0), arglag=list(fun="bs",degree=2,knots=13.3,int=F)) model <- coxph(Surv(agest,ageexit,ind)~cb+smoke+caltime,data) pred <- crosspred(cb,model,at=0:25*10) plot(pred,"3d",xlab="WLM/year",ylab="Lag (years)",zlab="RR") plot(pred,var=100,xlab="Lag (years)",ylab="RR") plot(pred,lag=15,xlab="WLM/years",ylab="RR") Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Simulations Linear−Constant Plateau−Decay 1.5 1.5 1.10 1.08 1.4 1.3 1.3 HR 1.4 HR HR 1.06 1.04 1.2 1.02 1.1 1.00 10 1.0 10 0 30 1.1 8 Ex 6 po su 4 re 10 20 g La 30 8 Ex 6 po su 4 re 20 g La 0 40 True AIC avg AIC samples 20 30 40 20 30 40 True 20 6 Exposure 8 10 30 40 AIC avg AIC samples True HR AIC avg AIC samples 0.9 1.1 HR 1.5 AIC samples 1.1 4 10 Lag 0.9 2 0 Lag AIC avg AIC samples HR 10 1.3 0.9 1.0 1.1 1.2 1.3 0 AIC avg 1.1 0 Lag True True 1.5 10 20 g La 0.9 1.1 0 30 1.3 1.5 AIC samples 0 10 2 0 40 1.3 AIC avg HR True 2 0.9 HR 0.9 1.0 1.1 1.2 1.3 0 40 1.0 10 0 10 1.5 2 1.2 1.3 8 Ex 6 po su 4 re HR Exponential−Peak 0 2 4 6 Exposure Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures 8 10 0 2 4 6 8 10 Exposure LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Penalized DLNMs Currently, the bi-dimensional exposure-lag-response function f ·w (x, `) is specified using completely parametric methods However, simple DLMs also proposed in a Bayesian (Welty 2008) or penalized versions (Zanobetti 2000, Rushworth 2013, Obermeier 2015) An obvious extension is to develop a semi-parametric version of DLNMs through penalized splines The development may be facilitated by ’embedding’ the R package mgcv in dlnm, exploiting the existing GAM implementation Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Interactions in DLNMs Interactions in DLNMs would allow the exposure-lag-response association varying depending on the value of other predictors (see also Rushworth 2013) This corresponds to relaxing the assumption of identical effects This development extends the framework to a wide range of new applications However, it entails non-trivial methodological problems Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Time-varying DLNMs Japan 1.6 1985 Spain 3.0 2012 2010 2.5 RR 1.4 RR 1990 1.2 2.0 1.5 1.0 1.0 0.8 0.5 0 1 10 50 90 99 0 Summer temperature percentile 1 10 50 UK 1993 99 100 USA 1.4 2006 1.6 1.3 1.4 1.2 RR RR 1.8 90 Summer temperature percentile 1.2 1.1 1.0 1.0 0.8 1985 2006 0.9 0 1 10 50 90 99 100 0 Summer temperature percentile Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures 1 10 50 90 99 100 Summer temperature percentile LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Some advantages DLNMs offer a flexible way to model exposure-lag-response associations Unified framework based on a general conceptual and statistical definition, applicable in various study designs Complete software implementation, models can be fitted with standard regression routines Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Some limitations The DLNM framework is only applicable to time-varying (non-constant) exposures It requires the availability of exposure histories (possibly reconstructed) Model selection procedures still under-developed Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Main references Gasparrini A. Modeling exposure-lag-response associations with distributed lag non-linear models. Statistics in Medicine. 2014;33(5):881-899. Gasparrini A & Armstrong B. The R package dlnm. http: //cran.r-project.org/web/packages/dlnm/index.html E-mail: [email protected] Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Other references (I) Abrahamowicz et al (2006). Modeling cumulative dose and exposure duration provided insights regarding the associations between benzodiazepines and injuries. Journal of Clinical Epidemiology, 59(4):393–403. Abrahamowicz et al (2012), Comparison of alternative models for linking drug exposure with adverse effects. Statistics in Medicine, 31:1014–1030. Almon S (1965). The distributed lag between capital appropriations and expenditures. Econometrica, 33(1):178–196. Armstrong (2006). Models for the relationship between ambient temperature and daily mortality. Epidemiology, 17(6): 624–631. Berhane et al (2008). Using tensor product splines in modeling exposure-time-response relationships: application to the Colorado Plateau Uranium Miners cohort. Statistics in Medicine, 27(26):5484–96. Heaton et al (2014). Extending distributed lag models to higher degrees. Biostatistics, 15(2):398–412. Gasparrini et al (2010). Distributed lag non-linear models. Statistics in Medicine, 29(21):2224–2234. Gasparrini (2011). Distributed lag linear and non-linear models in R: the package dlnm. Journal of Statistical Software, 43(8):1–20. Hauptmann et al (2000). Analysis of exposure-time-response relationships using a spline weight function. Biometrics, 56(4):1105–8. Langholz et al (1999). Latency analysis in epidemiologic studies of occupational exposures: application to the Colorado Plateau uranium miners cohort. American Journal of Industrial Medicine, 35(3):246–56. Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM Outline Introduction Concepts Stats Examples Software Extensions Discussion Other references (II) Leffondre et al (2002). Modeling smoking history: a comparison of different approaches. American Journal of Epidemiology, 156(9):813. Obermeier et al (2015). Flexible distributed lag models and their application to geophysical data. Journal of the Royal Statistical Society: Series B, ahead of print. Richardson (2009). Latency models for analyses of protracted exposures. Epidemiology, 20:395–399. Rushworth et al (2013). Distributed lag models for hydrological data. Biometrics, 69:537–544. Schwartz (2000). The distributed lag between air pollution and daily deaths. Epidemiology, 11(3):320–326. Sylvestre & Abrahamowicz (2009). Flexible modeling of the cumulative effects of time-dependent exposures on the hazard. Statistics in Medicine, 28(27):3437–53. Thomas (1983). Statistical methods for analyzing effects of temporal patterns of exposure on cancer risks. Scand J Work Environ Health, 9(4):353–366. Thomas (1988). Models for exposure-time-response relationships with applications to cancer epidemiology. Annual Review of Public Health, 9:451–82. Welty et al (2008). Bayesian distributed lag models: estimating effects of particulate matter air pollution on daily mortality. Biometrics, 65:282-291. Whitaker et al (2006). Tutorial in biostatistics: The self-controlled case series method. Statistics in Medicine, 25:1768–1797. Gasparrini A Flexible modelling of the cumulative effects of time-varying exposures LSHTM
© Copyright 2024 ExpyDoc