Department of Statistics STATISTICS COLLOQUIUM PO-RU LOH Department of Epidemiology Harvard School of Public Health Bayesian Mixed Model Association Statistics in Linear Time MONDAY, May 19, 2014, at 4:00 PM Eckhart 133, 5734 S. University Avenue Refreshments following the seminar in Eckhart 110. ABSTRACT Linear mixed models (LMM) are a powerful statistical tool for identifying loci associated to phenotypes and avoiding confounding. Mixed model analysis is computationally demanding, however, and is becoming infeasible as study sizes reach the scale of 100,000 samples. Existing algorithms rely on spectral analysis of a genetic relationship matrix (GRM) at total time cost O(MN^2), where M is the number of markers and N is the sample size. Additionally, these methods implicitly assume an infinitesimal genetic architecture in which all markers are causal. I will present a fast O(MN)-time mixed model association algorithm, BOLT-LMM, which increases power by generalizing the LMM to model noninfinitesimal (sparse) genetic architectures via a Bayesian mixture prior on marker effect sizes, used within a retrospective hypothesis testing framework. BOLT-LMM performs a variational iteration that circumvents computing the GRM by operating directly on raw genotypes stored compactly in memory. When specialized to the infinitesimal model, BOLT-LMM achieves additional speedup, matching existing methods at dramatically reduced time and memory cost. I will describe preliminary results of applying BOLT-LMM to analyze 60,000 samples from the recently released Genetic Epidemiology Research on Aging (GERA) data set. _______________________________ For further information and about building access for persons with disabilities, please contact Kirsten Wellman at 773.702.8333 or send email ([email protected]). If you wish to subscribe to our email list, please visit the following website: https://lists.uchicago.edu/web/arc/statseminars.
© Copyright 2024 ExpyDoc