The Role of Data in Healthcare Innovation Abel Kho MD, MS [email protected] May 2nd 2014 Outline • Electronic Health Records (EHRs) as an enabling data platform • Genetic studies (bench to bedside) • Population studies “Planting trees to see the forest” • Addressing data privacy Changes In Adoption Of Basic And Comprehensive EHR DesRoches CM, Charles D, Furukawa MF, et al. (2013) Adoption of Electronic Health Records Grows Rapidly, But Fewer Than Half of US Hospitals Had At Least A Basic System in 2012. Health Aff (Millwood). 2013;32(8) EHR Adoption and Meaningful Use • 1,807 providers (goal 1,486) • 1,751 (118 %) of our enrolled providers are live on their EHR products • 1,228 (83%) of our enrolled providers have achieved MU • CHITREC enrolled providers have received almost $20M in EHR Incentive program funds Ddd Ddd Coordinating Center Type II Diabetes Case Algorithm * Abnormal lab= Random glucose > 200mg/dl, Fasting glucose > 125 mg/dl, or hemoglobin A1c ≥6.5%. Type II Diabetes Control Algorithm Mega-Analysis (adjusted) TCF7L2 3,353 cases 3,352 controls Kho et al. Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study. JAMIA 2012 eMERGE Sample Size eMERGE I eMERGE I & II eMERGE II Participants Participants Genotyped (Enrolled/ Genotyped Genotyped Targeted) GHC/UW 2,820 2,789 3,561 786 3,575 Marshfield 20,000 4,210 20,000 777 4,987 Mayo 3,769 3,755 19,000 3,185 6,940 NU 10,500 1,907 10,500 3,055 4,962 VU 70,000 6,055 140,000 27,173 33,228 Geisinger N/A N/A 19,650 4,191 4,191 Mt. Sinai N/A N/A 21,000 16,000 16,000 CCMC/CHB N/A N/A 40,051 5,586 5,586 CHOP N/A N/A 40,000 8,000 8,000 107,089 18,716 313,762 68,753 87,469 Distributed Common Identity For Integration of Regional Health Data (DCIFIRHD) HealthLNK Data Description HealthLNK Patient Count 8.0 7.0 Patients, in millions 6.0 5.0 4.0 3.0 2.0 Total Chicago Patients 1.0 0.0 Non-deduplicated De-duplicated Visit Data Chicago Only n=1,492,144 % White 408,241 (27.4%) % Black 521,972 (35.0%) % Asian 49,597 (3.3%) % American Indian / Alaska Native 15,780 (1.1%) % Pacific Islander 3,168 (0.2%) % Other/Unknown/Declined 350,805 (23.5%) % Hispanic (Ethnicity) 247,231 (16.6%) Median Age (in Years) 42 Sample size/cohort comparison, by residential ZIP code, BRFSS* vs. HealthLNK Source IL BRFSS, Chicago 2011 respondents HealthLNK, patient with 2010 visit Min 4 Median Mean Max 15 16 33 1,339 10,031 9,270 21,289 *CDC Behavioral Risk Factor Surveillance System survey, Chicago sub-sample from Illinois dataset. Diabetes prevalence estimate by residential ZIP Percent= # of patients with > 1 diabetes mellitus diagnosis code or lab criteria met # of patients with visit in 2006-2010 The amount of variability inside a zip code can be as much or more than between zip codes Pah AR, Behrens JJ, Goel S, Kho AN. Unzipping Zip Codes: A Methodology to Assign Deidentified Health Data to Smaller Geographic Localities. AMIA CRI 2014. Difference in median household income from ACS Need to disaggregate patient data from zip code to evaluate small area effects • • • Input data: • Patient records with demographics (age, gender, race) • Census data at block group level Methodology: • Monte Carlo simulation to distribute patient cases • Fit simulation data with semi-variogram • Create Kriged surface using semi-variogram Output is probabilistic patient case contour map Example: Diabetes in Chicago Health records from 7 healthcare institutions in Chicago1 Examining Diabetes cases (Type 1 and 2)2 from 2010 190,069 total cases Population data from Census 2010 1HealthLNK 2A. — Northwestern University Elixhauser, C. Steiner, D. R. Harris, and R. M. Coffey. Comorbidity Measures for Use with Administrative Data. Medical Care, 36(1):8–27, January 1998. Probabilistic maps from simulation Simulation alone Raw data Probabilistic maps from simulation Simulation alone Simulation + Kriged surface But how good is it? Scraped data related to houses for sale from Zillow A house has features we use similarly to demographics: Beds Baths Price And the exact address of the house to use in quantifying performance Apply the same methodology to 656 houses across 9 zip codes Looking at housing data Advantages of this method • Aggregation at the zip code can obscure small area effects • Re-capture this detail using probabilistic methods without requiring detailed patient health information • Produces finer spatial resolution resulting in “hot-spot” detection • Ability to re-aggregate to meaningful geographic areas (i.e. community area) MC portion is available at: https://bitbucket.org/adamrpah/geographic-record-disaggregation GIS portion coming shortly Addressing Data Privacy HIPAA Expert Determination (abridged) Certify via “generally accepted statistical and scientific principles & methods, that the risk is very small that the information could be used, alone or in combination with other reasonably available information, by the anticipated recipient to identify the subject of the information.” 26 Summary • Comprehensive capture of contiguous EHR data can be a powerful engine for new discoveries • Methods exist to “unlock” data from where it resides while protecting privacy/identity Acknowledgements • Northwestern University: Katie Jackson, Jess Behrens, Adam Pah, Sara Lake, Satyender Goel • UIC: Bill Galanter, John Lazaro, Denise Hynes, Neil Bahroos, Jerry Krishnan • University of Chicago Medical Center: David Meltzer Chris Lyttle, Ben Vekhter • Cook County Hospital and Clinics: Bill Trick, Amanda Grasso • Alliance of Chicago: Erin Kaleba, Andrew Hamilton, Fred Rachman • Rush University Medical Center: Bala Hota, Shannon Sims • Loyola University: Ron Price, Rich Kennedy • Vanderbilt University: Brad Malin • UIC Intern team: Ariadna Garcia, Pravin Babu Karuppaiah, Shazia Sathar, Ulas Keles (Sid Battacharya, Faculty mentor) • Becker Friedman Institute: Jörn Boehnke, John Eric Humphries, Scott Kominers (Harvard) The Role of Data in Healthcare Innovation Abel Kho MD, MS [email protected] May 2nd 2014
© Copyright 2025 ExpyDoc