Integration of NGS Desktop Sequencers to Build a Global Genomic Network for Pathogen Traceback and Outbreak Detection: Description of international (GMI, WHO) and national (GenomeTrakr, 100K) activities. Marc Allard Ph.D. Microbiologist, Division of Microbiology, ORS, Center for Food Safety and Applied Nutrition, FDA Feb. 26th, 2014 Carleton University Roles for Sequence-based Subtyping 1. 2. 3. 4. 5. Attribution Surveillance Risk assessment and modeling Define a legal adulterant Replace traditional bacteriological typing procedures Role of Sequence-based Subtyping during Outbreak Investigation (cluster ID and source tracking): Is a particular isolate part of the outbreak? – Or is it a sporadic or unrelated case ? Have we seen this isolate before? – – geospatial distribution of clones from different agricultural regions persistent clone in a manufacturing facility Does this food/environmental isolate match this clinical isolate? – can we link an isolate from food or a facility to an outbreak? S. Enteritidis XbaI Patterns JEGX01.0004 BlnI Patterns JEGA26.0002 Similar Patterns Very Different Patterns Same PFGE but unrelated to the event Distinct outbreak related groups 5 S. Bareilly Outbreak (April-June 2012) FDA1202 SAL2919 Coriander Powder India FDA1203 SAL2921 Frozen Baila Bangladesh FDA1201 SAL2918 Coriander Bangladesh FDA1112 SAL2877 Frozen Undeveined Shrimp India FDA1203 SAL2921 Frozen Baila Bangladesh FDA1116 SAL2885 Coriander Powder India FDA1206 SAL2924 Fish Stomach Vietnam SAL3133 Clinical MD FDA1146 SAL2903 Hilsa Fish Thailand FDA1148 SAL2904 Frozen Rock Lobster Tails United Arab Emirates FDA1143 SAL2920 Lobster Tails Taiwan FDA1155 SAL2889 Fresh FDA1159 SAL0955 Unknown Cantaloupe SAL3132 USA Environmental USA FDA713 SAL0949 Clinical MD FDA1160 SAL0956 Environmental USASAL2898 Chili FDA1145 Environmental USA SAL2914 Pabda FDA1107 Powder Thailand FDA1141 SAL2884 Frozen Crab with Fish Bangladesh FDA1150 SAL2439 Frog Claws SriSAL2908 Lanka Ground FDA1157 Legs Unknown FDA1140 SAL2895 Red Chili Red Pepper USA Fennel Seeds FDA1147 SAL2886 Powder FDA1165Pakistan SAL2894 Raw United Arab Emirates FDA1161 SAL2876 Whisker Shrimp Vietnam FDA1163 SAL2879 Frozen Raw Esomus Fish Vietnam FDA1164 SAL2887 Sand Goby Swaison Whole FDA1117 SAL2888Vietnam Frozen Fish Vietnam FDA1139 SAL2890 Kheer Shrimp India FDA1123 SAL2901 Sesame Mix Pakistan FDA1118 SAL2891 Coriander Seeds India FDA1124 SAL2902 Powder India FDA1132 SAL2916 Coconut FDA1142 India SAL2910 Shell-on Shrimp FDA1200India SAL2917 Cumin Shrimp Lanka Organic FDA1129SriSAL2911 Powder India FDA1131 SAL2915 Frozen Black Pepper India Chili FDA1207 SAL2925 Rohu RishSAL2906 India FDA1126 Ginger Powder FDA1120India SAL2896 Crushed Powder FDA1138India SAL2900 Chilis India FDA1119 SAL2893 Frozen Coriander Mexico FDA1156 SAL2892 Fish India FDA1121 SAL2897 Sesame FDA364 Irrigation Water USA Seed India FDA1149 SAL2438 Nonfat Dry ATCC 9115 FDA1152 SAL2434 Milk Unknown FDA1154 SAL2436 Poultry Meal USA Poultry FDA1153 SAL2435 Feather MealSAL2913 USA FDA1137 Feather MealSAL2912 USA FDA1130 Cayenne Scallops FDA1128 Indonesia SAL2909 Punjabi Pepper India FDA1204 SAL2922 Chili Cheole Spice India FDA1125 SAL2905 Turmeric Powder SAL2882 India FDA1115 Frozen Raw Powder FDA1127India SAL2907 Peeled Shrimp IndiaFrozen FDA1113 SAL2880 Shrimp India FDA1202 SAL2919 Coriander Shrimp India FDA1201 SAL2918 Coriander Powder FDA1112India SAL2877 Frozen Bangladesh FDA1203 SAL2921 Frozen Baila Undeveined ShrimpBangladeshi India FDA1205 SAL2923 Fresh Water Bangladesh FDA1116 SAL2885 Coriander Fish (Bacha) Bangladesh FDA1206 SAL2924 Fish Powder India SAL3133 Stomach Vietnam FDA1146 SAL2903 Hilsa Clinical FDA1148MD SAL2904 Frozen Rock Lobster Tails Fish Thailand FDA1143 SAL2920 Lobster United Arab Emirates FDA1144 SAL2883 Frozen Whole Tails Taiwan FDA1114 SAL2881 Frozen Raw Tilapia Thailand SAL3150 Shrimp SAL3140India Clinical SAL3130NY Clinical NY SAL3126 Clinical SAL3139MD Clinical MD SAL3148 Clinical SAL3128NY Clinical NY SAL3127 Clinical MD SAL3157 Clinical SAL3131MD Clinical SAL3155NY Clinical MD SAL3151 Clinical SAL3129NY Clinical SAL3141NY Clinical MD SAL3145 Clinical SAL3152NY Clinical SAL3158NY Clinical SAL3153NY Clinical SAL3146NY Clinical NY SAL3142 Clinical NY SAL3143 Clinical SAL3154NY Clinical SAL3144NY Clinical SAL3156NY Clinical SAL3125NY Clinical SAL3149NY Clinical SAL3147MD Clinical NY Clinical NY FDA1144 SAL2883 Frozen Whole Tilapia Thailand FDA1114 SAL2881 Frozen Raw Shrimp India SAL3140 Clinical NY SAL3130 Clinical MD SAL3126 Clinical MD SAL3139 Clinical NY SAL3148 Clinical NY SAL3128 Clinical MD SAL3127 Clinical MD 20-25 SNPs SAL3157 Clinical NY SAL3131 Clinical MD SAL3155 Clinical NY SAL3151 Clinical NY SAL3129 Clinical MD SAL3141 Clinical NY SAL3145 Clinical NY SAL3152 Clinical NY SAL3158 Clinical NY SAL3153 Clinical NY SAL3146 Clinical NY SAL3142 Clinical NY SAL3143 Clinical NY SAL3154 Clinical NY SAL3144 Clinical NY SAL3156 Clinical NY SAL3125 Clinical MD SAL3149 Clinical NY SAL3147 Clinical NY <=5 SNPs PFGE Match 110-130 SNPs NGS distinguishes geographical structure among closely related Salmonella Bareilly strains Our Current Model FDA, USDA, CDC State, Local, Federal and Foreign Public Health Agencies Academia NCBI, EMBL DDBJ (Public Access Database) DATA ANALYSIS DATA ASSEMBLY AND STORAGE Network of Sequencers DATA ACQUISITION 10 Global Microbial Identifier http://www.g-m-i.org/ • Make novel genomic technologies and informatics tools available for improved global patient diagnostics, surveillance, research and public health response. develop a global system to aggregate, share, mine and use microbiological genomic data to address global public health and clinical challenges, a high impact area in need of focused effort. 500 members in 30 countries Work groups 1.Political challenges, outreach and building a global network 2.Repository and storage of sequence and metadata 3.Analytical approaches 4.Ring trials and quality assurance 5.Pilot project Expansion of FDA Network to site in economically-developing country Benefits 1.Add diversity to genome database by opening new strain collections and access to incurred food, animal and environmental samples available to each site 2.Identify gaps in project assumptions and resources that would interfere with global expansion of NGS networks Considerations 1.Participate as a full member of the USFDA network 2.Submit data and metadata to public database 3.Focus on sequencing Salmonella food and environmental isolates 4.Resources available for use of sequencer for other projects $500,000 Argentina 3 years 13 Network of Sequencers 7 state health depts. + 10 FDA-ORA Inputs o 1 Miseq system o Sufficient reagents to sequence > 300 genomes per year o Dedicated scientific staff (bioinformatics and/or laboratory support) through Oak Ridge Institute for Science and Education (ORISE) o Bioinformatics and laboratory support, analysis pipeline Deliverables o Minimum ~300 genomes with metadata uploaded to NCBI per annum, minimum 20X coverage o food and environmental related bacterial (prefer Salmonella) isolates Miseq benchtop NGS system (Illumina) Register BioSample metadata at NCBI 1 Data Transfer 4 Illumina BaseSpace Cloud FDA-CFSAN Isilon storage drive 3 SampleSheet Outside FDA network 2 BioSample BioProject Inside FDA network Data Generation 18 State and Federal Health labs Register strains with NCBI Illumina MiSeq Data Generation Data transfer Data transfer to FDA Data QC and Submission at FDA Batch QC of data Conversion to SRA format Upload to SRA CFSAN Genomic Information Management System (GIMS) integration Automated WGS accessions Genome assembly + PGAP annotation Hybrid k-mer and reference-based SNP calling analysis NCBI pathogen detection pipeline BioProject: http://www.ncbi.nlm.nih.gov/bioproject/183844 Data Generation 18 State and Federal Health labs plus the world Register strains with NCBI Collect genome sequence Data transfer Data transfer to FDA Data QA/QC on site Batch QC of data Conversion to SRA format Upload to SRA (CLC Plugin) CFSAN Genomic Information Management System (GIMS) integration Automated WGS accessions Genome assembly + PGAP annotation Hybrid k-mer and reference-based SNP calling analysis NCBI pathogen detection pipeline BioProject: http://www.ncbi.nlm.nih.gov/bioproject/183844 FDA-State Desktop Pilot called GenomeTrakr http://www.ncbi.nlm.nih.gov/bioproject/183844 MN and VA are newest partners. Mexico Sinaloa 1st international partner. SRA completed experiments are ~2000 records to date. Partners with sequencers United Kingdom Denmark Italy Argentina Brazil Germany Canada Partners with isolates Ireland Mexico Turkey Columbia Chile 22 MINIMAL PATHOGEN METADATA (FOODBORNE OUTBREAKS) sample_name organism strain/isolate What Category (attribute_package) 1a) Clinical/Host-associated 1a1) specific_host 1a2) isolation_source 1a3) host-disease OR 1b) Environmental/Food/Other 1b1) isolation_source collection_date Geographic location When Where 6a) geo_loc_name OR 6b) lat_lon collected by Who Example Surveillance Workflow Existing Salmonella clade in combined Eubacteria kmer tree After Day 1 After Day 2 After Day 3 Extract cluster, Montevideo serovar • 2 existing genomes • 39 new genomes • Extract sub-tree • Re-root on outlier Compute new subtree • Reference tree based on SNPs • Provides additional resolution • Integrate with metadata FDA_2010_142_Pistachio-3 31 Public/Private Partnership • • • • • • UC Davis FDA NCBI BGI@UCDavis Agilent Technologies CDC • Affiliate members – – – – – Mars, Inc. Harvard hospital system Poultry Industry members Culture collections SEEKING ADDITIONAL PARTNERS Role of Pacbio RS Technology Yield closed genome assemblies to improve accuracy of clustering and proper assembly of large repetitive elements (phage, plasmid, CRISPR) Yield data on epigenetic modification of genomes possibly to further discriminate strains and/or provide information about virulence and pathogenicity 33 With Pacific Biosciences technology we’ve sequenced over 60 Salmonella and Listeria genomes and their associated mobile elements for complete reference genomes. Chromosome Plasmid Salmonella enterica subsp. Enterica Serovar Cubana Methylation motifs from 14 Genomes G CA Serovar Listeria monocytogenes J1-220 Listeria monocytogenes J1816 III AG A II TG T CA T GA II C C GG W C orphan C I RT N AY NN NN C II T NC AG C G NN TC N NN A TT R N N N CC N NN NN A A TA C AN G C G C C C G G C G CT I I II I NN NN TG NN GA I N GN AG NN NN A RT T GA II YG G CA G I YN AA NN G NN A GT TC TC C II S. Bareilly Salmonella enterica subsp. diarizaonae = MTase identified S. Abaetetuba S. Abony = MTase unknown S. Anatum = novel MTase S. Braenderup S. Cubana S. Heidelberg CFSAN002069 S. Heidelberg CFSAN002064 S. Heidelberg CFSAN000318 S. Montevideo S. Typhimurium Large scale Salmonella enterica subsp. enterica phylogeny inferred from 156 genomes across 78 serovars Timme et al, in preparation Reference–free approach for gathering SNPs ML tree inferred from ~119,000 SNPs Timme et. al. in Gen. Biol. Evol. 5(11):2109- 36 Collaborations FDA CDRH: Sequencing as a diagnostic device. High performance computing. NIST: Standards for genomic sequencing. FDA CVM: MDR isolates from NARMS collection. DOJ: Microbial Forensics for FERN support. DOD: R&D on traceback using metagenomics. 2014 FDA_2010_142_Pistachio-3 SNP = Single Nucleotide Polymorphism TTCCCTAGCAC TTCCTTAGCAC ONE THING’S FOR CERTAIN: IT TAKES A VILLAGE TO GET THERE! Many thanks to the following: Division of Microbiology-FDA Eric Brown Peter Evans Ruth Timme, Narjol Gonzalez, Yi Chen, Maria Hoffman, Christine Keys, George Kastanis, Tim Muruvanda, Rebecca Bell, Cary Pirone, Andrea Ottesen, Ruth Timme, Charlie Wang, Jie Zheng, Justin Payne Division of Biostatistics-FDA Errol Strain, Yan Luo, James Pettengill CVM Cong Li, Pat McDermott, Shaohua Zhao CDC John Besser, Eija Trees, Lee Katz, Patti Fields Division of Molecular Biology-FDA Chris Elkins, Darcy Hanes, Palmer Orlandi FDA Division of Field Sciences Rebecca Dreisch NYPH Bill Wolfgang Kimberly Musser and colleagues MPH Alvina Chu and colleagues National Institutes of health (NCBI) David Lipman, Jim Ostell, William Klimke, Martin Shumway Office of Regulatory Science-FDA Steve Musser, Kelly Bunning, Don Zink 40 Questions Eric Brown: [email protected] Peter Evans [email protected] Marc Allard: [email protected]
© Copyright 2024 ExpyDoc