Seeds of Discovery (SeeD) Large-scale application of GbS in the SeeD project: ‘Rightsizing’ of methods and initial results Sarah Hearne, Alberto Romero, Huihui Li, Carolina Sansaloni, Cesar Petroli, Martha Willcox, Aleyda Sierra, Hector Galvez, Manuel Martinez, Sukwinder Singh, Marc Ellis, Giovanny Soca, Gary Atlin, Andrzej Kilian, Ed Buckler, Peter Wenzl International Maize Improvement Consortium (IMIC) Wheat Yield Consortium (WYC) Genetic resources Breeding programs Cultivar adoption, agronomy Seeds of Discovery (SeeD) New genetic variation to raise future crop production Take it to the Farmers (TTF) Increased agricultural production Global average yield (tons per hectare Why SeeD? 8 7 6 5 4 3 2 1 0 Wheat Maize Anticipated demand by 2050 (FAO) Climate change Soil degradation and falling water tables Costs of fertilizer and energy Genetic erosion [Source: USDA PDS database] 1960 1970 1980 1990 2000 2010 2020 2030 2040 2050 Year Genetic resources for food security Research emphasis Breeding-oriented [heat/drought tolerance] Genetically simple traits [some diseases, phenology] Main emphasis: Mobilize novel alleles for complex traits into breeding programs ‘Low-hanging fruits’ for breeding Seek collaborations to mine data for basic research Genetically complex traits Upstream Strategy 1 • • • • • Molecular atlases Asociación genómica Underutilized sources of genetic variation Selection imprints Heterotic patterns (maize) Hidden translocations (wheat) Rare recombinants 3 ‘Bridging germplasm’ 2 Novel alleles and allele donors • Novel, beneficial alleles, haplotypes • Markers linked to loci and alleles that control priority traits • Genetically distinct ‘donor accessions’ Project areas 1 Molecular atlases (diversity surveys) 2 Novel alleles and ‘allele donors’ (GWAS) 3 Pre-breeding ‘bridging germplasm’ 4 Information management 5 Capacities (genetic-analysis service) GbS Genotyping by sequencing (GbS) • Transition from genotyping-by-assay (gel, hybridization) towards genotyping-by-sequencing • Similar to analogue digital photography transition • Simultaneously discovers DNA polymorphisms and classifies their allelic states advantage for characterizing unknown genetic diversity in genebanks • Minimizes ascertainment bias • Configurable platform: adjust No. of markers vs. No. of DNA samples two ‘flavors’: • DArT: ~60-70K markers, SNP & PAV, ~20-35% missing data, lower error rates, calling of heterozygotes for subset of SNP markers, no imputation maize & wheat diversity surveys • Cornell: ~800K markers: only SNP, ~60% missing data, higher error rates, no heterozygote detection, imputation maize GWAS Genetic-analysis service (SAGA) ● Provide services, based on modern genomics platforms, which address the needs of demand-driven, impact-oriented agricultural R&D ● Partnership with DArT (Diversity Arrays Technology) in Australia ● Objectives: Economies of scale for characterizing SeeD samples using GbS Genomeprofiling &valueadding services to scientists in Mexico and the region Vehicle for capacitybuilding Database & interfaces for primary data (KDDart, IBFieldbook) for managing experiments (inventories, germplasm evaluation, etc.) IT ‘ecosystem’ of SeeD Web portal & data warehouse (Germinate): , and validated genotypic & phenotypic data To be OpenSourced from the first production version onwards (2015) Data access layer Visualization tools (Flapjack, CurlyWhirly, …) Database modules Genebank management (GrinGlobal) Web services Collaboration with DArT and James Hutton Inst.) High-level data repository (Genesys): Passport & summarized data Wheat diversity survey ● 42,000 accessions sequenced to date using DArTseq ● One individual per accession ● ~30,000 SNP and ~30,000 PAV per sample ● Comprehensive diversity analysis and design of AM panels is underway ● Positioning of markers using new consensus map ● Target: Characterize up to 160,000 accessions (120,000 from CIMMYT) Building AM panels Phenotypic values Core set / AM panel Genetic diversity Maize diversity survey ● To get an accurate representation of maize landraces we need to score heterozygotes ● DArTseq is based on multiple REs whose combination deliberately generates a smaller number of fragments for deeper sequencing ● PstI enzyme used for DArTseq partly overlaps with ApeKI (Cornell) partly overlapping representations ● Can score heterozygotes in many loci as multiple copies of each tag are sequenced (ca. 2 M fragments are typically sequenced per sample) Genotyping bulks ● Can genotype bulks and derive population-level allele frequencies Reduces costs of diversity Pools ● Most accessions are genetically heterogeneous landraces need to genotype multiple individuals (SSRs: 15–30 individuals) survey by more than an order of magnitude ● The allele frequencies derived are representative of allele frequencies in the accessions (populations) ● PAVs: Genetic distances among populations Individual samples No. of individuals per bulk? ● Compared separately assembled bulks of increasing size ● Little change above bulk sizes of 32 ● Used bulks of 30 leaf discs from 30 individuals for diversity survey ● Pooling at leaf-disk and DNA sample levels gave indistinguishable results 4 8 12 16 20 24 28 32 36 40 44 48 4 22.3 21.5 18.8 17.2 17.8 17.1 17.5 17 17.9 17.7 17.4 17.2 8 11.7 9.8 6.4 4.5 4.5 4.1 3.2 4.3 4.3 4.2 4.2 3.5 12 12.2 9 6.1 3.9 3.7 2.1 3.2 3.3 2.8 3.1 3.3 2.6 16 11.2 9 5.1 2.6 2.4 1.8 2.3 2.4 1.9 2.3 1.7 2 20 11.9 9.4 3.7 2.6 3.3 3.1 2.4 2.1 3 2.1 2.8 2.3 24 12.2 9.7 5.9 2.1 1.4 2.2 2.7 1.6 3 2.4 1.4 2 28 10.2 9.7 6.6 3.9 4 3.5 2.3 3 3.2 2.7 3.4 2.5 32 11.9 9.1 5.2 2.5 2.3 1.7 2.2 2 1.7 2.2 1.6 1.8 36 11.9 8.8 4.3 2.2 2.7 2.5 2.1 1.5 2.5 1.7 2.3 1.7 40 11.3 8.8 5 2.4 2.3 1.4 1.2 1.7 1.6 1.2 1.6 1 44 11.7 9 4.7 1.9 2.1 2.3 2.1 1.4 2.4 1.8 1.8 1.6 48 11.1 8.1 4.7 2.6 2.5 1.7 1.7 2.3 2.3 1.8 2 1.1 Accession 1 Accession 2 Accession 3 ... Accession 40,000 30 plants each 1 DNA sample each Molecular Atlas Genetic relationships amongst accessions, selection footprints, race classification, etc. Started to genotype up to 40,000 accessions High-density genome profiles from “bulk” samples Allele frequencies within accessions Just finished 20,000 accessions… • > 230,000 SNP identified (likely to increase upon re-calling the entire set) • Only 20% map to B73 reference genome! Whole-genome re-sequencing of ca. 20 landraces in progress.. • Enriched for gene-rich regions (methylation filtration effect) • Target: Characterize up to 40,000 accessions (27,000 from CIMMYT) No. of SNP within window Position on chromosome Next steps ● Environmental-selection footprints 18,500 accessions with good-quality geo-location data Extracted long-term abiotic environment data Identify allele/haplotype-frequency gradients across environmental clines in entire genebank collection ● Breeding-selection footprints Multiple cycles of recurrent-selection populations genotyped Identify response to selection ● Race-specific footprints Maize GWAS Accession 1 Accession 4,500 GWAS … Tester Tester GbS Field trials ● Existing core collection of 4,500 landraces, three adaptation zones ● Assumption: haplotypes replicated across accessions testcross one individual per accession with adaptation-zonespecific hybrid ● Genotyped testcross parents Field trials for GWAS Collected 700,000 data points from 34 trials across 14 locations Traits evaluated Abiotic stresses heat drought low N Biotic stresses tar spot, ear rot, stalk rot, Turcicum, Cercospora Grain quality hardness, starch, oil, amino acids, phenolics GbS profiles of testcross parents ● Genotyped both with Cornell GBS and DArTseq methods Highland Subtropical Tropical 36 Latin American countries Maximize marker density (Cornell) Enable identification of heterozygote regions (DArTseq) ● Imputation based on prevalent haplotypes detected in ca. 40,000 maize samples genotyped on Cornell platform ● Little genetic structure Proof of concept: Days to silking ● GWAS approach works ● Marker density just sufficient Anthesis: Teocinte-derived inversion Tar spot disease complex ● Up to 46% yield loss Yield ● Caused by Phyllachora maydis and Monographell a maydis in association Testcrosses Accessions ` Tar spot incidence Chromosome 9 (position 139,172,758): P = 1.01e -7 Next steps 1 Molecular atlases Asociación genómica 2 Novel alleles and allele donors 3 New breeding approaches and technologies; new tools such as GS ‘Bridging germplasm’ Elite germplasm selected by breeders • Breeder-ready lines &populations with new, beneficial alleles for priority characters in elite genetic backgrounds joined linkage/association mapping & trait mobilization into breeding programs • Molecular markers linked to beneficial alleles and statistical models for estimating breeding values to accelerate genetic progress in breeding programs Maize ‘bridging germplasm’ Useful novel alleles & haplotypes Early generation lines & pools enriched for favorable alleles …using multiple strategies defined by trait complexity and breeder needs (desired input germplasm, demand for new sources) Breeder demand Trait complexity Monogenic (1-3) Oligogenic (4-10) Polygenic (>10) Urgent DH from landrace & landrace / line crosses, selfing DH from landrace & landrace / line crosses, selfing GS with MABC for BC1S1 development Mediumterm MABC MARS & prediction index GS with MABC for BC1S2 development Long-term MABC & GS MARS, prediction index & GS GS with MABC for BC1S2 development Wheat ‘bridging germplasm’ Exotic 2 Exotic 1 Elite 2 Elite 1 50:50 Exotic Exotic Exotic Elite 50:50 Elite 50:50 25:75 Family 1 of fixed lines Family 2 of fixed lines Exotic 50:50 Exotic Elite Elite 50:50 Exotic Elite Elite 50:50 Exotic Elite Elite 50:50 Exotic Elite Elite 50:50 50:50 Exotic Elite Elite parents • 200 exotics (synthetics, landraces; FIGS) • 10 elites selected by breeders • Currently at TC1F3 stage Exotic Elite Elite with partly • TC chains with partly overlapping parents overlapping elite Elite 3 25:75 Elite Elite 50:50 Elite 2 Exotic Elite Linked topcross panel (LTP) Linked topcrosses 50:50 Exotic Elite Elite 50:50 Exotic Elite Elite 50:50 Exotic Elite Elite 50:50 Exotic Elite Elite 50:50 Exotic Elite Elite 50:50 Elite Elite 50:50 Elite Elite 50:50 Elite 25:75 25:75 25:75 25:75 25:75 25:75 25:75 25:75 25:75 25:75 25:75 25:75 25:75 25:75 25:75 25:75 Family of fixed lines Family of fixed lines Family of fixed lines Family of fixed lines Family of fixed lines Family of fixed lines Family of fixed lines Family of fixed lines Family of fixed lines Family of fixed lines Family of fixed lines Family of fixed lines Family of fixed lines Family of fixed lines Family of fixed lines Family of fixed lines Linked Topcross Panel (LTP)mapping for joint linkage/association mapping Joint linkage/association to identify novel exotic alleles that are expressed across several elite genetic backgrounds Thank you! http://seedsofdiscovery.org ([email protected]) Paritipants from Mexican institutions Participants from CIMMYT Participants from other countries Jonás Aguirre (UNAM), Flavio Aragón (INIFAP), Odette Avendaño (LANGEBIO), Ed Buckler (Cornell Univ.), Juan Burgueño, Vijay Chaikam, Alain Charcosset (AMAIZING), Gabriela Chávez (INIFAP), Jiafa Chen, Charles Chen, Andrés Christen (CIMAT), Angelica Cibrian (LANGEBIO), Héctor M. Corral (AGROVIZION), Moisés Cortés (CNRG), Sergio Cortez (UPFIM), Denise Costich, Lino de la Cruz (UdeG), Armando Espinosa (INIFAP), Néstor Espinosa (INIFAP), Gilberto Esquivel (INIFAP), Luis Eguiarte (UNAM), Gaspar Estrada (UAEM), Juan D. Figueroa (CINVESTAV), Pedro Figueroa (INIFAP), Jorge Franco (UDR), Guillermo Fuentes (INIFAP), Amanda Gálvez (UNAM), Héctor Gálvez (SAGA), Karen García, Silverio García (ITESM), Noel Gómez (INIFAP), Gregor Gorjanc (Roslin Inst.), Sarah Hearne, Carlos Hernández, Juan M. Hernández (INIFAP), Víctor Hernández (INIFAP), Luis Herrera (LANGEBIO), John Hickey (Roslin Inst.), Huntington Hobbs, Puthick Hok (DArT), Javier Ireta (INIFAP), Andrzej Kilian (DArT), Huihui Li, Francisco J. Manjarrez (INIFAP), David Marshall (JHI), César Martínez, Carlos G. Martínez (UAEM), Manuel Martínez (SAGA), Iain Milne (JHI), Terrence Molnar, Moisés M. Morales (UdeG), Henry Ngugi, Alejandro Ortega (INIFAP), Iván Ortíz, Leodegario Osorio (INIFAP), Natalia Palacios, José Ron Parra (UdeG), Tom Payne, Javier Peña, Cesar Petroli (SAGA), Kevin Pixley, Ernesto Preciado (INIFAP), Matthew Reynolds, Sebastian Raubach (JHI), María Esther Rivas (BIDASEM), Carolina Roa, Alberto Romero (Cornell Univ.), Ariel Ruíz (INIFAP), Carolina Saint-Pierre, Jesús Sánchez (UdeG), Gilberto Salinas, Yolanda Salinas (INIFAP), Carolina Sansaloni (SAGA), Ruairidh Sawers (LANGEBIO), Sergio Serna (ITESM), Paul Shaw (JHI), Rosemary Shrestha, Aleyda Sierra (SAGA), Pawan Singh, Sukhwinder Singh, Giovanni Soca, Ernesto Solís (INIFAP), Kai Sonder, Maria Tattaris, Maud Tenaillon (AMAIZING), Fernando de la Torre (CNRG), Heriberto Torres (Pioneer), Samuel Trachsel, Grzegorz Uszynski (DArT), Ciro Valdés (UANL), Griselda Vásquez (INIFAP), Humberto Vallejo (INIFAP), Víctor Vidal (INIFAP), Eduardo Villaseñor (INIFAP), Prashant Vikram, Martha Willcox, Peter Wenzl, Víctor Zamora (UAAAN) Contributed at the beginning: Gary Atlin, Michael Baum (ICARDA), David Bonnett, Paul Brennan (CropGen), Etienne Duveiller, Mustapha ElBouhssini (ICARDA), Marc Ellis, Ky Matthews, Bonnie Furman, Marta Lopes, George Mahuku, Francis Ogbonnaya (ICARDA), Ken Street (ICARDA)
© Copyright 2024 ExpyDoc