Series Training Courses on Genetic Analysis and - CIAT

Series Training Courses on
Genetic Analysis and Plant Breeding
23-27 June 2014
International Center for Tropical Agriculture (CIAT), Km 17, Recta
Cali-Palmira, Apartado Aéreo 6713, Cali, Colombia
Phone: +57 2 4450000 (direct), +1 650 8336625 (via USA)
Fax: +57 2 4450073 (direct), +1 650 8336626 (via USA)
Presenter: Dr. Jiankang Wang, CIMMYT China and Chinese Academy of Agricultural Sciences (CAAS);
[email protected] or [email protected]
When:
Monday 23rd – Friday 27th June 2014
Where:
CIAT, Cali, Colombia
Contact:
For logistics, contact Janeth Vargas, CIAT ([email protected]); For technical
issues, contact Luis Narro, CIMMYT Colombia ([email protected]), Joe Tohme, CIAT
([email protected]), or Jiankang Wang, CIMMYT China and CAAS ([email protected]);
Support:
HarvestPlus Challenge Program, CIAT and CIMMYT
Objectives of the Course
Through lectures, practices and discussions, you will learn:











Plant breeding methodology
Applied quantitative genetics
Estimation of recombination between two linked loci
Construction of genetic linkage maps
Principles of QTL mapping and statistical comparison of different mapping methods
Identification of quantitative trait genes with additive (and dominance) effects
Identification of quantitative trait genes with epistasis effects
QTL by environment analysis
Modeling of plant breeding
Comparison and optimization of plant breeding strategies
Integration of known gene information into conventional plant breeding
Who should attend?
Those who are interested in applied quantitative genetics, linkage analysis, linkage map construction,
QTL mapping, simulation and optimization of breeding strategies will benefit from this course.
Participants should be familiar with basic methods in plant genetics, plant breeding and statistics.
1
Computers and software: Each participant must bring a laptop computer that can run Microsoft
Windows applications. A USB memory stick will be distributed to all participants at the beginning of
the course. This contains the lecture presentations, the QTL IciMapping integrated software,
QU-GENE simulation tools, exercises and answers, etc.
Registration details: The registration fee for this course is USD 120. Registration includes a complete
set of electronic course notes, transportation, lunch and coffee break.
Numbers are limited and places will be offered on a first-in-first-served basis. Please indicate you
would like to attend as soon as possible by direct email to Janeth Vargas
([email protected]). Your place will be confirmed on receipt of the registration form and
payment. Please ensure to include your profession to enable lecturers to demonstrate and customize
elements towards your subject of interest or research.
For international payments through electronic transfers:
Account name: Centro Internacional de Agricultura Tropical - CIAT
Bank Name: HSBC USA
Account number: 193136341
Bank Address: Doral Office - 4090 NW 97th Avenue, Miami, FL 33178
ABA : 021001088
ACH : 067009390
Swift : MRMDUS33
Internal Number: HIA03
Participant's name
For national payment:
Account name: Centro Internacional de Agricultura Tropical – CIAT
Bank Name: Banco Popular
Account number: 583-72072-7
Internal Number: HIA03
Participant's name
For registrations, the bank deposit receipt should be included with the application and information
regarding to whom should be credited.
Venue: The training course will be run by International Maize and Wheat Improvement Center
(CIMMYT) and International Center for Tropical Agriculture (CIAT), held at the Headquarter of CIAT at
Cali, Colombia.
2
Programs
For every day, four lectures (each of 50 minutes) will be delivered. Two will be given in the morning,
and the other two in the afternoon. After two lectures, some practicals will be followed.
June 23 (Monday): Arrival
June 24 (Tuesday): Quantitative Genetics and Plant Breeding
Genetic population and population structure, additive and dominance genetic model, mating designs
and estimation of genetic variance and heritability, prediction of genetic gain, correlated selection
and index selection etc.
Lecture 1.1: History and Contents of Quantitative Genetics
Lecture 1.2: Introduction of Population Genetics
Practical 1.1: Find the structure of genetic populations
Practical 1.2: Find the number of effective factors under multi-factor hypothesis
Lecture 1.3: Classical Quantitative Genetics
Lecture 1.4: Quantitative Genetics and Plant Breeding
Practical 1.3: ANOVA and estimation of heritability
Practical 1.4: Genetic mating designs, estimation of genetic variances and genetic gain
June 25 (Wednesday): Genetic Linkage Analysis and Map Construction
Estimation of recombination frequency between two loci in biparental genetic populations, genetic
interference and mapping function, linkage map construction, handling redundant markers,
integration of multiple linkage maps to generate a consensus map
Lecture 2.1: Linkage Analysis between Two Loci (Two-point Analysis)
Lecture 2.2: Estimation of Recombination Frequency in Different Genetic Populations
Practical 2.1: Install the QTL IciMapping software, and Get familiar with the QTL IciMapping software
Practical 2.2: Handling the redundancy of markers (BIN functionality in QTL IciMapping)
Practical 2.3: Estimation of recombination frequency estimation between two loci (Tool 2pointREC in
QTL IciMapping)
Lecture 2.3: Three-point Linkage Analysis
Lecture 2.4: Genetic Map Construction: Grouping, Ordering and Rippling
Practical 2.4: Linkage map construction (MAP functionality in QTL IciMapping)
Practical 2.5: Construction of consensus maps (CMP functionality in QTL IciMapping)
June 26 (Thursday): QTL Linkage Mapping
Principle of QTL mapping, conventional Interval Mapping, Inclusive Composite Interval Mapping
(ICIM) of QTLs, two-dimensional scanning for additive by additive interactions, two-dimensional
3
scanning for additive by additive, additive by dominance, dominance by additive, and dominance by
dominance interactions
Lecture 3.1: Single Marker Analysis and the Conventional Interval Mapping
Lecture 3.2: Inclusive Composite Interval Mapping (ICIM) of additive (dominance) QTL
Practical 3.1: Use of Single Marker Analysis to identify QTL (BIP functionality in QTL IciMapping)
Practical 3.2: Use of Interval Mapping to identify QTL (BIP functionality in QTL IciMapping)
Lecture 3.3: Inclusive Composite Interval Mapping (ICIM) of epistasis QTL
Lecture 3.4: LOD Threshold and QTL Detection Power Simulation
Practical 3.3: Use of Inclusive Composite Interval Mapping to identify QTL (BIP functionality in QTL
IciMapping)
Practical 3.4: Use of Inclusive Composite Interval Mapping to identify epistatic QTL (BIP simulation
functionality in QTL IciMapping)
June 27 (Friday): Other QTL Mapping Methods and Breeding Simulation
QTL mapping with chromosome segment substitution (CSS) lines, other QTL mapping methods, QTL
mapping in nested association mapping (NAM) populations, frequently asked questions in QTL
mapping studies, principles of breeding simulation, defining genetic models in QU-GENE, defining
breeding methods in QuLine, comparing breeding methods through simulation, and use of know
genes in plant breeding
Lecture 4.1: QTL Mapping with Chromosome Segment Substitution (CSS) Lines and Segregation
Distortion Loci Mapping
Lecture 4.2: Frequently Asked Questions and Answers in QTL Mapping
Practical 4.1: Compassion of mapping methods by simulation (BIP functionality in QTL IciMapping)
Practical 4.2: Other functionalities of QTL IciMapping Software (MET: QTL Identification of QTL by
environmental interactions; CSL: QTL mapping with chromosome segment substitution lines; SDL:
Identification of segregation distortion loci;NAM: QTL mapping in NAM populations)
Lecture 4.3: Principle of Modeling and Breeding Simulation
Lecture 4.4: Applications of Breeding Simulation
Practical 4.1: Define a genetic model and a breeding parental population for the QU-GENE engine
Practical 4.2: Define a breeding strategy for the simulation tool QuLine
Practical 4.3: Run a QU-GENE simulation experiment; View and explain the simulation results
16:30pm, June 27: Course Closing & Certificates
4
Exercises
Note: Exercises 1-7 can be completed in Microsoft Excel.
Note: Exercises 8-16 can be completed in the QTL IciMapping software.
Note: Exercises 18-20 can be completed in the QU-GENE and QuLine software.
Exercise 1. Assuming the red flower colour is a dominant trait versus white flower colour, and there
are two alleles A and a affecting flower color. Red individuals in an F2 population may have genotype
AA or Aa. F3 families are needed to determine the genotype of red-flower individuals in F2 population.
When no segregation for flower color is observed in an F3 family, the F3 family is said to be derived
from the homozygous genotype AA. In contrast, when segregation is observed, the F3 family is said
from the heterozygous genotype Aa.
 If one F2 individual has the genotype Aa, and 5 individuals from its selfed seed are grown in the
following F3 family, what is the probability that the F2 individual will be misclassified as AA?
 If we wish the error probability to be below 0.05, how many F3 individuals should be grown?
 If we wish the error probability to be below 0.01, how many F3 individuals should be grown?
Exercise 2. The following table gives the frequency distribution of height (inches) among British
women.
Interval
Midpoint
Number of women
53-55
54
5
55-57
56
33
57-59
58
254
59-61
60
813
61-63
62
1340
63-65
64
1454
65-67
66
750
67-69
68
275
69-71
70
56
71-73
72
11
73-75
74
4
 Calculate the
sample mean and sample variance.
bar graph of the frequency distribution.
 Test whether the height can be fitted by a normal distribution.
 Draw the
Exercise 3. The M-N blood groups in man are determined by two alleles at a locus, and the three
genotypes correspond with the three blood groups. The following table gives the blood group
frequencies in two populations.
5
Genotype
Population I
Population II
MM
475
233
MN
89
385
NN
5
129
Total
569
747
 Calculate
gene frequencies and genotypic frequencies in the two populations.
 Test whether the two populations are in Hardy-Weinberg equilibrium (HWE).
 Test whether the two populations have same structure (you can test whether the two populations
equal gene frequencies).
 Suppose we mix the two populations together to have a mixture population. Test if the mixture
population is in HWE.
Exercise 4 (optional). The ABO blood groups in man are determined by three alleles at a locus. The
following table gives the blood group frequencies in one population.
Blood type
A (AA+AO)
B (BB+BO)
AB
OO
Total
Frequency
2162
738
228
2876
6004
 Calculate
the frequencies of alleles A, B and O using the EM algorithm.
 Test whether the population is in Hardy-Weinberg equilibrium (HWE).
Exercise 5. The following table gives the frequency distributions on flower length (mm) of tobacco in
two fixed parental lines P1 and P2, and their F1 and F2 generations (East 1913). Assuming P1 has all the
alleles increasing flower length, P2 has all the alleles reducing flower length.
 Estimate mean and variance of each population.
 Estimate the effective number of genes on flower length.
 Estimate the gene effect under the polygene hypothesis.
Pop. Size Flower length (mm)
34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100
P1 211 1
21 140 49
F1
98
4 10 41 40 3
P2 168
13 45 91 19
F2
444
3 9 18 47 55 93 75 60 43 25 7 8 1
6
Exercise 6. Three replicated observations of four wheat isogenic lines were given in the following
table.
Genotype
Replicate 1
Replicate 2
Replicate 3
AABB
232
233
242
AAbb
224
220
200
aaBB
219
211
209
aabb
151
152
151
 Conduct
ANONA by considering main effects at locus A and locus B, and the interaction between the
two loci.
 From ANOVA, calculate the variance components at locus A and locus B
 From ANOVA, calculate the variance component between locus A and locus B
Exercise 7. Means of the nine genotypes at two unlinked loci were given in the following table. The
two phenotypes have a 15:1 ratio in F2 population. Genotypes of the two parental lines are AABB and
aabb.
BB
13
13
13
AA
Aa
Aa
Bb
13
13
13
bb
13
13
1
 Calculate
the two additive effects, the two dominance effects and the four epistasis effects.
 For a F2 population, calculate population mean and genetic variance.
 For a F3 population, calculate population mean and genetic variance.
 For a RIL population, calculate population mean and genetic variance.
Exercise 8. Suppose a co-dominant marker is linked with a resistant gene, and the resistant
phenotype is dominant to susceptible phenotype. One resistant inbred parent P1 has the marker
type MM, and one susceptible inbred parent P2 has the marker type mm. In an F2 population of the
two parents, frequencies of the three marker types for resistant phenotype and susceptible
phenotype were given in the following table.
Marker type
Frequency



Resistant
MM
572
Mm
1161
mm
14
Susceptible
MM
3
Mm
22
mm
569
Test the three marker types can be fitted in the 1:2:1 segregation ratio.
Test the two phenotypes can be fitted in the 3:1 segregation ratio.
Test the marker locus is linked with the resistant locus.
7


Calculate the recombination frequency between the marker locus and the resistant locus
In case that the marker is also dominant, i.e. MM and Mm are not districted, work out the
sample sizes of the four genotypic groups. Re-calculate the recombination frequency if the
marker is dominant.
Exercise 9. Use the barley DH population (…\Examples\MAP\BarleyDH.map, BarleyDH.xls or
BarleyDH.xlsx) to construct the genetic linkage maps.
 Construct the seven
linkage maps of barley
 Output the seven barley linkage maps
 Split one chromosome into two at the largest marker interval
 Identify the segregation distortion loci in this population
Exercise 10. Use the rice F2 population (…\Examples\MAP\RiceF2.map, RiceF2.xls or RiceF2.xlsx) to
construct the genetic linkage map.
 Construct the
12 rice linkage maps
 Output the 12 rice linkage maps
 Split one chromosome into two at the largest marker interval
 Identify the segregation distortion loci in this population
Exercise 11. Use the barley DH population (…\Examples\BIP\BarleyDH.bip, BarleyDH.xls or
BarleyDH.xlsx) to conduct QTL mapping.
 Find
additive QTLs controlling kernel weight by using Interval Mapping and ICIM, and determine the
source of the QTL allele that increases kernel weight.
 Compare the mapping results from different mapping parameters.
 Find the largest interaction from ICIM epistatic mapping
Exercise 12. Use the rice F2 population (…\Examples\BIP\ RiceF2.bip, RiceF2.xls or RiceF2.xlsx) to
conduct QTL mapping.
 Find
additive and dominance QTL controlling the resistance by using Interval Mapping and ICIM. For
the identified QTLs, determine the source of the allele that reduces the resistance.
 Compare the mapping results from different mapping parameters.
Exercise 13. Compare two mapping methods by simulation using RIL populations with a size of 200:
Assume there are 5 chromosomes, each of 150 cM, and evenly distributed with 16 markers. Two
traits of interest are plant height and grain yield.
8
Plant height has a heritability of 0.7, and is controlled by 3 independent QTLs, located at 18 cM, 55
cM, and 101 cM on chromosomes 1, 2, and 3, respectively. Additive effects of the three QTL are 10
cm, 4 cm, and -6 cm, and the population mean is 100 cm. Dominance and gene interaction are not
considered.
Grain yield has a heritability of 0.5, and is controlled by 7 QTLs. One QTL is located at 25 cM on
chromosome 1; two are located at 35 cM and 73 cM on chromosome 2; two are located at 18 cM,
and 55 cM on chromosome 3; and two are located at 39 cM and 131 cM on chromosome 4. Additive
effects of the 7 QTL are 1 t/ha, -1 t/ha, 1 t/ha, 1 t/ha, 1 t/ha, -1 t/ha, and 1 t/ha, respectively, and the
population mean is 3 t/ha. Dominance and gene interaction are not considered.
Assume the support interval is 10 cM, i.e., in a simulated population one predefined QTL is declared
to be correctly identified if there is a significant peak in a chromosomal interval of 10 cM. The true
QTL location is at the center of the support interval. One hundred populations are simulated.
 Draw the
average LOD profile of IM and ICIM for plant height and grain yield
 Find out the detection power of IM and ICIM for each plant height and grain yield QTL
 Find out the false discovery rate of IM and ICIM for plant height and grain yield
 What else can you find from the power simulation?
Exercise 14. In Exercise 13, assume each chromosome is evenly distributed with 31 markers, i.e., the
marker density is 5 cM. How will the denser markers change the QTL detection?
Exercise 15. Use the rice CSSL population (…\Examples\CSL\CslMapping.csl, CslMapping.xls, or
CslMapping.xlsx) to conduct QTL mapping.
 Calculate
the broad sense heritability for grain length
 Find out the donor chromosomal segments which affect the grain length; Explain whether these
segments increase or reduce the grain length
 How stable is the expression of these donor chromosomal segments?
Exercise 16. Use your own data in the QTL IciMapping software.
Exercise 17. Do you have any comments and suggestions on the QTL IciMapping software? Do you
have any questions concerning the use of your own mapping populations in QTL IciMapping?
Exercise 18. Build a QU-GENE input file (QUG), and then run the QU-GENE engine to generate one
GES file and one POP file. The POP file consists of 50 homozygous parental lines. Other requirements
are:
9







One environment type named Obregon in the TPE (target population of environment
Two traits Maturity and Yield have the broad sense heritabilities of 0.4 and 0.2 at the individual
plant level, respectively. The among-plot error variance is assumed to be equal to the
within-plot error variance for both traits.
There are two alleles at each locus.
Six genes control Maturity, where m=140 day, a=3 day, and d=0 day. The shortest genetic
Maturity will be 122 days; and the longest genetic Maturity will be 158 days.
Twenty one per se genes control Yield, where m=5000 kg/ha, a=100 kg/ha, and d=0 kg/ha. The
last two maturity genes have a pleiotropic effect of a=100 kg/ha on yield as well. The longer the
maturity, the higher the yield. The lowest genetic Yield will be 2700 kg/ha; and the highest
genetic Yield will be 7300 kg/ha.
The six maturity genes are linked with six yield genes on six respective chromosomes, i.e. 1A, 1B,
1D, 2A, 2B, and 2D, with recombination frequencies 0.05, 0.12, 0.15, 0.23, 0.30, and 0.17. The
other 15 yield genes are located on the other 15 chromosomes.
All alleles have a frequency of 0.5 in the initial parental population, consisting of 50 homozygous
pure lines.
Exercise 19. Build a QuLine input file (QMP) based on the following breeding procedure.
A simplified modified pedigree breeding method
100 single crosses made from 50 parental lines
Generation
Grown in Obregon
AxB
10 plants for each F1; no selection; each F1 population is
harvested in bulk
F1
500 plants for each F2 population; select for 2% with medium
maturity; selected F2 plants are harvested individually
F2
30 plants in each F2:3 family; 20% families are selected by yield;
each selected family is harvestled in bulk
F3
30 plants in each F2:4 family; no selection; each selected family is
harvestled in bulk
F4
30 plants in each F2:5 family; randomly select 5 from each family;
selected plants are harvestled individually
F5
50 plants in each F6 family; two replications; select 5% of F6
families based on yield; each selected family is harvested in bulk
F6
Purification and
propagation
10
Exercise 20. Run QuLine using the GES and POP files from Ex. 18 and the QMP file from Ex. 19.


Find out the genetic gains on yield and maturity after one breeding cycle.
Find out when the genetic gain on yield reaches a selection plateau, assuming one generation
can be grown in one year.
11