Using Genetics, Genomics, and Breeding to Understand Diverse Maize
Germplasm
Sherry A. Flint-Garcia
United States Department of Agriculture–
Agriculture Research Service (USDA–ARS)
Corn breeding and genetic analysis of agronomic traits relies on phenotypic variation and genetic variation in the genes controlling these traits. My research program focuses on understanding genetic diversity in maize so we can mine beneficial alleles from the appropriate germplasm sources for continued corn improvement.
The process of domestication that began 9000 years ago has had profound consequences on maize, where modern corn has moderately reduced genetic diversity across nearly all genes in the genome relative to teosinte, and severely reduced levels of diversity for key genes targeted by domestication. The question that remains is whether these reductions in genetic diversity have impacted our ability to make progress in corn breeding today.
As an outcrossing species, maize has tremendous genetic variation compared to most other crops. The complementary combination of genome-wide association mapping (GWAS) approaches, large HapMap datasets, and germplasm resources are leading to important discoveries of the relationship between genetic diversity and phenotypic variation in inbred lines.
However, among the traits targeted during domestication and breeding are many yield component traits, including number of ears, kernel row number, seed size, and kernel composition. Therefore, we must reintroduce variation from landraces and/or teosinte if we hope to learn how domestication has impacted these yield component traits, and yield itself.
Using Genetics, Genomics, and Breeding to Understand Diverse Maize Germplasm
Sherry Flint-Garcia
USDA-ARS Columbia, MO
Outline
Introduction to Maize Domestication & Diversity
Inbred Lines
Teosinte, the wild ancestor
Breeding with Zea
Maize Domestication
Teosinte (ssp parviglumis ) Landraces Inbred Lines
Artificial Selection
Domesticated from Zea mays ssp. parviglumis
Single domestication event ~9,000 years ago in Mexico
Intermediate form of landraces; populations adapted across the Americas to specific microclimates and/or human uses
Consequences of Artificial Selection
Zea Synthetic
Teosinte NILs
Teosinte Synthetic
Unselected
(Neutral) Gene
Domestication Gene Improvement Gene
Teosintes
Artificial
Selection
Domestication
Maize
Landraces
Plant Breeding
Maize
Inbred Lines
98% (~49,000) maize genes
Nested Association Mapping (NAM)
Maize ATLAS Project
2% (~1,000) maize genes
Germplasm Enhancement of Maize (GEM)
Linkage-Based QTL Mapping
“Genome Scan”
Identify genomic regions that contribute to variation and estimate QTL effects
Position (cM)
Parent 1 Parent 2
F
1
F
2 population
Genotype
Phenotype
Statistics for
Mapping
7
6
5
4
9
8
3
2
1
0
Association Analysis
Utilize natural populations
Exploit extensive historical recombination
Candidate gene approach:
Gene X
Line 1
Line 2
Line 3
Line 4
…
Line N
A
A
A
A
A
C
C
C
T
C
T
T
G
G
G
G
A
A
A
A
G
G
A
G
G
T
T
G
G
T
Ancestral chromosomes
Subject
Population
1.3m
1.4m
1.5m
1.8m
2.0m
2.5m
Linkage vs. Association Analysis
Linkage (QTL) Mapping
Structured population
BC, F2, RIL, etc
Analysis of 2 alleles
High power
Low resolution
10-20 cM (10-20 Mb in maize)
Genome scan
Don’t know anything about the genetics underlying the trait
Nested Association
Mapping
Association Mapping
Unstructured population
Unrelated or distantly-related
Analysis of many alleles
Low power
High resolution
1000-5000 bp in maize *
* Candidate gene testing
Pathway or candidates previously identified. Used as a validation method.
* Genome Scan
Don’t know anything about the genetics underlying the trait. Used as a discovery method
* Depends on the species/population
NAM
Founders
Ky226
NON STIFF STALK CI90C
VA26
A619
OH43
Pa762
Oh43 E
Oh40B
H95
Va99
H99
M14
L317
Va85
CH701-30
NC232
VaW6
MoG
CI66
CI44
CI31A
CI64
PA91
K55
R109B
NC230
Hy
DE-3
ND246
CI21E
38-11
MS71
Mo1W
R168
NC260
Mo44
CO125
CO109
NC360
DE-2
WD
A554
DE1
B75
CM37
R4
K148
B57
NC364
NC362
MO17
NC262
CI91B
NC258
A682
K4
NC222
Mt42
NC342 CI187-2 CI3A
OH7B
W401
A556
CMV3
B77 W117HT
MS153
NC290A
CO106
B103
B97
B164
Ky228
NC236
CM7
C123
C103
Mo46
Mo47
Mo45
I205
Hi27
B79
Yu796-NS
Tzi16
Tzi25
N7A
B105
DE811
SD40
N28HT
A641
A214N
H100
H105W
A635
A632
H91
A634
B14A
B68
CM174
CM105
B104
N192
Pa875
CH9
SD44
W64A
T8
Pa880
Va102
Va14
Va17
Va22
Va35
H49
WF9
Va59
A654
B84
B64
NC250
B109
NC294
NC368
A679
B46
NC308
NC312 NC306
NC268
A239
B10
C49A
C49
W153R
A659
R177
W182B
W22
NC326
NC314
A188
33-16
4226
NC33
H84
B37
B76
T234
E2558W
CML14
CML69
CML287
Tzi11
CML103
CML108
CML9
N6
SC357
B52
GT112
NC366
SC213R
CI-7
STIFF STALK
EP1
CO255
F7
NC328
F2
R229
NC292
NC330
NC310
NC324
NC322
NC372
A680
B73Htrhm B73
B115
I137TN
81-1
MEF 156-55-2
IL677A
Ia5125
IL14H
IL101
IA2132 P39 SWEET
CML61
CML254
CML5
CML314
CML264
CML258 Q6199
CML10
CML11 CML45
CML341
CML261 CML331
CML332
TROPICAL-
SUBTROPICAL
CML277
CML322
CML321
CML238
Ki14 CML247
CML157Q
Ki2021
Ki21
Ki11
Ki44
Ki43
Ki2007
Ki3
CML228
TZI18
NC354
NC302
NC338
A6
NC300
NC340
NC356
NC358
NC304
TX601
NC350
NC238
T232
Sg1533 SG18
IDS69
IDS91
MS1334
IDS28
CI28A
B2
U267Y
Mo24W
F2834T
GA209
D940Y
Mp339
M37W
F44
CML77
A272
CML218
CML92
NC320
NC332
NC334
CML154Q
TZI10
Tzi9
CML91
CML158Q
CML349
CML333
CML311
TZI8
CML220
NC370
NC264
Oh603
CML328
CML323
SC55 A441-5
F6
Mo18W
4722
Ab28A
I-29
POPCORN
MIXED
0.1
Based on 89 SSR loci
CML281
NC296A
NC346
NC336
NC296
NC352
NC348
NC298 parvi-03 parvi-30 ssp. parvi-49 parviglumis parvi-36
Flint-Garcia, et al. (2005) Plant J.
NAM Development
Association Mapping
Yu, et al. (2008) Genetics ;
McMullen, et al. (2009) Science
Corn Tracker
NAM Kernel Composition
Fiber
Amylopectin
Starch
Amylose
Zeins
Protein
Amino Acid
Profiles
Oil
Fatty Acid
Profiles
Jason Cook
NIR conducted by Syngenta Seeds, Inc.
70
60
50
40
90
80
30
20
10
0
-
Oil Composition QTL in NAM
2 3 4 5 6 7 8
80
70
60
50
40
30
20
10
0
0
Chr. 1
Joint Linkage Mapping - Oil
100 200 300 400 500 600 700 800
Genetic Distance (cM)
900 1000 1100 1200
9
1300
10
1400
Joint Linkage Mapping - Oil
500,000,000 1,500,000,000 2,000,000,000 1,000,000,000
Physical Distance (bp)
Cook, et al. (2012) Plant Phys.
Oil
4.4%
5.3%
3.6%
3.9%
Chr. 6 Oil Candidate: DGAT1-2
Encodes acyl-CoA:diacylglycerol acyltransferase
Fine mapped by Pioneer-Dupont
Zheng, et al. (2008) Nature Genetics
High parent = 19% oil
High allele = 0.29% additive effect
High allele has Phenylalanine
insertion in C-terminus
Phenylalanine insertion
Cook, et al. (2012) Plant Phys.
NAM Genome Wide Association (GWAS)
1.6 Million HapMap.v1 SNPs projected onto NAM
Bootstrap (80%) sampling to test robustness of models
100.0
90.0
80.0
70.0
60.0
50.0
40.0
30.0
20.0
10.0
0.0
-
GWAS - Oil
500,000,000 1,500,000,000 1,000,000,000
Physical Distance (bp)
2,000,000,000
Cook, et al. (2012) Plant Phys.
DGAT 1-2 (Chr 6: 105,013,351-105,020,258)
NAM Population: 24 HapMap.v1 SNPs in DGAT
281 Association Panel: 2 55K SNPs in DGAT (plus the 3 bp indel)
M5 M4
M2: Phe Insertion
M3
M1
Marker Trait Population Analysis Method BPP P-Value Effect
M1 Oil 282 Assn. MLM (Q+K) - 1.2E-04 0.18
M2 Oil 282 Assn. MLM (Q+K) - 9.9E-04 0.16
M3
M4
Oil
Oil
M4 Starch
M5 Oil
M5 Starch
NAM
282 Assn.
NAM
NAM
NAM
GWAS - Bootstrap
MLM (Q+K)
GWAS - Bootstrap
31
-
GWAS - Bootstrap 51
GWAS - Bootstrap 67
11
-
-
-
-
0.18
4.3E-05 0.19
-0.38
0.13
-0.31
Cook, et al. (2012) Plant Phys.
DGAT 1-2 (Chr6: 105,013,351-105,020,258)
M5 M4 M3 M2
(indel)
M1
= B73 Genotype
= Non-B73 Genotype
Cook, et al. (2012) Plant Phys.
Summary – NAM Kernel Composition
Genetic Architecture of Kernel Composition Traits
Governed by many QTL (N = 21-26) with small to moderate effects
GWAS results confirm many QTL
DGAT is our favorite gene, but we still don’t have the complete story!
NAM In General
We can identify the common alleles in maize, but still have problems with rare alleles.
Cook, et al. (2012) Plant Phys.
Ames Plant Introduction Station Inbreds
2,815 inbred lines from the Ames PI Station
Genotyping-by-sequencing (GBS) - 681,257 SNPs
Romay, et al. (2013)
Genome Biology
Ames Plant Introduction Station Inbreds
More than half of the SNPs in collection are rare!
Expired PVPs
77%
48%
42%
302 Association panel = 75%, NAM founders = 57%. Romay, et al. (2013) Genome Biology
Development of Teo Introgression Libraries
10 accessions
B73 × teosinte
(parviglumis)
F1 :
B73 × teosinte
B73 teosinte F1 BC1 BC2 BC3 BC4
Library Development
10 libraries of 887 BC4S2/DH Near Isogenic Lines (NILs)
Z031E0035
Illumina
GoldenGate
768 SNP assay
Z035E0012
BC4S4 NILs: GBS & RAD sequencing = 33,000–600,000 SNPs
Lines to be released in 2013
c1 c2 c3
Library Coverage c4 c5 c6 c7 c8 c9 c10
10 maize-teosinte libraries
804 BC4S2 NILs and 83 BC4DH NILs
Each line:
2.3 chromosomal segments
4.1% of the teosinte genome
3.3X genome coverage
KEY
0
1
2
3
4
5
8
9
6
7
BC4S2 vs BC4DH from PI384071 donor
3
3
.
3
3
4
2
.
3
.
3
.
2
3
.
4
.
6
5
4
5
6
5
6
4
0
4
5
.
5
4
5
5
4
5
.
2
3
3
.
.
3
.
3
2
.
2
2
2
.
.
2
.
2
.
3
3
3
3
3
3
3
2
3
3
3
3
3
3
.
3
3
3
.
3
3
3
4
4
3
2
4
3
2
2
.
2
.
2
2
.
1
.
2
2
2
1
.
1
1
2
2
2
2
2
1
.
3
.
.
.
.
3
3
3
3
.
c1
S2 DH
.
1
1
1
1
1
.
1
.
1
.
2
0
.
2
.
2
3
3
3
1
3
1
1
2
0
2
.
1
1
1
1
1
1
.
1
1
2
2
.
.
.
1
1
.
2
3
3
.
.
3
.
3
.
3
1
1
1
1
1
1
1
0
0
1
0
0
0
.
0
0
0
.
0
0
0
0
0
0
0
0
0
0
0
.
0
.
0
0
.
0
.
0
0
0
0
.
0
0
0
0
0
0
0
0
.
0
.
.
0
.
.
0
0
.
0
0
3
3
3
2
2
2
3
3
2
3
3
3
2
3
.
3
2
.
3
2
1
.
5
4
5
.
5
5
5
3
5
5
.
3
3
.
.
3
.
2
2
2
3
3
3
3
.
.
3
2
2
.
3
5
4
.
3
4
.
3
2
3
.
5
.
.
8
8
6
5
.
4
5
6
5
4
.
4
5
4
3
.
4
3
3
4 c2 c3 c4 c5 c6 c7 c8 c9 c10
S2 DH S2 DH S2 DH S2 DH S2 DH S2 DH S2 DH S2 DH S2 DH
1
1
1
1
1
2
1
2
2
2
2
2
3
2
2
.
3
2
.
1
1
.
2
1
1
2
.
2
2
2
2
.
1
2
2
2
.
.
2
.
3
3
4
3
3
.
4
.
3
2
.
3
7
6
.
7
.
7
6
6
6
.
3
.
.
3
3
3
3
.
5
4
0
3
2
2
.
2
2
2
1
.
2
0
1
2
6
6
6
6
6
.
7
6
2
6
6
6
.
5
.
6
6
.
.
7
5
.
7
6
8
7
.
8
.
.
8
8
8
.
7
.
7
7
.
6
5
5
.
4
4
6
4
6
6
.
6
7
.
4
3
4
3
3
3
3
2
.
3
.
.
1
.
5
5
3
3
1
1
2
2
.
4
3
.
4
4
4
5
5
5
6
7
7
4
4
4
.
4
5
4
3
4
.
4
3
.
3
.
.
2
2
1
.
1
1
1
1
.
1
.
.
1
1
1
.
1
1
.
1
.
4
4
3
3
.
2
3
2
2
.
2
3
4
.
4
4
4
4
4
4
4
.
5
.
.
2
4
.
4
2
2
1
2
1
1
2
.
1
1
.
3
3
6
7
7
0
.
0
.
0
.
0
0
.
0
0
1
0
.
0
.
2
2
1
.
.
1
2
1
1
1
1
0
1
.
1
1
1
1
1
2
1
2
2
2
.
.
1
2
2
.
1
1
0
1
1
1
1
1
1
1
1
.
1
2
0
2
1
2
2
2
2
.
2
2
1
2
2
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
.
0
.
0
0
0
.
.
2
2
2
.
8
.
7
.
7
7
.
7
6
7
.
5
6
.
2
6
5
.
.
7
5
2
5
6
6
5
5
.
6
2
3
3
3
2
3
3
2
3
.
.
3
2
2
.
2
3
4
4
4
4
4
4
4
4
4
.
4
4
4
3
4
3
4
4
4
.
4
3
3
3
2
3
3
2
2
3
3
3
3
2
3
2
2
1
1
1
1
1
.
2
.
2
3
5
5
.
.
7
6
7
3
2
3
3
.
2
1
1
.
1
.
.
1
2
1
0
.
1
.
1
1
1
1
.
1
.
1
1
1
1
1
1
.
1
.
1
1
1
1
1
0
.
0
.
0
0
.
0
.
1
2
.
3
2
3
.
2
3
3
2
.
3
.
3
3
3
3
.
2
.
2
2
3
3
3
3
3
.
1
1
1
.
1
1
.
.
5
6
.
4
3
.
5
5
4
.
4
5
.
3
4
4
4
4
4
.
4
.
4
3
4
4
2
4
.
3
.
3
2
.
2
.
3
3
.
3
3
1
.
3
3
3
4
.
3
.
3
3
3
3
.
3
.
1
1
1
1
3
0
.
0
1
1
2
2
2
.
2
3
2
2
4
1
.
.
1
0
2
2
1
.
0
0
0
.
0
0
0
0
.
0
.
0
0
1
2
2
2
.
3
2
2
2
.
2
2
2
2
.
2
.
.
.
.
.
4
2
5
5
5
5
.
4
5
4
6
6
5
.
.
3
3
6
5
6
.
5
7
7
.
5
5
4
5
.
4
.
5
3
5
4
3
3
.
4
4
3
4
.
3
4
4
3
.
4
.
3
.
.
.
2
.
1
2
2
2
.
1
3
3
2
.
2
2
1
1
.
2
2
1
1
1
2
2
1
2
.
2
2
2
2
2
2
2
1
1
1
.
1
1
1
1
1
1
1
.
4
.
3
4
4
3
.
3
2
4
.
3
4
4
.
1
0
2
2
4
4
1
2
2
3
.
2
2
2
3
4
3
3
4
1
2
1
.
2
2
2
1
2
2
.
1
4
4
4
4
.
4
3
4
.
4
4
4
4
.
3
3
3
2
3
.
.
2
2
.
1
2
2
.
.
.
2
2
1
1
2
2
3
4
3
4
.
4
1
1
1
0
0
0
0
0
0
0
0
0
.
.
2
.
2
1
.
1
.
0
0
0
5
5
4
5
.
5
5
4
.
4
1
1
1
.
1
1
1
0
1
.
.
0
0
.
0
1
0
.
.
.
6
6
4
5
6
4
4
5
6
5
.
5
1
1
0
1
1
2
1
1
1
0
1
1
.
.
1
.
1
1
.
2
.
2
4
3
2
.
2
2
2
1
2
1
0
1
1
0
.
1
0
.
0
1
.
1
1
.
1
0
0
0
0
0
0
0
0
.
0
0
0
.
.
.
1
0
1
1
.
.
.
1
.
3
3
5
5
5
3
5
7
5
3
.
2
2
2
2
2
1
2
2
1
1
1
.
0
.
1
2
1
.
1
1
.
2
1
1
1
2
2
2
1
.
1
1
1
.
.
1
.
1
2
1
.
.
3
.
3
.
4
5
4
4
4
3
5
4
4
.
2
4
4
2
.
.
4
3
4
.
4
.
4
.
.
2
2
2
2
.
2
2
.
.
1
1
1
2
.
1
1
0
1
.
1
.
1
.
1
.
.
3
.
2
.
2
.
.
0
1
2
3
4
5
6
.
1
1
1
1
.
.
1
0
0
.
0
.
0
0
.
.
1
1
.
1
1
1
1
.
.
1
0
1
.
0
0
0
.
0
0
.
0
.
0
.
.
0
.
0
.
0
.
.
0
KEY
7
8
9
Application 1. Empirical Genetics
1000 Selected Genes
What do these selected genes do?
What traits were targeted by artificial selection during domestication/breeding?
Are selected genes important?
Tillering/branching?
Auxin response factor, ARF1
0.02
Inbreds
Teosinte
0.01
0
1 1000 2000 3000 (bp)
Application 2. QTL Mapping
Phenotyped at up to 18 reps:
Maturity, plant & ear height, kernel row number, kernel weight, kernel shape, leaf length-width,
Zhengbin Liu
Grain composition
(protein, starch, oil)
Avi Karn
50
40
30
20
10
0
90
80
70
60
0
Days to Anthesis
100 200 300 400 500 600 700 800 900 1000
1100 1200 1300 1400
Kernel Row Number
Pop Add. Eff.(rows)
Z029 -3.4
Z030
Z031
-1.8
-0.6
Z032
Z033
Z034
Z035
-1.4
-1.0
-1.1
-1.1
Z036
Z037
Z038
-0.3
0.2
-0.9
Pop Add. Eff.(rows)
Z029 ---
Z030
Z031
-1.4
-1.4
Z032
Z033
Z034
Z035
-1.1
-2.1
-2.3
-1.7
Z036
Z037
Z038
-0.6
-0.7
-1.4
Pop Add.Eff.(g)
Z029 -1.2
Z030
Z031
Z032
Z033
-0.9
-0.6
-1.2
-0.4
Z034
Z035
Z036
-1.3
-2.3
-1.2
Seed Weight (50 kernels)
Pop Add.Eff.(g)
Z029 -0.9
Z030
Z031
Z032
Z033
0.2
-0.6
0.8
-1.5
Z034
Z035
Z036
3.3
2.4
-0.2
Pop Add.Eff.(g)
Z029 3.7
Z030
Z031
Z032
Z033
0.0
-1.2
-1.2
-0.6
Z034
Z035
Z036
-1.1
0.6
-0.3
Application 3. Reintroduce Variation
60
50
40
80
70
30
20
10
Teosinte (N = 11)
Landraces (N = 17)
Inbred Lines (N = 27) seed size
β γ
α family
Zein Profile
Endosperm Embryo
Zeins
δ
0
Carbohydrate Protein Fat
Biological hypothesis: A loss of genetic variation results in a loss of phenotypic variation.
Teosinte NIL Collaborations
Hibbard –
Corn Rootworm
Resistance
Tracy –
Germination
AgReliant –
Agronomic
Traits
Harmon –
Circadian
Clock
Hoekenga –
Iron Bioavailability
Buckler – Flowering
Nelson – NCLB
Balint-Kurti –
SCLB & GLS
Smith – Corn Smut
Brutnell –
Shade Avoidance
Baxter–
Ionomics
Dallo –
Mal de Rio Cuarto Virus
Turlings – Terpenes
& Armyworms
Synthetic Populations
NAM Synthetic
×
Zea Synthetic
Teosinte Synthetic
B73 & NAM parents are inbreds, teosinte has NEVER been inbred
Zea Synthetic Inbreeding Depression
38% B73,
2% each
NAM parent,
12% teosinte
S0 Male S0 Female
Selfed (S1)
F ≈ 0.5
Full Sib (FS)
F ≈ 0
924 pairs
Ginnie Morrison
Zea Synthetic Doubled Haploids
38% B73,
2% each
NAM parent,
12% teosinte
Anna Selby
Genotype by GBS
Identity by
Descent (IBD)
Doubled Haploids (DH)
Per se Hybrids
Association Analysis
&
Genomic Selection
Goal: 2000 DH
First 800+ in 2013
Another 700 in Puerto Rico
AgReliant producing more
2014 trial: MO, NC, NY, IA
“Crazy?” Idea
7000 BC
2014
Agronomics
Fertilizer
Density
Mechanization
Select only on yield
? ideotype
Teosinte Synthetic
75% B73 (SS), 25% teosinte
Acknowledgements
NSF Maize
Diversity Project
Syngenta Seeds
AgReliant Genetics
Susan Melia-Hancock,
Kate Guill, Jason Cook,
Zhengbin Liu, Ginnie
Morrison, Christopher
Bottoms, Avi Karn,
Anna Selby www.panzea.org