NIdSRs - of /MaizeGDB/FTP

advertisement

In search of new genomic regions involved in maize domestication.

Alvarez-Mejia, Cesar, Martinez de la Vega Octavio, Herrera-Estrella Luis, Herrera-Estrella Alfredo, and

Vielle-Calzada Jean-Philippe

.

Laboratorio Nacional de Genómica para la Biodiversdad; Langebio Cinvestav Irapuato. Km. 9.6, Libramiento Norte

Carretera IrapuatoLeón, CP 36821, Irapuato Guanajuato México.

Introduction

Maize was domesticated from Teosinte (

Zea mays

ssp

parviglumis

) in Mexico, ~9000 years ago, presumably in a region of the Balsas river drainage, at the intersection of Michoacán and Guerrero and México

States 1,3 . Although some genes such as

TB1

and

TGA1

were shown to have been affected by artificial selection related to domestication 2,4, , their function is not sufficient to explain the drastic morphological differences that distinguish teosinte and maize, suggesting that an important portion of genomic regions that contributed to maize domestication remains to be explored. We have initiated a genomic comparative procedure to find new genomic regions containing gene candidates that were influenced by maize domestication.

Distribution of Teosinte in Mexico. Adapted from Matzuoka et al 2005

Methodology and Results

Comparative Genomic Analysis

ZmB73Palomero

≥ 95% of identity.

≥ 200 pb 100% identical

- Non redundant at the genome level.

-Selection of Nearly Identical Sequence

Region. (NIdSR)

Gene Target

Length ≥ 1 Kb

- Gene Annotation

- High concentration of NidSR/150 Kb

- Frequency of recombination ≥ 1 Cm/Mb

Polymorphic analysis determination

-Selection of 10 candidates genes.

- Selection and annotation of 100 Kb around the NidSR selected.

- Polymorphic analysis of three regions contained within 100 Kb.

A. Genomic comparison of Zea mays ssp mays B73 inbred line 6,7 (ZmB73) and Zea mays ssp mays Palomero landrace 8 (ZmPal) and identification of

Nearly Identical Sequences Regions (NIdSRs)

13

14

15

16

17

8

9 ( tga1 )

10

11

12

Class

1

2

5

6

3

4

7 ( tb1 )

23

24

25

26

18

19

20

21

22

Distribution of NIdSRs in continous 150 Kb genomic segments

1

1

2

1

2

1

1

2

8

2

1

1

1

1

852

455

272

132

90

48

22

12

Total

Number of

150 Kb segments per

Class

4637

3561

2275

1371

19

22

23

24

26

12

13

14

15

18

47

49

64

96

9

10

7

8

11

4

5

6

Total

Number of

NIdSRs per

Class

0

1

2

3

Red Line

. Distribution of NIdSRs in segments of 150 Kb each.

Blue Line,

Frequency of recombination (in Cm/Mb 5 )

Grey Lines,

Syntenic analysis of 150 Kb regions containg highest number of NIdSRs.

The NIdSR distribution in the genome of zmB73 was grouped in arbitrary classes depending on their NidSR content per 150 Kb segments. Whereas classes containing the highest number NIdSRs (class 17 to 26) are related to chloroplast and mitochondria genomic insertions, classes with lowest number of NIdSRs are related to neutral genes (Class 1 to 5).

Classification and distribution of NIdSRs.

Neutral Gene bz2 adh1 umc128 an1 gbl1 fus6 chr chr3 chr1 chr1 chr1 chr9 chr1

Class NidSR/150 CM/Mb

2

1

1

1

0

0

1.12

1.2

1.46

1

4

1

0

3

0

1.49

3.99

4.36

Gene low variability tb1 tga1

Cd transporter

Cu transporter

Multicopper

Oxidase chr chr1 chr4 chr5 chr5 chr5

Class

4

8

7

9

6

NidSR/150 CM/Mb

3

7

6

8

1.03

0.3

1.12

1.08

5 1.22

The distribution of NIdSRs in the ZmB73 genome shows some particularities: (1) Zones highest content of NIdSR’s correspond to regions with chloroplast and mitochondria identity; (2) Zones with lowest content of

NIdSR correspond to regions with high nucleotide variability, often related to redundant or repeated sequences, or neutral genes. For further investigation, we selected classes where genomic segments containing

tb1

and

tga1

are classified. The selection of these classes takes also into consideration a frequency of recombination of at least 1 Cm/Mb, a parwaise length representation of at least 1 Kb, and

ZmB73 gene annotation. Close to 200 genes were analyzed and 7 were selected in a pilot screen to study nucleotide variability and test for neutrality.

B. Initial studies of nucleotide variability in gene target regions.

From a 200 genes list, we selected 7 genes to analyse nucleotide variability in 16 native landraces and 16 local Balsas teosinte populations. We also plan to pursue our analysis of previously identified regions containing heavy meal response affected by domestication 8 . Identification and comparison of these genes with their genomic sequence in Mo17 inbred line confirmed a drastic reduction in nucleotide variability.

NidSR

E09Contig186729.1

E09Contig86112.1

E09Contig157162.1

E09Contig156774.1

E09Contig17638.1

E09Contig210809.1

E09Contig189736.1

E09Contig10846.1

E09Contig188819.1

E09Contig74849.1

chr chr8 chr7 chr1 chr8 chr2 chr7 chr9 chr5 chr5 chr5 length

1538

1206

1311

1046

1306

1090

2534

2405

2276

1589 iden anotation

100 Hypothetical protein SORBIDRAFT_03g013535

100 NIN-like protein 1 [Zea mays]

Mo17 comparision

100% (509 pb)

99.60 % (502 pb)

100 Leucine-rich repeat transmembrane protein kinase 1

100 ER degradation-enhancing alpha-mannosidase-like 1

99.92

Antiporter/drug/transporter/transporter [Zea mays]

99.91

ARF gap like zinc finger protein ZIGA3

99.21 % (505 pb)

99.59% (492 pb)

99.61% (508 pb)

99.59% ( 482 pb

100 Leucine-rich repeat transmembrane protein kinase2 [Zea mays] 100% (525 pb)

100 ATPase cadmium transporter [Zea mays].

100 Copper transporter [Zea mays].

100 Multicopper oxidase protein [Zea mays].

99.61% (518 pb)

99.81% (513 pb)

100% (516 pb)

C. Selection of widely diverse regions for analysis of nucleotide variability.

Taking the position of the target gene as a reference, we will analyse nucleotide variability in a region encompassing 100 kb around the coding sequence. This procedure could offer hints on possible events of selection sweep.

24 kb 26 kb

A

Example of genomic structure and annotation of 100 kb around a target gene (

A

).

Blue:

ZmB73 genome linear trend.

Red:

location of

Palomero

sequences with at least 95% shared identity. Regions for nucleotide polymorphic analysis are marked in

green

.

Summary.

a) Genomic sequence comparisons between a maize landrace and an inbred line are useful to find new genomic regions involved in domestication.

b) Regions containing a high number of NIdSRs can often represent organellar genome insertions.

c) A pilot analysis of nucleotide variability is underway for new widely distributed regions showing low polymorphism in B73, Palomero, and Mo17.

Acknowledgements.

We thank Patrick Schnable for frequency recombination data and maizegdb and maizesequence for the B73 genome data. This work was supported by grant ZEA-2006 from SAGARPA, CONACyT, and the Howard

Hughes Medical Institute International Scholars Program.

References

1. Moeller, D. A., Tenaillon, M. I., Tiffin, P. , Genetics . 176 , 1799-809 (2007).

2. Clark, R. M., Wagler, T. N., Quijada, P., Doebley, J. , Nature Genetics . 38 , 594-7 (2006).

3. Matsuoka, Y. , Breeding Science . 55 , 383-390 (2005).

4. Doebley, J., Gaut, B., Smith, B. , Cell . 127 , 1309-1321 (2006).

5. Liu S., Yeh, C.-T., Ji, T., K. Ying, Wu, H., Tang, H. M., Fu, Y., Nettleton, D., and Schnable, P. S., PLoS Genetics 5 :e1000733 (2009).

6. http://www.maizegdb.org

7. http://www.maizesequence.org

8. Vielle-Calzada J.-P., Martinez De La Vega, O., Hernandez-Guzman, G., et al. Science . 326 :1078-1078 (2009).

Download