Proteomic studies of the environmentally important Methylocella silvestris

advertisement

Proteomic studies of the environmentally important methanotroph Methylocella silvestris

Konstantinos Thalassinos, Vibhuti Patel, Susan Slade, Nisha Patel, Joanne

Connoly, Andrew Crombie, Colin Murrell, James Scrivens

The Central Dogma of Molecular Biology

Proteome Complexity

Human Genome 20,000 – 25,000 genes 1

A single gene can give rise to many different proteins by the process of alternative splicing

In complex genes, alternative splicing can generate dozens or even hundreds of different mRNA isoforms 2

It is estimated that almost 50% of all proteins contain one or more post-translational modifications 3

1 International Human Genome Sequencing Consortium (2004). "Finishing the euchromatic sequence of the human genome.".

Nature 431 (7011): 931-945

2 Missler, M. and Sudhof, T. C. (1998). Neurexins: three genes and 1001 products. Trends Genet. 14,

20–26.

3 Apweiler, R., Hermjakob, H. and Sharon, N. (1999). On the frequency of protein glycosylation, as deduced from analysis of the SWISS-PROT database. Biochim.Biophys. Acta 1473, 4–8.

What is Proteomics?

Proteomics is the global characterisation of protein products (sequence, post-translational modifications, protein-protein interactions) expressed by a given genome at a specific point in time

Unlike the genome, the proteome is a dynamic entity

Challenges and limitations

Biological

Dynamic range

4 orders of magnitude in cells and 10 orders of magnitude in plasma

Post-translational modifications

Alternative splicing

Technical

Sample requirements

Complex, time-consuming sample preparation

Days of experimental time

Pedrioli, P. G., et al., (2004). A common open representation of mass spectrometry data and its application to proteomics research. Nat. Biotechnol. 22, 1459–1466.

Proteomics Approaches

Profiling

Identify which proteins are present in a sample

Differential

Probe for changes in protein expression levels between a number of environmental states

Identify the presence and map the position of all posttranslational modifications present on each protein

Essentials of a Mass Spectrometer

A mass spectrometer separates ions according to their mass-to-charge ( m/z ) ratio

Direct Sample Introduction

Liquid Chromatography

ESI

MALDI

Quadrupole

TOF

Ion Trap

FT

MS and Tandem MS data

MS/MS fragment ion nomenclature

Limitations of Current Database Search Programs

Finding a peptide match after a database search is easy, but knowing whether it is correct is not

It is almost always possible to match a MS/MS spectrum to a peptide in the database

Incorrect matches often (but not always) result from use of low quality peptide MS/MS data to search the database

Actual peptide sequence is not in the database searched (under the search conditions used)

Probability of a false positive assignment is much higher for proteins identified with only one peptide (known as one-hitwonders)

According to publishing guidelines more than 1 peptide per protein is required

Methylocella silvestris

Acidophillic aerobic methanotroph

Methanotrophs use methane as their sole carbon and energy source

Key position in the global methane cycle

Effects of multi-carbon substrates on activity

Soluble methane monooxygenase

(sMMO)

The recently discovered genus of

Methylocella is capable of utilising certain multi-carbon compounds as well as methane

Aims

To measure changes in protein expression of

Methylocella silvestris under varying growth conditions.

Relate changes in proteome to important biological pathways .

Compare existing methods and new approaches using the same instrumentation and software without bias.

Proteomes analysed

H

3

C CH

3

H O

O

O

OH

H

H

H

OH H

H

H

N

H

H

H

H

H

Methane

H

O

H O

Acetate

CH

3

Creation of protein database

M. silvestris genome recently published

Predict all open reading frames

Use custom Perl scripts to create appropriate FASTA formatted database

Experimental approaches used

SDS-PAGE gels

No quantitation

 iTRAQ

Uses 4 isobaric tags for quantitation

MS E (Identity E )

Uses an internal standard for quantitation

Database search results saved in MySQL database

Proteins identified by each methodology

Gels iTRAQ Identity E

Overlap of protein identifications

Including single-peptide identifications

Two peptides or more

iTRAQ reporter ion ratios

Replicate analyses of iTRAQ-labelled samples

Identity

E

estimated quantitation

Log(e) ratio plot of common proteins expressed under methane and acetate growth.

Summary

Protein loading

Total experimental time

Total instrument time

Number of proteins identified

Average number of peptides per protein

Average sequence coverage

Dynamic range covered by relative quantitation

Gels

14 µg

4 days

30-40 hours

95

3.7

17.2 %

-iTRAQ

800 – 1000 µg

6 days

30-40 hours

171

2

11.5 %

3

Identity E

0.5 – 0.75 µg

Less than 3 days

6 hours per sample

399

10

50.4 %

4

Conclusions

All the methodologies employed provided good profiling coverage of the respective proteome.

 iTRAQ and Identity E both provide information on protein identity and changes in expression.

Identity E

More confident protein identifications

Lower protein requirements

Significantly less instrument demands

Significantly reduced sample preparation time

Provides a stand-alone quantitative estimate of the proteins present in any given sample

Publication

A comparison of labelling and label-free mass spectrometry-based proteomics approaches.

Konstantinos Thalassinos, Vibhuti Patel, Susan Slade,

Nisha Patel, Joanne Connoly, Andrew Crombie, Colin

Murrell, James Scrivens.

Journal of Proteome Research

Ongoing studies

Continue the studies on the effect of growth substrate on the proteome of M. silvestris .

Relate results back to M. silvestris cell biochemistry in particular the pathways of multi-carbon substrate assimilation

Biological significance of results obtained

Distinct protein profiles for each substrate:

Certain proteins only expressed under methane.

Certain proteins only expressed under acetate.

Significantly lower levels of key enzyme soluble methane monooxygenase ( sMMO ) when grown under acetate.

Pathway mapping

Kyoto Encyclopedia of Genes and Genomes (KEGG)

Manually curated database of biological pathways.

KEGG Automatic Annotation Server (KAAS)

Functional annotation of genes by BLAST comparisons against KEGG database.

Develop a program to map the KAAS results back onto KEGG pathways http://www.genome.jp/kegg/

Biological pathways

Acknowledgements

Acknowledgements

Mu

lti-

d

imensional

p

rotein

i

dentification

t

echniques (MudPIT)

Profiling Quantitation

Proteome

Label with isobaric tags

Tryptic digest

Strong cation exchange chromatography

LC-ESI-MS/MS

Database searches

Analyse ratios of tags iTRAQ 115

114

Protein identification

Relative levels of identified proteins

116

117

Alternative approach to profiling and quantitative proteomics

Waters Identity E and Expression E

 500 ng sample loading plus an internal standard

 Approximately 2 hours analysis/sample

New approach to profiling and quantitative proteomics

Validation of MS

E

data

Identity

E

and Expression

E

Quantitation is based on relationship between ESI signal response and protein concentration

The average ESI signal response of the 3 most intense tryptic peptides per mole of protein is constant (CV +/- 10%) $

Quantify at the protein level (gross changes) or peptide level

(minor fluctuations in protein expression)

$ Absolute Quantification of Proteins by LCMS E Silva et al., MCP vol. 5 issue 1 (2006) 144-156

Protein identifications common to iTRAQ and

Identity E

Proteins identified in the iTRAQ and hit-wonders.

Identity E experiments, including one-

Download