Protein Purification Overview

advertisement
Protein Purification Overview
Proteins are the molecules that do interesting things in cells – they are involved in the
production of all other molecules and cell structure, and they are involved in a cell’s ability to
respond to changes in its environment. It’s not clear how many proteins there are in an
organism, but a “best guess” for humans is about 50,000, and for bacteria maybe 10,000 or so.
There are a variety of ways to investigate protein function. Some of these are made possible by
advances in molecular biology (now we can knock out the expression of a gene, or create
altered versions, and look at the effects, both on the organism and on molecular events in the
cell) – and these are useful, because they look at the protein in its cellular context. But often it
is necessary to purify the protein away from the other thousands of cellular proteins, so that its
structure can be examined, or its interactions with other molecules studied more specifically.
Thus, a great deal of protein chemistry has to do with the principles and problems of protein
purification, and even when not related to purification per se, can be instructive and applicable
to all sorts of protein work.
The tremendous variety in protein structure and chemistry makes the task of purification
sensible but difficult. The overall challenge is to find methods that will separate proteins into
two (or more) different fractions based on some property (charge, size, density etc). The
protein of interest is thus put into a smaller group of proteins that are similar in one way (lets
say, proteins that are large), and then take this group and separate them based on whether they
bind to (+) charged surfaces. One always looks for the protein of interest, hoping to find one
fraction that has lots of the protein of interest, but few other proteins.
One important thing to understand is that the process of designing purification strategies is
mostly empirical. A method is applied, the investigators find the protein of interest,
characterize the different fractions and decide if this was a good method to use (or not) and
then they go again. These steps also tell the investigators something about the protein. It is true
that our ability to learn the primary structure of an interesting protein before we even begin to
try to purify it (by having identified its DNA sequence and de-coding it), but that only takes us
part way. The folded, native structure of a protein dictates its interactions with other
molecules, and hence, the relevant purification strategies. It is also true that genetic
engineering has allowed us to manipulate proteins by attaching tags that allow purification
schemes that have little to do with the proteins native structure. Still, these methods are seldom
used as the sole purification step, and they are still based on the same sorts of principles that
drive a more classical protein purification approach (that is, based on the particular properties
of the protein desired) – and optimization of such steps is still largely empirical.
1
Preparative vs. Analytical Methods
Before launching into the specific approaches and methods that are most often used, it is useful
to distinguish between preparative and analytical methods:
Preparative methods: the purpose is to “prepare”, so the samples are large, not
consumed but recovered, and the goals are good yield and purity of the sample of
interest
Analytical methods: the purpose is to analyze, often involves consuming the protein so
small samples are taken for analysis; precision and accuracy, and resolution are key.
In purifications, we use analytical methods to find the protein of interest, and determine how
much other protein has yet to be gotten rid of. Obviously, analytical methods are also used in
non-preparative contexts, but when purifying, the distinction is good to keep in mind –
especially that bit about taking small samples for analysis (b/c reserving generous amounts for
analysis takes away from the yield of the purification.
Properties of proteins that which can be used in purification approaches
Size: Proteins range from about 2000 Daltons (amu) to over 1,000,000 Daltons, though most
are in the 10-150 kD range
Shape: Spherical to asymmetric. Affects movement through media (such as a gel). Methods
that presume to separate proteins of different sizes or molecular weight are often actually based
on the “Stokes radius” which is the effective radius of a molecule tumbling in space.
Charge: Whatever acidic and basic amino acids are not involved in salt bridges may carry
charge (depending on the pH of the solution). Separations can be based on net charge, or on
charge distribution
Isoelectric point (pI): this the pH at which the net charge on a protein is zero. Each
protein will have its characteristic pI. Proteins can be applied to a pH gradient in an
electric field, and they will stop moving at the pH at which their charge is zero
Charge distribution: uneven charge distribution can affect the way proteins bind to
charged beads (called ion exchange resins). Even if a protein has a net zero charge, (+)
patches may bind to (-) beads.
Hydrophobicity: Non-polar amino acids that are not buried in the center may be available to
bind to beads with non-polar groups attached. if the water cages can be got rid of (high salt,
which gives the H2O molecules something else to do)
2
Solubility: depending on the particular solvent, salts, pH, temperature etc, Proteins may stay in
solution or not. Insolubility leads to precipitation, and precipitates are easily separated from
soluble proteins by centrifugation
Density: most proteins are between 1.3 and 1.4 g/cm3, but some are not. . Proteins that contain
lipids are less dense, and some containing phosphate groups are denser
Unusual properties:
Metal binding, thermstability, protease resistance
Affinity methods
These are in a bit of a special group because they depend on much more specific interaction
that proteins make with other molecules.
Ligand Binding: Ligands are small molecules that bind to proteins, such as substrates,
effector molecules, cofactors etc. Often these can be used to select for only those
proteins that bind to them. Sometimes on can separate a functional class of molecules
(such as those that bind to ATP) or very specific proteins.
Immunoaffinity – if antibody molecules are available, these can be used
Genetically engineered purification “handles” fusion proteins in which a specific
binding moiety is added to one end of the protein (e.g. as 6X-His added to a protein
allows it to bind to a Ni column)
Purification strategies
the parts in blue are “under construction”
1. “Antecedents”
a. Assay (analytical)
b. Source
2. Purification
a. Prepare a soluble fraction (extract, lysate or homogenate)
b. Centrifugation
i. Differential centrifugation
ii. Separation based on size, shape, density
c. Bulk separations
i. Differential precipitation
ii. Bulk application of adsorption methods (e.g., bulk phase ion-exchange)
d. Chromatography -- General principles
i. LC vs. HPLC
ii. Gel filtration (size, shape)
iii. Ion exchange (charge, charge distribution)
iv. Hydrophobic interaction
v. Affinity
e. Electrophoresis
i. Gel electrophoresis charge, size shape
ii. Isoelectric focusing
3. Characterization
3
Antecedents: “things that come before”.
Assay
Having a way to identify the protein of interest is ABSOLUTELY CRITICAL . You have to
know this before you even start. Ideally, the assay should be easy and quantitative, and be
related to the biological function of the protein. These conditions are not always met – an assay
for function may be cumbersome, or difficult to quantitate, or may fail to pick up just one
protein (in one of the example papers, a column fractionation revealed two distinct peaks of
the same activity – indicating two different proteins that can do the same thing).
Source
Sometimes your source is an option, sometimes not. If you are purifying a recombinant human
protein that is being expressed by yeast cells, you will obviously start with yeast cells. You
may try to grow or manipulate the expression system so that your protein of interest is
relatively abundant. You may be faced with the task of purifying from a given source (if you
are looking for the version of protein that is made in rat brain, then you will have to start with
rat brain). But sometimes there are choices. (Calmodulin, a small protein that is important for
how cells respond to Ca++ signals, is found in high amounts in skeletal muscle, and that would
seem to be the obvious source for purification, but skeletal muscle also contains another
protein, Troponin C, which chemical properties that are very similar to those of calmodulin,
and co-purifies with it in many different separation procedures. A better source is smooth
muscle, because there is little Troponin C in that kind of muscle.)
Purification
Prepare a soluble fraction: lysate, homogenate, extract
Most separation procedures depend on your protein of interest being soluble, or at least
suspendable in solution. The various methods of disrupting the tissue, and the solution that you
use to do so, are the two key features of this part.
Mechanical methods include using a blender, a homogenizer (Polytron), a sonicator, French
pressure cell, or Dounce homogenizer. The Dounce is the gentlest of these – it basically cells
membranes between glass surfaces. It is so gentle in fact that it cannot deal with sturdy
structures like cell walls (plant, yeast or bacteria) or with connective tissue.
The solution used is also important. Choices need to be made about the pH, and thus, the kind
of buffer, and the concentration of the buffer and of other salts (ionic strength). Other reagents
are often included to help preserve the integrity and activity of the protein (protease inhibitors,
reducing agents, metal ions (or metal ion chelators), or other more specific reagents. Finally
temperature can be important– most of the proteins we work with will have a shorter active
lifetime at elevated temperature, and any contaminating proteases can be more active when the
temperature is high. When you consider that some of the disruption methods cause temp
increases (by friction), its clear that some care should be taken to counteract these increases.
The terms used to describe the soluble fraction are mostly interchangeable – there are formal
differences in meaning between a lysate, a homogenate and an extract, but it is not really
important what you call it.
4
Centrifugation
After making some sort of suspension of the cell or tissue, the next step is generally to
centrifuge at a relatively low speed and recover the supernatant (unless of course, what you are
studying is trapped in the insoluble material). The speed you use for the centrifugation will
determine what is in this fraction. If you spin slowly, you will only get big things (a gently
lysed suspension of mammalian cells, centrifuged at relatively slow speed, will pellet only the
large and heavy nuclei. This is handy if you are studying nuclear proteins. If you are not, you
might take the “post nuclear” supernatant (hereafter sup, which is pronounced “soup”), and
spin it much harder. People working with truly cytoplasmic proteins will often consider the
“S100” fraction as their starting point (it is the sup from a centrifugation at 100xg) – at these
speeds the organelles, small membrane vesicles (microsomes, which is really bits of torn ER)
and collections of cytoskeletal components. Of courses, sometimes you want these things, and
you take the pellet. Or, a more specifically designed centrifugation can get you the cellular
component you want – this is called “differential centrifugation”.
Centrifugation is also used to separate proteins from each other based on their density, mass or
size, as the solution can be set up so that the rate of molecules moving through the medium
depends on their density and/or their size.
One thing to note with centrifugation – when it is specified in a protocol it is either done in
terms of RCF (relative centrifugal force, generally given as # x g and time), OR speed (rpm)
and rotor type, and time. This is because what really matters is the RCF – which is determined
by the speed and the radius of the rotor. At a given speed, the larger the radius, the higher the
RCF, but its not linear (g = 1.12 x 10-5 x radius (cm) x rpm2). If a protocol is given with an
rpm for one rotor that is different from the one you have, you should look up the radius, find
the RCF and then figure out how to get that using your rotor. In published reports, you will
generally see the RCF, though sometimes the rotor, and very seldom will any one specify the
make and model of the centrifuge, because that is usually irrelevant).
Bulk Separations
This refers to the practice of taking a large batch of sample, and setting the conditons such that
the molecules are basically divided into two sets that can be separated from each toher. For
example, if beads that are (+) charge are added to a solution of protins, some will bind to the
beads, and others will not. If you separated the beads from the rest of the solution, you will
have two groups of proteins – those that bound and those that did not. You can usually get the
bound proteins off of the beads by changing the ionic conditions, and thus elute them into a
separate container. Its likely that your protein would be much more abundant in one or the
other fraction and you could get rid of a whole lot of proteins that you are not interested in by
continuing on with just that fraction.
Other bulk-type separations rely on setting up conditions that cause one group of proteins to
precipitate (i.e., “fall out of solution”). Salt precipitation is often the first step in purification.
When the salt concentration is raised, the water molecules that tend to cover hydrophobic
patches on the surfaces of proteins are more likely to be engaged in electrostatic interaction
with the salt ions, and the proteins tend to “glom” together by those exposed hydrophobic
patches – when they get big enough, they precipitate. The amount of salt dictates which
proteins stay soluble and which precipitate. (See Ammonium sulfate handouts for more
5
information). One very convenient feature of this kind of precipitation is that is usually
reversible – the protein's folded structure is not compromised, so by changing the salt
concentration again, the precipitated proteins can be recovered and are usually active. Thus
either fraction might be chosen.
Other methods can be devised to selectively precipitate (or not) the protein of interest. High
heat generally denatures proteins, because when their H-bonds are ripped apart, they unfold and
exposed hydrophobic regions bind to those of other proteins forming large precipitates (think
cooked egg white). However, some small proteins are heat-resistant, and will not precipitate.
The proteins precipitated in this way are generally not recoverable, but if you want the heatstable proteins, you can achieve a good purification with just this one step.
Chromatography – General Principles
Usually after one or a few bulk separation steps, some sort of column chromatography is used.
In chromatography, molecules in a mobile phase are retired in their movement across a
stationary phased based on the degree to which they either interact with the stationary phase, or
remain soluble in the mobile phase.
The classic example of chromatography is the separation of plant pigments in an organic
solvent system, on paper or a think layer of silica (TLC is think layer chromatography). The
pigments that are most soluble in the solvent system move the fastest, because they stay mostly
in the mobile phase. The molecules that are less soluble tend to get retarded because in
moments of insolubility, they get stuck on the stationary phase.
Most of the solvents systems that we use with proteins and in column chromatography are
aqueous, and it is not so much a matter of solubility but interaction with the stationary phase
itself. The stationary phase is usually some sort of bead (sometimes called gel, or resin or
matrix, but in fact, they are generally all some sort of particle that can be made into a slurry and
poured into a column. We then set up a plumbing system such that we can pass a solution
through the column (by gravity or by pump), load our protein, continue to pass the solution
through, and ultimately elute the proteins. The ones that interact less with the beads will elute
first – those with more interaction will elute later. As the main types of chromatography are
described below, realize that they all involve molecules in a mobile, which are retarded, to a
greater or lesser degree by their interaction with the stationary phase.
When separating molecules in this fashion, it is customary to collect many fractions, so that the
molecules with differing degrees of interaction with the column are captured separately from
each other. This allows for better resolution (the ability to “see” two things as separate entities)
– and to separate molecules that may differ only slightly in their ability to interact with the
resin. The various fractions are tested for the presence of the desired protein, and also for total
protein, and fractions generally pooled to collect our protein
There are a variety of different types of chromatography that differ in the properties used to
achieve separation – the main ones are gel filtration, ion exchange, hydrophobic interaction and
affinity – which will be described in more detail in subsequent sections
6
Chromatography, cont.
LC vs. HPLC,
LC stands for liquid chromatography, meaning that the mobile phase is liquid. (GC is gas
chromatography, where the mobile phase is gas or vapor). HPLC adds “High Performance” to
the name (it used to stand for high pressure, and in fact usually does involve higher pressure).
It is carried out in columns that may be packed with the same sorts of resins used in standard
LC, but are run under higher-pressure conditions. They are usually not practical for large-scale
separations, and are more commonly used for analysis. One type of separation that is rarely
used except in HPLC is “reverse phase” chromatography. Decades ago, early
chromatographers worked with silica beads and organic solvent systems (such as the TLC of
pigments described above), and that was considered “normal phase”. Reverse phase means that
the beads are non-polar (usually a hydrocarbon chain – C18 is very common) and the solvent
system is more polar, and as with TLC, the issue is solubility and whether the molecules spend
more time in the mobile phase or stationary phase.
Gel Filtration Chromatography (also called Size Exclusion)
The beads of these resins are a bit like waffle balls. They have pores, and spaces inside. Very
large molecules that cannot enter the pores travel between the waffle balls through the column.
In a sense they do not really interact with the beads at all, they just pass by. (We say they are
traveling in the Void volume, or Vo – which is the space between the beads). Smaller
molecules may enter some pores, and have to travel through more space (that is, not only the
space between the beads, but also through the internal space (called the included volume, or Vi)
of the beads. They go slower because, in a sense, they have to travel farther. The smaller the
molecule, the more beads they actually go into. Very small molecules will enter basically
every bead they encounter, and will elute last. Thus, we predict that if we know the Vo + Vi of
the column, the smallest molecules should come out in that volume. This assumes that the
molecules are not interacting in any OTHER way with the beads. The beads for gel filtration
are made to be inert, but sometimes molecules will stick to them a bit, and elute later than you
would expect.
One more thing about the Vo + Vi assumption – it is determined empirically, but can be
guessed at by determining the volume of the cylindrical bed of the resin. If you measure the
height of the resin (not the column itself, but the resin bed), and know the internal diameter of
the tube, you can calculate the bed volume as r2 x h There are three things that take up space
in the column. The V0, the Vi and the space the bead material takes up, Vg. Although proteins
do not travel in the Vg (it would be like walking through walls), we generally ignore it, because
its hard to figure out what it really is – so we take the bed volume and assume that if we run
one bed volume through the column, all molecules should come off.
There are many gel filtration resins that can be purchased, with defined size ranges (see
handout). Often their name gives a clue as to the sizes of proteins that they separate. You
should, however, be aware that this is just an estimate – in fact, the proteins that can enter the
pores (or note) of a given bead depends not so much on its molecular weight, but on its Stokes
radius, which it the effective radius when tumbling in solution. We generally say that this
method separates on the basis of molecular size rather than weight. Also, it is a non-denaturing
method, multimeric proteins that have submits held together by non-covalent bonds are likely
to separate at their conglomerate size, and may be very large.
7
Download