Abstract

advertisement
Abstract
Dipeptidyl peptidase IV is a serine oligopeptidase utilised by the periodontopathic, asaccharolytic
bacterium Porphyromonas gingivalis in amino acid degradation & peptide scavenging, which has been
attributed to the pathogenesis of adult periodontitis. The aims of this project were to characterise the
three-dimensional structure of dipeptidyl peptidase IV and its relation to the enzymes function, to aid in
the design of future inhibitors for the treatment of periodontitis. Preliminary x-ray crystallography of
native dipeptidyl peptidase IV crystals provided a 2.5 Å resolution radiation diffraction dataset, and x-ray
diffraction on Selenomethionine-derived crystals provided an additional 3-wavelength multi-wavelength
anomalous dispersion dataset for resolution of crystallographic phases. Prior to the start of the project
an atomic model was autonomously built using the Solve/Resolve software. WinCoot and CCP4 program
packages were then used for manual and automated refinement of the constructed model. Following
refinement, the most refined model had an R-factor of 21.3%, with a corresponding Rfree value of 26.9%.
The average B factor of total protein atoms for this model was 49.9 Å2, and the average RMS deviation
of bond distances and angles were 0.0156 Å and 1.9147° respectively. Structural alignment with
available eukaryotic and prokaryotic homologs revealed that dipeptidyl peptidase IV bore strong
structural congruity; in particular dipeptidyl peptidase IV from the gram-negative bacterium
Stenotrophomonas maltophilia showed the greatest degree of structural alignment. The active site of
dipeptidyl peptidase IV however showed greater alignment with its eukaryotic homologues; dipeptidyl
peptidase IV from both Homo sapien and Sus scrofa, suggesting that this prokaryotic enzyme has a
greater structure-function relationship homology with eukaryotic homologs than prokaryotic homologs.
Alexander Fullwood - 1024400
Crystal structure determination of dipeptidyl
peptidase IV from Porphyromonas gingivalis
Supervisor: Prof. Vilmos Fülöp
2013
1
Table of Contents
1.1 - Periodontal disease & Porphyromonas gingivalis .............................................................................................3
1.2 - Dipeptidyl peptidase IV: the prolyl oligopeptidase family ................................................................................4
1.3 - Treatment of Periodontal disease: inhibition of dipeptidyl peptidase IV .........................................................6
1.4 - Homologs of P. gingivalis dipeptidyl peptidase IV ............................................................................................9
1.4.1 - Homo sapien dipeptidyl peptidase IV ........................................................................................................9
1.4.2 - Sus scrofa dipeptidyl peptidase IV ...........................................................................................................15
1.4.3 - Stenotrophomonas maltophilia dipeptidyl peptidase IV .........................................................................17
1.4.4 - P. gingivalis prolyl tripeptidyl peptidase..................................................................................................18
1.5 - Structural Biology: X-ray Diffraction/MAD ......................................................................................................20
1.6 - Aims .................................................................................................................................................................21
2.1 - Expression, purification & crystallisation ........................................................................................................21
2.2 - X-Ray Diffraction .............................................................................................................................................22
2.3 - Data Processing ...............................................................................................................................................23
2.3.1 - Solve/Resolve ...........................................................................................................................................23
2.3.2 - WinCoot ...................................................................................................................................................23
2.3.3 - CPP4 .........................................................................................................................................................23
7.1 - Protein Sequence materials ............................................................................................................................48
7.1.1 - PgDPPIV sequence ...................................................................................................................................48
7.1.2 - HsDPPIV sequence ...................................................................................................................................48
7.1.3 - SsDPPIV sequence ....................................................................................................................................48
7.1.4 - SmDPPIV sequence ..................................................................................................................................49
7.1.5 - PgPTP sequence .......................................................................................................................................49
7.1.6 - CLUSTAL W2 multiple sequence alignment .............................................................................................49
7.1.7 - ClustalW2 Sequence alignment Scores ....................................................................................................51
2
1 - Introduction
1.1 - Periodontal disease & Porphyromonas gingivalis
Adult periodontitis is an inflammatory disease of the periodontium,
which maintains and supports the teeth in the oral cavity. Continual
destruction of the periodontium results in the eventual loss of the
connective tissue between the teeth and the gum. Adult periodontitis
occurs as a result of the presence of oral periodontopathogenic
anaerobes, the most common of which are Porphyromonas gingivalis
and Tannerella forsythia, which can be observed in afflicted patients
at frequencies of 85.7% and 60.7% respectively(1). These 2 pathogens
Figure 1: 0.1% uranyl acetate stained
electron micrograph of P. gingivalis
strain ATCC 33277.
along with Treponema denticola form the so called red complex of periodontopathogens.
P. gingivalis is a gram-negative asaccharolytic black-pigmented asaccharolytic bacterium (figure 1(2))
which metabolises oligopeptides as an alternative carbon/energy source to glucose and other
carbohydrates(3). It invades and replaces the facultative gram-positive bacteria in the host periodontium
by formation of a complex subgingival biofilm at the tooth's surface, referred to as a plaque, where its
presence often results in the destruction of the gingival connective tissues; including the alveolar bone(4)
and the periodontal ligament(5, 6). In addition to periodontal disease, P. gingivalis has also been
implicated in rheumatoid arthritis(7, 8), closely-associated peri-implantitis(9, 10), premature labour(11) and
atherosclerosis(12).
High proteolytic activity, as well as the release of inflammatory mediators by host leukocytes (proinflammatory cytokines, matrix metalloproteinases and prostanoids), are both key factors in the
development of periodontal diseases. The high proteolytic activity in the oral cavity occurs as a result of
a large battery of proteases & peptidases utilised by P. gingivalis, including enzymes with trypsin-like,
collagenolytic & glycylprolyl peptidase activity. An example of a well characterised family of enzymes are
the gingipains; Lys/Arg-specific Cysteine proteinases that are attributed to 85% of the total extracellular
activity(13) that contributes to much of P. gingivalis’ pathogenicity, including the evasion of the host
immune system, establishment of chronic inflammation, cellular adhesion and vascular permeability &
bleeding(3).
3
1.2 - Dipeptidyl peptidase IV: the prolyl oligopeptidase family
Dipeptidyl peptidase IV from P. gingivalis (PgDPPIV) is another enzyme that in mouse knock-out
experiments was implicated to have a role in the virulence of P. gingivalis(14). PgDPPIV was first
identified as an enzyme having only a role in peptide scavenging and amino acid degradation(15),
however current research also implicates PgDPPIV in number of proteolytic activities on peptides(16), as
well as being a major participant in the asaccharolytic growth of P. gingivalis(17). PgDPPIV is a 723 amino
acid residue integral membrane protein located in the outer periplasmic membrane, where it is exposed
to the extracellular environment. PgDPPIV belongs to the S9B subfamily of prolyl oligopeptidases (POPs),
which hydrolyse the C-terminal side of peptide bonds of linear oligopeptides (up to ~30 residues)
containing the amino acid residue Proline at their P1 sites. The POP family constitutes a great number of
enzymes, such as prolyl oligopeptidase (S9A), oligopeptidase B (S9A), dipeptidyl aminopeptidase X (S9B),
acylaminoacyl peptidase (S9C) and prolyl tripeptidyl peptidase (S9B). POP family enzymes are unrelated
to the classical trypsin and subtilisin family of peptidases, however the mechanism of binding & catalysis
maintain a degree of homology; catalysis is driven by a catalytic triad consisting of the residues Serine
(nucleophile), Aspartate and Histidine (acid/base), which are housed in a highly conserved α/β hydrolase
domain which consists of 8 β-sheets connected by α-helices.
All POP family proteins share a catalytic serine consensus sequence, GxSxGGφφ, (figure 2(18)) where x is
any residue and φ is a hydrophobic residue. The DPPIV subfamily enzyme motif is generally comprised of
GWSYGGφφ, however there are DPPIV enzymes which are exceptions. The mechanism of catalysis for
these serine peptidases involves the formation of a negatively charged covalent acyl transition state
intermediate, which is stabilised by 2 hydrogen bonds with residues in a nearby oxyanion binding
pocket. In chymotrypsin family enzymes these hydrogen donors are provided by the backbone NH
groups of the catalytic Serine and nearby Glycine, and in subtilisin family enzymes they are provided by
the backbone NH of the catalytic Serine and the side-chain NH2 amide of Asparagine. In POP family
enzymes these hydrogen donors are the backbone NH of the 2nd x residue in the POP Serine consensus
sequence, and the OH hydroxyl group of an upstream Tyrosine residue. The Tyrosine residue present at
the 2nd x position provides the hydrogen donor for oxyanion stability in DPPIV family enzymes.
4
Figure 2: Partial protein sequence alignment of the c-terminal end of prolyl oligopeptidase (POP, subfamily S9A) and other S9 family
enzymes : rat liver acylaminoacyl-peptidase (ACPH, S9C subfamily), human protein 3p2l (later acylaminoacyl-peptidase, S9C subfamily), rat
liver dipeptidyl peptidase IV (DPPIV, S9B subfamily) and yeast dipeptidyl peptidase B (DAPB, S9B subfamily). Highlighted in the red box is the
serine consensus sequence GXSXGG present in all 5 sequences.
In addition to the α/β hydrolase domain, the POP family enzymes also contain a second domain, a βpropeller, which in POP family enzymes acts as a gating mechanism by which substrate size and
stereochemistry may be selected for to provide substrate specificity. The β-propeller of the POP family is
a primarily β-sheet structure, which consists of 7 blades arranging in a ellipsoid radial open-velcro
topology to form a solvent-exposed opening, with each blade consisting of a stack comprised of a
varying number of anti-parallel β-sheets; typically 4. DPPIV enzymes differ slightly in their β-propeller
topology, containing an additional 8th blade, with the 4th blade in the domain being comprised of 7 βsheets; 5 β-sheets in the primary blade and 2 β-sheets in the subdomain, the role of which is elucidated
in section 1.4.
PgDPPIV maintains specificity for both X-Pro and X-Ala residues, which are cleaved at the N-terminus of
oligopeptide substrates. This has been evidenced by the cleavage of glycylprolyl dipeptides from type-1
collagen partially treated with collagenase from Clostridium histolyticum(15). Type-1 collagen is a helical,
polymeric protein (consisting of repeating tripeptide monomers of Gly-Pro-X) that intertwines with
other collagen molecules to form a super helical homotrimeric quaternary structure. X-Pro and Pro-X
cleavage sites are not apparent in other proteases & peptidases, thus providing a niche of activity for
the DPPIV family enzymes. DPPIVs are also shown to act upon hydroxylated Prolines, which are a posttranslational modifications that increase the stereoelectronic stability of the collagen triple helix in
eukaryotic organisms (X-HyPro-Gly)(19). More recent evidence however suggests that while PgDPPIV can
hydrolyse type-1 collagen, it does not directly hydrolyse homotrimeric type-1 collagen in vivo, but also
promotes the activity of host-derived matrix metalloproteinase 2 (MMP-2) (gelatinase) and MMP-1
(collagenase) to aid in type-1 collagen hydrolysis, as well as also having a role in the mediation of the
adhesion of P. gingivalis fimbrae to fibronectin(20).
5
1.3 - Treatment of Periodontal disease: inhibition of dipeptidyl peptidase IV
The observed role of DPPIV in the pathogenicity of P. gingivalis provides a potential drug target for the
treatment of adult periodontitis and other implicated diseases. However there is very little current
published data on the inhibition of PgDPPIV. Current research into the treatment of periodontal disease
have focused in other areas of virulence such as the reliance on communication between
peridontopathogens, particularly the symbiotic relationships between P. gingivalis and the red complex
species, and in particular Aggregatibacter actinomycetemcomitans(21). There is also much research
describing the importance of fimbrae in plaque formation(22). Targeting P. gingivalis with currently
available antibiotics has thus-far proven difficult, as both encapsulated and non-encapsulated P.
gingivalis is capable of invading host gingival fibroblasts, making internalised P. gingivalis resistant to
current antibiotics(23). Currently vaccination in combination with passive immunisation & probiotic
therapy is the favoured course of progression against treatments of periodontal disease(24). Mus
musculus monoclonal antibodies in particular have been put forward for use potential in passive
immunisation. One such antibody, MAb-Pg-DAP-1, was produced by Teshirogi et al in 2003(25) using
highly purified PgDPPIV as an immunogen. The developed MAb was also capable of inhibition of DPPIV
in gram-negative species such as Porphyromonas endodontalis and Prevotella loesheii, but was not able
to inhibit DPPIV present in the gram-positive species Streptococcus mutans and Actinomyces viscosus.
HsDPPIV present in blood serum was also not inhibited.
Only 2 papers to date been published on direct PgDPPIV
inhibition. Gilmore et al published a paper in 2006 describing the
kinetic effects of the binding of a biotinylated dipeptide proline
diphenyl
phosphonate
inhibitor,
H2N-Glu(biotinyl-PEG)-
ProP(OPh)2 (figure 3), on DPPIV-like serine peptidases(26). For
porcine homolog (Sus scrofa DPPIV, SsDPPIV) it was shown to be
an irreversible inhibitor with a second-order rate constant (ki/Ki)
for inhibition of 1.57 x 103 M-1 min-1; favourable in comparison to
other P1 proline diphenyl phosphonate dipeptide inhibitors.
Figure
Studies carried out on P. gingivalis strain W83 did not elucidate
biotinylated dipeptide proline diphenyl
the kinetics of inhibition of PgDPPIV, however the inhibitor did
PEG)- ProP(OPh)2.
3:
Chemical
structure
of
the
phosphonate inhibitor, H2N-Glu(biotinyl-
prove to be an efficient probe for the presence of DPPIVs in crude
6
sonicates of W83 using western blot analysis of the 80kDa protein; this was to their knowledge the first
DPPIV active site to be labelled in this manner, and may have applications beyond the treatment of
periodontal disease, into other serine-protease directed diseases such as diabetes, mitochondrial
disease and rheumatoid arthritis.
The other paper also published in 2006 by Bodet et al describes the effects of a non-dialysable material
(NDM) extract from cranberry juice on the proteolytic activities of red complex bacteria, P. gingivalis, T.
forsythia and T. denticola
(27)
. They characterised the hydrolytic activity of PgDPPIV on synthetic
chromogenic peptides, as well as the total P. gingivalis proteolytic activity on Type-1 collagen and
transferrin. Their results showed that this NDM reduced the activity of PgDPPIV by 50% at a
concentration of 150 µg mL-1 (figure 4),
however compared to other enzymes such as
Arg-gingipain and Lys-gingipain the inhibition
of DPPIV is much less, with Arg-gingipain and
Lys-gingipain
being
inhibited
at
-1
concentrations of 75 µg mL and 25 µg mL-1
respectively. Combined with 30% inhibition of
total collagenase activity at 50 µg mL-1 and
Figure 4: Effect of the cranberry NDM fraction on DPPIV activity of P.
gingivalis. The degradation obtained in the absence of NDM was given
95% total transferrin degradation activity at
a value of 100%. *P < 0.05 between NDM of various concentrations
150 µg mL-1, there is still an implication that
and control without NDM.
this NDM may effective in the treatment of adult periodontitis.
It is evident that thus far screening techniques have not elucidated any potential inhibitors of PgDPPIV.
Direct structure-based drug design however may identify potential inhibitors. This technique is utilised
to develop novel inhibitors tailored specifically for their target, ensuring specificity in the selectivity of
the target, and has proven to be much a more efficient and cost-effective development process
compared to high-throughput screening. However to design targets in such a way, one first needs to
have an understanding of the structural features that can be attributed to the function of the target
protein; the so called structure-function relationship. For enzymes such as PgDPPIV, this would require
an understanding of the structural events occurring at the active site, particularly of the residues making
up the binding pockets of the natural substrates. To ascertain this knowledge, biophysical techniques
are commonly used to determine the 3-dimensional structure proteins. Of the structures currently
7
available in the Protein Data Bank (PDB), 88.2% were solved using X-ray crystallography, while 10.9%
were solved using nuclear magnetic resonance (NMR). If there is sufficient similarity between structures,
homologs of the protein in question can be used to aid in the design of such structure-specific inhibitors.
The inhibition of Homo sapien DPPIV (HsDPPIV) is well characterised in the treatment of human
diseases; DPPIV-inhibitors, also referred to as gliptins (figure 5), are currently used in the treatment of
type II diabetes mellitus. Inhibitors of DPPIVs have been long sought after since the late 80s and early
90s, however the first clinically approved gliptin, sitagliptin (previously MK-0431, figure 5A(28)) was only
FDA approved in 2006. Mechanistically it is a competitive inhibitor of the gastrointestinal incretins
glucagon-like peptide-1 and gastric inhibitory polypeptide, which are released in response to meal
ingestion in the gastrointestinal tract. This inhibition results in the increase of extracellular insulin and a
decrease in the production of glucagon. Currently there are a 7 gliptins available on the market, with the
most recent, alogliptin (figure 5G(29)), being FDA approved in 2013. The structure-activity relationship of
HsDPPIV and inhibitors such as vildagliptin (figure 5B), are discussed in section 1.4.1. Since their
inception, direct drug design analysis has been utilised to optimise inhibitor selection of the HsDPPIV(30).
Figure 5: Chemical structures of 7 gliptins either approved or currently in development/clinical trials. These are (in chronological order of
published research): A) sitagliptin, B) vildagliptin, C) saxagliptin, D) linagliptin, E) dutogliptin, F) gemigliptin and G) alogliptin.
8
1.4 - Homologs of P. gingivalis dipeptidyl peptidase IV
To date, no structure of the PgDPPIV has yet appeared in the literature. However a combination of
sequence alignment (based on Clustal W2 alignment, see 7.1, appendix) with reading of the currently
available literature has identified potential homolog candidates of PgDPPIV; HsDPPIV, SsDPPIV,
Stenotrophomonas maltophilia DPPIV (SmDPPIV) and P. gingivalis prolyl tripeptidyl peptidase (PgPTP).
Prior attempts at crystallographic study of PgDPPIV have had no success using molecular replacement
the structure of PgDPPIV using such homologs(31); however structural
techniques to resolve
comparisons in combination with the interpretation of conserved residues may elucidate whether the
structure-activity relationship of PgDPPIV is conserved, and perhaps aid us in the pursuit of new drug
inhibitors for PgDPPIV.
1.4.1 - Homo sapien dipeptidyl peptidase IV
A
HsDPPIV (shown in figure 6) shares 25.45%
of its sequence with PgDPPIV and is the
most
extensively
studied
homolog
of
PgDPPIV, particularly in the study of DPPIV
family enzyme inhibitors.
HsDPPIV has a
role in amino acid degradation and peptide
scavenging (32), however it is also implicated
in
homeostasis(33),
glucose
T-cell
B
signalling(34), chemotaxis(35) and cancerogenesis(36). Confirmed protein interactions
include
interactions
(37)
deaminase
,
with
fibronectin
(38)
adenosine
,
collagen(39),
HIV gp120 protein(40), chemokine receptor
CXCR4(41) and tyrosine phosphatase CD45(42).
Diseases which have been associated with
Figure 6: Cartoon representation of the dimeric configuration of
the action HsDPPIV include type 2 diabetes
HsDPPIV from an A) front perspective and B) top-down perspective.
mellitus(43), obesity(44), tumour growth(45),
(46)
and HIV infection
Rendering was accomplished with Pymol, using crystal structure 1nu8
from the PDB.
.
9
The structure of HsDPPIV used in this report was that of HsDPPIV crystallised with the ligand diprotin A
(Ile-Pro-Ile), solved at 2.5 Å resolution by Thoma et al(47), with more current information sourced from
the 2006 review of POP family peptidases by Rea D & Fülöp V(48). HsDPPIV is a single polypeptide
composed of 766 amino acid residues, with a polypeptide pair associating as homodimer (figure 6), the
formation of which has been linked with the regulation of catalytic activity(49); however dimerisation is
not required to achieve catalytic activity(50). The 2 conserved domains, the α/β hydrolase domain at the
C-terminus and the β-propeller domain at the N-terminus, are shown in figures 7A and figure 7B
respectively. Each monomer contains 9 N-glycosylated Asparagine residues, and 5 disulphide bridges; 4
stabilising the β -propeller domain and 1 housed in the α/β hydrolase domain.
A
B
Figure 7: Cartoon representations of A) the α/β hydrolase domain of HsDPPIV, showing the 8 β-sheet topology of the domain and B) the βpropeller domain of HsDPPIV, showing the 8 bladed topology and pore of the domain. Rendering was accomplished with Pymol, using
crystal structure 1nu8 from the PDB.
The active site sits in the interface between the α/β hydrolase domain (Gln508-Pro766) & the βpropeller domain (Arg54-Asn497), and has 2 solvent-exposed entrances; a side entrance (Figure 8A) and
a solvent-exposed entrance through the β-propeller (Figure 8B); however the mode of entry still remains
controversial. The solvent-exposed opening to the active site through the β-propeller has a length of 14
Å and a width of 7 Å. These tunnel dimensions imply that the tunnel allows for passage of an extended
peptide or a hairpin loop, but not for a folded α-helix. The blades of the β-propeller are divided into 2
subdomains; blades II-V and blades I & VI-VIII. Bending of blade I and the subtle bending of blades II–IV
results in the formation of the slightly larger side entrance. Both entrances are large enough to negate
10
the need for conformational changes. The membrane binding domain, found only in the integral
membrane form of DPPIV, is located at the N-terminus. It comprises a short cytoplasmic tail (Met1-Arg6)
and a single transmembrane helix (Val7-Leu28). This domain does not simply anchor the protein, but
also contributes to the formation of the quaternary structure(51).
Figure 9 shows the motifs involved in dimerisation of HsDPPIV. The subdomain which extends from
blade IV of the β-propeller domain forms an anti-parallel β-sheet structure that forms the primary
dimerisation motif for HsDPPIV, which are involved with the formation of a salt bridge that is formed
with the blade IV subdomain of the partner monomer. Residues Phe713–Thr736 from the α/β comprise
a loop that also plays a role in the dimerisation of HsDPPIV. In the monomeric form these residues are
exposed, resulting in distortion of the catalytic triad(52). A hypothesis suggested by Rasmussen et al
theorises that this loop forms a lid-type structure that covers the side opening in the monomeric
form(53). Residues Arg658–Tyr661 form a short α-helix which also contributes to the dimer interface.
A
B
Figure 8: Cartoon representation of HsDPPIV, showing A) the solvent-exposed side entrance and B) the solvent-exposed β-propeller entrance
to the active site. The α/β hydrolase domain is coloured in orange & the β-propeller domain is coloured in blue. The substrate shown is
diprotin A (Ile-Pro-Ile) in a tetrahedral intermediate complex with Ser630. Rendering was accomplished with Pymol, using crystal structure
1nu8 from the PDB.
11
A
B
Figure 9: Carton representation of HsDPPIV homodimer with highlighted residues involved in dimerisation: Residues Arg658–Tyr661 (red)
and Phe713–Thr736 (blue) from the α/β hydrolase domain and the anti-parallel β-sheet stack subdomain (yellow) protruding from the 2nd
β-sheet of blade 4 in the β-propeller domain. Figure 9A is from a front perspective, and figure 9B is a top-down perspective. Rendering was
accomplished with Pymol, using crystal structure 1nu8 from the PDB.
12
Figure 10: Stick/cartoon diagram of the HsDPPIV active site. Shown are the substrate diprotin A (Ile-Pro-Ile) inhibitor in a tetrahedral
intermediate complex with Ser630 (white), S1 hydrophobic residues Val711, Val656, Tyr662, Tyr666 & Trp659 (magenta), oxyanion
stabilising residues Tyr547 & Tyr631 (green), catalytic triad Ser630, Asp708 & His740 (orange) and Glu205-Glu206 contributed from the βpropeller domain (cyan). Rendering was accomplished with Pymol, using crystal structure 1nu8 from the PDB.
Figure 10 shows the active site of HsDPPIV with ligand Ile-Pro-Ile covalently bound. The residues that
form that active site are contributed mostly by the α/β hydrolase domain. The catalytic triad is
comprised of residues Ser630, Asp708 & His740. In the state shown above, His740 is protonated, with
the cationic charge being stabilised by hydrogen bonding with the side chain of Asp708. The residues
responsible for the stabilisation of the negatively charged oxyanion group of the acyl tetrahedral
intermediate formed during catalysis are Tyr547 and Tyr631. The S1 binding site consists of the
hydrophobic residues Val711, Val656, Tyr662, Tyr666, Trp659 and Tyr631; Tyr662 & Tyr666 stack
adjacently on the P1 residue, with the P2 and P1' side chains facing into cavity of the active site.
Residues Glu205 and Glu206 provided by blade IV of the β-propeller domain interact with the N
terminus of peptide substrates, indicating the requirement of a positively charged substrate N terminus.
13
As can be seen from the gliptins shown in figure 5, the compounds which can inhibit HsDPPIV activity
are highly variable in their appearance; however all the current gliptin inhibitors contain either a
cyanopyrrolidine functional group (a 5 membered pyrrolidine ring with nitrile moiety, also pyrrolidine-2nitrile), such as vildagliptin and saxagliptin, or they may contain either a modified cyanopyrrolidine ring
or a pyrrolidine ring with the nitrile present in cyanopyrrolidine replaced with hydrogen, fluoro,
acetylene, or methanol functional groups(54). The presence of a nitrile group enhances the potency of
such drugs by inducing partial transient covalent trapping of the Ser630 hydroxyl of HsDPPIV by the
nitrile, as well as hydrogen bonding with nearby Tyr547(55).
Studies on saxagliptin(56) indicated a 2 step mechanism by which an initial encounter complex is formed,
followed by a covalent intermediate formation. Ionization of the Asp708-His740 catalytic pair enhances
Ser nucleophilicity, and thus its ability to be targeted for covalent addition. Hydrogen bonding of the P2
terminal amine of the inhibitor to residues Glu205/Glu206 in the enzyme active site contributed from
the β-propeller involves very short, strong hydrogen bonding formation characterised using 1H NMR.
Positioning of the pyrrolidine ring between the stacks of the S1 binding residues Tyr662 and Tyr666 also
provides additional binding energy(53). Generally speaking inhibitors lacking the pyrrolidine-2-nitrile ring
have 10-fold reduction in their potency. Weaker inhibitors based on Valine pyrrolidide showed hydrogen
bonding of carbonyl group of the inhibitor to residues Asp124 and Asn710. Experiments on non
cyanopyrrolidines revealed that alteration of steric constraints of the pyrrolidine ring in the S1 pocket
greatly reduced drug potency, such as the introduction of a methyl group which effectively destroyed all
inhibitor potency(54). Stereochemical bulking of the S2 binding region however showed an increase in
inhibition.
The observations described above have made it clear that the structure and positioning of residues in
the active site of HsDPPIV play a clear role in its related function; the aforementioned structure-function
relationship. As mentioned previously, structural characteristics of the HsDPPIV active site have seen use
in optimisation of the selection of inhibitors that bind the active site. Such knowledge can provide
insight into development of similar inhibitors of PgDPPIV, provided there is great enough homology of
the active site between two homologs. If there is strong homology in the active site residues of PgDPPIV,
then may still be possible to identify other regions that are unique to PgDPPIV, such as the hydrogen
bonding of Valine pyrrolidide inhibitor described above.
14
1.4.2 - Sus scrofa dipeptidyl peptidase IV
The 766 residues long SsDPPIV shows high sequence similarity to its Eukaryotic ortholog HsDPPIV,
bearing 88.38% sequence similarity. With PgDPPIV it shares 25.45% similarity, the same as PgDPPIV
does with HsDPPIV. High sequence homology to HsDPPIV may imply conserved function; however the
structure of SsDPPIV further elucidated a mechanism which has thus far not been observed in HsDPPIV.
The 1.8 Å resolution structure of native SsDPPIV which was solved by Engel in 2003(57) revealed that
unlike the homodimeric structure observed in HsDPPIV crystals, SsDPPIV has been shown to crystallise
as both a homodimer and as a homotetramer (figure 11), with tetramerisation believed to be involved in
cell-cell contacts at the cell surface. The tetramerisation of 2 homodimers is brought about via
interactions between the glycosylated blade IV of the β-propeller domain, which comprises hydrophilic
residues Asn279-Gln286. The residues from each dimer form an anti-parallel β-sheet, with further
contributions from blade V. From the α/β hydrolase domain, helix Met746–Ser764 and the loop
comprising residues Phe713-Thr736 (in particular helix Gln714–Asp725 and strand Asp729–Thr736;
similar to HsDPPIV) constitute the central dimerisation motif as in HsDPPIV The 2nd β-sheet from the
blade IV subdomain also contributes to dimerisation via stabilization of these loops, like in HsDPPIV.
The N-terminal β-propeller domain comprises residues Arg54-Asn497, and the α/β hydrolase domain
comprising residues Gln508-Pro766. The I & VI-VIII, II-V asymmetric 8-bladed topology of the β-propeller
domain is consistent with that found in HsDPPIV, with the first blade (Phe53–Tyr58) and last blade
(Glu499–Met503) forming a non-covalent linkage. Again there are 2 solvent-exposed openings to the
active site; through the β-propeller and through an exposed side entrance generated by the kinked
arrangement of blade 1 and 2-4. The diameter of the β-propeller opening is 9 Å and 15 Å from blade IV
to VIII and from blade II to VI respectively, with the tunnel widening to 15 Å and 25 Å towards the
interface. The distance from the surface of the opening to the active site is 37 Å. The side entrance is
oval with dimensions of 15 and 22 Å, and measures 20 Å from the surface to the active site. The blades
are a stabilized by numerous Cysteine disulfide bonds, which comprise all disulfide bonds in the
monomer aside from Cys649-Cys762, which cross-links the C-terminal helix Met746-Ser764 with the βsheet Cys649-Val652, stabilizing the α/β hydrolase domain, as in HsDPPIV. Similarly all glycosylation sites
are present in the β-propeller domain apart from Asn685, half which are orientated away from the α/β
hydrolase domain. Only Asn279 on blade IV is post-translationally modified.
15
The active site of HsDPPIV and SsDPPIV are also highly conserved, being almost identical. The catalytic
triad comprises residues Ser-630, Asp708 and His740. The hydrogen donors of the oxyanion stabilization
pocket are Tyr631 and Tyr547. The pyrrolidine ring is accommodated by a hydrophobic pocket formed
by side chains of Tyr666, Tyr662, Val711, Val656, and Trp659. In crystallization with an inhibitor, the P2carbonyl oxygen sits in an electrostatic pocket formed by the side chains of Arg125 (positioned on the
hairpin loop between strands 2 and 3 of blade II) and Asn710. Glu205 and Glu206, positioned on a short
helical insertion within strand 1 of the β-propeller blade IV, interact with the free amino terminus of the
P2- residue.
Figure 11: Cartoon representation of the SsDPPIV homotetramer, taking on a quaternary conformation as a dimer of dimers. Rendering was
accomplished with Pymol, using crystal structure 2aj8 from the PDB.
16
1.4.3 - Stenotrophomonas maltophilia dipeptidyl peptidase IV
To date, the 741 residue SmDPPIV (figure 12) is the only bacterial dipeptidyl peptidase IV for which the
structure has been solved(58). Sequence alignment with Clustal W2 indicates that PgDPPIV and SmDPPIV
share 27.66% of their sequence, a higher similarity than that observed with HsDPPIV and SsDPPIV.
However this similarity is much lower than perhaps would perhaps be expected between bacterial
species, compared to the homology demonstrated between the 2 eukaryotic orthologs. The GWSYGGφφ
sequence found in the conserved DPPIV motif differs in SmDPPIV, where the sequence is GWSNGGYM,
with the introduction of Asparagine predicted to participate in substrate recognition of 4hydroxyproline, as evidenced by a N611Y mutation resulting in a decrease to 30.6% of the wild type
hydrolytic activity. Interestingly Asparagine is also found in the same position in the GxSxGG consensus
of POP, for which there no observed bacterial homologs.
The 2.8 Å resolution structure of SmDPPIV is again very highly conserved, with the key topological
features of both the α/β hydrolase domain (Gln484-Pro741) and the β-propeller domain (Leu39-Ala483)
being very similar to those found in HsDPPIV and SsDPPIV, however it contains an α-helix (containing a
Glu206/Glu207 di-Glutamate motif) situated between the 1st and 2nd sheets of blade IV that is not
observed in the eukaryotic homologs. Additionally the 2nd and 3rd sheets of blade II are shorter than that
found in SsDPPIV, generating a larger side entrance. Arg125, which is believed to take part in substrate
recognition, is located in between these 2 blades, but is displaced from the active site. Finally, the βpropeller is also displaced relative to its position from the α/β hydrolase domain in other DPPIV
enzymes, following a rotation of 10°.
The catalytic triad comprises residues Ser610, Asp685, and His717, with the active site sitting in the
domain interface. Residues Tyr524 and Asn611 comprise the oxyanion stabilising residues. Val636,
Trp639, Tyr642, Tyr646, and Val688 form the hydrophobic binding pocket. As mentioned previously
residues Glu206/Glu207 provide the N-terminus substrate binding pocket. The residues found in the
active site of SmDPPIV are shifted from those residues in the active site of SsDPPIV, leading to a
substantial increase in the size of the hydrophobic binding pocket of SmDPPIV.
17
Figure 12: Cartoon representation of the SmDPPIV homodimer. Rendering was accomplished with Pymol, using crystal structure 2ecf from
the PDB.
1.4.4 - P. gingivalis prolyl tripeptidyl peptidase
In 1999 Banbula et al were able to deduce the presence of another oligopeptidase that took part in the
growth and host-evasion mechanisms of P. gingivalis; prolyl tripeptidyl peptidase (PgPTP)(59), also a
member of the S9B subfamily. Failure to inactivate with Cysteine and metalloproteinase inhibitors,
combined with observed inactivation with diisopropyl fluorophosphates (DFP) implicated its role &
mechanism as a serine peptidase. Clustal W2 sequence alignment of the 732 residue PgPTP revealed the
presence of the DPPIV family consensus sequence GWSYGG, and a total sequence similarity of 25.86%
with PgDPPIV. The structure of PgPTP was resolved at 2.1 Å by Kiyoshi Ito et al in 2006 (figure 13)(60).
Again overall topology of both the α/β hydrolase domain (Asn471–Leu732) and the β-propeller (Glu45–
Lys470) is highly conserved. Like SmDPPIV, there are differences of the β-propeller domain compared to
the eukaryotic homologues, however some of these differences can be attributed to this enzymes
function as a tripeptidyl peptidase, and not a dipeptidyl peptidase, such as the widening of the side
18
entrance to facilitate larger substrates. The subunit interface is dominated by hydrophobic interactions
and contained four salt bridges between Lys232 and Glu245 (of blade IV subdomain Lys232–Thr256),
and between Arg669 and Asp730 (Gly688–His731 loop), as seen in the homologs. Perhaps paradoxically
PgPTP bears little structural homology to other tripeptidyl peptidases.
The catalytic triad consists of residues Ser603, Asp678 and His710. The hydrophobic pocket consists of
residues, Val629, Trp632, Tyr635, Tyr639 and Val680, with an additional residue, Val681, also
participating the hydrophobic binding of the substrate. Tyr518 and Tyr604 comprise the oxyanion
stabilising residues, much as they do in the thus far described dipeptidyl peptidases. The glutamate
motif is again present as residues Glu205, however the hydrogen bonding role of Glu206, which is not
present in PgPTP, is replaced by Glu636.
Figure 13: Cartoon representation of the PgPTP homodimer. Rendering was accomplished with Pymol, from crystal structure 2d5l from the
PDB.
19
1.5 - Structural Biology: X-ray Diffraction/MAD
X-ray diffraction is used to determine the electron density of the molecules that constitute the crystal,
which can then be interpreted to fit an atomic model, which in current times is done computationally. Xrays are fired at a fixed crystal, which interact with the electron cloud of the electrons surrounding the
molecules within the crystal. The x-rays are diffracted off the electrons, and are detected using an x-ray
detector, typically an imaging plate, which measures the intensities (I) and positions of the diffracted xrays. By crystallising a molecule of interest, a lattice is formed composed of repeating units (referred to
as unit cells), where molecules within the unit cell which have no internal planes of symmetry are
referred to as asymmetric units. Crystallisation of molecules permits amplification of the diffracted xray, as the intensity from a single molecule too small to be detected. The electronic density is defined by
the equation
, where
are the real-space
Cartesian coordinates, h, k, ℓ are the miller indices (integers describing the orientations of a plane or set
of planes within a lattice in relation to the unit cell), V is the volume of the unit cell,
is the phase of
the incident x-rays (a mathematical function describing the fraction of a sinusoidal wave cycle that has
elapsed relative to the wave cycle origin), and F is a structure factor (a mathematical description of the
ability of an atom to scatter incident x-rays), where F(h, k, ℓ)2 ∝ I(h, k, ℓ).
Determination of phase is important because phase cannot be detected by any currently available
detectors. Multi-wavelength anomalous dispersion is one such technique used to determine phases,
which makes use of an anomalous scatterer. Anomalous scattering is scattering that results in the
change of
of the incident radiation as a result of inelastic collision with the scatterer. This results in an
electronic transmission of a low energy electron to a higher electron shell. An electromagnetic wave
causes an electronic transition when the energy of the incident photon at a given wavelength,
is equal to the energy difference between the electron shells. The incorporation of selenium
into methionine residues provides a heavy atom which acts as an anomalous scatterer. For selenium this
usually occurs as a transition from the K shell (1S subshell) to the 5S subshell, which occurs at a photon
wavelength 0.9795 Å, corresponding to the x-ray region of the electromagnetic spectrum. The energy of
this transition corresponds to approximately 12.7 keV, which is described as the K shell absorption edge
of selenium. In MAD the change in
when scattering is observed at multiple wavelengths allows for the
determination of the phases of the incident x-rays of the crystal.
20
1.6 - Aims
The primary objective of this project was to build, refine and characterise an atomic model of the threedimensional structure of PgDPPIV, using crystals generated by crystallographic techniques to collect xray diffraction data which can be utilised to aid in the goal. Along with a standard synchrotron x-ray
dataset, selenomethionine-derived crystals would be utilised with multi-wavelength anomalous
dispersion (MAD) to resolve the crystallographic phases of the crystal. The determination of the threedimensional structure, how it relates to the enzymatic activity of PgDPPIV, and structural comparatives
with the structurally-available homologs of PgDPPIV (HsDPPIV, SsDPPIV, SmDPPIV, PgPTP) will hopefully
provide some new insight into the design of selective PgDPPIV inhibitors, providing potential treatment
for adult periodontitis and other PgDPPIV-associated disease states.
2 - Materials & Methods
2.1 - Expression, purification & crystallisation
Preliminary expression, purification and crystallisation of
PgDPPIV was carried out by Rea D et al(31). A plasmid vector
containing PgDPPIV (Thr21-Lys723) and an N-terminus
polyHistidine-tag was constructed and used to transform
Escherichia coli strain BL21 (DE3). After extraction, PgDPPIV
was
purified
using
Nickel
Selenomethionine-derived
affinity
crystals
were
chromatography.
prepared
by
Figure 14: Visual light photograph 2.7 Å
transformation of auxotrophic E. coli strain B834 (DE3) and
diffracting resolution PgDPPIV crystals, with the
supplementation with proteinogenic amino acids (excluding
largest dimension of 0.3 mm.
methionine) and Selenomethionine, followed by purification as
above. Both were crystallised using the hanging drop vapour
diffusion technique in 40% 2-methyl-2,4-pentanediol (MPD)
and 100 mM Tris-HCl pH 7.5 buffer, with subsequent
microseeding at 35-40% MPD and 100 mM Tris-HCl pH 8.0
several months after initial crystallisation.
21
2.2 - X-Ray Diffraction
2 x-ray diffraction datasets were collected ahead of the start if the project: standard synchrotron
radiation diffraction of the hanging drop vapour diffusion crystals, which yielded an electron density
with a resolution of 2.5 Å, and a 3 wavelength MAD dataset from the selenomethionine-derived crystals.
Tuneable wavelengths were selected near the absorption edge of the selenium. Initial synchrotron data
collection and processing statistics were not available to be listened in this section, however table 1
contains partial processing statistics acquired from the data provided for this project. This includes the
X-ray diffraction pattern for PgDPPIV shown in figure 15, which was generated from the provided
PgDPPIV mtz file, which stores the reflection data.
Figure 15: FPK X-ray diffraction pattern of PgDPPIV, acquired from the provided mtz reflections.
22
2.3 - Data Processing
2.3.1 - Solve/Resolve
The Solve/Resolve suite was used for preliminary solution of the crystallographic phases of the PgDPPIV
crystal using the MAD dataset, as well as being used for the initial automated model building of the
PgDPPIV asymmetric unit (61-67). This was done in advance of starting work on the project.
2.3.2 - WinCoot
Refinement and manipulation of the atomic coordinates of PgDPPIV was carried out using the WinCoot
(Crystallographic Object-Oriented Toolkit) software, on the windows operating system(68, 69). For the sake
of practicality, only a single monomer of PgDPPIV was refined, with later superpositioning being carried
out on other monomers present in the asymmetric unit. For each residue, electron density fitting,
geometric restraint fitting, rotamer fitting, Asn/Gln B factor analysis and intermolecular bonding
environment analysis were assessed, as well as H2O positioning and hydrogen bonding environments.
2.3.3 - CPP4
The CCP4 suite(70) was used for much of the refinement which was used in conjunction with model
building. Programs contained within the CCP4 suite where accessed using the windows ccp4i GUI(71).
Software packages that were used in the refinement primarily include Refmac5(72,
73)
for automated
refinement, rampage(74) for stereochemical verification of the built model (via construction of
Ramachandran plots; bond angle φ (N-Cα bond) against bond angle ψ (C-Cα bond), superpose(75) to
superposition the secondary-structure of refined monomer of PgDPPIV onto other monomers present in
the asymmetric unit prior to Refmac5 refinement, and Baverage(70) which was used for the B factor
analysis of the constructed model. Additional sfcheck(76) was used for evaluating other structure-factor
parameters.
Liberation Screw-motion (TLS) refinement protocols(77) were also utilised in Refmac5 refinement of
PgDPPIV. TLS is defined as a mathematical method used to predict the local displacement of atoms that
are part of a rigid body, i.e. domains, which displace around a mean position. TLS bodies were
determined by input of restrained refinement of PgDPPIV to the TLS Motion Determination Server.
23
The asymmetric unit of the PgDPPIV crystal consists
of 8 molecules; model building of the electron density
has shown that the PgDPPIV crystal comprises 4
monomeric PgDPPIV chains (A-H), which are arranged
as a homotetramer of 4 homodimers. The unit cell
symmetry of the PgDPPIV crystal is characterised by a
P21 space-grouping, defining a monoclinic crystal
system with a single axis of crystallographic symmetry
from the position of an asymmetric unit at
coordinates
to an asymmetric unit at the
coordinates
as a result of an 180˚ screw
turn (figure 16). The asymmetric unit of PgDPPIV is
shown in figure 17(78), with a visual representation of
Figure 16: Diagramatic representation of the P21 monoclinic
space group, showing the 180˚ screw turn symmetry of the
assymetric unit
and it’s symmetry mate at
.
the symmetry of the asymmetric unit in the same cell
shown in figure 18. The unit cell parameters for a, b,
c (Å) and α, β, γ (°) are 117.0, 112.9, 310.9, 90.0, 95.0
and 90.0 respectively.
Table 1 shows the refinement statistics for 3 output models of PgDPPIV: the unrefined model, the
refinement model using restraint fit without TLS, and the refinement model using restrained fit with TLS.
Refinement without TLS was carried out at 30 cycles of restrained fit refinement, while refinement with
TLS was carried out with 20 cycles of TLS refinement, followed by 20 cycles of restrained fit refinement.
For TLS refinement, 3 designated TLS bodies were available for each monomer: residues 21-36 (Nterminus), residues 37-469 (β-propeller domain) and residues 470-723 (α/β hydrolase domain).
The model built with solve/resolve had an initial R-factor of 28.1%, and an Rfree value of 31%. The Rfactor (or reliability factor), is a parameter described by the equation
, which is a
measure of the difference between the observed structure factors (F) of the original electron density
map (Fobs) and the sum of the structure factors from the rebuilt electron density map calculated from
24
the model (Fcalc). Thus a lower R-value is indicative of the fit of the built model to the original map. The
R-value typically ranges from 0.6 for disordered molecules to ~0.2 for organised macromolecules. Rfree is
a variable used to eliminate R-factor bias from refinement; and are usually measured as is done for Rfactor minus a fixed percentage of the data set (typically 5-10%). Rfree is typically higher than the Rfactor. Analysis shows that restrained fit without TLS refinement reduced the R-factor of the observed
model to 21.3%, and the Rfree value to 26.9%, while the TLS-refined model had an R-factor of 23.9% and
an Rfree value to 28.2%. Structural analysis used in figures featured in this section will utilise the lowest Rfactor model.
Table 1: Refinement parameters for PgDPPIV crystal data sets; unrefined, refinement using restraint fit without TLS, and refinement using
restrained fit with TLS. *These B factor values represent the average across all chains (A-H). **B factor values for water molecules may be
incorrect, as there were no subsequent refinements of the model with waters incorporated into the solvent.
Refinement parameters
Unrefined
Refined
TLS Refined
Resolution range (Å)
Number of used reflections
Percentage Observed (%)
Percentage of free reflections (%)
Overall correlation coefficient
Free correlation coefficient
Cruickshank's DPI for coordinate error (Å)
DPI based on free R-factor (Å)
Overall figure of merit
R-factor
Rfree
Wilson B factor (Å2)
Average B Factors*
Total protein atoms (Å2)
Main chain atoms (Å2)
Side chain atoms (Å2)
Water molecules (Å2)
RMS Deviations
Bond Lengths (Å)
Bond Angles (°)
Chiral Volume
Average B factor Main chain atoms (Å2)
Average B factor Side chain atoms (Å2)
77.429-2.5
270,646
99.0897
1.9945
0.8910
0.8696
0.5016
0.3190
0.7513
0.2814
0.3097
-
77.429-2.5
270,646
99.0907
1.9945
0.9381
0.9035
0.3892
0.2792
0.8077
0.2129
0.2688
64.18
77.429-2.5
270,646
99.0907
1.9945
0.9218
0.8933
0.4255
0.2909
0.7935
0.2387
0.2824
64.18
26.299
25.749
26.839
-
49.928
47.614
51.436
30**
16.281
15.372
17.186
-
0.0127
1.6723
0.1222
0.388
1.270
0.0156
1.9147
0.1087
2.760
3.409
0.0160
1.9042
0.1063
0.733
1.067
Stereochemistry remains relatively unchanged with refinement. Polypeptide backbone stereochemistry
assessed with rampage showed that for the unrefined model, 5115 (91.2%) residues fall in the favoured
region and 360 (6.4%) residues reside in the allowed region. 133 (2.4%) residues were outliers. The final
25
non-TLS refined model 5087 (90.7%) were in favoured regions, 386 (6.4%) were in allowed regions and
135 (2.4%) in outlier regions. The TLS refined model contained 5137 (91.6%) residues in the favoured
region, 347 (6.2%) residues in the allowed region and 124 (2.2%) in the outlier region.
B factors are a parameter that define the motion of an individual, and are often referred to as
"temperature factors", as increased motion correlates with increase in temperature (this is a conferred
of advantage of using a cryostream in x-ray diffraction, as it reduces temperature and hence motion,
allowing for higher resolution structures). A smaller B factor is indicative of lower atomic motion, and
vice versa. Analysis of the models show distinctive sets of B factor data for each model. The original
model had an average B factor for all protein atoms of 26.3 Å2; for the non-TLS refined model this value
was 49.9 Å2, while for the TLS refined model this was 16.3 Å2. The average B factor of the main chain
atoms were 25.7 Å2, 47.6 Å2 and 15.4 Å2 respectively, while the average B factor for side chain atoms
were 26.8 Å2, 51.4 Å2 and 17.2 Å2 respectively. The average B factors across each chain for the original
model are also consistent with one another (table 2). For the non-refined TLS model (table 3) this
consistency was not observed: B factors remained consistent with chains A-F, however chain G has a
higher average B factor of all atoms of 54.7 Å2, while chain H further deviates with an average B factor of
87.0 Å2. Figure 19 shows a localised B factor putty of chains A-H, which shows that for chains A-H higher
B factors generally reside within the β-propeller, with some higher B factors at the N-terminus. In chain
G the B factors of the α/β hydrolase are slightly exaggerated, and for chain H the highest B factors show
observable bias in β-propeller domain. The Average B factor for total protein atoms of the TLS-refined
model (table 4) also showed similar inconsistencies, with chain B presenting higher average B factors
(36.0 Å2), while chains E and H showed lower average-B factors (6.8 Å2 and 7.0 Å2). Across the entire
asymmetric unit of the non-TLS refined model, 760 H2O molecules (avg. 95 per monomer) where placed
in electron density signal peak positions with RMS deviation of 1. The minimum water distances were
assigned to 2.4 Å, with maximum water distances set to 3.2 Å. The B factor of the waters positioned was
30 Å2, however as refinement could not be carried out post-addition of H2O due to software
complications, this may not be totally representative.
RMS deviations of Bond Lengths (Å) and Bond Angles (°) are variables associated with a restrained fit
refinement. A value close to 0 is indicative of an overrepresentation of geometry in the atomic model.
Refinement where these values are equal to 0 is referred to as a rigid fit, which may or may not
correlate with the atomic fitting of the model to the electron density. For the original model these
26
values are 0.0127 Å and 1.6723° respectively, while for non-TLS refinement they are 0.0156 Å and
1.9147°, and for TLS refinement they are 0.0160 Å and 1.9042°. Increase of these values with restrained
refinement is indicative of reduced geometric constraints of the atomic model.
Tables 2-4: Table of average B factors & RMS deviations for 2) the unrefined model, 3) non-TLS refined model and 4) TLS
refined model. B factors acquired using Baverage.
Chain
Total protein
atoms (Å2)
Main chain
atoms (Å2)
Side chain
atoms (Å2)
ALL
A
B
C
D
E
F
G
H
26.299
26.000
27.936
26.075
25.940
25.713
26.127
26.216
26.313
25.749
25.436
27.404
25.466
25.375
25.134
25.567
25.682
25.856
26.839
26.562
28.467
26.682
26.504
26.291
26.686
26.749
26.768
Chain
Total protein
atoms (Å2)
Main chain
atoms (Å2)
Side chain
atoms (Å2)
ALL
A
B
C
D
E
F
G
H
49.928
41.989
44.406
40.805
44.914
40.277
45.359
54.651
87.024
47.614
39.798
41.928
38.454
42.719
38.080
41.085
52.480
86.366
51.436
44.173
46.876
43.148
47.102
42.467
43.225
56.814
87.681
Chain
Total protein
atoms (Å2)
Main chain
atoms (Å2)
Side chain
atoms (Å2)
ALL
A
B
C
D
E
F
G
H
16.281
17.371
35.966
16.081
14.854
6.765
16.165
16.027
7.015
15.372
16.336
33.823
14.991
13.982
6.518
15.137
15.264
6.924
17.186
18.402
38.101
17.168
15.724
7.011
17.191
16.787
7.105
Average B factor
RMS main chain
atoms
0.388
0.384
0.470
0.405
0.379
0.398
0.398
0.358
0.314
Average B factor
RMS side chain
atoms
1.270
1.288
1.523
1.362
1.229
1.313
1.298
1.164
0.979
Average B factor
RMS main chain
atoms
2.760
2.178
2.298
2.171
2.531
2.200
2.362
2.879
5.458
Average B factor
RMS side chain
atoms
3.409
3.091
3.367
3.280
3.211
3.149
3.158
3.352
4.667
Average B factor
RMS main chain
atoms
0.733
0.734
1.908
0.763
0.639
0.312
0.723
0.614
0.168
Average B factor
RMS side chain
atoms
1.067
1.171
2.595
1.234
0.962
0.430
1.132
0.840
0.172
27
Figure 16: Cartoon rendering of the 8-chained
asymmetric unit of PgDPPIV. The asymmetric
unit
is
topologically
homotetramer
of
organised
homodimers.
as
a
Water
molecules present in the solvent are shown as
small magenta coloured spheres. Rendering
was accomplished with Pymol.
Figure 17: Cartoon representation
of
the
symmetry
of
the
asymmetric unit of PgDPPIV in the
unit cell. Figure 17A shows the
relative front view of the unit cells
as shown in figure 15, along
primary axes a and c. Figure 17B
shows the relative top view of the
unit cell, along axes b and c. Figure
17C shows the side view of the
unit cell, along axes
a and b.
Rendering was accomplished with
Pymol.
28
Figure 18: Cartoon B factor putty representation of the monomers constituting the asymmetric unit of PgDPPIV; Chains A to H as labelled.
The β-propeller is oriented to the left, while the α/β hydrolases are oriented to the right. Rendering accomplished with Pymol.
29
The active dimeric and monomeric models of PgDPPIV are shown in figure 19. The N-terminus region of
the protein extends past the α/β hydrolase domain, as is seen in other homologs. Transmembrane
region prediction using TMHMM(79, 80) suggests residues Met1-Pro4 form the periplasmic portion of the
protein, with residues Val5-Gly22 forming the transmembrane spanning region of the protein. As
described previously, the α/β hydrolase domain comprises residues 470-723, and the β-propeller
domain residues 37-469: these are shown in figure 20. Tables 5 and 6 show the residue positions of the
secondary structure elements of the β-propeller and the α/β hydrolase domains respectively.
Assessment of these secondary structure elements shows that the β-propeller domain contains some
deviations of what we know about the topological secondary structure organisation of this domains;
specifically, there are a number of β-sheets not implicated in the structure of the propeller that would
be expected: blades I-III and V-VIII containing 4 β-sheets, blade IV containing 7 β-sheets (5 in the main
blade, and 2 in the subdomain), which we do not see in PgDPPIV. For blades I-VIII we see 3, 4, 3, 7, 4, 4,
3 and 3 β-sheets respectively. Sequence alignment of blade structure with SmDPPIV indicate that both
blades I & III lacks β-sheet 1 (Arg43-Ser47, Ile144-Phe147), while blades VII & VIII lack β-sheet 4 (Thr410Lys411, Lys452-Arg455). Additional α-helices are present in blade II residues (Val127-Arg129) and blade
VI (Ser328-Ala333). The α/β hydrolases shows much less discrepancy in secondary structure; however
an extra α-helix is present between β6 and β7, although it is possible this could be due to an interruption
of an α-helix present in other homologs. It is difficult to determine which residue this constitutes.
The solvent-exposed opening of the β-propeller shows an ellipsoid topology, measuring 19.4 Å between
blades III and VII at its greatest and 12.9 Å between blades I and V at its shortest. The cavity opens
towards the interface, although this is difficult to measure due to the bending of blades I-III resulting in a
loss of oval geometric symmetry and the formation of the side entrance. The distance between blades IV
and VIII measures 19.3 Å, while the distance between blades I and VI measure 19.7s Å. The average
distance of the catalytic serine to the blades at the β-propeller entrance is 42.4 Å. The solvent-exposed
side entrance is slightly larger than that of the β-propeller entrance, with diameters of 23.7 Å and 20.5
Å. The entrance sits approximately 25.5 Å from active site serine. The diameters of the β-propeller
entrance are slightly larger than those of comparable measurements in HsDPPIV and SsDPPIV, and may
perhaps have some role the opening of the active site to slightly larger oligopeptide substrates. The side
entrance conversely, is only slightly larger than in the eukaryotic homologs.
30
C
Figure 19: Cartoon representation of the dimeric configuration of PgDPPIV from A) the front perspective B) top-down perspective, and C) the
monomeric configuration of PgDPPIV. Rendering was accomplished with Pymol.
A
B
Figure 20: Cartoon representations of A) the α/β hydrolase domain of PgDPPIV, showing the 8 β-sheet topology of the domain and B) the βpropeller domain of PgDPPIV, showing the 8 bladed topology and pore of the domain. Secondary structure features are also annotated.
Rendering was accomplished with Pymol.
31
Tables 5 & 6: Table of 5) residues comprising the secondary structure elements of the PgDPPIV β-propeller domain and 6) residues
comprising the secondary structure elements of the PgDPPIV α/β hydrolase domain, based on automatic secondary structure determination
by Pymol.
Sub Domain
N-terminus
Blade I
Blade II
Blade III
Blade IV
Blade V
Blade VI
Blade VII
Blade VIII
Secondary Structure
α1
β1A
β1B
β1C
β2A
β2B
β2C
α2
β2D
β3A
β3B
β3C
β4A
α3
β4B
β4C
β4’1
β4’2
β4D
β4E
β5A
β5B
β5C
β5D
α3
β6A
β6B
β6C
β6D
β7A
β7B
β7C
β8A
β8B
β8C
α4
Residues
Leu27-Ser32
His52-Met57
Ala63-Asn68
Val75-Ser80
Gln93-Val97
His103-The108
Ala121-Asp126
Val127-Arg129
Asn130-Pro134
Met153-Arg158
Asn161-Lys166
Asp170-Gln174
Ile184-Asn186
Trp191-Phe197
Met203-Trp205
Phe211-Asp218
Glu224-Met229
Glu237-Lys242
Thr252-Asn259
Arg263-Val268
Arg280-Phe283
Leu290-Leu295
Asp301-His308
Leu312-Met321
Ser328-Ala333
Lys335-Ala338
Gly340-Ser346
His353-Tyr357
His364-Arg366
Thr375-Asp381
Gly384-Ser390
Arg398-Tyr401
Thr418-Phe423
Tyr429-Ser435
Val442-Arg448
Val461-Ala469
Sub Domain
α/β hydrolase
Secondary Structure
β1
β2
β3
α1
β4
α2
α3
β5
α4
β6
α5
α6
α7
α8
β7
α9
β8
α10
Residues
Glu476-Thr482
Gly485-Val493
Val506-Gln510
Trp527-Lys534
Val537-Asp542
Glu551-Thr557
Val563-Gln578
Ile587-Trp592
Ser593-Gly606
Ala612Val616
Trp622-Phe624
Ser627-Met634
Ala641-Ser647
Ala649-Gln655
Asn659-Gly665
Leu673-Ala686
Asp691-Tyr695
Thr707-Asn722
Regions evidenced to be involved in dimerisation in HsDPPIV are also present in the PgDPPIV model.
The 2 anti-parallel β-sheets of the blade IV subdomain comprise of residues Pro223-Lys242. The αhelix/β-sheet loop which functions in catalytic triad relaxation upon dimerisation is composed of
residues Leu673-Met696. Residues Trp622-Phe624 constitute the short α-helix that also contributes to
the dimerisation interface. These conserved dimerisation motifs are indicated in figure 21.
32
A
Figure 21: Carton representation of
PgDPPIV homodimer with highlighted
residues
dimerization:
involved
Residues
in
Trp622–
Phe624 (red) and Leu673–Thr696
(blue) from the α/β hydrolase
domain and the antiparallel β-sheet
stack subdomain (Pro223-Lys242;
yellow) protruding from the 2nd βsheet of blade 4 in the β-propeller
domain. Figure 21A is from a front
perspective, and figure 21B is a top-
B
down perspective. Rendering was
accomplished with Pymol.
The primary dimerisation motif, the subdomain of blade IV of the β-propeller, is the only dimerisation
region that forms hydrogen bonds with the monomeric partner. The 2 β-sheets are residues Glu224Met229 and Glu237-Lys242. Figure 22 shows the hydrogen bonding environment of the interface. The
intermolecular contacts are formed by hydrogen bonding involving 3 residues; the hydrogen donor
Arg226, and the hydrogen acceptors Asp238 and Pro236 of the partner monomer. The NE nitrogen of
the side chain guanidium group of Arg226 hydrogen bonds to the carboxyl oxygen of the side-chain of
Asp238, with a distance of 2.8-3.0 Å, while the NH1 nitrogen of the guanidium group hydrogen bonds to
the backbone carbonyl oxygen of Pro236, at a distance of 2.8-3.4 Å. Additionally, NH2 nitrogen of
Arg226 extending from chain B also hydrogen bonds to the side chain carbonyl oxygen of Glu224 from
the same chain, at a distance of 3.3 Å. This hydrogen bond is not observed in chain A; which has a
distance of 5.2 Å, due to positioning of the Glu224 away from the interface.
Figure 22: Ribbon/line representation of the hydrogen bonding environment of the primary dimerisation region of PgDPPIV; residues
Pro223-Lys242. Rendering was accomplished with Pymol.
Figures 23 and 24 show the structural alignments of the α/β hydrolase and β-propeller domains
(respectively) of PgDPPIV with the comparable domains of HsDPPIV, SsDPPIV, SmDPPIV and PgPTP.
Alignment with Pymol was then followed by quantification of the RMSD between the 2 aligned
structures. For α/β hydrolase domain alignment, the RMS deviations between each homolog are 17.135,
16.269, 15.932 and 16.994 respectively. Likewise for the β-propeller domain, the RMSDs measure
17.799, 17.941, 17.252 and 17.476. RMS deviations of overall structure alignment were 22.065, 23.091,
18.303 and 18.350. The determined RMS deviations values indicate that for all 3 structural components
that were aligned, SmDPPIV shows the greatest structural similarity. From overall structure alignment
the prokaryotic homologs are better fits than the eukaryotic homologs, with very little difference in RMS
deviations between eukaryotic and prokaryotic homologs. HsDPPIV has the worst fit to the α/β
hydrolase domain while SsDPPIV has the worst fit with the β-propeller domain, although for the βpropeller domain there is little variation in RMS deviation values.
34
A
B
Figure 23: Ribbon representation
of the structural alignment of the
α/β
hydrolase
domain
of
PgDPPIV (magenta) with the
homologous
α/β
hydrolase
domains of A) HsDPPIV, B)
SsDPPIV, C) SmDPPIV and D)
C
D
A
B
PgPTP. Rendering accomplished
with Pymol.
Figure 24: Ribbon representation
of the structural alignment of the
β-propeller domain of PgDPPIV
C
D
(magenta) with the homologous
α/β hydrolase
domains of A)
HsDPPIV,
B)
SsDPPIV,
SmDPPIV
and
Rendering
accomplished
D)
C)
PgPTP.
with
Pymol.
35
Figure 25: Stick/cartoon diagram of the PgDPPIV active site. Shown are the S1 hydrophobic residues, V619, W622, Y625, Y629 and
V671(magenta), oxyanion stabilising residues Tyr511 & Tyr594 (green), catalytic triad Ser593, Asp668 & His700 (orange) and Glu195-Glu196
contributed from the β-propeller domain (cyan). Rendering accomplished with Pymol.
The active site of PgDPPIV is shown in figure 25, based on selection of residues from sequence
alignment with its homologs. Sequence alignment suggests that the residues involved in substrate
binding or catalysis are identical to those found in eukaryotic homologs. Alignments of the active site of
PgDPPIV with all 4 of its homologs are shown in figure 26. As expected of a serine peptidase, the
catalytic triad of PgDPPIV is comprised of residues Ser593, Asp668 and His700. The oxyanion stabilising
residues are Tyr511 located upstream of the catalytic triad, and Tyr594 neighbouring the catalytic
serine. The S1 hydrophobic binding pocket is formed by residues Val619, Trp622, Tyr625, Tyr629 and
Val671. The diGlumate residues which bind the amino-terminus of PgDPPIVs oligopeptide substrate are
Glu195 & Glu196, which are provided by the β-propeller as they are in HsDPPIV/SsDPPIV/SmDPPIV.
36
Figure 26: Stick/cartoon diagram of the PgDPPIV active site (magenta) aligned with the homologous active sites of A) HsDPPIV, B) SsDPPIV,
C) SmDPPIV and D) PgPTP. Rendering accomplished with Pymol.
RMS deviations values for active site alignments were 0.926, 0.912, 5.148 and 7.074 for each homolog
respectively. This is in agreement with the sequence alignment of the active site residues, where
prokaryotic homologs with differing residues in the active site showing expectedly less similarity.
Combined structural alignment and sequence alignment of the active site suggests stronger homology of
eukaryotic functional features of the DPPIV family, particularly concerning the core of the α/β hydrolase
domain, despite the weaker structural homology observed in the tertiary structure of this domain.
Compared to HsDPPIV, the PgDPPIV active site shows a general slight shift in the active site. The diGlutamate motive is shifted the most, measuring a 0.9-1.4 Å shift, while the catalytic triad; Ser593,
Asp558 and His700 are shifted 0.7-0.8 Å (2.6 Å shift of the oxygen due to alternate rotamer
conformation; see discussion), 0.6-0.8 Å and 0.4-0.6 Å, respectively. Tyr629 is shifted the least, with the
shift measuring 0.4-0.6 Å. SsDPPIV shows a more varied of the shift active site. Tyr511 is rotated with
37
the aromatic rotated to a perpendicular rotamer conformation (100°). The shift is shorter at the
backbone (0.9-1.0 Å), while at the C1, C4 and oxygen atoms the shift increases by 1.4 Å, 2.3 Å and 2.8 Å
respectively. The catalytic triad is shifted 0.4-1.0 Å, 0.5-0.8 Å and 0.5-0.8 Å for Ser593, Asp558 and
His700 respectively. Tyr625 and Tyr629 show the smallest shift, shifting 0.5-0.6 Å and 0.4-0.6 Å
respectively. The shifts compared to both homologs shows a slightly more closed conformation of the
active site, measuring on average 0.1-0.2 Å closure across the active site, with little rotation of the active
site residues (aside from Tyr511 compared to SsDPPIV and Ser593 compared to HsDPPIV). Due to
differences in residue composition, and thus poor alignment, measured comparisons with prokaryotic
homologs were not made.
B factors (main chain & side chain) for the active site residues are the A chain monomer are (with values
shown in parenthesis): Glu195 (39.7 Å2), Glu196 (37.0 Å2), Y511 (34.0 Å2), Ser593 (36.3 Å2), Tyr594 (32.8
Å2), Val619 (37.9 Å2), Trp622 (34.6 Å2), Tyr625 (35.4 Å2), Tyr629 (31.1 Å2), Asp668 (32.7 Å2), Val671 (34.8
Å2), His700 (34.5 Å2). These B factors are lower than the average B factor for all atoms (49.9 Å2), as
would be expected of residues that are contained within the core of the protein. When 20 < B < 40,
atom position is strong, albeit with 0.5 Å errors being a possibility. This could negate some of the
observation that have been stated above, and may indicate the measured distances actually bear little
significance.
For the structural interpretations and comparisons that have been made, the non-TLS refinement model
was selected as the best refinement of the 2 models that were produced from refinement. Both the
non-TLS and TLS refined models have an Rfree value which is higher than their R-factor, which is an
indication that the model has not been over interpreted with R-factor bias. The non-TLS refined model
was selected based primarily on its lower R-factor and Rfree, which can be interpreted as a better model
fit to the electron density compared to the TLS model; however other important factors were also
considered, as while many of the refinement parameters show little variance between refinements,
variance in other parameters vary greatly. B factors in particular took part in this selection, as a great
degree of variation between the 3 models was observed. B factors show a correlative relationship with
atomic resolution, where an increase in resolution results in a decrease in the B factors, due to
decreased motion of the atoms in the model. This relationship can be used to draw comparisons with
38
structures at similar resolution ranges, which would bear similar average B factors. The 2.8 Å solved
structure of SmDPPIV does generally agree with the average B factors of the non-TLS models, with
average B factors measuring 41.0 Å2 for all protein atoms, 40.5 Å2 for main chain atoms, 41.6 Å2 for side
chain atoms and 30.9 Å2 for water molecules(58). The 2.1 Å resolution structure of PgPTP also bears
similar values, measuring 38 Å2 for all protein atoms(60). The comparisons do however suggest that
average B factors for the non-TLS model are slightly higher than would be expected. This is likely the
result of high B factors of chains G & H present in the asymmetric unit; if these chains corresponded
with chains A-F, then average B factors would be 43.0 Å2, 40.3 Å2 and 44.5 Å2 for the aforementioned
atoms. Exaggeration of these B factors correlates with poor electron density fit of these chains,
particularly chain H, which results in either poor superposition of refined chain A onto these chains, or
could be indicative that these chains undertake an alternate conformation in the asymmetric unit.
In terms of geometric constraints the non-TLS refinement is also relatively sound, with RMS deviations
varying very little between non-TLS and TLS refined models. However the refinement with TLS shows
better stereochemistry in regards to backbone stereochemistry, with fewer residues in outlying regions.
Manual alteration of φ/ψ angles in chain A were made with WinCoot (either with chi angle rotation or
peptide bond flipping), and did show improvement in stereochemistry; the original A chain monomer
contained 648 (92.3%) residues in favoured regions, 42 (6.0%) residues in allowed regions and 12 (1.7%)
residues in outlier region. The refined monomer contained 675 (96%), 23 (3.3%) and 4 (0.7%) residues in
allowed/outlier regions respectively, however post-final superpositioning and refinement these were
altered to 657 (93.6) 37 (5.3%) and 8 (1.1%) residues, respectively. This could be indicative that
attempted resolvement of the stereochemistry of some residues imposed too rigid a geometric fit of the
model, which contradicted the electron density fit. However this increase in residues in the A chain
alone doesn’t account for similar Ramachandran profiles of the unrefined and non-TLS refined models,
possibly indicating that the Ramachandran profile is much poorer in chains which underwent
superpositioning by chain A. Comparatively the poorest electron density fitted chain, chain H, contains
590 (84.1%), 81 (11.5%) and 31 (4.4%) in favoured/allowed/outlying regions. Ideally there would be no
residues in outlying regions; 98% would be in the favoured regions, while the remaining 2% would be in
allowed regions. The stereochemical profile of non-TLS refined PgDPPIV, 90.7%, 6.4% and 2.4% residues
in favoured/allowed/outlying regions, has lower percentage of residues in favoured regions and higher
percentage of residues in allowed and outlying regions.
39
For comparatives, Rampage assessment of the isolated chain A monomers of the models used in
structural alignment shows that the 2.8 Å SmDPPIV has 92%, 7.8% and 0.3%, 1.8 Å SsDPPIV has 95.9%,
29% and 0.1%, 2.1 Å HsDPPIV has 93.8%, 6.1% and 1.0%, and 2.1 Å PgPTP has 96.3%, 3.4% and 0.3%
residues in favoured/allowed/outlying residues. It is clear that further refinement of the
stereochemistry of the PgDPPIV asymmetric unit is required.
During manual refinement with WinCoot, there a few residues which did not optimally fit the electron
density; residues Ala402, Ile403 and Leu454, which are shown in figure 27. In the image the 2Fobs-Fcalc
map is shown in blue, while the Fobs-Fcalc maps are shown in green (positive density) and red (negative
density); where atoms placed in negative density are unfavourable density fits. The side chains of
Ala402, Ile403 and Leu454 are visibly located within the red region. Manipulation of the model was
unsuccessful in finding geometrically viable alternative conformations. Interestingly Leu454 is contained
within the theoretical β-sheet 4 of blade VIII which is missing from the model; however whether this is
due to the selected conformation for Leu454 is unknown. It is possible that the missing blades of the βpropeller are the result of bad model building, such as in the presented case of Leu454; however
observation of the electron density fit of the predicted β-sheet regions in WinCoot show no abhorrent
fitting of atoms to the electron density map. It is likely that the absence of β-sheets comes down to the
conformation assumed during crystallisation of the protein.
For the catalytic serine there were 2 possible rotamers for the catalytic serine, which ultimately altered
hydrogen bonding distances to the nitrogen atom of the imidazole ring of His700, depending on the
rotamer selected. In the conformation shown in figure 26, the distance between the Serine oxygen and
the His700 nitrogen is 3.3 Å, while in the alternative rotamer conformation this distance is shortened to
2.1 Å. For HsDPPIV, SsDPPIV, SmDPPIV and PgPTP these distances are 2.7 Å, 2.6 Å, 3.6 Å and 2.8 Å
respectively. In either conformation, the distance between residues is deviant from those found in other
homologs, with distance at the longest hydrogen bond conformation showing similar deviation to
SmDPPIV. For the 2.1 Å rotamer, closer hydrogen bonding could be indicative of stronger nucleophile
activity of serine; this conformation would mean Ser593 might be orientated towards the binding site
when ligand is bound in the acyl intermediate state. Further co-crystallisation with a ligand is required to
confirm whether this 2.1 Å distance is altered in the bound state.
40
A
B
Figure 27: Electron-density map fit of atomic model of residues Ala402/Ile403 (A) and residue Leu454 (B) from chain A. Screenshot taken
with WinCoot.
Curiously, although structurally PgDPPIV bears the strongest resemblance its prokaryotic homologs;
particularly SmDPPIV, the active site bears an uncanny resemblance to the HsDPPIV/SsDPPIV enzymes.
This could be indicative that such homology would make the development of a drug-inhibitor with
specificity for only PgDPPIV difficult, based on inhibitors which only bind directly the active site.
41
However, slight aforementioned differences in the dimensions of the active site could provide sufficient
specificity; whether these are true positioning differences or errors which can be explained by possible
0.5 Å position errors as indicated by B factor analysis of these residues is debatable. Further
characterisation of residues found near the active site could be utilised in synthesis of an inhibitor.
Although the structure that has been characterised provides some insight into the structure-function
relationship of this enzyme, slightly higher b factors compared to homologs, poor stereochemical
refinement of the polypeptide backbone and the absence of predicted secondary structures would lead
to the conclusion that further refinement of the atomic model is required to better elucidate the
structural features of this pathologically important enzyme. Previous attempts at the refinement of
single-wavelength anomalous dispersion (SAD) crystals of PgDPPIV with a resolution range down to 2.5
Å produced a refinement model with an R-factor of 22.7% and an Rfree of 28.8% (Mistry, unpublished).
Our model that has been produced using MAD reflection data has provided a model of PgDPPIV which
has the best fit of the electron density map to date; however a higher resolution structure is still
necessary to provide a more admissible error-free model of PgDPPIV. This will likely mean a return to
research into more optimised crystallisation conditions of PgDPPIV crystals, which thus far has proven
difficult due to the slow rate of crystal growth of current PgDPPIV crystal. Alternatively other avenues of
resolution refinement could be considered, including post-crystallographic methods such as crystal
dehydration (addition of dehydration solution or extraction of water) or crystal annealing (returning
crystal to room temperature after flash-freezing, before returning to cryostream) which reduce the
solvent-content of the crystal, increasing the packing of the molecule in the crystal lattice, which has
been shown to increase the resolution of some crystals(81).
5 - Acknowledgements
I would like to give my thanks to Prof. Vilmos Fülöp for not only advising and helping to direct the course
of my work on the project, but also assisting in enquires related to both the computational and
theoretical components of the x-ray crystallography and the subsequent processing of data, including
data refinement.
42
6 - Bibliography
1.
Yang HW, Huang YF, Chou MY. Occurrence of Porphyromonas gingivalis and Tannerella
forsythensis in periodontally diseased and healthy subjects. Journal of Periodontology. 2004;75(8):107783.
2.
Chen T. P. gingivalis ATCC 33277 (1) Porphyromonas gingivalis Genome Project: The Forsyth
Institute; 2002 [cited 2013 2nd June]. Available from: http://www.pgingivalis.org/ATCC33277(1).htm.
3.
Bostanci N, Belibasakis GN. Porphyromonas gingivalis: an invasive and evasive opportunistic oral
pathogen. Fems Microbiology Letters. 2012;333(1):1-9.
4.
Baker PJ, Evans RT, Roopenian DC. ORAL INFECTION WITH PORPHYROMONAS-GINGIVALIS AND
INDUCED ALVEOLAR BONE LOSS IN IMMUNOCOMPETENT AND SEVERE COMBINED IMMUNODEFICIENT
MICE. Archives of Oral Biology. 1994;39(12):1035-40.
5.
DeCarlo AA, Windsor LJ, Bodden MK, Harber GJ, BirkedalHansen B, BirkedalHansen H. Activation
and novel processing of matrix metalloproteinases by a thiol proteinase from the oral anaerobe
Porphyromonas gingivalis. Journal of Dental Research. 1997;76(6):1260-70.
6.
Matsuda N, Takemura A, Taniguchi S, Amano A, Shizukuishi S. Porphyromonas gingivalis reduces
mitogenic and chemotactic responses of human periodontal ligament cells to platelet-derived growth
factor in vitro. Journal of Periodontology. 1996;67(12):1335-41.
7.
Wegner N, Wait R, Sroka A, Eick S, Nguyen KA, Lundberg K, et al. Peptidylarginine Deiminase
From Porphyromonas gingivalis Citrullinates Human Fibrinogen and alpha-Enolase Implications for
Autoimmunity in Rheumatoid Arthritis. Arthritis and Rheumatism. 2010;62(9):2662-72.
8.
Mikuls TR, Thiele GM, Deane KD, Payne JB, O'Dell JR, Yu F, et al. Porphyromonas gingivalis and
Disease-Related Autoantibodies in Individuals at Increased Risk of Rheumatoid Arthritis. Arthritis and
Rheumatism. 2012;64(11):3522-30.
9.
Eganhouse K, Keller JC, Drake D, Grigsby W, Wu-Yuan CD. Attachment of Porphyromonas
gingivalis to titanium surfaces. Journal of Dental Research. 1993;72(ABSTR. SPEC. ISSUE):140.
10.
Dennison DK, Huerzeler MB, Quinones C, Caffesse RG. CONTAMINATED IMPLANT SURFACES AN IN-VITRO COMPARISON OF IMPLANT SURFACE COATING AND TREATMENT MODALITIES FOR
DECONTAMINATION. Journal of Periodontology. 1994;65(10):942-8.
11.
Leon R, Silva N, Ovalle A, Chaparro A, Ahurnada A, Gajardo M, et al. Detection of
Porphyromonas gingivalis in the amniotic fluid in pregnant women with a diagnosis of threatened
premature labor. Journal of Periodontology. 2007;78(7):1249-55.
12.
Li L, Messas E, Batista EL, Levine RA, Amar S. Porphyromonas gingivalis infection accelerates the
progression of atherosclerosis in a heterozygous apolipoprotein E-deficient murine model. Circulation.
2002;105(7):861-7.
13.
Potempa J, Pike R, Travis J. Titration and mapping of the active site of cysteine proteinases from
Porphyromonas gingivalis (Gingipains) using peptidyl chloromethanes. Biological Chemistry. 1997;378(34):223-30.
14.
Kumagai Y, Konishi K, Gomi T, Yagishita H, Yajima A, Yoshikawa M. Enzymatic properties of
dipeptidyl aminopeptidase IV produced by the periodontal pathogen Porphyromonas gingivalis and its
participation in virulence. Infection and Immunity. 2000;68(2):716-24.
15.
Abiko Y, Hayakawa M, Murai S, Takiguchi H. GLYCYLPROLYL DIPEPTIDYLAMINOPEPTIDASE FROM
BACTEROIDES-GINGIVALIS. Journal of Dental Research. 1985;64(2):106-11.
16.
Banbula A, Bugno M, Goldstein J, Yen J, Nelson D, Travis J, et al. Emerging family of prolinespecific peptidases of Porphyromonas gingivalis: Purification and characterization of serine dipeptidyl
peptidase, a structural and functional homologue of mammalian prolyl dipeptidyl peptidase IV. Infection
and Immunity. 2000;68(3):1176-82.
43
17.
Oda H, Saiki K, Tonosaki M, Yajima A, Konishi K. Participation of the secreted dipeptidyl and
tripeptidyl aminopeptidases in asaccharolytic growth of Porphyromonas gingivalis. Journal of
Periodontal Research. 2009;44(3):362-7.
18.
Rawlings ND, Polgar L, Barrett AJ. A NEW FAMILY OF SERINE-TYPE PEPTIDASES RELATED TO
PROLYL OLIGOPEPTIDASE. Biochemical Journal. 1991;279:907-8.
19.
Kotch FW, Guzei IA, Raines RT. Stabilization of the collagen triple helix by O-methylation of
hydroxyproline residues. Journal of the American Chemical Society. 2008;130(10):2952-3.
20.
Kumagai Y, Yagishita H, Yajima A, Okamoto T, Konishi K. Molecular mechanism for connective
tissue destruction by dipeptidyl aminopeptidase IV produced by the periodontal pathogen
Porphyromonas gingivalis. Infection and Immunity. 2005;73(5):2655-64.
21.
Wu Y-m, Yan J, Chen L-l, Gu Z-y. Association between infection of different strains of
Porphyromonas gingivalis and Actinobacillus actinomycetemcomitans in subgingival plaque and clinical
parameters in chronic periodontitis. Journal of Zhejiang University-Science B. 2007;8(2):121-31.
22.
Lin XH, Wu J, Xie H. Porphyromonas gingivalis minor fimbriae are required for cell-cell
interactions. Infection and Immunity. 2006;74(10):6011-5.
23.
Irshad M, van der Reijden WA, Crielaard W, Laine ML. In Vitro Invasion and Survival of
Porphyromonas gingivalis in Gingival Fibroblasts; Role of the Capsule. Archivum Immunologiae Et
Therapiae Experimentalis. 2012;60(6):469-76.
24.
Persson GR. Immune responses and vaccination against periodontal infections. Journal of
Clinical Periodontology. 2005;32:39-53.
25.
Teshirogi K, Hayakawa M, Ikemi T, Abiko Y. Production of monoclonal antibody inhibiting
dipeptidylaminopeptidase IV activity of Porphyromonas gingivalis. Hybridoma and Hybridomics.
2003;22(3):147-51.
26.
Gilmore BF, Carson L, McShane LL, Quinn D, Coulter WA, Walker B. Synthesis, kinetic evaluation,
and utilization of a biotinylated dipeptide proline diphenyl phosphonate for the disclosure of dipeptidyl
peptidase IV-like serine proteases. Biochemical and Biophysical Research Communications.
2006;347(1):373-9.
27.
Bodet C, Piche M, Chandad F, Grenier D. Inhibition of periodontopathogen-derived proteolytic
enzymes by a high-molecular-weight fraction isolated from cranberry. Journal of Antimicrobial
Chemotherapy. 2006;57(4):685-90.
28.
Weber AE, Kim D, Beconi M, Eiermann G, Fisher M, He HB, et al. MK-0431 is a potent, selective,
dipeptidyl peptidase IV inhibitor for the treatment of type 2 diabetes. Diabetes. 2004;53:A151-A.
29.
Feng J, Zhang ZY, Wallace MB, Stafford JA, Kaldor SW, Kassel DB, et al. Discovery of alogliptin: A
potent, selective, bioavailable, and efficacious inhibitor of dipeptidyl peptidase IV. Journal of Medicinal
Chemistry. 2007;50(10):2297-300.
30.
Ghate M, Jain SV. Structure Based Lead Optimization Approach in Discovery of Selective DPP4
Inhibitors. Mini-Reviews in Medicinal Chemistry. 2013;13(6):888-914.
31.
Rea D, Lambeir AM, Kumagai Y, De Meester I, Scharpe S, Fulop V. Expression, purification and
preliminary crystallographic analysis of dipeptidyl peptidase IV from Porphyromonas gingivalis. Acta
Crystallographica Section D-Biological Crystallography. 2004;60:1871-3.
32.
Tiruppathi C, Miyamoto Y, Ganapathy V, Leibach FH. GENETIC-EVIDENCE FOR ROLE OF DPP-IV IN
INTESTINAL HYDROLYSIS AND ASSIMILATION OF PROLYL PEPTIDES. American Journal of Physiology.
1993;265(1):G81-G9.
33.
Drucker DJ. Minireview: The glucagon-like peptides. Endocrinology. 2001;142(2):521-7.
34.
Ansorge S, Buhling F, Kahne T, Lendeckel U, Reinhold D, Tager M, et al. CD26 dipeptidyl
peptidase IV in lymphocyte growth regulation. Cellular Peptidases in Immune Functions and Diseases.
1997;421:127-40.
44
35.
Ludwig A, Schiemann F, Mentlein R, Lindner B, Brandt E. Dipeptidyl peptidase IV (CD26) on T
cells cleaves the CXC chemokine CXCL11 (I-TAC) and abolishes the stimulating but not the desensitizing
potential of the chemokine. Journal of Leukocyte Biology. 2002;72(1):183-91.
36.
Busek P, Malik R, Sedo A. Dipeptidyl peptidase IV activity and/or structure homologues (DASH)
and their substrates in cancer. International Journal of Biochemistry & Cell Biology. 2004;36(3):408-21.
37.
Ludwig K, Fan H, Dobers J, Berger M, Reutter W, Bottcher C. 3D structure of the CD26-ADA
complex obtained by cryo-EM and single particle analysis. Biochemical and Biophysical Research
Communications. 2004;313(2):223-9.
38.
Cheng HC, Abdel-Ghany M, Pauli BU. A novel consensus motif in fibronectin mediates dipeptidyl
peptidase IV adhesion and metastasis. Journal of Biological Chemistry. 2003;278(27):24600-7.
39.
Loster K, Zeilinger K, Schuppan D, Reutter W. THE CYSTEINE-RICH REGION OF DIPEPTIDYL
PEPTIDASE-IV (CD 26) IS THE COLLAGEN-BINDING SITE. Biochemical and Biophysical Research
Communications. 1995;217(1):341-8.
40.
Blanco J, Valenzuela A, Herrera C, Lluis C, Hovanessian AG, Franco R. The HIV-1 gp120 inhibits
the binding of adenosine deaminase to CD26 by a mechanism modulated by CD4 and CXCR4 expression.
Febs Letters. 2000;477(1-2):123-8.
41.
Herrera C, Morimoto C, Blanco J, Mallol J, Arenzana F, Lluis C, et al. Comodulation of CXCR4 and
CD26 in human lymphocytes. Journal of Biological Chemistry. 2001;276(22):19532-9.
42.
Ishii T, Ohnuma K, Murakami A, Takasawa N, Kobayashi S, Dang NH, et al. CD26-mediated
signaling for T cell activation occurs in lipid rafts through its association with CD45RO. Proceedings of the
National Academy of Sciences of the United States of America. 2001;98(21):12138-43.
43.
Holst JJ, Deacon CF. Inhibition of the activity of dipeptidyl peptidase IV as a treatment for type 2
diabetes. Diabetes. 1998;47(11):1663-70.
44.
Medeiros MD, Turner AJ. Metabolism and functions of neuropeptide Y. Neurochemical
Research. 1996;21(9):1125-32.
45.
Kajiyama H, Kikkawa F, Suzuki T, Shibata K, Ino K, Mizutani S. Prolonged survival and decreased
invasive activity attributable to dipeptidyl peptidase IV overexpression in ovarian carcinoma. Cancer
Research. 2002;62(10):2753-7.
46.
Proost P, De Meester I, Schols D, Struyf S, Lambeir AM, Wuyts A, et al. Amino-terminal
truncation of chemokines by CD26/dipeptidylpeptidase IV - Conversion of RANTES into a potent inhibitor
of monocyte chemotaxis and HIV-1-infection. Journal of Biological Chemistry. 1998;273(13):7222-7.
47.
Thoma R, Loffler B, Stihle M, Huber W, Ruf A, Hennig M. Structural basis of proline-specific
exopeptidase activity as observed in human dipeptidyl peptidase-IV. Structure. 2003;11(8):947-59.
48.
Rea D, Fulop V. Structure-function properties of prolyl oligopeptidase family enzymes. Cell
Biochemistry and Biophysics. 2006;44(3):349-65.
49.
Chien CH, Huang LH, Chou CY, Chen YS, Han YS, Chang GG, et al. One site mutation disrupts
dimer formation in human DPP-IV proteins. Journal of Biological Chemistry. 2004;279(50):52338-45.
50.
Suzuki Y, Erickson RH, Sedlmayer A, Chang SK, Ikehara Y, Kim YS. DIETARY-REGULATION OF RAT
INTESTINAL ANGIOTENSIN-CONVERTING ENZYME AND DIPEPTIDYL PEPTIDASE-IV. American Journal of
Physiology. 1993;264(6):G1153-G9.
51.
Chung KM, Cheng JH, Suen CS, Huang CH, Tsai CH, Huang LH, et al. The dimeric transmembrane
domain of prolyl dipeptidase DPP-IV contributes to its quaternary structure and enzymatic activities.
Protein Science. 2010;19(9):1627-38.
52.
Chien CH, Tsai CH, Lin CH, Chou CY, Chen X. Identification of hydrophobic residues critical for
DPP-IV dimerization. Biochemistry. 2006;45(23):7006-12.
53.
Rasmussen HB, Branner S, Wiberg FC, Wagtmann N. Crystal structure of human dipeptidyl
peptidase IV/CD26 in complex with a substrate analog. Nature Structural Biology. 2003;10(1):19-25.
45
54.
Simpkins LM, Bolton S, Pi Z, Sutton JC, Kwon C, Zhao G, et al. Potent non-nitrile dipeptidic
dipeptidyl peptidase IV inhibitors. Bioorganic & Medicinal Chemistry Letters. 2007;17(23):6476-80.
55.
Oefner C, D'Arcy A, Mac Sweeney A, Pierau S, Gardiner R, Dale GE. High-resolution structure of
human apo dipeptidyl peptidase IV/CD26 and its complex with 1- ({2- (5-iodopyridin-2-yl)amino ethyl}amino)-acetyl -2-cyano-(S)-pyrr olidine. Acta Crystallographica Section D-Biological
Crystallography. 2003;59:1206-12.
56.
Kim YB, Kopcho LM, Kirby MS, Hamann LG, Weigelt CA, Metzler WJ, et al. Mechanism of Gly-PropNA cleavage catalyzed by dipeptidyl peptidase-IV and its inhibition by saxagliptin (BMS-477118).
Archives of Biochemistry and Biophysics. 2006;445(1):9-18.
57.
Engel M, Hoffmann T, Wagner L, Wermann M, Heiser U, Kiefersauer R, et al. The crystal
structure of dipeptidyl peptidase IV(CD26) reveals its functional regulation and enzymatic mechanism.
Proceedings of the National Academy of Sciences of the United States of America. 2003;100(9):5063-8.
58.
Nakajima Y, Ito K, Toshima T, Egawa T, Zheng H, Oyama H, et al. Dipeptidyl Aminopeptidase IV
from Stenotrophomonas maltophilia Exhibits Activity against a Substrate Containing a 4-Hydroxyproline
Residue. Journal of Bacteriology. 2008;190(23):7819-29.
59.
Banbula A, Mak P, Bugno M, Silberring J, Dubin A, Nelson D, et al. Prolyl tripeptidyl peptidase
from Porphyromonas gingivalis - A novel enzyme with possible pathological implications for the
development of periodontitis. Journal of Biological Chemistry. 1999;274(14):9246-52.
60.
Ito K, Nakajima Y, Xu Y, Yamada N, Onohara Y, Ito T, et al. Crystal structure and mechanism of
tripeptidyl activity of prolyl tripeptidyl aminopeptidase from Porphyromonas gingivalis. Journal of
Molecular Biology. 2006;362(2):228-40.
61.
Terwilliger TC, Berendzen J. Automated MAD and MIR structure solution. Acta Crystallographica
Section D-Biological Crystallography. 1999;55:849-61.
62.
Terwilliger TC, Kim SH, Eisenberg D. GENERALIZED-METHOD OF DETERMINING HEAVY-ATOM
POSITIONS USING THE DIFFERENCE PATTERSON FUNCTION. Acta Crystallographica Section A. 1987;43:15.
63.
Terwilliger TC, Eisenberg D. UNBIASED 3-DIMENSIONAL REFINEMENT OF HEAVY-ATOM
PARAMETERS BY CORRELATION OF ORIGIN-REMOVED PATTERSON FUNCTIONS. Acta Crystallographica
Section A. 1983;39(SEP):813-7.
64.
Terwilliger TC, Eisenberg D. ISOMORPHOUS REPLACEMENT - EFFECTS OF ERRORS ON THE PHASE
PROBABILITY-DISTRIBUTION. Acta Crystallographica Section A. 1987;43:6-13.
65.
Terwilliger TC. MAD PHASING - TREATMENT OF DISPERSIVE DIFFERENCES AS ISOMORPHOUS
REPLACEMENT INFORMATION. Acta Crystallographica Section D-Biological Crystallography. 1994;50:1723.
66.
Terwilliger TC. MAD PHASING - BAYESIAN ESTIMATES OF F-A. Acta Crystallographica Section DBiological Crystallography. 1994;50:11-6.
67.
Terwilliger TC, Berendzen J. Bayesian correlated MAD phasing. Acta Crystallographica Section DBiological Crystallography. 1997;53:571-9.
68.
Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta Crystallographica
Section D-Biological Crystallography. 2004;60(Sp. 1):2126-32.
69.
Emsley P, Lohkamp B, Scott WG, Cowtan K. Features and development of Coot. Acta
Crystallographica Section D. 2010;66(4):486-501.
70.
4 CCPN. The CCP4 suite: programs for protein crystallography. Acta Cryst. 1994;50:760-3.
71.
Potterton E, Briggs P, Turkenburg M, E. D. A graphical user interface to the CCP4 program suite.
Acta Cryst. 2003;D59:1131-7.
72.
Murshudov GN, Skubak P, Lebedev AA, Pannu NS, Steiner RA, Nicholls RA, et al. REFMAC5 for
the refinement of macromolecular crystal structures. Acta Crystallogr. 2011;D67:355-67.
46
73.
Murshudov GN, Vagin AA, Dodson EJ. Refinement of macromolecular structures by the
maximum-likelihood method. Acta Crystallographica Section D-Biological Crystallography. 1997;53:24055.
74.
Lovell SC, Davis IW, Arendall III WB, de Bakker PIW, Word JM, Prisant MG, et al. Structure
validation by Cα geometry: φ/ψ and Cβ deviation. Proteins: Structure, Function & Genetics; 2002. p.
437-50.
75.
Krissinel E, Henrick K. Secondary-structure matching (SSM), a new tool for fast protein structure
alignment in three dimensions. Acta Cryst. 2004;D60:2256-68.
76.
Vaguine AA, Richelle J, Wodak SJ. SFCHECK: a unified set of procedures for evaluating the quality
of macromolecular structure-factor data and their agreement with the atomic model. Acta
Crystallographica Section D-Biological Crystallography. 1999;55:191-205.
77.
Winn MD, G.N. M, Papiz MZ. Macromolecular TLS refinement in REFMAC at moderate
resolutions. Method in Enzymology. 2003;374:300-21.
78.
Cockcroft KC. A Hypertext Book of Crystallographic Space Group
Diagrams and Tables: Birkbeck College; 1999 [cited 2013 7th June]. Available from:
http://img.chem.ucl.ac.uk/sgp/large/004ay1.htm.
79.
Sonnhammer EL, von Heijne G, Krogh A. A hidden Markov model for predicting transmembrane
helices in protein sequences. Proceedings / International Conference on Intelligent Systems for
Molecular Biology ; ISMB International Conference on Intelligent Systems for Molecular Biology.
1998;6:175-82.
80.
Krogh A, Larsson B, von Heijne G, Sonnhammer ELL. Predicting transmembrane protein topology
with a hidden Markov model: Application to complete genomes. Journal of Molecular Biology.
2001;305(3):567-80.
81.
Heras B, Martin JL. Post-crystallization treatments for improving diffraction quality of protein
crystals. Acta Crystallographica Section D-Biological Crystallography. 2005;61:1173-80.
47
7 - Appendix
7.1 - Protein Sequence materials
7.1.1 - PgDPPIV sequence
MKRPVIILLLGIVTMCAMAQTGDKPVDLKEITSGMFYARSAGRGIRSMPDGEHYTEMNRERTAIVRYNYASGKAVDTLFSIERARE
CPFKQIQNYEVSSTGHHILLFTDMESIYRHSYRAAVYDYDVRRNLVKPLSEHVGKVMIPTFSPDGRMVAFVRDNNIFIKKFDFDTE
VQVTTDGQINSVLNGATDWVYEEEFGVTNLMSWSADNAFLAFVRSDESAVPEYRMPMYEDKLYPEDYTYKYPKAGEKNSTVSLHLY
NVADRNTKSVSLPIDADGYIPRIAFTDNADELAVMTLNRLQNDFKMYYVHPKSLVPKLILQDMNKRYVDSDWIQALKFTAGGGFAY
VSEKDGFAHIYLYDNKGVMHRRITSGNWDVTKLYGVDASGTVFYQSAEESPIRRAVYAIDAKGRKTKLSLNVGTNDALFSGNYAYY
INTYSSAATPTVVSVFRSKGAKELRTLEDNVALRERLKAYRYNPKEFTIIKTQSALELNAWIVKPIDFDPSRHYPVLMVQYSGPNS
QQVLDRYSFDWEHYLASKGYVVACVDGRGTGARGEEWRKCTYMQLGVFESDDQIAAATAIGQLPYVDAARIGIWGWSYGGYTTLMS
LCRGNGTFKAGIAVAPVADWRFYDSVYTERFMRTPKENASGYKMSSALDVASQLQGNLLIVSGSADDNVHLQNTMLFTEALVQANI
PFDMAIYMDKNHSIYGGNTRYHLYTRKAKFLFDNL
7.1.2 - HsDPPIV sequence
MKTPWKVLLGLLGAAALVTIITVPVVLLNKGTDDATADSRKTYTLTDYLKNTYRLKLYSLRWISDHEYLYKQENNILVFNAEYGNS
SVFLENSTFDEFGHSINDYSISPDGQFILLEYNYVKQWRHSYTASYDIYDLNKRQLITEERIPNNTQWVTWSPVGHKLAYVWNNDI
YVKIEPNLPSYRITWTGKEDIIYNGITDWVYEEEVFSAYSALWWSPNGTFLAYAQFNDTEVPLIEYSFYSDESLQYPKTVRVPYPK
AGAVNPTVKFFVVNTDSLSSVTNATSIQITAPASMLIGDHYLCDVTWATQERISLQWLRRIQNYSVMDICDYDESSGRWNCLVARQ
HIEMSTTGWVGRFRPSEPHFTLDGNSFYKIISNEEGYRHICYFQIDKKDCTFITKGTWEVIGIEALTSDYLYYISNEYKGMPGGRN
LYKIQLSDYTKVTCLSCELNPERCQYYSVSFSKEAKYYQLRCSGPGLPLYTLHSSVNDKGLRVLEDNSALDKMLQNVQMPSKKLDF
IILNETKFWYQMILPPHFDKSKKYPLLLDVYAGPCSQKADTVFRLNWATYLASTENIIVASFDGRGSGYQGDKIMHAINRRLGTFE
VEDQIEAARQFSKMGFVDNKRIAIWGWSYGGYVTSMVLGSGSGVFKCGIAVAPVSRWEYYDSVYTERYMGLPTPEDNLDHYRNSTV
MSRAENFKQVEYLLIHGTADDNVHFQQSAQISKALVDVGVDFQAMWYTDEDHGIASSTAHQHIYTHMSHFIKQCFSLP
7.1.3 - SsDPPIV sequence
MKTPWKVLLGLLGIAALVTVITVPVVLLNKGTDDAAADSRRTYTLTDYLKSTFRVKFYTLQWISDHEYLYKQENNILLFNAEYGNS
SIFLENSTFDELGYSTNDYSVSPDRQFILFEYNYVKQWRHSYTASYDIYDLNKRQLITEERIPNNTQWITWSPVGHKLAYVWNNDI
YVKNEPNLSSQRITWTGKENVIYNGVTDWVYEEEVFSAYSALWWSPNGTFLAYAQFNDTEVPLIEYSFYSDESLQYPKTVRIPYPK
AGAENPTVKFFVVDTRTLSPNASVTSYQIVPPASVLIGDHYLCGVTWVTEERISLQWIRRAQNYSIIDICDYDESTGRWISSVARQ
HIEISTTGWVGRFRPAEPHFTSDGNSFYKIISNEEGYKHICHFQTDKSNCTFITKGAWEVIGIEALTSDYLYYISNEHKGMPGGRN
LYRIQLNDYTKVTCLSCELNPERCQYYSASFSNKAKYYQLRCFGPGLPLYTLHSSSSDKELRVLEDNSALDKMLQDVQMPSKKLDV
INLHGTKFWYQMILPPHFDKSKKYPLLIEVYAGPCSQKVDTVFRLSWATYLASTENIIVASFDGRGSGYQGDKIMHAINRRLGTFE
VEDQIEATRQFSKMGFVDDKRIAIWGWSYGGYVTSMVLGAGSGVFKCGIAVAPVSKWEYYDSVYTERYMGLPTPEDNLDYYRNSTV
MSRAENFKQVEYLLIHGTADDNVHFQQSAQLSKALVDAGVDFQTMWYTDEDHGIASNMAHQHIYTHMSHFLKQCFSLP
48
7.1.4 - SmDPPIV sequence
MRHLFASLAFMLATSTVAHAEKLTLEAITGPLPLSGPTLMKPKVAPDGSRVTFLRGKDSDRNQLDLWSYDIGSGQTRLLVDSKVVL
PGTETLSDEEKARRERQRIAAMTGIVDYQWSPDAQRLLFPLGGELYLYDLKQEGKAAVRQLTHGEGFATDAKLSPKGGFVSFIRGR
NLWVIDLASGRQMQLTADGSTTIGNGIAEFVADEEMDRHTGYWWAPDDSAIAYARIDESPVPVQKRYEVYADRTDVIEQRYPAAGD
ANVQVKLGVISPAEQAQTQWIDLGKEQDIYLARVNWRDPQHLSFQRQSRDQKKLDLVEVTLASNQQRVLAHETSPTWVPLHNSLRF
LDDGSILWSSERTGFQHLYRIDSKGKAAALTHGNWSVDELLAVDEKAGLAYFRAGIESARESQIYAVPLQGGQPQRLSKAPGMHSA
SFARNASVYVDSWSNNSTPPQIELFRANGEKIATLVENDLADPKHPYARYREAQRPVEFGTLTAADGKTPLNYSVIKPAGFDPAKR
YPVAVYVYGGPASQTVTDSWPGRGDHLFNQYLAQQGYVVFSLDNRGTPRRGRDFGGALYGKQGTVEVADQLRGVAWLKQQPWVDPA
RIGVQGWSNGGYMTLMLLAKASDSYACGVAGAPVTDWGLYDSHYTERYMDLPARNDAGYREARVLTHIEGLRSPLLLIHGMADDNV
LFTNSTSLMSALQKRGQPFELMTYPGAKHGLSGADALHRYRVAEAFLGRCLKP
7.1.5 - PgPTP sequence
MKKTIFQQLFLSVCALTVALPCSAQSPETSGKEFTLEQLMPGGKEFYNFYPEYVVGLQWMGDNYVFIEGDDLVFNKANGKSAQTTR
FSAADLNALMPEGCKFQTTDAFPSFRTLDAGRGLVVLFTQGGLVGFDMLARKVTYLFDTNEETASLDFSPVGDRVAYVRNHNLYIA
RGGKLGEGMSRAIAVTIDGTETLVYGQAVHQREFGIEKGTFWSPKGSCLAFYRMDQSMVKPTPIVDYHPLEAESKPLYYPMAGTPS
HHVTVGIYHLATGKTVYLQTGEPKEKFLTNLSWSPDENILYVAEVNRAQNECKVNAYDAETGRFVRTLFVETDKHYVEPLHPLTFL
PGSNNQFIWQSRRDGWNHLYLYDTTGRLIRQVTKGEWEVTNFAGFDPKGTRLYFESTEASPLERHFYCIDIKGGKTKDLTPESGMH
RTQLSPDGSAIIDIFQSPTVPRKVTVTNIGKGSHTLLEAKNPDTGYAMPEIRTGTIMAADGQTPLYYKLTMPLHFDPAKKYPVIVY
VYGGPHAQLVTKTWRSSVGGWDIYMAQKGYAVFTVDSRGSANRGAAFEQVIHRRLGQTEMADQMCGVDFLKSQSWVDADRIGVHGW
SYGGFMTTNLMLTHGDVFKVGVAGGPVIDWNRYEIMYGERYFDAPQENPEGYDAANLLKRAGDLKGRLMLIHGAIDPVVVWQHSLL
FLDACVKARTYPDYYVYPSHEHNVMGPDRVHLYETITRYFTDHL
7.1.6 - CLUSTAL W2 multiple sequence alignment
Key residues of the active site are colour annotated for easy comparison of alignment. The catalytic triad is annotated in red, the oxyanion
stabilising residues are annotated in blue, hydrophobic binding residues are annotated in yellow, and amino-terminus binding residues are
annotated in cyan.
PgDPPIV
HsDPPIV
SsDPPIV
SmDPPIV
PgPTP
-----------MKRPVIILLLGIVTMCAMAQTGDKPVDLKEITSGMFYARSAGRG-IRSM
MKTPWKVLLGLLGAAALVTIITVPVVLLNKGTDDATADSRKTYTLTDYLKNTYRLKLYSL
MKTPWKVLLGLLGIAALVTVITVPVVLLNKGTDDAAADSRRTYTLTDYLKSTFRVKFYTL
---MRHLFASLAFMLATSTVAHAEKLTLEAITGPLPLSGPTLMKPKVAPDGSRVTFLRGK
---MKKTIFQQLFLSVCALTVALPCSAQSPETSGKEFTLEQLMPGGKEFYNFYPEYVVGL
.
*.
.
.
48
60
60
57
57
PgDPPIV
HsDPPIV
SsDPPIV
SmDPPIV
PgPTP
PDGEHYTEMNRERTAIVRYNYASGKAVDTLFSVERARECPFKQIQNYEVSSTGHHILLFT
RWISDHEYLYKQENNILVFNAEYGNSS--VFLENSTFDEFGHSINDYSISPDGQFILLEY
QWISDHEYLYKQENNILLFNAEYGNSS--IFLENSTFDELGYSTNDYSVSPDRQFILFEY
DSDRNQLDLWSYDIGSGQTRLLVDSKVVLPGTETLSDEEKARRERQRIAAMTGIVDYQWS
QWMGDNYVFIEGDD--LVFNKANGKSAQTTRFSAADLNALMPEGCKFQTTDAFPSFRTLD
.
:
.
..
:
.
:
108
118
118
117
115
PgDPPIV
HsDPPIV
SsDPPIV
SmDPPIV
PgPTP
DMESIYRHSYRAAVYDYDVR---RNLVKPLSEHVGKVMIPTFSPDGRMVAFVRDNNIFIK
NYVKQWRHSYTASYDIYDLN---KRQLITEERIPNNTQWVTWSPVGHKLAYVWNNDIYVK
NYVKQWRHSYTASYDIYDLN---KRQLITEERIPNNTQWITWSPVGHKLAYVWNNDIYVK
PDAQRLLFPLGGELYLYDLKQEGKAAVRQLTHGEGFATDAKLSPKGGFVSFIRGRNLWVI
AGRGLVVLFTQGGLVGFDML---ARKVTYLFDTNEETASLDFSPVGDRVAYVRNHNLYIA
.
:*:
:
.
** * :::: ..::::
165
175
175
177
172
49
PgDPPIV
HsDPPIV
SsDPPIV
SmDPPIV
PgPTP
K-----FDFDTEVQVTTDGQINSILNGATDWVYEEEFG-VTNLMSWSADNAFLAFVRSDE
I-----EPNLPSYRITWTGKEDIIYNGITDWVYEEEVFSAYSALWWSPNGTFLAYAQFND
N-----EPNLSSQRITWTGKENVIYNGVTDWVYEEEVFSAYSALWWSPNGTFLAYAQFND
D-----LASGRQMQLTADG-STTIGNGIAEFVADEEMD-RHTGYWWAPDDSAIAYARIDE
RGGKLGEGMSRAIAVTIDGTETLVYG---QAVHQREFG-IEKGTFWSPKGSCLAFYRMDQ
:* *
: .
: * :.*.
.
*:...: :*: : ::
219
230
230
230
228
PgDPPIV
HsDPPIV
SsDPPIV
SmDPPIV
PgPTP
SAVPEYR-MPMYEDK--IYPEDYTYKYPKAGEKNSTVSLHLYNVADRNTKSVSLPIDADG
TEVPLIE-YSFYSDESLQYPKTVRVPYPKAGAVNPTVKFFVVNTDSLSSVTNATSIQITA
TEVPLIE-YSFYSDESLQYPKTVRIPYPKAGAENPTVKFFVVDTRTLSPNASVTSYQIVP
SPVPVQKRYEVYADR----TDVIEQRYPAAGDANVQVKLGVISPAEQAQTQWIDLGKEQD
SMVKPTPIVDYHPLE----AESKPLYYPMAGTPSHHVTVGIYHLATGKTVYLQTGEPKEK
: *
: .
..
** ** . *.. :
276
289
289
286
284
PgDPPIV
HsDPPIV
SsDPPIV
SmDPPIV
PgPTP
---------YIPRIAFTDNADELAVMTLNRLQN-------DFKMYYVHPKSLVAKLILQD
PASMLIGDHYLCDVTWAT-QERISLQWLRRIQNYSVMDICDYDESSGRWNCLVARQHIEM
PASVLIGDHYLCGVTWVT-EERISLQWIRRAQNYSIIDICDYDESTGRWISSVARQHIEI
--------IYLARVNWRD-PQHLSFQRQSRDQK-------KLDLVEVTLASNQQRVLAHE
---------FLTNLSWSPDENILYVAEVNRAQN-----ECKVNAYDAETGRFVRTLFVET
:: : :
: : .
* *:
. .
.
320
348
348
330
330
PgDPPIV
HsDPPIV
SsDPPIV
SmDPPIV
PgPTP
MNKRYVDSDWIQALKFTAGGG--FAYVSEKDGFAHIYLYDNKGVMHRRITSGNWDVTKLY
STTGWVGRFRPSEPHFTLDGNSFYKIISNEEGYRHICYFQIDKKDCTFITKGTWEVIGIE
STTGWVGRFRPAEPHFTSDGNSFYKIISNEEGYKHICHFQTDKSNCTFITKGAWEVIGIE
TSPTWVPLHN--SLRFLDDGS--ILWSSERTGFQHLYRIDSKGK-AAALTHGNWSVDELL
DKHYVEPLHP---LTFLPGSNNQFIWQSRRDGWNHLYLYDTTGRLIRQVTKGEWEVTNFA
.
* ...
*.. *: *:
:
:* * *.* :
378
408
408
385
387
PgPTP
HsDPPIV
SsDPPIV
PgDPPIV
SmDPPIV
GFDPKGTRLYFESTEASPLERHFYCIDIKGGKTKDLTPESGMH-------RTQLSPDGSA
ALTSDYLYYISNEYKGMPGGRNLYKIQLSD-YTKVTCLSCELNPERCQYYSVSFSKEAKY
ALTSDYLYYISNEHKGMPGGRNLYRIQLND-YTKVTCLSCELNPERCQYYSASFSNKAKY
GVDASGTVFYQSAEE-SPIRRAVYAIDAKG-RKTKLSLNVGTN-------DALFSGNYAY
AVDEKAGLAYFRAGIESARESQIYAVPLQGGQPQRLSKAPGMH-------SASFARNASV
.. .
.
.* : ..
:
. :: .
440
467
467
429
438
PgDPPIV
HsDPPIV
SsDPPIV
SmDPPIV
PgPTP
YINTYSSAATPTVVSVFRSKDAKELRTLE----DNVALRERLKAYRYNPKEFTIIKTQSG
YQLRCSGPGLP-LYTLHSSVNDKGLRVLE----DNSALDKMLQNVQMPSKKLDFIILN-E
YQLRCFGPGLP-LYTLHSSSSDKELRVLE----DNSALDKMLQDVQMPSKKLDVINLH-G
YVDSWSNNSTPPQIELFRANGEKIATLVENDLADPKHPYARYREAQRPVEFGTLTAADGK
IIDIFQSPTVPRKVTVTN-IGKGSHTLLE-------AKNPDTGYAMPEIRTGTIMAADGQ
.
*
:
.
:*
.
.
.
485
521
521
498
492
PgDPPIV
HsDPPIV
SsDPPIV
SmDPPIV
PgPTP
LELNAWIVKPIDFDPSRHYPVLMVQYSGPNSQQVLDRYS----FDWEHYLASKG-YVVAC
TKFWYQMILPPHFDKSKKYPLLLDVYAGPCSQKADTVFR----LNWATYLASTENIIVAS
TKFWYQMILPPHFDKSKKYPLLIEVYAGPCSQKVDTVFR----LSWATYLASTENIIVAS
TPLNYSVIKPAGFDPAKRYPVAVYVYGGPASQTVTDSWPGRGDHLFNQYLAQQG-YVVFS
TPLYYKLTMPLHFDPAKKYPVIVYVYGGPHAQLVTKTWR-SSVGGWDIYMAQKG-YAVFT
:
: * ** :::**: : *.** :* .
:
: *:*.
*
540
577
577
557
550
PgDPPIV
HsDPPIV
SsDPPIV
SmDPPIV
PgPTP
VDGRGTGARGEEWRKCTYMQLGVFESDDQIAAATAIGQLPYVDAARIGIWGWSYGGYTTL
FDGRGSGYQGDKIMHAINRRLGTFEVEDQIEAARQFSKMGFVDNKRIAIWGWSYGGYVTS
FDGRGSGYQGDKIMHAINRRLGTFEVEDQIEATRQFSKMGFVDDKRIAIWGWSYGGYVTS
LDNRGTPRRGRDFGGALYGKQGTVEVADQLRGVAWLKQQPWVDPARIGVQGWSNGGYMTL
VDSRGSANRGAAFEQVIHRRLGQTEMADQMCGVDFLKSQSWVDADRIGVHGWSYGGFMTT
.*.**: :*
: * * **: .. : . :** **.: *** **: *
600
637
637
617
610
PgDPPIV
HsDPPIV
SsDPPIV
SmDPPIV
PgPTP
MSLCRGNGTFKAGIAVAPVADWRFYDSVYTERFMRTPK--ENASGYKMSSALDVAS-QLQ
MVLGSGSGVFKCGIAVAPVSRWEYYDSVYTERYMGLPTPEDNLDHYRNSTVMSRAENFKQ
MVLGAGSGVFKCGIAVAPVSKWEYYDSVYTERYMGLPTPEDNLDYYRNSTVMSRAENFKQ
MLLAKASDSYACGVAGAPVTDWGLYDSHYTERYMDLPAR--NDAGYREARVLTHIE-GLR
NLMLTHGDVFKVGVAGGPVIDWNRYEIMYGERYFDAPQE--NPEGYDAANLLKRAG-DLK
:
.. : *:* .** * *: * **:: *
*
* : :
:
657
697
697
674
667
50
PgDPPIV
HsDPPIV
SsDPPIV
SmDPPIV
PgPTP
GNLLIVSGSADDNVHLQNTMLFTEALVQANIPFDMAIYMDKNHSIYGGNTRYHLYIRKAK
VEYLLIHGTADDNVHFQQSAQISKALVDVGVDFQAMWYTDEDHGIASSTAHQHIYTHMSH
VEYLLIHGTADDNVHFQQSAQLSKALVDAGVDFQTMWYTDEDHGIASNMAHQHIYTHMSH
SPLLLIHGMADDNVLFTNSTSLMSALQKRGQPFELMTYPGAKHGLSG-ADALHRYRVAEA
GRLMLIHGAIDPVVVWQHSLLFLDACVKARTYPDYYVYPSHEHNVMG-PDRVHLYETITR
::: * * *
:: : .* .
:
* . .*.: .
* *
PgDPPIV
HsDPPIV
SsDPPIV
SmDPPIV
PgPTP
FLFDNL--FIKQCFSLP
FLKQCFSLP
FLGRCLKPYFTDHL--::
:
717
757
757
733
726
723
766
766
741
732
7.1.7 - ClustalW2 Sequence alignment Scores
Sequence 1 Sequence 2 Score (%)
PgDPPIV
PgPTP
25.86
PgDPPIV
HsDPPIV
25.45
PgDPPIV
SsDPPIV
25.45
PgDPPIV
SmDPPIV
27.66
PgPTP
HsDPPIV
18.31
PgPTP
SsDPPIV
19.67
PgPTP
SmDPPIV
25.82
HsDPPIV
SsDPPIV
88.38
HsDPPIV
SmDPPIV
21.59
SsDPPIV
SmDPPIV
21.32
Word count excluding abstract, table of contents, figure captions, tables, bibliography and appendix
materials: 9,561
51
Download