Pdb_v2_draft1_review_script 1 Hello, and welcome to a tutorial on

advertisement
Pdb_v2_draft1_review_script
1
Hello, and welcome to a tutorial on
the RCSB Protein Data Bank, or
RCSB PDB. The PDB archive
contains information about
experimentally-determined
structures of proteins, nucleic acids,
and complex assemblies. As a
member of the wwPDB, the RCSB
PDB curates and annotates PDB
data, and provides a variety of tools
and resources to access these data.
Slide 2
The RCSB Protein Data Bank
Materials prepared by:
Sawsan Khuri, Ph.D. and
Cynthia Perreault-Micale, Ph.D.
www.openhelix.com
Updated: Q1 2011
Version 6
In this introductory tutorial we will
explore the RCSB PDB and teach
you how to use some of the many
tools and resources it offers to find
the structural information you are
interested in.
OpenHelix is not affiliated with the
RCSB PDB, but is offering training
on this resource.
This tutorial was prepared by Drs.
Sawsan Khuri and Cynthia
Perreault-Micale for OpenHelix.
Slide 3
RCSB Protein Data Bank Agenda

Introduction & Credits


Basic Searching & Browsing
Result Options

Structure Summary Page


Advanced Searching
Tools & Education

Summary

Exercises
RCSB PDB: www.pdb.org
Copyright OpenHelix. No use or reproduction without express written consent
3
First, I will outline the agenda of
this tutorial.
We will start with an introduction to
the PDB and provide credits to its
creators and curators.
Next, we will go through the basic
methods available to search and
browse through RCSB PDB.
I will then go though the results
pages in detail, including the
different results options.
We will also discuss the structure
summary page.
This will be followed by an
overview of some advanced
searching strategies.
Then we will look at some of the
additional features offered,
including the available tools and
Pdb_v2_draft1_review_script
2
education resources.
After that I will offer a summary of
the main points learned and show
you some exercises that you can do
to practice your new skills.
Let’s now begin the Introduction
and Credits.
Slide 4
Introduction & Credits
Easy to navigate
& customize
PDB
RCSB The
PDBRCSB
to Access:
is managed
by 2
• Molecule
of the Month
members of
RCSB
• Experimental
Data
• Structure Comparisons
• MyPDB Alerts
• Many Tools and Resources
Copyright OpenHelix. No use or reproduction without express written consent
4
The RCSB PDB is a tremendous
resource that you can use to study
information about biological
macromolecules found in the PDB
archive. It is free and available
online. Here you see the homepage
that you can find at the URL shown.
PDB is an archive where researchers
can submit and access
experimentally-determined
molecular structure data, and that is
managed and curated and made
freely available to anyone who
wishes to access it. PDB does not
contain only protein data – as the
name may imply – it includes a
wide variety of macromolecules
including nucleic acids and proteinnucleic acid complexes. The RCSB
PDB offers tools that you can use to
visualize, manipulate, download and
analyze macromolecular structure
data – all in a user friendly format.
Researchers from a wide variety of
scientific disciplines, students,
teachers, and the general public use
the RCSB PDB’s related resources
for accessing these data. For
example, students and educators can
read about structures in a Molecule
of the Month column, and then view
individual entries in an interactive
molecular viewer. Structural
biologists can study the available
experimental data to recreate similar
experiments. Computational
Pdb_v2_draft1_review_script
3
biologists can compare how similar
different structures are by sequence
or structure classification.
Pharmacologists can receive
MyPDB email alerts when a
structure is released that relates to
drugs under development. All of
these tools and resources help users
explore a structural view of biology.
On the left of the homepage you can
see how easy it is to navigate to just
what you need in RCSB PDB. There
are groups of links for background
information, data deposition,
searching, browsing, tools and
educational resources. These menus
are movable so that you can
customize your homepage view. If
you are going to be doing a lot of
searching you may want to move
the Search section to the top of the
homepage, for example. All you
have to do is drag it to where you
want it to be because RCSB PDB
utilizes a widget framework. Your
preferences will be remembered so
that you can essentially customize
the RCSB PDB homepage to
whatever works best for your needs.
And we shall soon see how you can
customize your data pages too. This
is only one of the many great
features that make it easy to use this
incredible resource.
The RCSB PDB is a not-for-profit
consortium managed by two
members of the Research
Collaboratory for Structural
Bioinformatics––Rutgers, The State
University of New Jersey and the
University of California San Diego.
The RCSB PDB is a member of the
Worldwide Protein Data Bank
(wwPDB). It is a group of
Pdb_v2_draft1_review_script
4
organizations that maintain the
single PDB archive of structural
data and make it freely available to
the public. The wwPDB
organizations function as
deposition, data processing and
distribution centers for PDB data.
You can link to both the RCSB PDB
and wwPDB websites right from
www.pdb.org. OpenHelix is a
separate company that provides
training on public resources like the
RCSB PDB, we do not develop or
maintain these resources. Full credit
goes to these groups.
Slide 5
Homepage
PDB Statistics
New Features
News
`
`
Latest Structures
Copyright OpenHelix. No use or reproduction without express written consent
www.pdb.org
5
Here’s a full view of the RCSB
PDB homepage. In the lower middle
section is one of their most popular
regular features: the Molecule of the
Month, which is always worth
checking out. We’ll enlarge that for
a closer look. And you can explore
the links provided in this section to
learn more about this molecule, or
to go to any of the previously
featured molecules of the month.
Molecules of the Month are
categorized.
This molecule belongs to the
Infrastructure and Communication
category. Other categories include
Protein Synthesis, Enzymes, Health
and Disease, Biological Energy, and
Biotechnology and Energy. Click on
any category icon to view the
molecules in that category.
Molecules of the Month can also be
viewed by titles and dates by
selecting these links. Underneath is
a link to a similar feature at the
Protein Structure Initiative
Structural Biology Knowledgebase
website.
Pdb_v2_draft1_review_script
5
Below this is the latest structures
widget that cycles between all
entries released in the past week.
On the right side of the homepage is
the news and updates section. Here
it is enlarged. This section
highlights any new information or
important changes that have been
made to RCSB PDB. The dropdown
menu at the top lets you view new
website features added during
different releases - learn about the
PDB Mobile application for the
iPhone here, for example. RCSB
PDB news can be viewed in this
section. Below it you can read
wwPDB news and find out how to
access snapshots of the PDB
archive.
At the top you find a link to the
PDB statistics. We will click on it to
open it, and look at it in the next
slide.
Slide 6
PDB Has Grown Exponentially
‘Drillable’ Data Distribution Summaries
Now more
than 70,000!
PDB began in
1971 with only
7 structures
Copyright OpenHelix. No use or reproduction without express written consent
6
RCSB PDB provides a wealth of
statistics for you. Any type of
breakdown you are interested in can
probably be found here, including
data distribution summaries that
allow drilling down. We will
explore this feature in an upcoming
section of the tutorial. To show you
the overall content growth of all
released structures I will select the
Growth of Released Structures link
here.
This nicely illustrates how the PDB
archive has grown over the years –
the word exponential comes to
mind! This is an amazing tribute to
the advances in molecular biology,
biochemistry and computer
Pdb_v2_draft1_review_script
6
technology that we have
experienced. When it started, in
1971 at the Brookhaven National
Laboratory, the PDB had all of
seven structures. Their coordinates
were stored on punch cards and
computer tape. Currently, PDB
contains more than 70,000
structures, and continues to grow.
Slide 7
Structural Diversity - Constantly Increasing!
Glucagon
Insulin
Many Enzymes
Interferon
Channels
Complexes
“Molecular Machinery: A Tour of the Protein Data Bank”
http://www.rcsb.org/pdb/static.do?p=education_discussion/educational_resources/index.html
Copyright OpenHelix. No use or reproduction without express written consent
7
Shown here is part of the
“Molecular Machinery: A Tour of
the Protein Data Bank” poster that
you can find in the Educational
Resources section of RCSB PDB. It
so nicely highlights PDB’s
tremendous structural diversity.
Although it is called the Protein
Data Bank, the PDB holds
structures of many macromolecules,
including nucleic acids. This
screenshot shows some
representative structures of
glucagon, interferon and insulin, but
PDB also contains structures of
many enzymes, channels and
complexes. Some of these
macromolecules are very
complicated, but you are provided
with all the details you need to
understand them.
These structures have been obtained
through a number of experimental
techniques, including X-ray
crystallography, Nuclear Magnetic
Resonance (or NMR), and electron
microscopy. These methods yield
three-dimensional information to
give us detailed pictures. In this
tutorial, we shall focus on how to
access protein structures that have
been resolved using X-ray
crystallography or NMR, but
remember there are others-like those
solved with electron microscopy-
Pdb_v2_draft1_review_script
7
which you can explore on your own.
Slide 8
Here is an example of a structure
summary page – and we will later
go through this with you step by
step, beginning with how to get to
this result from the homepage,
through how to download the
structure.
An Example of a Structure Summary Page
Explore
next
Biological Assembly – 2 asymmetric units
β-strands
α-helices
loops
Copyright OpenHelix. No use or reproduction without express written consent
sugar-binding
domains
8
The protein described in this entry
page looks like this. Here we are
looking at the biological assembly
which is composed of two
asymmetric units in this protein, but
that will not always be the case. We
will teach you more about the
differences between the biological
assemblies and asymmetric units
soon. Note that you can see all of
the so-called “secondary structure”
elements found in this protein, such
as alpha helices forming the
dimerization domain, beta strands,
and loops. For the sake of this
tutorial, we shall not go into the
functions of the proteins we look at,
but in case you are curious, this is a
bacterial protein called AraC which
is involved in transcription
regulation. The beta barrel-shape of
this structure binds to an LARABINOSE sugar which allows
the protein to bind to DNA and
work properly. We’ll see the sugar
bound to this protein later on when
we look at PDB entry 2arc. These
are the types of molecular details
RCSB PDB can help you to uncover
about your molecule of choice. And
the goal of this tutorial is to teach
you the most efficient way to find
the information you want in RCSB
PDB.
Pdb_v2_draft1_review_script
8
Let’s go back to the lefthand menu
and continue to explore a couple
more of the important background
links you find here. First we will
select the News & Publications link.
Slide 9
On the News & Publications page
you can access a tremendous
amount of information. The
Publications link provides access to
a list of journal articles published by
the RCSB PDB.
More Background Information
Copyright OpenHelix. No use or reproduction without express written consent
Slide 10
9
Getting Started: Organization, Icons & Help
Help
Click to
return to
homepage
Access from
homepage
Icons
Flash Player
Help
widget
Copyright OpenHelix. No use or reproduction without express written consent
10
The Policies page details the data
usage policies, and describes how to
cite PDB structures, the RCSB
PDB, and the wwPDB. Other
helpful links contained in the home
menu include FAQs, or Frequently
Asked Questions, contact and
feedback addresses and forms, and
more background information to
read and link out from.
Please let me make you aware of
one last – and very important – page
to be aware of. This is the “Getting
Started” page that you can access
from the bottom of the RCSB PDB
homepage. Here you can learn
some navigation hints. For example,
wherever you are in the RCSB PDB
you can easily navigate back to the
homepage by clicking on the logo in
the upper left corner. Here you can
also learn the meaning of many of
the icons used. At any point during
your session, you can access online
help by clicking on the Help buttons
which appear in many places. They
are indicated by a question mark.
This will open an extensive
selection of options for you to
browse for help. And you can access
Pdb_v2_draft1_review_script
9
the Help menu from the top of the
page as well. There are icons to
indicate the molecular viewers,
database searches, external sites,
viewing lists, downloading and
reports. The icon with the “f”
indicates that a Flash player is
required.
Clicking on “This webpage” checks
your system for compatibility. Also,
the Getting Started page offers
useful notes, definitions and advice
on how to use its resources. In
addition to this step-by-step tutorial
that has been prepared for you by
OpenHelix, the RCSB PDB has
many of its own tutorials and userguides.
Slide 11
[End of Introduction & Credits]
That concludes the Introduction and
Credits section of our tutorial.
RCSB Protein Data Bank Agenda

Introduction & Credits


Basic Searching & Browsing
Result Options

Structure Summary Page


Advanced Searching
Tools & Education

Summary

Exercises
[Beginning of Basic Searching &
Browsing] We will now review
some basic searching and browsing
techniques.
RCSB PDB: www.pdb.org
Copyright OpenHelix. No use or reproduction without express written consent
Slide 12
11
Structure Deposition & Unique PDB Identifiers
PDB ID examples:
2ara
AraC transcription factor
Customize
1lr5
Auxin Binding Protein
1e5s
Proline hydroxylase
(not case-sensitive)
Customize your
homepage view


www.pdb.org
Deposit your structural data from the homepage
Data will be validated and assigned a unique PDB ID
Copyright OpenHelix. No use or reproduction without express written consent
12
As I mentioned earlier, PDB is the
world’s archive of macromolecular
structures. Researchers who solve
macromolecular structures deposit
their data with the wwPDB. The
RCSB PDB offers a specialized tool
for deposition and validation of
structural data. Deposited structures
are given a unique identifier code
called a PDB ID. The entries are
then processed and validated by
wwPDB annotators, and then made
available in the public archive and
from the wwPDB member websites.
Here are some examples of PDB
IDs. They are four alphanumeric
Pdb_v2_draft1_review_script
10
characters long and are not case
sensitive. You can search for a
particular ID in upper or lower case
letters, but within the website you
find them written in lower case
letters. The older codes may have
been abbreviations of the protein
whose structure they represent, for
example, 2ara is the PDB ID for
AraC, the bacterial protein that you
saw in an earlier slide. However, as
more and more structures were
deposited, the denotations became
simply a unique identifier with no
other relevance to the protein itself.
As I mentioned in the introduction –
you can customize your homepage
view by moving the menus where
you like. All you have to do is drag
them. So if you are depositing
structures frequently, you may want
to move the Deposition menu to the
top, for example. Also, be sure to
check out the additional
customization options here.
Slide 13
Basic Search Methods – From the Homepage
Search
Help
Search
Booleans,
wildcard &
complex queries
accepted

www.pdb.org
Many search methods in Top Bar Search
Copyright OpenHelix. No use or reproduction without express written consent
13
You can search the RCSB PDB
using a number of different
strategies. The simplest method is to
enter a search term directly into the
query box. From here, you can
search by PDB ID or keyword.
Boolean operators, wildcard
searches and complex queries using
quotes for exact phrases and
parenthesis to group concepts are
also accepted through this box. The
help menu next to the search box
describes all of these options. In
addition, you can select other search
types from the dropdown menu.
For our first example, let’s leave the
default selection of PDB ID or Text,
Pdb_v2_draft1_review_script
11
enter the PDB ID 2ara here, and
click on the Search button.
Slide 14
A PDB ID Search Returns Structure Summary Page
To more
data
Structure title, ID,
citations, descriptions,
sources, methodology,
structure & more
Easy access to all
search methods

A Structure Summary page for every structure
Copyright OpenHelix. No use or reproduction without express written consent
14
You are taken directly to the
Structure Summary page for the
PDB ID given, providing it is an ID
for a released entry. This method is
most useful if you already know the
PDB ID you would like to have a
look at, say you found it in a
publication, or a colleague told you
about it.
The structure summary page is
essentially the homepage for each
and every one of the tens of
thousands of PDB structures. At the
top you find the title and ID, access
to many pages of additional
information regarding this structure
and a multitude of additional
options for displaying,
downloading, and sharing this
structure. On the summary page
itself there is even more information
than you can see in this cropped
screenshot. This page presents
citations, descriptions, sources,
methodology, the structure with
many viewing options, and more.
We will go over this structure
summary page in detail in the
upcoming results section, but I just
wanted you to see how easy it is to
find your structure of interest simply
from a basic search, using the top
homepage navigation bar.
Let’s now look at some of the other
search methods, which you find in
the Search menu in the left
navigation bar present on all of the
PDB pages.
Pdb_v2_draft1_review_script
Slide 15
12
The first search type on the menu -which we are now showing cropped
at the top corner of this slide-- is the
Advanced Search. This is a
powerful search tool we will explore
in an upcoming section of the
tutorial.
Advanced Search & Latest Release
Show/Hide
Mouse over
Left hand
Search
menu
List of Results
Click to see
data
distribution
summary for
this subset
Copyright OpenHelix. No use or reproduction without express written consent
15
PDB releases new structures every
week. Clicking on the Latest
Release link will show you a list of
the structures that were made
available to the public that week.
Before we discuss the list of latest
releases, I’d like to explore the
query refinements. I’ll expand this
section of the page by clicking this
“Show” link.
When the Query Refinements
section opens we see data
distribution summaries for the latest
releases. This drill-down view
enables you to see the major
characteristics of the latest releases
because data distribution summaries
are organized by major categories,
such as taxonomy and experimental
method. From this page you can
view only the entries you want, or
explore the entire list of release. If
you mouse over any link a chart
appears so that you can visualize the
percentage of that group within the
broader category. Here you see that
bacterial structures account for
about 39 percent of the Taxonomy
category.
Clicking on a link will take you to a
new webpage containing only that
subset of results. From there you
can continue to narrow down your
results because the Query
Refinements section will be there,
as well. These data distribution
Pdb_v2_draft1_review_script
13
summaries are available for drilling
down the entire contents of the
RCSB PDB, as we briefly
mentioned in the introduction
section, and are also available from
search results, as we will see in an
upcoming section of the tutorial.
Additional Query Refinement
options include the “Refine Query”
link that takes you to the Advanced
Search form, and the “Remove
Similar” option, which lets you
filter your results by sequence
similarity. Just open the dropdown
menu and make your selection.
Click here to show or hide the
Query Refinements section.
Slide 16
Here you see the results for the
latest releases with the Query
Refinements section hidden. Let’s
now focus on the list of latest
releases.
Latest Release Search
Left hand
Search
menu
Click PDB ID,
structure or title to go
to Structure Summary
Copyright OpenHelix. No use or reproduction without express written consent
Jump to a
results page
16
At the top the results are categorized
for you into query results tabs. We
are looking at the structure hits by
default, but you can also look at the
associated citations and ligand hits.
There is an option to check all or
none of the checkboxes beside the
records. There are also
Display/Download options, options
for generating reports, as well as
sorting and page length options and
the ability to jump to a particular
results page if you want.
Clicking on a PDB ID, thumbnail
structure or title takes you to the
Structure Summary page for the
entry.
Pdb_v2_draft1_review_script
Slide 17
14
Clicking on New Structure Papers
from the left hand menu enables you
to view the primary citations
associated with the newest PDB
structures, with links to PubMed.
New Structure Papers Search
Left hand
Search
menu
To related
structures & articles
To searchable PubMed
abstract & NCBI PubMed
abstract
Copyright OpenHelix. No use or reproduction without express written consent
17
In the top query results tabs you can
see that we are looking at the
citations page by default, but you
can change this to view the PDB
structures, chemical components
and ligands, Gene Ontology (GO)
annotations, or SCOP or CATH hits
found in the latest PDB release.
There are again options for
displaying downloading, sorting and
jumping.
Next you can see the list of
citations, including the title, authors
and PubMed ID. This is followed by
links to a searchable PubMed
abstract from the structure summary
page, and then links to the NCBI
PubMed abstracts. You can also link
to related structures and articles
from here.
Slide 18
Clicking on this “Sequence Search”
takes you to the Advanced Search
option for sequences. You can enter
either a PDB ID and the associated
chain ID, or paste a sequence into
the sequence input box. You can
search using a protein or nucleotide
sequence.
Sequence Search
Left hand
Search
menu
Input PDB ID &
chain ID or paste
sequence
Search methods

BLAST vs. FASTA: See RCSB PDB Help or OpenHelix tutorials
Copyright OpenHelix. No use or reproduction without express written consent
18
There are multiple search
algorithms you can use – BLAST,
FASTA, or PSI-BLAST. A
comparison of these methodologies
is beyond the scope of this
introductory tutorial, but the help
contains references for these. In
addition, OpenHelix has tutorials
devoted to these search tools.
Pdb_v2_draft1_review_script
Slide 19
15
The Chemical Components search
lets you search the Chemical
Component Dictionary of RCSP
PDB entries.
Chemical Components Search
Left hand
Search
menu
Copyright OpenHelix. No use or reproduction without express written consent
19
There are three different search
options that can be accessed by
these tabs. Currently, we are
viewing the Structure search page
which can be used to search with the
chemical structure of a ligand.
In addition, you can use the
Name/Identifier search interface to
base your search on a chemical ID,
InChI descriptor, or the name of a
chemical component. The
Formula/Weight tab provides a
chemical formula or formula
expression search, a molecular
weight search, or a combination.
Another browser for the Chemical
Component Dictionary, Ligand
Expo, is also available by selecting
the links on the lower section of the
Chemical Components search page.
Slide 20
Unreleased Entries Search
Left hand
Search
menu


Copyright OpenHelix. No use or reproduction without express written consent
Find the
current
status of a
structure
On hold,
processing,
withdrawn,
& more
20
Searching Unreleased Entries allows
you to find out the current status of
a structure. A structure may be on
hold awaiting publication of the
paper, for example, or may still be
in processing by the wwPDB staff.
You may be wondering why you
would need to know. Well, this
search is most useful if you wish to
write a grant to elucidate the
structure of a protein for example.
Here is where you can find out if it
has already been determined and is
soon to be released.
Structures that are submitted to the
Pdb_v2_draft1_review_script
16
PDB often remain on hold for a few
months or even a year before they
are released. This is done mostly to
protect the intellectual property of
the authors. When authors submit a
structure to the PDB, they have the
option to ask the PDB not to release
that structure until the paper is
published or the patent is filed. Most
journals require authors to deposit
the structure to the PDB before
submitting a manuscript – and to
release it when the paper is
published.
Slide 21
Browse Database or Search Tree
Left hand
Search
menu
Search here

Click
Mouse over

Expand


Expand
more
Copyright OpenHelix. No use or reproduction without express written consent
Several
ways to
browse
Mouse
over &
click to
access
Expand &
browse
Search
Tree
21
Selecting the Browse Database
provides many options that allow
you to see the PDB structures
associated with particular terms or
classification groups. The first three
tabs, including the default selection
we are currently viewing, are the
three Gene Ontology, or GO,
category browsers for Biological
Process, Cell Component and
Molecular Function. Many of you
may already know that GO terms
are controlled standardized
vocabulary terms that describe gene
products uniformly. Additional tabs
offer other categories of information
that you can browse through as well.
Note that data from external
resources, such as the Gene
Ontology Consortium shown here,
are highlighted in orange.
Browsing through each one of these
different categories will vary a bit
based on their content, but you will
always be provided with a
description and instructions. I will
just show you a general example of
your options on this page. First of
all you can simply mouse over any
heading to see the number of
Pdb_v2_draft1_review_script
17
associated structures. And you can
click on any headings to access that
particular list of structures.
Secondly, you could just toggle
open a category if you have one in
mind.
In addition, searching through the
tree is possible using this search
box. Enter your search term into the
search box, and click on “Find in
Tree”. The Ontology tree will move
to where your term is displayed.
The last link in the Search menu,
entitled Histograms, allows you to
select histograms detailing PDB
entries by categories – we will leave
these statistics for you to explore on
your own.
Slide 22
[End of Basic Searching &
Browsing] As you have seen, RCSB
PDB offers a searching or browsing
method for every kind of structural
biology research project.
RCSB Protein Data Bank Agenda

Introduction & Credits


Basic Searching & Browsing
Result Options

Structure Summary Page


Advanced Searching
Tools & Education

Summary

Exercises
RCSB PDB: www.pdb.org
Copyright OpenHelix. No use or reproduction without express written consent
Slide 23
22
Let’s search for the phrase
“transcription factor”, and then
examine the results. I will use
quotes so that the exact phrase is
searched on. Clicking the Search
button will submit the query, as we
have seen before.
Many Result Options Available
Search
www.pdb.org

Our query will be: “transcription factor”
Copyright OpenHelix. No use or reproduction without express written consent
[Beginning of Result Options] Let’s
look at getting the most out of the
results by familiarizing ourselves
with some of the options available.
23
Pdb_v2_draft1_review_script
Slide 24
18
Understanding Your Results & Options
MyPDB
login &
registration
Copyright OpenHelix. No use or reproduction without express written consent
24
Here are our search results for the
phrase “transcription factor”. I will
enlarge the left hand navigation
menu so you can see it better. Here
it shows how many results our
query has returned, and these are
also shown in the top bar in the
structure hits tab. We are currently
viewing the Query Results but if we
navigated away from our results, to
the homepage for example, this
menu would still appear in the left
navigation area and clicking on the
Query Results would bring up the
results again.
The next link shows the Query
Details, including your query in
XML format. The Query History
presents the past queries you have
done during this web session. If you
have a past query that you would
like to change slightly, then you
could use the Refine/Modify option
in the Query History. This will take
you to an advanced search form that
we will discuss soon.
The selection here allows you to
save your query to MyPDB. This
feature allows you to save queries
that you may want to repeat. They
can even be repeated during each
weekly update, and emails about the
latest structures matching your
queries will be sent to you. This is a
free service the RCSB PDB offers.
You simply register here or login if
you already have set up an account.
The Query History form has an
option to save your query to
MyPDB too. In addition, there is a
MyPDB Login at the top of the
RCSB PDB pages. If you would like
more information about this service
just follow the links to read more.
Pdb_v2_draft1_review_script
19
Here you can find details about
what type of queries can be run on a
scheduled basis, for example.
Slide 25
Upper Tabs & More Options
Results
breakdown
You may also
have these tabs
Next slide
Copyright OpenHelix. No use or reproduction without express written consent
25
Let’s focus back on the query
results page. The upper tabs show
the breakdown of results into
different categories. By default you
will be taken to the structure results
(as indicated by the white tab), but
you can also view the other results
associated with your query by
selecting different tabs.
Below that are some more of the
result browser options. These
functions are particularly useful for
large datasets. First of all, the
checkbox allows you to select all or
none of the structures from your list.
Of course you can simply go down
your list and check any of the
individual structures you are
interested in as well.
To the right of the checkbox is the
Display/Download dropdown menu.
The first item in this menu is the
“View IDs Selected” option that
lets you view a list of the PDB IDs.
Here you see a screenshot showing
a partial list of the IDs. The next
option allows you to display only
the records you have selected. If you
click any or all records, and select
this option, they will be shown in a
new window. The last menu item
provides downloading options.
Selecting it takes you to an
extensive page of downloading
options that we will see in the next
slide.
Pdb_v2_draft1_review_script
Slide 26
20
Downloading
Structure
Download
PDB IDs
entered
FASTA file
Download
PDB IDs
entered
Download
Services
Copyright OpenHelix. No use or reproduction without express written consent
26
This is the page you will see by
selecting the “Download Selected”
option. You can download
coordinates and/or experimental
data for one or many structures
listed here. This box contains the
IDs of the structures that were
checked on the results page already
entered for you. Below it you can
select the type of download you
want and whether you want an
uncompressed file or a compressed
file. The default settings are set for a
gzipped coordinate file in mmCIF
format.
FASTA formatted files with the
sequences of the structures can be
downloaded from this section.
Below it is a list of the RCSB PDB
File Download Services.
Slide 27
Report Generation
Image Collage
CustomizableTable
Experimental
Copyright OpenHelix. No use or reproduction without express written consent
27
Let’s return back to our results page
for our transcription factor query to
look at the rest of the options. A
dropdown menu allows you to
generate reports. Image Collage lets
you choose from low, medium or
high resolution image collages. Here
is an example of what the medium
resolution collage looks like, and
includes a collage of the images
from all the results we selected.
With the Customizable Table option
you can choose which result items
to include in the table from a long
list of selections. The table can be
exported into spreadsheet tools, or
downloaded as a CSV file.
There are also detailed
Experimental Reports available,
provided as tables.
Pdb_v2_draft1_review_script
Slide 28
21
Sorting, Displaying & More Result Options
Sorting &
Display menus
ID, Title
& options
download
structure visualization
view PDB file
Copyright OpenHelix. No use or reproduction without express written consent
28
Again you have sorting and display
menus on the top right hand corner
of the results page. You can sort
your results by PDB IDs, Release
Date, Residue Count (meaning
length of protein) or Resolution.
The default sorting is by Release
Date with the most recently released
entry shown first.
Let’s look more closely at the
entries on a typical results page. To
the left is the box you click in to
select this accession, the PDB ID
number, and the title of the
structure. I will enlarge this so you
can see the icons below the PDB ID
number. These icons take you to
download, view the PDB file, and to
view the structure using Jmol,
respectively.
Slide 29
Basic Entry Information & Access to More
Click ID,
structure, or
title to access
structure
summary page
Authors,
Release Date,
Classification,
Experiment,
Compound &
Citation
Display full polymer or ligand
details for this entry or all results
Mouse over thumbnail
Copyright OpenHelix. No use or reproduction without express written consent
29
The query results browser provides
Authors, Release Date,
Classification, Experiment,
Compound and Citation information
for each entry. If you click on an
author name, a list opens with of all
PDB structures they have submitted.
This is followed by the Release
Date. A wider functional
classification follows, in this case
transcription, and then the
Experiment. Here we see that this
experiment was X-ray diffraction,
and the resolution achieved was 2.9
Angstroms. The lower the
Angstroms, the better and more
detailed the structure. Structures
with higher resolution values are
considered to be “low resolution”,
meaning less precise.
Next is a summary of compounds
Pdb_v2_draft1_review_script
22
found in this entry. The options next
to the compounds let you display
full polymer or ligand details for
this entry, or for all of your results.
Here I have opened the polymer
details. There are three polymers in
this structure. The annotation in this
section will vary--as you might
expect--depending on the
availability of pertinent information,
so it is worth paying close attention
to. You may see EC numbers if the
structure is an enzyme, whether or
not a mutation is present (as we do
here for polymer one), if a catalytic
domain is contained within the
polymer, and many more important
structural details. If a citation is
available it will be listed in the
following section.
Below the PDB ID is a thumbnail
image of this structure. If you
mouse over it you can see it
enlarged.
If you wish to view this result in
more detail, you can click on the
PDB ID, structure or title to access
the structure summary page of the
entry you are interested in. Structure
summary pages will be the focus of
the next section.
Slide 30
[End of Result Options] We have
now finished our review of the
basics of the results pages.
RCSB Protein Data Bank Agenda

Introduction & Credits


Basic Searching & Browsing
Result Options

Structure Summary Page


Advanced Searching
Tools & Education

Summary

Exercises
RCSB PDB: www.pdb.org
Copyright OpenHelix. No use or reproduction without express written consent
30
[Beginning of Structure Summary
Page] In this section we will focus
on the structure summary page and
look at some of the tools you can
use to visualize your structures.
Pdb_v2_draft1_review_script
Slide 31
23
Structure Summary Page






One for every structure
Organized similarly
It provides access to a
wealth of data
Customizable with a
widget framework
Re-arranged sections
in red
Use Reset Layout to
return to default
For this section we will use the E.
coli AraC page for our example of a
typical page. Perform a quick
search for 2ARC and view the
summary page shown.
Enter 2arc
2arc
Report tabs
Title & ID
Primary
Citation
Menus
Molecular
Description
LigandSource
Component
Related Entries
Source
Ligand Component
Related Entries
Images &
Viewing
Options
MyPDB
Experiment
Deposition
Details
Summary
MyPDB
Experiment
Details
Deposition
Summary
External Data
(orange)
Reset Layout
Copyright OpenHelix. No use or reproduction without express written consent
31
Each of the structures in the PDB
has a structure summary page
organized like this one. This, if you
like, is the homepage of this
structure, with a wealth of data
available within it. We will go over
each section in the upcoming slides,
but first let me provide a brief
overview of the organization.
Starting at the top there are report
tabs taking you to detailed reports
on a variety of subjects, followed by
the title and ID. To the right are
options for downloading and more.
Then you find citation information,
molecular descriptions, images and
different interactive viewing
options, source data, MyPDB,
related PDB entries, a deposition
summary, a ligand chemical
component section (if the structure
you are viewing contains a ligand),
experimental details, and data from
external sources (denoted by
orange), such as ligand and domain
annotations, and Structural Biology
Knowledgebase data.
To the left you will always have
access to the RCSB PDB navigation
menus that we have previously
examined. Not only are these left
navigation menus, customizable as
we have seen previously, but the
structure summary page is as well.
Individual sections on the structure
summary page can be hidden or rearranged by either selecting the hide
buttons or dragging on the arrows
you find in the blue navigation
Pdb_v2_draft1_review_script
24
areas. Here is an example. The
sections I re-arranged are shown in
red. It is certainly easy to imagine
how valuable this can be – simply
put the information you need the
most in the position you find to be
the most convenient.
Cookies are used to store your
preferences and they will remain for
any structure summary page you
select, until you click on “Reset
Layout” at the bottom of the page.
Slide 32
Structure Summary Page - Top
Reports
Title, ID
Citation
AraC AND L-arabinose
Copyright OpenHelix. No use or reproduction without express written consent
32
Let’s now take a closer look at the
top of a typical structure summary
page. Notice that from here you can
also start another search – this main
search toolbar remains at the top of
all Web Pages within the RCSB
PDB site. Another nice feature is the
structure explorer in the left hand
navigation bar. The last structure
you viewed will appear here so that
you can easily return to it.
The information found in the first
part of the Structure Summary page
is very straightforward. It all starts
with the title and PDB ID. We will
look more closely at the report tabs
shown here in a minute. To the right
are links to display files. When you
click on it this dropdown menu will
open with choices. The links to
downloading files are next, and here
is the menu of selections. After that
you have options to share this page
via a social bookmarking
aggregator.
Next you have the Primary Citation
section including the article title, the
names of the authors and a link to
the PubMed Abstract. Click here to
search for related articles. If you
Pdb_v2_draft1_review_script
25
select the “Read More & Search
PubMed Abstracts” an expanded
view of the citation section will
appear. Here I am only showing a
portion of the bottom of it which
includes a neat search box. Any
word within the abstract or keyword
can be selected by clicking on it and
it will be added to this box. The
query will use these keywords to
search for all other PDB structures
with the same terms in their
PubMed abstracts.
Slide 33
Structure Summary Page - Top, Part 2
“Looking at Structures”
Education section
Asymmetric Unit options:
Toggle
through
views
Larger
image
Click here
Copyright OpenHelix. No use or reproduction without express written consent
33
Under the Primary Citation section
is the Molecular Description section
containing the general classification
– Transcription Factor, the weight,
length, polymer, chain and fragment
details.
To the right is an image of the
structure and some visualization
choices. The Biological Assembly
image shows how this protein could
look in its quarternary, or active
form. In this case, AraC is a dimer.
Clicking on these buttons allows
you to toggle through views of the
biological assembly and of the
asymmetric unit. The asymmetric
unit contains the unique part of a
crystal structure, and is used by
crystallographers to refine the
coordinates of the structure against
the experimental data. To learn
more about asymmetric units and
biological assemblies, select the
help button in this window. It will
take you to the RCSB PDB’s online
resource called “Looking at
Structures: Introduction to
Biological Assemblies and the PDB
Archive”.
Many viewers are available to you
Pdb_v2_draft1_review_script
26
in the section directly below the
image. Here you see the options for
the Biological Assembly view. The
viewing options vary depending on
whether you are interested in the
Biological Assembly or the
Asymmetric Unit. The Asymmetric
Unit options include KiNG, Jmol
and WebMol Viewers, all of which
require a Java applet to be loaded
onto your computer before you can
view the structure. All do so more
or less automatically with minimum
fuss, and all allow a certain degree
of image manipulation such as
adding some visualization
preferences or moving the molecule
around in three-D. Some viewers
are more geared towards specific
needs, such as Ligand Explorer, for
example. This viewer lets you select
which specific ligand to focus on
when several are present in the same
structure – we will see this viewer
soon. Which viewer you choose
depends on personal or institutional
preferences, as well as on how much
you will need to manipulate the
image once you have uploaded it. A
complete discussion of all of these
viewers and their capabilities is
beyond the scope of this tutorial.
Click on the plus sign to see a larger
view of this image.
Next I will show you a couple of
quick examples of some of the
images you can see. Let’s click here
to view the biological assembly with
Jmol.
Pdb_v2_draft1_review_script
Slide 34
27
Here is an example of the biological
assembly viewed with Jmol. You
have many visualization options at
your finger tips with this viewer.
You can easily take measurements
of angles and distances, rotate and
zoom-in on your molecule just by
using your mouse.
Jmol Viewer
Right click
Menus to
easily change
coloring &
display styles
Scripting box
Help tab
Mouse over
for info &
click to
display
http://jmol.sourceforge.net/
Copyright OpenHelix. No use or reproduction without express written consent
34
Menu options in this section allow
you to change the coloring and
display styles of the image. Here
you can also reset the display and
export the image in several formats.
If you right click, a series of
detailed cascading menus appear
providing you with more options.
The script box allows users familiar
with Jmol scripting to customize
their displays even more.
Interactive Jmol scripting
documentation, and several other
Help menus for Jmol can be
accessed by selecting this tab.
Currently, we are viewing the
Annotations tab that shows domain
assignments. If you mouse over the
domain assignments more
information is provided. Also,
clicking on the domain assignments
displays them in Jmol.
Jmol is not developed by the RCSB
PDB. It is open source software –
available to everyone at the address
shown.
Pdb_v2_draft1_review_script
Slide 35
28
Structure Summary Page - Bottom
MyPDB
Source
Deposition
Summary
Related
PDB Entries
Ligand
Component
Experimental
Details
External
Annotations
To move
sections to
customize
Let’s return to the structure
summary page now to look at the
lower half of it. The Source section
describes the species that this
protein is naturally found in, and the
expression system that was used to
clone it for protein isolation.
Return to
default
Copyright OpenHelix. No use or reproduction without express written consent
35
Related PDB entries are located in
this section. Next is the Ligand
Chemical Component section. This
is present only for structures which
contain a ligand and it presents
ligand-specific links and viewing
options-we will explore this on the
next slide.
After that there are several External
Annotation sections shown in
orange. Again, external annotations
display information from resources
outside of the RCSB PDB, such as
ligand annotations from BindingDB
and BindingMOAD databases, and
models and protein targets from the
Structural Biology Knowledgebase.
Here is the MyPDB Personal
Annotations widget. As we
discussed earlier, this allows you to
save personal annotations, add
structures to your favorites list, and
access saved information from a
summary page in your MyPDB
account.
In the Deposition Summary section
you find the author names and
deposition, release and modification
dates for the structure. The
Experimental Details section
contains the experimental method
that was used to resolve the
structure, and the main experimental
parameters such as the resolution
and unit cell parameters for crystal
Pdb_v2_draft1_review_script
29
structures.
Like other displays, you can use
these controls to customize the
layout of this page. All you do to
move a section of this page is to
drag it while clicking on the doubleheaded arrows in the dark blue title
areas. And here again is the “Reset
Layout” link to return the back to its
default settings.
Next we will explore the Ligand
Chemical Component section more
closely.
Slide 36
Ligand Chemical Component - Ligand Explorer
Copyright OpenHelix. No use or reproduction without express written consent
36
And again, a reminder that the
Ligand Chemical Component
section is present on structure
summary pages of only those
structures containing ligands. Our
structure is the protein AraC
complexed with L-arabinose, and so
we have additional information
regarding this ligand. The AraC
structure summary page of the apo,
or unbound form (PDB ID 2ara)
would not contain this section,
however.
If you mouse over the linked ARA
you can view a thumbnail of the
structure of this ligand. Clicking on
the linked ARA takes you to RCSB
PDB’s ligand summary page.
This Search link lets you find other
structures that contain ARA as a
free ligand, and you can click here
to download the ARA ligand file.
Interaction images generated by
PoseView software can be seen by
clicking on this thumbnail image.
Selecting the Ligand Explorer
viewer is a nice way to really focus
Pdb_v2_draft1_review_script
30
in on the ligand-containing part of
your structure. Here you see a
screenshot of this viewer and some
of the options in the upper and
lower left hand panels. I will next
select to view bridged hydrogen
bonds, and this is only one of the
many choices you can make. An
additional feature of this viewer is
that it allows you to select a specific
ligand if multiple are present. If the
structure you are interested in
contains ligands, then do take the
time to look at it with the Ligand
Explorer viewer.
Slide 37
Sequence Report
Graphic of
secondary
structures
 Choose domain
assignments or
annotations from
dropdown menu
 Specify how you
want to view
graphic
Set viewing preferences

Results
Tabs
Select more
annotations
References below
Copyright OpenHelix. No use or reproduction without express written consent
37
Here are the tabs that are across the
top of the structure summary page
that allow you to access more
specific results. We have been
previously examining the summary,
but now are looking at the first of
the detailed report tabs, the
Sequence and Structure Details
section. This section does a lot more
than simply give you the sequence
of the protein. It has a beautiful
graphic showing you where the
secondary structures are with
respect to the sequence, and you can
also utilize this “More annotations”
menu to choose domain assignments
or secondary structure annotations.
At the bottom of the page there is a
preferences section that allows you
to specify how you would like to
view the graphic in terms of three
dimensional and page parameters.
Once you have made your selections
you simply click on “Submit” and
the sequence will be shown
according to your preferences. You
will also find references in the lower
Pdb_v2_draft1_review_script
31
section that is not shown in this
cropped screenshot.
Slide 38
Annotations Report
Database
links
SCOP
CATH
PFAM
SBKB
Customize
Copyright OpenHelix. No use or reproduction without express written consent
38
The next tab takes you to the
Annotations section. The sections
present the SCOP, CATH and
PFAM classifications, and
Structural Biology Knowledgebase
data for the AraC structure,
similarly to what we saw on the
structure summary page in the
external resources section.
After the titles there are links to all
of the databases. The top of this
page also provides the same options
we have seen before to display,
download and share this page. All of
the report pages offer these options
in the same upper right hand corner.
The data on this page are
highlighted in orange to indicate
that these are data from other
sources, as we have also seen
previously. The links in blue can be
used to conduct an RCSB PDB
search using those keywords, and
take you to a query results page
containing the structures that
correspond to that keyword phrase.
In effect, this gives you the
structural homologs of this protein.
Here is what you would see if you
clicked on the jelly rolls topology
link in the CATH classification
section. There are many structures
for you to explore.
This is another page that can be
customized using the arrows or Hide
buttons. Not all data pages have this
feature, but if they are present, then
you can customize that particular
page.
Pdb_v2_draft1_review_script
Slide 39
32
Next is the Sequence Similarity
page, which provides clusters (or
groups) of structures at different
levels of sequence similarity. This
page allows you easy access to
related structures by simply
browsing through the clusters at
different similarity thresholds.
Sequence Similarity Report
Copyright OpenHelix. No use or reproduction without express written consent
39
The clusters are based on a weekly
BLAST analysis of all proteins with
more than 20 amino acids in the
PDB. Many PDB entries contain
several chains, so the sequence
similarity is defined on a chain-bychain basis, with the results returned
for the entire structure. Detailed
documentation is available here.
This shows some of the current
statistics on redundancy you find
when following the documentation
link. There is also much more
information here if you need it.
Below you find the sequence
clusters, including the similarity
cutoffs, ranks, the number of chains
in the cluster and the cluster
number. Here is an example of the
details for the 95% similarity
cluster. Much more information is
available that you can pursue in
many directions.
Slide 40
The 3D Similarity report presents
structural similarities found using
the jFATCAT-rigid algorithm. A
40% sequence identity clustering is
applied to reduce the number of
results. Ranked results are shown in
this table with details and scores,
and the legend is shown to the right.
3D Similarity Report
Scroll to
Download &
Help sections
Copyright OpenHelix. No use or reproduction without express written consent
40
Clicking on “view” will take you to
structure alignment details
presenting a summary of the
Pdb_v2_draft1_review_script
33
alignment results, a Jmol view of
the alignment, and a textual
representation of the alignment.
Scrolling further down this page
will take you to downloading and
help sections.
Slide 41
Literature Report
Scroll to
Copyright OpenHelix. No use or reproduction without express written consent
41
Here we are viewing the new
literature report. It summarizes the
articles that are related to our
structure. The top Primary citation
section is the same as what we saw
previously on the structure summary
page so we won’t review it again
here. Below it are the MeSH, or
Medical Subject Heading, terms.
Clicking on a term will query for
structures with that particular term
in their abstract as we also have
seen previously.
If you scroll down the literature
page a little further you will see
some additional information
provided by a collaboration with
BioLit. As a part of this
collaboration, PubMed Central is
searched for any articles containing
PDB IDs. Here you see the PubMed
Central articles found to contain
2ARC. The corresponding abstracts,
figures and legends are shown for
our structure, with links to any other
PDB IDs that are also included in
these articles.
Slide 42
The Biology and Chemistry tab
provides information about protein
and nucleic acid chains, and ligands
in a structure.
Biology & Chemistry Report
Help
Help
Help
Mouse over
for details
Copyright OpenHelix. No use or reproduction without express written consent
42
Here in the lower section of this
page, under the Gene Details
section, is the name of the organism
from which this protein is found,
and other molecular biology
Pdb_v2_draft1_review_script
34
experimental details. You can
always get help with the
terminology used by mousing over
many of the terms to read a
definition. And of course you see
the question marks indicating that
help is only a click away.
Slide 43
Methods Report




Precise
experimental
details
Organized
similarly
Mouse over
to read
definitions
Sections
depend on
methodology
Copyright OpenHelix. No use or reproduction without express written consent
43
In the Materials and Methods
section are details of the
experiments that were conducted.
Here I have only shown the top part
of this webpage as an example, but
it is organized similarly to the other
results page we have just seen. And
you also have similar options-for
example you can mouse over terms
to see detailed definitions.
The top section reviews the
crystallization conditions and
methods, information about the
crystal, and diffraction details.
Further down on this page you can
find sections of refinement, and
software and computing. Of course
what you find on this page will
depend on what structure you are
looking at. Structures solved by
NMR will have information
pertaining to that type of
experiment.
Slide 44
Geometry Report
Top
Details of bond
lengths & angles
Middle
Lower
Colorcoded
Copyright OpenHelix. No use or reproduction without express written consent
44
For structural biologists and
biochemists interested in detailed
information about the bond lengths
and angles between atoms in this
structure, RCSB PDB has a
webpage dedicated to geometry.
The data are available either in table
format or as interactive graphs, and
here you see only the top of the
page with the available options and
the beginning of the bond length
data. The other sections are
organized similarly and all data are
Pdb_v2_draft1_review_script
35
color-coded based on fold deviation
scores.
Here you can see the middle of the
page with the bond angle data.
Lower down on this page is a table
of dihedral angle information too.
Slide 45
Links Report
SCOP & CATH
Scroll down to
Copyright OpenHelix. No use or reproduction without express written consent
45
The Links section provides links to
many structural databases that
contain a large amount of data and
information about the folds and
domains of your protein. Each
database has its different strengths
and displays, and is used for slightly
different purposes. They are nicely
categorized for you. Here you see
databases clustered into groups
entitled structure summary,
structure features and ligand
features. And you can scroll down
the page to find even more, such as
those devoted to secondary
structure, like DSSP.
You will find links to most of the
well-curated structural classification
databases like SCOP and CATH
which we saw previously in the
main body of the structure summary
Page. They are in the structure
classification and comparison
section here. Clicking on these links
would open up the webpage relevant
to your protein in that database in a
new window.
Pdb_v2_draft1_review_script
Slide 46
36
[End of Structure Summary Page]
We have now completed our
overview of the structure summary
page.
RCSB Protein Data Bank Agenda

Introduction & Credits


Basic Searching & Browsing
Result Options

Structure Summary Page


Advanced Searching
Tools & Education

Summary

Exercises
RCSB PDB: www.pdb.org
Copyright OpenHelix. No use or reproduction without express written consent
Slide 47
46
Now let’s go back and take a
detailed look at the Advanced
Search tool. You can get to the
Advanced Search tool either from
the toolbar on the top or from the
Search menu in the left navigation
area widget.
Advanced Search Access & Basics
Query Types

Access from homepage top or left side search menu
Copyright OpenHelix. No use or reproduction without express written consent
[Beginning of Advanced Searching]
It is time to move on to a brief
overview of some of the Advanced
Searching options.
47
Advanced Search is a fully
customizable form which allows
you to broaden or narrow your
search according to search
parameters relevant to your project.
For example, you can data mine for
a particular group of proteins that
share some properties, or you can
search using the name of a
particular ligand. These are the
input boxes where you begin
classifying your query. You add or
remove parameters by adding or
removing Query boxes and options
by selecting Add Search Criteria to
the right.
You have several Query types in
each query box you open up. These
refer to the results pages, and the
different components of a results
page. Notice you can scroll down
for more options. From here you can
search for a keyword, a particular
ligand, and you can limit the search
to an Enzyme Classification from
the Biology section, to name just a
Pdb_v2_draft1_review_script
37
few of the many choices.
Let’s look at an example in the next
slide.
Slide 48
Advanced Search - Number of Entities Query
Protein
5
5


Use Result Count to see
how many results you will
obtain
Click on Add Search
Criteria to add a query box
Copyright OpenHelix. No use or reproduction without express written consent
48
Here I am just showing a screenshot
of the Advanced Search Interface
and not the upper and side
navigation bars so you can see this
more clearly. Let’s say for this
example, I want to look at a
structure with certain amount of
distinct entities. This is obviously
going to give us very general results
so we will need to learn some
methods to narrow our results. In
the query type dropdown menu I
will select Number of Entities in the
Structure Features section.
Once I make this selection, I get a
variety of options that may narrow
down my search if it is needed.
From the dropdown menu I will
select protein as the entity type, but
you can see there are additional
choices of entity types, as well.
Then I will choose five as the
minimum and maximum number of
entities so that we will retrieve only
structures with five protein entities.
These are distinct entities. If there
are two identical protein chains in a
structure then one will not be
counted as an entity.
In order to get an idea of how large
a set of results you are likely to
obtain, you can click on Result
Count. Clicking on Result Count
shows many structures that have
five distinct protein entities; clearly
we need to narrow the search a bit.
Pdb_v2_draft1_review_script
38
To do this we shall add another
query box.
To add another query box, click on
the Add Search Criteria on the far
right.
Slide 49
Copyright OpenHelix. No use or reproduction without express written consent
Slide 50
Here is the next query box added.
We will choose “Has Ligands” from
the Chemical Components section
of the Query Type dropdown menu,
and leave the default menu for “Has
Ligands” set at “Yes”.
Advanced Search Methods - Adding Query Boxes
49
Adding More
Query Boxes
Macromolecule
Type
Structures
vs. ligands,
start over,
or submit
query
Experimental
Method
Remove Similar &
Match ALL or ANY
Copyright OpenHelix. No use or reproduction without express written consent
50
And then we will continue to refine
our query by adding additional
query boxes using the Add Search
Criteria selection.
Next we search for a specific
Macromolecule Type. This selection
is in the Structure Features section
of the Query Types dropdown
menu. For the sake of this example,
we shall search for a protein, by
selecting Yes, and we will require
that is does not contain DNA, or
RNA, and that it is not a DNA/RNA
hybrid, by selecting No for the last
three options.
For our final subquery, we will
select the Experimental Method
category under the methods section
of the dropdown menu. I specify
that I only want to look at structures
resolved using X-Ray
crystallography and below it we will
choose that there must be
Pdb_v2_draft1_review_script
39
experimental data, by selecting Yes.
Here is the section allowing you to
select the algorithm to automatically
remove structures whose sequences
are very similar to each other. You
can choose to make use of this
feature by clicking in this box and
setting the threshold of sequence
identity you would like the
algorithm to use – as we have seen
earlier in this tutorial. For our query
we will check the box and select
95% for the homolog removal. Also,
right below this you can choose to
match all or any of the above search
conditions. We have been adding
query boxes with an AND Boolean
function because, by default, this
setting is for matching all of the
conditions. You could select any
from this lower menu though, and
then you would be essentially using
the OR Boolean logic. If you select
any then between all of the query
boxes the AND would be changed
to an OR.
The Results dropdown menu in this
section lets you specify whether you
want to search for structures or
ligands. We will leave it at the
default setting to search for
structures. You can also make use of
the Clear All Parameters button if
you want to start again. To start the
search, click on the Submit Query
button. We will see our results in
the next slide.
Pdb_v2_draft1_review_script
Slide 51
40
Advanced Search Results
Our
query
Copyright OpenHelix. No use or reproduction without express written consent
51
Here are the results you get from the
search parameters we have
specified. Note that our advanced
query is summarized at the top of
the page. We have specified
structures with five protein entities
and ligands, those that contain
protein, but not DNA or RNA, those
that have associated X-ray
crystallography experimental data,
and we have removed homologs that
were 95% similar from our results
list.
Several dozen structures matched
these criteria. Now that you are
familiar with some of the methods
and tools RCSB PDB offers for
searching, I am sure you would be
able to further narrow your search
any way you choose, to most
effectively locate structures relevant
to your research.
Slide 52
[End of Advanced Searching] I hope
by this time you have come to
appreciate how much information is
stored in, and can be accessed
through, the RCSB Protein Data
Bank.
RCSB Protein Data Bank Agenda

Introduction & Credits


Basic Searching & Browsing
Result Options

Structure Summary Page


Advanced Searching
Tools & Education

Summary

Exercises
RCSB PDB: www.pdb.org
Copyright OpenHelix. No use or reproduction without express written consent
52
[Beginning of Tools & Education]
Now we will briefly look at some of
the additional tools and educational
resources RCSB PDB provides.
Pdb_v2_draft1_review_script
Slide 53
41
Tools & Educational Resources
Click on
Compare
Structures
www.pdb.org
Copyright OpenHelix. No use or reproduction without express written consent
53
RCSB PDB provides many tools for
accessing PDB data and educational
resources for learning about
biomolecular structures. They are
easy to access from the left hand
side of the homepage. I will enlarge
it here. In the tools section you can
access the form to download data
files. We have seen this before, but
if you want direct access to it, then
it is good to know that you can
easily find it right on the homepage.
FTP Services are listed. You can
access the major directories here.
Clicking on File Formats takes you
to a page with extensive information
on the formats used. The following
links take you to information about
SOAP and RESTful Web Services,
which are software systems that are
used by developers to access PDB
data remotely. If you select the
“Widgets” link you can learn more
about the web widgets offered.
These are small bits of code that can
add RCSB PDB-related website
functions to your website or blog.
Let’s look at the comparison tool
offered here in the next slide.
Slide 54
The RCSB PDB Comparison Tool
To Expert
mode - vary
parameters
Example
Sequence
Read more
Structure
Copyright OpenHelix. No use or reproduction without express written consent
54
The RCSB PDB Comparison Tool
can be used to calculate pairwise
sequence or structure alignments. If
I open this dropdown menu you will
see that you can choose between a
variety of methods for the
comparisons. The top ones are for
sequence comparisons, and the
bottom ones are for structure
alignments. This tool is also
available as a downloadable web
widget and, in addition, it can be
accessed from sequence similarity
Pdb_v2_draft1_review_script
42
pages. Its functionality is integrated
into the sequence clusters that we
looked at previously.
The “Align custom files” option will
take you to expert mode for this
tool. This will enable you to have
more control over the structure
alignment algorithms via the
parameters menu, and is
recommended only for those users
that have a good understanding of
alignment algorithms. If you click
here you can view an example. You
can read more about this tool by
following the links here on this
page.
Methodology references and
acknowledgements are found in the
section below.
Slide 55
Let’s now return to the homepage to
take a quick look at educational
resources available from the RCSB
PDB.
Educational Resources
Click on
Molecule
of the
Month
www.pdb.org
Copyright OpenHelix. No use or reproduction without express written consent
55
The “Understanding PDB Data”
link will take you to an extensive
page of information topics, as you
can see here. The table of contents
shows the subjects covered. It is a
beautifully illustrated and clearly
written document that provides
information helpful to both
beginners and advanced researchers.
Selecting the Educational Resources
link brings you to a large page
highlighting a variety of resources
including a Kiosk Viewer that
displays animations of PDB
structures, Posters, tutorials, details
about molecular animation,
classroom activities and lessons,
Pdb_v2_draft1_review_script
43
upcoming events and more.
Next we will take a look at the
Molecule of the Month feature by
clicking here.
Slide 56
The Molecule of the Month resource
uses text, images and interactive
Jmol displays to teach users about
many different types of structures
and their biological significance.
Molecule of the Month
View by Title,
Date or Category
Click on
Copyright OpenHelix. No use or reproduction without express written consent
Slide 57
56
More opportunity to engage with the
community of PDB users is
available from the News area on the
homepage.
Community Resources
News &
Publications
Copyright OpenHelix. No use or reproduction without express written consent
Molecules are grouped into the
broad categories seen here.
Alternatively, if you want to access
a list view of the archive by titles,
dates or categories, then click here.
Clicking on any of the category
icons on this page will bring up a
list of subcategories to choose from.
You can click to learn more about
any of the molecules within these
subcategories. In addition, moving
your mouse over any molecule will
provide more information about it.
57
Access the Newsletter to hear about
new software, features, and obtain
tips on getting the most out of the
RCSB PDB.
If you become a regular user you
may also want to consider joining
the PDB electronic listserv to keep
up with developments on the site
and in this arena. There are also
links to other related discussion
groups that may be relevant to PDB
users as well.
Pdb_v2_draft1_review_script
44
We hope that the many resources
from the RCSB PDB will enhance
your knowledge and increase the
effectiveness of your research.
Slide 58
[End of Tools & Education] That
completes our look at some of the
tools and educational resources you
can find in RCSB PDB.
RCSB Protein Data Bank Agenda

Introduction & Credits


Basic Searching & Browsing
Result Options

Structure Summary Page

Advanced Searching

Tools & Education

Summary

Exercises
[Beginning of Summary] I will now
give you a brief summary of what
we have seen in this tutorial.
RCSB PDB: www.pdb.org
Copyright OpenHelix. No use or reproduction without express written consent
Slide 59
58
Summary
An Information Portal
Many search
options
Special features to
keep you current
Lots of tools
& services
Customize view
www.pdb.org
Copyright OpenHelix. No use or reproduction without express written consent
59
Right from the homepage you can
read a wonderful and accurate
description of the RCSB Protein
Data Bank-it is your “Information
Portal to Biological Macromolecular
Structures”. RCSB PDB is an
incredible resource that provides
you access to so much structural
information. Really it would be
impossible to summarize all that
RCSB PDB offers - or even all that
we have covered in this tutorial - in
just a couple slides so I will just
briefly highlight a couple of points
here.
RCSB PDB has special features
right on its homepage that can help
you to keep current in structural
biology, including its popular
Molecule of the Month feature. So
many general and specific search
options are available that it is quick
and easy to find all the information
you want. You can search for a
structure using the method that best
suits your needs. And you can
customize the homepage so that the
menus are where they are easiest for
you.
Pdb_v2_draft1_review_script
45
Many tools and services are
available with new ones constantly
added. And you have easy access to
these tools and services right from
the homepage navigation areas.
MyPDB is a free service that will
store your favorite queries and send
you weekly results that include the
latest new structures matching your
queries.
Slide 60
A Protein Information Goldmine
Structure Summary pages
provide detailed results
Copyright OpenHelix. No use or reproduction without express written consent
Easy ways to
visualize &
manipulate
structures
60
The structure summary page is the
homepage for each structure in
RCSB PDB. It contains an extensive
amount of information, and also
gives you access to many other
detailed results pages. You find
information on sequence details,
methodology and many other
features of the structure, as well.
RCSB PDB also provides you with
many tools to view and manipulate
the structures you want to learn
more about. You can examine the
biological assemblies, as well as the
asymmetric units, with many
different viewers. Here’s an image
of AraC. So, at the click of a mouse,
you can access vast amounts of
information about the structure you
are interested in.
Thank you for taking the time to
view this tutorial!
Pdb_v2_draft1_review_script
Slide 61
46
[End of Summary] That completes
our summary of RCSB PDB.
RCSB Protein Data Bank Agenda

Introduction & Credits


Basic Searching & Browsing
Result Options

Structure Summary Page


Advanced Searching
Tools & Education

Summary

Exercises
[Beginning of Exercises] We will
now provide you with exercises that
you can perform to practice all that
you have learned about RCSB PDB.
RCSB PDB: www.pdb.org
Copyright OpenHelix. No use or reproduction without express written consent
61
Download