1957

http://www.pdb.org/

Experimental approaches for structural biology

• X-ray crystallography

• NMR

• cryoEM

cryoEM

Where to get structural data?

• biological molecules

– PDB – Protein Data Bank http://www.pdb.org

free

– NDB – Nucleic Data Bank http://ndbserver.rutgers.edu/

• organic molecules

– CSD – Cambridge Structural Database paid

PDB History

1957

• Myoglobin structure determined

1970’s

• Discussions how to establish an archive of protein structures

• PDB established at Brookhaven

– Oct 1971, 7 structures

1980’s

• Technology takes off

– molecular biology, instrumentation, computer hardware and software

• Number of structures increases

• Structural biology is able to focus on medical problems

• IUCr requires data deposition to the PDB

1990’s

• Complexity of structures increases

• Structural genomics begins

Current state of the PDB

• 20. 11. 2012 – 86 344 structures in the PDB archive

• 8 225 new structures deposited in 2012 so far

• Depositions by macromolecule type

– 92.6 % Proteins (79 959 structures)

– 2.8 % Nucleic acids (2456 structures)

– 4.5 % Protein-nucleic acid complexes (3905 structures)

• Depositions by experimental technique:

– 88.0% x-ray diffraction (75 957 structures)

– 11.2% solution NMR (9702 structures)

– 0.5% cryo-EM (468 structures) data as of 26. 11. 2012 http://www.pdb.org/pdb/static.do?p=general_information/pdb_statistics/index.html

PDB ID

• Each structure in the PDB is represented by a 4 character identifier of the form [0-

9][a-z,0-9][a-z,0-9][a-z,0-9]

• 1B3T

Data formats of PDB

• PDB format, mmCIF (and derived xml

PDBML)

• Dictionary resources at:

http://mmcif.pdb.org/

• mmCIF is the PDB archival format

– all data released in all three formats

PDB Format

 legacy format

 http://www.wwpdb.org/docs.html

 fortran-like 80 column-wide

 not structured enough to describe complicated 3D objects

 its limits have been broken several times

 99,999 atoms, 34 (or 58) chains

 readable by most programs

model – chain – residue – atom

mmCIF language

 based on community-agreed definitions

 allows adding new features and customization

 mmCIF categories are easily transformed to database tables

 not designed to be read by humans, data should be viewed through programs and databases http://ich.vscht.cz/~cechp/mmcif/

Pubmed, MEDLINE, Entrez etc.

http://www.pubmed.gov

http://www.pubmed.org

NCBI

National Institute of Health (NIH) – U. S. government

National Library of Medicine (NLM)

National Centre for Biotechnology Information (NCBI)

NCBI (founded 1988, http://www.ncbi.nlm.nih.gov/ )

• Genomic sequences GenBank – open access annotated collection of all available nucleotide sequences, doubles each

18 months (October 2008 – 97 381 682 336 bp), new release every 2 months, accession number (U49845) required upon publication

• OMIM – Online Mendelian Inheritance in Man, db of diseases together with their genetic components

• PubChem (http://pubchem.ncbi.nlm.nih.gov/) – db of small organic molecules, includes the information about their bioacivities

• Entrez (http://www.ncbi.nlm.nih.gov/sites/gquery) – federated search engine offering unified access to all NCBI databases

MEDLINE

• journal citations and abstracts for biomedical literature

• since 1996 - free access to MEDLINE via

PubMed.

• PubMed - Web-based retrieval system developed by the NCBI at the NLM. It is part of

NCBI's Entrez.

• PubMed contains

– abstracts

– links to full-text articles

– links to other databases

– …and much more

What’s in Pubmed

• Most PubMed records are MEDLINE citations .

– citations and author abstracts from approx. 5 200 biomedical journals

– diverse topics: microbiology, delivery of health care, nutrition, pharmacology and environmental health.

– currently over 19 million references dating back to

1948

– new material added Tuesday through Saturday

– about 90% records are from English-language sources or have English abstracts

– Approximately 79% of the citations are included with the published abstract

What’s in Pubmed

• Pubmed Central (PMC)

– http://www.pubmedcentral.nih.gov/

– db of free full texts

– since 2007 paper funded by NIH must be freely available through PMC no later tha 12 month since publishing

• NCBI Bookshelf

– http://www.ncbi.nlm.nih.gov/sites/entrez?db=books

– free biomedical books (biochemistry, molecular biology, …)

MeSH

• created 1960 by NLM

• "Medical Subject Headings."

– the authority list of the biomedical terms

– used for indexing journal articles for MEDLINE

• It imposes uniformity and consistency to the indexing of biomedical literature.

• MeSH Tree.

• Citations are indexed manually.

• http://www.nlm.nih.gov/bsd/disted/video/index.html

• MeSH vocabulary is organized by 16 main branches:

1. Anatomy

2. Organisms

3. Diseases

4. Chemical and Drugs

5. Analytical, Diagnostic and Therapeutic Techniques and

Equipment

6. Psychiatry and Psychology

7. Biological Sciences

8. Natural Sciences

9. Anthropology, Education, Sociology and Social Phenomena

10. Technology, Industry, Agriculture

11. Humanities

12. Information Science

13. Named Groups

14. Health Care

15. Publication Characteristics

16. Geographic Locations

Search Pubmed

• each citation has a unique PbMed ID (PMID), www.pubmed.org/PMID

• Boolean operators

– must be UPPERCASE!

– AND is default

– parenthesis: salmonella AND (hamburger OR eggs)

• phrase searching

– “kidney failure”

, kidney failure*

, kidney failure[tw]

• author names

– natural or inverted order julia”)

(“julia wong”

,

“wong

– searching last name only – use [au] tag ( wheeler[au]

)

Search tags

• [ad] – affiliation of the first author

• [all] – all fields

• [au] – author

• [dp] – date of publication, yyyy/mm/dd, last two are optionally

• [ta] – journal title (abbreviated, full), see Journals database http://www.ncbi.nlm.nih.gov/journals

• [mh] - MeSH term

• [majr] – MeSH major topic

• [ti] – title

• [tiab] – title + abstract

• citation sensor

– choi blood 2008

• related articles

– sorted from most to least relevant

• All, Review, Free full text

1957

http://www.pdb.org/

Experimental approaches for structural biology

cryoEM

Where to get structural data?

PDB History

Current state of the PDB

PDB ID

Data formats of PDB

• PDB format, mmCIF (and derived xml

PDBML)

• Dictionary resources at:

• mmCIF is the PDB archival format

PDB Format

mmCIF language

Pubmed, MEDLINE, Entrez etc.

NCBI

MEDLINE

What’s in Pubmed

What’s in Pubmed

MeSH

Search Pubmed

Search tags

Related documents

Products

Support

1957

http://www.pdb.org/

Experimental approaches for structural biology

cryoEM

Where to get structural data?

PDB History

Current state of the PDB

PDB ID

Data formats of PDB

• PDB format, mmCIF (and derived xml

PDBML)

• Dictionary resources at:

• mmCIF is the PDB archival format

PDB Format

mmCIF language

Pubmed, MEDLINE, Entrez etc.

NCBI

MEDLINE

What’s in Pubmed

What’s in Pubmed

MeSH

Search Pubmed

Search tags

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib