Introduction to IntAct Pablo Porras Millán, IntAct pporras@ebi.ac.uk Session outline • • Introduction to protein-protein interactions (PPIs) What are PPIs? Representing PPIs PPI databases IntAct: the molecular interactions database at the EBI Data structure and curation model Using the IntAct website Introduction to protein-protein interactions (PPIs) A definition… Protein-protein interactions (PPIs): physical and selective contacts that happen between pairs of proteins, in certain molecular regions and in a defined biological context. Interactome: the totality of PPIs that happen in a cell / in an organism / in a specific biological context... EMBL-EBI Proteasome image from Hook, B. and Schagat, T. [Internet] 2011. Available from: www.promega.com/resources/articles/pubhub/functional-proteomics-techniques-to-isolate-and-characterize-the-human-proteasome/ Why protein-protein interactions? Gene level DNA Protein level RNA 1 protein = 1 protein n = functions 1 function = n networks! WRONG! 1. To predict a protein biological function • “guilt by association” • proteins with similar functions should cluster together 2. To improve characterization of protein complexes and pathways • interaction networks work as a draft map that brings detail to biological processes and pathways EMBL-EBI Protein-protein interaction detection methods Lowthroughput High-throughput Yeast-two hybrid (Y2H) Tandem affinity purification+ mass spectrometry (TAP-MS) No single method can accurately reproduce a true binary interaction observed under physiological conditions – every interaction detected experimentally is fundamentally artefactual. X-ray diffraction studies EMBL-EBI Representing PPIs: interaction domains interaction domains Overlap in sequence ranges: EMBL-EBI Representing PPIs: The problem with complexes • Some experimental methods generate complex data: E. g. Tandem affinity purification (TAP) • There are two algorithms to transform this information into binary data: EMBL-EBI Interactions databases: types De Las Rivas & Fontanillo, PLoS Computational biology, PMID: 20589078. EMBL-EBI Primary databases: coverage and biases Human PPIs coverage in the main public primary databases (Dec 2009) Popularity bias in publicly available databases (2013) Roland et al., Cell, PMID: 25416956. De Las Rivas & Fontanillo, PLoS Computational biology, PMID: 20589078. EMBL-EBI A standard for PPIs representation: the IMEx consortium www.imexconsortium.org Orchard et al., Nature Methods, PMID: 22453911. EMBL-EBI IntAct: The molecular interactions database at the EBI IntAct goals & achievements 1. Publicly available repository of molecular interactions (mainly PPIs) - >530K binary interaction evidences taken from >13,800 publications (September 2015) 2. Data is standards-compliant and available via our website, for download at our ftp site or via PSICQUIC www.ebi.ac.uk/intact ftp://ftp.ebi.ac.uk/pub/databases/intact www.ebi.ac.uk/Tools/webservices/psicquic/view/main.xhtml 3. Provide open-access versions of the software to allow installation of local IntAct nodes. EMBL-EBI IntAct: Data storage schema Entry [A] Publication level (entry) Publication Experiment 1 Experiment 2 [B] Experiment level [C] Interaction level Interaction 1 Interaction 2 … [D] Participant level [E] Feature level Participant 1 Participant 2 Features Features Interaction 3 Interaction 4 … … EMBL-EBI IntAct: PSI-MI ontology EMBL-EBI IntAct Curation pipeline “Lifecycle of an Interaction” Sanity Checks (nightly) reject Public web site Publication (full text) . accept p2 I p1 Curation manual report exp FTP site check CVs annotate IMEx report MatrixDB curator Mint DIP Super curator EMBL-EBI IntAct: the role of the curator DIRECT SUBMISSION LARGE DATASETS FROM HIGH-THROUGHPUT PROJECTS Gene Ontology FUNCTION UniProtKB PROTEIN SEQUENCES PUBLISHED MOLECULAR INTERACTIONS DATA CROSS-REFERENCES ChEBI SMALL MOLECULES CURATION Ensembl GENOME SEQUENCES InterPro FAMILIES AND DOMAINS Others STRUCTURES, ORGANISM, TISSUE... EMBL-EBI UniProt Knowledge Base Interactions can be mapped to the canonical sequence… ... to splice variants... ... or to postprocessed chains www.uniprot.org EMBL-EBI IntAct as a common curation platform General curation, domain int. General curation, large scale UniProt entry related Extracellular matrix Model organisms Immune system Commercial curation Cellular mechanics Regulatory interactions Specific curation focus/expertise Host – pathogen interactions Cardiovascular proteins Other DBs Common curation platform Specific Data Dissemination Platforms EMBL-EBI www.ebi.ac.uk/intact IntAct – Home Page EMBL-EBI IntAct webpage-based search EMBL-EBI IntAct webpage-based search Details of interaction Link to external resource (UniProtKB) Details about controlled vocabulary term describing interaction detection method EMBL-EBI IntAct: changing the layout EMBL-EBI IntAct: download formats EMBL-EBI PSIMITAB Columns MITAB 2.5 Standard columns (15): ID(s) interactor A & B • Alt. ID(s) interactor A & B • Alias(es) interactor A & B • Interaction detection method(s) • Publication 1st author(s) • Publication Identifier(s) • Taxid interactor A & B • Interaction type(s) • Source database(s) • Interaction identifier(s) • Confidence value(s) • MITAB 2.7 specific columns (+27): • Expansion method(s) • Biological role(s) of interactors • Experimental role(s) of interactors • Type(s) of interactors • Properties (CrossReference) of interactors / interaction • Annotation(s) of interactors / interaction • HostOrganism(s) • Parameters of interaction • Creation and update dates • Checksum(s) of interactors / interaction • Negative • Feature(s) interactors • Stoichiometry(s) interactors • Participant(s) identification method(s) EMBL-EBI Interaction detail in IntAct EMBL-EBI Interaction detail in IntAct EMBL-EBI Interaction detail in IntAct EMBL-EBI Interaction detail in IntAct EMBL-EBI Filtering results EMBL-EBI IntAct: visualizing results as a network EMBL-EBI IntAct: visualizing results as a network EMBL-EBI IntAct: browse menu EMBL-EBI IntAct: Other searches EMBL-EBI IntAct: Advanced search EMBL-EBI IntAct: Advanced search EMBL-EBI IntAct: Advanced search ... EMBL-EBI IntAct: MIQL syntax search EMBL-EBI More about IntAct: on-line EBI courses www.ebi.ac.uk/training/online/course/intact-molecular-interactions-ebi EMBL-EBI Acknowledgements MI Team leader Developing team Sandra Orchard Max Koch Curation team Margaret Duesbury Birgit Meldal Mariaestela Ortiz EMBL-EBI