GENOMICS AND PROTEOMICS UNIT-I Eukaryotic genomes: • All of the eukaryotic nuclear genomes that have been studied are, like the human version, divided into two or more linear DNA molecules, each contained in a different chromosome; all eukaryotes also possess smaller, usually circular, mitochondrial genomes. • The only general eukaryotic feature not illustrated by the human genome is the presence in plants and other photosynthetic organisms of a third genome, located in the chloroplasts. • The smallest eukaryotic genomes being less than 10 Mb in length, and the largest over 100 000 Mb. • The nuclear genome of the yeast S. cerevisiae, which at 12 Mb is 0.004 times the size of the human nuclear genome. • The genes themselves are more compact, having fewer introns, and the spaces between the genes are relatively short, with much less space taken up by genomewide repeats and other non-coding sequences. • More complex organisms have fewer compact genomes. Organelle genome: Human mitochondrial genome: o Small in size (18kb). o Limited in function. o 13 protein encoding genes. o Genes encode proteins related to electron transport activity. Chloroplast genome: o Partial set of genes involved in photosynthesis. (Involved in light and dark reactions). Genomics of Microbes and Microbiomes: o Microbial genomics is largely the identification and characterization of their genetic compositions. The ability to process and analyse the genomic data collected from microbial organisms is a cornerstone of modern bioinformatics. o The microbiome is the collection of all microbes, such as bacteria, fungi, viruses, and their genes, that naturally live on our bodies and inside us. Although microbes are so small that they require a microscope to see them, they contribute in big ways to human health and wellness. o A person’s core microbiome is formed in the first years of life but can change over time in response to different factors including diet, medications, and environmental exposures. o Differences in the microbiome may lead to different health effects from environmental exposures and may also help determine individual susceptibility to certain illnesses. o Some microbes alter environmental substances in ways that make them more toxic, while others act as a buffer and make environmental substances less harmful. Genome sequencing technologies: Next Gen Sequencing: Introduction to early sequencing techniques: In 1975, Sanger introduced the concept of sequencing DNA, which suggests a rapid method of sequencing that determines sequences based on Primed synthesis with DNA polymerase. Later in 1977, Allan Maxam and Walter Gilbert went suggested a better way to sequence DNA by Chemical degradation of DNA in which terminally labelled DNA fragments were chemically cleaved at specific bases and separated by gel electrophoresis. Later ABI has introduced the first commercial DNA sequencer based on automated technology in an ABI Prism 3700 with 96 capillaries. The first human genome was sequenced in 2003, taking an effort of 13 years, with an estimated cost of 2.5 billion USD. Birth of Next gen sequencing: The first NGS device was designed on 2005 and named as GS20. This was don’t by combining single molecule emulsion PCR with Pyro sequencing (Shot gun sequencing procedure, and sequencing by synthesis) of the entire genome of Mycoplasma genitalia. In pyrosequencing the DNA synthesis is performed within a complex reaction that includes ATP sulfurylase and luciferase enzymes and adenosine 5′ phosphosulfate and luciferin substrates in such a way that, the pyrophosphate group releases upon addition of a nucleotide, resulting in the production of detectable light. Later, GS FLX titanium was developed in 2006. 2nd Gen sequencing of HT NGS: The second-generation HT-NGS platforms can generate about five hundred million bases of raw sequence (Roche) to billions of bases in a single run (Illumina, SOLiD). The principle is based on the emulsion PCR amplification of DNA fragments, to make the light signal strong enough for reliable base detection by the CCD cameras. Although the PCR amplification has revolutionized DNA analysis, 3rd Gen Sequencing of HT NGS: In some instances, 2nd Gen sequencing may introduce base sequence errors, thus changing the relative frequency and abundance of various DNA fragments that existed before amplification. To overcome this, the ultimate miniaturization into the nanoscale and the minimal use of biochemicals, would be achievable if the sequence could be determined directly from a single DNA molecule, without the need for PCR amplification and its potential for distortion of abundance levels. This sequencing from a single DNA molecule is now called as the “third generation of HT-NGS technology”. Genome Assembly: o Genome assembly refers to the process of putting nucleotide sequence into the correct order. Genome assembly is made easier by the existence of public databases. o If a genome assembly is finished to the level of whole linear chromosomes, the ends will contain tandem (consecutive) repeat sequences found within telomeres, ranging from 5-mer to 27-mer repeated several thousand times, which both protect the end of the chromosome from deterioration, chromosomal fusion, or recombination, and as a mechanism for senescence and triggering apoptosis. Comparative Genomics and its applications: Introduction: o Comparative genomics is the direct comparison of complete genetic material of one organism against that of another to gain a better understanding of how species evolved and to determine the function of genes and noncoding regions in genomes. o It includes a comparison of gene number, gene content, and gene location, the length and number of coding regions (called exons) within genes, the amount of non-coding DNA in each genome, and conserved regions maintained in both prokaryotic and eukaryotic groups of organisms. o Comparative genomics not only can trace out the evolutionary relationship between organisms but also differences and similarities within and between species. Methodology: Genome Correspondence: The method of determining the correct correspondence of chromosomal segments and functional elements across the species compared is the first step in comparative genomics. This involves determining orthologous segments of DNA that descend from the same region in the common ancestor of the species compared, and paralogous regions that arose by duplication events prior to the divergence of the species compared. Applications: 1. Once genome correspondence is established, comparative genomics can aid gene identification. 2. Comparative genomics provides a powerful way to distinguish regulatory motifs from non-functional patterns based on their conservation. 3. Comparative genomics has wide applications in the field of molecular medicine and molecular evolution. 4. Comparative genomics is used identification of drug targets of many infectious diseases. 5. Comparative analysis of genomes of individuals with genetic disease against healthy individuals may reveal clues of eliminating that disease. 6. Comparative genomics helps in selecting model organisms. 7. Comparative genomics also helps in the clustering of regulatory sites, which can help in the recognition of unknown regulatory regions in other genomes. UNIT-II Computational Tools for gene expression analysis: Computational tools for data integration: 1. Desiderata: If researcher A wants to use a database kept and maintained by researcher B, the “quick and dirty” solution is for researcher A to write a program that will translate data from one format into another. 2. Data Standards: Technical standards that define representations of data and hence provide an understanding of data that is common to all database developers. Standards are more relevant to future data sets. Standards are indeed an essential element of efforts to achieve data integration of future datasets. 3. Data Normalization: Data normalization is the process through which data taken on the “same” biological phenomenon by different instruments, procedures, or researchers can be rendered comparable. 4. Data Warehousing: Data warehousing is a centralized approach to data integration. The maintainer of the data warehouse obtains data from other sources and converts them into a common format, with a global data schema and indexing system for integration and navigation. 5. Data Federation: Data federation calls for scientists to maintain their own specialized databases encapsulating their particular areas of expertise and retain control of the primary data, while still making it available to other researchers. Data Presentation: 1. Graphical interfaces: Graphical interface is used for molecular visualization. 2. Tangible physical interfaces: Tangible, physical models that a human being can manipulate directly with his or her hands are an extension of the two-dimensional graphical environment. 3. Automated literature searching: the availability of full-text articles in digital formats such as PDF, HTML, or TIF files has limited the possibilities for computer searching and retrieval of full text in databases. In the future, wider use of structured documents tagged with XML will make intelligent searching of full text feasible, fast, and informative and will allow readers to locate, retrieve, and manipulate specific parts of a publication. Hierarchial Clustering: o Hierarchical clustering is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two categories: 1. Agglomerative: This is a "bottom-up" approach: Each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. 2. Divisive: This is a "top-down" approach: All observations start in one cluster, and splits are performed recursively as one moves down the hierarchy. o MATLAB, NCSS, Stata are somme of the exampled for Hierarchial classification softwares. STS: o Expanded as “Sequence tagged sites”. o It is a relatively short, easily PCR-amplified sequence (200 to 500 bp) which can be specifically amplified by PCR and detected in the presence of all other genomic sequences and whose location in the genome is mapped. o It was first introduced by Olson et al in 1989. o STS-based PCR produces a simple and reproducible pattern on agarose or polyacrylamide gel. o The DNA sequence of an STS may contain repetitive elements, sequences that appear elsewhere in the genome, but as long as the sequences at both ends of the site are unique and conserved, researches can uniquely identify this portion of genome using tools usually present in any laboratory. o STS include markers such as microsatellites (SSRs, STMS or SSRPs), SCARs, CAPs, and ISSRs. Expressed sequence tags (ESTs): o Expressed sequence tags (ESTs) are fragments of mRNA sequences derived through single sequencing reactions performed on randomly selected clones from cDNA libraries. o To date, over 45 million ESTs have been generated from over 1400 different species of eukaryotes. o EST projects are used to either complement existing genome projects or serve as lowcost alternatives for purposes of gene discovery. o However, with improvements in accuracy and coverage, they are beginning to find application in fields such as phylogenetics, transcript profiling and proteomics. GSS: o The GSS division of GenBank is similar to the EST division, with the exception that most of the sequences are genomic in origin, rather than cDNA (mRNA). o It should be noted that two classes (exon trapped products and gene trapped products) may be derived via a cDNA intermediate. o Care should be taken when analyzing sequences from either of these classes, as a splicing event could have occurred and the sequence represented in the record may be interrupted when compared to genomic sequence. o The GSS division contains (but is not limited to) the following types of data: o random "single pass read" genome survey sequences. o cosmid/BAC/YAC end sequences o exon trapped genomic sequences o Alu PCR sequences o transposon-tagged sequences. Transcriptome analysis: o Transcriptome Analysis is the study of the transcriptome, of the complete set of RNA transcripts that are produced by the genome, under specific circumstances or in a specific cell, using high-throughput methods. o The transcriptomic techniques have been particularly useful in identifying the functions of genes. o Transcriptomics also allows identification of pathways that respond to or ameliorate environmental stresses. Uses of Transcriptome Analysis: o Transcriptome Analysis is most commonly used to compare specific pairs of samples. The differences may be due to different external environmental conditions. o Transcriptome studies can classify cancer beyond anatomical location and histopathology. Outcome predictions can establish gene-based benchmarks to predict tumor prognosis and therapy response. o The transcriptomes of stem cells help to understand the processes of cellular differentiation or embryonic development. o Because of its very broad approach transcriptome analysis is a great source for identifying targets for treatment. Unit III Molecular Systems Biology: o Molecular systems Biology integrates many types of molecular knowledge, which can best be achieved by the synergistic use of models and experimental data. o Two main approaches of systems biology can be distinguished. o Top-down systems biology is a method to characterize cells using systemwide data originating from the Omics in combination with modelling. o Bottom-up systems biology does not start with data but with a detailed model of a molecular network on the basis of its molecular properties. o In this approach, molecular networks can be quantitatively studied leading to predictive models that can be applied in drug design and optimization of product formation in bioengineering. Kinetic models: o Kinetic model approach is a type of modelling often used in systems biology. o Kinetic models use experimentally determined kinetic parameters and network structure, and has proven to be very promising. o Many such type of kinetic models are found in the database for systems biology (JWS). o The first kinetic model was developed by Hoefnagel et al for pyruvate metabolism in Lactococcus lactis. o when precise and detailed knowledge of the kinetics of the molecular components is available, so-called computer experimentation can be carried out which serves as an adequate substitute for true experimentation. Biomass objective function: o Biomass objective function describes the rate at which all of the biomass precursors are made in the correct proportions. o One can formulate biomass objective function at a different level of detail. Basic Level: o The formulation process starts with defining the macromolecular content on the cell and then the metabolites that make up each macromolecular group. o With this information, it is possible to detail the required number of metabolites that are needed along with associated reaction pathways. Intermediate level: o This level calculates the necessary biosynthetic energy that is needed to synthesis the macromolecules whose building blocks are directly accounted for in a curated metabolic network. Advance Level: o Advanced biomass objective functions can be formed by detailing the necessary vitamins, elements, and cofactors required for growth as well as determining core components necessary for cellular viability. o Inclusion of vitamins, elements, and cofactors allow for the analysis of a broader coverage of network functionality and required network activity. o Another advanced approach is to not only define the wild-type biomass content of the cell, but to generate a separate biomass objective function that contains the minimally functional content of the cell. Biotechnological applications of system biology: o Identifying or engineering microbial strains. o Transcriptomics. o Proteomics and metabolomics o Predictive computational models and machine learning o Re-engineering strains o Bioprocessing o Meta-omics. Pharmacogenomics and Drug Discovery: o Pharmacogenomics (sometimes called pharmacogenetics) is a field of research that studies how a person’s genes affect how he or she responds to medications. o Pharmacogenetic studies can be used at various stages of drug development. o The effect of drug target polymorphisms on drug response can be assessed and identified. o This prevents the occurrence of severe adverse drug reactions and helps in better outcome of clinical trials. o The variations in drug response can be better studied with the wider application of pharmacogenomic methods like genome wide scans, haplotype analysis and candidate gene approaches. o The efficacy of a drug, to a great extent, is determined by appropriate target selection, which can be guided by pharmacogenomic methods. o Pharmacogenomics can also be used to identify the target population that would benefit the most from the drug. Agricultural Genomics: o Agricultural genomics is a rich field that has been contributing to advances in crop development for decades. o From sequencing reference genomes to genotyping for genome-wide association studies to genomic prediction, advances in technology and applications have led to breakthroughs in crop improvement. o These innovations have resulted in elite cultivars that have been selected for agriculturally desirable traits, including high yield, stress tolerance and pest resistance. o One potential way for genomics to lend itself to crop improvement and food security is through the collection-wide sequencing and classification of established seed banks or gene banks, in which important agricultural species are stored and maintained in large collections organized by taxonomy and origin. o The preservation of seed resources ensures that the natural genetic diversity captured by the collections will not be lost. o Accurately cataloguing these resources by using genomic data provides precise and usable information for breeders and scientists, while simplifying the effort by identifying redundancy in the collection. UNIT-IV Qualitative proteome technology: Introduction: o Proteomics is crucial for early disease diagnosis, prognosis andto monitor the disease development. o Proteomics is one of the most significant methodologies to comprehend the gene function although, it is much more complex compared with genomic. o The below picture depicts an overview of proteomic technology o Gel Based approaches: 1. SDS PAGE: o SDS-PAGE is a high resolving technique for the separation of proteins according to their size, thus facilitates the approximation of molecular weight. o Different proteins in mixture migrate with different velocities according to the ratio between its charge and mass. o Example: o The protein profiling of Mycoplasma bovis and Mycoplasma agalactiae through SDS-PAGE has high diagnostic value as these species are difficult to differentiate with routine diagnostic procedures. 2. 2 D Gel Electrophoresis: o The two-dimensional polyacrylamide gel electrophoreses (2D-PAGE) is an efficient and reliable method for separation of proteins on the basis of their mass and charge. o 2D-PAGE is capable of resolving ~5,000 different proteins successively, depending on the size of gel. o The proteins are separated by charge in the first dimension while in second dimension separated on the basis of differences between their mass. o The 2-DE is successfully applied for the characterization of post-translational modifications, mutant proteins and evaluation of metabolic pathways. o Example: o Listeria monocytogenes involved in the host–pathogen interactions were analyzed with 2-DE and 30 different proteins of two strains were identified. 3. 2D-DIGE: o It is expanded as “Two-dimensional differential gel electrophoresis”. o 2D-DIGE utilizes the proteins labeled with CyDye that can be easily visualized by exciting the dye at a specific wavelength. o Example: o Cell wall proteins (CWPs) of toxic dinoflagellates Alexandrium catenella labeled with Cy3 have been identified through 2D-DIGE. o The 2-DE remains a method of choice in proteomic research, though certain limitations enervate its potential as a principal separation technique in modern proteomics. Gel Free Approaches: o In recent years, most developmental endeavours have been focused on alternative approaches, such as promising gel-free proteomics. o With the appearance of MS-based proteomics, an entirely new toolbox has become available for quantitative analysis. Other important examples include ICAT labelling, MS based SILAC (Stable Isotopic Labeling with Amino Acids in Cell Culture). o These novel approaches were initially pitched as replacements for gel-based methods. Quantative Proteome Technology: 1. ICAT labelling: o The ICAT is an isotopic labeling method in which chemical labelling reagents are used for quantification of proteins. o The ICAT has also expanded the range of proteins that can be analyzed and permits the accurate quantification and sequence identification of proteins from complex mixtures. o The ICAT reagents comprise affinity tag for isolation of labeled peptides, isotopically coded linker and reactive group. o Example: o The systemic proteome quantification was carried out possible through ICAT during cell cycle of Saccharomyces cerevisiae that supported the cognition of gene functions. 2. Stable Isotopic Labelling with Amino Acids in Cell Culture: o SILAC is an MS-based approach for quantitative proteomics that depends on metabolic labelling of whole cellular proteome. o The SILAC has been developed as an expedient technique to study the regulation of gene expression, cell signalling, post translational modifications. o Additionally, SILAC is a vital technique for secreted pathways and secreted proteins in cell culture. o Example: o SILAC was used for quantitative proteome analysis of B. subtilis in two physiological states such as growth during phosphate and succinate starvation. o The intracellular stability of almost 600 proteins from human adenocarcinoma cells have been analysed through “dynamic SILAC” and the overall protein turnover rate was determined. 3. Isobaric tag for relative and absolute quantitation: o iTRAQ is multiplex protein labelling technique for protein quantification based on tandem mass spectrometry. o This technique relies on labelling the protein with isobaric tags (8-plex and 4plex) for relative and absolute quantitation. o The technique comprises labelling of the N-terminus and side chain amine groups of proteins, fractionated through liquid chromatography and finally analysed through MS. o It is essential to find the gene regulation to understand the disease mechanism, therefore protein quantitation using iTRAQ is an appropriate method that helps to identify and quantify the protein simultaneously. o Example: o iTRAQ has been applied for quantitative analysis of membrane and cellular proteins of Thermobifida fusca grown in the absence and presence of cellulose. o iTRAQ was a useful tool for determination of molecular process involved in development and function of natural killer (NK) cells. 4. X-ray crystallography: o X-ray crystallography is the most preferred technique for three dimensional structure determination of proteins. o The highly purified crystallized samples are exposed to X-rays and the subsequent diffraction patterns are processed to produce information about the size of the repeating unit that forms the crystal and crystal packing symmetry. o Example: o X-ray crystallography revealed the structure of C-terminal fragment of FtsZ and binding complex of FtsZ-ZipA of E. coli. Functional Proteome Technology: Introduction: o Functional proteomics constitutes an emerging research area in the proteomic field that is “focused to monitor and analyse the spatial and temporal properties of the molecular networks and fluxes involved in living cells”. Targets: The 2 major targets of functional proteomics are, 1. The elucidation of biological functions of unknown proteins and 2. The definition of cellular mechanisms at the molecular level. Example: (And) Conclusion and Futures: o Understanding protein function and unravelling cellular mechanisms at the molecular level constitute a major need in modern biology. o With the availability of full genome sequences, these goals can be achieved by determining which macromolecules interact with a given protein in a specific manner. o Proteomic analyses of the protein complexes occurring in vivo will disclose the identity of the individual components and whether they differ from a territory to another. Methods, algorithm and Tools in computational proteomics: Introduction: o In the fifties amino acid sequencing was already possible using Edman degradation and the first computer programs for amino acid sequencing appeared. o Iterative searching forms the basis of many of the modern and more complex software algorithms for correlating MS data to peptide and sequence databases. o The first MS database search engine with probability-based scoring function was released in 1999 and formed the basis of the popular semicommercial search engine MASCOT. Spectrum Interpretation: o Proteins and peptides can be identified both from MS (PMF) and MS/MS (PFF) spectra. o In MS (PMF) the mass pattern obtained from measuring the masses of purified or simple protein mixture, enzymatically digested or chemically cleaved, is compared to theoretical mass patterns generated in silico from a protein database. o In MS/MS both intact mass of the peptide (parent ion mass) and fragment ions are recorded. o De novo sequencing can also be used for validating search results obtained from database dependent search algorithms. Sequences tagging approach: o The sequence tagging approach, a method similar to de novo sequencing, aims at finding short tags and delta masses which can be used to search a sequence database. o The sequence tag approach has recently regained momentum due to the high mass accuracy of modern mass spectrometers. o The sequence tagging approach makes the computational task of finding the correct de novo sequence path less complex, more accurate, and computationally less expensive for modified peptides. Tools: 1. Accessing tools: CPTAC, GDC, PDC, TCIA 2. Data Processing: CDAP, COSMO, DREAM AI, MassQC, MSInspector, OMICS 3. Data analysis: ARHT, Black Sheep, ChEA3. Proteome Database: neXtProt: o netXtProt is a database of the human proteome containing information on over 20,000 human proteins, the vast majority of which their existence has been determined through direct identification or through the identification of their mRNA transcripts. o https://www.nextprot.org/. o Operating Institution: Swiss Institute of Bioinformatics Protopedia: o Protopedia is a wiki style database containing information on a wide array of biological macromolecules o The site offers generalized information about the molecule of interest in addition to providing 3D visuals on the structure of the molecule and information on its interactions, locations, and targets. o Link: http://proteopedia.org/wiki/index.php/Main_Page. o Operating Institution: The Israel Structural Proteomics Center. Protein Data Bank: o The Protein Data Bank is a database of proteins and nucleic acids. o The site offerors scientists and educators the ability to both upload new information on known sequences and proteins or completely novel entities, and download sequence and other scientific information as well. o The database allows for searching based on sequence, function, ligand, and drug targets as well as offering visual representations of the 3D structure of proteins, the pathways they are involved in, mechanism of drug and ligand interaction, and where their sequence lies on human chromosomes. o Link: https://www.rcsb.org/ o Operating Institutions: Rutgers University, University of San Diego Supercomputer Centre, University of San Francisco. Human Protein Atlas: o The Human Protein Atlas is a database containing the location of all known human proteins with three major sub-divisions. o The Tissue Atlas showing the protein expression levels and their localization in all human tissues, the Cell Atlas shows the expression and sub-cellular localization of proteins in 64 human cell lines, the Pathology Atlas shows the abhorrent expression patterns of proteins in 18 different types of human cancers. o In addition to these Atlases the database contains information on wide classes of human proteins, protocols and methods for human protein experimentation, and access to antibodies and cell lines for research. o Link: https://www.proteinatlas.org/ o Operating Institutions: KTH Royal Institute of Technology, Uppsala University, Science for Life Laboratory. STRING: o STRING is a proteomic database focusing on the networks and interactions of proteins in a wide array of species. o STRING allows for the searching of one or multiple proteins at a time with the ability to additionally limit the search to the desired species. o Link: https://string-db.org/ o Operating Institutions: Swiss Institute of Bioinformatics, Novo Nordisk Foundation Centre for Protein Research, European Molecular Biology Laboratory - Heidelberg, University of Zurich. UNIT-V Techniques to Study Protein-Protein Interaction: Types of Protein-Protein intertaction: o Protein interactions are fundamentally characterized as “stable or transient”, and both types of interactions can be either strong or weak. o Stable interactions are those associated with proteins that are purified as multi-subunit complexes, and the subunits of these complexes can be identical or different. o Transient interactions are temporary in nature and typically require a set of conditions that promote the interaction, such as phosphorylation, conformational changes or localization to discrete areas of the cell. Methods to Study Protein-Protein Interaction: 1. Co-immuno precipitation: o The interacting protein is bound to the target antigen, which is bound by the antibody that is immobilized to the support. o Immunoprecipitated proteins and their binding partners are commonly detected by sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDSPAGE) and western blot analysis. 2. Pull Down Assays: o Pull-down assays are similar in methodology to co-immunoprecipitation because of the use of beaded support to purify interacting proteins. o The difference between these two approaches, though, is that while co-IP uses antibodies to capture protein complexes, pull-down assays use a "bait" protein to purify any proteins in a lysate that bind to the bait. o Pull-down assays are ideal for studying strong or stable interactions or those for which no antibody is available for co-immunoprecipitation. 3. Crosslinking Protein Interaction Analysis: o Crosslinking interacting proteins is an approach to stabilize or permanently adjoin the components of interaction complexes. o Once the components of an interaction are covalently crosslinked, other steps. 4. Label transfer protein interaction analysis: o Label transfer involves crosslinking interacting molecules (i.e., bait and prey proteins) with a labelled crosslinking agent and then cleaving the linkage between the bait and prey so that the label remains attached to the prey. o This method is particularly valuable because of its ability to identify proteins that interact weakly or transiently with the protein of interest. 5. Far western blot analysis: o Just as pull-down assays differ from co-IP in the detection of protein–protein interactions by using tagged proteins instead of antibodies, so is far–western blot analysis different from western blotanalysis, as protein–protein interactions are detected by incubating electrophoresed proteins with a purified, tagged bait protein instead of a target protein-specific antibody, respectively. o The term "far" was adopted to emphasize this distinction. Interactome databases: Introduction: o Several public databases collect published PPI data and provide researchers access to their curated datasets. These usually reference the original publication and the experimental method that determined every individual interaction. o The 6 databases for interactome studies are 1. Biological General Repository for Interaction Datasets (BioGRID). 2. Molecular INTeraction database (MINT). 3. Biomolecular Interaction Network Database (BIND) 4. Database of Interacting Proteins (DIP) 5. IntAct molecular interaction database 6. Human Protein Reference Database (HPRD) Modificomics: o Definition: The study of posttranslational modifications of the proteins associated with a particular genome. o Posttranslational modifications of proteins possess key functions in the regulation of various cellular processes. Application of Proteomics in Clinical and biomedicine: Application in Biomarker Discovery: o In medicine, a biomarker can be a traceable substance that is introduced into an organism as a means to examine organ function or other aspects of health. o Proteomics technology has been extensively used in the molecular medicine especially for biomarker discovery. o By analyzing of a global protein profiling in the body fluids, proteomics can identify invaluable disease-specific biomarkers. o Expression of proteomics provides biomarker detection through comparison of protein expression profile between normal samples vs. disease affected ones. o After identifying biomarkers by mass spectrometry-based approach, biomarkers need to process using bioinformatics analyses and also need to be reproduced in different populations. Application in Drug Discovery: o As the drug discovery is an inherently complex process and values high-cost, new emerging technologies such as proteomics can facilitate and accelerate discovery processes. o Drug discovery has many stages 3 and indeed it is a multidisciplinary field using genomics, proteomics, metabolomics, bioinformatics, and system biology. o Proteomics plays a major role in target identification step o Proteomics studies also are useful for drug action, toxicity, resistance, and its efficacy under examination. Application of Proteomics in Agriculture: o Since proteins are the main constituents of plants and its foods, proteomic technology can monitor and characterize the protein content of foods and their changes during production using two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) and chromatography techniques in combination with mass-spectrometry. o Concerning plant science, plant-proteomics methods can help to identify quality biomarkers to design better and safer breeding. o Proteomics is a useful tool for identification of microbial contaminations in plants. o Proteomic techniques are increasingly used for breed quality control. o o Proteomics studies have identified numerous proteins that play crucial roles in plant growth and development. o Proteomics is used to study seed growth regulators. o Proteomics is used to evaluate genetically modified crops. o Also proteomics approach is an efficient tool to analyse agriculture crop biomass.