APPLICATONS OF BIOINFORMATICS IN AGRICULTURE Zulqarnain javed Sub to: yadish bukhari UNIVERSITY OF AGRICULTURE FSD (CABB) 2010-ag-3553 Bioinformatics and Its Applications in Plant Biology Bioinformatics is a new field of science but it is making progress in every field of biotechnology very rapidly. As it has its application in the medicine by providing the genome information of various organisms, similarly the field of agriculture has also taken advantage of this field because microorganisms play an important role in agriculture and bioinformatics provides full genomic information of these organisms. Tools of bioinformatics are playing significant role in providing the information about the genes present in the genome of theses species. These tools have also made it possible to predict the function of different genes and factors affecting these genes. The information provided about the genes by the tools makes the scientists to produce enhanced species of plants which have drought, herbicide, and pesticide resistance in them. Similarly specific genes can be modified to improve the production of meat and milk When the evolutionary changes occurred in the plants, their genome remained conserved and did not provided much information. Since the arrival of bioinformatics tools, it is possible to extract the required information from the genome of specific plants. There are two species of food plants, the genome of which has been mapped completely for example Arabidopsis thaliana and Oryza sativa. These two species of plants have their names in English as water cress and rice respectively. Water cress is a small plant which is found on the rocks. Researchers took interest in its genome because of its smaller genomic size and studied the plant developmental processes. Its genome consists of 5 chromosomes on which 100 Mbp DNA is distributed. It reproduces in 5 weeks and makes new generation. The understanding about its genes and their expressions provides information about the other plants' proteins and their expressions. There are many uses of knowing the genome of A. thaliana but the major use is that the yield of the plants can be increased. UAF Page 2 Many plants have been made insect resistant by incorporating the desired genes. Bacillus thuringiensis is bacterial specie which increases the soil fertility and protects the plants against pests. When the researchers mapped its genome, they used its genes to incorporate into the plant to make it resistant against insects. For example, corn, cotton and potatoes have been made insect resistant so far. By having the genes of bacteria in the plants genome, when insects eat the plants, the bacteria enter in their bloodstream and make them starved, ultimately they die. Bt corn is one specie of food plants which have been modified by inserting bacterial genes in it. It is effective against insects by developing resistance against them. The use of Bt genes in the plants genome has made the agriculturists to use the insecticides in very little amount. As a result the productivity and nutritional value of plants will also increase and will be beneficent for human health. When the changes are made in the genome of the plants, the nutritional value of plants also increases. For example some genes are inserted in the rice genome to increase the Vitamin A level in the crop. Vitamin A is an important component for the eyes and if the Vitamin A deficiency occurs in the body, it may result in blindness. This work has allowed the scientists to reduce the rate of blindness from the world by giving genetically modified rice to the people. Some varieties of cereals are developed which have the ability to grow in poor soils and are drought resistant. Due to this method, those areas can also be used which have less soil fertility. UAF Page 3 Applications of Bioinformatics in Agricultural Research The science of bioinformatics has many beneficial uses in the modern day world. Main application of bioinformatics can 1) Single gene analysis 2) Biochemical pathways 3) Molecular techniques 1.Single gene analysis Comparative studies: Comparative studies involves analyzing and comparing the genetic material of different species, studying tions of genes, mechanisms of inherited diseases and species evolution. Various Bioinformatics tools are used comparisons between the number of genes, their locations and their biochemical functions in different organis Comparative studies can be done using Bioinformatics. Sequence alignment: Sequence alignment in of two types: pair wise alignment and multiple alignment. The pairwise alignment either global alignment or local alignment. BLAST and FASTA- the most widely used tools are examples of lo alignment. 2. Biochemical pathways KEGG- Kyoto Encyclopedia of Genes and Genomes is a suite of databases and associated software, integrat current knowledge on molecular interaction networks, information about the genes and proteins, information a chemical compounds and reactions. 3. Molecular techniques There are several online tools devoted to serving molecular biologist design effective PCR primers. There are numerous web-based resources for PCR and primer design which are freely available. Some of them include premier, Vector [VecScreen], Gene prediction [Genscan], Restriction analysis and Probe design [Primose] Current status and future trends: To encourage the submission of all sequence data into the public domain, through repositories To provide rational annotation of genes, proteins and phenotypes To elaborate relationships both within the plants’ data and between plants and other organisms. EST sequences aided research Conserving plant data using Biodiversity Informatics Using plant genes as Phylogenetic markers Other areas of applications of Bioinformatics in Agriculture are in: Plant systematics Plant breeding Biopharmaceuticals and edible vaccines Biodiversity informatics UAF Page 4 Biological sequence such as DNA, RNA, and protein sequence is the most fundamental object for a biological system at the molecular level. Several genomes have been sequenced to a high quality in plants, including Arabidopsis Advances in sequencing technologies provide opportunities in bioinformatics for manag ing, processing, and analyzing the sequences. Shotgun sequencing is currently the most common method in genome sequencing: pieces of DNA are sheared randomly, cloned, and sequenced in parallel. Software has been developed to piece together the random overlapping segments that are sequenced separately into a coherent and accurate continuous seq. The exponential growth of genomics is due to computational challenges of systematically collecting, storing, organizing, manipulating visualizing and analyzing large amounts of biological information come from the experiments carried out by the biologists.. Thus, bioinformatics, in its broad sense, can be seen as providing both the infrastructure and the scientific framework in which biologists take information and use computers to help convert it into knowledge .Apart UAF Page 5 from the fact that bioinformatics is a newly recognized discipline; there is an impressive diversity o bioinformatics resources currently available. Though a wide array of commercial resources exist, some of which are ideally suited to specific tasks, and freely available. Many of the databases and analysis tools we describe here are hosted by government or academic research centers and can be accessed via user-friendly web interfaces The term "sequence analysis" in biology implies subjecting a DNA or peptide sequence to sequence alignment, sequence databases, repeated sequence searches, or other bioinformatics methods on a computer. In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Sequence analysis can be used to assign function to genes and proteins by the study of the similarities between the compared sequences. Nowadays there are many tools and techniques that provide the sequence comparisons (sequence alignment) and analyze the alignment product to understand the biology UAF Page 6 In protein functional studies we compare the protein sequence to the secondary (or derived) protein databases that contain information on motifs, signatures and protein domains. Highly significant hits against these different pattern databases allow us to approximate the biochemical function of our query protein [38, 39]. Motif finding, also known as profile analysis, constructs global multiple sequence alignments that attempt to align short conserved sequence motifs among the sequences in the query set. This is usually done by first constructing a general global multiple sequence alignment, after which the highly conserved regions are isolated and used to construct a set of profile matrices. The profile matrix for each conserved region is arranged like a scoring matrix but its frequency counts for each amino acid or nucleotide at each position are derived from the conserved region's character distribution rather than from a more general empirical distribution Comparing sequences provides a foundation for many bioinformatics tools and may al low inference of the function, structure, and evolution of genes and genomes. For ex ample, sequence comparison provides a ba sis for building a consensus gene model like UniGene ). Also, many computational methods have been developed for homology identification .Although sequence com parison is highly useful, it should be noted that it is based on sequence similarity between two strings of text, which may not correspond UAF Page 7 to homology (relatedness to a common an cestor in evolution), especially when the confidence level of a comparison result is low. Also, homology may not mean conservation in function UAF Page 8