Good morning everyone! This baby girl was born happy and innocent. She just wanted to play and have fun all the time. But, destiny has decided otherwise. 40 years from now she started feeling different. Something is going wrong in her body. She decided to go see a doctor. After extensive tests, the doctors found abnormal cells in her body. They looked liked this She is diagnosed with BREAST CANCER! A crash course in cancer! What is cancer? How is it caused? These are some of the questions that plague us. A normal cell looks like this Three important features of a normal cell: 1. It has a regular shape and size 2. It has a normal DNA 3. It divides normally When a cell undergoes mutation, the above mentioned features are altered, leading to a cancerous cell. Cancerous cells divide abnormally causing tumors. After a mutation, the sequence of base-pairs in DNA changes. Coming to breast cancer…… What is that thing? How is it caused? Normal female breasts are made up of 1) Lobules – milk-producing glands 2) Ducts – milk transporting tubes 3) Stroma – fatty tissue surrounding the ducts and lobules 4) Blood vessels and Lymph vessels Breast cancer is a malignant tumor that has developed from cells of breast. Some facts about the breast cancer: •Breast cancer is the most common cancer among women. •It also occurs in men. •It is estimated that in 2003 about 211,300 new cases will be diagnosed among women. •It is hereditary! Doctors cannot explain why one gets cancer and why one doesn’t. They can help you by diagnosing the cancer at its early stages and help prevent its development. But we are not doctors. We are BIO-INFORMATICISTS. We have the opportunity to understand the disease and inform many others, including the doctors about it. Nobody has so far discovered the true reason behind the cause of this disease. But, we know that its hereditary. We also know the risk factors involved in the disease. Some of them are: 1. Prolonged exposures to estrogen, female hormone. 2. Late childbearing (having first child after about age 30) 3. Breasts that have a high proportion of lobular and ductal tissue density 4. Exposure to radiation 5. High alcohol consumption Genetics of breast cancer So…no cure?? What next?? Are pain and suffering the only answers?? In 1994 scientists have discovered two tumor suppressor genes: 1. BRCA1 2. BRCA2 Any mutations in the above genes causes breast cancer. After 1994 many other genes were discovered that were linked to the cause of breast cancer. But these genes don’t directly involve in tumor formation. BRCA1 gene identified on the long arm of chromosome 17 BRCA2 gene identified on the short arm of chromosome 13 A quick fact 45% of familial cancers and more than 80%of ovarian cancers are also caused due to mutations in BRCA1. The function of these genes was not clear until studies on a related protein in yeast revealed their normal role: they participate in repairing radiation-induced breaks in double-stranded DNA. This means that mutations might disable this mechanism leading to more errors in DNA replication. These defective genes are transferable from parents to offspring. I have concentrated on BRCA1 to make it easy for myself as well as for you all. From our very own NCBI web site, I have secured a lot of info on BRCA1 A brief explanation NCBI is the mother of all genomic web sites. Every new nucleotide or protein or genomic material will be readily available in the database. Search over 18 vast databases for related sequences. Translate sequences Know lot of other stuff The latest discovery is the SYNTHETIC LETHAL SCREENING When cell death is caused by the combination of two or more mutations, the genes involved are said to be synthetically lethal. This indicates a functional relationship between the corresponding gene products. So what’s my objective?? I chose to work on BRCA1 gene I’ve decided to find relative genes in other organisms that belong to different classes. I want to learn the evolution of this gene in different organisms. May be this info might help me better understand the gene, and its function. NCBI was helpful in finding the related genes in other organisms. Related sequences?? BRCA1 is a gene, and most genes make their products. Gene products are mostly proteins. In related sequences the gene product is almost the same. How will this help us? We will try to compare the gene products in all the selected organisms. This will lead us to a point where we can understand the evolution of the specific gene we are looking for. Isn’t that simple?? At least sounds like it. Chosen sequences I chose sequences from the following organisms (organisms listed with scientific name, common name and class they belong). Scientific name Homo sapiens Pan troglodytes Gallus gallus Xenopus laevis Drosophila Caenorhabditis Plasmodium Arabidopsis Common name Man Chimp Hen Frog Fruit fly Nematode Sickle parasite Thale cress Class/Kingdom Mammalia Mammalia Aves Amphibia Insecta Nematoda Prokaryote Plant Did you find anything strange in the list of the sequences? Yes…how come I included Plasmodium (prokaryote), Arabidopsis (plant), nematode, and an insect? Scientists have reported related proteins in these organisms. Isn’t it surprising?? Lets go ahead and analyze these sequences using the programs that Dr.Joshi taught us. Its fun, and very interesting. NOTE: I am using only those programs that are required for my analysis. Important analyses %Similarity and identity: Using doublegap.csh script I identified percentage similarity and identity. Results Human and Chimp proteins showed 99% similarity. Arabidopsis, Drosophila, Nematode are no way similar to Human and Chimp. Frog and Hen come closer in terms of similarity to Human and Chimp. Multiple sequence alignment: Using pileup function I aligned all the sequences. I encountered a problem. Plasmodium had many amino acid sequences. This led to the incorporation of Gaps beyond the limit (actually we can set the limit). So I ignored Plasmodium sequence. I used the following script to run the pileup program: “pileup -MAXSeg=2000 –MAXGAP=5000 –OUTfile1=listgen.msf @listgen” This script will allow me to extend the gap sequences to 5000 and maximum Segment alignment to 2000. Observe that the sum is 7000. It should always be 7000 (Well I don’t know why? Its in the program.) After aligning the protein sequences, I used PRETTY to find the consensus of the aligned sequences. Do u wanna take a look at the pretty file? Consensus sequences Br1: Drosophila Br2: Chimp Br3: Hen Br4: Frog Br6: Humans Br7: Arabidopsis Br8: Nematode Some of the stuff is confusing. Some of it makes sense. To better understand it, lets go ahead and get a phylogenetic picture. Phylogenetic tree: I went to clustalw web site to get a beautiful phylogenetic tree by feeding the pileup file. The Result??? An amazing phylogenetic tree that supports the theory of evolution. Phylogenetic tree Analysis of the phylogenetic tree According to the theory of evolution, at the bottom of the phylogenetic tree are the prokaryotes. Prokaryotes are followed by nematodes. Nematodes are followed by insects, amphibians, aves and mammals. So we are at the top of the phylogenetic tree. Humans and Chimps are mammals. So there is a close relation between the proteins. It looks great! But how does this help?? HOPE!! These techniques will help scientists work on a different living system to understand the functioning of the gene. Yeast (we did not discuss about it) is one of the living systems that is extensively used for research. May be in the future, we will find out the actual cause of this dreadful disease, and find a cure for it. We should never lose hope. It is the only thing that will inspire us to reach our goals!