Normal Variant: >gi|32880123|gb|AAP88892.1| MGPWGWKLRWTVALLLAAAGTAVGDRCERNEFQCQDGKCISYKWVCDGSAECQDG SDESQETCLSVTCKSGDFSCGGRVNRCIPQFWRCDGQVDCDNGSDEQGCPPKTCS QDEFRCHDGKCISRQFVCDSDRDCLDGSDE ASCPVLTCGPASFQCNSSTCIPQLWACDNDPDCEDGSDEWPQRCRGLYVFQGDSS PCSAFEFHCLSGECIHSSWRCDGGPDCKDKSDEENCAVATCRPDEFQCSDGNCIG SRQCDREYDCKDMSDEVGCVNVTLCEGPN KFKCHSGECITLDKVCNMARDCRDWSDEPIKECGTNECLDNNGGCSHVCNDLKIG YECLCPDGFQLVAQRRCEDIDECQDPDTCSQLCVNLEGGYKCQCEEGFQLDPHTK ACKAVGSIAYLFFTNRHEVRKMTLDRSEYTSLIPNLRNVVALDTEVASNRIYWSD LSQRMICSTQLDRAHGVSSYDTVISRDIQAPDGLAVDWIHSNIYWTDSVLGTVSV ADTKGVKRKTLFRENGSKPRAIVVDPVHGFMYWTDWGTPAKIKKGGLNGVDIYSL VTENIQWPNGITLDLLSGRLYWVDSKLHSISSIDVNGGNRKTILEDEKRLAHPFS LAVFEDKVFWTDIINEAIFSANRLTGSDVNLLAENLLSPEDMVLFHNLTQPRGVN WCERTTLSNGGCQYLCLPAPQINPHSPKFTCACPD GMLLARDMRSCLTEAEAAVATQETSTVRLKVSSTAVRTQHTTTRPVPDTSRLPGA TPGLTTVEIVTMSHQALGDVAGRGNEKKPSSVRALSIVLPIVLLVFLCLGVFLLW KNWRLKNINSINFDNPVYQKTTEDEVHICHNQDGYSYPSRQMVSLEDDVA Disease Variant >gi|62088398|dbj|BAD92646.1| SGSGHCLAEAASMGPWGWKLRWTVALLLAAAGTAVGDRCERNEFQCQDGKCISYK WVCDGSAECQDGSDESQETCLSVTCKSGDFSCGGRVNRCIPQFWRCDGQVDCDNG SDEQGCPPKTCSQDEFRCHDGKCISRQFVC DSDRDCLDGSDEASCPVLTCGPASFQCNSSTCIPQLWACDNDPDCEDGSDEWPQR CRGLYVFQGDSSPCSAFEFHCLSGECIHSSWRCDGGPDCKDKSDEENCAVATCRP DEFQCSDGNCIHGSRQCDREYDCKDMSDEV GCVNVTLCEGPNKFKCHSGECITLDKVCNMARDCRDWSDEPIKECGTNECLDNNG GCSHVCNDLKIGYECLCPDGFQLVAQRRCEDIDECQDPDTCSQLCVNLEGGYKCQ CEEGFQLDPHTKACKAVGSIAYLFFTNRHE VRKMTLDRSEYTSLIPNLRNVVALDTEVASNRIYWSDLSQRMICSTQLDRAHGVS SYDTVISRDIQAPDGLAVDWIHSNIYWTDSVLGTVSVADTKGVKRKTLFRENGSK PRAIVVDPVHGFMYWTDWGTPAKIKKGGLN GVDIYSLVTENIQWPNGITLDLLSGRLYWVDSKLHSISSIDVNGGNRKTILEDEK RLAHPFSLAVFEDKVFWTDIINEAIFSANRLTGSDVNLLAENLLSPEDMVLFHNL TQPRGVNWCERTTLSNGGCQYLCLPAPQIN PHSPKFTCACPDGMLLARDMRSCLTEAEAAVATQETSTVRLKVSSTAVRTQHTTT RPVPDTSRLPGATPGLTTVEIVTMSHQALGDVAGRGNEKKPSSVRALSIVLPIVL LVFLCLGVFLLWKNWRLKNINSINFDNPVYQKTTEDEVHICHNQDGYSYPSMVSL EDDVA Above, you’ve been given FASTA sequences of an unknown protein and that same unknown protein with a change in amino acid sequence. The unknown protein with no amino acid sequence changes is the normal variant in the population (wild-type). The protein sequence with the amino acid change(s) result in a variation of the protein that does not function properly. These sequences will be used as part 2 of your Proteomics Laboratory. This protein has an important function for humans. Humans would not be able to survive if they were no this protein. Further more, individuals who have a variant that does not function properly are very sick. These patients have a disease in which thick mucous builds up in the lungs. This mucous blocks the ability of the patient to breathe properly. Often these patients die at a very young age. Follow the questions below to determine what protein we are studying, where the change of amino acid sequence is. Also, for those individuals carry the variant that doesn’t function properly, what disease they have. Above, you’ve been given FASTA sequences of an unknown protein and that same unknown protein with a change in amino acid sequence. The unknown protein with no amino acid sequence changes is the normal variant in the population (wild-type). The protein sequence with the amino acid change(s) result in a variation of the protein that does not function properly. These sequences will be used as part 2 of your Proteomics Laboratory. This protein has an important function for humans. Humans would not be able to survive if they were no this protein. Further more, individuals who have a variant that does not function properly are very sick. Instead of having red blood cells that are bi-concave (like we saw in the microscopy lab, they look like sickles). This has important effects on how much oxygen can be carried by these cells. Follow the questions below to determine what protein we are studying, where the change of amino acid sequence is. Also, for those individuals carry the variant that doesn’t function properly, what disease they have. Now, let’s work through to figure out which protein we are studying. 1. Take your unknown sequence (normal variant) and place that sequence into blast. The first entry should be your protein of interest. Write down the name of the protein. 2. Look at the list of homologs spawned from your BLAST search. How many proteins from the list can be declared homologs (Hint: Think about e-values)? 3. From your list of homologs choose 5 from different species than Homo sapiens. Note the identities and similarities (positives) in the slot below. Also from your list note the latin name for each species, as well as the common name. For instance, if you chose one from Mus musculus you would have chosen the mouse. 4. For those 5 homologs you’ve chosen above, find the sequences as you did last week, and put them in FASTA format. Then copy and paste them into a Microsoft Word file as you did last week. We will use these sequences later. 5. Now let’s find some information about your protein. Therefore, let’s use the NCBI protein program (the first one you used last week). In the box at the top of the page, type the name of your protein, and then search. You will get back many hits from your search. Find either the first or second entry for Homo sapiens. 6. In these entries there should be a summary paragraph. This summary paragraph will tell you all about the protein. In the space provided below, discuss, as summarized in the paragraph the function of the unknown protein. 7. In the space provided below, note which disease a patient might have if they can only make a non-functional variant of your protein. 8. Now let’s find some information about this disease. From the summary paragraph, write down the given information about the disease. At this point, you should use the web to search for further information about the disease. Note the symptoms in detail, people are most likely to be affected, where the disease would most likely be found, life span of patients etc. 9. Now, let’s go ahead and do some sequence alignments. Go to ClustalW. Align your normal human sequence, with the five homologs (be sure to place your human sequence first). Note the alignment scores between the human sequence, and the protein sequences of each homolog. 10. Using ClustalW, make a cladogram, using the normal variant human sequence, and the 5 homologs. What conclusions can you make from your cladogram? 11. Now let’s figure out how your disease variant differs from your normal variant. Use ClustalW. Place the normal variant, and the disease variant in the sequence box and align them. In detail, state the amino acid number and the change in amino acid sequence in the space below.