BI PAH questions 2009 Page 1 of 4 Step 1. Getting sequences and learning about your protein Gene database at NCBI 9. Is this a primary or a secondary database? Explain your answer. 10. What type of information may be available for a gene in this database? 11. What subset of sequences is included in this database? 12. Explain what refseq is. (You will have to search the NCBI Web Site to find this – go back to the home page and select NCBI Web Site from the drop down list). PTGS entry 13. What is the official Gene symbol? 14. What is Gene name? 15. What is the GeneID number? 16. Where in the human genome is this gene located? 17. What is the function of this protein? 18. What specific role does it play in the cell? 19. What is the RefSeq accession number for the mRNA of this gene? 20. What is the RefSeq accession number for the protein sequence (it is the product of the gene)? 21. Is this a primary or secondary database? Why? 22. Would you expect to see the same protein sequence represented more than once? Why or why not? 23. What is the SwissProt accession number? (Hint: for proteins it always starts with a P), 1 BI PAH questions 2009 Page 2 of 4 24. What are the alternate names for this protein? 25. Where in the cell is this protein located? (If it is known it will be in the comments – if it is not listed your answer is “unknown”.) 26. Does this protein exist as a monomer, dimmer (homo or hetero) etc.? OMIM 27. What information is available at this site? 28. Is this a primary or a secondary database? Why? 29. What book is the information in OMIM base on? 30. Would you expect to find information on an infectious disease such as Herpes in this database? Why or why not? 31. What disease is associated with the protein you are studying? Project specific questions 32. In addition to the general questions, answer the following PAH specific questions (use information from these databases or the Berg reading material, note that you may need to clink on links). Please indicate which source each answer comes from. a. What metabolic pathway does this protein belong to? b. Which three amino acid residue numbers bind the iron atom and how was this determined? c. Which residue is modified by phosphorylation? residue 16 (SwissProt) d. Diagram the reaction catalyzed by this enzyme (include structures). e. What is the difference between PKU and the related disease hyperphenylalaninemia? 2 BI PAH questions 2009 f. Page 3 of 4 Suggest an explanation for why there is a range of symptoms among different hyperphenylalaninemia individuals (I’m looking for your own ideas here)? g. Normally, what fraction of phenylalanine is converted to tyrosine? Step 2. Finding related sequences and setting up a multi-sequence FASTA file BLAST (at NCBI) 33. What does this acronym stand for and what is it used for? 34. (a) from pg 110 In your own words, describe what FASTA format is. Step 3. Creating the MSA using ClustalW 34. What information can be obtained from a multiple sequence alignment of related proteins? 35. What are three ways this information can be used? 36. What types of sequences can be aligned by ClustalW? 37. Print the output to hand one in at the end of today’s lab. Also answer the following questions. 38. What is the mutation? Write it in the following format "Res123Res" where Res is the three-letter code for the amino acid in the un-mutated (wild type) protein and the second Res is the amino acid in the mutated protein. In place of "123" put the amino acid residue number of the mutation. 39. Is the mutation in a region of conservation – how do you know? 40. If the mutation was in a region of conservation what would this suggest? 41. What properties differ between the mutant and normal protein amino acid(see Betts and Russel, 2003)? Step 4. Analysis of the 3D structure 42. Experimental method used to obtain the data: 3 BI PAH questions 2009 Page 4 of 4 43. Resolution of the structure 44. Species the protein was obtained from 45. List any ligands, cofactors or metal ions included in the structure (particularly important for enzymes): 46. Project specific questions. These can be answered using the PDB specific information available at RCSB, or by going to the journal article on which this structure is based (direct link to the abstract at RCSB). a. What is the first author and journal name for the primary citation for this crystal structure? b. Was the entire PAH protein used to obtain the crystal structure (if not what portion was included)? c. What molecules were bound to the protein and what ‘real’ molecules do they represent? d. Compare and contrast the structure of the substrate analog with phenylalanine. 47. The actual secondary structure of your protein is shown in Control Panel. Write the actual secondary structure above the MSA that was generated in step 3. Use the symbol h for helix and s for beta sheet (write nothing for loops). answer on your MSA 48. What groups, and what exact atom of each group is involved in the H-bond (for example backbone O of Glu14)? 49. What type of secondary structure are these groups involved in? 50. List any changed, lost, or new H-bonds between the variant amino acid and other groups. 51. If a new group(s) has been associated with the variant amino acid than was associated with the normal amino acid what types of structure(s) is this group(s) involved in? 4