Small Project#2

advertisement
Project number 2: To work with, and think about, binding sites
for DNA binding proteins (DUE December 1st by 11:59 pm)
Zheng et al. published a paper in 2004 in which they used a couple different
methods to identify genes and operons in the sequence E. coli strain MG1655
that are regulated by CAP. They found that ~180 promoter regions were
activated by CAP•cAMP and about 20 were repressed by CAP•cAMP. The end result
of this work is shown in Tables 1, 2 and 3 of the paper (which you can
download at the class website).
For this project, two main things need to be done.
1a. Collect 10 CAP binding site sequences from Tables 1 and/or 2. Line them
up with each other and generate a consensus sequence. (show this work,
include the names of the genes that you used). (8 pts)
1b. Take your aligned sequences and put them into the weblogo program
(http://weblogo.berkeley.edu/logo.cgi) to create a WebLogo like the one
below.
How does yours compare to the one below (Logo’s CAP consensus)? (4 pts)
(http://weblogo.berkeley.edu/logo.cgi)
2a. You need to You need to find a CAP site from Table 1 or Table 2 in Zheng
et al., You need to find a CAP site from Table xx in Zheng et al. Choose one
that lacks an obvious reason to need one. That is, don't pick a sugar
catabolism gene.
Next, find the CAP binding site in the gene promoter region of its gene
and find the -35, -10, +1 sites as well as the start codon of the first gene
regulated by it. These should be mapped out on the gene sequence as shown
below. Indicate whether your gene shows type I or type II activation by
CAP(12 pts)
2b. Write a few sentences on why you think your gene/operon is regulated by
CAP•cAMP, be sure to include some information on what the gene/operon does
(more than the small description in Table 1/2). (6 pts)
An example: if you chose lacZ (which you won't because it is a
sugar catabolism gene)
1. Go to the KEGG genes database (http://www.genome.jp/kegg/genes.html).
2. Type the species and gene name into the search box like this: eco:lacZ.
3. This will bring you to a KEGG page with information about your gene. Down
near the bottom is the gene's sequence from its start to its finish.
Unfortunately, you need more than this--you need the promoter region which is
upstream of the gene. To get that, add 400 bases to the "upstream box" that
should be more than enough (I added 300 in this case).
Hit the "NT seq" button.
---------------------4. Finally, the sequence is at hand. Your gene is in blue at the bottom,
space between your gene and the next one upstream is in in black and the
upstream gene (lacI pointing in the same direction as lacZ) is shown in blue
up at the top. If the upstream gene is more than 400 bp away from your gene,
you won't see it unless you add more DNA in step 3. Green text upstream means
the next gene is in the opposite orientation.
------------------------
5. Lastly, use this DNA sequence and in Word, find and highlight the CAP
site(s) etc as outlined on the first page. Don't just stupidly put your CAP
sequence into Work and hit find. If the CAP binding site is broken by a line
return (paragraph return, Word won't find it. So, best to look for it by eye.
TTCCCGACTGGAAAGCGGGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGC
lacI stop
CAP
ACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACAA
-35
-10
+1
TTTCACACAGGAAACAGCTATGACCATGATTACGGATTCACTGGCCGTCGTTTTACAACGTCGT
lacZ start
GACTGGGAAAACCCT
Note: Table 2 indicates where the CAP sites are relative to the
+1 site, from that info, you should be able to find reasonable 35 and -10 candidates, realize they may be off the standard
sigma70 consensus by a bit, like the ones for lacZYA are.
BUT..Table 1 indicates where the CAP sites are relative to the
start of the gene (ie relative to the “A” of “ATG” if the start
codon is an ATG start. There are sometimes GTG and TTG starts)
Download