Supplemental Methods Protein Purification T. weissflogii was grown as previously described1 in 25L polycarbonate carboys bubbled with sterile filtered air. Cells from 300L of culture were harvested by filtration. Cells from a 2L culture with 20mCi/L 109Cd added to the media were harvested by filtration and added to the cell pellet as a tracer. The pellet was resuspended in lysis buffer (50mM Tris-HCL pH 7.0, 1mM EDTA, 0.1mM DTT and 0.1% phenylmethyl sulfonyl fluoride) and lysed by sonication on ice. Progress of cell lysis was followed by microscopy. The crude cell lysate was centrifuged for 30 minutes at 10,000 rpm in a Beckman JA-10 rotor at 4oC to remove frustules and other cell debris. This low speed supernatant was then transferred and centrifuged at 50,000 rpm in a Beckman 60ti rotor for 2h at 4oC to pellet membrane fragments. This high speed supernatant was subjected to sequential 30% and 70% ammonium sulfate precipitations. The pellet from the 70% ammonium sulfate precipitation was resuspended in 200mL of 25mM sodium phosphate (pH 6.8) and 0.5M ammonium sulfate (HICA buffer). 100mL was loaded at 2mL/min onto a 20mL butyl hydrophobic interaction chromatography (HIC) column and washed with 5 column volumes of HICA buffer at 5mL/min. Protein was eluted with a 10 column volume, 0.5 to 0M gradient of ammonium sulfate in 25mM phosphate buffer (pH 6.8) run at 2.5mL/min. Forty 5mL fractions were collected. The HIC column was run twice and the carbonic anhydrase activity in collected fractions was detected as previously described2. Briefly, equal volumes of each fraction were run on 10% nondenaturing gels, soaked in 0.1% bromthymol blue in 1x Lammeli running buffer (without SDS)3. Saturated CO2 gas blown over the gel revealed yellow bands of CA activity. Gels were then dried and radiolabeled bands were imaged by phosphorimaging. Pooled CA containing fractions were concentrated by ultrafiltration and exchanged into 10mM histidine buffer (pH 6.0) using a Biorad 10DG column. The protein mixture was then injected onto an HPLC with a TSK-DEAE ion-exchange column and eluted with a NaCl gradient. Fractions were assayed for CA activity and 109Cd label as described above, and fractions containing the enzyme were pooled. Purified protein was assayed for enzyme activity, purity (Coomassie stained) and 109Cd label using non-denaturing gels. N-terminal sequencing was performed by automated Edman degradation at the Princeton University Core Facility. Internal peptides were generated by trypsin digestion of the purified Cd-CA by the method of Fernandez and Mische (1996)4. Peptides were purified by C-18 capillary HPLC and sequenced as above. The peptide sequence of the amino terminus was NQSNTSSSTSKASLTPDQIVAALQERGWQAIVTE FSLLN and that of the internal peptide was IVIPSISPAQGAEL. Sequence determination: PCR, cloning, sequencing T. weissflogii total RNA was extracted from 400 mL of mid log phase culture using TriReagent (Sigma, St Louis MO). cDNA was synthesized using the Clontech Smart RACE kit following the manufacturer’s instructions. The nearly full-length cDNA was cloned in three steps (see Supplemental Figure 1). Degenerate primers P65-6 (sequence: 5'GGITGGCARACIGARATHG-3’) and P65-3b (5'-ARYTCIGCICCYTGIGCIGG-3') were designed using the amino terminal and internal peptide sequences. PCR reactions carried out with these primers yielded a 620 base pair product that contained both primer sequences and also encoded the amino-terminal peptide sequence downstream of the region targeted by P65-6 and upstream of the region targeted by P65-3b. Thus it was clear that we had amplified and cloned a cDNA fragment that encodes a large portion of the amino-terminal region of the Cd CA. Two nested non-degenerate primers CdCT-2 (5'- GGTCGACGTCGATCCTCAAGGC-3') and CdCT- 1 (5'CATCTTGAAATGCGTCCACGGACG-3') were then used sequentially in conjunction with Clontech primers UP and NUP to amplify the last two thirds of the cDNA encoding the CDCA1. Because the Cd-CA takes the form of three direct repeats of the 200 amino acids each, the cDNA fragment encoding the amino terminus that was cloned in the first step did not overlap the larger fragment encoding the carboxy terminus that was cloned in the second step. Two separate fragments were cloned in spite of the fact that the primers used for the amplification of the second fragment were designed using the sequence from the first fragment. In the final step, to amplify the nearly full length cDNA encoding all three direct repeats in the CDCA1 sequence, nested PCR was performed first using primers CCA-4 (5'-CGGAATTCTCCCTCCTCAACG-3') and CCA-3 (5'CAAAAACTTGACCACATCCAA-3)'. The resulting product was diluted 1:10000 fold and a nested PCR reaction was run using primers CCA-4 and CCA-2 5'CGACATCGTCGAGGCCTTGAC-3' and Clontech hot-start Advantage Polymerase. The final PCR product was gel purified and cloned into a non-expression plasmid (TOPO-TA Invitrogen). An additional clone of the entire CDCA1 gene was generated using mutagenic PCR primers 5'-CTCCCTCCTCAACGAAATGGT-3' (CCA-070302A) and 5'-CCCCGTCACAGCCATCATCTAAGG-3'(CCA-070302B) and its sequence was determined as described above. The cdca1 sequence has been submitted to Genbank (accession #AY772014). X-ray Absorption Spectroscopy Spectra were measured at the Stanford Synchrotron Radiation Laboratory (SSRL) on beamline 7-3, using a Si(220) double crystal monochromator, with the SPEAR storage ring containing 70-100 mA at 3.0GeV. Harmonics were rejected by detuning one monochromator crystal to 60% of peak intensity, and samples were maintained at 10K in an Oxford Instruments helium flow cryostat. Spectra were measured using a 30-element Ge array detector, and incident and transmitted x-ray intensity was measured using Arfilled ionization chambers. X-ray energy was calibrated with reference to the lowest energy inflection of a Cd metal foil, assumed to be 26714.0 eV. CdCA1 protein, purified from T. weissflogii as described above, was used for analyses. Because the sample was dilute (~ 7 M Cd) a total of 59 scans, each taking 25min, were averaged in order to obtain adequate signal to noise. References 1. 2. 3. 4. Lane, T. W. & Morel, F. M. M. Proc. Natl. Acad. Sci. 97, 4627-4631 (2000). Roberts, S., Lane, T. & Morel, F. Journal of Phycology 33, 845-850 (1997). Sambrook, J., Fritsch, E. F. & Maniatis, S. Molecular Cloning: A Laboratory Manual (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1989). Fernandez, J. and Mishe S.M. The protein protocols handbook. J.M. Walker, ed., p 405-414, Humana Press, Totowa, NJ (1996). Supplementary Figure 1. a.) The derived amino acid sequence of CDCA1: three repeats are located within CDCA1 sequenced from T. weissflogii b.) Alignment of the three repeats (R1-R3, where * indicates where the sequence continues at the beginning of the subsequent repeat) of CDCA1 from Thalassiosira weissflogii, and that derived from the entire homologous gene from the genome of the marine diatom Thalassiosira pseudonana (Tp-CdCA). a. NQSNTSSSTSKASLTPDQIVAALQERGWQAEIVTEFSLLNEMVDVDPQGILKCVDGRGSDNTQFCGPKMPG GIYAIAHNRGVTTLEGLKQITKEVASKGHVPSVHGDHSSDMLGCGFFKLWVTGRFDDMGYPRPQFDADQGA KAVENAGGVIEMHHGSHAEKVVYINLVENKTLEPDEDDQRFIVDGWAAGKFGLDVPKFLIAAAATVEMLGG PKKAKIVIPSISPAQIAEALQGRGWDAEIVTDASMAGQLVDVRPEGILKCVDGRGSDNTIMGGPKMPGGIY AIAHNRGVTSIEGLKQITKEVASKGHLPSVHGDHSSDMLGCGFFKLWVTGRFDDMGYPRPQFDADQDANAV KDAGGIIEMHHGSHTEKVVYINLLANKTLEPNENDQRFIVDGWAADKFGLDVPKFLIAAAATVEMLGGPKN AKIVVPSITPPQIVSALRGRGWKASIVKASTMSSELKRVDPQGILKCVDGRGSDNTQFGGPKMPGGIYAIA HNRGVTTLEGLKDITREVASKGHVPSVHGDHSSDMLGCGFFKLWLTGRFDDMGYPRPEFDADQGALAVRAA GGVIEMHHGSHEEKVVYINLVSGMTLEPNEHDQRFIVDGWAASKFGLDVVKFLVAAAATVEMLGGPKKAKI VIP* b.