EVERYTHING YOU NEED TO KNOW TO GET STARTED IN SYNTHETIC BIOLOGY Janie Brennan Purdue University 5/16/12 Outline • Introduction (definitions and history) • Fundamentals of molecular biology • Assembly methods • Part characterization • Modeling • Design theory • More about iGEM • Other stuff you might want to know Disclaimer • This is NOT a comprehensive look at Synthetic Biology….just an introduction • It may not be *everything* you need to know, but I’ll try to get close • For everything else, there’s Wikipedia • For more detailed topics, other people will be a much better source of information: • Transcriptional control strategies, the Registry, and knowledge about what types of parts are possible: Sean Kearney • Lab Equipment: Jenna Rickus and Rickus lab grad students • Cloning particulars/Reagents: Dow AgroSciences What is synthetic biology (synbio)? • Synthetic Biology is: • The design and construction of new biological parts, devices, and systems, and • The re-design of existing, natural biological systems for useful purposes. (from SyntheticBiology.org) • Basically, it’s genetic engineering (aka molecular biology) for practical purposes • Try to look at genetics from engineering paradigm • Genes and different genetic components = “parts” • Groups of components = “devices” • Model devices as circuits Who’s in charge? • The Biobricks Foundation (BBF) is one of the biggest initiatives worldwide driving the development of synbio • Goals of the BBF: • Educate the public about synbio and biotechnology • Make biotech simple to do and available to the public • Define ethical concerns and practices in synbio • The BBF developed and maintains the Parts Registry How does iGEM fit into this? • Both the BBF and iGEM were started by Drew Endy and Tom Knight of MIT in the early 2000’s • iGEM has increased the public interest in synbio and helped drive the development of standard parts and methods • BBF and iGEM are also related to the BioBuilder Foundation, which hosts the educational BioBuilder website • This is where Dr. Clase got a lot of the IT 227 course material • Based a lot on the material developed for the MIT course 20.020, developed by Natalie Kuldell (who is President of BioBuilder) Outline • Introduction (definitions and history) • Fundamentals of molecular biology • Assembly methods • Part characterization • Modeling • Design theory • More about iGEM • Other stuff you might want to know Fundamentals of molecular biology… ….or “What makes up a device?” • For this section, I drew a lot of basic information from the Registry: http://partsregistry.org • You can also find most of this (minus the synbio-specific terminology) in any genetics or biochemistry textbook (or Wikipedia) • Some of this is probably review, but I figured it would be better to include too much rather than too little Synbio jargon • A “part” is a DNA sequence encoding some part of the genetic machinery, including but not limited to: • Promoters • Ribosome binding sites (RBS) • Protein coding regions • Terminators • A “device” is a group of parts that work together for specific functions, such as: • Protein production • Sensing/reporting • Measurement • Signal inversion • Cell signaling • Cell motility More jargon • A “chassis” is the organism (host) containing the devices • Usually E. coli or S. cerevisiae (yeast) • Also can be bacteriophage (virus), plants, mammals, etc……anything, really • “Cloning” does not refer to Dolly the sheep! • Refers to manipulation of recombinant DNA Promoters • Control transcription of downstream coding region by recruiting RNA polymerases • Can be constitutive (always “on”) or controlled by external signals • Often constitute the main logical part of your device…. • The promoter is basically an “if” statement (to use a programming analogy) • Sensitivity to external signals compose the AND/OR/NOT logic • Promoters can be sensitive to things like repressor proteins, activator proteins/transcription factors, metals, or other factors in cell signaling • Promoters have a wide range of strength and activities • They can even be designed to suit your own purposes! http://partsregistry.org/Promoters/Design Ribosome binding sites (RBS) • Control translation level by recruiting ribosomes to the mRNA transcript • Therefore, an RBS is necessary if you want to make a protein • Most are constitutive • Many strengths are available • Like promoters, RBSs can be designed to suit your project: http://partsregistry.org/Ribosome_Binding_Sites/Design Protein coding sequences (CDS) • DNA sequences that encode mRNA transcripts that later get translated into proteins by ribosomes • Begin with a start codon (ATG) encoding methionine • End with stop codons (often TAA TAA) • Can also include special features at each end • At N-terminus (front end): export tags, attachment tags, or protein cleavage sites • At C-terminus (tail end): degradation tags • Common protein products: • Reporters (produce a measurable signal like GFP, etc) • Transcriptional regulators (activate/repress transcription) • Selection markers (antibiotic resistance, etc) • Enzymes for biosynthesis, DNA modification (ligases/polymerases/etc.), degradation/cleavage (proteases), etc. • Membrane proteins (surface display, transporters, channels, pumps) • Receptors/ligands • As you might have guessed, CDSs can also be designed: http://partsregistry.org/Help:Protein_coding_sequences/Design Terminators • Causes transcription to stop (causes polymerase to fall off mRNA) • Can terminate transcription on the forward strand, reverse strand, or both • Don’t stop transcription with 100% efficiency • Strength can be tuned/designed, too • Can be rho-dependent or rho-independent • Rho-independent terminators work based on stem-loop structures formed by GC-rich DNA sequences (pictured below) • The Registry only includes rho-independent terminators • You need one (or two) of these in all your devices! Outline • Introduction (definitions and history) • Fundamentals of molecular biology • Assembly methods • Part characterization • Modeling • Design theory • More about iGEM • Other stuff you might want to know How do I make my device? • Lots of ways to do it, but there are two I would specifically recommend: • 3A Assembly (similar to Standard Assembly) • Based on traditional cloning methods • If a part is designed to work with these methods, it is said to be a “Biobrick” • Gibson Assembly • Developed recently • Rapidly becoming one of the methods of choice • The next few slides will give you an idea of both methods, but first let’s talk about some of the tools you will be using • Enzymes • Plasmids Restriction Enzymes • Also called restriction endonucleases • Does the “cutting” in “cut & paste”: • Recognize specific DNA sequences (recognition sites) and cleave DNA at specific locations • Can cut straight across both DNA strands (blunt ends) or make a staggered cut with overhangs (sticky ends) • Named for organism where found • EcoRI is from E. coli, R = strain reference, and I = first enzyme isolated from that strain) Other Enzymes • Ligase • Responsible for the “pasting” in “cut & paste” • Bonds 2 complementary pieces of DNA • DNase, RNase: break down DNA/RNA • CIP (Calf Intestinal Phosphatase): Prevents self-ligation by removing phosphate groups from sticky ends • Polymerase: constructs DNA/RNA Plasmids • Most molecular biology work is done with plasmids! • Occur naturally in bacteria • We use specially designed plasmids • Plasmids are circular DNA containing the following: • Origin of replication (ori) • Antibiotic resistance (or other selection marker) • Kills off nearly all bacterial colonies that don’t contain your plasmid • Alternatively, can use a positive selection marker (GFP, blue/white screening with X-gal) to label the colonies that do have your plasmid • Your parts and devices!! • Usually contain a region known as a “polylinker” (MCS) (lots of restriction sites in a row where you can cut with restriction enzymes and insert your parts) • Can be high- or low-copy number (# copies maintained in the cell at any one time) • High-copy used for cloning (construction) (ex: pUC) • Low-copy used for expression (transcription/translation) (ex: pET vectors and T7 expression vectors) Traditional cloning methods • Quite literally, involves cutting and pasting DNA fragments • Lots more jargon: • Miniprep = small-scale DNA purification • 1-20 mL of bacterial culture • Larger scales called “midiprep”, “maxiprep”, and “megaprep” • Digestion = “cutting” DNA with restriction enzymes • Ligation = “pasting” DNA pieces together • Transformation = insertion of a plasmid into a host • Competent = ready to be transformed • Culturing = growing organism in the lab • Run a gel = separate DNA fragments by size, usually in agarose gel • Properly, “agarose gel electrophoresis” • Expression = when the bacteria are actively making the protein from the gene of interest Lab work flowchart for traditional cloning Obtain component fragments Isolate Plasmid DNA Copy or Cut Out Fragment Separate Fragments (Miniprep) (PCR or Digestion) (Electrophoresis) Unfinished or Incorrect Plasmids Purify Fragment (Gel Purification) Combine fragments (ligate) + Insert into E. coli (transform) Check Plasmid Accuracy Isolate Plasmid DNA Cut into Fragments Separate Fragments (Miniprep) (Digestion) (Electrophoresis) Transform into expression host Express protein, purify, test properties DNA Sequencing Finished Plasmids iGEM/Biobrick Assembly Methods • Use specially-designed polylinker sequences that make it easy to piece parts together in whatever order you want • http://partsregistry.org/Help:Standards/Assembly#Registry_Supported _Assembly_Standards • They are idempotent: newly composed parts will automatically adhere to the standard without any further manipulation • Specific prefixes and suffixes are added to each part so they work with each other and with standard plasmid backbones • There are several different polylinker sequences available • Known as “RFCs” • RFC10 = the BioBrick standard….not optimal! • RFC23 = the Silver standard (like the BioBrick standard but better) • I recommend using RFC23 or RFC25 • Many of them are compatible with the Standard and 3A Assembly Methods…..but not all! • Check each part you want to use to make sure it’s compatible!! • To learn more: http://partsregistry.org/Help:Standards Standard Assembly • The original BioBrick Assembly Method • Puts parts together one-by-one • Very easy, but can be time-consuming • Also, unreliable therefore, this method is no longer recommended by iGEM or the Registry • Uses 4 enzymes: EcoRI, XbaI, SpeI, and PstI • Overhanging ends for X and S are compatible, but when they are ligated together, they form a new site that can’t be cut by either enzyme (ablated site) • This site is known as a “scar”, since it is a region of DNA that doesn’t code for anything and can’t be gotten rid of 3A Assembly • 3A = 3 antibiotic • Advantages: • 97% colonies correct • No gel purification • No PCR • Easy • Best method if only combining 2 parts • Uses same restriction sites as Standard Assembly…just slightly different strategy • Compatible with all the same RFCs as Standard Assembly….just requires more antibiotics and lots of different plasmid backbones (available in the Registry) • Also produces a scar site • This is probably the best method to use if you don’t want to do any PCR or if you want to put together a long string of parts • Protocol: http://partsregistry.org/Help:Protocol/3A_Assembly Gibson Assembly • Developed recently, but becoming quite popular due to these advantages: • Scarless (good for fusion proteins and RBS placement) • DNA can come from any source (blunt-end fragments, genomes, biobricks, etc) • Great for putting multiple (>2) parts together at the same time • Sequences don’t have to follow any particular standard (RFC) • Disadvantages: • Need custom primers for PCR must order in advance (and can be expensive) • Will not work for repetitive sequences! • This system has been commercialized as “gBlocks” by IDT/NEB Gibson Assembly • How it works: • PCR out genes with specialized primers • Adjoining DNA fragments should have a 20 bp overlap • Obtain a series of fragments with overlapping sequence homology • Purified amplified DNA fragments are then mixed with a “Gibson Master Mix” consisting of T5 exonuclease, Phusion polymerase, and Taq ligase • T5 exonuclease chews back on the 5’ end • The overlapping ends are then annealed together (the newly-added homologous regions line up) • Finally, the fragments are repaired and fused with Phusion polymerase and Taq ligase • Puts all parts together simultaneously without any scars!!! • Great website: http://synbio.org.uk/dna- assembly/guidetogibsonassembly.html Synthesized Parts • Another way to get scarless fusion of all your parts (and without much work) is to get them synthesized by a company! • Can be rather expensive (~$400 per kb) • Need to make sure that you can clone it into standard iGEM plasmids afterwards • Otherwise, will not qualify as a “part” for the iGEM competition (and it won’t be much use to anyone else who wants to use it) • Add restriction sites to either side of your part/device so that it works with standard RFCs/assembly methods Where do I get my parts? The Registry! • http://partsregistry.org/Main_Page • Many of them don’t actually work…. • Be careful to use only parts with a Registry star and/or a green “W” • Good source for basic stuff including: RBS, terminators, GFP, characterization devices, logic devices, some promoters • Even if the part is listed in the Registry, it may not have been included in the iGEM Repository… • To check, go to partsregistry.org, click on “DNA Repositories”, and then type in your part number into the top blue box (“Part list”) this will search the repositories and tell you where you might find it • Other option: on part’s main page, click “Get This Part” in the upper right hand corner will take you to a page with the availability and location in current and past repositories, as well as quality control information • Sometimes, the Registry can send you a part, even if it’s not in the repository: http://partsregistry.org/DNA_Requests If it’s not in the Registry, what do I do? You have 2 options: 1. Get it straight from the source: PCR out of a natural genome • Organism may not be readily available…. • Organism genome may not be compatible with your host organism genome (i.e., a part from a plant may not work if placed into E. coli because of different codon biases) • Need to add correct prefix and suffix so will work with Registry standards (RFCs) 2. Order it (synthesize it) from IDT (iGEM partner) or GenScript • You will need the DNA sequence • Natural DNA sequences can be found from many sources: • Academic papers (search through Web of Science Database available through Purdue Libraries) • GenBank: http://www.ncbi.nlm.nih.gov/genbank/ • BRENDA (enzyme database): http://www.brenda-enzymes.info/ If using a natural or synthesized sequence…. ….be sure to optimize the codon sequences to work in your chassis!! (although don’t change the prefix and suffix – they must still work with the assembly methods!!!) Codon Optimization protocol: 1. 2. 3. 4. 5. 6. 7. 8. Go to http://genomes.urv.cat/OPTIMIZER/ Insert the DNA sequence you would like to optimize. Select a) to use the OPTIMIZER data for codon frequency. Select the organism in which you will be expressing (most likely E. Coli K12) Choose Codon Usage (HEG) for highly expressed genes. Choose the genetic code you would like to use (Eubacterial) Use guided random for the method. Click next and a report will be created with the optimized sequence. Once you have your optimized sequence, be sure to check that you didn’t accidentally add an extraneous restriction sites that might interfere with your assembly method!!!! In silico cloning • BEFORE you ever work in the lab, it is imperative to make sure that everything will work!! • Do this with computer programs that will simulate assembly procedures • DNA2.0 (iGEM is a partner) • https://www.dna20.com/genedesigner2/ • Geneious (this is what my lab uses….I really really like it) • pDRAW32 (never used this…I found it online…I think it’s free) • VectorNTI (this is the industrial-level software….what Dow Agro uses… it’s really expensive) Another thing to check: Rare codons • Besides checking that there aren’t any restriction sites within your sequence (none that will be used in assembly, anyway), you also need to check for rare codons • Rare codons are codons that aren’t used very often by your chassis, so there aren’t very many tRNAs corresponding to them • Therefore, a rare codon can either slow translation or even cause it to stop (especially if there are multiple rare codons) • Even though you “optimized” your coding sequence, the assembly process still has the potential to introduce rare codons into the overall sequence • Therefore, you need to use in silico cloning to check for them in the fully assembled device • To check for rare codons (E. coli only), use this tool: http://nihserver.mbi.ucla.edu/RACC/ For more information… • Browse the Registry and iGEM websites • All this information and more is there…but it can be hard to find • Read the following paper (in the Dropbox under Background Info): “Engineering BioBrick vectors from BioBrick parts” by Reshma P Shetty, Drew Endy, and Thomas F Knight Jr, Journal of Biological Engineering, 2:5, 2008. Outline • Introduction (definitions and history) • Fundamentals of molecular biology • Assembly methods • Part characterization • Modeling • Design theory • More about iGEM • Other stuff you might want to know I’ve made my device….now what? • You need to characterize it!! • http://partsregistry.org/Characterization_of_Parts • Usually, characterization methods use fluorescent tags (GFP = green fluorescent protein, RFP = red fluorescent protein, etc) • As part is made/used, the cells fluoresce • Measure the fluorescence levels and quantify! • A standardized way to measure protein expression is “PoPS”, or Polymerase Per Second • http://partsregistry.org/PoPS • Defined as the number of times that RNA polymerase passes a specific site on DNA per unit of time • Used by inserting test gene into same regulatory region with a fluorescent protein coding region measure fluorescence More on characterization • To learn more about characterization methods, read the following paper (in the Dropbox under Background Info): • “Refinement and standardization of synthetic biological parts and devices” by Barry Canton, Anna Labno, and Drew Endy, Nature Biotechnology, 26:7, 2008. • The paper also contains a sample datasheet showing what kind of data you need to collect for proper characterization Outline • Introduction (definitions and history) • Fundamentals of molecular biology • Assembly methods • Part characterization • Modeling • Design theory • More about iGEM • Other stuff you might want to know Modeling • We at Purdue think it’s important to not just make the parts and devices, but have a good understanding of how they work • Therefore, it is important that you model your system! • There are 2 main ways you can do this: • Graphical software: TinkerCell, CellDesigner • Mathematical method: MATLAB (Mathworks is a partner with iGEM) • No matter which way you pick, you’ll still need to have a basic understanding of the math!!! • It’s not as bad as it sounds, I promise! • Examples of iGEM projects that use good models (there’s not very many): • Purdue 2009 (I did this one! So yes, I’m a little biased) • British Columbia 2011 (won Best Model at iGEM Americas) • Good documentation at: http://2011.igem.org/Team:British_Columbia/Model1 Intro to the maths • It all boils down to solving a set of differential equations • The differential equations are derived from knowledge of how the system works • First, draw a pathway of what occurs physically in your system • Then fill in the defining equations • Most enzymes work via well-defined kinetics, i.e. Michaelis-Menten • Other phenomena can usually be estimated by zero, 1st, or 2nd order reactions (just like in freshman chemistry!) • If necessary, account for cell growth with the logistic model of cell growth • Once you have your set of differential equations, you will have lots of unknown parameters you have to supply • You can sometimes find these in literature • For enzymes, the BRENDA database has a LOT of kinetic constants • Otherwise, you can estimate them based on physiologically relevant values (find these in academic papers, etc) • Good animation for introduction to enzyme kinetics: http://www.wiley.com/college/pratt/0471393878/student/animations/enzy me_kinetics/ • Also, the Wikipedia article is quite good: http://en.wikipedia.org/wiki/Enzyme_kinetics Outline • Introduction (definitions and history) • Fundamentals of molecular biology • Assembly methods • Part characterization • Modeling • Design theory • More about iGEM • Other stuff you might want to know Design theory and strategies As we talked about in IT227, there are some things you want to keep in mind when designing your system: 1. You want to make it have “sections” which can be tested individually a. b. This is not unlike computer programming, where you want to divide your code up so you can test individual functions Later, you can put all the pieces together once you know they all work alone There are many ways in which a genetic circuit can function … you might want to think of alternatives when you start designing your device (for example, http://openwetware.org/wiki/BioBuilding:_Synthetic_Biology_for_Stude nts:_Lab_1) Once you have your device put together, you might want to optimize it to work better 2. 3. a. b. Before you put everything together, think about how you might do this Optimization can be done in tandem with modeling: once you have your basic system, you can model it. Then change some parameters in your model to see how you can make the output better! Translate that into your device and repeat. Outline • Introduction (definitions and history) • Fundamentals of molecular biology • Assembly methods • Part characterization • Modeling • Design theory • More about iGEM • Other stuff you might want to know iGEM! • So, you’re on an iGEM team. What exactly does that mean? • The best way to find out is to look at lots of older projects • Try to look mostly at the winning projects….(although the rest can often give you good insight as to what to avoid in your own project) • Goals of iGEM: • Learn about synbio! • Spread the word to the public • Enforce safety and ethical concerns regarding recombinant DNA • Stimulate progress in synbio • Several iGEM teams have actually published their results! • Provide a venue for undergraduates to conduct their own creative research projects How iGEM works • The competition (Jamboree) is in the fall (Oct/Nov) • The rest of the year (primarily in summer), teams from around the world develop a new and creative project • In fall, teams attend regional competitions (new in 2011) • Each team is graded on certain criteria (more later) • If a team satisfies all of the criteria, a gold medal is awarded • You can also achieve a silver or a bronze medal • Other awards are also given, including Best Model, Best New Biobrick, etc. • The best teams from each regional move on to the World competition at MIT where they compete for the Golden BioBrick! Getting a gold medal • More info: http://2012.igem.org/Judging • In summary, you will need to do the following: • Register your team (already done!) • Complete the judging form (when you get closer to the Jamboree) • Develop a Team Wiki • Includes a project description, modeling information, a lab notebook, information about team members, safety, and anything else you want people to see before the Jamboree • This is your first impression to the judges, so make it thorough! • Present a poster and a talk at the Jamboree • Submit at least one new BioBrick part or device before the deadline • You can also *extensively* document an older part used for a new application instead • The new part *must* work with existing RFCs (i.e., have the correct prefix and suffix and be in an iGEM plasmid) • Show that your new part/device works as expected • Characterize operation of at least one new BioBrick part or device and enter information into the Registry • Plus one of the following: • Improve an existing part and document it in the Registry • Significantly help another iGEM team (debug their part, model their system, etc) • Outline a **new** approach to an issue in Human Practice in synbio (don’t just do a survey!! The judges have seen zillions of them!) Things to note • It is VERY difficult to successfully complete an iGEM project…mostly due to time and cost constraints • Therefore, do NOT waste time! • The first few weeks (when you nail down your design) are CRUCIAL, and they can easily get you behind schedule • It takes a lot more reading and searching than you probably think • Also, lots of other teams have lots more funding, dedicated grad students/post docs, and/or courses that focus on developing the summer’s iGEM project (i.e. MIT) • You have to compete against them, even though you may not have the same resources • European teams especially have started growing in strength schools there are putting a lot of focus and time into developing a good iGEM team Outline • Introduction (definitions and history) • Fundamentals of molecular biology • Assembly methods • Part characterization • Modeling • Design theory • More about iGEM • Other stuff you might want to know Other useful resources • When doing a literature review, the best place to find academic articles is the Web of Science database • Yes, it’s better than Google Scholar • Access through Purdue Libraries: • Go to http://lib.purdue.edu and click on the orange Databases tab • Find “Web of Science” in the drop-down menu • Use the search tools to find what you want • Each article will have a link that looks like this: • Click it, and it will take you to a page with links to the pdf • If it’s not available on the internet, Purdue Libraries can do an Interlibrary Loan – they find it at another school, scan it, and send it to you • For the basics behind all the lab protocols, a good resource is Wiley Current Protocols in Molecular Biology • Access via same Database list (“Current Protocols series”) • Note: these detail the “traditional” protocols….more often than not, people nowadays use kits to do many of these techniques • However, these will give you a good idea of how the kits work • Molecular Cloning: A Laboratory Manual by T. Maniatis • This is basically the bible of molecular biology • Introduction to Genetic Engineering by W. Sofer • Good intro and history section for how molecular biology developed (and the basic terminology) That’s all, folks! • Or rather, that’s all I can think of right now. • If you have any other questions, don’t hesitate to ask. If I can’t answer them, we can find someone who can. • I really want to see Purdue succeed this year…I think we have a good shot at a Gold medal, as long as we stay on track and don’t get lost in the details. I will do my best to make sure this happens!