Molecular Replacement Ronan Keegan Molecular Replacement and the Phase problem • Molecular replacement is the process of solving the phase problem for an unknown structure by placing the atomic model for a related, known structure in the unit cell of the unknown structure in such a way as to best reproduce the observed structure factors. • The known model, once placed, may be used to calculate phases which, in combination with the observed structure factors for the unknown structure, allow the model to be rebuilt and refined. • The calculation involves a 6 dimensional search over all possible orientations and translations of the known model in the unit cell of the unknown structure. • This calculation is generally too time consuming to perform in full, so it is usually split into two parts: – A 3 dimensional search over all possible orientations to determine the orientation of the model. – A 3 dimensional search over all possible translations to determine the position of the orientated model. Solving structures using Molecular Replacement • In 2006, over 67% of structures deposited in the PDB where solved by MR Before you start • Number of molecules to search for – Matthews_coef: Given molecular weight of the target and the dimensions of the target cell it will return a set of probabilities for the number of molecules in the asymmetric unit • Check for twinning – Ctruncate or Sfcheck can be used to examine the data to assess the likelihood that the data is twinned • Is there NCS? – helps with refinement – use self rotation function Finding a search model • Use online resources such as the EBI or OCA services for doing FASTA sequence matching searches – http://www.ebi.ac.uk/Tools/fasta33/ – http://oca.weizmann.ac.il/oca-bin/ocamain • Use locally held related PDB models if available • Use secondary structure matching based on best scoring models from sequence based search – http://www.ebi.ac.uk/msd-srv/ssm/ • Look for domains components (SCOP) or multimeric forms of search models (PISA) What makes a good search model • General rule-of-thumb is that the sequence identity of homologue to the target must be > 30 % for the process to work • Where sequence identity is low it is important to get as good a sequence alignment as possible – Use multiple alignment pulling in many related sequences rather than pair-wise – Profile fitting alignment e.g. Blast Preparing your search model • Signal-to-noise problem – anything that is in your model that is not likely to be in your target structure needs to be removed as it will only contribute to the background noise – Prune back side-chains that aren’t aligned – Cut out flexible loops – Cut out waters • Various programs in CCP4 to help do this – Chainsaw – prunes side chains based on a given alignment – Molrep – creates its own alignment and prunes side chains accordingly – Cutting loops: look at B-factors, if above an acceptable threshold cut out those residues using PDBCur – Removing waters and other small molecules: use PDBset CCP4 and MR • Molecular Replacement programs: – Molrep – Phaser – Amore • Automated MR: – Balbes – MrBUMP • Helper Applications: – Matthes_coef, Chainsaw, Pdbcur, Pdbset, Coordformat, Superpose, PISA Molrep • Molrep is program for automated molecular replacement in the case where a homologous structure has already been identified. • The program will attempt to find the number of molecules expected in the asymmetric unit as entered by the user. • A PDB file for the best solution is output. • Additional options – Self rotation function – Search for model in a map – Alignment only • Can perform individual steps in more difficult cases Molrep Output • Look at the output log file – Examine RF scores – Examine TF scores – Look at the Contrast score • Check to see if the number of molecules asked for have been found • Output PDB file will contain the best positioned model Cross Rotation Function Euler angles (CCP4) polar angles List of top RF peaks More details here R factor R factor Score Translation Function polar angles List of top solutions: contrast of solution fractional translation Phaser • Based on the use of maximum likelihood methods to find a solution for the phases • Likelihood measures the agreement of the model with the data by using probabilities • Allows use of Ensembles as search models in MR, improving the chances of finding the correct position for the template Running Phaser • CCP4i GUI • Highly automated – input SF’s in MTZ format, template search model(s) and sequence information • Can perform search over all possible alternative space groups • Specify each component of Ensemble • Specify composition of asymmetric unit – number of molecules and searches for hetrogeneous complexes Phaser Output • Z-scores for rotation and translations functions – Translation score > 5 = good solution – < 5 and > 3 = potential to be a good solution – > 3 = poor solution • LLG Scores – log likelihood gain score – higher value = greater confidence in result • Top positioned PDB file output along with corresponding MTZ including phase columns Balbes • Automated MR from model selection through to initial refinement • Balbes has its own cut-down version of the PDB to use for searching for potential search models for a target. Accessing Balbes • Balbes can be accessed via the YSBL web service page at York University. • Create an account and upload your MTZ file and sequence information • Also possible to run Balbes locally via CCP4i. Requires installation of the Balbes data base on the local machine http://www.ysbl.york.ac.uk/YSBLPrograms/index.jsp Web Server Results • Summary of processing for each spacegroup • Final best result highlighted • For each spacegroup log file and all output files are made available for download (5 days) • If user opted to use ARP/wARP server a link to the ARP/wARP results is provided Balbes Output • Spacegroup specific output: – Download files – Main summary file showing results of MR and refinement for each template model that was used. – Q value scoring MrBUMP • An automation framework for doing Molecular Replacement • Brute force approach that puts a particular emphasis on generating a variety of search models • It can use both Molrep and Phaser for MR • Uses a variety of helper applications to find and prepare search models (FASTA , Clustalw, Chainsaw) • To source search models it access various online databases such as the PDB and the SCOP database. • In favourable cases it will give a one-button solution • In complicated cases it will suggest likely search models for further manual investigation (lead generation) The MrBUMP Pipeline Target MTZ & Sequence Process Target Details Template Model Search N templates Model Preparation N x M models Check scores and exit or select the next model Molecular Replacement Refinement Phase Improvement Search for and preparation of templates Domain 1 • Search step automatically performs FASTA search, multiple alignment of results and searches for domains and multimers based on the hits from the FASTA search Domain 2 • Best templates are then prepared using several methods: – Molrep model alignment and side chain pruning – Chainsaw side chain pruning – Polyalanine model Ensemble • An Ensemble model is also created for Phaser MrBUMP CCP4i interface • Select mode of operation • Input files and column label details • Template search options • Search model preparation options • Molecular Replacement and Refinement options • Additional options Summarised results.. Best search model so far and file location for this model List of sorted results so far MrBUMP Output • Log file gives summary of models tried and results of MR – May get several putative solutions – Ease of subsequent model re-building, model completion may depend on choice of solution – Worth checking “poor” solutions • Top solution available from ccp4i Summary • Molecular replacement is the process of retrieving the phase information for a target structure using a related, known structure • CCP4 provides several programs and helper tools perform MR • If you are having difficulty try them all! • If no search model is obvious use Balbes or MrBUMP to do the work for you • Before experiment use MrBUMP or Balbes “Model Search” modes to check to see if good models are available Tutorial Material • Tutorial Document: – CCP4-Workshop\Tutorial\MR\APS-MrBUMP-tutorial-2010.pdf • Tutorial Data: – $CCP4\examples\mr_tutorial\2006 • MR and MrBUMP – Step-by-step guide to doing MR using Molrep and Phaser – Automated using MrBUMP • BALBES: – – – – YSBL web server: http://www.ysbl.york.ac.uk/YSBLPrograms/index.jsp Create an account Upload MTZ and sequence Acknowledgements • • • • Alexi Vagin, Garib Murshudov and Fei Long Randy Read, Gabor Bunkoczi, Airlie McCoy Martyn Winn, Norman Stein CCP4 group