DNA Dev Notes – Relaxing Indexing & Other Pointless things Gleb Graeme Harry @ day 2 of dna dev meeting Decided that the best thing to do is probably to have two runs of pointless, the first with the 2 × 3° of data collected for cell refinement, and the second from the first 15 degrees or so of the data actually collected. Run #1: 2 × 3° of data Input: pointer to an image, e.g. /tmp/data/usr24323/postref-bert1_1_001.img Will perform triclinic index, integrate data and feed results into pointless. Will then return the point group (expressed as a space group). Alert user if “correct” point group is lower than indexed lattice, and predict completeness/multiplicity of resulting set (this is essentially a detailed message for the user) – TAKE NO ACTION. Need to include with this an assessment of the reliability of the result, e.g. “We can say with 75% confidence that the point group is probably correct.” Actions: 1. HRP: verify reliability from “real” dna examples (will involve liasing with ESRF to get data). 2. GW: verify reliability against JCSG data. 3. HRP: Add “A” matrix output for each solution from indexing, to allow the refimages to be reintegrated and BEST run, to provide more informative output. 4. GW: Implement a lightweight strategy prediction for this (using BEST, already done). 5. OS: Ensure that the information from the collect request can be made available to the scheduler to enable the above calculations. 6. GW: Assess likely time cost of this. Discussion & Comments After thinking about this for a while, we came to the conclusion that the best thing to do would really be to collect this data, then WAIT for the results, then compute the strategy, since you will have the refined mosaic spread and a much better idea of the correct point group to be collecting in. This will also help the post refinement and later integration steps (e.g. stop them from breaking). To ensure that this works, it will be critical that the input parameters in the image header are correct. If they are not, it is likely that the reliability will be substantially diminished. Run #1: 1 × 15° of data Once the proper data collection has started, it should be trivial to pass the processed reflection files in to the same system as above, and get a much more reliable estimate of the point group (GW has tested this and found around 15° to be reliable). This could then provide another warning, this time with a little more force! Actions: 1. GW: Implement this (lower priority than above though). 2. GW: Assess time costs. Relaxing the Indexing Criteria The indexing criteria in DNA are currently too strict, and in many cases reject the correct solution. To some extent the qualities of the statistics used (rmsd, number of rejected reflections) are highly correlated with the accuracy of the distance and beam position. Assuming that the input parameters are correct, the following limits are reasonable: RMS φ ≈ 0.5 Δφ RMS x ≈ 0.5 pixel If they are higher than this, then warn the user that the numbers may be a little out. If they are 5 (five) times this then abort, since something is almost certainly wrong (most probably the centre/distance/wavelength). Abort therefore only when the solution is very bad or when the images are blank.