|
Prepare Data for HA Search - Revise, Ecalc, MTZ2various
Real Space Patterson Search - RSPS
The layout of each task window, i.e. the number of folders present, and whether these folders are open or closed by default, depends on the choices made in the Protocol folder of the task (see Introduction). Although certain folders are closed by default, there are specific reasons why you should or may want to look at them. These reasons are described in the Task Window Layout sections below. Merge Datasets (CAD)See documentation in Reflection Data Utilities module.Scale Datasets - SCALEIT and FHSCALFor the scaling of derivative to native datasets, two CCP4 programs are available: SCALEIT and FHSCAL. The tutorial on isomorphous replacement by I. Tickle describes the strengths and weaknesses of those programs. Note that there is no unique solution to the problem of scaling together two different datasets. Various problems can arise from:
Scale Datasets with Anomalous Dispersion DataThe Scale Datasets task will run SCALEIT to scale together all the DPHn (the dispersive difference for the nth wavelength). It will optionally do a cross-comparison of the anomalous data sets - this involves rerunning SCALEIT with the input: LABIN FP = FPHn SIGFP = SIGFPHn FPH1 = FPHm SIGFPH1=SIGFPHm DPH1 = DPHm SIGDPH1= SIGDPHm for all possible pairwise combinations of wavelengths n and m. From these runs, the cross-comparison Rfactor and normal probability for the acentric data are extracted. It is also optional to perform analysis of dispersive differences by rerunning SCALEIT with the input: LABIN FP = FPH(+)n SIGFP = SIGFPH(+)n FPH1 = FPH(-)n SIGFPH1= SIGFPH(-)n From this analysis, the normal probablities for the acentric and centric data and the Rfactor are extracted. The input MTZ file must contain the FPH(+)n and FPH(-)n. If you do not have data in this form, you should run the mtzMADmod program which converts DPHn to the appropriate form. This program is not interfaced. A better solution is to use the latest version of the TRUNCATE program which retains the FPH(+)n and FPH(-)n on output. The results of both these analyses are tabulated in a summary file called project_jobid_scaleit.summary. Scale Datasets - Task Window LayoutIn the Protocol folder of the Scale Datasets task, you can choose:
Features to look out for in the Scale Datasets Task are:
See program documentation: SCALEIT, FHSCAL. Solution FilesHeavy Atom (.ha) FilesHeavy atom (HA) files are short files which keep a record of the proposed heavy atom sites in a structure. They are analagous to the MR files of the Molecular Replacement module. The format of the file is similar to the ATOM input line for the MLPHARE heavy atom refinement program. There is one line per atom site and the line is free format beginning with the word ATOM: ATOM atom_name x y z occupancy anomalous_occupancy BFAC B-factor The interface to MLPHARE can use an HA file as input and HA files are output by:
HA files are generated with a default file name which is project_jobid_n.ha where n=1,2,3... . If you select an HA file from the menu under the View Files from Job button, it will be displayed in an HA file viewer which is similar to the MR file viewer and which has some simple functionality to edit the file. Picking a line in the file will put a # character at the beginning of a line and this line will then be ignored on input to MLPHARE. A second pick will remove the # character. There is a Change All button at the bottom of the viewer which will add or remove #'s from all ATOM lines. There is also an Edit Columns button which presents options to set the atom name, occupancy, anomalous occupancy and Bfactor for all the atoms in the file. Prepare Data for HA Search - Revise, Ecalc, MTZ2variousYou wil need to run this task for the following cases:
In the Prepare Data for HA Search task window you should only need to identify the type of your data and which phasing program you intend to run, and the interface will make the necessary conversions described below. MAD data is rescaled by the Revise program to give an estimate of the normalised anomalous scattering magnitude (given the column label FM by Rantan but sometimes referred to as FA in the literature). The input data can be in the form of F(+) and F(-) for each wavelength or be anomalous differences Dano for each wavelength. The output FM can then be used in similar fashion to a single anomalous difference (Dano) or isomorphous difference (Diso). The theory behind this is described in the Revise program documentation. Data conversionDirect methods programs such as Shelx and Rantan usually work with data in the form of normalised intensities rather than the structure factors which are normally used in macromolecular crystallography. So structure factor data must be converted to normalised structure amplitudes for use in direct methods programs. The Shelx program has an internal procedure to do this conversion but data intended for the Rantan program must go through the Ecalc program which calculates normalised structure amplitudes (usually given the column label E). Rantan and all other CCP4 programs work with experimental data in MTZ file format but SHELX requires the data in an ascii format described in the Shelx documentation. The Prepare Data for HA Search task will use Mtz2various to convert an MTZ file to Shelx format. See program documentation: Revise, MTZ2various, Ecalc. Acorn - ab initio PhasingAcorn can be used to phase when you have atomic resolution data, and a suitable starting point such as a known fragment (which in favourable cases can be a randomly-placed atom) or approximate phases. It can also be used to find heavy atom substructures at lower resolutions.In the Protocol folder of the Acorn task, you can choose:
The program requires normalised structure factors. These can be supplied in the input MTZ file, or alternatively the task will run ECALC to generate them. The task may also require an input PDB file if a starting fragment is being supplied. See program documentation: Acorn SHELX - Heavy Atom SearchThe SHELX program can be obtained from THE SHELX HOMEPAGE. The CCP4i interface is for Shelx-90. To ensure that CCP4i scripts can find the Shelx program, the full path name of the program needs to be entered in the Configure Interface window which is accessed from a button on the right hand side of the main window. For more information on the SHELX program, see THE SHELX HOMEPAGE. This has references to various FAQs: The Shelx Homepage, section 7, and Thomas Schneider's FAQs. It also has a document on Macromolecular applications of SHELX, written for International Tables Vol. F. RANTAN - Direct MethodsThe RANTAN Direct Methods program can be applied to solving MAD data or isomorphous replacement data. The Interface will set the key input parameters appropriately for the type of data. For isomorphous data, RANTAN works optimally with the input in the form of normalised amplitudes rather than structure factors so the Interface will usually run the ECALC program to convert SFs to normalised amplitudes. The Interface will alternatively allow input of either precalculated normalised amplitudes or normalised amplitudes and initial phases. MAD data for RANTAN will be preprocessed by the REVISE program (see above) which generates estimates of FM which is the normalised anomalous scattering factor. The input to REVISE is the FP and FPH(+)n and FPH(-)n for dataset n. These data should have been scaled by the SCALEIT program. REVISE also needs to know the wavelength, f' and f'' for each wavelength. See program documentation: RANTAN, ECALC, REVISE, SCALEIT. Professs - NCS from HAThe Professs task will take heavy atom coordinates from a PDB file or a Heavy Atom (.ha) file and attempt to find NCS operators relating subsets of sites. Any operators found will be listed in the log file. See program documentation: Professs Oasis - SAD/SIR phasingOasis takes SAD or SIR data and uses direct methods to break the phase ambiguity. The program requires heavy atom sites, either from a Heavy Atom (.ha) file or entered manually. The task will optionally run DM to perform density modification after phasing.See program documentation: Oasis Generate Patterson MapThe Generate Patterson Map Task performs the following:
Optionally: Excluding Large Intensity DifferencesErroneously large intensity differences can affect a Patterson map disproportionately because the parameter used, the intensity, is the square of the structure factor, and the square of a large number is a very large number. The effect seen in the Patterson map is ridges. It is therefore usually a good idea to exclude the reflections with very high differences: FPH-FP from the difference Patterson and FPH+-FPH- from the anomalous difference Patterson. By default the Interface will run the SCALEIT program to analyse the data and use the value of 4.1*RMS(FPH-FP) which is a reasonable first estimate of a suitable cutoff. It may be worthwhile to try different cutoff values and look at the resultant Patterson map - the value used can be set at the top of the Exclude Reflections folder. Excluding 'good' reflections tends to degrade the map so it is not good to over-estimate the cutoff value. For very good data it may be unnecessary to exclude any data. The SCALEIT log file also has a table of Isomorphous and (if appropriate) Anomalous differences which show the number of reflections with given differences as a function of resolution shell. Generate Patterson Map - Task Window LayoutFeatures to look out for in the Generate Patterson Map Task are:
See program documentation: SCALEIT, FFT, MAPMASK, PEAKMAX, NPO, VECTORS, HAVECS. Real Space Patterson Search - RSPSThis task runs the RSPS program to find heavy atom sites from a Petterson. It may also be used to test previously determined heavy atom coordinates. Run MLPHAREMLPHARE can be used to refine either isomorphous or anomalous data. Check the 'Use anomalous difference data' box at the top of the MLPHARE interface if appropriate. The initial default interface only provides for describing one derivative or wavelength; click on the Add Another Derivative button under the 'MTZ in' section to open space for additional data. The minimal input then required is some initial heavy atom definitions in the folder Describe Derivatives & Refinement. For each derivative enter a name, and the name of the HA file containing the data for that derivative. Alternatively, enter the atoms explicity by changing the Use data 'from file' menu option to Use data 'entered below' and then typing in the information. The Cut and Paste tool may be useful. For anomalous data you will need to enter the same HA file for each wavelength. It is possible to edit the HA files 'on line' by clicking the View button on the file selection line. The HA file viewer has some simple editing tools but more complex changes may need to be done in an editor. The output MTZ file contains columns PHIB_mlphare1, FOM_mlphare1 etc.. If you use this file as input to another MLPHARE run, set a new unique column name extension. Change the parameter 'Output label identifier' from mlphare1 to mlphare2 for instance. Each run of MLPHARE within the Interface also outputs one HA file for each derivative. These HA files can be used as input to the next MLPHARE run. The SCALEIT documentation states: "MLPHARE has a built in weighting scheme which means that it doesn't do much harm to include less good data in phasing. After all the poor hkl should get low FOMs, and then DM can use the few reflections with reasonable phases to help in the phase extension procedure." The MLPHARE program documentation has several helpful hints, e.g.: "NB: If an occupancy becomes near to 0.0 the coordinate shifts will possibly be meaningless", and a whole section of Notes on usage. Suggested input numbers for Estimated Lack of Closure:
Data HarvestingMLPHARE is one of the Data Harvesting programs. See Data Harvesting in CCP4i for implications for the Interface. MapsThe MLPHARE interface has the option to output double difference maps which can be used to search for further heavy atoms. In this case the PEAKMAX program will also be run to list the peaks to a PDB file and to an HA file with the name project_jobid_label_peaks.ha where label is the MTZ column label of the derivative FPH. If you wish to do any other analysis on the map, it can be input to the 'Generate Patterson Map' task when the 'Run FFT ...' option at the top of the task window has been toggled off. It is easiest to create maps by running the FFT task inside the Run Mlphare task. Do this by toggling on the option to 'Generate double difference maps files ...'. In some cases it may be necessary to (re)create maps independently from the MLPHARE task. It is not possible to do this through the Create Task-Specific Maps task in the Map & Mask Utilities module. And only if you know exactly what you are doing should you attempt to do this through the Run FFT - Create Map task in the Map & Mask Utilities module. See program documentation: MLPHARE, PEAKMAX, FFT. See also
MIRTutorial(Bath)
(the HTML equivalent of $CDOC/Iso_repl_itickle_tut.bath.ps), |