|
CCP4 Tutorial: Session 33a) Scaling and analysing datasetsThe ProblemYou now have a file containing native data for GerE, and MAD data for a selenomethionine derivative. First, we scale each wavelength of the MAD data to the native dataset, so that all data is on the same scale. At the same time, we analyse the MAD data to estimate the strength of the dispersive and anomalous signals.Exercise3.1 Select the Experimental Phasing module, and open the Scale and Analyse Datasets task window.3.2 On the first line, enter a suitable job title such as
3.3 On the second line, select
3.4 Select the input MTZ file
Now select the columns from the MTZ file. The first line has the native F_nat and SIGF_nat. Then select columns for the 4 wavelengths, using the button Add Derivative Data to add more columns. (It might be easier here to load the file $DATA/session3a.def which already has these parameters set.) You should end up with:
Check that the output MTZ file is given as
3.5 You should not need to change anything else. Select Run -> Run Now. 3.6 When the job has finished, return to the main window, highlight the job in the Job List, and select View Files from Job -> View Log Graphs. This task outputs a large number of graphs for analysing the data, and we will just look at some of them. 3.7 We can gauge the strength of the dispersive differences by looking at the graphs Centric Normal probability v resolution and Acentric Normal probability v resolution ... for each pair of wavelengths, e.g. ... FP = FSElrm FPH = FSEinfl SIGFSEinfl DSEinfl SIGDSEinfl . For each graph, look at the line Gradient_on_reflection_prob.lt.0.9. Use the crosswires to estimate a rough value, e.g. for the low-remote against the inflection, the value is about 1.182 for centric data and 1.254 for acentric data. The values can be summarised as (these values are contained in the file View Files from Job -> scaleit.summary): Table: Normal Probability for acentric data Normal Prob. | FSEpeak FSElrm FSEhrm --------------------------------------------------- FSEinfl | 1.071 1.254 1.515 Table: Normal Probability for Centric data Normal Prob. | FSEpeak FSElrm FSEhrm ---------------------------------------------------- FSEinfl | 0.975 1.182 1.470This shows that the difference in f' values is smallest from the inflection to the peak, and largest from the inflection to the high-wavelength remote (the inflection point has the smallest f'). 3.8 We can gauge the strength of the anomalous differences by looking at the graph Acentric Normal probability v resolution ... for F(+) and F(-) of each wavelength, e.g. ... FP = F(+)SEinfl FPH = F(-)SEinfl SIGF(-)SEinfl . For each graph, look at the line Gradient_on_reflection_prob.lt.0.9, and use the crosswires to estimate a rough value. The values can be summarised as: F(+)SEhrm v F(-)SEhrm 1.01 F(+)SEinfl v F(-)SEinfl 1.17 F(+)SEpeak v F(-)SEpeak 1.43 F(+)SElrm v F(-)SElrm 1.34This shows that the high-wavelength remote has hardly any anomalous signal, i.e. a low value of f'' at this wavelength. The peak wavelength has the largest f'', while the other 2 wavelengths have intermediate values. 3b) Preparing datasets for finding heavy atomsThe ProblemYou are going to use a direct methods approach for locating the Se sites. In this section, you will prepare the MAD data for use in the direct methods program RANTAN. This task runs REVISE for generating the normalised anomalous scattering magnitude FM, and then the program ECALC for calculating the corresponding normalised structure factor E.Exercise3.20 Select the Experimental Phasing module, and open the Prepare Data for HA Search task window.3.21 On the first line, enter a suitable job title such as
3.22 On the next line, select
3.23 Select the input MTZ file
Check that the output MTZ file is given as
3.24 Now fill in the section Anomalous Data as follows:
3.25 You should not need to change anything else, so select Run -> Run Now. 3c) Find heavy atomsThe ProblemYou have generated a column of E values which give a wavelength-independent measure of the anomalous scattering due to the Se sites. The Se sites can be found from the E values by Patterson methods, but here you will use a direct methods approach.Exercise3.40 Select the Experimental Phasing module, and open the Rantan - Direct Methods task window.3.41 On the first line, enter a suitable job title such as
3.42 On the second line, select
3.43 Select the input MTZ file
3.44 You should not need to change anything else, so select Run -> Run Now.
3.45 When the job has finished, view the output MTZ file by selecting
in the main window View Files from Job ->
gere_MAD_rantan1.mtz. The output file has 48 columns:
3.46 From each of these phase sets, the task calculates a map and locates peaks, which may correspond to Se sites. These peaks are output in both orthogonal and fractional coordinates. Click on View Files from Job to reveal a list of output files. For each phase set, there will be a .pdb (orthogonal coordinates) and a .ha (fractional coordinates) file, for example TEST_5_1.pdb and TEST_5_1.ha for phase set 1. The default peak search produces approximately 15 peaks - we expect there to be 12 Se sites for this protein (2 each for 6 chains). (Note that RANTAN starts from random phases sets, so the results are not always the same.) 3d) Heavy atom refinementThe ProblemYou now have 3 sets of possible Se sites. Heavy atom refinement and phasing is done using the program MLPHARE. The stages are:
For the tutorial, we just do the 1st stage (exercise 3d) and the last stage (exercise 3e). Exercise3.60 Select the Experimental Phasing module, and open the Run Mlphare task window.3.61 On the first line, enter a suitable job title such as
3.62 In the first section, select:
3.63 Select the input MTZ file:
Check that the output MTZ file is given as
3.64 In the section Data Harvesting, leave as:
3.65 In the section Key parameters, enter resolution limits (we exclude data which does not help phasing):
3.66 In the section Describe Derivatives & Refinement, enter a name for the derivative:
3.67 Select Run -> Run Now.
3.68 Click on View Files from Job ->
TEST_9_1.ha and look at the list of refined sites. The occupancy
of sites 7 and 12 are now below 0:
3.69 Return to the Run Mlphare task window. In the section Describe Derivatives & Refinement, add in XYZ refinement, and change the occupancy refinement to alternate occupancy and B factor:
3.70 Update the heavy atom file:
3.71 Select Run -> Run Now. The interface will ask you whether you want to overwrite gere_MAD_mlphare1.mtz. This is OK, so click Delete File . 3.72 When the job has finished, you can check the refined Se sites as before. There are several ways the refinement of the Se sites can be optimised:
3e) PhasingThe ProblemWe assume you have now found all the Se sites, and have refined their positions, real occupancies and B factors. A file is provided in DATA with the correct Se coordinates. To get the best phases, we now include all wavelengths together, and use the anomalous signal as well.Exercise3.80 In the Run Mlphare task window, enter a suitable job title such as:
3.81 In the first section, select:
3.82 Select the input MTZ file:
Check that the output MTZ file is given as
3.83 In the section Key parameters, enter resolution limits:
3.84 In the section Describe Derivatives & Refinement, Derivative Number 1, enter a name for the derivative:
3.85 Repeat 3.84 for the other 3 wavelengths 3.86 Select Run -> Run Now. 3.87 When the job has finished, return to the main window, highlight the job in the Job List, and select View Files from Job -> View Log Graphs. Graphs are given for each wavelength, for both the last refinement cycle and the final phasing cycle. Look in particular at:
3f) Density ModificationThe ProblemThe phases output by MLPHARE can be used to generate an electron density map for the native data. However, the map is likely to be easier to interpret if density modification is performed first. (In fact, MLPHARE gives realistic Figures of Merit and therefore density modification usually works well.) Density modification (also known as Density Improvement) can be done using the program DM.Exercise3.100 Select the Density Improvement module, and open the Run DM task window.3.101 On the first line, enter a suitable job title such as
3.102 Select the input MTZ file:
3.103 Enter the solvent content as
3.104 Everything else can be left as default, so Run -> Run Now. 3g) Testing the handThe ProblemThe procedure for locating the Se sites cannot distinguish between a particular set of sites and the same set of sites transformed through a point of inversion, i.e. it cannot distinguish the hand of the solution. Therefore, the previous phasing run should be repeated using the opposite hand.Then we look at two things:
ExerciseRe-run the previous 2 exercises, but use the following file of sites instead:
|