|
Molecular Replacement TutorialThis tutorial solves one protein structure using the structure of a similar protein.The ProblemThe example is to solve cardiotoxin which is:This protein is now in the PDB as 1tgx.pdb (solved by A.Bilwes,B.Rees,D.Moras J.Mol.Biol. 1994, V239, p-122). Outline of the Method1) Make an estimate of the number of molecules in the asymmetric unit so we know how many molecules to look for with the molecular replacement programs.2) Look at our experimental data - are there any problems? 3) Run molecular replacement program to find solutions. 4) Refine the phases using NCS (non-cystallographic symmetry) phased refinement. The Data FilesFiles in directory $DATAmodel.pdb -
contains coordinates of the model we will use to solve cardiotoxin.
Files in the directory $RESULTS matthews.log
- the log file from Cell Content Analysis
For next steps see the picture here
1.1 Change to the Coordinate Utilities
module. Select the Cell Content Analysis
task.
1.2 Select the MTZ file - the program will read
the space group and cell dimensions from the MTZ file (so you do not need
to type them in):
MTZ file
DATA
cardiotoxin.mtz.
1.3 Enter the molecular weight of the protein. The
protein has 60 residues and we say average residue weight is 100 Dalton.
So
Molecular weight of protein 6000.
1.4 Click the Run Now button.
1.5 Look at the output in the window - it shows a table
of the Matthew's coefficient and percentage solvent content dependent on
the number of molecules that are in the asymmetric unit.
1.6) Close the Cell Content
Analysis window.
a) Create a Patterson map and search it for peaks.
We expect a big peak at the origin (position 0,0,0) but if there is another
big peak (perhaps about 0.25 the size of the origin peak) then perhaps
there is translation between the molecules in the asymmetric unit
and it will be more difficult to solve.
(The theory behind this is explained on the web site of Bernhard
Rupp:
http://www-structure.llnl.gov/xray/101index.html
For more information, go to the section on Phasing Techniques on
this website, and click on NCS with native Patterson maps)
b) Create a Wilson plot which is an indication of the
self consistency of the data. Also find the average B-value of the
data - this can be used to help the molecular replacement program.
For the next steps look at picture here
2.1) Change to the Molecular Replacement
module and select the Analyse Data for MR
task.
2.2) Select the input experimental data
MTZ in
DATA cardiotoxin.mtz
2.3) Select input model:
PDB in DATA
model.pdb
2.4) Enter the Number of residues in the asymmetric
unit - this is:
number of molecules in asymmetric unit *
number of residues per molecule
= 3 * 60
= 180
Number of residues in asymmetric
unit 180
2.5) Run the
job. You can now Close the Analyse
Data window.
2.6) Look at the log file when the job has finished.
In the main CCP4i window click on the job called mr_analyse
and then from menu View Files from Job select
View Log File. In the log file is output
from the programs FFT which created the Patterson map and Peakmax
which searched for peaks in the map. To find what we want
click on the Find button and enter the text
List of peaks. You can now see table
like this:
Hints
In fact the values you get may be different, for example:
You may also see a different number of peaks, for example:
The difference is an effect of the width of the Patterson origin peak being
related to the resolution range of data included when generating the Patterson
map. At lower resolutions the origin peak may overlap neighbouring grid points
in the map, and result in apparent extra peaks in these adjacent positions.
Including higher resolution data narrows the origin peak and reduces the
effect; try changing the high resolution limit from 4.0 A to 3.0 A in the
Define Map folder, and re-run.
2.7) Now go to the bottom of the log file where
you will see:
Wilson Plot
This is a usual Wilson plot - no problems here!
Amplitude Analysis v. Resolution
This plot is usual shape for amplitude
versus resolution plot with 'water' peak at about 4A.
Average B v. Residue
This shows the difference from the
mean - in this PDB file all the B values are set to 20.
This is not interesting for this protein.
Quit from
the two windows which display the log file and the graphs.
For the next few steps look at picture here
3.1) From the Molecular
Replacement menu select MolRep
- auto MR
3.2) The default mode for
running MolRep is good:
Do molecular
replacement performing rotation
and translation function
3.3) Select the input experimental
data file.
MTZ in DATA
cardiotoxin.mtz.
3.4) Select the input model.
Model in
DATA model.pdb
3.5) Look down the window a little:
set the
Search for
3 monomers in
the asymmetric unit.
3.6) From the Run
menu select Run
Now.
MolRep will take a long time to
run - if it is too long you can see the output files: $RESULTS/molrep.log
and $RESULTS/mr_model_molrep1.pdb.
Go to directory
RESULTS
then
File
molrep.log
and the click on Display
and Exit
The log file lists many possible
solutions. After the rotation function:
After a translation function:
The program runs the translation function three times
to find three different molecules. For the second run of the translation
function it will take the best solution from the first run and try to find
another molecule which will fit well with the first solution. For
the third run of the translation function it will keep the best solution
from the first run and the second run and try to find molecule number three.
It is not possible to say what is a good score - this
will depend on many things but it is good if the best score (the correlation
factor in the column labelled Corr) is much bigger than the second
best score. When you are looking for several molecules the best score
for the first (and perhaps second) molecule will not be very good but you
hope that the best score for all three molecules is much better than for
other possible solutions.
3.7) The output file mr_model_molrep1.pdb
contains three copies of the input model moved to the right positions in
the asymmetric unit. The three
molecules will pack together something like this.
For the next few steps look at the picture here
4.1) Go to the Refinement
module and chose Run Refmac5
4.2) Enter the name of the data file:
MTZ in
DATA
cardiotoxin.mtz
Enter the name of the input files - this is the coordinate
file output from MolRep:
PDB in
RESULTS model_molrep1.pdb
If you have not run molrep, use the model_molrep1.pdb file found in DATA:
PDB in
DATA model_molrep1.pdb 4.3) Now you must tell the program that it must
keep the non-crystallographic symmetry by keeping the three chains
similar. Click on the line with the folder title
Non-crystallographic Symmetry
4.4) Click on the button
Add non-X restraint
4.5) Click three times on the button
Add chain
4.6) Now set up like this:
Restrain together chain A
4.7) Make sure you are not creating a data harvesting file. Select:
Do not create harvest file
from the Data Harvesting section. 4.8) Run the program by clicking Run
and Run Now.
4.9) You can look at a log file by using View
Any File and selecting
Go to directory
RESULTS
File type
log CCP4 log filename
filter *.log
Viewer View
Log Graphs
and then select file:
File
mr_refmac.log
and then click on Display and Exit
Go to the bottom of the list of
Tables in File and
select
Rfactor analysis, stats vs cycle
Prepared by Liz Potterton (lizp@ysbl.york.ac.uk) &
Eleanor Dodson, July 2000 |