Google

CCP4 web logo CCP4i: Graphical User Interface
Refinement Module

next button previous button top button
Module Overview
Overview of Restraints Handling
Run Refmac5 - Maximum Likelihood refinement
Refmac Folder Names
Refmac - Task Window Layout
Data Harvesting
Maps
Edit Restraints in PDB File
Monomer Library Sketcher
Merge monomer libraries
Run ARP/wARP v5.0 (``Arp_waters'')
Running ARP/wARP v5.0 with Refmac - This is the most common usage of ARP/wARP, to find waters
The ARP/wARP Stand_Alone Interface - no longer supported
NCS Phased Refinement - NCSref
Input
Methodology
Reviewing Results
Maps
Tidy Waters - Watertidy
Create/Edit TLS File
Analyse anisotropic U parameters
Analyse TLS parameters
Run SFCHECK & PROCHECK - Checking the fit between structure and experimental data, and checking the geometry
Note on Running PROCHECK within CCP4i
CHECK - Task Window Layout
The Refinement module contains the following tasks:
Run Refmac5
Edit Restraints in PDB File
Monomer Library Sketcher
Merge monomer libraries
NCS Phased Refinement
Tidy Waters
Create/Edit TLS File
Analyse anisotropic U parameters
Analyse TLS parameters
Run SFCHECK & PROCHECK

Module Overview

Refinement in CCP4 is usually done using the program Refmac5, although there is also Restrain available. Before proceeding with refinement, it is important to check that there are suitable restraints in place. This should be the case for the protein and common ligands, but needs to be checked for unusual ligands and protein linkages (e.g. disulphide bonds). Therefore, the first step should be to run Refmac5 in Review Mode (select "review restraints" in the protocol section of the Run Refmac5 task). Certain restraints will be placed in the PDB file, and these can be checked and changed if necessary via the Edit Restraints in PDB File task. Restraint dictionaries for ligands can be generated and/or editted using the Monomer Library Sketcher. Multiple dictionaries can be merged with the Merge monomer libraries task.

Refinement of the model can then proceed via the Run Refmac5 task, as described below. The special case of NCS Phased Refinement is treated in a separate task. For TLS refinement, an input file is required to define the TLS groups, and this can be generated in the Create/Edit TLS File task.

After refinement, a number of analyses and checks can be performed. The Tidy Waters task rationalises water molecules according to the protein chain with which they are most likely associated. The Analyse anisotropic U parameters task provides analyses of U's obtained from a full anisotropic refinement, while the Analyse TLS parameters task provides analyses of parameters from a TLS refinement. Finally, the Run Sfcheck & Procheck task must be run to validate the model structure.

The layout of each task window, i.e. the number of folders present, and whether these folders are open or closed by default, depends on the choices made in the Protocol folder of the task (see Introduction). Although certain folders are closed by default, there are specific reasons why you should or may want to look at them. These reasons are described in the Task Window Layout sections below.

Overview of Restraints Handling

Using Refmac for 'restrained refinement', 'structure idealisation' and 'TLS & restrained refinement', requires restraints to idealised geometry and these are set up within Refmac (rather than requiring the Protin program as in previous versions of Refmac). The options for 'unrestrained refinement' and 'rigid body refinement' do not require restraints. Although restraints can be generated automatically, the user is very strongly recommended on the first run of Refmac to check these restraints, particularly for any ligand. There is an option to 'review restraints' which will perform the restraint generation but no refinement.

Refmac restraint generation will attempt to find a match in the geometry library for all residue types in the input PDB file; this will handle all standard amino acids and nucleic acids and their link regions automatically. The Review Restraints folder in the Refmac window controls the setting up of restraints. The user should be careful with the library descriptions of ligand molecules and wary that the names of monomers in the geometry library and the atom naming conventions may not be the same as in their structure. You can view the monomer in the geometry library using the Monomer Library Sketcher and the option under the File pull-down menu to Read File -> Load Monomer from Library. If you have an unusual ligand you can use the Monomer Library Sketcher to create a geometry library description from either an input PDB coordinate file or from a structure that you sketch. If you generate your own geometry library, it must be given as one of the input files to Refmac.

Refmac also needs additional restraints to define disulphide bonds, cis-peptides, inter-residue links and any modified residues. These are input to Refmac as the SSBOND, CISPEP, LINK and MODRES cards in the PDB file. They are also generated automatically and written to the PDB file when Refmac is run in 'review restraints' mode. The automatic restraint generation is very dependent on the quality of the input coordinates; the results should be reviewed and editted in the Edit Restraints in PDB File task. Note that the format used by Refmac for the MODRES and LINK lines is slightly non-standard: there is an option in the Edit Restraints in PDB File task to write the output PDB file in standard format.

If there is non-crystallographic symmetry within the structure and you wish to restrain the related chains to be similar (as is recommended in early cycles of refinement) then use the Non-crystallographic symmetry folder in the Refmac window.

There is a tutorial introducing restraint handling. For an introduction to this tutorial (and a route to the data to be used), see CCP4 Tutorial: Contents.

Run Refmac5

Note that the pre 5.0 version of Refmac is still available.

The Protocol folder in Refmac requires the user to choose the type of refinement (default 'restrained') and the type of phase information to be used (default 'none'). For certain types of refinement, TLS parameters can be input as a fixed contribution to structure factors. Next, it allows the choice of cyclisation with the program ARP_WATERS (ARP/wARP v5.0, analyse waters only, or all atoms). Maps can be generated in various formats and the extent of map coverage (including the border) may be chosen here.

Depending on the selected Protocol option, the Files folder requires input of various files (MTZ, PDB, TLS, library files), and names for the output files will be suggested. If map files are to be generated (through the toggle button in the Protocol folder) and the Preferences are set such that output map files may be chosen explicitly, the Files folder will contain lines for 'Fwt map' and 'DelFwt map'. Default filenames will be suggested, but they may be changed.

Dependent on the selected Protocol option, other important choices can be made in the Required Parameters folder:

  • The method to be used for the calculation of the crystallographic residual (maximum likelihood or least-squares).
  • The number of refinement cycles ('minicycles') to be run and, in the case of cycling with ARP_waters, Refmac cycles ('macrocycles' or 'external refinement cycles').
  • In/Exclusion of hydrogens from structure factor calculation.
  • The resolution range of the data to be used in this refinement run.
  • The type of weighting (matrix/gradient) and the relative weights of the X-ray and geometry terms, and the in/exclusion of experimental sigmas in this weighting. The gold-coloured 'Message line help' at the top of the window indicates the direction of change (increase or decrease) for tightening the geometry.
  • The type of temperature factor refinement.
  • The handling of the 'free' reflections. Note that the MTZ label of the free R column is set here, rather than in the Files folder - you should make sure this is set correctly.

Refmac Folder Names

The Refmac window potentially has the following foldersª:
Folder nameVisible forComment
Data Harvesting REST, TLSª default is 'Do not create harvest file', but get ready to produce harvest files automatically in future
Required Parameters REST, UNRE, RIGID, IDEA, TLSª open by default
Setup Restraints REVIEW, REST, IDEA, TLSª open for REVIEW (with revised defaults), closed for the other options
Non-crystallographic symmetry REVIEW, REST, IDEA, TLSª closed by default
ARP_waters Parameters REST, UNRE, TLSª open by default, but only if the ARP/wARP toggle button in the Protocol folder is 'On'
TLS Parameters TLSª open by default when TLS & restrained refinement selected
Rigid Domains Definition RIGIDª open by default when rigid body refinement
Partial Structure Factors REST, UNRE, RIGID, TLSª closed by default
Data Output to MTZ file REST, UNRE, RIGID, TLSª closed by default
Crystal parameters all options closed by default
Scaling Fobs and Fc REST, UNRE, RIGID, TLSª closed by default
SigmaA Estimation REST, UNRE, RIGID, TLSª closed by default
Other Parameters REST, UNRE, RIGID, IDEA, TLSª closed by default
Monitoring all options closed by default
Geometric parameters REVIEW, REST, IDEA, TLSª closed by default
ª REVIEW = review restraints, REST = restrained refinement, UNRE = unrestrained refinement, RIGID = rigid body refinement, IDEA = structure idealisation, TLS = TLS & restrained refinement

Refmac - Task Window Layout

Following the Protocol, Files, and Required Parameters folders, features to look out for in the Refmac task are:

Protocol optionFolder titleImportanceComment
REVIEW, REST, IDEA, TLSª Setup Restraints The default options are slightly different for the REVIEW mode.
RIGIDªRigid Domains Definition Initialise rotation and translation parametersDefine translation and/or rotation to be applied before refinement starts
Cycle with ARP_watersARP/wARP ParametersMerge atoms closer than... Default depends on Protocol choice (waters only, or all atoms)
Refine (only water atoms, or all atoms) Protocol choice refers to analysis, this refers to refinement of 'waters only, or all atoms'
REST, UNRE, RIGID, TLSªPartial Structure Factors Include partial structure factors from known partial structures See Refmac partial structure factors
Data Output to MTZ file Labels for MTZ outputDefaults suggested
Scaling Fobs and Fc NOTE: When doing ML refinement, the scale factors are only used to calculate R values
Apply anisotropic scaling Different from program default, which is NOT to apply anisotropic scaling
Use experimental sigmasAre yours good enough?
Fix scale and B for low resolution structures Scale (i.e. subkeyword SCBULK) needs to be negative
SigmaA Estimation Use 'free' reflectionsSigmaA can be successfully estimated using only 200 reflections
Fix scale and B for low resolution structures Only use with EXTREME care; scale needs to be negative
REST, UNRE, RIGID, IDEA, TLSªOther parameters Splitting resolution range, limit Bvalue range, damping parameters
REVIEW, REST, IDEA, TLSªGeometric parameters Target values for the geometrySensible defaults in place
ª REVIEW = review restraints, REST = restrained refinement, UNRE = unrestrained refinement, RIGID = rigid body refinement, IDEA = structure idealisation, TLS = TLS & restrained refinement

Data Harvesting

Refmac is one of the Data Harvesting programs. See Data Harvesting in CCP4i for implications for the Interface.

Maps

Refmac calculates special weighted differences to be used as coefficients for electron density maps. It is easiest to create those maps by running the FFT task inside the Refmac task. Do this by toggling on the option 'Generate weighted difference maps files ...' in the Protocol folder. The format of these maps can then be chosen.

The Refmac map coefficients FWT and DELFWT should be treated as 'composite single Fourier coefficients'. Therefore, even though their appearance suggests their use for difference (nF1-mF2) maps, Refmac coefficients should be used in the calculation of a 'simple' map if the map is calculated independently from the Refmac task with the Run FFT - Create Map task in the Map & Mask Utilities module.

Maps may also be (re)created independently from the Refmac task, through the Create Task-Specific Maps task in the Map & Mask Utilities module. The only input this task requires is the job number of the original job - all other parameters will be restored from the database. This task will produce the appropriate maps, even when they were not calculated in the original job.

See program documentation: Refmac5 or Refmac5@YSBL.

For further information on TLS refinement, see Martyn's page.

Edit Restraints in PDB File

This task will simplify editing the restraint records (MODRES, SSBOND, LINK and CISPEP) in the PDB file. Refmac will use these restraints specified in the PDB file dependent on the input commands from the Setup Restraints folder in the Refmac task interface. You will normally enter the name of an input PDB file (though this is not essential) and by default the changed records will be written out to the same file - but you can give an alternative output file name.

Information

The first folder provides some information to help you with the rest of the task. When the spacegroup is entered, the symmetry operations for that spacegroup are listed. The residue modifications and links currently defined in the geometry libraries are also listed.

MODRES - Modified Residues

By default the name of a residue in the PDB file should correspond to the name of a monomer in the geometry libraries and the restraints defined for that monomer will be used when refining the residue. The MODRES records can be used to handle two types of non-default cases:

Modified residues
If there has been a small chemical change to a standard residue then a MODRES record will define that change. You should identify the modified residue (enter residue name, sequence number, chain ID and insertion code) and the LibNam (i.e. the name of the modification in the library).
Renaming residues
This is a means of aliasing the residue name from the PDB file to a different monomer name in the libraries. This may be necessary because the PDB file format does not allow residue names greater than three characters but the geometry library names may be up to six characters. To use the MODRES record in renaming mode: enter the PDB file residue name, sequence number, chain ID and insertion code; click on the button in the Rename column and enter the LibNam, the name of the monomer in the library file.

Note: CCP4 uses an extension to the MODRES record in the PDB file, the LibNam field is increased from 3 to 6 characters (extending into the field usually reserved for comments) and an extra 'mode' field is added to the end of the record. When using the MODRES record in rename mode the mode field is either RENAME or left blank. When MODRES is used to modify a residue the mode field contains the name of the modification (as taken from the geometry libraries) and the LibNam field is left blank.

SSBOND - Disulphide Bonds

Refmac will usually do a good job of finding close cysteine residues and automatically generating the restraints to maintain a disulphide bond but you may need to enter definitions for difficult cases. Normally the residue name sequence number, chain ID and insertion code should be entered for both residues in the disulphide bond.

There are two cases which can complicate matters:

Alternative conformations
It is possible for cysteine residues to adopt two or more alternative conformations and not all the conformations will have the same disulphide bonding. The PDB format does not allow for definition of alternative conformers in the SSBOND record and the LINK record must be used instead with the link ID set to 'SS'
Symmetry Operators
Disulphide bonds can be between residues in different asymmetric units of the crystal. It is not usually necessary for the user to specify the symmetry relation between the two linked units since the program can usually deduce them by testing inter-atomic distances to all possible neighbouring molecules in the crystal. All you need to do for disulphide bonds which are not within the same asymmetric unit is click on the button in the column labelled 'Find Symm'. Then the symmetry operations '1555' and '2555' will be written to the PDB file and this will prompt Refmac to check for links bonds to neighbouring asymmetric units (the PDB conventions for defining symmetry operations are defined in the documentation for REMARK 290 of the PDB guide for remarks 4-999).

LINK - Inter-Residue Bonds

The LINK record is a means of defining any non-standard bond or interaction between residues. The standard links between consecutive peptides, nucleic acids and sugar residues are deduced automatically by the program from the coordinates and it is not necessary to enter them unless the coordinates are poor or the link is non-standard.

To define a link you must enter the atom name, alternate indicator, the residue name, sequence number, chain ID and insertion code for two atoms. The link ID is a CCP4 specific field on the end of the LINK record which should contain the name of a type of link defined in the CCP4 geometry libraries.

Symmetry Operators
Links can be between residues in different asymmetric units of the unit cell. It is not usually necessary for the user to specify the symmetry relation between the two linked units since the program can usually deduce them by testing inter-atomic distances to all possible neighbouring molecules in the crystal. All you need to do for links which are not within the same asymmetric unit is click on the button in the column labelled 'Find Symm'. Then the symmetry operations '1555' and '2555' will be written to the PDB file and this will prompt Refmac to check for links bonds to neighbouring asymmetric units (the PDB conventions for defining symmetry operations are defined as before).

Note: CCP4 uses an extension to the PDB definition of a LINK record; there is an extra ID field at the end of the record.

CISPEP - Cis Peptides

To restrain a bond to a cis-peptide linkage you must enter the residue name, sequence number, chain ID and insertion code for the two consecutive residues connected by the cis-peptide bond. Note that using this record is equivalent, within Refmac, to defining a link of type CIS or PCIS between the two residues.

Note: The standard PDB definition of the CISPEP record includes two fields, the model number and the angle, which are not used in Refmac and are not in the interface.

Merge monomer libraries

This task adds a new geometry description, as held in a .cif file, to a library file. The purpose is simply to enable all a user's geometry descriptions to be held in one library file.

Run ARP/wARP v5.0 (``Arp_waters'')

CCP4 only distribute ARP/wARP v5.0 (renamed to arp_waters), and within CCP4i this can only be run as part of the REFMAC procedure.

The program ARP/wARP updates a model by removing poorly fitted atoms and adding new atoms into unsatisfied electron density. All newly found atoms are considered to be water atoms but if not all of the protein structure has been found, these 'water' atoms may be indicating the wherabouts of protein.

ARP/wARP works best with high resolution data, and is usually run in cycles in conjunction with REFMAC or another refinement program which does an overall refinement. ARP/wARP requires the input of pre-calculated difference maps and so scripts to run ARP/wARP also run the FFT program which generates the maps and may also require MAPMASK to put the maps into the required asymmetric unit for ARP/wARP.

Running ARP/wARP v5.0 with Refmac

ARP/wARP v5.0 can be run with REFMAC from the REFMAC interface. There is a button in the Protocol folder of the REFMAC interface to switch on use of ARP/wARP and then a folder presenting ARP/wARP Parameters will be opened further down the interface. By default the procedure will remove and refine only water atoms. It can be applied to all atoms but this is only recommended for very high resolution data (~1Å). The parameters controlling the removal and addition of atoms are discussed in the ARP/wARP documentation (see REMOVE and FIND keywords). REFMAC outputs the labeled columns FWT and DELFWT which are the structure factors for weighted difference maps. These maps are the optimal input for ARP/wARP and are used in the automated map generation.

The ARP/wARP Stand-Alone Interface

CCP4 distribute ARP/wARP version 5.0 (renamed to arp_waters), which is substantially older than the current version of the program. The option to run arp_waters through a stand-alone interface is no longer supported.

The full version of ARP/wARP, including a comprehensive CCP4i interface, can be obtained from http://www.arp-warp.org/

See program documentation: ARP_WATERS.

NCS Phased Refinement

NCS Phased Refinement is expected to be very useful in the early stages of refining molecular replacement solutions when the structure contains NCS models. The method works by using the phases derived by 'dm' from the NCS and solvent flattening and histogram matching as input phases to Refmac.

Input

The task requires you to input:

  • a model which contains all of the chains in the asymmetric unit. CCP4i will automatically extract the names and residue ranges of the chains to define the NCS Units and list them in the task window.
  • a MTZ file containing the experimental Fs and a FreeR flag. You can also, optionally, input experimental phases.
  • an estimate of the solvent content; the Cell Content Analysis task in the Coordinate Utilities module may be helpful in determining this.
  • Methodology

    If the user has not input experimental phases, the task runs Refmac for one cycle to generate phases for input to 'dm'. The task normally runs for two cycles and on each cycle:

    1. runs Pdbset to split the coordinate file to enable:
    2. running Lsqkab to determine the translation and rotation between the NCS elements
    3. runs 'dm' to do solvent flattening, histogram matching and NCS averaging, inputting the calculated translation and rotation
    4. runs Refmac with input phases taken from 'dm' for 5 cycles
    5. runs Refmac for one cycle without input phases but with the refined structure to generate unbiased phases

    After the refinement cycles, steps 1-3 are repeated so 'dm' produces averaged phases. Three maps are generated automatically: the averaged map and the weighted difference maps based on phases from Refmac.

    Reviewing Results

    This is a novel procedure and we welcome all feedback and questions. Please send them to: garib@ysbl.york.ac.uk.

    Maps

    It is easiest to create maps by running the FFT task inside the NCS Phased Refinement task. The option to 'Generate weighted difference maps files in ...' in this task is toggled on by default. The format of these maps can then be chosen.

    Maps may also be (re)created independently from the NCS Phased Refinement task, through the Create Task-Specific Maps task in the Map & Mask Utilities module. The only input this task requires is the job number of the original job - all other parameters will be restored from the database. This task will produce the appropriate maps, even when they were not calculated in the original job.

    In some cases it may be necessary to create maps through the Run FFT - Create Map task in the Map & Mask Utilities module. Extreme care needs to be taken in the choice of map type and coefficients. See above for Refmac Maps.

    See program documentation: Refmac, 'dm', LSQKAB, PDBSET.

    Tidy Waters - Watertidy

    This task rationalises waters at the end of refinement.

    See program documentation: Watertidy, Distang.

    Create/Edit TLS File

    TLS refinement requires a definition of the TLS groups to be used, and outputs the TLS parameters found for each group. This information is held in special TLSIN / TLSOUT files. These files are simple ASCII files which can be hand-edited, but they can also be created and edited with the "Create/Edit TLS File" task.

    You should first specify whether you are creating a TLS file from scratch, or editing an existing TLS file. You should then add or change the values in the "TLS group definitions" folder. You can have one or more groups, and each group can consist of one or more atom ranges.

    Analyse anisotropic U parameters

    This task provides various analyses of anisotropic U parameters, held in ANISOU records of a PDB file, such as might have been obtained from a full anisotropic refinement.

    See program documentation: Anisoanl,

    Analyse TLS parameters

    This task provides various analyses of TLS parameters, such as might have been obtained from a TLS refinement in Refmac or Restrain. It will also generate anisotropic U parameters from the input TLS parameters, which can then be visualised in an appropriate graphics program. In this case, it is important to remember that the U values have been derived from TLS rather than being refined independently.

    See program documentation: Tlsanl,

    Run SFCHECK & PROCHECK

    The program PROCHECK analyses the geometry of a structure and SFCHECK analyses the goodness of fit between structure and experimental data. Both programs are useful for identifying badly refined regions of a protein.

    Both programs output their analyses in the form of postscript files. From SFCHECK there is one file, but from PROCHECK there are, by default, ten postscript files. After you have run the task, you can use the option to View Files from Job (from the Datbase Menu on the right hand side of the Main Window) and select an output file to display. The file will be displayed using the postscript display program that is defined in your local CCP4i configuration (see Postscript Viewers). Also, when running PROCHECK, there is an option to print the output files automatically.

    Note on Running PROCHECK within CCP4i

    When PROCHECK is run through CCP4i, the script runs in the directory $CCP4_SCR so additional output files can be found there. There is a file procheck.prm which controls the running of PROCHECK. When the task is run by CCP4i, a default version of this file is copied to $CCP4_SCR if there is not one there already. This file is then edited to set the parameter which controls whether plots are colour or momochrome. If you wish to adjust any other PROCHECK parameters, you should edit $CCP4_SCR/procheck.prm before running the task (this will have to be done outside the Interface).

    Note that when PROCHECK is run within the Interface, the output file names are changed from the PROCHECK form (which is molecule_01_ramachand.ps, molecule_02_allramach.ps etc.) to molecule_jobnumber_ramachan.ps, molecule_jobnumber_allramach.ps etc. so file names are unique for each run of the program.

    CHECK - Task Window Layout

    Features to look out for in the SFCHECK&PROCHECK task are:

    Folder titleImportanceComment

    See program documentation: PROCHECK, PROCHECK MANUAL, SFCHECK, $CDOC/sfcheckdoc.ps.

    next button previous button top button