|
Creating a full unique set of reflections with the correct FreeRflagsFor successful cross validation:
Different programs have different philosophies for dealing with FreeR reflections:
Choosing a FreeR fractionIt is important to choose a fraction that is large enough so that the statistics are sensible (at least 500 reflections seems to be the consensus at the moment), but small enough so that as many reflections as possible are still used for the refinement. This is of course always true, whichever philosophy is chosen for the selection of the FreeR reflections! How to Convert Files?Starting from CCP4When you are ready to start the first refinement, or preferably as soon as you collect the native data: Convert to Other Formats from CCP4You can use the jiffy MTZ2VARIOUS to convert from MTZ to XPLOR/CNS TNT or SHELX formats quite simply. They all have different conventions, but MTZ2VARIOUS attempts to reproduce them (see program documentation: MTZ2VARIOUS).
ExamplesMTZ to CNS/XPLOR# test set flagged with TEST=1, working set with TEST=0 # mtz2various \ hklin pc553_19f-unique.mtz \ HKLOUT xplor.hkl \ <<eof # All these labels can be set and will be handled appropriately: # LABIN FP=F SIGFP=SIGF [FPART PHIPART PA PB PC PD PHIB WEIGHT ] FREE=FreeR_flag OUTPUT CNS/XPLOR # END eof exit MTZ to SHELX Intensitiesmtz2various \ hklin lmw.mtz \ HKLOUT shelxout.hkl \ <<eof OUTPUT SHELX LABIN FP=FRBP SIGFP=SIGFRBP [IP SIGIP FP(+) FP(-) IP(+) IP(-) ] FREE=FreeR_flag # This will always output Is; and will rescale the data to fit the format. # You can override the default by setting SCAL yourself. SCALE 0.01 # END eof MTZ to TNT - working set# TNT uses a different asymmetric unit of reciprocal space to CCP4. Dale has # programs to convert the data if necessary. # The data is seperated into a free set and a working set. # mtz2various \ hklin lisa.wright/lmw.mtz \ HKLOUT lisa.wright/tnt_work.hkl \ <<eof LABIN FP=FP SIGFP=SIGFP FREE=FreeR_flag OUTPUT TNT EXCLUDE FREER 0 # END eof # MTZ to TNT - free setmtz2various \ hklin lisa.wright/lmw.mtz \ HKLOUT lisa.wright/tnt_free.hkl \ <<eof LABIN FP=FP SIGFP=SIGFP FREE=FreeR_flag OUTPUT TNT INCLUDE FREER 0 # END eof exit Convert to CCP4 from Other FormatsThese are all ASCII formats, so F2MTZ can be used in a straightforward way. After all these conversions you need to uniqueify the MTZ file. Run uniqueify {-f FreeLABel} mydata.mtz The script guesses what style of file is being imported, by looking at the
distribution of FreeR_flags: It estimates the percentage of reflections flagged as the FreeR set, and then pads out the missing reflections and converts the flags to the CCP4 style of (0, 1,...,(n-1)). SHELX "input" SHELX "output" TNT CNS/XPLOR ExamplesStarting from CNS/XPLOR (complicated CNS/XPLOR to MTZ)# # NREFlection= 10208 # ANOMalous=FALSe { equiv. to HERMitian=TRUE} # DECLare name=FOBS DOMAin=RECIprocal type=COMP END # DECLare name=SIGMA DOMAin=RECIprocal type=REAL END # DECLare name=FPART DOMAin=RECIprocal type=COMP END # DECLare name=WEIGHT DOMAin=RECIprocal type=REAL END # DECLare name=TEST DOMAin=RECIprocal type=INTE END # INDE 6 0 0 FOBS= 1259.884 0.000 SIGMA= 38.561 # FPART= 0.000 0.000 WEIGHT= 1.000 TEST= 0 # INDE 8 0 0 FOBS= 827.600 0.000 SIGMA= 30.983 # FPART= 0.000 0.000 WEIGHT= 1.000 TEST= 0 #!/bin/csh -f # f2mtz \ hklin suying/b-over.hkl \ hklout suying/b-over.mtz \ hklout suying/b-over.mtz \ <<eof # skip the NREF and DECLARE lines SKIP 7 # For XPLOR you would probably need: SKIP 0 CELL 55.19 79.73 66.68 90.00 90.00 90.00 SYMM C2221 # # f2mtz assumes a free format without any character data # So you must either remove these from the file, or design # a format statement to skip the labels. # # You have to get this format right! nX ignores n characters. # Count characters FORMT '(6x,3F5.0,6X,2f10.0,7X,f10.0,/,25X,2f10.0,8X,F10.0,6x,F10.0)' # #1234561234512345123451234561234567890123456789012345671234567890 # INDE 6 0 0 FOBS= 1259.884 0.000 SIGMA= 38.561 #1234567890123456789012345123456789012345678901234567812345678901234561234567890 # FPART= 0.000 0.000 WEIGHT= 1.000 TEST= 0 # # LABO H K L FRBP PHIB SIGFRBP FPART PHIPART WEIGHT FreeR_flag # CTYPO H H H F P Q F F W I END eof # uniqueify suying/b-over.mtz exit Starting from SHELX Intensitiesf2mtz \ hklin pc553_19.hkl \ hklout pc553_19i.mtz \ <<eof CELL 37.144 39.422 44.021 90.00 90.00 90.00 SYMM P212121 LABO H K L I SIGI [ FreeR_flag ] CTYPO H H H J Q [ I ] END eof # # To reduce Is to Fs - use truncate # truncate \ hklin pc553_19i.mtz \ hklout pc553_19f.mtz \ <<eof LABI IMEAN=I SIGIMEAN=SIGI END eof # # If you read a FreeR_flag, you will now have to rescue it - # TRUNCATE ignores it. # cad hklin1 pc553_19f.mtz \ hklin2 pc553_19i.mtz \ hklout pc553_19f-free.mtz \ <<eof LABI FILE 1 ALLIN LABI FILE 2 E1=FreeR_flag END eof # # Modify FreeR_flags uniqueify pc553_19f.mtz # Starting from TNT or old SHELX (FreeR assigned to 10%)# First edit the TNT to assign flag 1 to working set and 0 to free set; # then cat both TNT files together: # # sed 's/$/ 1/' $SCRATCH/tnt-work.hkl # sed 's/$/ 1/' $SCRATCH/tnt-work.hkl # cat $SCRATCH/tnt-work.hkl $SCRATCH/tnt-work.hkl > $SCRATCH/tnt-all.hkl # # Example piece: HKL -22 0 4 2010.9 134.7 1000.0 0.0000 1 HKL -22 0 5 4005.2 83.1 1000.0 0.0000 1 HKL -22 0 6 3661.5 91.1 1000.0 0.0000 1 HKL -22 0 7 2321.9 59.7 1000.0 0.0000 1 .... HKL -21 1 9 488.4 143.9 1000.0 0.0000 0 HKL -20 0 6 329.5 202.9 1000.0 0.0000 0 HKL -20 0 11 1009.2 146.7 1000.0 0.0000 0 HKL -20 4 10 1989.1 46.5 1000.0 0.0000 0 .... # f2mtz \ hklin tnt_all.hkl \ hklout tnt_all.mtz \ <<eof CELL 37.144 39.422 44.021 90.00 90.00 90.00 SYMM P212121 LABO H K L F SIGF FreeRflag CTYPO H H H F Q I # # See above comments about formats.. You need to skip the HKL label. # FORMT '(4x,3F4.0,2F8.0,16X,F4.0)' # or, if PHI and FOM given # LABO H K L F SIGF PHIB FOM FreeRflag CTYPO H H H F Q P W I FORMT '(4x,3F4.0,4F8.0,F4.0)' END eof # # uniqueify will now complete hkl list and add FreeRflags # uniqueify -f FreeRflag pc553_19f.mtz #!/bin/csh -f # Starting from SHELX I and FCf2mtz HKLIN ./1bxo*-sf.hkl \ hklout $CCP4_SCR/junk.mtz \ <<eof TITLE X-PLOR to MTZ CELL 96.980 46.650 65.710 90.00 115.57 90.00 LABOUT H K L I SIGI FC PHIC CTYPE H H H I Q F P SKIP 2 SYMM C2 eof if($status) exit truncate \ hklin $CCP4_SCR/junk.mtz \ hklout $CCP4_SCR/junk1.mtz \ <<eof LABI IMEAN=I SIGIMEAN=SIGI TRUNCATE YES END eof # if($status) exit cad \ hklin1 $CCP4_SCR/junk1.mtz \ hklin2 $CCP4_SCR/junk.mtz \ hklout ./ibxo-sf.mtz \ <<eof LABI FILE 1 ALLIN LABI FILE 2 E1=FC E2=PHIC END eof |