Helical Image Procssing of AChR Tubes Embedded in Ice CT 0. Introduction a. Programs described here have been developed by CT at Tokyo, Stanford and MRC LMB with different ideas from the original MRC package. They are, I believe, easier to use and will give you more detailed information you may want. Compatibility with MRC packages are intended but not perfect (to be consistent with CT's earlier versions). b. The most fundamental difference is interpolation is done in real space first and not in Fourier space. Unless your transform is fairly redundant, bilinear interpolation to obtain amplitude and phase in Fourier space does not seem to be satisfactory. Therefore, first of all, digitised image itself is interpolated so that layer-lines come exactly on the sampling points in the Fourier sapce. (This actually save much time afterwards; layer-line data extraction program becomes much simpler.) For this purpose, auto-correlation has been implemented and found to be an indispensable tool. Once you determine the rotation angle and repeat distance accurately, you get a perfect transform. 1. Densitormetry a. You can scan either way (i.e. scanning raster either perpendicular or parallel to the helix axis). Both display and "box" programmes take care of the direction automatically. b. Some of the programs assume that you took pictures at 34,900 x and scanned at 15 um interval. c. Conventions are: 1. emulsion side up 2. gain 1 (maximum gain) 3. condenser aperture (labeled "motor") 5 4. x-dim (longer than y) y-dim 6 6 1 (i.e. scan in one direction; stepsize 15 um; speed 6) (dimensions = panel meter readings/6) 2. Display a. To use MRC Tone program, run AXCHGMRC program to swap axes; MRC programs LABEL or INTERPOL possibly work for this purpose (if your "QUOTA" is big enough). If you use my MRCVTEC program, specify the coordinates as you see on the print out; if you use MRC TONE program, (0,0) comes at the left bottom origin. MRC Tone program does not take care of the display range automatically, you have to know standard deviation of your image. To do so, $ IMGSTAT [x:file name] (hereafter, [] shows optional parameter(s)) then, specify the origin of the box (usually middle of your image, e.g. (1000, 100) if your image sizes are 3,000 x 300) and box size (e.g. 800 x 100). It seems OK to use mean value +- sigma for the density range. Also check if the standard deviation is almost the same for weakly and strongly defocused images. b. Since AXCHG and TONE take rather long time, it is convenient to use a batch file. Edit [your_account.achhlx]TONE.COM. 3. Initial boxing and Fourier transform --- measure layer-line heights and rotation angle of helix axis wrt sampling raster a) Determine the center of tube by eye and specify box parameters; whenever possible, the length of the tube masked-off should be 1024; Remember the box width (i.e. maximum diameter of the tube). b) Edit "filename".BOX (rotation angle = 0; 256 x 1024; see example) *** Example of Box parameters ([chikashi.emc]hn1593.box) ! title achhlx hn1593 ! input file [-.achhlx]hn1593.img ! output file (optionally projection file as well) hn1593.imb hn1593.prj ! horizontal (i.e. perpendicular to helix axis) dimension (start:end) 44 237 ! vertical (start:end) 500 1524 !final size in x and z direction 256 1024 ! finally 512 1024 !rotation angle (degree) and apodization column # (integer) for each !side -0.01 3 !float method (should be 1) & scaling (0, automatic; 1, no change; others, each density is multiplied by that number) 1 1. c) type 'HFT' and enter ID (e.g. hn1593; don't put extension (.img etc)) d) type 'OUT' and enter necessary parameters; try default values first except "start and end pages" (specify 1 and 2, respectively). e) identify (2, -2) and (-2, 4) and check phase difference across meridian; since both h and k are multiples of two, phase should be even; if not, most likely the center of box is off the helix axis; note a simple relationship between offset (dx) and phase deviation (dp) at distance (dy = 2*Rpeak usually) dp = dy*dx*360/256 = 1.40625*dx*dy (degree) for 256 x nz transform. Therefore, if the center of box is off axis by 1 grid unit, phase deviation will be dy phase deviation phase of right hand side ----------------------- will be smaller if you 10 14 deg apply positive shift 20 28 30 42 40 56 50 70 at distance dy. If the offset is > 0.5, it is better to transform again but may not be necessary at this point. Note that the phase correction due to omega-tilt could easily be > 180; only the layer-lines with small Z should be inspected carefully. Also note that (2, -2) and (-2, 4) usually have opposite sign and the correction for tilt is also opposite direction. f) Compare phases of the (-1, 2) across the meridian. If it is ~180 off, many other reflections ((1,0), (-1, 4), (1, 2), (0, 3), (0,5)) should be odd; usually the (1, 3) or (1, 2) has small |n| (0 to 5) and (0, 5) should have |n| of a multiple of 5 (usually 25 or 30), which are related to the diameter of the tube by the formula Rpeak = 256*(n+1)/(2*pi*rc) = 40.76*(n+1)/rc where rc is the radius for the centre of mass that contributes that particular layer-line. rc's are quite different in the case of AChR tube depending on the layer-line (e.g. (0, 2) has a very small rc). g) Now we can make a table consisting of (h, k) Z parity Rpeak n r l ------------------------------------------------- usually, (-1,2), (-2,4) | (2,-2), (1,0), (0,2), (-1,4) | (2, 0), (1,2), (0,4) | (1,3), (0,5) can be identified. '|'s separate layer-lines belonging to differnt branches in the (n,l) plot. Difference between consecutive layer-lines in the same branch is constant, dn. Note that n for (-1, 2) is always dn and that for (-2, 4) is 2*dn. If both h and k are even, parity should be even. Other reflections with even k follows (-1, 2). Note that correction from omega-tilt could be > 180 degree for high Z layer-lines. Suppose n for (0, 5) is 25 and that for (1, 3) is 2. dn is therefore either 27 or 23. Then the n for (1, 2) will be either 25*4/5 - 27 = -7 or 25*4/5 - 23 = -3 accordingly. From the peak position of (1, 2) layer-line, the correct one is easily chosen. Once dn is determined, n for other layer-lines are readily determined using n for (0,4), (0,2). Suppose n for (0,5) is 30, instead of 25; then the n for (1, 2) will be even because dn is even in this case and (0,4) is always even. Thus, from the parity of the (1,2) reflection, we can tell what is the correct n. Calculate the corresponding peak radii in real space and check consistency. h) Determination of layer-line number is the most difficult and frustlating thing. There are only small number of constraints and you can think of many sets almost equally possible. constraint (i): l for (0, 5) should be a multiple of 5; (ii) (2, -2) and (-2, 4) should have even l; since the l for the (2, -2) and (-2, 4) is small, this is probably the most sensitive measure. However, there are lots of possibility even though you restrict the repeat distance to be about 1000 g.u. This problem could be resolved by autocorrelation calculation. 4. Autocorrelation function --- determination of true pitch and rotation angle i) edit control datafile IMGNAME.ACF; !reference file name [-.achhlx]hn1593.img !centre of box (horizontal vertical) --- (a) 140 700 !size (horizontal vertical) 185 511 --- should be bigger than test ones !test file name [-.achhlx]hn1593.img !centre of the box 140 1600 ! and size 160 100 !ccf frame size ( X,Y) or default (d) --- (b) d !valid ccf area --- (c) o o !float method (Vertical,Horizntal,Edge,average of whole Area) t v --- (d) !lowpass/bandpass parameters (sampling in real space and cut-off !in Angstrom) 4.3 50 100 30 --- (e) a) coordinate system is now consistent with other programs; i.e. vertical direction always corresponds to the helix axis (it may be either x or y axis). b) ccf frame size: cross-correlation function is calculated using simple radix 2 FFT; hence the frame size should be power of 2. Default sizes are minimum numbers bigger than box sizes. This might cause problem since low-pass or band-pass filtering is done at run time and overlap of the areas close to the box boundary might cause undesirable effect. In those cases, specify bigger numbers. c) valid ccf area: since the ccf is calculated using FFT, some portions of ccf can be artefacts unless frame sizes correspond to integral numbers of unit cell dimensions. To avoid to get spurious ccf maximum and minimize computation time, you can specify "valid ccf area". For helical particle, it should be 'o' for both directions; 'o' stands for 'overlap' and means 'test area' should be inside the reference area. d) floating is critical to get right answer; there are several options: V: vertical edge; h: Horizontal edge; E: all edges; A: average of box area; R: use value from reference image; T: use value from test image; or any numbers for floating (specify two numbers). Best resuts seem to be got if 'T V' are used as long as no big density gradient along the vertical axis of the image. e) if no components exist below a certain spacing, it makes perhaps more sense to use bandpass filtering. For bandpass filtering, specify 4 parameters: real space sampling distance (i.e. densitometer sampling), resolutions for no amplitude modification, 1/e at lower resolution and 1/e at higher resolution. The formula used for bandpass filtering is 2 ------------------------ 1 + exp(a(xc-x)*(xc-x)). At xc, amplitude is unmodified. For low pass filtering, specify only 3 parameters. f) To activate the program, type $ ACF and enter control data file name. This command procedure makes another temporary command procedure and submit it; a log file is created in your root directory. Print it with /nofeed option (i.e. $ PRN). 5. Determination of box size and rotation angle, and Fourier transform a) ACF will give you position of test image at the highest correlationin the reference image, thus provide you true pitch and rotation angle. Check if this true pitch is consistent with the layer-line heights. If even order (for k) layer-lines look consistent but odd order layer- lines not, most likely the true pitch is twice that value. Since (0,3) layer-line is usually hard to see, (1,3) or(0, 5) will be critical. Thus, it might help to process less defocused image at the same time if you don't see those layer-lines clearly. b> ACF may give you false peaks. This is especially the case if there is a significant gradient of optical density along the particle. It might be worth compensating the gradient by polynomial fitting (Murray has a program). At least, measure the mean density of the test and reference area using IMGSTAT program and specify floating densities accordingly. It is also important to measure the inclination of the layer-lines; this could be a big constraint where you should find ACF peaks. If possible, calculate ACF using another part of the same particle and find consistent peaks. In the worst case I have encountered so far, true peak was about 80% of the highest. 6. Final adjustment of box size and rotation; bigger Fourier transform a) Since the accuracy of ACF is not very high due to local perturbation and noise, it is probably necessary to adjust the size of the box and the rotation angle. Sometimes high resolution layer-lines (e.g. (2,4)) help to determine the box size. It is probably necessary to extract those layer-line data. b) Check the phase across the meridian again. Critical check is possible only here because, in the case of receptor, phase changes quite rapidly in Z direction. Again, only (1,-2), (2,-2), (2,-4) are the suitable layer-lines. Note that the effect of tilt is opposite with (2, -2) and (2, -4). So if necessary, shift the box origin and change the size of the transform to 512 x 1024. 7. Make a table for all possible layer-lines a. To know the position of layer-lines and make sure that the assignment of Bessl orders are OK, run NLTABL. To activate the program, type $ NLTABL You have to supply 1) image id (just for title) 2) Bessel order (N) and layer-line number (L) for (1,0) layer-line 3) ibid for (0,1) layer-line 4) box width (full width ! output shows half width at present) 5) repeat distance in grid unit 6) ctf cut-off (c/r for using default value 140) 7) where to write (ttd3: for printing out on the laser printer directly). This program provides you, other than layer-line number and Bessel order for all possible layer-lines (Bessel order < 90), 1) selection rule of this particle 2) possible stretch of layer-line data; Rmin shows the expected position of peak at the edge of the box (calculated from |n| + 1 = 2*pi*rR); Rmax shows the radius where CTF is zero. 3) axial layer-line height (Z) in grid unit for the frame size of 512 so that you can see the position in a radial plot of the transform. 4) unit cell parameters; only two parameters are constant over various radii: z-coordinate (in Angstrom) and azimuthal angle (in degree) of the unit vectors; others, inclination angles for (1,0), (0,1) and (1,-2), included angle, x-coordinate (actually, 2*azimuthal angle * radius) or unit vectors, and length of unit vectors, all vary depending on the radius. The program does not take care of the control of laser printer, so you might get nonsense print out; in that case, write onto a disk file and print out by LPR. b. To know the CTF cutoff position, run CTFDSP and CTFAVG programs; to activate those programs, type $ CTFPLT image_id (e.g. hc772) A batch job is activated, which finds a FFT file in the scratch disk (x:) and output plots on the laser-printer. CTFDSP plots mean radial amplitude distribution over 5 sectors and make a listing file (.ctf); CTFAVG program uses the listing file and plots overall transform. Unfortunately, there is no que for the laser-printer, you might not be able to get plot. Then you have to do it again. Your FFT file should be the final one (i.e. x-dimension 512). It is very difficult usually to see the CTF zero in less defocussed picture (you need carbon film). If you take the picture at 20,000 A, CTF cut-off will come at about 90 grid unit. 8. Extract layer-line data a) edit hlxs.cnt (see example below); omega and shift corrections should be zero at this moment. hn1593b.fft ! Input file hn1593b-2 (corrected) ! title 4.30 4.30 ! sampling size in real space (Angstrom) -0.53 0.06 0 0 ! omega-tilt(deg), x-shift (g.u.), y-shift (g.u) 13 100 1 ! number of layer-lines and width of display (0 to 100) in this example; 1 for display of right hand side only to minimize your waiting time; width of display may be 80 for "search". 0 0 0 ! layer-line number (l), bessel order (n), and height (Z) in g.u. for each layer-line 1 49 1 15 27 15 30 54 30 ..... 116 20 116 130 -2 130 b) $sub hlxs or $fsub hlxs (using FAST batch que). c) go to your root directory and print the log file on printer with no feed option ($prn hlxs.log). 9. Refine helix axis position a. There are two parameters to adjust: inclination of the helix axis from the projected plane (omega-tilt) and displacement of the helix axis from the centre of box in the projected plane (x-shift). b. Choose only good peaks: the purpose is to refine the axis position, therefore we must choose peaks apparently not affected by noise. If phase gradient across the peak is large (say 40 deg), it is perhaps better to discard. Since we are dealing with ice image, there is no execuse for "shearing" or uneveness between two sides and, hence, there is no reason to treat two sides separately. Thus, the phase and amplitude of the points at the same distance from the axis compared. Output of HLXS gives you averaged amplitude and phase deviation from the ideal value. Supply those numbers to SRCH program (not *absolute* phase difference). Note that the sign of omega is different between MRC program and this program. c. Example of control data file for SRCH hp3538a ! title 512 1024 ! FFT frame size (512 x 1024) 0.2 0.01 21 ! tilt (start, step size, number of step) -0.2 0.01 21 ! xshift (ibid) 23 1 ! #of peaks incorporated; # of selection rule 15 56 417 47 ! Z (not L), distance across meridian, ! amplitude (should be <9999), phase deviation ! (not phase difference) 25 ! Bessel order 15 66 389 -11 25 36 114 288 -17 -40 ....... d. As mentioned, you should choose good peaks only. The axis position should not be affected by bad peaks (noise). Therefore, if you find bad peaks in the final detailed residual table, you should check the goodness again. Also check, majority of the peaks come to the minimum at similar tilt angles; if not, Bessel orders you specified might be wrong ! e. Sometimes, due to big noise, phase gradient across the peak happens to be OK but phase itself is not consistent with other peaks; because we can take advantage of twofold, it might help to examine the phase (not phase deviation) in the original layer-line data. f. If x-shift is bigger than 0.3 grid unit, re-box and transform again; the difference could be minor, but you should check it. I suspect equator could be affected. Re-SEARCHing is not really necessary if x-shift was less than 1 grid unit; difference is usually 0.01 degree or 0.01 grid unit at the most. 10. Determination of the stretch of layer-line to be stored a. You now have got "axis orientation parameters"; edit hlxs.cnt and specify those parameters. Run HLXS again and print layer-line data. b. The size of stretch is different depending on the purpose. For the initial fitting, use only strong layer-lines and good portions only. For averaging of several transforms, we should try as much as possible. Of course, inner side (i.e. near the Z-axis) is determined by the width of the particle, and the outermost limit is determined by CTF cut-off. Therefore, by this stage, you should have run CTFDSP and CTFAVG, and NLTABL. However, data are usually dominated by noise but not by CTF, knowledge about CTF is not essential for fitting. c. Anyway, compare two sides, and also examine phase gradient along the layer-lines. If the two-fold is OK, peaks at outer radii should have similar (or 180 degree different) phases with the main peaks. If the gradient looks OK but there are some bad peaks, we can consider those will be due to noise and can expect they will be averaged out. d. My criteria for judging the quality of the peaks: 0-45 deg -- good; 45-70 deg -- torelable; higher than 70 -- bad. If two successive peaks are higher than 45 deg. in resiudals, we should stop before those bad peaks. One bad peak may still be all right (the amplitude is weighted down as cos(phase residual)). For the initial fitting you shoud stop before any bad peak. 11. Store layer-line data in disk files a. Make a control data file using "hlxs.cnt" as a template; choose only 'good' layer-lines. x:hc771a.fft hc771a.nea ! output file near side hc771a.far ! output file far sdie hc771a redo (lower rot ang) 4.30 4.30 -0.14 -0.07 0 0 0 15 100 1 0 0 0 0 73 0 73 ! left hand side start-end; right hand side 1 -45 1 50 86 50 86 ! 17 25 17 26 73 26 73 ! ....... ....... 246 -10 246 8 36 8 36 ! You add the marked lines. Stick to the convention for the file names, i.e. '.nea' and '.far', because an averaging program assume those extensions. b. Type HLXFL, then image id (e.g. hc771a); your control data file shold have '.hlx' extension. c. This is the only program remaining from MRC package without extensive modification. It does a lot of unnecessary calculation and is quite slow. 12. Average of two-sides a. A separate average program is provided for this purpose. Type $ nfavg image_id (e.g. hc772a) The program automatically finds .nea and .far files in your default directory and make an .avg file. 13. Find two-fold origin and make two-fold forced layer-line data a. Finding two-fold origin is done by minimizing phase deviation from the ideal values (i.e. 0 or 180 degrees). Phase residual is an amplitude- weighted mean sum (|F| * | phase deviation |) residual = ----------------------------------------------- sum(|F|) You can use the data points which have high enough amplitudes to reduce the effect of transition points. We haven't had much experience about the threshold but tentatively use 3% of maximum amplitude. When two-fold forced layer-line data are made, amplitude is weighted down by cos(phase deviation). b. Edit control data file ([.emc]twofold.cnt) ! title (this is used for the title of two-fold forced dataset) HC3314 ! input file X:HC3314A.avg ! number of layer-lines (0 for all) and repeat distance (Angstrom) 0 6790 ! amplitude cutoff (% of maximum amplitude) 3 ! list of layer-lines with weights different from 1.0 (L and weight) 0 0 ! equator is usually excluded ! azimuthal rotation (start, end, and stepsize; in degree); the order ! of start and end does not matter 22.83 22.87 0.01 ! shift along the helix axis (Angstrom) -8.78 -8.74 0.01 ! output file if needed (otherwise leave a blank line; 'n' is also fine x:hc3314a.2fd c. To run the program interactively, type $ H2FLD twofold if the name of the control data file is "twofold.cnt" (".cnt" is the default extension; so you don't have to type it). Alythough the program asks you the control data file name if you haven't specified in the command line, it is much more convenient to do it as above because you can use command editing feature of VMS (hit up arrow key) to save typing the control data file name each time. To get the hard copy, type $ sub hlx2fld or $ fsub hlx2fld if you have a command file 'hlx2fld.com", and when the job is done go to your root directory and $lpr hlx2fld.log. You can use the same comand file name (i.e. h2fld) as the program name. d. Example of output: --- HLX2FLD --- Ver. 2.00 Thu Dec 29 17:12:41 1988 title of the layer-line : hc2954b new title : hc2954b side : Avg in : X:hc2954b.avg repeat distance : 1815.00 Angstrom layer-line sampling : 0.000454 reciprocal Angstrom maximum amplitude : 132.35 cutoff amplitude : 6.62 ( 5.00%) sum of amplitude used : 10234.6 out of 10666.4 number of points used : 426 out of 531 control data in : twofold.cnt z phi | 0.640 0.650 0.660 0.670 0.680 ------------------------------------------------- -4.450: 16.098 16.097 16.108 16.121 16.140 .................... -4.420: 16.096 16.094 16.105 16.119 16.138 -4.410: 16.095 16.094 16.105 16.119 16.138 ------------------------------------------------- minimum residual: 16.094 at phi= 0.650, z= -4.420 Phase residual for each layer-line at global minimum (phi= 0.650, z= -4.420) -------------------------------------------------------------- L N 1/Z weight point amp residual (A) (g.u) (%) (degree) -------------------------------------------------------------- 1: 0 0 0.0 0.00 -- ( 0- 84) -- -- 2: 3 49 605.0 1.00 24 ( 48- 80) 4.07 14.6 ......... 13: 57 -2 31.8 1.00 15 ( 3- 19) 2.11 39.9 -------------------------------------------------------------- @@ point: in line 2," 24 ( 48-80) " means that this layer-line consists of 48th to 80th (offset 0) grid points and only 24 points have amplitude higher than the threshold level (this case 5% of maximum amplitude) and have contributed the phase residual calculation. @@ amp: sum of amplitude for the points with the amplitude higher than the threshold level. e. Consideration about the parameters: 14. Initial fit a. Finding common origin is done by minimizing phase residuals between reference and test datasets. The number given by this program (HLXFIT) is ______________________________________ / sum (|Fref|*(Phase difference)**2) residual = /--------------------------------------- \/ sum(|Fref|) where summations are taken over the terms with amplitude higher than a cut-off level. Previous programs compared all terms with whatever weak amplitudes; this resulted in an apparently bad phase residual for specimens with rapidly oscilating phase. b. To run program in interactive mode, type $ hfit controldata_file; then you can reactivate the program by pressing up the arrow key. It would be advisable to stick to one radial scaling factor (usually 1.000) until you can specify quite narrow ranges for phi (azimuthal rotation) and z (shift along the helix axis). Note that you don't have to use actual repeat distance. If you are trying to average different particles with the same (similar) selection rule, it is best to use the value appropriate for the averaged dataset. c. This program also gives information on amplitude difference. R factor is defined sum{ ||Fref| - |Ftest||} R-factor = ---------------------------- sum {|Fref|} where the summations are again taken over those terms that have amplitudes higher than cut-off level. Note that number of points used is different depending on the scaling factor and cut-off level. d. Example of (a portion of) output HLXFIT: Find common origin for helical particles Ver. 1.00 --- Mon Sep 26 18:07:54 1988 --- Reading in reference data from 'hc772.avg' --- Reading in test data from 'hc772.sft' !!! No corresponding layer-line to (n= -65, l= 14) @@@ layer-line does not exist in the test data is put zero-weight; @@@ so you don't have to worry about creating nonexistent layer-lines --- HLXFIT --- Ver. 1.00 Mon Sep 26 18:07:59 1988 [ref] hc772.avg : hc772.sft [test] control data file : hfit.cnt title of ref. layer-line : AChR hc772 (3part fitted to 2fold forced avg) test layer-line : AChR hc772 shifted by -5 deg 5 A side : Avg (ref.) Avg (test) repeat distance : 3800.00 Angstrom layer-line sampling : 0.000454 (ref.) 0.000454 (test) Angstrom inv amplitude cut-off level : 15.00 ( 5.00% of max. amp) radial scaling factor: 0.9800 ------------------------------------------------- z phi | 4.800 4.900 5.000 5.100 5.200 ------------------------------------------------- -5.200: 25.085 24.689 24.558 24.696 25.057 -5.100: 25.075 24.679 24.548 24.678 25.037 .................... -4.800: 25.143 24.749 24.619 24.724 25.074 ------------------------------------------------- minimum residual: 24.5481 at phi= 5.000, z= -5.100 number of points: 402 @@@ Number of points actually compared total amplitude: 16382 @@@ Sum of amp. used for comparison amp scaling factor: 1.1058 @@@ To make mean amplitude the same R factor: 0.2365 test dataset must be multiplied by this **** Minimum locations **** --------------------------------------------------------------- Rsc :phi shift: z shift: amp (points) : residual : R factor --------------------------------------------------------------- 0.9600 4.9000 -5.2000 16348 ( 400) 59.763 0.4011 0.9800 5.0000 -5.1000 16382 ( 402) 24.548 0.2365 --------------------------------------------------------------- ****Phase residual for each layer-line at global minimum**** phi= 5.000, z= -5.100 Rscal = 0.9800 Phase residual = 24.5481 Rfactor = 0.2365 (test amplitude multiplied by; 1.1058) cut-off amplitude: 15.00 ( 5.00% of max amp) ------------------------------------------------------------------ L N 1/Z weight point ref-amp-test residual (A) (g.u) (%) (degree) ------------------------------------------------------------------ 1: 0 0 0.0 0.00 -- ( 0- 84) -- -- 2: 1 -45 3800.0 1.00 17 ( 45- 74) 3.19 2.92 38.022 3: 12 25 316.7 1.00 34 ( 24- 73) 7.75 7.28 46.951 4: 14 -65 271.4 0.00 -- ( 64- 75) -- -- 5: 24 50 158.3 1.00 18 ( 49- 80) 4.20 4.24 9.226 6: 26 -40 146.2 1.00 28 ( 37- 73) 7.43 7.46 36.029 7: 37 30 102.7 1.00 19 ( 31- 73) 2.46 2.39 12.572 8: 38 -15 100.0 1.00 42 ( 14- 81) 11.63 11.54 37.496 9: 39 -60 97.4 1.00 0 ( 57- 85) 0.00 0.00 0.000 25: 125 25 30.4 1.00 6 ( 20- 51) 0.79 0.80 5.659 26: 126 -20 30.2 1.00 2 ( 23- 48) 0.20 0.21 6.340 ------------------------------------------------------------------ @@@ Layer-lines not existing in test data are discarded @@@ Numbers in parentheses show data points existing in reference data set; the number before the parentheses is that of points used for comparison. This number depends on scaling factor and cut-off level you specified, and possibly zero. @@@ Weight factors are multiplied to layer-line data at the beginning; all the subsequent calculations assume all the terms have unit weight. e. Example of control data ([chikashi.emc]hfit.cnt) !title test of hlxfit !reference file hc772.avg !llmax and repeat distance (llmax 0 to include everything) 0 3800 ! distance in Angstrom ! amplitude cutoff (% of max amp used for comparison) and penalty (1/0) 5 0 ! 1 for 90 deg penalty for non-existing part ! weight for special layer-lines (L, weight) 0 0 ! weight of equator (L= 0) is set to 0 ! file name hc772.sft !iside (1 for far side) and ipole (1 for upside down) 0 0 !phimin phimax delphi (in degree) 4.8 5.2 0.2 ! order of min and max doesn't matter !z range (in Angstrom) -4.8 -5.2 0.2 ! order of min and max doesn't matter !r range ! this number is multiplied to test data in recp. space 0.96 1.04 0.02 ! if >1, data go outer radii f. I don't have much experience with different levels of cut-off, but apparently 3 to 5 % seems to be a reasonable number. Note that "penalty" for non-exisiting data points in test dataset is not fully implemented or debugged; the penalty causes a problem when radial scaling factor should be corrected and, therefore, I don't think useful. 16. Extract all the possible layer-lines a. There is a program for making control data file for HLXFL which extracts all possible layer-lines using the one you made for extracting strong layer-lines (11.) as a template. b. To activate the program, type $ MKHLXC You have to supply 1) template file name (e.g. hc772c.hlx) 2) (n, l) for (1,0) 3) (n, l) for (0,1) 4) repeat distance in grid unit 5) box width (full width !) 6) ctf cut-off (in grid unit) The program assumes possible error of 5 for ctf cut-off and box width; thus, layer-line data stored will be bigger than allowed if your specified values are correct. c. change output file names and title; convention for file name is ***w.nea etc. d. At the moment, this program does not take care of the effect of titling, and HLXFL happily output 0 amplitude, 0 phase data points; the presence of 0 0 0 at the beginning might cause unexpected error in some of the programs. So, adjust starting radius by hand. This will be fixed soon. e. HLXFL using new control data (this may take ridiculously long time). Then, NFAVG. to get average data. 16. Average of different portions from the same tube a. At least at the first stage, since we don't have good reference data set, each dataset from different portions of the same tube must be put to common two-fold origin and averaged. To do this, 2nd and 3rd portions etc are roughly fitted to the 2-fold forced first data set consisting of only strong layer-lines (using HFIT); then each dataset is put to its 2fold origin refined using H2fld. It might be a good idea to fit each dataset to the average but I haven't tried it yet. b. To verage using HAVG, first edit control data file (e.g. havg.cnt) ! title hf600/601a+b (lowpassed (84/4); ctf corrected 84/134) ! average file name x:hf600x1.lpc ! sampling size and Fourier frame size (e.g. 4.3 A; 512) 4.3 512 ! repeat distance (in Angstrom) !!! must be consistent with FIT or 2FLD 1904.9 ! weighting scheme (N, F, S) N !--- First file -- ! file name x:hf600abw.cor ! equator was corrected; weakly defocused ! iside (1: for far side) ipole (1 for upside-down) 0 0 ! fitting parameters (delphi, delz, Rscal) 0.07 0.1 1. ! common scaling factor 1 ! list of layer-lines with special weight factors !--- Second file x:hf601abw.lpc ! This is the lowpass filtered dataset 0 0 0 -0.01 1. ! common scaling factor 2.2 ! due to low-pass filtering ! list of layer-lines with special weight factors 0 1 ! equator must be treated separately (i.e. this case ! same weight as weakly-defocused pitcure) c. This new averaging program automatically takes care of layer-lines missing in some of the datasets. Several weighting schemes are possible; in any case, layer-lines are summed after applying scaling factor (weight of the LL specified in layer-line data set * scaling factor specified in the control data file) = LLWEIGHT*CNT_SCAL. In the final stage, they are scaled differently according to the 'weighting scheme'. 'N': if some layer-lines are missing in some files, common scaling factors for those files are assigned as the weight of the missing LLs; the scaling factor will be 1/(sum of LLWEIGHT*SNT_SCAL); therefore, those layer-lines will be weighted down; if datasets are complete, only the ratio of scaling factors for different datasets alters the final results. 'F': the scaling factor is 1/(sum of existing LLWEIGHT*CNTSCAL); therefore, no reduction occurs. 'S': the scaling factor is 1/(number of sides existing); thefore, if you specify scaling factors not 1, averaged data may be multiplied by a constant; this option can be used for scaling. Normally, 'N' is the option to use. d. To activate the program type HAVG [control_data_file (.cnt can be spared)] for batch, or r [-.exe]hlxavg control_data_file for interactive job. 17. Prepare full layer-line data a. Average different portions as described in the previous section. It seems better to make averaged for limited data (i.e. strong LLs only) and for full data set. Find two-fold origin for limited data using H2FLD. Make sure that the averaged amplitudes are reasonable. b. Print out layer-line data using LLPR; phase origin should be shifted to (common) two-fold origin determined in a. so that you can judge the quality of the data. c. Make a list of layer-line contained in the file. Type $ HTABL (and layer-line data file name) The program make a list of layer-lines (L,N, start radius, end radius) and write onto a file 'image-id'.tbl. Print it using LPR. d. Determine the stretches of the layer-line data to be incorporated. e. Edit layer-lines. i) It is certainly possible to edit layer-line data file directly using an editor, but perhaps easier to edit the data file made by HTABL. It consists ofet lines showing start and end radii of layer-lines contained in the input file. Using editor you can specify the good portion for each layer-line (just replace start and end radii). To delete a whole layer-line, insert 'd' (or 'D') at the head of the corresponding line. There are two other options (Z and A), but they are only for dealing with old averaging programs. You can change the weight for specified layer-lines by specifying new weight factors at the end of the corresponding lines. When you are done, type $ HEDIT to activate the program. ii) example of control data ! HLXTBL Vers. 1.00 --- Wed Oct 19 10:47:22 1988 ! input file name hc3.avg ! output file name hc3l.new <-- default name ! new title carb avg (h9834,hc771, hc1013) limited <--- write something here !-------------------------------------------------------- ! put D to delete, Z to keep at 0 weight and A to add !-------------------------------------------------------- ! L N : start end !-------------------------------------------------------- 0 0 : 0 134 <--- edit start and end radii 1 -45 : 47 93 16 70 : 68 105 2. <--- weight factor is set to 2. 17 25 : 25 117 d 18 -20 : 25 54 <--- delete this layer-line 19 -65 : 63 108 18. Correction for Contrast Transfer Function a. We correct equatorial data treating the recptor tube as a weak phase - weak amplitude object, assuming 7% amplitude contrast. Since the correction around ctf zero is very big, equatorial data should be edited so that the data end at several points before the ctf zero. Note that the ctf changes so rapidly arouond the zero points, 3 or 4 points difference makes a big change. To activate the program, type $ hctfcor [control data file] and control data file name if you didn't specify it in the command line. b. example of control data file: [chikashi.emc]hctfcor.cnt ! title --- used as a new title hn3392 compensated (equator only; 90) ! input file hn3392abcw.avg ! output file x:hn3392abcw.cor ! number of layer-lines (0 for all) to read in and write ! and repeat distance (Angstrom) 0 4500 !ctf zero (grid unit) 90 !amplitude contrast (%) 7 !list of layer-line numbers (A for all) to be included 0 ! this case, only L= 0, i.e. eqator is processed 19. Average of different defocus images a. We have tried two methods: (i) simply average two datasets up to their respective ctf-zero's; (ii) lowpass filter strongly defocused dataset so that more weight is put on lower-resolution data. First one is easier but resultant CTF is not very uniform; by adding 8,000 and 20,000 A defocus data, composite ctf will be about 0.4 at (1, 0); highest resolution data are weighted down to 0.7 to 0.8. By lowpass filtering strongly defocused data, CTF at (1,0) will be > 0.7 and highest resolution data remain intact. Both methods do not rely on the exacat profile of CTF (only use the location of ctf-zero) and thus must be safe. b. Overall contrast transfer is described by overall_ctf(r) = ctfw(r) + w1*ctfs(r)*exp(-w2*r^2/ctfzero^2) ctfw: ctf for weakly defocused image ctfs: ctf for strongly defocused image w1 and w2 must be determined so that overall ctf is as uniform as possible. I tentatively use 2.2 for w1 and 4 for w2; these values give overall ctf of 0.72 for (1,0) reflection. By using higher values for w1 and w2, it is possible to get higer values for ctf at low resolution; however, a valley between two peaks (one formed by strongly defocused image, the other by weakly defocused image) becomes deeper. Note these values affect amplitude scaling factor in averaging. Remember that the above values apply only to images taken at ~8,000 and ~20,000 A defocuses; if the difference in defocus is smaller, as in some of the pictures I took at Stanford, stronger low pass filtering is necessary to achieve that value. b. Example of the control data ([unwin.emc]hlpass.cnt) !title hc6231abcw low-pass (85; steepness 4) !input file hc6231abcw.new !output file hc6231abcw.lps !number of layer-lines (0 for all) and repeat distance (Angstrom) 0 4059.2 ! need true pitch !ctf zero (grid unit) 85 ! steepness (weighting factor w2, described above) 4 !list of layer-line numbers to be **excluded** 0 ! equator should be treated separately c. To activate the program $ hlpass [control_data_file_name (default: .cnt)] d. Example of the output -------------------------------------------------------------- L N 1/Z Y R weight (A) (grid) (grid) -------------------------------------------------------------- 1: 3 45 1353.1 50- 77 ( 50.03- 77.02) ( 0.250; 0.037) 2: 16 25 253.7 20- 84 ( 21.80- 84.45) ( 0.769; 0.019) 22: 129 0 31.5 1- 32 ( 69.97- 76.94) ( 0.066; 0.038) 23: 132 45 30.8 37- 51 ( 80.59- 87.90) ( 0.027; 0.014) 24: 145 25 28.0 26- 40 ( 82.83- 88.23) ( 0.022; 0.013) -------------------------------------------------------------- axial height-| | | weighting factor at | | two end points | |- radii of two end points from origin |- distance from meridian e. Edit control data file for HAVG; remember weighting factor for strongly defocused data. Run HAVG. 20. Average of different images belonging to the same class a. If the selection rules are different but belong to the same class, it may still be possible to average them in the Fourier space. You have to check how much distortion is expected by examining the output from NLTABL. b. If the distortion is small enough, next step is renumbering the layer-line datasets according to a unified selection rule. First, find best image or largest common denominator (i.e. intermediate selection rule). Then run a renumbering program: $ HRENUM You have to supply 1) original file name [def: .avg] 2) new file name [def. original_filename.ren] 3) (n,l) for (1,0) in original indexing scheme 4) (n,l) for (0,1) 5) l for (1,0) in new indexing scheme 6) l for (0,1) As always, hit carriage-return if you are happy with the default. Please note that the layer-lines are stroed in ascending order of L in the original dataset. New layer-lines are sorted according to new layer-line number (L). c. This renumbering is almost trivial except (2,-3) which could be above the equator (i.e. L=1) or below the equator (i.e. L= -1). If the sign of L is different between new and original indexing (i) sign of N should be opposite, and (ii) Friedel pair of the original Fourier term should be used. (Make a NLPLOT and index points using (h,k)). d. With the new fitting program you don't have to worry about aligning and adding blank layer-lines. Just fit test datasets to two-fold forced reference dataset (use only good layer-lines). e. If the radial scaling correction is needed, use HAVG program for scaling, shifting and rotation. It is enough to fill control data for only one side. Then LLPR to plot fitted (full set) layer-line data. 21. Treatment of equatorial data a. There are two things need careful consideration about equatorial data. One is box width; the other is weight for equator in averaging. b. Determination of box-width is not very simple due to weak amplitude contrast. Since we do exact compensation for equatorial data, the phase contrast ripple along the edge of the tube must be included as a part of the structure; otherwise, the density profile near the edge will be altered. c. Weight for equator needs a special care, since we apply different scheme for correcting ctf for equatorial and off-equatorial layer- lines. 22. Fourier-Bessel inversion a. Since the routine for Bessel function used in the MRC programs has a considerable error for higher order Bessel's, a completely new program, optimized for AChR tube is written. This new version is more than 10 times faster than CT's old version. Output file is binary and has different structure if two-fold is assumed (i.e. only real part is stored; see "program.doc" for more detail) b. Prepare control data file (HFB.CNT). !-- input file name hnfitlpc.2fd !-- No of layer-lines (0 for all) twofold (1/0) 0 1 ! use all layer-lines and force two-fold !-- rmin rmax delr (in Angstrom) 0 400 5 ! rmax/delr < 128 (current limitation) !-- output file x:hnfitlpc.lg5 ! .lg5 stands for "5 A interval little g" c. $ SUB HFB to activate the program. d. Example of output: 23. Fourier synthesis a. At present only cylindrical, horizontal (perpendicular to the helix axis and vertical parallel (to the helix axis) sections can be calculated. b. We should start from cylindrical section, because it is most directly related to the reconstruction method. It is very instructive to consider which stripes correspond to which layer-line. c. Example of control data file ([chikashi.emc]HCLND.CNT) !-- Control data file for HCLND !-- title of this reconstruction hn3391x2 lpc fitted (2.2/4) eq=0.22 !-- file name for little g x:hnfitlpc.lg5 !-- true pitch,llmax (to limit # of layer lines; 0 for all), 1 ! if twofold is forced (0 otherwise) 3681 0 1 !-- min and max radius of sections 180 355 ! interval is prefixed by HLXFB !-- min, max and stepsize for phi !9.0 56.2 0.6 (used in the Nature paper) 0. 48. 1. ! these are OK for usual display !-- min, max and stepsize for z (-16 224 4 :used in the Nature paper) 0 200 5 !-- weight for equator 0.22 ! see below (effective weight ~0.7) !-- weight for other ll's if not 1 142 0 ! L = 142 is discarded ! ll number (L) and weight !-- scaling factor (0. for automatic scaling); -2500. ! varies depending on the image contrast !-- output (con: for console) x:hnfitlpc.cld d. for x-sections (vertical parallel sections) !-- Control data file for HXSEC !-- title of this reconstruction hf600x1 fitted eq=0.2 !-- file name for little g f:hf600x1.lg2 !-- true pitch, llmax (0 for all) and 2 for twofold 2202 0 2 !-- min and max radii, interval in x-direction (Angstrom); !-- overscan (1/0) 255 385 5 1 !-- y direction; there are three ways to define the parameters L25 R43 2.5 ! left and right end (degree); interval (angstrom) !c34 a18 2.5 ! centre angle and full width (degree); ! interval (angstrom) !c34 w100 2.5 ! centre angle (degree) full width (angstrom) ! interval (angstrom) !-- min, max and stepsize for z 0. 90. 2.5 !-- weight for equator 0.2 !-- weight for other ll's if not 1 ! ll number (L) and weight !-- scaling factor (0. for automatic scaling) -2500. ! negative for ice image !-- output (con: for console) f:hf600x1.x @@ overscan: this flag refers to whether all the available radii for gnl(r) are used for the map. If 0, gnl(r) terms within the radii specified by rmin and rmax are used; thus at the outermost radii, only one line will be non-zero; if 1, all gnl(r) terms available in the gnl(r) file are used to fill the map as much as possible. @@ If 'console' is specified as the output device, only up to 44 points in the horizontal direction are displayed. Anyway, $stw132 to put the terminal into 132 column modes. f. horizontal section !-- Control data file for HRZ !-- title of this reconstruction Native hn3391x2 eq = 0.3 !-- file name for little g [chikashi.emc]nat.lg5 !-- true pitch,llmax (for limiting # of layer lines; 0 for all); ! twofold ( 1 if twofold forced) 4446.2 0 1 !-- min and max radius of sections; interval (x,y); overscan (1/0) 0 360 5 1 ! min, max, radius could be 0 (for everyting) !-- min, max for phi (degree) 0. 359.99 !-- min, max and stepsize for z (Angstrom) 0. 100. 5. !-- weight for equator 0.3 !-- weight for other ll's if not 1 ! ll number (L) and weight !-- scaling factor (0. for automatic scaling) -1400. !-- output (con: for console) x:natf.hrz !con g. consideration on the weight of equator 23. Output of the map a. ISOCLND for printing out cylindrical sections; run [chikashi.work]isoclnd program is fully interactive to print out 1st to 10th sections, enter 1 10. b. IMGPLT for printing out other kind of sections. run [chikashi.work]imgplt Miscellaneous programs: * HFTAMP : Since the Fourier transform of a helical particle provides only 1st and 4th quadrants, you need to generate 2nd quadrant from 4th quadrant. This program extract the amplitude of Fourier transform and interpolate along the horizontal direction so that you get an undistorted transform. If 1st and 4th quadrants are required and thus only interpolation is needed, use FTAMP. Programs are interactive. Interpolation factor: if the length of the box was longer than 1024, the positions of layer-lines have appeared higher than non-interpolated transforms; that means the interpolation factor (i.e. how many times the transform must be expanded in the horizontal direction) must be larger than 2 for 512x1024 transform. * HSHIFT : rotate and translate phase origin; cann't do phase inversion or scaling correction (use HAVG). * MAGCOR : correct magnification by bilinear interpolation; if the image is small, you can use [public.image.exe]INTERPOL. At the moment, you can reduce the image only by decreasing image dimension (i.e. must keep (0, 0) as it is). (Interactive) * HRZTOCAV : calculate cylindrical average from horizontal sections. Before trying this, you might have to masking off only required portion. 3DBOX and 3DMSK can be used for this purpose. Interactive. You can specify sectors to be included in the average. Quick and dirty code. 1. input file 2. output file 3. origin (x,y) 4. radial sampling (end, interval) 5. z-sections used (start, end) 6. angles for sectors (start, end) 7. ibid if necessary (up to 10) or C/R to stop * HRZTOCLD : calculate densities on cylinders from horizontal sections. Useful for getting general ideas on subunits concentric about the molecule axis. (Interactive) * 3DMSK : mask off one molecule along the minimum/maximum density points on rays projected from the center of molecule to the image periphery. The densities in the area are the same as they were whereas the masked area is filled by a specified number. (Interactive) * VOLUME : calculate volume recovery from density maps. At present the diameter of the limiting circle is specified in grid unit; thus, it will not be accurate if the sampling distance is different in x-y directions. (Interactive)