Job: Orientation Diagnostics

A new job in CryoSPARC v4.4+ to diagnose the presence of preferred orientation

Description

Orientation Diagnostics is a job (new in CryoSPARC v4.4, updated in v4.5) that can aid in diagnosing the presence of preferred orientation. It includes and builds upon 3DFSC (Tan et al., 2017) and Fourier Sampling (Baldwin & Lyumkis, 2020).

By default, Orientation Diagnostics reports the conical FSC area ratio, or cFAR (v4.4+) and Relative signal (v4.5+). cFAR values below 0.5 generally indicate the presence of preferred orientation. cFAR accounts for both the viewing direction distribution and the signal content present within each particle by quantifying the variance of directional half-map Fourier correlations across the viewing sphere.

Relative signal captures FSC variation as a function of viewing direction. Regions of low relative signal can help identify missing views whose absence has a deleterious effect on map anisotropy.

If particles are supplied, the job will also report the Sampling Compensation Factor or SCF* (Baldwin & Lyumkis, 2020 — see below for more information about the significance of the star). SCF* values below 0.81 generally indicate the presence of preferred orientation. SCF* characterizes the sampling of Fourier shells by considering the viewing directions of all particles, without accounting for the signal content contained therein. A junk particle is given equal weight to a real particle.

We provide a short summary of the two metrics below. Please see the detailed definitions at the end of this job guide or the Orientation Diagnostics tutorialarrow-up-right for more information.

cFAR and SCF* at a glance...

Input

  • Volume or Volumes (all classes)

    • New in v4.5: If Volumes (all classes) are supplied from an upstream classification job, orientation diagnostics will be computed for each class volume. Note that this input is a volumes grouparrow-up-right input.

  • [Optional] Particles

    • If supplied, Fourier sampling (and its associated metric, SCF*) will be computed along with other per-particle diagnostics

  • [Optional] Mask

    • If supplied, half-maps will be masked prior to the computation of conical FSCs

circle-info

NOTE: The cFSC plot produced during the final iteration of all refinement jobs uses the auto-tightened mask (mask_fsc_auto). This mask is a low-level output of all refinements and will be automatically used by Orientation Diagnostics:

  • when connecting a refined volume group as an input to Orientation Diagnostics

  • when using the 'Build Orientation Diagnostics' quick action

To use a custom mask, connect the input mask group. Note that if a custom mask is connected, the results may differ from the output of an upstream refinement.

Common Parameters

Conical FSC

  • Number of Directions

    • The conical FSC Area Ratio (cFAR) metric is relatively robust to number of conical axis directions. However, reducing this number can speed up the job for volumes with large box sizes. Increasing this number will create denser spherical plots.

Sampling Compensation Factor

  • Symmetry

    • Particle viewing directions will be expanded to account for symmetry. This parameter should be set to the symmetry applied in the upstream refinement.

Output

  • 3DFSC volume

    • A volume of radial cFSC curves interpolated at each 3D voxel location

Common Next Steps

  • Particle picking (to find missing views)

More details about cFAR, SCF*, and Relative Signal

cFSC Weighted Area-under-Curve (wAuC)

Given a conical Fourier shell correlation, Cr(v^)C_r (\hat{v}), (where rr is the Fourier radius in wave number and v^\hat{v} is the conical axis), we define the wAuC as

In this weighted sum, the Fourier radius, rr, ranges from the DC component to the radius at which Cr(v^)C_r(\hat{v}) first crosses 0.143. Intuitively, for each Fourier radius rr, the correlation between the half maps is multiplied by the surface area of the Fourier shell at rr. The final result is proportional to the ‘mass’ of the cone in units of correlation.

cFSC Area Ratio (cFAR)

We define the cFSC as the ratio of the minimum to the maximum wAuC(v^)\text{wAuC}(\hat{v}) over v^\hat{v}, or

cFAR=Δminv^wAuC(v^)maxv^wAuC(v^).\text{cFAR} \overset{\Delta}{=} \frac{\min_{\hat{v}} \text{wAuC}(\hat{v})}{\max_{\hat{v}} \text{wAuC}(\hat{v})}.

tFSC Area Ratio (tFAR)

circle-info

New in CryoSPARC v5

In some cases, especially with small membrane proteins or targets with unimodal or bimodal viewing direction distributions, the cFAR score can indicate that a map is very anisotropic when it is in fact usable. Often, these targets with a pathologically low cFAR score have a good tFAR score (the same area ratio as the cFAR but calculated with tFSCs). Starting in CryoSPARC v5.0, we began reporting the tFAR score for use in these targets.

Sampling Compensation Factor (SCF*) (Baldwin and Lyumkis, 2020)

SCF* measures orientation bias in particle viewing directions — it allows CryoSPARC to convert a potentially difficult-to-parse viewing distribution into a single metric.

SCF* is computed from the statistics of ‘Fourier sampling’. Given a single particle, Fourier sampling is a binary function that indicates which ‘slab’ (i.e., a stack of slices) of Fourier voxels are affected when the particle is used in back-projection. Summing this sampling over many particles allows one to see if large chunks of Fourier space are poorly sampled or missing completely. The degree to which this occurs is measured by the Sampling Compensation Factor—a single number that quantifies the extent to which anisotropic viewing direction distributions attenuate the global FSC value.

chevron-rightMathematical Definition (expandable section)hashtag

To compute the SCF, we consider the sampling of Fourier bins at a particular radius, RR. In total there are ~2πR22 \pi R^2 unique bins in Fourier space. For each particle viewing direction v^\hat{v}, we compute its associated sampling via the slab condition (Baldwin and Lyumkis, 2020):

Sp(k^,v^,R)=Δ1(k^v^12R),\text{Sp}(\hat{k}, \hat{v}, R) \overset{\Delta}{=} \boldsymbol{1} ( \hat{k} \cdot \hat{v}\leq \frac{1}{2 R} ),

where k^\hat{k} is a unit vector that defines a Fourier voxel on the shell of radius RR, and 1()\boldsymbol{1}(\cdot) is the indicator function. Intuitively, this function returns 1 for all Fourier voxels that belong to a ring of radius RR, within a plane orthogonal to v^\hat{v} (and 0 otherwise). Next, we sum Sp(k^,v^,R)\text{Sp}(\hat{k}, \hat{v}, R) for all particle viewing directions to produce Sp\text{Sp}, a single set of values that indicate the number of times each bin was sampled. The SCF can then be computed as

SCF=(<Sp><1Sp>)1,\text{SCF} = \left( <\text{Sp}> <\frac{1}{\text{Sp}}> \right)^{-1},

where 1SP\frac{1}{\text{SP}} denotes element-wise reciprocals, and <><\cdot> is the arithmetic mean. This value is positive and always less than or equal to 1. Higher numbers indicate more uniform sampling distributions. If there are zeros amongst the sampling set, the above will be not defined. To account for this potential we report

SCF*=(<Sp>p(p<1Sp>+q))1,\text{SCF*} = \left(\frac{<\text{Sp}^*>}{p} \left(p <\frac{1}{\text{Sp}^*}> + q \right) \right)^{-1},

where qq is the fraction of zero sampling bins, p=1qp = 1 - q, and Sp\text{Sp}^* are the non-zero sampling values. Note that if q=0q=0, then SCF=SCF*\text{SCF} = \text{SCF*}. Please refer to (Baldwin and Lyumkis, 2021) for more details.

Relative Signal (tFSC)

The conical sections used to compute cFSCs cannot be easily mapped to viewing directions. To map FSC values to viewing directions, we use a toroidal section. We define a toroidal section to be the volume swept out by a cone, whose axis is orthogonal to the viewing direction, as it spins about the viewing direction. A toroidal section contains the Fourier components that would be populated by a particle with the same viewing direction, dilated to account for some error (or 'wiggle') in the pose estimate. We set the toroidal half-angle such that its volume is approximately equal to a cone of the same half-angle.

Conical vs. toroidal sections and their relation to the Fourier Slice Theorem.

Using these toroidal sections, we compute a set of FSC curves. This set then allows to define a number, which we call relative signal, for each viewing direction. Relative signal is the wAuC of each curve, normalized with respect to the maximum within the set. The curve with the greatest wAuC in the set has a relative signal of 1.0, while a theoretical curve that is 0 at every frequency would have a relative signal of 0.0.

FSC curves computed within a toroidal section. Each curve is coloured by its relative signal, which we define to be the wAuC of each curve, normalized with respect to the maximum. We visualize each curve as a single number on an azimuth-elevation chart, and a 3D viewing sphere that encompasses a low-pass-filtered 3D volume.

Starting in CryoSPARC v5, the tFSC curves are plotted in the "All toroidal fscs" plot.

Plot Explanations

chevron-rightv4.4 only plotshashtag

cFSC wAuC vs. conical axis

The wAuC of a cFSC curve is a proxy for directional signal content. If wAuC is relatively constant when the conical axis is varied, then the signal is isotropic in viewing direction. This plot helps illustrate the variation in cFSC wAuC and can aid in diagnosing the ‘structure’ of anisotropy.

cFSC resolution vs. conical axis

This plot is similar to the plot above, but visualizes the 0.143 crossing, rather than wAuC, of each cFSC curve.

cFSC summaries within azimuth/elevation regions

To add back in a coarse notion of directionality to the above plot, we reproduce the same statistics for twelve different regions of axis space. This plot can help identify differences in cFSC variance to further elucidate the source of anisotropy.

Summary of cFSC curves

We summarize all cFSC curves in this plot, visualizing statistics as a function of spatial frequency rather than conical axis. As of v4.5, this plot also displays the cFAR score.

In blue: statistics over cFSC curves: mean, min, max, +/- one standard deviation plotted against spatial frequency. In green: histogram over 0.143 crossings of the same curves.

Starting in v5.0 Orientation Diagnostics also produces a plot of all cFSC curves, colored by their relative wAuC. This may be helpful for understanding the distribution of cFSC curves in some cases.

Also starting in v5.0, raw cFSC curves are also written to a csfsc.csv file in the job's directory (it cannot be downloaded through the GUI). In this file, the first column wave_number gives the wave number for the resolution shell in that row. The resolution of that row is therefore

Resolution=box_size×pixel_sizewave_number\mathrm{Resolution} = \mathrm{\frac{box\_size \times pixel\_size}{wave\_number}}

All columns other than the first contain cFSC value for a single cone at that row's resolution. The column headers are a 3D vector identifying the cone by the vector pointing from the origin to the center of the cone's base. In general, the exact values of these numbers are not expected to be important, rather just that they uniquely identify a given cone.

Relative signal by viewing direction

Relative signal visualized in a 2D azimuth-elevation chart (left), and in a 3D coloured scatter plot (right) with a low-pass-filtered volume embedded within. Low relative signal (i.e., darker colours) indicates a region with under-represented views.

Relative signal vs. viewing direction (left: azimuth/elevation, right: 3D coloured scatter plot with structure embedded inside).

Starting in v5.0 Orientation Diagnostics plots all of the tFSC plots. This plot also reports the tFAR score.

Relative signal in azimuth / elevation regions

New in v4.5. Relative signal within twelve regions of the viewing sphere, defined by different limits on azimuth and elevation. For each region, we show the projection of the structure from the central viewing direction, as well as the mean relative signal withing that region. Regions with low relative signal represent missing or underrepresented views in the data.

Relative signal within 12 azimuth/elevation regions.

3DFSC volume

The 3DFSC volume (Tan et. al, 2017) is another way to summarize cFSC curves by storing them in a volume whose voxels are interpolated from cFSC values at the nearest conical axis. In this plot, we visualize the 3DFSC volume via central slices.

Central slices of the 3DFSC volume (Tan et. al, 2017) composed of interpolated cFSC values.

Fourier Sampling

circle-info

The Fourier Sampling plot is only generated if particles are connected.

These plots visualize the Fourier sampling accumulated over a random subset of the particle viewing directions (default: 10000) — Fourier sampling is anti-podally symmetric (and hence only has ~ 2πR22 \pi R^2 bins) so we visualize only the z>0z>0 hemisphere, following the original publication (Baldwin, P. R., & Lyumkis, D., 2020). N.B., the elevation / azimuth plot on the left should be visually similar to the Posterior Precision plot as posterior precision measures Fourier sampling modulated by the CTF.

The Fourier sampling Sp\text{Sp} (see equation in mathematical definition) visualized in 3D (right) and via an azimuth / elevation parameterization of the hemisphere (left).

Particle scale factor vs. viewing direction

circle-info

The scale factor plot is only generated if particles are connected and their per-particle scales are not all 1.0.

This figure visualizes the average particle scale for a set of viewing directions (uniformly sampled on the viewing sphere).

(top) particle scale factors visualized by colour as function of particle viewing direction. (bottom) particle scale factor histogram (reproduced in other refinement jobs).

References

Tan et al. (2017), Addressing preferred specimen orientation in single-particle cryo-EM through tilting. Nat Methods 14(8), 793-796.

Baldwin, P. R., & Lyumkis, D. (2020). Non-uniformity of projection distributions attenuates resolution in Cryo-EM. Progress in biophysics and molecular biology 150, 160-183.

Last updated