Comment on page
Job: 3D Classification (BETA)
3D classification without alignment.
This job type has been substantially improved from its original release in CryoSPARC v3.3. Changes in v4.0 and v4.1 are described below.
3D Classification (BETA), first introduced in v3.3, can help discover discrete heterogeneity in single particle cryo-EM datasets. This job currently implements a version of 3D classification without alignment — a classification routine that can complement the Heterogeneous Refinement and 3D Variability jobs in finding new discrete classes of data.
In CryoSPARC v4.0, 3D Classification was updated with several notable improvements, including FSC regularization, focus and solvent mask inputs, new convergence criteria, and a number of new diagnostic plots and outputs.
Note that in CryoSPARC v4.0+, cloning a 3D classification job that was created in CryoSPARC v3.3 will fail to launch due to a change in the inputs and parameters of the job type. Instead, please create a 3D classification job from scratch in v4.0 and re-connect the desired inputs and set parameters.
Under the hood, 3D Classification (BETA) uses a combination of Online and Full-Batch Expectation Maximization (O-EM, and F-EM, respectively). These algorithms alternate between (1) computing the most likely class assignments for each particle image in a batch based on known 3D class volumes, and (2) updating each 3D volume based on these assignments.
- Particles (with
- [Optional] Initial Volumes
- To be used with the
inputinitialization mode. The number of initial volumes must match the number of classes.
- [Optional] Solvent mask
- If not supplied, a solvent mask is computed by dilating and soft-padding the consensus volume.
- [Optional] Focus mask
- If not supplied, only the solvent mask will be used (i.e., the focus mask will be set to a volume of all ones).
Number of classes: Number of classes to use in job. Note that this can be significantly larger than Heterogeneous Refinement for the same computational cost.
Target resolution: Desired resolution of each 3D map — this, combined with the extent of the particle images will determine 3D box size.
Output data after every F-EM iter(updated in v4.1.2): This option may be useful for larger datasets where one may want to monitor the 3D volumes prior to the completion of the job. Note that as of CryoSPARC v4.1.2, this option can only be turned on if class re-ordering is turned off (see below).
If 3D Classification is not producing good results, adjusting the following parameters may be a good starting point to get improved results:
O-EM learning rate init(default updated in v4.0): For a fixed O-EM batch size and epoch value, larger values will generally result in fewer populated classes
Use FSC to filter each class(new in v4.0): FSC filtering may be turned off to match the filtering behaviour of 3D classification in CryoSPARC v3.3.x.
Convergence criterion (%)(new in v4.0): Primary stopping criterion — percentage of particles that have switched classes across F-EM iterations. Increasing this value may result in ‘early stopping’ of the optimization.
RMS density change convergence check(new in v4.0): If some particles have high probability of being in two or more different classes, the primary criterion may not be sufficient. Turning on this parameter will force the job to also monitor the root mean square of the class volumes directly, which will provide a secondary source of convergence information.
Per-particle scale(new in v4.1): Per-particle optimization can be turned off and scales can be set to their upstream values (
input) or to a constant value of 1.0 (
Force hard classification(new in v4.0): Turn off weighted back projection — this may improve performance for small(er) targets where the standard optimization may ‘smear’ a portion of particles across several classes.
Other salient considerations with regards to parameters:
Reorder classes by size(new in v4.1.2): With this parameter turned on (default), classes will be reordered according to their size (i.e., assigned particles) at the end of classification, prior to output generation. To avoid potential confusion regarding class outputs, this option must be turned off if
Output data after every F-EM iteris turned on.
- All particles
- Particles for each class
- 3D volumes for each class
- Volume series of all 3D maps (
- Solvent mask (passthrough input or auto-generated)
- Focus mask (passthrough input if provided)
- Consensus volume
- Further classification of subsets of classes
A number of significant improvements to 3D Classification were added in CryoSPARC v4.0. We list them below.
- Per-particle scale optimization (v4.1+)
- By default, 3D Classification will perform per-particle scale optimization before starting the main EM classification loop.
- FSC-based filtering (v4.0+)
- By default, during both O-EM and F-EM iterations, 3D Classification will filter each class volume by its intra-class FSC curve.
- Convergence criteria (v4.0+)
- F-EM iterations will conclude when one of two stopping criteria is met:
- % of particles that switch classes (primary stopping criterion)
- weighted mean RMS density change falls below a threshold (optional, secondary criterion)
- Separate focus and solvent mask inputs (v4.0+)
- 3D Classification accepts two different types of masks. A solvent mask,, and a focus mask,. During optimization we use the following real-space volume for all likelihood computations of class:
is the consensus reconstruction.
is not provided, we set
. Otherwise, we also plot real-space slices and projections of the mask overlayed on the consensus volume map:
Focus mask overlayed on real-space slices.
- Filtered consensus volume output (v4.4+)
- The consensus map is now filtered in accordance to its FSC. The resulting map is output by the job for inspection.
Starting with CryoSPARC v4.0, 3D Classification outputs several new diagnostic plots listed below.
This histogram can help diagnose poor classification results by showing if some particles have significant probability mass in more than one class. The ESS (Effective Sample Size) is a measure of how many classes each particle appears to belong to with significant probability. And ESS of 1.0 indicates that a particle is completely confidently assigned to only one class. An ESS of 2.0 would mean that a particle belongs with substantial probability to two classes. When many particles have a large ESS (> 1), this indicates that there is significant uncertainty in classification, any the classes may be overlapping or similar.
This plot shows the real-space difference between the consensus map and each class map, regularized by the class FSC (if FSC regularization is turned on). This can quickly show areas of heterogeneity.
This diagram shows how many particles switched classes across F-EM iterations (output starts at the second F-EM iteration). An edge, (i,j), is drawn with a thickness, colour, and opacity defined by the amount of particles that switch from class i to class j.
This diagram visualizes class flow in a matrix format. Each column represents a 1D distribution of the particles in a given class at the current F-EM iteration. Each row represents the class which the particles belonged to at the previous iteration. In other words, each square in this grid represents an edge in the bipartite class flow graph above. This form of class flow can be useful in visualizing 'minor' edges that are difficult to see in the bipartite graph, and it can greatly improve clarity for class flow with large (25+) numbers of classes.
This histogram now includes both total assignments and the ‘effective size’ of the class. The latter is a sum of the probability mass in that class. When the assignments and effective size bars are differently sized, this indicates that there is uncertainty in the classification, as many particles have probabilities that are spread out between classes (an effect included in the effective size) compared to the class where they have the maximum probability (the assignments).