Job: 2D Classification
Last updated
Last updated
Rapidly classify particle images based on their in-plane rotation.
Single particle cryo-EM images are essentially 2D projections of 3D objects. The ultimate goal of single-particle analysis is reconstruction of one or more 3D volumes; however, the calculations necessary for these reconstructions are expensive, relatively slow, and sensitive to noise and outliers in the data. Thus, it can be beneficial to perform an initial “clean-up” step in 2D to quickly discard particle images which are obviously junk or the wrong particle, and also to quickly visualize the contents of a dataset before proceeding further with processing.
2D Classification groups particles into a specified number of classes. Because the technique is using only two dimensions, the particles are only rotated and translated in-plane (i.e., only rotated and translated as you could with a flat, printed version of the image). The average of all particles in a class (called a class average) typically has a significantly better signal-to-noise ratio than any single particle image. As such, it is much easier to identify and discard a bad class average than it is a single particle image.
In CryoSPARC, particle coordinates and extracted particle images are both "particle" type outputs. For 2D Classification, the particles must have been extracted (i.e., they must have a “blob” field). If you encounter an error message when you launch a 2D Classification job that reads
AssertionError: Non-optional inputs from the following input groups and their slots are not connected: particles.blob. Please connect all required inputs.
This parameter sets the number of classes into which particles will be grouped. In general, as the number of classes increases, the ability to separate images into different viewing directions, or into “good” and “bad” classes improves. With too few classes, “junk” particles may be grouped into a “good” class because there is not enough room for a class that is entirely junk. However, with too many classes the computation can become slow, and there may not be enough signal in the images within each class to successfully sort particles.
For the typical dataset with particles numbering in the hundreds of thousands, 50—200 classes is a good starting place. However, this parameter has a significant impact on the results and effectiveness of 2D Classification; as such, some experimentation may be necessary (and is recommended) in order to find the best number of classes for a given dataset.
This parameter sets the highest (i.e. finest) spatial frequency used throughout the job, for both alignment and averaging.
In most cases, the default setting does not need to be changed. Limiting the resolution available to 2D Classification can reduce overfitting in cases where it is present, but if resolution is too low the algorithm will not be able to align or classify the particles. In general, we recommend using a lower resolution when spiky overfitting artifacts are observed.
In general, we do not recommend using a higher resolution for this parameter than the default setting, as higher resolution details should be resolved in 3D rather than 2D.
In some cases, it can be beneficial to use high resolution detail when averaging particle images, but limit the algorithm to use a lower resolution while aligning particles. This parameter allows setting the maximum resolution for alignment.
Setting the maximum alignment resolution using Maximum alignment res (A)
provides the same effect as setting the Maximum resolution (A)
parameter, namely, reducing overfitting by removing high-frequency information. This parameter must be a the same or lower resolution as Maximum resolution (A)
.
However, once particles have been aligned at the lower Maximum alignment res (A)
resolution, they are averaged together at the Maximum resolution (A)
. The class averages will therefore be of a higher resolution, but without the risk of overfitting due to noise.
This parameter sets the lowest frequency which will be included in the alignment step — essentially, the particle images are high-pass filtered to this frequency during alignment. This setting is most useful when there are large, uninteresting parts of the images which degrade alignment (most commonly micelles or other neighboring contaminants). In such cases, setting the Minimum alignment res
between 40—60 Å typically gives the best results.
2D Classification starts with randomly generated initial guesses for the 2D class averages. Early in the classification process, these class averages begin to improve, but are still quite far from correct. Therefore, it is important that the 2D classification algorithm account for uncertainty in the class averages; the Initial classification uncertainty factor
parameter controls this effect.
This parameter controls how long 2D Classification should remain uncertain about the particles’ class assignments. Increasing the Initial classification uncertainty factor will make the algorithm treat class assignments as uncertain for a larger number of iterations.
To avoid the problem of contamination from neighboring particles, the class averages (not the particle images) are masked with a circular mask at each iteration of 2D classification. Generally, we recommend that this parameter is set slightly larger than the particle diameter. It is also best to keep Re-center 2D classes
on when using a circular mask, so that the particle is centered in the mask as its alignment improves.
The inner and outer diameter of this mask’s soft edge are controlled by the Circular mask diameter (A)
and Circular mask diameter outer (A)
, respectively.
Class averages of filaments can be aligned such that the filament is vertical in each 2D class. This allows for estimates of an approximate in-plane rotation for each filament image. Note that this is only approximate, and does not attempt to determine the polarity of the class averages.
This parameter is turned on by default for helical targets. It should be left off for non-helical targets, since there is no meaningful sense of “vertical” for non-helical objects.
When particles are picked, the same physical particle may be picked and extracted two or more times at different positions on the particle, resulting in duplicate particles in the dataset. Over the course of 2D Classification, the duplicated particle images will be aligned to the same 2D class, and their center positions will be updated. Ideally, this will result in the center positions of the duplicate particle images moving towards the same point on the physical particle.
In general, this is a desirable effect. Particle images should become more centered as 2D Classification progresses. However, the presence of duplicate particles causes issues in downstream processing, and so they must be removed from the particle stack before progressing.
This parameter detects and removes duplicate particles at the end of 2D Classification if their picked positions plus the translations of the center position modeled by the job are within the radius specified by Minimum separation distance (A)
.
Note that the input particles must have locations for this function of 2D Classification to work — particles imported as a particle stack with no attached micrographs (and therefore no location information) cannot have duplicates removed.
By default, 2D Classification creates class averages by averaging together all the particles in a class, each at their best pose. In other words, each particle is forced to use its maximum probability pose during averaging, hence the name "Force max". This parameter is on by default, and turning it off instead causes the job to marginalize over pose — essentially “blurring” each particle image out over all likely poses. For a more thorough explanation of this process, please see the Expectation Maximization page.
Turning this parameter off can, in some cases, improve 2D classification results with small or low-contrast particles. However, it can often take more iterations for the class averages to converge. We therefore recommend increasing the Number of online-EM iterations
when this option is turned off. Increasing the Batchsize per class
may also improve results when this option is turned off.
Turning this parameter off increases the amount of time needed to perform a 2D Classification. For example, classifying 50,000 particles from EMPIAR-10261 into 50 classes took 11 minutes with Force max
turned on, and 61 minutes with Force max
turned off.
The parameters discussed in this section are Number of online-EM iterations
, Number of final full iterations
, and Batchsize per class
.
In 2D Classification, the class averages are initialized randomly. These random class averages contain no signal and are far from any true projection of the 3D target. Thus, even providing a small amount of good information will likely result in a significant improvement of the average.
At early iterations of classification, it would be inefficient to compare the poor quality random class averages to every single particle. CryoSPARC therefore performs several iterations in which only a subset of the particles are aligned to the class averages.
These iterations are called online-EM iterations, and Number of online-EM iterations
controls the number of online-EM iterations performed by the job. Typically 20 is a good starting place, but if the particles have low signal-to-noise ratios, or if Force max over poses/shifts
is turned off, more iterations will likely be required, for example 40.
Batchsize per class
controls the number of particles used in these online-EM iterations. By choosing a number of particles per class instead of the total number, the amount of information provided to each class is held constant regardless of the number of classes. Again, we find the default of 100 is generally sufficient, but
if the particles have a low signal-to-noise ratio,
if Force max over poses/shifts
is turned off, or
if there is a rare particle type present in the sample
it may be beneficial to increase this parameter. A good starting value may be between 200 and 500.
Once these online-EM iterations have completed, the class averages are typically very high quality. They therefore need significantly more information to improve, so 2D Classification performs a number of final iterations in which all particles are used. These are the Number of final full iterations
, and one is typically sufficient.
By default, the 2D class average formation model is allowed to make some pixels in the model negative. Turning this parameter on forces each pixel in the class average to be greater than or equal to zero. This constraint on the 2D class averages can substantially change how the classes look, and can improve results in some cases, but can also create streaking or textural artefacts in some cases. Given this, we generally do not recommend turning this parameter on.
This parameter is similar to Enforce non-negativity
in that it imposes a contstraint on the 2D class averages. Specifically, when this parameter is turned on, the corners of the class average must be zero, and the class average image must also be smooth. This constraint generally results in classes with a "flattened" background (i.e. solvent) region, and can improve results in some cases, but can also create streaking or textural artefacts in some cases. Given this, we generally do not recommend turning this parameter on.
For a typical cryo-EM workflow, particle images are extracted from a micrograph and have not been corrected for the Contrast Transfer Function. If particle images have been premultiplied or do not need CTF correction (e.g., negative stain data), this parameter should be turned off.
This parameter turns off CTF correction by setting the following CTF parameters in the particles’ metadata:
Amplitude contrast is set to 1.0
Defocus is set to 0.0
Spherical aberration is set to 0.0
These values are set for the particles and retained in the output, so subsequent jobs using the output particles will also not correct for the CTF. If CTF correction needs to be re-activated at a later step, the CTF parameters from a prior job can be connected using the low-level interface, or the particles can be re-extracted from micrographs with Force re-extract CTFs from micrographs
turned on.
During alignment, particles are assigned fractional weights to some number of classes. Then, during backprojection (the step in which the class averages are updated), the particles contribute to each class according to their weight in that class. This allows the updated class averages to reflect uncertainty in the class assignments of an individual particle.
In some cases, it can be beneficial to force the final class reconstructions to use only each particle’s “best” class (i.e., the class for which that particle has the highest weight). For instance, if a large number of good particles have a small weight in a junk class, they may make that class appear better than it truly is.
Turning this parameter on forces the particles to contribute all of their weight to their single best class. Note that this applies only to the final iteration which is the iteration in which the output 2D class average images and classifications are produced.
Particles output from a 2D Classification job have a class membership and a 2D pose (translation and in-plane rotation). This allows for removal of, say, all particles in a particular class using a Select 2D Classes job.
The class averages output is simply the collection of images for each class — it does not contain any information about the particles themselves. This output is used to help subsequent jobs find the images of the class average.
If Remove duplicate particles
is turned on, any particles rejected as duplicates are output in this group.
At each iteration of the 2D Classification the current 2D class averages are displayed to help assess convergence. In addition, several diagnostic plots are produced.
At each iteration, the current class averages are plotted. The following information is overlaid on the class averages as well:
The number of particles in each class (top of each class)
A scale bar to help assess particle size (left side of left-most class, every other row)
The resolution at which the half-sets’ Fourier Ring Correlation (FRC, 2D equivalent to the FSC) correlate at 0.5 (bottom-left of each class)
At each iteration, particles are assigned to classes based on the error between that class’s average and the particle image. CryoSPARC marginalizes over class, meaning that each particle contributes to any number of classes, with more of that particle’s weight going to classes to which it is more similar.
This histogram measures the total number of classes the particle is contributing to. If a particle is confidently assigned to only one class, it will have an effective sample size (ESS) of 1.0. If a particle splits its density between three classes in a ratio of 50% / 25% / 25%, its effective sample size will be slightly less than 2.0. A particle which is equally likely to belong to all classes will have an effective sample size equal to the number of classes.
Formally, the effective sample size of some image is the reciprocal of the sum of that particle’s squared probabilities for each class.
where is the probability that belongs to class .
Early in the 2D Classification job, class assignments are not confident because the quality of the class averages is very poor. Therefore, most particles have a relatively high effective sample size:
As classification continues, the 2D class averages improve, which in turn makes class assignments more confident. Particles therefore have a lower effective sample size as classification converges:
The quartiles of the effective sample size are also output at the end of each iteration in the Event Log:
Several problems can cause effective sample size to remain high through the end of the 2D Classification job:
Overlapping classes: if classes look similar, the job assigns particles to both of them. If you see classes which look similar by eye and effective sample size remains high, this is a likely cause. The high effective sample size is most likely not a concern in this case.
Incomplete classification: if effective sample size remains high and 2D classes still look noisy or blurry, the classification may not have converged yet. 2D Classification should be repeated with an increased number of O-EM iterations to give the job more time to converge.
Poor data quality: if the input particle stack is overwhelmingly junk picks or has low signal-to-noise ratio, class assignments may never become confident. In this case, it may be beneficial to
These plots provide another way of assessing the confidence of class assignments. Rather than plot the number of classes each particle is assigned to, these plots show the the probability that each particle has for its most likely class. For instance, if a particle is assigned to only one class, the probability of its best class would be 1.0. If a particle splits its density between three classes in a ratio of 50%/25%/25%, the probability of its best class will be 0.5.
Informally, you could think of this value as how confident the 2D Classification job is that, if you forced it to pick a single class for each particle, it would get the class assignment right.
Early in the classification, the class averages are too noisy for confident assignment, so the probability of the best class will be low for most particles:
As classification proceeds and the class averages improve, particles will be assigned with more confidence and the histogram will move to the right:
The quartiles of the probability of best class are also output at the end of each iteration in the Event Log:
If the probability of the best class remains clustered at low probabilities, the potential causes and solutions are similar to those of a constant high effective sample size. That section of this page provides potential troubleshooting steps.
By far the most common pathology in 2D Classification is streaky or noisy classes which have no clear protein features, like those above. In general, if your 2D Classification results look like those above, changing the following parameters may help:
Turn Force max over poses/shifts
off
Increase the Number of online-EM iterations
. Increasing the default of 20 up to 40 is a good starting place, but as many as 80 may ultimately be required. Experimentation is usually required to find the right value for a given dataset. This setting helps because when parti
Increase the Batchsize per class
, especially if the particle has low signal to noise ratio. A good starting value is perhaps 200, but a higher value may be required. Changing this setting gives the algorithm more information each time it updates the 2D class averages. This is especially helpful in combination with turning off Force max over poses/shifts
, because it counteracts the “blurring” effect of pose marginalization by increasing the total number of particles each iteration sees.
Unfortunately, all of the above parameter changes make the job significantly slower. They are thus not set this way by default.
In cryo-EM processing generally, there is no way to algorithmically identify a class average as a catch-all “junk class”. Instead, one hopes that requesting a greater number of classes creates additional classes into which junk particles can segregate. This only works if the junk particles look more like each other than they do the good particles — at the end of the day, particles are always assigned to the class they match the best.
This creates a challenge for any cryo-EM processing algorithm. We must be confident enough in our class averages to properly assign particles among them, but must also allow them to change enough that they properly reflect the true data. The classification confidence is tuned by the Initial classification uncertainty factor
(ICUF) parameter.
When the ICUF is high, 2D Classification treats the class averages as unreliable during the early iterations. This means that even if a particle aligns well to a given class average, it contributes to other classes as well because all alignments are treated as lower quality. This forces all classes to start to look similar in the early iterations. Conversely, a low ICUF means the job will treat all alignments as higher quality right from the first iteration, meaning that classes which begin different will remain different. This often results in a final set of class averages which look more different from each other, but may have fewer unique views of the same object.
Changing the ICUF has two main effects. First, with a lower ICUF, class averages stabilize more quickly, since particles are “blurred” across classes to a lesser degree. Note that this effect is highly dataset-dependent. If the particles are small or have low SNR, their class averages will be treated as high quality when in fact they are very poor, which can prevent convergence entirely.
Second, if the ICUF is increased, all classifications are treated as more uncertain. If a particle has a clear best class assignment but ICUF is high, it will still contribute some information to classes for which it has a poor score, since the algorithm is forced to be uncertain about the good alignment. This, in turn, makes those low-quality class averages better, so in future iterations the particle will actually align better to those classes and the cycle will continue.
The end result is that 2D Classification jobs with a higher ICUF tend to have more classes which look like each other, and more classes that look like the "average" particle, rather than rare objects or junk.
Note that this is not necessarily an unalloyed good. If 2D class averages will be used for Template Picking, having more views is good; however, if the class averages will be used to filter bad particles, having more good views from the same particles may mean that junk particles have been distributed among the good classes, which makes them impossible to remove by selecting or excluding classes.
2D Classification is fast and scales to very large particle sets well, which makes it an appealing job for particle curation. However, it can be difficult to visually identify good class averages, especially for an unfamiliar target. Moreover, good particles may end up classified into junk classes and vice-versa. This makes particle curation with 2D Classification somewhat risky in the sense that it is possible to inadvertently exclude good particles from downstream processing. Therefore, in general, we recommend that only the most obvious junk classes are removed with 2D Classification, and additional curation to retain the best particles be done in 3D.
For example, consider this 2D Classification result using images of a GPCR from EMPIAR 11350 (Akasaka et al. 2022). First, a 2D Classification job is run requesting 200 classes and with Force Max over poses/shifts
on (default). This job completes in twenty-five minutes on one GPU and produces four classes which have clear GPCR features and 196 classes which do not.
Turning off Force max
improves the quality of 2D classes, with thirteen good classes and 187 bad classes:
However, this 2D classification job took 1,571 minutes (26 hours and 11 minutes) to complete on the same single GPU as the previous job — a sixty-fold slowdown. The resulting map is significantly improved both in terms of resolution (4.25 Å) and visible map quality:
all particles (i.e., the extracted particles before any 2D classification was performed)
one good class (e.g., the first Ab Initio map, or perhaps a similar target solved previously)
several bad maps (e.g., maps from iteration 0 of an Ab Initio Reconstruction, or some other noise volume)
then good particles tend to sort into the first map, while the others collect junk:
The resulting good GPCR map contains 141,216 particles — approximately three times as many as the 2D classification job with Force max off. Additionally, this Heterogeneous Refinement job finishes in ninety-seven minutes on only one GPU — approximately four times slower than the 2D Classification job with Force Max turned off, but producing a significantly improved final result.
Performing Non-Uniform Refinement on the good class from the Heterogeneous Refinement job produces the best map by far, reaching 3.4 Å and displaying clearly improved map quality:
In contrast, the particles selected via Heterogeneous Refinement have a qualitatively more even distribution of particles (note that the color scale differs between the two plots) and a significantly-improved cFAR score of 0.43.
It may be, then, that 2D classification struggles to separate a particular view of this GPCR from noise or empty micelles. This results in those particles being excluded from downstream analysis despite the fact that they are actually of high quality.
As a final sanity check, a 2D Classification job (with Force max off) on the particles selected by Heterogeneous Refinement shows 56,750 particles in 21 class averages with clear GPCR density, indicating that these particles are indeed good picks which were hidden by the preponderance of junk particles in the initial 2D Classification jobs.
The remaining classes are likely a mix of junk particles and good particles which cannot easily be distinguished from junk in 2D methods.
In conclusion, for smaller targets and targets for which not all views have clear features, it may be beneficial to skip 2D Classification altogether and perform particle curation using only 3D methods, like Ab Initio Reconstruction and Heterogeneous Refinement.
Force max over poses/shifts switches from the default mode of maximization over pose (top) to marginalization over pose (bottom). In maximization, only the most likely pose is used to contribute to the class average. The others are discarded entirely. When we marginalize over pose, each pose is weighted by its probability. Then those weighted images are combined, and that combination is added to the class average.
Force max
is on by default for computational efficiency — it is typically best to weight particle poses by their probability to account for uncertainty. However, in most cases, the probability peak is very sharp and defined, so there is little difference between the two settings.
This is why turning Force max
off mostly improves alignment of small, low-SNR particles. In these cases, the probability is distributed more broadly over the poses, so the averages produced by marginalization are significantly different from those produced by maximization. This is also why turning Force max
off slows 2D Classification jobs; each backprojection takes time, and when marginalizing over pose the job must backproject each particle in a number of poses instead of just one.
Kumar, K. et al. Structure of a Signaling Cannabinoid Receptor 1-G Protein Complex. Cell 176, 448-458.e12 (2019).
Xu, H. et al. Structural Basis of Nav1.7 Inhibition by a Gating-Modifier Spider Toxin. Cell 176, 702-715.e14 (2019).
Akasaka, H. et al. Structure of the active Gi-coupled human lysophosphatidic acid receptor 1 complexed with a potent agonist. Nature Communications 13, 5417 (2022).
2D Classification is fast and useful in many cases. However, it also has important limitations and caveats to consider. We encourage all users to read the and sections and keep them in mind while preparing to run or analyze a 2D Classification job. The section discusses the two most common failure modes of 2D Classification: streaky or noisy classes and classes with too many or too few junk classes. It also provides some suggested parameter tweaks to improve results when these issues are observed. The section provides an in-depth explanation of why, in some cases, 3D particle curation workflows are recommended over 2D classification.
you have connected particle locations only, and must extract them (using ) before performing 2D Classification.
2D Classification also requires images to have CTF estimates, which will be included when particles are extracted from micrographs with CTF estimates (e.g., from ).
It can be helpful to increase this parameter when good and bad particles are expected to look very similar. This may avoid the classification getting stuck in a local minimum in which junk particles have been confidently grouped into a good class or vice-versa. See for more information.
Because the , it is important to extract particles with a box size that is much larger than the particles themselves. When grids are crowded, a large box size can result in neighboring particles also being present in the image (i.e., crowding). In this case, alignments suffer because the contaminating neighbor is also considered in the alignment. At its worst, this effect can result in several 2D class averages with two or more particles in them, rather than one particle at the center of each average.
The median for particles in each class (bottom right of each class)
This plot displays the noise model for the current iteration. Essentially, the noise model measures how reliable the images and class averages are at each resolution shell. Current sigma (in blue) is the noise model used in the current iteration while aligning particles. It may be different from the current noise (in orange), which is calculated directly from the images and averages, due to annealing parameters. Finally, initial sigma (in green) is the noise calculated from 100 random images at the start of 2D Classification. More information about the noise model plot is available in the explanation of . The Expectation Maximization page discusses the impact of the noise model on classification and alignment.
use an job to reduce the number of obvious empty ice or contaminant particles,
try the advice in , or
try proceeding directly to .
A 2D Classification job is almost always followed by using Select 2D Classes to remove particles belonging to classes which appear to be junk or contaminant. Alternately, if a 3D volume for the target already exists, can be used to pick 2D classes based on their similarity to projections of the reference volume.
Once 2D Classification produces classes which are free of obvious contaminants and noise users typically move on to 3D methods, starting with . Note that the 2D pose estimates are not used in subsequent 3D jobs, so it is not strictly necessary to perform 2D Classification before 3D techniques.
The selected classes have only 28,953 particles. After and , the resulting map refines to 6.8 Å and shows evidence of overfitting, orientation bias, and other pathologies:
This is a reasonable result. However, rather than using 2D Classification only, we could instead filter good particles from bad using a Heterogeneous Refinement job. If we provide a job with
Furthermore, the results of jobs run on each of the refined maps above reveals one potential reason for this significant improvement: the particles which were selected by 2D classification are missing an entire set of views, yielding a cFAR score of 0.09. cFAR scores below 0.5 generally indicate map anisotropy (i.e., that orientation bias is harming the map quality).