Job: Rebalance 2D Classes
Last updated
Last updated
Group 2D class averages into superclusters and, optionally, balance the number of particles in each supercluster.
Particles must have been 2D Classified, and should come from the same job as the 2D Class Averages.
Particles and 2D class averages should come from the same job. Note that Rebalance 2D Classes analyzes the class averages and not the particles themselves, so results will be better with higher-quality (i.e., less noisy) class averages.
Particles will be dropped from each superclass such that the smallest class is, at smallest, this fraction of the largest class. A Rebalance factor
of 0.0 does not discard any particles. A Rebalance factor
of 1.0 randomly discards particles from all classes (except the smallest class) until all classes are the same size.
This should optimally be set to the number of unique views in the 2D class averages. This number is typically not known precisely, and so some experimentation is often necessary. Note that if Do superclassification
is turned off this number must equal the number of 2D class averages in the input.
Provided the Rebalance factor
is not 0.0, this parameter will determine the maximum superclass size rather than the Rebalance factor
. Setting this parameter to some integer N is functionally equivalent to setting the Rebalance factor
to $n/N$, where $n$ is the number of particles in the smallest class.
Particles remaining in the dataset after rebalancing classes according to the Rebalance factor
(or Override maximum superclass size
).
The templates are unchanged from the input.
Particles excluded from the dataset after rebalancing classes according to Rebalance factor
(or Override maximum superclass size
).
Rebalance 2D Classes creates an Affinity Matrix which displays how similar 2D classes are to each other. It is this affinity that is used to group the class averages into the requested number of superclasses.
Say we start with ten class averages and we want to group them into two superclasses.
First, we calculate the affinity of the classes for each other. The affinity is a measure of how similar the two classes look, and varies from 0.0 (not similar at all) to 1.0 (identical).
We can map the pairwise affinities on a matrix, where the row and column represent a specific class, and each cell is colored by the similarity between the class in its row and the class in its column.
If we rearrange the rows and columns such that classes with a high affinity for each other are adjacent, we can easily see the superclasses as square structure in the matrix. This structure arises naturally in a well-clustered matrix, since a group of rows and columns all have high affinity for each other and low affinity for other classes.
If the matrix does not have a clear pattern of squares for each superclass, or if the superclasses have members which “project” darker colors in their row and column, it may be that a different number of superclasses is needed.
This job is most useful as a diagnostic to assess distribution of particles among views before moving to 3D, and often the outputs are not directly used in following jobs.
The improved map may be useful for downstream tasks or for repeating particle picking, if the underrepresented views are present but not being picked. If, however, underrepresented views are simply not present in the micrograph it is unlikely that this technique (or any other) will recover an isotropic map.
Wong, Wilson, et al. "Cryo-EM structure of the Plasmodium falciparum 80S ribosome bound to the anti-protozoan drug emetine." Elife 3 (2014): e03080.
Tan, Y. Z. et al. Addressing preferred specimen orientation in single-particle cryo-EM through tilting. Nature Methods 14, 793–796 (2017).
Campbell, Melody G., et al. "2.8 Å resolution reconstruction of the Thermoplasma acidophilum 20S proteasome using cryo-electron microscopy." Elife 4 (2015): e06380.
Rebalance 2D Classes analyses the 2D class averages produced by jobs to produce superclusters of similar class averages. Optionally, particles can be randomly excluded from the more populated superclasses to balance the number of particles in each superclass.
In some rare cases, rebalancing particles among views can improve initial results of in the case of severe orientation bias. If your Ab initio reconstruction shows evidence of severe bias (such as a flat map or a map with severe streaking), setting the Rebalance factor
relatively high (e.g., 0.8) can improve results slightly.
If a 3D refinement of the particles exists, will provide quantitative description of orientation bias that may or may not exist in the particles, and whether or not that bias results in a significantly anisotropic map.
Similarly, if a 3D refinement of the particles exists, will directly rebalance the particles based on viewing direction rather than by 2D Class.