# Job: Rebalance 2D Classes

## At a Glance

<figure><img src="https://1916621962-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-M7DGv3GkRvGGpbVPCgg%2Fuploads%2F752f9pyy5ZXfv7q28onz%2Fcover-image.png?alt=media&#x26;token=a97a05b7-f855-443e-978f-1f7adeee920a" alt=""><figcaption><p>Class averages produced from EMPIAR 10028 data (Wong et al. 2014).</p></figcaption></figure>

Group 2D class averages into superclusters and, optionally, balance the number of particles in each supercluster.

## Description

Rebalance 2D Classes analyses the 2D class averages produced by [2D Classification](https://guide.cryosparc.com/processing-data/all-job-types-in-cryosparc/particle-curation/job-2d-classification) jobs to produce *superclusters* of similar class averages. Optionally, particles can be randomly excluded from the more populated superclasses to balance the number of particles in each superclass.

## Inputs

### Particles

Particles must have been 2D Classified, and should come from the same job as the 2D Class Averages.

### 2D Class Averages

Particles and 2D class averages should come from the same job. Note that Rebalance 2D Classes analyzes the class averages and not the particles themselves, so results will be better with higher-quality (i.e., less noisy) class averages.

## Commonly Adjusted Parameters

### Rebalance factor

<figure><img src="https://1916621962-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-M7DGv3GkRvGGpbVPCgg%2Fuploads%2FzCOGP1sldWxwGQmPvRon%2Frebal-factor_2.png?alt=media&#x26;token=86519633-061b-433b-9145-68192dfe4657" alt=""><figcaption></figcaption></figure>

Particles will be dropped from each superclass such that the smallest class is, at smallest, this fraction of the largest class. A `Rebalance factor` of 0.0 does not discard any particles. A `Rebalance factor` of 1.0 randomly discards particles from all classes (except the smallest class) until all classes are the same size.

### Number of superclasses or templates (integer)

This should optimally be set to the number of unique views in the 2D class averages. This number is typically not known precisely, and so some experimentation is often necessary. Note that if `Do superclassification` is turned off this number must equal the number of 2D class averages in the input.

### Override maximum superclass size

Provided the `Rebalance factor` is not 0.0, this parameter will determine the maximum superclass size rather than the `Rebalance factor`. Setting this parameter to some integer N is functionally equivalent to setting the `Rebalance factor` to $n/N$, where $n$ is the number of particles in the smallest class.

## Outputs

### Particles selected

Particles remaining in the dataset after rebalancing classes according to the `Rebalance factor` (or `Override maximum superclass size`).

### Templates

The templates are unchanged from the input.

### Particles excluded

Particles excluded from the dataset after rebalancing classes according to `Rebalance factor` (or `Override maximum superclass size`).

### Plots

Rebalance 2D Classes creates an Affinity Matrix which displays how similar 2D classes are to each other. It is this affinity that is used to group the class averages into the requested number of superclasses.

Say we start with ten class averages and we want to group them into two superclasses.

<figure><img src="https://1916621962-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-M7DGv3GkRvGGpbVPCgg%2Fuploads%2FJxuuGgZjKSUavEV2062E%2Fclass-averages.png?alt=media&#x26;token=b57ae418-5e7e-4236-9dae-b208c851eb64" alt=""><figcaption><p>Class averages from EMPIAR 10096 (Tan et al. 2017)</p></figcaption></figure>

First, we calculate the affinity of the classes for each other. The affinity is a measure of how similar the two classes look, and varies from 0.0 (not similar at all) to 1.0 (identical).

<figure><img src="https://1916621962-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-M7DGv3GkRvGGpbVPCgg%2Fuploads%2FwD3CiAsEFBlgDQzUqSom%2Fwhat-is-score.png?alt=media&#x26;token=9b771acb-5f7f-4018-91de-0a37dc77c322" alt="" width="563"><figcaption></figcaption></figure>

We can map the pairwise affinities on a matrix, where the row and column represent a specific class, and each cell is colored by the similarity between the class in its row and the class in its column.

<figure><img src="https://1916621962-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-M7DGv3GkRvGGpbVPCgg%2Fuploads%2FuS8SRe9RfaN6vY7Pgp4X%2Fun-sorted.png?alt=media&#x26;token=00c5e368-643c-4563-9c67-48b7d6ad7408" alt=""><figcaption></figcaption></figure>

If we rearrange the rows and columns such that classes with a high affinity for each other are adjacent, we can easily see the superclasses as square structure in the matrix. This structure arises naturally in a well-clustered matrix, since a group of rows and columns all have high affinity for each other and low affinity for other classes.

<figure><img src="https://1916621962-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-M7DGv3GkRvGGpbVPCgg%2Fuploads%2FK1Y5grlLHQkde4bKWUQ2%2Fsorted.png?alt=media&#x26;token=dbceb1d7-2cb8-4a1e-b50d-a4ffb87efc8f" alt=""><figcaption></figcaption></figure>

If the matrix does not have a clear pattern of squares for each superclass, or if the superclasses have members which “project” darker colors in their row and column, it may be that a different number of superclasses is needed.

## Common Next Steps

This job is most useful as a diagnostic to assess distribution of particles among views before moving to 3D, and often the outputs are not directly used in following jobs.

In some rare cases, rebalancing particles among views can improve initial results of [*Ab initio* reconstruction](https://guide.cryosparc.com/processing-data/all-job-types-in-cryosparc/3d-reconstruction/job-ab-initio-reconstruction) in the case of *severe* orientation bias. If your *Ab initio* reconstruction shows evidence of severe bias (such as a flat map or a map with severe streaking), setting the `Rebalance factor` relatively high (e.g., 0.8) can improve results slightly.

<figure><img src="https://1916621962-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-M7DGv3GkRvGGpbVPCgg%2Fuploads%2FviRd5aLnYpVfCfLoPjzc%2Frebal-ab-init.png?alt=media&#x26;token=f9c34e8c-2e26-495c-8933-3d78a89cf19c" alt=""><figcaption><p>Example Ab Initio maps produced from EMPIAR 10025 (Campbell et al. 2015)</p></figcaption></figure>

The improved map may be useful for downstream tasks or for repeating particle picking, if the underrepresented views are *present* but not being *picked*. If, however, underrepresented views are simply not present in the micrograph it is unlikely that this technique (or any other) will recover an isotropic map.

## Recommended Alternatives

If a 3D refinement of the particles exists, [Orientation Diagnostics](https://guide.cryosparc.com/processing-data/all-job-types-in-cryosparc/utilities/job-orientation-diagnostics) will provide quantitative description of orientation bias that may or may not exist in the particles, and whether or not that bias results in a significantly anisotropic map.

Similarly, if a 3D refinement of the particles exists, [Rebalance Orientations](https://guide.cryosparc.com/processing-data/all-job-types-in-cryosparc/particle-curation/job-rebalance-orientations) will directly rebalance the particles based on viewing direction rather than by 2D Class.

## References

1. Wong, Wilson, et al. "Cryo-EM structure of the Plasmodium falciparum 80S ribosome bound to the anti-protozoan drug emetine." *Elife* 3 (2014): e03080.
2. Tan, Y. Z. *et al.* Addressing preferred specimen orientation in single-particle cryo-EM through tilting. *Nature Methods* 14, 793–796 (2017).
3. Campbell, Melody G., et al. "2.8 Å resolution reconstruction of the Thermoplasma acidophilum 20S proteasome using cryo-electron microscopy." *Elife* 4 (2015): e06380.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://guide.cryosparc.com/processing-data/all-job-types-in-cryosparc/particle-curation/job-rebalance-2d-classes-beta.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
