CryoSPARC Guide
  • About CryoSPARC
  • Current Version
  • Licensing
    • Non-commercial license agreement
  • Setup, Configuration and Management
    • CryoSPARC Architecture and System Requirements
    • CryoSPARC Installation Prerequisites
    • How to Download, Install and Configure
      • Obtaining A License ID
      • Downloading and Installing CryoSPARC
      • CryoSPARC Cluster Integration Script Examples
      • Accessing the CryoSPARC User Interface
    • Deploying CryoSPARC on AWS
      • Performance Benchmarks
    • Using CryoSPARC with Cluster Management Software
    • Software Updates and Patches
    • Management and Monitoring
      • Environment variables
      • (Optional) Hosting CryoSPARC Through a Reverse Proxy
      • cryosparcm reference
      • cryosparcm cli reference
      • cryosparcw reference
    • Software System Guides
      • Guide: Updating to CryoSPARC v4
      • Guide: Installation Testing with cryosparcm test
      • Guide: Verify CryoSPARC Installation with the Extensive Validation Job (v4.3+)
      • Guide: Verify CryoSPARC Installation with the Extensive Workflow (≤v4.2)
      • Guide: Performance Benchmarking (v4.3+)
      • Guide: Download Error Reports
      • Guide: Maintenance Mode and Configurable User Facing Messages
      • Guide: User Management
      • Guide: Multi-user Unix Permissions and Data Access Control
      • Guide: Lane Assignments and Restrictions
      • Guide: Queuing Directly to a GPU
      • Guide: Priority Job Queuing
      • Guide: Configuring Custom Variables for Cluster Job Submission Scripts
      • Guide: SSD Particle Caching in CryoSPARC
      • Guide: Data Management in CryoSPARC (v4.0+)
      • Guide: Data Cleanup (v4.3+)
      • Guide: Reduce Database Size (v4.3+)
      • Guide: Data Management in CryoSPARC (≤v3.3)
      • Guide: CryoSPARC Live Session Data Management
      • Guide: Manipulating .cs Files Created By CryoSPARC
      • Guide: Migrating your CryoSPARC Instance
      • Guide: EMDB-friendly XML file for FSC plots
    • Troubleshooting
  • Application Guide (v4.0+)
    • A Tour of the CryoSPARC Interface
    • Browsing the CryoSPARC Instance
    • Projects, Workspaces and Live Sessions
    • Jobs
    • Job Views: Cards, Tree, and Table
    • Creating and Running Jobs
    • Low Level Results Interface
    • Filters and Sorting
    • View Options
    • Tags
    • Flat vs Hierarchical Navigation
    • File Browser
    • Blueprints
    • Workflows
    • Inspecting Data
    • Managing Jobs
    • Interactive Jobs
    • Upload Local Files
    • Managing Data
    • Downloading and Exporting Data
    • Instance Management
    • Admin Panel
  • Cryo-EM Foundations
    • Image Formation
      • Contrast in Cryo-EM
      • Waves as Vectors
      • Aliasing
  • Expectation Maximization in Cryo-EM
  • Processing Data in cryoSPARC
    • Get Started with CryoSPARC: Introductory Tutorial (v4.0+)
    • Tutorial Videos
    • All Job Types in CryoSPARC
      • Import
        • Job: Import Movies
        • Job: Import Micrographs
        • Job: Import Particle Stack
        • Job: Import 3D Volumes
        • Job: Import Templates
        • Job: Import Result Group
        • Job: Import Beam Shift
      • Motion Correction
        • Job: Patch Motion Correction
        • Job: Full-Frame Motion Correction
        • Job: Local Motion Correction
        • Job: MotionCor2 (Wrapper) (BETA)
        • Job: Reference Based Motion Correction (BETA)
      • CTF Estimation
        • Job: Patch CTF Estimation
        • Job: Patch CTF Extraction
        • Job: CTFFIND4 (Wrapper)
        • Job: Gctf (Wrapper) (Legacy)
      • Exposure Curation
        • Job: Micrograph Denoiser (BETA)
        • Job: Micrograph Junk Detector (BETA)
        • Interactive Job: Manually Curate Exposures
      • Particle Picking
        • Interactive Job: Manual Picker
        • Job: Blob Picker
        • Job: Template Picker
        • Job: Filament Tracer
        • Job: Blob Picker Tuner
        • Interactive Job: Inspect Particle Picks
        • Job: Create Templates
      • Extraction
        • Job: Extract from Micrographs
        • Job: Downsample Particles
        • Job: Restack Particles
      • Deep Picking
        • Guideline for Supervised Particle Picking using Deep Learning Models
        • Deep Network Particle Picker
          • T20S Proteasome: Deep Particle Picking Tutorial
          • Job: Deep Picker Train and Job: Deep Picker Inference
        • Topaz (Bepler, et al)
          • T20S Proteasome: Topaz Particle Picking Tutorial
          • T20S Proteasome: Topaz Micrograph Denoising Tutorial
          • Job: Topaz Train and Job: Topaz Cross Validation
          • Job: Topaz Extract
          • Job: Topaz Denoise
      • Particle Curation
        • Job: 2D Classification
        • Interactive Job: Select 2D Classes
        • Job: Reference Based Auto Select 2D (BETA)
        • Job: Reconstruct 2D Classes
        • Job: Rebalance 2D Classes
        • Job: Class Probability Filter (Legacy)
        • Job: Rebalance Orientations
        • Job: Subset Particles by Statistic
      • 3D Reconstruction
        • Job: Ab-Initio Reconstruction
      • 3D Refinement
        • Job: Homogeneous Refinement
        • Job: Heterogeneous Refinement
        • Job: Non-Uniform Refinement
        • Job: Homogeneous Reconstruction Only
        • Job: Heterogeneous Reconstruction Only
        • Job: Homogeneous Refinement (Legacy)
        • Job: Non-uniform Refinement (Legacy)
      • CTF Refinement
        • Job: Global CTF Refinement
        • Job: Local CTF Refinement
        • Job: Exposure Group Utilities
      • Conformational Variability
        • Job: 3D Variability
        • Job: 3D Variability Display
        • Job: 3D Classification
        • Job: Regroup 3D Classes
        • Job: Reference Based Auto Select 3D (BETA)
        • Job: 3D Flexible Refinement (3DFlex) (BETA)
      • Postprocessing
        • Job: Sharpening Tools
        • Job: DeepEMhancer (Wrapper)
        • Job: Validation (FSC)
        • Job: Local Resolution Estimation
        • Job: Local Filtering
        • Job: ResLog Analysis
        • Job: ThreeDFSC (Wrapper) (Legacy)
      • Local Refinement
        • Job: Local Refinement
        • Job: Particle Subtraction
        • Job: Local Refinement (Legacy)
      • Helical Reconstruction
        • Helical symmetry in CryoSPARC
        • Job: Helical Refinement
        • Job: Symmetry search utility
        • Job: Average Power Spectra
      • Utilities
        • Job: Exposure Sets Tool
        • Job: Exposure Tools
        • Job: Generate Micrograph Thumbnails
        • Job: Cache Particles on SSD
        • Job: Check for Corrupt Particles
        • Job: Particle Sets Tool
        • Job: Reassign Particles to Micrographs
        • Job: Remove Duplicate Particles
        • Job: Symmetry Expansion
        • Job: Volume Tools
        • Job: Volume Alignment Tools
        • Job: Align 3D maps
        • Job: Split Volumes Group
        • Job: Orientation Diagnostics
      • Simulations
        • Job: Simulate Data (GPU)
        • Job: Simulate Data (Legacy)
    • CryoSPARC Tools
    • Data Processing Tutorials
      • Case study: End-to-end processing of a ligand-bound GPCR (EMPIAR-10853)
      • Case Study: DkTx-bound TRPV1 (EMPIAR-10059)
      • Case Study: Pseudosymmetry in TRPV5 and Calmodulin (EMPIAR-10256)
      • Case Study: End-to-end processing of an inactive GPCR (EMPIAR-10668)
      • Case Study: End-to-end processing of encapsulated ferritin (EMPIAR-10716)
      • Case Study: Exploratory data processing by Oliver Clarke
      • Tutorial: Tips for Membrane Protein Structures
      • Tutorial: Common CryoSPARC Plots
      • Tutorial: Negative Stain Data
      • Tutorial: Phase Plate Data
      • Tutorial: EER File Support
      • Tutorial: EPU AFIS Beam Shift Import
      • Tutorial: Patch Motion and Patch CTF
      • Tutorial: Float16 Support
      • Tutorial: Particle Picking Calibration
      • Tutorial: Blob Picker Tuner
      • Tutorial: Helical Processing using EMPIAR-10031 (MAVS)
      • Tutorial: Maximum Box Sizes for Refinement
      • Tutorial: CTF Refinement
      • Tutorial: Ewald Sphere Correction
      • Tutorial: Symmetry Relaxation
      • Tutorial: Orientation Diagnostics
      • Tutorial: BILD files in CryoSPARC v4.4+
      • Tutorial: Mask Creation
      • Case Study: Yeast U4/U6.U5 tri-snRNP
      • Tutorial: 3D Classification
      • Tutorial: 3D Variability Analysis (Part One)
      • Tutorial: 3D Variability Analysis (Part Two)
      • Tutorial: 3D Flexible Refinement
        • Installing 3DFlex Dependencies (v4.1–v4.3)
      • Tutorial: 3D Flex Mesh Preparation
    • Webinar Recordings
  • Real-time processing in cryoSPARC Live
    • About CryoSPARC Live
    • Prerequisites and Compute Resources Setup
    • How to Access cryoSPARC Live
    • UI Overview
    • New Live Session: Start to Finish Guide
    • CryoSPARC Live Tutorial Videos
    • Live Jobs and Session-Level Functions
    • Performance Metrics
    • Managing a CryoSPARC Live Session from the CLI
    • FAQs and Troubleshooting
  • Guides for v3
    • v3 User Interface Guide
      • Dashboard
      • Project and Workspace Management
      • Create and Build Jobs
      • Queue Job, Inspect Job and Other Job Actions
      • View and Download Results
      • Job Relationships
      • Resource Manager
      • User Management
    • Tutorial: Job Builder
    • Get Started with CryoSPARC: Introductory Tutorial (v3)
    • Tutorial: Manually Curate Exposures (v3)
  • Resources
    • Questions and Support
Powered by GitBook
On this page
  • At a Glance
  • Description
  • Inputs
  • Particles
  • Initial Particles and Final Particles
  • Commonly Adjusted Parameters
  • Subset by
  • Class indices and Class probability filtering mode
  • 3DVA Component Number
  • Curation mode
  • Minimum probability to include particle in cluster (%)
  • Outputs
  • Particles set N
  • Common Next Steps
  • Recommended Alternatives
  • Subsetting By Per-Particle Scale
  • Class Probability
  • CryoSPARC Dataset Fields
  1. Processing Data in cryoSPARC
  2. All Job Types in CryoSPARC
  3. Particle Curation

Job: Subset Particles by Statistic

PreviousJob: Rebalance OrientationsNext3D Reconstruction

Last updated 1 month ago

At a Glance

Split particles into groups based on the value of a statistic.

Description

Often, some statistic of an alignment or classification (such per-particle scale or the distance a particle shifted during an alignment) indicates that several subpopulations exist in a particle stack. Subset Particles by Statistic simplifies the process of separating these populations, either by modeling the selected statistic using a Gaussian mixture model (GMM) with a selected number of components, or with manually entered thresholds.

Inputs

Subset Particles by Statistic can model the distribution of a statistic for a particular particle stack, or it can model the distribution of the difference in a statistic. When operating on a single statistic, just one Particles input exists. When operating on a difference, the Particles input disappears and is replaced with the Initial Particles and Final Particles inputs.

Particles

When operating on a single statistic, such as “Per-particle scale”, only this input is available.

The particles are separated using the statistic selected in the Subset by parameter. Output particles will have the same metadata as this input (e.g., the pose will remain unchanged by the filtering operation).

Initial Particles and Final Particles

When operating on a difference statistic, such as “Absolute difference in 3D shift (A)”, only this pair of inputs is available.

First, the absolute difference in the selected statistic is calculated between the Initial Particles and Final Particles. Taking Absolute difference in 3D shift (A) as an example, if one particle had a difference of (0, 1) between the two particle stacks and another had a difference of (-1, 0), both would be at the 1.0 position of the histogram.

The particles are then separated by the distribution of these differences. When particle groups are output, they inherit their metadata (e.g., poses) from the Final Particles input.

Since differences are absolute, Subset Particles by Statistic will find the same groups no matter which input a given particle stack is inserted into. You can therefore always place the particle stack with your desired metadata (poses, CTF values, etc) in the Final Particles.

Commonly Adjusted Parameters

Subset by

This parameter selects the statistic by which particles will be separated into groups.

Statistic
Description

Per-particle scale

Per-particle scale accounts for local variations in greyscale and is typically taken as a proxy for ice thickness, but would also be affected by overall particle quality and other factors. Note that if the volume is highly anisotropic due to orientation bias, the per-particle scale may be unreliable.

Particle picking NCC score

Particle picking power score

Average defocus (A)

2D alignment error

The 2D alignment error is a measure of the mismatch between the particle image and the 2D class average to which it is aligned, in the particle’s optimal pose. While a higher 2D alignment error means the particle is a poorer match for its class average than other particles, this may be because the particle is low quality or because the class average represents a slightly different viewing direction. Thus, when 3D alignments are available, 3D alignment error should be preferred.

3D alignment error

The 3D alignment error is a measure of the mismatch between the particle image and the projection of the volume in the particle’s optimal pose. A higher error may generally correlate with poorer images, but the error is also affected by the volume’s overall quality.

Class probability - 2D

Class probability - 3D

Class ESS - 2D

The Effective Sample Size is an alternate measure of the confidence in a particle’s class assignment. Higher ESS values indicate lower confidence in the particle’s class. An ESS of 1.0 indicates that a particle is only a member of a single class. ESS modes typically work best with manual thresholds.

Class ESS - 3D

The Effective Sample Size is an alternate measure of the confidence in a particle’s class assignment. Higher ESS values indicate lower confidence in the particle’s class. An ESS of 1.0 indicates that a particle is only a member of a single class. ESS modes typically work best with manual thresholds.

Total motion (A)

Total motion measures the distance the particle moved while the movie was collected. More motion is generally considered to correlate with poorer images due to blurring.

X location (fraction)

The X location is a value ranging from 0.0 to 1.0, with 0.0 representing the left-most pixel in a micrograph and 1.0 the right-most.

Y location (fraction)

The Y location is a value ranging from 0.0 to 1.0, with 0.0 representing the bottom-most pixel in a micrograph and 1.0 the top-most.

Helical tilt angle (deg)

The helical tilt angle is only meaningful for helical proteins. It measures the angle of deviation of the particle image relative to a vertical helix.

3DVA component X

Absolute difference in 3D pose (deg)

The difference in 3D pose is the absolute difference in degrees between a particle’s pose in the initial and final datasets. Note that this single value takes into account the particle’s rotation in all three degrees of freedom.

Absolute difference in 3D shift (A)

Absolute difference in 2D pose (deg)

The difference in 2D pose is the absolute difference in 2D rotation between the initial and final datasets.

Absolute difference in 2D shift (A)

The difference in 2D shift is the absolute difference in X/Y shift, refined by 2D Classification, between the initial and final datasets.

Absolute difference in average defocus (A)

Class indices and Class probability filtering mode

These parameters only appear when Subset by is set to “Class probability - 3D”.

Class indices selects classes by index (starting with 0) over which the probability should be evaluated. Class probability filtering mode controls whether the particles should be retained if the sum or the maximum of the probabilities in the selected classes is greater than the threshold.

3DVA Component Number

3DVA components are never removed from a particle stack, only updated.

Consider a workflow in which you first run a 3DVA job with 4 components, then run a 3DVA job on the resulting particle stack with only 2 components. In this case, components 0 and 1 refer to the coordinates from the second job, but components 2 and 3 refer to the coordinates from the first job.

Curation mode

If Curation mode is set to “Cluster by gaussian fitting”, a Gaussian mixture model (GMM) will be fit to the statistic selected with Subset by. The number of Gaussian components in the GMM is set by Number of gaussians, which only appears in this curation mode. Particles are then output in a set corresponding to Gaussian component for which they have the highest probability.

If Curation mode is set to “Split by manual thresholds”, the user manually selects thresholds which divide the particle groups. The Number of thresholds parameter creates the selected number of Threshold N parameters, which are then used to divide the particles.

Minimum probability to include particle in cluster (%)

This parameter sets the minimum probability a particle must have to be included in the output. A particle has a 0% probability of belonging to a cluster when we are certain it does not belong to the cluster, and a 100% probability when we are certain it does. In practice, particles will essentially never have a 0% probability but may have 100% probability.

For example, consider a particle stack with the following distribution of per-particle scales:

This distribution of per-particle scales looks like it has three distinct groups, so we will use a GMM with three Gaussians. The resulting model looks like this:

We can use this GMM to assign each particle to either the red, pink, or green group, depending on the height of the Gaussians at that position. We can then use these probabilities to split the particles into groups. Each particle is assigned into the group with the highest probability at that particle’s per-particle scale. This would result in the following groups:

With this technique, we keep all particles for future use. However, some particles have an equal probability of belonging to two Gaussian components. In this example, particles with per-particle scales around 1.0 are equally likely to belong to the red and pink components.

In some cases, it may be preferable to remove particles with an uncertain group assignment. This is what the Minimum probability parameter does. In this example, say we set the Minimum probability to 80%.

Outputs

Particles set N

The separated input particles are output in N sets, depending on the selection of thresholds or GMM components. Sets are numbered from low to high, with set 0 having the lowest mean value for the selected statistic.

Note that if the selected statistic is a difference statistic, the metadata (including pose, CTF estimate, etc) will come from Particles Final.

Common Next Steps

The next steps will depend on the selected statistic, but this job is often used to select a subset of interest for further refinement.

Separating particles into per-particle scale and removing the low-value group can improve refinement results in some cases. See the Subsetting By Per-Particle Scale section below for an example of this use case.

Recommended Alternatives

Subsetting By Per-Particle Scale

In this example, we performed a Non-Uniform Refinement of 241,210 TRPV1 particles with C4 symmetry and Minimize over per-particle scale turned on. The resulting map had a GSFSC resolution of 2.69.

The per-particle scales show a clear bi-modal distribution. Typically, per-particle scales are expected to be normally distributed about 1.0.

One explanation for the bimodal distribution is there are two populations of particles, one in thick ice and one in thin ice. The particles in thick ice would need a higher per-particle scale to have the same greyscale as the particles in thin ice, giving rise to a bimodal distribution.

However, it is also possible that junk or low-quality particles do not match the TRPV1 volume very well even in their best orientation, and so are assigned a lower per-particle scale to reduce their overall error. If this is the case, filtering out these particles will yield a better reconstruction.

We can use Subset Particles by Statistic to separate the particles into two groups with a Gaussian Mixture Model. Subset Particles by Statistic also plots the viewing direction distribution of each set, which we can check to ensure that the different scales do not simply correlate with a specific viewing direction.

Although there is some difference between the two viewing direction plots, there’s no clear orientation bias in either one. We can now refine each of these sets separately and evaluate the particle quality based on the final map. Since cluster 1 contains more particles than cluster 0, a random subset of the particles from cluster 1 will be used to make the comparison even.

With the same number of particles in each refinement, the high-scale particles produce a significantly better map. It seems likely that the low scale factors in the original refinement were, in fact, accounting for remaining junk particles in the stack. Note also that both particle stacks’ per-particle scales have recentered around 1.0 — the actual value of the per-particle scale is a function of the overall range of the greyscale across the entire particle stack.

Class Probability

For example, consider a 3D Classification with 3 classes. A particle with the probabilities of [0.8, 0.1, and 0.1] for each of the classes would be assigned to class 0 and included in the class 0 output. However, a particle with the probabilities of [0.34, 0.33, 0.33] would also be assigned to class 0, even though the assignment of this second particle is far less confident than the first.

Subset by
Class probability - 3D

Class indices

0

Subsetting mode

Split by manual thresholding

Threshold 1

0.75

In this example, classes 0 and 1 are good. The threshold is set to 0.75, and Class probability filtering mode is set to “sum”.

If instead classes 0 and 1 were both good, Class indices could instead be set to 0, 1. Particles would then be retained if the sum of their probabilities of belonging to both class 0 and class 1 were greater than 0.75. In this case, particles with probabilities [0.8, 0.1, 0.1] or [0.5, 0.4, 0.1] would be kept. This makes sense if both class 0 and class 1 are high-quality classes of the same volume — particles may not be confidently assigned to one of the two, but they are definitely good if they’re split among those two classes

In this example, classes 0 and 1 are good, but represent different particles. The threshold is still set to 0.75, but now the Class probability filtering mode is set to “max”. Thus, only particles confidently assigned to one or the other good class are retained.

Now, suppose class 0 and class 1 are both good, but they represent different targets. Good particles should be confidently assigned to one of the two classes — particles which are split between two different-but-good volumes are likely low quality. In this case, Class probability filtering mode should be set to “max”. Now, the [0.8, 0.1, 0.1] particle will be kept but the [0.5, 0.4, 0.1] particle rejected.

CryoSPARC Dataset Fields

Statistic
Field name(s)

Per-particle scale

The value of alignments3D/alpha_min

Particle picking NCC score

The value of pick_stats/ncc_score

Particle picking power score

The value of pick_stats/power

Average defocus (A)

The average of ctf/df1_A and ctf/df2_A

2D alignment error

The value of alignments2D/error

3D alignment error

The value of alignments3D/error

Class probability - 2D

The value of alignments2D/class_posterior

Class probability - 3D

The sum of the elements of alignments3D_multi/class_posterior selected by the Class indices parameter.

Class ESS - 2D

The value of alignments2D/class_ess

Class ESS - 3D

The value of alignments3D_multi/class_ess

Total motion (A)

The summed length of all steps of the particle’s path recorded in the file located at motion/path

X location (fraction)

The value of location/center_x_frac

Y location (fraction)

The value of location/center_y_frac

Helical tilt angle (deg)

The rotation about the imaging axis in degrees. This can be calculated from the pose recorded in rotation vector format in alignments3D/pose.

3DVA component X

components_mode_x/value , where x is replaced by the component number.

Absolute difference in 3D pose (deg)

The absolute rotational difference between the particle’s alignments3D/pose in the two datasets.

Absolute difference in 3D shift (A)

alignments3D/psize_A times the euclidian distance between the particle’s alignments3D/shift in each of the two datasets.

Absolute difference in 2D pose (deg)

The absolute value of the difference in alignments2D/pose of the particle in each of the two datasets, converted from radians into degrees.

Absolute difference in 2D shift (A)

alignments2D/psize_A times the euclidian distance between the particle’s alignments2D/shift in each of the two datasets

Absolute difference in average defocus (A)

The absolute value of difference in the average of each particle’s ctf/df1_A and ctf/df2_A in the two datasets.

In some cases, separating particles in this way is an efficient means of eliminating junk or otherwise low-quality particles. For an example of this use case, see the section below.

The score measures how well the particle image matches the template, blob, or ring used to select the particle. It is only set during particle picking.

The measures the overall contrast of the patch surrounding a particle pick. It is only set during particle picking.

The average defocus of a particle takes into account and any that have been performed on the particle stack.

A particle’s 2D class probability is the probability assigned to that particle’s best class during 2D Classification. Note that this does not filter particles by which class they are assigned to, only the confidence of their assignment. See the section of this page for more info. Probability modes typically work best with manual thresholds.

When a particle is classified using Heterogeneous Refinement or 3D Classification, it is assigned a probability of belonging to each class. This mode filters by the sum or maximum of all probabilities for the classes selected with the Class indices parameter. See the section of this page for more info. Probability modes typically work best with manual thresholds.

When particles are analyzed with , they are assigned a coordinate along each component. This statistic represents the particle’s value along the Xth component, where X is the value entered into 3DVA Component Number.

The difference in 3D shift is the absolute difference in X/Y shift, refined by , between the initial and final datasets. Note that in 3D refinements, there are still only two degrees of freedom for the shift.

The difference in average defocus is the absolute difference in the average defocus (see above) between the initial and final datasets. Typically, the average defocus only changes during a job, or a 3D refinement in which is enabled

This parameter only appears when Subset by is set to “3DVA component X”. This parameter selects which component from a job is used to separate particles.

This removes particles with low-confidence group assignments. In other words, we have traded reduced particle count for increased confidence. In this way, the Minimum probability parameter is conceptually related to the operation performed in the job.

If you are using this job to split particles up into evenly-spaced regions of 3DVA coordinate space, the Intermediates mode of may prove easier to use.

This example uses particle images from .

The Class Probability modes of Subset Particles by Statistic replace the functionality of , which is a legacy job as of CryoSPARC v4.7.

When particles are classified (in, e.g., , , or ) they are assigned a probability of belonging to each class. Once the job completes, particles are assigned to the class they have the highest probability of belonging to.

When a particle stack must be very clean (for instance, before performing ) it may be beneficial to keep only the most confidently-assigned particles. Taking the example above, suppose class 0 is the good class and classes 1 and 2 are junk. One could create a dataset which contained only the first particle (and others like it) using Subset Particles by Statistic and setting the following parameters:

Subset Particles by Statistic operates by filtering particles based on the value (or differences in values) of a Dataset Field. For most users, this is an unimportant detail; for users interested in using to incorporate similar analyses in their scripts, we list the dataset fields related to each operation below.

3D Variability Analysis
Class Probability Filter
3D Variability Display
EMPIAR 10059
Class Probability Filter
2D Classification
3D Classification
Heterogeneous Refinement
3D Flexible Refinement
CryoSPARC Tools
Subsetting By Per-Particle Scale
Normalized Cross Correlation (NCC)
power score
astigmatism
CTF refinements
3D Variability Analysis
a 3D method
Local CTF Refinement
on-the-fly CTF refinement
Class Probability
Class Probability
In this example, classes 0 and 1 are good. The threshold is set to 0.75, and Class probability filtering mode is set to “sum”.
In this example, classes 0 and 1 are good, but represent different particles. The threshold is still set to 0.75, but now the Class probability filtering mode is set to “max”. Thus, only particles confidently assigned to one or the other good class are retained.