CryoSPARC Guide
  • About CryoSPARC
  • Current Version
  • Licensing
    • Non-commercial license agreement
  • Setup, Configuration and Management
    • CryoSPARC Architecture and System Requirements
    • CryoSPARC Installation Prerequisites
    • How to Download, Install and Configure
      • Obtaining A License ID
      • Downloading and Installing CryoSPARC
      • CryoSPARC Cluster Integration Script Examples
      • Accessing the CryoSPARC User Interface
    • Deploying CryoSPARC on AWS
      • Performance Benchmarks
    • Using CryoSPARC with Cluster Management Software
    • Software Updates and Patches
    • Management and Monitoring
      • Environment variables
      • (Optional) Hosting CryoSPARC Through a Reverse Proxy
      • cryosparcm reference
      • cryosparcm cli reference
      • cryosparcw reference
    • Software System Guides
      • Guide: Updating to CryoSPARC v4
      • Guide: Installation Testing with cryosparcm test
      • Guide: Verify CryoSPARC Installation with the Extensive Validation Job (v4.3+)
      • Guide: Verify CryoSPARC Installation with the Extensive Workflow (≤v4.2)
      • Guide: Performance Benchmarking (v4.3+)
      • Guide: Download Error Reports
      • Guide: Maintenance Mode and Configurable User Facing Messages
      • Guide: User Management
      • Guide: Multi-user Unix Permissions and Data Access Control
      • Guide: Lane Assignments and Restrictions
      • Guide: Queuing Directly to a GPU
      • Guide: Priority Job Queuing
      • Guide: Configuring Custom Variables for Cluster Job Submission Scripts
      • Guide: SSD Particle Caching in CryoSPARC
      • Guide: Data Management in CryoSPARC (v4.0+)
      • Guide: Data Cleanup (v4.3+)
      • Guide: Reduce Database Size (v4.3+)
      • Guide: Data Management in CryoSPARC (≤v3.3)
      • Guide: CryoSPARC Live Session Data Management
      • Guide: Manipulating .cs Files Created By CryoSPARC
      • Guide: Migrating your CryoSPARC Instance
      • Guide: EMDB-friendly XML file for FSC plots
    • Troubleshooting
  • Application Guide (v4.0+)
    • A Tour of the CryoSPARC Interface
    • Browsing the CryoSPARC Instance
    • Projects, Workspaces and Live Sessions
    • Jobs
    • Job Views: Cards, Tree, and Table
    • Creating and Running Jobs
    • Low Level Results Interface
    • Filters and Sorting
    • View Options
    • Tags
    • Flat vs Hierarchical Navigation
    • File Browser
    • Blueprints
    • Workflows
    • Inspecting Data
    • Managing Jobs
    • Interactive Jobs
    • Upload Local Files
    • Managing Data
    • Downloading and Exporting Data
    • Instance Management
    • Admin Panel
  • Cryo-EM Foundations
    • Image Formation
      • Contrast in Cryo-EM
      • Waves as Vectors
      • Aliasing
  • Expectation Maximization in Cryo-EM
  • Processing Data in cryoSPARC
    • Get Started with CryoSPARC: Introductory Tutorial (v4.0+)
    • Tutorial Videos
    • All Job Types in CryoSPARC
      • Import
        • Job: Import Movies
        • Job: Import Micrographs
        • Job: Import Particle Stack
        • Job: Import 3D Volumes
        • Job: Import Templates
        • Job: Import Result Group
        • Job: Import Beam Shift
      • Motion Correction
        • Job: Patch Motion Correction
        • Job: Full-Frame Motion Correction
        • Job: Local Motion Correction
        • Job: MotionCor2 (Wrapper) (BETA)
        • Job: Reference Based Motion Correction (BETA)
      • CTF Estimation
        • Job: Patch CTF Estimation
        • Job: Patch CTF Extraction
        • Job: CTFFIND4 (Wrapper)
        • Job: Gctf (Wrapper) (Legacy)
      • Exposure Curation
        • Job: Micrograph Denoiser (BETA)
        • Job: Micrograph Junk Detector (BETA)
        • Interactive Job: Manually Curate Exposures
      • Particle Picking
        • Interactive Job: Manual Picker
        • Job: Blob Picker
        • Job: Template Picker
        • Job: Filament Tracer
        • Job: Blob Picker Tuner
        • Interactive Job: Inspect Particle Picks
        • Job: Create Templates
      • Extraction
        • Job: Extract from Micrographs
        • Job: Downsample Particles
        • Job: Restack Particles
      • Deep Picking
        • Guideline for Supervised Particle Picking using Deep Learning Models
        • Deep Network Particle Picker
          • T20S Proteasome: Deep Particle Picking Tutorial
          • Job: Deep Picker Train and Job: Deep Picker Inference
        • Topaz (Bepler, et al)
          • T20S Proteasome: Topaz Particle Picking Tutorial
          • T20S Proteasome: Topaz Micrograph Denoising Tutorial
          • Job: Topaz Train and Job: Topaz Cross Validation
          • Job: Topaz Extract
          • Job: Topaz Denoise
      • Particle Curation
        • Job: 2D Classification
        • Interactive Job: Select 2D Classes
        • Job: Reference Based Auto Select 2D (BETA)
        • Job: Reconstruct 2D Classes
        • Job: Rebalance 2D Classes
        • Job: Class Probability Filter (Legacy)
        • Job: Rebalance Orientations
        • Job: Subset Particles by Statistic
      • 3D Reconstruction
        • Job: Ab-Initio Reconstruction
      • 3D Refinement
        • Job: Homogeneous Refinement
        • Job: Heterogeneous Refinement
        • Job: Non-Uniform Refinement
        • Job: Homogeneous Reconstruction Only
        • Job: Heterogeneous Reconstruction Only
        • Job: Homogeneous Refinement (Legacy)
        • Job: Non-uniform Refinement (Legacy)
      • CTF Refinement
        • Job: Global CTF Refinement
        • Job: Local CTF Refinement
        • Job: Exposure Group Utilities
      • Conformational Variability
        • Job: 3D Variability
        • Job: 3D Variability Display
        • Job: 3D Classification
        • Job: Regroup 3D Classes
        • Job: Reference Based Auto Select 3D (BETA)
        • Job: 3D Flexible Refinement (3DFlex) (BETA)
      • Postprocessing
        • Job: Sharpening Tools
        • Job: DeepEMhancer (Wrapper)
        • Job: Validation (FSC)
        • Job: Local Resolution Estimation
        • Job: Local Filtering
        • Job: ResLog Analysis
        • Job: ThreeDFSC (Wrapper) (Legacy)
      • Local Refinement
        • Job: Local Refinement
        • Job: Particle Subtraction
        • Job: Local Refinement (Legacy)
      • Helical Reconstruction
        • Helical symmetry in CryoSPARC
        • Job: Helical Refinement
        • Job: Symmetry search utility
        • Job: Average Power Spectra
      • Utilities
        • Job: Exposure Sets Tool
        • Job: Exposure Tools
        • Job: Generate Micrograph Thumbnails
        • Job: Cache Particles on SSD
        • Job: Check for Corrupt Particles
        • Job: Particle Sets Tool
        • Job: Reassign Particles to Micrographs
        • Job: Remove Duplicate Particles
        • Job: Symmetry Expansion
        • Job: Volume Tools
        • Job: Volume Alignment Tools
        • Job: Align 3D maps
        • Job: Split Volumes Group
        • Job: Orientation Diagnostics
      • Simulations
        • Job: Simulate Data (GPU)
        • Job: Simulate Data (Legacy)
    • CryoSPARC Tools
    • Data Processing Tutorials
      • Case study: End-to-end processing of a ligand-bound GPCR (EMPIAR-10853)
      • Case Study: DkTx-bound TRPV1 (EMPIAR-10059)
      • Case Study: Pseudosymmetry in TRPV5 and Calmodulin (EMPIAR-10256)
      • Case Study: End-to-end processing of an inactive GPCR (EMPIAR-10668)
      • Case Study: End-to-end processing of encapsulated ferritin (EMPIAR-10716)
      • Case Study: Exploratory data processing by Oliver Clarke
      • Tutorial: Tips for Membrane Protein Structures
      • Tutorial: Common CryoSPARC Plots
      • Tutorial: Negative Stain Data
      • Tutorial: Phase Plate Data
      • Tutorial: EER File Support
      • Tutorial: EPU AFIS Beam Shift Import
      • Tutorial: Patch Motion and Patch CTF
      • Tutorial: Float16 Support
      • Tutorial: Particle Picking Calibration
      • Tutorial: Blob Picker Tuner
      • Tutorial: Helical Processing using EMPIAR-10031 (MAVS)
      • Tutorial: Maximum Box Sizes for Refinement
      • Tutorial: CTF Refinement
      • Tutorial: Ewald Sphere Correction
      • Tutorial: Symmetry Relaxation
      • Tutorial: Orientation Diagnostics
      • Tutorial: BILD files in CryoSPARC v4.4+
      • Tutorial: Mask Creation
      • Case Study: Yeast U4/U6.U5 tri-snRNP
      • Tutorial: 3D Classification
      • Tutorial: 3D Variability Analysis (Part One)
      • Tutorial: 3D Variability Analysis (Part Two)
      • Tutorial: 3D Flexible Refinement
        • Installing 3DFlex Dependencies (v4.1–v4.3)
      • Tutorial: 3D Flex Mesh Preparation
    • Webinar Recordings
  • Real-time processing in cryoSPARC Live
    • About CryoSPARC Live
    • Prerequisites and Compute Resources Setup
    • How to Access cryoSPARC Live
    • UI Overview
    • New Live Session: Start to Finish Guide
    • CryoSPARC Live Tutorial Videos
    • Live Jobs and Session-Level Functions
    • Performance Metrics
    • Managing a CryoSPARC Live Session from the CLI
    • FAQs and Troubleshooting
  • Guides for v3
    • v3 User Interface Guide
      • Dashboard
      • Project and Workspace Management
      • Create and Build Jobs
      • Queue Job, Inspect Job and Other Job Actions
      • View and Download Results
      • Job Relationships
      • Resource Manager
      • User Management
    • Tutorial: Job Builder
    • Get Started with CryoSPARC: Introductory Tutorial (v3)
    • Tutorial: Manually Curate Exposures (v3)
  • Resources
    • Questions and Support
Powered by GitBook
On this page
  • At a Glance
  • Description
  • Inputs
  • Particles
  • 2D class averages
  • Reference volume
  • Commonly Adjusted Parameters
  • Exclude classes worse than resolution (A)
  • Selection mode
  • Low pass filter volume
  • Maximum alignment resolution
  • Outputs
  • Particles selected
  • Templates selected
  • Particles excluded
  • Templates excluded
  • Plots
  • Common Next Steps
  • References
  1. Processing Data in cryoSPARC
  2. All Job Types in CryoSPARC
  3. Particle Curation

Job: Reference Based Auto Select 2D (BETA)

PreviousInteractive Job: Select 2D ClassesNextJob: Reconstruct 2D Classes

Last updated 1 month ago

At a Glance

Select 2D Classes based on their similarity to a 3D reference.

Description

Reference Based Auto Select 2D selects 2D classes based on their similarity to a user-provided 3D volume. This allows for improved selection of broader sets of 2D classes in an automated workflow, and may help users select classes representing rare views.

Inputs

Particles

This input is optional. If provided, particles which belong to any class selected by Reference Based Auto Select 2D will also be retained.

These particles should come from the same job as the input 2D class averages.

2D class averages

These class averages will be compared to the reference volume. Classes which meet the user-selected criteria will be selected; the others will be rejected. Note that Reference Based Auto Select 2D compares the class averages to projections from the 3D volume — it does not compare the particle images to the volume at any point.

Reference volume

2D classes are accepted or rejected based on their similarity to projections of this volume. First, the 2D class averages are aligned to the volume in the same way that individual images are aligned to the volume in a 3D refinement. Correlations are then calculated between a projection of the volume along that pose and the 2D class average.

Commonly Adjusted Parameters

Exclude classes worse than resolution (A)

Classes with a resolution worse than (i.e., higher numerical value) this parameter are rejected without comparing them to the reference.

Selection mode

Reference Based Auto Select 2D can select 2D classes based on these scores using five distinct modes: thresholds on Sobel scores, cluster, thresholds only, top N classes by correlation score, or top K percentile by correlation score.

thresholds on Sobel scores

Thresholds on Sobel scores is a new selection mode available starting in CryoSPARC v4.7. It performs well on a variety of datasets and is the new default value for this parameter.

In this mode, two scores are calculated: a Sobel score and an overall correlation score.

Sobel scores are, similar to HOG correlations, a way of measuring how similar the overall shape of two images are. A lower Sobel score indicates a better match between the two images. Sobel scores do not have a defined range (unlike correlation scores, which range from 0 to 1). Therefore, the Sobel scores in Reference Based Auto Select 2D are normalized such that the class average and volume projection which match best have a Sobel score of 1.0 and the others have a score greater than 1.0.

The overall correlation score provides another means of separating images with similar Sobel scores. Higher correlation scores are better.

When the selection mode is set to thresholds on Sobel scores, you must provide the particle diameter to the Circular mask diameter (A)parameter. This mask must be less than or equal to the Circular mask diameter (A) parameter setting from the source 2D Classification. This mask avoids spuriously high Sobel scores resulting from the high-contrast circle created by masking during 2D Classification.

Cluster

In all non-Sobel modes, two correlations are calculated for each 2D class average and its reference projection: the histogram of ordered gradients (HOG) correlation and the Pearson correlation coefficient.

The HOG correlation calculates the similarity in edge position and orientation between the class average and the projection — loosely, the similarity in overall shape. The Pearson correlation coefficient calculates the pixel-by-pixel similarity between the class average and the projection.

First, the 2D class averages are clustered into groups with similar Pearson and HOG correlations. The algorithm checks clusterings using between two and ten clusters, and uses the best-fitting number of clusters. Next, the cluster which has, on average, the highest combined correlation score is selected and all others are rejected.

Thresholds only

The user provides thresholds for the Pearson and HOG correlations. Classes for which either correlation is worse than the threshold are rejected. Only classes which have both correlations better than the threshold are selected.

Top N classes by correlation score

Classes are ordered by the sum of the HOG and Pearson correlations. The first N classes are selected, the remainder are rejected.

If either the HOG or Pearson threshold is greater than 0.0, classes in these top N will be rejected anyway if their correlation scores are lower than the set threshold(s).

Top K percent by correlation score

Classes are ordered by the sum of the HOG and Pearson correlations. The top K percentile are selected, the remainder are rejected. For instance, setting this parameter to 66 selects two thirds of the class averages for which the sum of Pearson and HOG correlations are highest.

If either the HOG or Pearson threshold is greater than 0.0, classes in the top K percentile will be rejected anyway if their correlation scores are lower than the set threshold(s).

Low pass filter volume

If this parameter is on, the reference will be lowpass filtered to the FRC resolution of the 2D class average. This occurs after the volume is aligned to the 2D class averages but before correlation scores are calculated. This is generally expected to improve results, since the input reference may have higher-resolution features not present in the class averages (which are generally lower resolution) which would hurt the correlation scores.

Maximum alignment resolution

While aligning the volume to the class averages, only frequencies up to this resolution will be considered. In general, setting this parameter to a higher numerical value (i.e., worse resolution) will help prevent overfitting. This may be especially useful when 2D class averages are noisy.

Outputs

Particles selected

If particles were provided as an input, particles belonging to the selected class averages are output here and otherwise unchanged. This output is not present if particles were not provided.

Templates selected

This output contains the selected class averages.

Particles excluded

If particles were provided as an input, particles belonging to the excluded class averages are output here and otherwise unchanged. This output is not present if particles were not provided.

Templates excluded

This output contains the rejected class averages.

Plots

Examples in this section are from a job run on particles from EMPIAR 10288 (Kumar et al. 2019).

First, 2D class averages which are excluded solely on the basis of their resolution are displayed. The resolution is written out at the top of the class averages. When selecting by thresholds on Sobel scores, the class averages are also normalized to appear as similar to the reference projections as possible.

Next, class averages are plotted by their Pearson and HOG correlations. This plot reflects the Selection mode chosen by the user:

  • In thresholds on Sobel scores and thresholds only modes, the class averages are plotted according to the scores and thresholds are displayed with blue and red lines.

  • In cluster mode, the cluster means are marked with stars.

  • In either top N classes by correlation score or top K percent by correlation score modes, there are no additional chart elements.

Next, accepted class averages are plotted alongside the projection of the reference volume which aligned to them. The odd images in each row (1st, 3rd, etc.) are 2D class averages, with the Pearson and HOG correlation scores displayed at the top and bottom of the image, respectively. The even images (2nd, 4th, etc.) are projections of the reference volume in the same pose as the 2D class average.

If the 2D class average is high quality and the reference was well aligned, these images should be indistinguishable.

The same plot is produced for the excluded classes:

Note that for noisy classes (for instance, the top-left class of the above image), the reference volume is still displayed in its best-fitting pose. However, this display makes it clear (by eye) that the volume used to calculate correlation is not truly present in the class.

In the plot of rejected 2D class averages, the thresholds are coloured red if they were used to reject a class (for instance, the Pearson correlation coefficient in the top-left class). If multiple good classes are rejected by this job, investigating these classes and determining which threshold caused them to be rejected can help fine-tune the threshold settings.

Common Next Steps

Reference Based Auto Select 2D is typically run manually once per sample type to determine useful parameter selections for future automated use on similar data.

References

  1. Kumar, K. et al. Structure of a Signaling Cannabinoid Receptor 1-G Protein Complex. Cell 176, 448-458.e12 (2019).

When processing multiple datasets of the same or similar targets, it is often possible to automate large swaths of the processing pipeline due to similar particle size, data quality, etc. Non-interactive 2D class selection can be performed with naive thresholds (choosing classes with high resolution, low ESS, or high particle count) in a job. However, this strategy runs the risk of rejecting classes which represent rare or hard-to-align views or including junk classes with artifactually high FRC resolutions.

We expect this job to be most useful in automated processing pipelines for the same target. When processing data for a target for the first time, the user can determine the appropriate filtering method and threshold values. These can then be re-used in subsequent collections as part of a to pick appropriate classes and proceed to high resolution reconstruction.

Once useful threshold settings have been determined, this job can be included in a , between and jobs. In this way, the workflow can proceed from preprocessing through 3D refinement without any user intervention, extending the degree of automation possible for known samples.

The job can also be useful when attempting to select good classes from a 2D classification that was run with a very large number of classes (e.g. >200). In this case, if there is a coarse resolution reference of the target available, this job can help select matching 2D classes without having to make potentially hundreds of manual selections in a interactive job.

Select 2D Classes
Workflow
Workflow
2D Classification
Ab Initio Reconstruction
Select 2D Classes
Data from EMPIAR 10288 (Kumar et al. 2019) used to produce this figure.