CryoSPARC Guide
  • About CryoSPARC
  • Current Version
  • Licensing
    • Non-commercial license agreement
  • Setup, Configuration and Management
    • CryoSPARC Architecture and System Requirements
    • CryoSPARC Installation Prerequisites
    • How to Download, Install and Configure
      • Obtaining A License ID
      • Downloading and Installing CryoSPARC
      • CryoSPARC Cluster Integration Script Examples
      • Accessing the CryoSPARC User Interface
    • Deploying CryoSPARC on AWS
      • Performance Benchmarks
    • Using CryoSPARC with Cluster Management Software
    • Software Updates and Patches
    • Management and Monitoring
      • Environment variables
      • (Optional) Hosting CryoSPARC Through a Reverse Proxy
      • cryosparcm reference
      • cryosparcm cli reference
      • cryosparcw reference
    • Software System Guides
      • Guide: Updating to CryoSPARC v4
      • Guide: Installation Testing with cryosparcm test
      • Guide: Verify CryoSPARC Installation with the Extensive Validation Job (v4.3+)
      • Guide: Verify CryoSPARC Installation with the Extensive Workflow (≤v4.2)
      • Guide: Performance Benchmarking (v4.3+)
      • Guide: Download Error Reports
      • Guide: Maintenance Mode and Configurable User Facing Messages
      • Guide: User Management
      • Guide: Multi-user Unix Permissions and Data Access Control
      • Guide: Lane Assignments and Restrictions
      • Guide: Queuing Directly to a GPU
      • Guide: Priority Job Queuing
      • Guide: Configuring Custom Variables for Cluster Job Submission Scripts
      • Guide: SSD Particle Caching in CryoSPARC
      • Guide: Data Management in CryoSPARC (v4.0+)
      • Guide: Data Cleanup (v4.3+)
      • Guide: Reduce Database Size (v4.3+)
      • Guide: Data Management in CryoSPARC (≤v3.3)
      • Guide: CryoSPARC Live Session Data Management
      • Guide: Manipulating .cs Files Created By CryoSPARC
      • Guide: Migrating your CryoSPARC Instance
      • Guide: EMDB-friendly XML file for FSC plots
    • Troubleshooting
  • Application Guide (v4.0+)
    • A Tour of the CryoSPARC Interface
    • Browsing the CryoSPARC Instance
    • Projects, Workspaces and Live Sessions
    • Jobs
    • Job Views: Cards, Tree, and Table
    • Creating and Running Jobs
    • Low Level Results Interface
    • Filters and Sorting
    • View Options
    • Tags
    • Flat vs Hierarchical Navigation
    • File Browser
    • Blueprints
    • Workflows
    • Inspecting Data
    • Managing Jobs
    • Interactive Jobs
    • Upload Local Files
    • Managing Data
    • Downloading and Exporting Data
    • Instance Management
    • Admin Panel
  • Cryo-EM Foundations
    • Image Formation
      • Contrast in Cryo-EM
      • Waves as Vectors
      • Aliasing
  • Expectation Maximization in Cryo-EM
  • Processing Data in cryoSPARC
    • Get Started with CryoSPARC: Introductory Tutorial (v4.0+)
    • Tutorial Videos
    • All Job Types in CryoSPARC
      • Import
        • Job: Import Movies
        • Job: Import Micrographs
        • Job: Import Particle Stack
        • Job: Import 3D Volumes
        • Job: Import Templates
        • Job: Import Result Group
        • Job: Import Beam Shift
      • Motion Correction
        • Job: Patch Motion Correction
        • Job: Full-Frame Motion Correction
        • Job: Local Motion Correction
        • Job: MotionCor2 (Wrapper) (BETA)
        • Job: Reference Based Motion Correction (BETA)
      • CTF Estimation
        • Job: Patch CTF Estimation
        • Job: Patch CTF Extraction
        • Job: CTFFIND4 (Wrapper)
        • Job: Gctf (Wrapper) (Legacy)
      • Exposure Curation
        • Job: Micrograph Denoiser (BETA)
        • Job: Micrograph Junk Detector (BETA)
        • Interactive Job: Manually Curate Exposures
      • Particle Picking
        • Interactive Job: Manual Picker
        • Job: Blob Picker
        • Job: Template Picker
        • Job: Filament Tracer
        • Job: Blob Picker Tuner
        • Interactive Job: Inspect Particle Picks
        • Job: Create Templates
      • Extraction
        • Job: Extract from Micrographs
        • Job: Downsample Particles
        • Job: Restack Particles
      • Deep Picking
        • Guideline for Supervised Particle Picking using Deep Learning Models
        • Deep Network Particle Picker
          • T20S Proteasome: Deep Particle Picking Tutorial
          • Job: Deep Picker Train and Job: Deep Picker Inference
        • Topaz (Bepler, et al)
          • T20S Proteasome: Topaz Particle Picking Tutorial
          • T20S Proteasome: Topaz Micrograph Denoising Tutorial
          • Job: Topaz Train and Job: Topaz Cross Validation
          • Job: Topaz Extract
          • Job: Topaz Denoise
      • Particle Curation
        • Job: 2D Classification
        • Interactive Job: Select 2D Classes
        • Job: Reference Based Auto Select 2D (BETA)
        • Job: Reconstruct 2D Classes
        • Job: Rebalance 2D Classes
        • Job: Class Probability Filter (Legacy)
        • Job: Rebalance Orientations
        • Job: Subset Particles by Statistic
      • 3D Reconstruction
        • Job: Ab-Initio Reconstruction
      • 3D Refinement
        • Job: Homogeneous Refinement
        • Job: Heterogeneous Refinement
        • Job: Non-Uniform Refinement
        • Job: Homogeneous Reconstruction Only
        • Job: Heterogeneous Reconstruction Only
        • Job: Homogeneous Refinement (Legacy)
        • Job: Non-uniform Refinement (Legacy)
      • CTF Refinement
        • Job: Global CTF Refinement
        • Job: Local CTF Refinement
        • Job: Exposure Group Utilities
      • Conformational Variability
        • Job: 3D Variability
        • Job: 3D Variability Display
        • Job: 3D Classification
        • Job: Regroup 3D Classes
        • Job: Reference Based Auto Select 3D (BETA)
        • Job: 3D Flexible Refinement (3DFlex) (BETA)
      • Postprocessing
        • Job: Sharpening Tools
        • Job: DeepEMhancer (Wrapper)
        • Job: Validation (FSC)
        • Job: Local Resolution Estimation
        • Job: Local Filtering
        • Job: ResLog Analysis
        • Job: ThreeDFSC (Wrapper) (Legacy)
      • Local Refinement
        • Job: Local Refinement
        • Job: Particle Subtraction
        • Job: Local Refinement (Legacy)
      • Helical Reconstruction
        • Helical symmetry in CryoSPARC
        • Job: Helical Refinement
        • Job: Symmetry search utility
        • Job: Average Power Spectra
      • Utilities
        • Job: Exposure Sets Tool
        • Job: Exposure Tools
        • Job: Generate Micrograph Thumbnails
        • Job: Cache Particles on SSD
        • Job: Check for Corrupt Particles
        • Job: Particle Sets Tool
        • Job: Reassign Particles to Micrographs
        • Job: Remove Duplicate Particles
        • Job: Symmetry Expansion
        • Job: Volume Tools
        • Job: Volume Alignment Tools
        • Job: Align 3D maps
        • Job: Split Volumes Group
        • Job: Orientation Diagnostics
      • Simulations
        • Job: Simulate Data (GPU)
        • Job: Simulate Data (Legacy)
    • CryoSPARC Tools
    • Data Processing Tutorials
      • Case study: End-to-end processing of a ligand-bound GPCR (EMPIAR-10853)
      • Case Study: DkTx-bound TRPV1 (EMPIAR-10059)
      • Case Study: Pseudosymmetry in TRPV5 and Calmodulin (EMPIAR-10256)
      • Case Study: End-to-end processing of an inactive GPCR (EMPIAR-10668)
      • Case Study: End-to-end processing of encapsulated ferritin (EMPIAR-10716)
      • Case Study: Exploratory data processing by Oliver Clarke
      • Tutorial: Tips for Membrane Protein Structures
      • Tutorial: Common CryoSPARC Plots
      • Tutorial: Negative Stain Data
      • Tutorial: Phase Plate Data
      • Tutorial: EER File Support
      • Tutorial: EPU AFIS Beam Shift Import
      • Tutorial: Patch Motion and Patch CTF
      • Tutorial: Float16 Support
      • Tutorial: Particle Picking Calibration
      • Tutorial: Blob Picker Tuner
      • Tutorial: Helical Processing using EMPIAR-10031 (MAVS)
      • Tutorial: Maximum Box Sizes for Refinement
      • Tutorial: CTF Refinement
      • Tutorial: Ewald Sphere Correction
      • Tutorial: Symmetry Relaxation
      • Tutorial: Orientation Diagnostics
      • Tutorial: BILD files in CryoSPARC v4.4+
      • Tutorial: Mask Creation
      • Case Study: Yeast U4/U6.U5 tri-snRNP
      • Tutorial: 3D Classification
      • Tutorial: 3D Variability Analysis (Part One)
      • Tutorial: 3D Variability Analysis (Part Two)
      • Tutorial: 3D Flexible Refinement
        • Installing 3DFlex Dependencies (v4.1–v4.3)
      • Tutorial: 3D Flex Mesh Preparation
    • Webinar Recordings
  • Real-time processing in cryoSPARC Live
    • About CryoSPARC Live
    • Prerequisites and Compute Resources Setup
    • How to Access cryoSPARC Live
    • UI Overview
    • New Live Session: Start to Finish Guide
    • CryoSPARC Live Tutorial Videos
    • Live Jobs and Session-Level Functions
    • Performance Metrics
    • Managing a CryoSPARC Live Session from the CLI
    • FAQs and Troubleshooting
  • Guides for v3
    • v3 User Interface Guide
      • Dashboard
      • Project and Workspace Management
      • Create and Build Jobs
      • Queue Job, Inspect Job and Other Job Actions
      • View and Download Results
      • Job Relationships
      • Resource Manager
      • User Management
    • Tutorial: Job Builder
    • Get Started with CryoSPARC: Introductory Tutorial (v3)
    • Tutorial: Manually Curate Exposures (v3)
  • Resources
    • Questions and Support
Powered by GitBook
On this page
  • Introduction
  • Use case #1: Clustering movies into exposure groups at import time
  • Import Movies or Import Micrographs
  • Pre-processing
  • Clustering via Exposure Group Utilities
  • Use case #2: Clustering exposures from live, or clustering exposures from pre-v4.4 CryoSPARC versions
  • Import Beam Shift
  • Clustering via Exposure Group Utilities
  • Next Steps
  • References
  1. Processing Data in cryoSPARC
  2. Data Processing Tutorials

Tutorial: EPU AFIS Beam Shift Import

A tutorial covering how to split exposures into groups based on beam shift values, for data collected in EPU's AFIS mode.

PreviousTutorial: EER File SupportNextTutorial: Patch Motion and Patch CTF

Last updated 1 year ago

Introduction

Thermo Fisher’s is a commonly used data acquisition software for single particle analysis. Typically, SPA data has been collected via manually moving the stage around, placing different regions within the hole at the optical center of the microscope. This is done in order to avoid the strong aberrations that result from off-axis use of the objective lens. However, collecting data in this manner induces delays from moving the stage and waiting for the stage to settle. Thus, advances such as have allowed for targeting multiple holes without stage movement in-between each hole. Importantly, AFIS and associated microscope calibration service allow for targeting holes that don’t lie at the optical center of the microscope without inducing severe artefacts, and this significantly speeds up data collection.

For many datasets collected via AFIS mode, it is still worthwhile to estimate residual higher-order aberrations such as coma via the Global CTF Refinement job: if there are any residual aberrations, correcting them may lead to improved structures. However, doing so requires grouping movies into subsets () based on similar optical conditions, which includes the amount of applied beam shift. Since refinement of higher-order CTF aberrations is done separately for each exposure group, the assignment of movies into exposure groups can have significant impacts on the aberration values, which will impact the resolution achieved by subsequent refinements. In CryoSPARC v4.4+, we have integrated the import of beam shift values from EPU sessions collected in AFIS (Aberration-free Image Shift) mode to allow for Exposure Group assignments based on applied beam shift. The following tutorial covers:

  • how to import movies/micrographs with beam shift values

  • how to assign movies/micrographs into exposure groups based on beam shift

  • merging beam shift values into pre-v4.4 movie/micrograph datasets, without re-processing from scratch

  • continuing processing data from live

Use case #1: Clustering movies into exposure groups at import time

This use case covers the situation where dataset processing starting in CryoSPARC from scratch, i.e., all processing steps post-motion correction (including particle picking) have not been done yet. For existing CryoSPARC exposure datasets, datasets processed in CryoSPARC Live, or datasets with existing particles, please refer to the subsequent use case #2.

Import Movies or Import Micrographs

For movie or micrograph datasets collected via EPU’s AFIS mode, beam shift values can now be imported along with other metadata. As an example, here is a screenshot of an output data directory from a data collection session using EPU. Note that the movies we would like to import are in .eer format, and the associated files containing the beam shifts are in .xml format. Each EER movie has a corresponding xml file.

In CryoSPARC’s Import Movies and Import Micrographs jobs, you will now notice an “XML Import” section that allows specification of an absolute path wildcard to the directory containing the XML files. This path is in addition to the wildcard expression pointing to the raw movie files. For the above example, we have set the two wildcard expressions to the following:

  • Movies data path: /bulk9/data/EPU_apof_JLR/20230212a_T64/Images-Disc1/GridSquare_11564642/Data/*.eer

  • EPU XML metadata path: /bulk9/data/EPU_apof_JLR/20230212a_T64/Images-Disc1/GridSquare_11564642/Data/*.xml

Here they are the same paths with different file extension filters.

There are 4 additional parameters that assist in finding correspondences between the .eer and .xml files. These parameters specify the number of characters to cut off of the beginning and end of the movie and XML filenames in order to match them to each other, one-to-one:

  • Length of movie filename prefix to cut for XML correspondence: Use this field to specify the number of characters to cut off the prefix of the imported movie filename, to match with the XML filename.

  • Length of movie filename suffix to cut for XML correspondence: Use this field to specify the number of characters to cut off the suffix of the imported movie filename, to match with the XML filename.

  • Length of XML filename prefix to cut for movie correspondence: Use this field to specify the number of characters to cut off the prefix of the XML filename, to match with the imported movie filename.

  • Length of XML filename suffix to cut for movie correspondence: Use this field to specify the number of characters to cut off the suffix of the XML filename, to match with the imported movie filename.

In this case, we need to trim the eight _EER.eer characters off of the end of the movie filename, as well as the four .xml characters off of the XML filenames. Thus, we’ll set the movie suffix parameter to 8, the XML suffix parameter to 4, and leave the rest empty.

After inputting the parameters and running the job, a scatter plot with the beam shift values will be displayed in the event log if the XML import was successful. The event log will also print an example of the movie and XML paths after applying the prefix/suffix trim, in order to ensure that these are aligned and match in structure. If any of the XML files are absent, corrupt, or missing beam shift values, they will be flagged as having missing beam shifts; in the example image, two exposures are missing beam shift values. Be sure to check the event log to see if the majority of exposures had successfully read beam shift values; if this is not the case, a warning will be displayed in orange highlight.

Pre-processing

Next, exposures must be pre-processed via motion correction (applicable to movies) and CTF estimation. CTF estimation is required to cluster exposures by the applied beam shift. The recommended motion correction job is Patch Motion Correction, and the recommended CTF estimation job is Patch CTF Estimation.

Clustering via Exposure Group Utilities

The next step is to cluster exposures into groups based on the applied beam shift. The main purpose of clustering is to ensure that exposures with similar beam shift values are placed into the same exposure group. This can be done via running Exposure Group Utilities in the cluster&split mode.

First, connect the outputted exposures from the Import Movies or Import Micrograph job above. Then, specify the “Input Selection” as exposure, and the “Action” as cluster&split. Finally, set the number of clusters. In this case, based on the beam tilt scatter plot above, we counted 61 clusters, which correspond to the 61 unique “rings” (each ring corresponding to 8 different collection sites arranged in a circle on one hole). Note that it is not necessary to ensure that the number of clusters matches the number of holes precisely. Indeed, depending on the layout and orientation of holes on the grid, the beam shift distribution may not form neat clusters and may appear more continuous. In any case, the following should be noted when choosing the number of clusters:

  • With too few clusters, there will be greater intra-exposure-group variability in the beam shift, possibly leading to less accuracy when fitting the higher-order aberrations

  • With too many clusters, there will be fewer exposures and particles per exposure group, possibly limiting the precision of the fit higher-order aberration values. In extreme cases, too few particles per exposure group could impact the stability of the Global CTF Refinement aberration fitting algorithm, as there is a minimum cumulative amount of signal in each exposure group that is needed to fit the aberration parameters. This is important to keep in mind, as aberration estimation is done independently for each exposure group.

The “Clustering method” may also be tweaked. The most important factor when clustering exposures is that clusters are reasonably uniform in both:

  • the number of exposures they contain, and

  • the range of beam shift values they span

The default of agglomerative clustering works well on a variety of datasets, but we also enable kmeans. K-means clustering works better when exposures’ beam shift values form isotropic clusters with most points located close to the mean, or when the spread over beam shifts is more “continuous” and doesn’t form neat discrete clusters. In these cases, k-means will ensure that clusters remain relatively uniform in the range of beam shift values that each cluster spans. Agglomerative clustering may perform better when clusters form more irregular shapes, such as the “rings” in this example.

Once the number of clusters is chosen, queue and run the job. At the first checkpoint, the exposure group clustering result is shown. If any exposures are missing beam shift values, they will be placed into their own separate exposure group, and the number of exposure groups outputted by the job in total will be one larger than the parameter value.

The output exposures are now ready for downstream processing, including motion correction, CTF estimation, and particle picking. Be sure to experiment with Global CTF Refinement to see if clustering particles into exposure groups helps obtain better resolutions. Note that only exposure groups with an adequate number of particles should have their aberrations refined, as Global CTF Refinement depends on having enough signal across the particle images in the exposure group.

Use case #2: Clustering exposures from live, or clustering exposures from pre-v4.4 CryoSPARC versions

This use case covers the following situations:

  • Exposures have been initially processed via CryoSPARC Live or via CryoSPARC versions pre-v4.4, with or without associated particle stacks, and re-clustering of exposure groups based on beam tilt is desired

In this case, the following steps (outlined below) allow for re-clustering of exposure groups:

  • Running an Import Beam Shifts job in order to retrieve the exposures’ beam shift values;

  • Clustering the movies/micrographs into exposure groups via Exposure Group Utilities, with input particles provided to the job

If you are importing fresh movies or micrographs into CryoSPARC v4.4+, Use case #1 covers the basic import case, which is recommended to read first.

Import Beam Shift

Navigate to the job builder, locate the new “Import Beam Shift” job under the imports section, and build an Import Beam Shift job. This is a new job created to add beam shift information to existing exposures datasets in CryoSPARC, without need for re-importing the movies/micrographs from scratch.

Next, connect the existing movies or micrographs dataset from CryoSPARC as input to the Import Beam Shifts job. This may be a movie dataset exported from CryoSPARC Live, or a movie dataset processed in regular CryoSPARC. Ensure that the entire movie dataset is inputted to the job (i.e. if any exposure curation had filtered out some exposures, ensure to use exposures from upstream to that job). When connecting movies as input to the Import Beam Shifts job, the jobs will use the existing movies’ UIDs rather than generating new UIDs, like the other import jobs. These existing UIDs are required when updating particles’ exposure group assignments in Exposure Group Utilities, to match particles to the exposures that they came from.

Here, the example movie/mic filename is the same as the XML filename, except for the trailing _EER.eer. Due to these extra characters, the beam shift import was not successful, and CryoSPARC warned that it did not find the beam shifts associated with any of the 2797 exposures. To fix this, we can set the “Length of movie filename suffix to cut for XML correspondence” parameter to 8 to cut off the trailing 8 characters and find proper matches between the XML and movie files. Re-running the job, we see that the XML files were found for all but two exposures, which happen to be missing from this dataset:

Finally, if the XML import was successful and beam shifts were present in the XML files, a beam shift scatter plot will be displayed in the event log as in use case #1. The UIDs and all input slots (e.g. motion correction or CTF estimation results) will have been pulled from the input dataset, meaning we do not have to repeat these steps if they have already been done.

Clustering via Exposure Group Utilities

  • Connect the output exposures from “Import Beam Shift” and the existing particle dataset to Exposure Group Utilities;

  • As in use case #1, Set the “Input Type” to exposure, specify the “Action” as cluster&split, and specify the number of clusters and clustering method;

  • Activate the “Correspond particles to exposures and enforce consistency of exposure group IDs” parameter

  • If particles were previously split into more than one exposure group, set the “Combine strategy” to take_mode

    • This ensures that when particles from different exposure groups are combined into the same group, the aberration values for the entire group will be set to the mode (most common value) amongst particles in the group. Since exposure group clustering is done with the purpose of re-running Global CTF Refinement, aberrations will be re-refined and this is not a point of concern.

In our case, particles were previously from only one exposure group, so we don’t need to change the combine strategy. Thus, we’ll run the job with the following input parameters:

Checkpoints 1 and 2 will show the exposure and particle datasets’ exposure group information prior to clustering, respectively, in a table format in the event log. In most cases, particles and exposures will initially be all pooled into one exposure group, unless they were assigned different exposure group IDs upon import. Checkpoint 1 will also show the beam shift scatter plot labelled by the assigned exposure groups. Checkpoints 3 and 4 will show the exposure and particle datasets’ exposure group information after clustering. If “Correspond particles to exposures” was activated, the particles and exposures datasets should be consistent.

Next Steps

Once picked particles are obtained and a relatively high-resolution structure has been obtained, use Global CTF Refinement to fit higher-order aberration values. If you followed use case #2, it’s possible to do an apples-to-apples comparison of resolutions before and after clustering particles into exposure groups. This can be done by using two Global CTF Refinement jobs and two Homogeneous Reconstruction Only jobs along with a fixed mask.

References

As in use case #1, provide the XML directory wildcard expression, that points to the directory containing the original XML files. These parameters are identical to those in Import Movies, and the instructions can be followed in . If needed, specify the four “Length of movie/XML filename prefix/suffix…” parameters to correctly match movie filenames to XML filenames. Examples of the trimmed file-paths will be printed to the event log to help determine the number of characters. The values of these parameters is most quickly determined by running the job with all defaults, and observing the event log. For example, when connecting movies that were previously imported to CryoSPARC and running the Import Beam Shift job, the event log shows the following messages:

If particles were already picked, we also do not have to repeat particle picking and can instead assign particles to exposure groups based on which exposures they came from. This can also be done via the Exposure Group Utilities job. In this case, we can , with the following (bolded) modifications:

In our example dataset, the resolution improvements obtained via exposure group clustering were rather modest, indicating that the microscope was quite well calibrated. However, of more significant improvements have been previously documented on the forum.

Dustin Morado’s repository for clustering strategies, as well as his describing the motivation for exposure group clustering when collecting in AFIS mode.

use case #1 instructions
use Exposure Group Utilities as described above
examples
EPU_group_AFIS
detailed forum post
EPU data collection software
EPU’s Aberration-Free Image Shift (AFIS) collection mode
Exposure Groups
Example illustrating how to determine the number of characters to cut for the movie and XML filenames.
Example of a tree view for the workflow (starting from importing micrographs) until exposure group utilities.
Example of processing movies exported from live. Movies were processed through “Import Beam Shifts” to tag them with beam shifts, then together with particles were passed through Exposure Group Utilities to cluster them based on beam shift. Global CTF Refinement was performed twice, once on the initial exposure group assignments (J135: all particles in one group) and once on the new assignments (J136). Two final reconstructions were done, with fixed poses and identical masks, to compute FSCs.
A modest increase in resolution (1.53 Å) induced by exposure group clustering into 61 clusters, compared to the baseline of 1.54 Å associated with keeping all exposures in one cluster. Note that poses and masks were identical, thus the only variables changed between these two reconstructions were the fitted high-order aberrations from Global CTF Refinement.