Cryo-EM Data Processing in cryoSPARC: Introductory Tutorial

We recommend starting off with the T20S Tutorial to become familiar with the workflow in cryoSPARC. This dataset is a subset of 20 movies from the EMPIAR-10025 T20S Proteasome dataset. While not a representative example of the complexity of most cryo-EM projects today, it is a good way to become familiar with the interface and software features and to learn how cryoSPARC organizes jobs and projects.

For a refresher on the interface, projects and jobs, please see User Interface.

Introduction: Dashboard, Projects, Workspaces and Jobs

The Dashboard provides at-a-glance information on your Projects, Workspaces and status of Jobs. It also shows the change log for new versions of cryoSPARC. The header and footer contain links to Projects view, Workspaces, the Resource Manager and the identity of the current user.

cryoSPARC organizes your workflow by Project, e.g, P1, P2, etc. Projects contain one or more Workspaces, which in turn house Jobs.

Projects are strict divisions. Files and jobs from different projects are stored in dedicated project directories and jobs cannot be connected from one project to another.

Workspaces allow for logical separation of jobs and workflows so they can be more easily managed in a large project. Jobs may be connected across workspaces and each job may belong to more than one workspace.

Step 1: Create a Project

  • Navigate to the Projects view by selecting on the drawer icon in the header, or from the Projects button in the footer.

  • To create a project, press N or press the "+" in the header. A modal window appears to enter new project details.

  • Enter a project Title and browse for a location for the associated Project directory with the File Browser. The project directory should already exist. cryoSPARC populates it with job directories as you create jobs. All files associated with the project will be stored inside the selected project directory. You may also enter a Description for your project.

  • Press "Create". The new project now appears on the Projects page.

Step 2: Create a Workspace

Use Workspaces to organize or separate portions of the cryo-EM workflow for convenience or experimentation. Create at least one Workspace within a Project before running a Job.

  • Select the project number (e.g., "P8") to open the Project.

  • Alternatively, select the Projects drop-down in the header. This opens a searchable list of all Projects associated with your user account. Select an entry to open the associated project.

  • Create a New Workspace with the "+ Add" button in the header or press N on your keyboard. Alternatively, select New Workspace from the Project Details panel on the right side of the screen. Set a Title (may be changed later) and optionally a description.

  • Select "Create"

Step 3: Download the Tutorial Dataset

  • Log in to the machine where cryoSPARC is installed via command-line.

  • Navigate to or create a directory into which to download the test dataset (approx. 8 GB). This location should have read permissions for the user running cryoSPARC.

  • Run the command cryosparcm downloadtest while in this directory. This downloads a subset of the T20S dataset.

  • Run tar -xf empiar_10025_subset.tar to decompress the downloaded data.

Step 4: Import Movies

  • In cryoSPARC, navigate to the new Workspace. To do so, navigate to the project to see a list of workspaces, and then into the workspace.

  • Select the Job Builder in the right sidebar. The Job Builder displays all available job types by category (e.g., workflows, imports, motion correction, etc.)

  • Select the Import Movies job type in the Job builder. This creates a new job within the current Workspace, displayed as a card. By default, new jobs are set to Building status, indicated on the job card in purple. To change parameters, toggle between active or inactive building states with B on your keyboard, or press the "Building" badge on the job card (see also Keyboard Shortcut Reference for more keyboard shortcuts).

  • Select the Movies data path: Press the file browse icon and select the movie files (.mrc or .tif format). To select multiple files, use a wildcard, e.g., *.mrc. This selects all files that match the wildcard expression. The file browser displays the list of selected files along with the number of matches at the bottom. For this tutorial dataset, navigate to the directory where the test data was downloaded. Use the wildcard expression *.tif to select all TIFF format movies in the folder. There should be 20 imported movies.

  • Select the Gain reference path with the file browser: Select the single .mrc file in the folder where the test data was downloaded.

  • Edit Job parameters from the Builder; enter the following parameters (obtained from the original publication in eLife).

    1. Raw pixel size (A): 0.6575

    2. Accelerating voltage (kV): 300

    3. Spherical abberation (mm): 2.7

    4. Total exposure dose (e/A^2): 53

  • After changing a parameter, the blue D icon changes to a green S. This indicates the parameter is different from its default value.

  • Select Queue to start the import. Use the subsequent modal to select a lane/node on which to run the job. The available lanes depend on your installation configuration. Select a lane, then press Create.

  • The Import Movies job queues and starts running. Look for the Job card in the workspace to monitor its status.

  • To open a Job and view its progress, select on the Job number on the top left hand side of the Job card. Alternatively, select on the job card and press the spacebar on your keyboard:

  • This opens the Inspect view, which shows a streaming log of the real-time progress for the Import Job. Scroll through the stream log to view results. Select a checkpoint to find a specific location in the stream log or 'Show from top' to return to the beginning. The Inspect view also shows all the job's outputs on the right hand side:

  • To exit the job/close the inspect view, press the spacebar again, or press the × on the top-right.

  • Once finished, the job's status indicator changes to "Completed" in green. Additional actions and detailed information for the job are available in the details panel. The Output of the import job, i.e., the 20 imported movies, are available on the right hand side of the stream log.

Completed "Import Movies" job

Step 5: Motion Correction

  • In the Job Builder, select Patch motion correction (multi) (parallelized over multiple GPUs, if you have them available). This creates a new job and in building state so that it's inputs and parameters are editable in the right side panel.

  • The motion correction job requires raw movies as Inputs. Open the previously completed Import Movies job, then drag and drop the Outputs of the Import Movies job, to the Movies placeholder in the Job Builder.

  • Once dropped, the connected output name appears in the Job Builder as an Input:

  • Queue the job and select a lane. It is generally not necessary to adjust the Patch Motion Correction job parameters; they are automatically tuned based on the data.

Step 6: CTF estimation

  • Select Patch CTF Estimation (multi) in the Job Builder to create a new job.

  • This job type requires micrographs as the Input. Open the previous patch motion correction job, and drag and drop the Output (20 micrographs) into the Micrograph placeholder in the Job Builder. Queue the job to start.

Step 7: Particle Picking

Start off by manually picking some particles from the processed micrographs to generate templates for further automatic picking.

Manual picking is a good idea because it verifies data quality and sets expectations for what particle images, projections, and structures should look like. This informs parameters for tuning automatic picking algorithms and determining whether the picks are good.

cryoSPARC also offers blob-based picking with the "Blob picker" job. If desired, use this instead of manual picks to generate templates quickly.

Select the Blob picker job, then enter the Job Builder and set Min Particle Diameter to 100 and Max Particle Diameter to 200.

  • Select Manual picker from the Job Builder.

  • Locate the exposures output from the previously completed patch CTF estimation job. Drag and drop these into the Micrograph placeholder in the Job Builder. Queue the job.

  • Manual Picker is an Interactive job. Inspect the job to display all micrographs used as inputs on the left hand side. Select 'Name', 'Defocus', 'CTF fit' or 'Picks' to re-order the list.

  • To pick from a particular micrograph, scroll through the list to locate a micrograph and select it (purple highlight).

  • Box size: We generally recommend selecting a box size that is at least double the diameter of the particle. In this case, enter a box size of 384. The box size controls how much of the micrograph is cropped around each particle location. Larger box sizes capture the most high-resolution signal that is spread out spatially due to the effect of defocus (CTF) in the microscope. However, larger box sizes significantly increase computation expense in further processing.

  • Hover over a particle and left-click to select. Selections appear as a green circle around the particle to indicate the box size. The number of picked particles in each micrograph appears in the 'Picks' column and the total number of picks is available at the bottom of the list. For this tutorial, we recommend picking a total of approximately 100 particles across the 20 micrographs. Try to capture a diversity of top and side views of the Proteasome. Your picks are automatically saved. Closing the browser also does not affect picks.

  • To un-pick a particle, right-click the green circle over it.

  • Use the lowpass filter slider on the left side to adjust the micrograph display (this does not affect the results of the job) and/or adjust the box size if needed.

  • Once satisfied with the picks, select Done Picking! Extract Particles at the top. The job proceeds to extract the picked particles.

When using the Blob picker job, connect its output to an Inspect Picks job (details below) followed by Extract from Micrographs. Connect the output from this last job to 2D Classification in the next step.

Step 8: Template-based Automatic Particle Picking

  • To generate templates for automatic picking, select 2D Classification from the Job Builder. Drag and drop the particles output from the Manual picking (or Blob picking) job, into the Particle stacks input.

  • Change the Number of 2D classes parameter to 10, then Queue the job.

  • The 2D Classification job proceeds through a number of iterations over a few minutes. Once complete, create a Select 2D job from the Job Builder. Drag and drop both the particles and class_averages to their corresponding Inputs in the Job Builder. Queue the Select 2D job.

  • The Select 2D job enters Waiting mode (fuchsia). This indicates that an interactive job is ready for interaction. Open the job to inspect it (press the spacebar or select the job number).

  • Select two templates (one top view and one side view). Press Done to complete the job.

  • From the Job Builder, create a Template Picker job. This uses the generated 2D templates to automatically pick particles from micrographs.

  • Drag and drop the templates_selected (Count: 2) from the Select 2D job into the Templates input. Close the Select 2D job inspector.

  • Locate and open the previously completed Patch CTF Estimation job. Drag and drop the exposures (Count: 20) into the Micrographs input.

  • Set the Particle diameter in Angstrom parameter to 190. Queue the job. The template picker finds particles that match the templates and returns statistics for each pick. These statistics may be used to filter for high quality picks.

Step 9: Inspect Picks

Use the Inspect picks job to view and interactively adjust the results of template-based (and blob-based) automatic particle picking.

  • Select Inspect particle picks from the Job Builder.

  • Drag and drop both the particles and micrographs outputs from the previously completed Template Picker job. Queue the job.

  • Browse through and select from the list of micrographs on the left side.

  • The histogram on the left side shows statistics across all pick locations, including false positives and true particles. True particles generally have a high Normalized Cross Correlation (NCC) score (indicating agreement in shape with the templates) and a high Power score (indicating the presence of significant signal). Picks that have too little Power are false positives containing only ice, while picks with very high power are carbon edges, ice crystals, aggregate particles, etc.

  • Make adjustments to the parameters below if needed. All adjustments are saved automatically.

    • Adjust the lowpass filter slider if needed to better view the picks.

    • Adjust the box size to make it easier to see the location of picks. Often a very small box size (32) can be helpful. The box size is not used in the outputs of this job; Inspect Picks only outputs particle (x,y)(x, y) location coordinates.

    • Adjust the NCC slider (approx. 0.350).

    • Adjust the Power threshold slider. This helps to remove false positives (approx. between 1075 and 1745).

  • Once satisfied with the picks, select Done Picking! Output Locations!. This completes the Inspect picks job and saves selected particle locations.

Step 10: Extract from Micrographs

This job extracts particles from the respective micrographs.

  • Select Extract from Micrographs (parallelized over multiple GPUs, if you have them available).

  • Open the recently-completed Inspect Picks job. Drag and drop both the micrographs and particles outputs into the corresponding inputs on the Job Builder.

  • In the Job Builder, look under the Particle Extraction section and change the Extraction box size (pix) to 440.

  • Queue the job.

Step 11: 2D Classification

  • Select 2D Classification from the Job Builder.

  • Drag and drop the particles output from the previously completed Extract from Micrographs, into the input.

  • Queue the job.

  • In the stream log, preview images of class averages appear after each iteration. Classification into 50 classes (the default number) takes about 15 minutes on a single GPU.

  • Once complete, proceed to Select 2D classes.

Step 12: Select 2D classes (Interactive)

  • Select Select 2D classes from the Job Builder.

  • Drag and drop both the particles and class_averages outputs from the most recently completed 2D Classification job.

  • Queue the job. Once the data is loaded, the job status changes to Waiting and Interactive class selection mode is ready.

  • Select each "good" class. Use both the number of particles and the provided class resolution score to identify good classes of particles. The interactive job provides several ways to sort the classes in ascending or descending order based on:

    • # of particles: The total number of particles in each class

    • Resolution: The relative resolution of all particles in the class (Å)

    • ECA: Effective classes assigned

      Note: Avoid selecting classes that contain only a partial particle or a non-particle junk image.

  • When finished, select Done at the top right side of the window. The job completes.

Example of good 2D class selections

Step 13: Ab-initio Reconstruction

  • Select Ab-initio reconstruction from the Job Builder.

  • Drag and drop the particles_selected output from the most recently completed Select 2D classes job into the Particle stacks input in the Job Builder.

  • Queue the job. Results appear in real-time in the stream log as iterations progress. Ab-initio reconstruction should resolve the T20S structure to a coarse resolution.

Completed Ab-initio job

Step 14: Homogeneous Refinement

  • Select Homogeneous refinement from the Job Builder.

  • Drag and drop both the particles_all_classes and volume_class_0 from the recently completed ab-initio reconstruction into the Particle Stacks and Initial Volume inputs, respectively.

  • Set the following parameters:

    • Refinement box size: 256

    • Symmetry: D7

  • Queue the job. Results appear in real time in the stream log. The refinement job performs a rapid gold-standard refinement using the branch-and-bound algorithm. The job display the current the resolution and other diagnostic information for each iteration.

  • Once complete, download the volume and/or mask directly from the Outputs section on the right hand side: Select the drop-down to choose the outputs you wish to download. A refinement job outputs a map_sharp, the final refined volume with automatic B-factor sharpening applied and filtered to the estimated FSC resolution.

Step 15: Sharpening

For optimal results in publications and for model-building, it is often necessary to re-sharpen and adjust the B-factor.

  • Sharpen the result of the refinement with the Sharpening Tools from the Utilities section in the Job Builder.

  • Drag and drop the volume output from the result of the previous refinement job, into the input volume input.

  • Set a B-Factor. Get a good starting B-Factor value from the final Guinier plot in the stream log of the refinement job you previously ran. Input the B-Factor as a negative value.

  • Queue the job.

  • Once complete, download the sharpened map from the output.

  • After visually assessing the map, optionally run another sharpening job (clear or clone the existing one) with a different B-Factor.

Step 16: Inspect Workflows

Once you have assembled a workflow of connected jobs within a project, switch to the Tree View to understand how jobs are connected to obtain the final result. Press the flowchart icon in the top header:

Within the tree view, select jobs and modify/connect them in the same way as previously demonstrated in the Card view. For more information on the Tree view and other useful tips, see User Interface.

Conclusion

Now that you have refined the data to a high-resolution structure, apply more advanced processing techniques. Explore the job builder and other documentation to see the available job types and processing options. Common workflows include:

  • Multiple rounds of 2D classification to remove more junk particles

  • Heterogeneous ab-initio reconstruction to find multiple unexpected conformational states or multiple distinct particles in the data

  • Heterogeneous refinement to refine multiple conformations and simultaneously classify particles

  • Sub-classification to identify small, slightly differing populations

  • Non-uniform refinement to account for disordered regions and local variations in a structure

  • Masked/local refinements to focus on sub-regions of a structure

  • Re-pick with multiple higher quality 2D classes

  • Local or per-particle CTF re-estimation

For detailed explanations on all available job types and commonly adjusted parameters, see All Job Types in cryoSPARC.

Check back to see updates to this guide, as new features and algorithms are in constant development within cryoSPARC.