Get Started with CryoSPARC: Introductory Tutorial (≤v3.3)
In this tutorial, we will process a small dataset from movies to reconstructed density map. If you are new to data processing in CryoSPARC, we highly recommend following along!
Last updated
In this tutorial, we will process a small dataset from movies to reconstructed density map. If you are new to data processing in CryoSPARC, we highly recommend following along!
Last updated
The information in this section applies to CryoSPARC ≤v3.3. For CryoSPARC v4.0+, please see: Get Started with CryoSPARC: Introductory Tutorial (v4.0+)
We recommend starting off with the T20S Tutorial to become familiar with the workflow in CryoSPARC. This dataset is a subset of 20 movies from the EMPIAR-10025 T20S Proteasome dataset. While not a representative example of the complexity of most cryo-EM projects today, it is a good way to become familiar with the interface and software features and to learn how CryoSPARC organizes jobs and projects.
For a refresher on the interface, projects and jobs, please see the User Interface and Usage Guide.
Overview of processing the T20S dataset from raw movie data to a high-resolution 3D structure.
The Dashboard provides at-a-glance information on your Projects, Workspaces and status of Jobs. It also shows the change log for new versions of CryoSPARC. The header and footer contain links to Projects view, Workspaces, the Resource Manager and the identity of the current user.
CryoSPARC organizes your workflow by Project, e.g, P1, P2, etc. Projects contain one or more Workspaces, which in turn house Jobs.
Projects are strict divisions. Files and jobs from different projects are stored in dedicated project directories and jobs cannot be connected from one project to another.
Workspaces are soft divisions, and allow for logical separation of jobs and workflows so they can be more easily managed in a large project. Jobs may be connected across workspaces and each job may belong to more than one Workspace.
Navigate to the Projects view by clicking on the drawer icon in the header, or from the Projects button in the footer:
To create a project, press the n
key or click the "+ Add" button in the header. A dialog window appears to enter new project details.
Enter a project Title and browse for a location for the associated Project directory with the File Browser. The project directory should already exist. CryoSPARC populates it with job directories as you create jobs. All files associated with the project will be stored inside the selected project directory. You may also enter a Description for your project.
Click "Create". The new project now appears on the Projects page.
Use Workspaces to organize or separate portions of the cryo-EM workflow for convenience or experimentation. Create at least one Workspace within a Project before running a Job.
Select the project number (e.g., "P67") to open the Project.
Alternatively, select the Projects drop-down in the header. This opens a searchable list of all Projects associated with your user account. Select an entry to open the associated project.
Create a New Workspace with the "+ Add" button in the header or press N
on your keyboard. Alternatively, select New Workspace
from the Project Details panel on the right side of the screen. Set a Title (may be changed later) and optionally a description.
Click "Create". This will create the new Workspace.
Log in to the machine where CryoSPARC is installed via command-line.
Navigate to or create a directory into which to download the test dataset (approx. 8 GB). This location should have read permissions for the linux user account running CryoSPARC.
Run the command cryosparcm downloadtest
while in this directory. This downloads a subset of the T20S dataset.
Run tar -xf empiar_10025_subset.tar
to decompress the downloaded data.
In CryoSPARC, navigate to the new Workspace. To do so, navigate to the project to see a list of workspaces, and then into the workspace.
Select the Job Builder in the right sidebar. The Job Builder displays all available job types by category (e.g., workflows, imports, motion correction, etc.). A tutorial on the Job Builder is available here.
Select the Import Movies job type in the Job builder. This creates a new job within the current Workspace, displayed as a card. By default, new jobs are set to Building status, indicated on the job card in purple. To change parameters, select the Building job and toggle between active or inactive building states with B
on your keyboard, or click the "Building" badge on the job card.
Select the Movies data path
: Click the file browse icon and select the movie files (.mrc
or .tif
format). To select multiple files, use a wildcard, e.g., *.mrc
. This selects all files that match the wildcard expression. The file browser displays the list of selected files along with the number of matches at the bottom. For this tutorial dataset, navigate to the directory where the test data was downloaded. Use the wildcard expression *.tif
to select all TIFF format movies in the folder. There should be 20 imported movies.
Select the Gain reference path
with the file browser: Select the single .mrc
file in the folder where the test data was downloaded.
Edit Job parameters from the Builder; enter the following parameters (obtained from the original publication in eLife).
Raw pixel size (Å): 0.6575
Accelerating voltage (kV): 300
Spherical abberation (mm): 2.7
Total exposure dose (e/Å^2): 53
After changing a parameter, the blue D
icon changes to a green S
. This indicates the parameter is different from its default value.
Click Queue to start the import. Use the subsequent dialog to select a lane/node on which to run the job. The available lanes depend on your installation configuration. By default, import and interactive jobs will run on the master node as they are not resource intensive. Press the Create button.
The Import Movies job queues and starts running. Look for the Job card in the workspace to monitor its status.
To open a Job and view its progress, click on the Job number on the top left hand side of the Job card. Alternatively, select on the job card and press the spacebar on your keyboard:
This opens the Inspect view, which shows a streaming log of the real-time progress for the Import Job. Scroll through the stream log to view results. Select a checkpoint to find a specific location in the stream log or click 'Show from top' to return to the beginning. Additional actions and detailed information for the job are available in the details panel. The Output of the import job, i.e., the 20 imported movies, are available on the right hand side of the event log:
To exit the job/close the inspect view, press the spacebar again, or press the ×
button on the top-right of the dialog.
Once finished, the job's status indicator changes to "Completed" in green.
For our next stage of processing, we will be performing Motion Correction on the imported movies. Motion Correction refers to the alignment and averaging of input movies into single-frame micrographs, for use downstream.
In the Job Builder, select Patch motion correction (multi) (parallelized over multiple GPUs, if you have them available). This creates a new job in building state so that its inputs and parameters are editable in the right side panel.
The Patch Motion Correction job requires raw movies as Inputs. Open the previously completed Import Movies job, then drag and drop the Outputs of the Import Movies job, to the Movies placeholder in the Job Builder.
Once dropped, the connected output name appears in the Job Builder as an Input:
If you have multiple GPUs available, you can speed up the processing time by setting the Number of GPUs to parallelize
parameter within the "Compute settings" section to the number of GPUs you would like to assign to that job.
Queue the job and select a lane. It is generally not necessary to adjust the Patch Motion Correction job parameters; they are automatically tuned based on the data.
Once the job starts to run the card will update with a preview image:
Our next step is to perform Contrast Transfer Function (CTF) Estimation. This stage involves the estimation of several CTF parameters in the dataset, including the defocus and astigmatism of each micrograph.
Select Patch CTF Estimation (multi) in the Job Builder to create a new job.
This job type requires micrographs as the input. Open the previous Patch Motion Correction job, and drag and drop the output (20 micrographs) into the Micrograph placeholder in the Job Builder.
You can connect outputs of jobs that haven't completed into the inputs of a building job. In this case, the newly created job will start to run automatically when all parent jobs have completed. This makes it easy to queue up a series of jobs to run without having to wait until they're completed to queue them manually.
As with Patch Motion Correction, the job will complete faster by allocating multiple GPUs. This can be configured with the Number of GPUs to parallelize parameter.
Queue the job to start. It is generally not necessary to adjust the Patch CTF Estimation job parameters; they are automatically tuned based on the data.
It is important to attain a large number of high-quality particles for an optimal reconstruction. The Blob Picker is a common starting point for particle picking as it is a quick way to attain an initial set of particle images that can be used to refine picking techniques over time.
Blob picking is a good idea because it verifies data quality and sets expectations for what particle images, projections, and structures should look like. We'll use blob picks to generate a set of templates that can be used as an input to the Template Picker, which will generate a set of much higher-quality picks matching the two primary 2D views of the T20s structure.
Select the Blob Picker job, then enter the Job Builder and set Min. Particle Diameter to 100
and Max Particle Diameter to 200
.
Locate the exposures output from the previously completed Patch CTF Estimation job. Drag and drop these into the Micrograph placeholder in the Job Builder. Queue the job.
Use the Inspect picks job to view and interactively adjust the results of blob-based (and template-based) automatic particle picking.
Select Inspect Particle Picks from the Job Builder.
Drag and drop both the particles and micrographs outputs from the previously completed Blob Picker job. Queue the job.
Once the job is ready to interact with, it will be marked as 'Waiting' and an "Interactive" tab will be available in the job details dialog:
Browse through and select from the list of micrographs on the left side.
The histogram on the left side shows statistics across all pick locations, including false positives and true particles. The y-axis measures the Power Score, and the x-axis measures the Normalized Cross-Correlation (NCC) Score for each particle pick. True particles generally have high NCC scores (indicating agreement in shape with the templates) and a moderate-to-high Power score (indicating the presence of significant signal). Picks that have too little power are false positives containing only ice, while picks with very high power are carbon edges, ice crystals, aggregate particles, etc.
Make adjustments to the parameters below if needed. All adjustments are saved automatically.
Adjust the lowpass filter slider if needed to better view the picks.
Adjust the box size to make it easier to see the location of picks. Often a very small box size (32
) can be helpful.
The box size is not used in computing the outputs of this job; Inspect Picks only outputs particle (x, y) location coordinates.
Adjust the NCC slider (approx. 0.350
).
Adjust the Power threshold slider. This helps to remove false positives (approx. between 1075 and 1745).
Once satisfied with the picks, select "Done Picking! Output Locations!" button. This completes the Inspect Particle Picks job and saves selected particle locations.
This job extracts particles from micrographs, and writes them out to disk for downstream jobs to read.
Select Extract from Micrographs (parallelized over multiple GPUs, if you have them available).
Open the recently-completed Inspect Picks job. Drag and drop both the micrographs and particles outputs into the corresponding inputs on the Job Builder.
In the Job Builder, look under the Particle Extraction section and change the Extraction box size (pix) to 440
.
We generally recommend selecting a box size that is at least double the diameter of the particle. The box size controls how much of the micrograph is cropped around each particle location. Larger box sizes capture the most high-resolution signal that is spread out spatially due to the effect of defocus (CTF) in the microscope. However, larger box sizes significantly increase computation expense in further processing. To mitigate this, you can use the Downsample Particles job to speed up processing for jobs that do not require the full spectrum of data such as 2D Classification.
Queue the job.
Once the job completes, you'll notice the number of resulting particles is less than the input; this is due to the fact that particle extraction process excludes picks that are too close to the edge of the micrograph:
2D Classification is a commonly used job in cryo-EM processing to get a first look at the data, group particles by 2D view, remove false positive picks, and even to get early insights into potential heterogeneity present in the dataset. In this step, we will use 2D Classification to group particles by 2D view, and then use the resulting class averages (also referred to as "templates") to improve our particle picking.
Select 2D Classification from the Job Builder.
Drag and drop the particles output from the previously completed Extract from Micrographs, into the input and queue the job.
In the stream log, preview images of class averages appear after each iteration. Classification into 50 classes (the default number) takes about 15 minutes on a single GPU.
The quality of classes depends on the quality of input particles. While we could use these blob-picked particles directly for 3D Reconstruction, we can obtain better results by repeating our particle picking using these templates generated by 2D Classification. Thus, in this tutorial, we will use these classes to inform the Template Picker of the shape of our target structure. The next step in this workflow is the Select 2D Classes job.
Select 2D Classes allows us to select a subset of the generated templates from 2D Classification, and to reject the rest.
Select Select 2D Classes from the Job Builder.
Drag and drop both the particles and class_averages outputs from the most recently completed 2D Classification job.
Queue the job. Once the data is loaded, the job status changes to Waiting and Interactive class selection mode is ready.
Select a "good" class for each distinct view of the structure. In this case, a top view and side view. Use both the number of particles and the provided class resolution score to identify good classes of particles. The interactive job provides several ways to sort the classes in ascending or descending order based on:
# of particles: The total number of particles in each class
Resolution: The relative resolution of all particles in the class (Å)
ECA: Effective classes assigned
Use the sort and selection controls to quickly sort and filter the class selection. Each class has a right-click context menu that allows for selecting a set of classes above or below a particular criteria.
Avoid selecting classes that contain only a partial particle or a non-particle junk image.
When finished, select Done at the top right side of the window. The job completes.
The Template Picker operates similarly to the Blob Picker but allows for an input set of templates to use to more precisely pick particles that match the shape of the target structure
Connect the output of the Select 2D 'selected classes' into the template input
Connect the output of the Patch CTF Estimation job into the micrographs input
Set the Particle diameter (Å) value to 190
Queue the job. It should take around 15 seconds to process the dataset.
As with the Blob Picker, use the Inspect picks job to view and interactively adjust the results of template-based automatic particle picking.
Select Inspect Particle Picks from the Job Builder.
Drag and drop both the particles and micrographs outputs from the previously completed Template Picker job. Queue the job.
Select a NCC score of around .350
Select a power score between around 930
and 1990
We will now repeat the extraction process given the new set of pick locations generated by the latest Inspect Picks job.
Select Extract from Micrographs (parallelized over multiple GPUs, if you have them available).
Open the recently-completed Inspect Picks job. Drag and drop both the micrographs and particles outputs into the corresponding inputs on the Job Builder.
In the Job Builder, look under the Particle Extraction section and change the Extraction box size (pix) to 440
.
Select 2D Classification from the Job Builder.
Drag and drop the particles output from the previously completed Extract from Micrographs, into the input and queue the job.
The particles extracted from the template picker result in much higher quality 2D classes. Proceed to the next step to filter out the highest quality classes for 3D reconstruction.
Select Select 2D Classes from the Job Builder.
Drag and drop both the particles and class_averages outputs from the most recently completed 2D Classification job.
Once queued and running, switch to the Interactive tab and select all of the good quality classes.
In this case, we want to keep all true particles in our dataset (rather than just selecting one top and one side view, as previously done in step 11), and reject all false positives.
The visual quality of a 2D class, together with its resolution, number of particles, and ECA, can all provide proxy measurements of the quality of the underlying particles that comprise the class.
Note that clear "junk" classes, corresponding to non-particle images, blurry images, ice crystals, etc., should be rejected at this stage. For example, the following shows one possible selection of 2D classes to retain:
Once clicking 'Done', the job will generate outputs for each group of classes and particles, one for the selected set and another for the excluded set:
Now that we have a set of good quality particle picks, we can proceed into 3D Reconstruction.
Select Ab-initio Reconstruction from the Job Builder.
Drag and drop the particles_selected output from the most recently completed Select 2D classes job (classes selected from the result of the template picker) into the Particle stacks input in the Job Builder.
Note: You do not need to enforce symmetry during Ab-initio Reconstruction.
Queue the job. Results appear in real-time in the stream log as iterations progress. Ab-initio reconstruction should resolve the T20S structure to a coarse resolution.
Now that we have a coarse resolution 3D density, we can refine the density to high-resolution using the Homogeneous Refinement job.
Select Homogeneous Refinement from the Job Builder.
Drag and drop both the particles_all_classes and volume_class_0 from the recently completed Ab-initio Reconstruction into the Particle Stacks and Initial Volume inputs, respectively.
Set the following parameter:
Symmetry: D7
Queue the job. Results appear in real time in the stream log. The refinement job performs a rapid gold-standard refinement using the Expectation Maximization and branch-and-bound algorithms. The job displays the current the resolution, measured via Fourier Shell Correlation, and other diagnostic information for each iteration.
Once complete, download the volume and/or mask directly from the Outputs section on the right hand side: Select the drop-down to choose the outputs you wish to download. A refinement job outputs a map_sharp
, the final refined volume with automatic B-factor sharpening applied and filtered to the estimated FSC resolution.
For optimal results in publications and for model-building, it is often necessary to re-sharpen and adjust the B-factor. This step is optional as Homogeneous Refinement already outputs a sharpened volume (map_sharp) with a B-factor reported within the Guinier plot.
Sharpen the result of the refinement with the Sharpening Tools from the Utilities section in the Job Builder.
Drag and drop the volume output from the result of the previous refinement job, into the volume input.
Set a B-Factor. Get a good starting B-Factor value from the final Guinier plot in the stream log of the refinement job you previously ran. It is recommended to input a B-Factor (as a negative value) ±20 the reported value in the plot. In this case, -56.7
and -96.7
Activate the Generate new FSC mask parameter, which will generate a new mask for the purposes of FSC calculation from the input volume. This is done by thresholding, dilating, and padding the input structure.
Queue the job. Once complete, download the sharpened map (map_sharp) from the output.
After visually assessing the map, optionally run another sharpening job (clear or clone the existing one) with a different B-Factor.
Once you have assembled a workflow of connected jobs within a project, you can switch to the Tree View to understand how jobs are connected to obtain the final result. Click the flowchart icon in the top header:
Within the Tree View, you can select jobs and modify/connect them in the same way as previously demonstrated in the Card view. For more information on the Tree view and other useful tips, see the User Interface and Usage Guide.
Now that you have refined the data to a high-resolution structure, you can apply more advanced processing techniques. Explore the job builder and other documentation to see the available job types and processing options. Common workflows include:
3D Variability Analysis to explore both discrete and continuous heterogeneity in the dataset
Non-Uniform Refinement to improve resolutions by accounting for disordered regions and local variations in a structure
Heterogeneous Refinement or 3D Classification to refine multiple conformations and simultaneously classify particles
Sub-classification to identify small, slightly differing populations
Heterogeneous Ab-initio Reconstruction to find multiple unexpected conformational states or multiple distinct particles in the data
Multiple rounds of 2D Classification to remove more junk particles
Masked/Local Refinements to focus on sub-regions of a structure
Re-pick with multiple higher quality 2D classes
Global (per-exposure-group) or Local (per-particle) CTF Refinement
For detailed explanations on all available job types and commonly adjusted parameters, see:
Check back to see updates to this guide, as new features and algorithms are in constant development within CryoSPARC.