Case Study: End-to-end processing of encapsulated ferritin (EMPIAR-10716)
Processing EMPIAR-10716 with a focus on high-symmetry ab-initio reconstruction, local symmetry, non-point-group symmetry, symmetry expansion, and custom geometry operations.
The aim of this case study is to demonstrate some advanced tools and processes within CryoSPARC that enable processing of structures with unconventional symmetry present. The dataset this case study will cover is an encapsulin nanocompartment originally collected by Jennifer Ross, et al. (2022), containing four encapsulated ferritin (EncFtn) decamers within. The raw data is publicly available for download as EMPIAR-10716.
The main topics of focus covered in this case study include: high-symmetry ab-initio reconstruction, local symmetry, non-point-group symmetry, symmetry expansion, custom geometry operations.
Encapsulins are a type of protein that contain internal cargo, and this encapsulin is an icosahedrally-symmetric molecule from the Haliangium ochraceum bacteria. The geometry of the nanocompartment is best illustrated in Figure 1:
Due to the complicated geometry of this nanocompartment, processing it can be challenging. The four EncFtn decamers are arranged in an approximate tetrahedral shape, which complicates solving this structure due to the mismatch in symmetry of the cargo and the shell. Furthermore, each EncFtn molecule itself has 5-fold dihedral symmetry, meaning there is an additional symmetry mismatch between the tetrahedral arrangement and the internal D5 symmetry of each EncFtn molecule.
While refinement of the encapsulin is fairly straightforward, recovering high resolution in the internal EncFtn is difficult without custom steps that take care to respect the geometry of the cargo. This case study walks through the steps we took to handle this geometry in CryoSPARC, achieving a final high-resolution structure of both encapsulin and encapsulated ferritin. The previously published map of encapsulated ferritin from this dataset reached resolutions of 5-6 Å; using the techniques of this case study, we were able to resolve the encapsulated ferritin to a sub-3 Å structure.
This case study is divided into two sections, each with subsections covering the major processing tasks:
Section A: Encapsulin Processing
A1: Preprocessing and Particle Picking in CryoSPARC Live
A2: 2D Classification
A3: Encapsulin 3D Reconstruction
Section B: Encapsulated Ferritin Processing
B1: Group Re-alignment on Tetrahedron
B2: Custom Symmetry Expansion
B3: Group Re-alignment on Encapsulated Ferritin
B4: Local Refinement
All processing was done in CryoSPARC and CryoSPARC Live v4.5.
A: Encapsulin Processing
This case study begins with processing the dataset, with the goal of reconstructing encapsulin.
A1: Preprocessing and Particle Picking
Preprocessing of exposures consists of import, motion correction, and CTF estimation. These steps can either be completed separately using individual jobs in CryoSPARC, or simultaneously using CryoSPARC Live. The latter can be quicker as it allows processing exposures in a streaming fashion, where one exposure can be imported, motion corrected, and CTF estimated all in sequence (i.e. without waiting for all other exposures to finish each step).
We will use CryoSPARC Live to perform import, motion correction, CTF estimation, particle picking and extraction.
If you haven’t used CryoSPARC Live before, you can review this Start to Finish Guide.
Download Raw Data (Subset)
This dataset comprises 8,109 movies.
Due to the large size of the dataset, we chose to only process a subset of the 8,109 movies. As we will see, that was sufficient for high resolution reconstruction due to the high symmetry present, but further improvements in map quality are possible if the entire dataset is used.
Thus in order to make processing quicker, we will only download the movies in two subdirectories, GridSquare_16285984 and GridSquare_16286188 which amounts to 2,815 movies in total.
cd/path/to/rawdata# navigate to a container directory to hold the raw datawgethttps://www.ebi.ac.uk/empiar/world_availability/10716/data/micrographs/GridSquare_16285984/.&&wgethttps://www.ebi.ac.uk/empiar/world_availability/10716/data/micrographs/GridSquare_16286188/.
Set up Live Session
Create a new CryoSPARC Project, and within this project, create a new Live Session. Under the configuration tab, enter the following configuration information and parameters:
Parameter
Value
Raw pixel size (A)
0.326
Accelerating voltage (kV)
300
Spherical Aberration
2.7
Total exposure dose (e/A^2)
40.509
Save Results in 16-bit floating point
Yes
Output F-crop factor
0.5
Minimum particle diameter
160
Maximum particle diameter
230
Use circular blob
Yes
Use ring blob
Yes
Extraction box size
800
Fourier crop to box size
512
In the Configuration Tab, create two exposure groups, with the following fields set:
Use at least one Preprocessing GPU worker(s). Set the number of Reconstruction GPU workers to 1 (note reconstruction tasks i.e. ab-initio and refinement will not be used in Live for this case study).
Click “Start session” to begin processing. CryoSPARC Live will automatically begin motion correction, CTF estimation, particle picking, and particle extraction.
In the Overview Tab, modify the upper CTF fit resolution threshold to 6 Å.
In the Picking Tab, adjust the Normalized Cross Correlation (NCC) and Power Score sliders to remove false positive picks.
CryoSPARC Live will work through the exposures and process them until extraction is complete for each exposure. You can tell when the exposures have finished processing when the number of processed exposures equals the number of total exposures.
We are now done with CryoSPARC Live for this case study. For the remainder of the processing, we will use the standard CryoSPARC interface.
Navigate to the top dropdown menu, and click “Go to session workspace”; in this workspace we will carry out the rest of the jobs.
A2: 2D Classification
Once we have an exported stack of particles in CryoSPARC, we will use 2D Classification to curate our particle stack and remove false positive particle picks.
In the session workspace, locate the most recent “Live Particle Export” job. Add these particles to a new 2D Classification job with the following parameters:
Parameter
Value
Number of classes
80
Minimum separation distance (A)
60
Number of GPUs to parallelize
3
Once the 2D Classification job is complete, use quick actions to queue a Select 2D Classes job. Select classes that show high resolution detail in the encapsulin. Note that since the encapsulin is the dominant signal in the images at this stage, the interior cargo will likely remain blurry and ill-defined for most of the classes, even the classes for which encapsulin is well-defined. Reject all classes that have multiple overlapping encapsulin particles, are empty, include ice or carbon edges, or otherwise have junk in them.
A3: Encapsulin 3D Reconstruction
The figure below illustrates the workflow for subsection A3 of this case study.
Initial Model Generation
Now that we have a set of curated particles, we will move onto 3D initial model generation. High-symmetry structures often require special treatment during initial model generation, as the particle images for these types of structures typically look very similar to each other at low and medium resolutions, regardless of the particle pose. This lack of information in the data makes it difficult for algorithms like Ab-Initio Reconstruction to reconstruct the correct structure when using default parameters. Typically, running Ab-Initio Reconstruction in this setting will yield “flattened” density, with all particles assigned to the same viewing direction.
There are two options to work around the lack of information in the images:
Enforce symmetry during Ab-Initio Reconstruction: This will guarantee that a symmetric structure is found.
Alternatively, Disable “Enforce non-negativity”: This parameter has empirically been found to help discourage ab-initio from producing flattened models.
We can see the difference between options 1 and 2 in the final structures found by Ab-initio in each case:
The first option is undesirable because we would like to preserve the internal asymmetric structure within the encapsulin as best as possible. The internal structure doesn’t follow an icosahedral symmetry like the outer shell does, so enforcing symmetry will prevent any details from being resolved inside of the encapsulin.
Taking the particles from the Select 2D Classes job, we will build an Ab-initio Reconstruction job with the following parameters:
Maximum Resolution (Angstroms)
8
Initial Resolution (Angstroms)
25
Center Structures in Real space
Off
Enforce non-negativity
Off
Volume Alignment Tools (Symmetry Alignment)
After obtaining an initial model of the encapsulin, we would like to refine it to high-resolution. Since we gave Ab-Initio no symmetry information, the structure is oriented arbitrarily. This is fine for intrinsically asymmetric structures, but for symmetric structures, we must ensure they are aligned to the symmetry axes if we later want to enforce symmetry or enable symmetry relaxation.
Alignment of the initial volume to the symmetry axes is also done automatically by the subsequent Homogeneous Refinement job, but we include it explicitly here to get familiar with the Volume Alignment Tools job’s parameters, inputs and outputs.
To do this, build a Volume Alignment Tools job, activate symmetry alignment, and input “I” as the symmetry string. Connect both the volume and particles from Ab-Initio Reconstruction to this job. The output volume should be aligned to the icosahedral symmetry axes.
Refinement of encapsulin
Now that we have an aligned model, we can refine this to high resolution using a Homogeneous Refinement job. For this job, we will use symmetry relaxation to give the refinement the best chance of preserving the asymmetry of the encapsulin contents.
Setting the symmetry relaxation to “maximization” enables symmetry relaxation. It can also be set to “marginalization”, which uses a slightly different method for finding the optimal pose.
Setting the dynamic mask start resolution to 1 Å causes the job to use no mask, which is important as dynamic masking can remove lower-contrast asymmetric details that we’d like to preserve, such as the internal contents of the encapsulin.
From here, with ~250k particles, we obtained a C1-refined structure of encapsulin at around 3.0 Å.
Homogeneous Reconstruction Only
Now that we have a high-resolution reference structure, there are many avenues to further improve resolutions using reference-based algorithms for latent variable estimation. In this tutorial, we’ll use Global CTF Refinement to correct for high-order aberrations. We’ll also create a symmetrized version of the encapsulin portion of the reference, and then use this for Particle Subtraction **to generate particle images with only signal from the internal contents present. This will prepare us for section B of this case study: processing the internal encapsulated ferritin structure.
To get a icosahedral (I) symmetric reference for subtracting the encapsulin away, we don’t have to repeat a full refinement.
Instead, connect the particles from the previous C1 refinement to a Homogeneous Reconstruction Only job, and run the job with a symmetry of I specified. This will work as we have ensured our initial reference to the homogeneous refinement was symmetry aligned.
Note that despite enforcing symmetry in this Homogeneous Reconstruction **Only, the particles will retain their C1 alignments — thus the particles will remain suitable for downstream processing of the tetrahedral arrangement of EncFtn, and we won’t lose the effort put into preserving the symmetry-break. This would not be true if we re-ran a refinement with symmetry enforced, as alignments would be re-calculated against the symmetric reference.
Application of symmetry will result in a significant increase in resolution owing to the greater number of asymmetric units contributing to the structure. In our case, the resolution improved from 3.0 Å to 2.5 Å over the C1-refined structure.
Mask generation using Volume Tools
The Particle Subtraction job takes in a set of previously-aligned particle images, the corresponding reference volume to which they’ve been aligned, and a mask covering the region of the volume that we would like removed from the particle images.
We’d like to subtract the encapsulin away, leaving particle images with just the encapsulated ferritin inside. To do this, we need to use Particle Subtraction, and provide it with a mask covering just the encapsulin.
To obtain this mask, download the volume from the upstream Homogeneous Reconstruction job. Open this volume in UCSF ChimeraX, and select a threshold value that preserves most of the encapsulin while removing all of the internal density.
Note down this threshold value.
Create a Volume Tools job and connect the volume from the reconstruction only job, with the following parameters:
Type of input volume
map
Type of output volume
mask
Threshold
chosen value
Dilation radius (pix)
4
Soft padding (pix)
18
Global CTF Refinement
Connect the particles and volume from the upstream symmetry-enforced Reconstruction Only to a Global CTF Refinement job. Connect the mask from the previous Volume Tools job. Activate “Fit anisotropic mag”, to allow for estimation of anisotropic magnification; this will also set the number of iterations to 2.
Global CTF Refinement works best when operating on the reference with the largest mass and highest resolution available, and on particles with refined alignments. Global CTF parameters are a function of the microscope (i.e., they have little dependence on which region of the protein is being refined), and so can be optimized with the 2.5 Å symmetry-enforced encapsulin map, which is large and high-quality.
Homogeneous Reconstruction Only
To get a reference generated from CTF-refined particles, we’ll repeat reconstruction using the CTF-refined particles.
Create a Homogeneous Reconstruction Only job. Connect the particles from the previous Global CTF Refinement along with the mask from the Volume Tools job, set the symmetry to “I”, and launch the job.
Particle Subtraction
Connect the particles, volume, and mask from the previous Homogeneous Reconstruction Only job to a Particle Subtraction job. Since this mask only covers the encapsulin, the particles will have encapsulin subtracted away and internal contents preserved.
B: Encapsulated Ferritin Processing
We now have subtracted particles containing just the four copies of encapsulated ferritin. However, we have a tricky case of symmetry to handle:
Within each EncFtn, there is D5 (5-fold and 2-fold) symmetry
The four encapsulated ferritin are arranged in a tetrahedron configuration. However this is not equivalent to a tetrahedral point symmetry group, because tetrahedral point symmetry has 3-fold symmetry at each vertex, but EncFer is 5-fold symmetric
Thus, EncFtn has two different types of symmetry. First, each individual copy of EncFtn has D5 point group symmetry, which may be referred to as a local symmetry. Second, the overall tetrahedral arrangement imparts a non-point-group, four-fold symmetry, which must be treated with custom operations to superimpose each of the four EncFtn.
The figure below demonstrates the processing chain we will use for refining the encapsulated ferritin.
Homogeneous Reconstruction & Local Refinement (demonstration)
Now that we have subtracted particles, we might first attempt to refine the interior.
First, we launch a Homogeneous Reconstruction job to generate an initial reference from the subtracted particles.
Next, using the subtracted particles and reference, launch a Homogeneous Refinement. To prevent the job from using masking altogether, set the “Dynamic mask start resolution” to 1 Å.
With our stack of 250k subtracted particles, the Homogeneous Refinement reached a resolution of 10 Å.
Next, we attempted to locally refine one of the encapsulated ferritin proteins. By using the map eraser in UCSF ChimeraX, a mask was generated around one protein, and the structure was locally refined.
This Local Refinement stalled at a claimed resolution of ~ 6.9 Å. The non-protein streaking artefacts visible in the map above are a characteristic sign of overfitting. There are two potential reasons why this Local Refinement was unsuccessful:
There are many junk particles in the dataset. This is supported by the fact that one of the 3D classes (from the next step) solved does not show a clear tetrahedral configuration of the four EncFtn instances, rather showing a disordered “ring” of density where three distinct EncFtn would be expected:
Initial particle alignments are not close enough to their optimal values. Local Refinement can only move particle alignments by so much, and the larger the search space we give it to make, the greater potential there is for overfitting. Local Refinement works best when it does not have to search a large range of poses, and when tight gaussian priors are used to limit the drift of alignments.
Both factors above make it more difficult for local refinement to solve for a high-resolution structure with minimal artefacts.
B1: Group Re-alignment on Tetrahedron
Instead of refinement, we’ll attempt to perform 3D Classification on the tetrahedral arrangement of EncFtn. Since 3D Classification doesn’t update alignments, we’ll be able to see if there is heterogeneity in the orientations of the “tetrahedra”.
To bias the classification as little as possible, we’ll provide a spherical mask covering the inside of the encapsulin. If we used a mask generated from the consensus subtracted particles, we may bias the classification to only find classes of the tetrahedron oriented in the same orientation as the consensus structure.
Obtain spherical mask (UCSF ChimeraX)
Before starting 3D Classification, we’d like to use a mask that excludes the corners of the box, along with any other residual density from the encapsulin. At this stage the best mask to use would be a soft spherical mask centered at the box center, as this biases the classification the least (as opposed to a mask contoured to the consensus density).
Generate a mask base in UCSF ChimeraX
Download the map from the most recent homogeneous reconstruction job. Open the map in a new ChimeraX session. Navigate to the “Tools” tab on the menu bar, and head to Tools > Volume Data > Map Eraser. The map eraser should open by default in the center of the volume; if not, adjust the pink sphere’s position to the center of the volume via clicking and dragging. Adjust the size of the map eraser to approximately surround the internal disordered density of the encapsulin; refer to the image below for an example:
Lower the density threshold all the way to the lowest value in the volume — you should see a large cube of density. This is so that we can erase all density outside of a central sphere, leaving us with a spherical mask base in the center.
Use the threshold operation to binarize the cube:
vop threshold #1 maximum <threshold_value_here> setMaximum 1
Now click “Erase outside sphere”. You should now have a solid sphere of density:
Finally, save this map. Click File > Save... and change the “Map” dropdown to the thresholded volume. Give the file a name, and click save:
Import Mask into CryoSPARC
In your CryoSPARC Workspace, create an Import 3D Volumes job. Provide the path to the mask base. Change the “Type of volume being imported” to mask, and run the job.
Volume Tools (padding)
Connect the imported mask to a new Volume Tools job. Set the following parameters, to pad the mask with a width of ~20 pixels:
Parameter
Value
Type of input volume
mask
Type of output volume
mask
Threshold
0.5
Soft padding width
20
This job will produce a softly-padded mask as its output.
Group Re-alignment
Create a 3D Classification job. Take the particles from the Particle Subtraction job, along with the spherical mask, and connect the mask to the “Solvent mask” input. Use the following parameters for the 3D Classification job:
Number of classes
20
Filter resolution (Å)
8
O-EM batch size per class
2000
O-EM learning rate init
0.9
O-EM learning rate half-life (%)
0
Force hard classification
On
Looking at the volume series from 3D Classification, we can see that the volumes are not in total alignment; 3D Classification spent most of its capacity in finding similar volumes oriented differently relative to each other. This is not surprising, since we have not updated alignments since the initial C1 refinement of the entire encapsulin/encapsulated ferritin complex. We can place volumes back into register, as well as update particle alignments for each class, by using Align 3D Maps.
Create an Align 3D Maps job. Activate the “Update particle alignments” parameter. Connect the All volumes output from the 3D Classification job to the “Maps to align (volumes group)” input. Connect the All particles output to the “Particles (all)” input. Connect the spherical mask to the Reference mask input. Finally, pick one of the 3D classes to serve as the reference map — this can be the best resolved class — and connect it to the Reference map input. The results are shown in the video of the orange (left) volume series below, compared to the un-aligned series from 3D classification in gray (right).
Homogeneous Reconstruction Only
Using the spherical mask from the previous 3D Classification, launch a Homogeneous Reconstruction Only job. This will produce a consensus structure after the previous Align 3D Maps job.
The next step we must do is “effect” the non-point-group symmetry operators through the custom symmetry expansion step.
B2: Custom Symmetry Expansion
The next portion of the case study describes how CryoSPARC can be used to handle non-point-group symmetry. In this case, we would like to use CryoSPARC to superimpose the four copies of EncFtn, such that they can be combined into one structure for a final refinement, as a priori it is expected that each of these four units are indistinguishable and are the same structure. Symmetry-averaging that involves point group symmetries can normally be handled via symmetry expansion or refinement with symmetry enforced. Since this symmetry does not follow a point group, particular steps are required to obtain properly symmetry-averaged structures.
This can be done most quickly using ChimeraX’s Segment Map tool. Our Mask Creation Tutorial covers this step in much more detail. Using the consensus map, lower the threshold value until exactly 4 disconnected discs are present in the density, one corresponding to each EncFtn, (but not too low that any of the EncFtn split into multiple disconnected blobs). The segmentation option Group by connectivity works well for this dataset, visible under the Segmenting Options drop-down menu:
This then produced four segments, each covering one of the EncFtn units.
Following the remainder of the mask generation tutorial, we are able to generate four mask bases that will subsequently be imported into CryoSPARC. Save each of these mask bases with a format such as encftn_maskbase1.mrc, encftn_maskbase2.mrc, etc., in a directory.
1x Import 3D Volumes
Build an Import 3D Volumes job. Set the path to a wildcard pointing to all four masks (for example, /path/to/directory/encftn_maskbase*.mrc). Change the type of imported volume to mask, and hit run. The job will import all four mask bases and present them as outputs.
4x Volume Tools
The next step is to generate dilated and softly-padded masks from these four mask bases.
To do this, we will need to run four Volume Tools jobs, one with each of the four masks connected as an input:
Type of input volume
mask
Type of output volume
mask
Threshold
value selected in the mask generation section
Dilation radius
anywhere between 2-8, depending on how large masks are desired; here chose 2.
Soft padding width
At least 16; here chose 16.
4x Volume Alignment Tools
The next task is to effect the non-point-group symmetry expansion step. In summary:
We use Volume Alignment Tools to shift each of the four masks to the center of the box.
Volume Alignment Tools simultaneously adjusts the positions of the volume and particles accordingly, to match each of the four shifted masks. Volume Alignment Tools also re-generates the unique identifiers (UIDs) of each particle, so that CryoSPARC knows to treat each image containing four copies of EncFtn as four separate observations of our protein of interest.
Align 3D Maps is then used to rotationally align (i.e. superimpose) all four volumes.
Create four Volume Alignment Tools jobs. Connect each of the masks from the previous four Volume Tools jobs as the mask inputs. Connect the volume and particles from the homogeneous reconstruction only as the volume and particle inputs to each of the Volume Alignment Tools jobs. Finally, set the following parameters and run the jobs:
Re-center to mask center of mass
On
Reassign UIDs
On
Align 3D Maps
Next, we use Align 3D Maps to correct the rotational mis-alignment of the four volumes from the previous Volume Alignment Tools jobs. This step completes the custom symmetry expansion.
Create an Align 3D Maps Job.
Pick one of the four EncFtn volumes from one of the previous Volume Alignment Tools jobs; this volume will serve as the reference, and the other three will be aligned to it. This will establish the orientation of the consensus of the subsequent 3D classification. (This orientation is not the final one that will be used for the highest resolution refinement, as when we later incorporate D5 symmetry, we will have to align the consensus to the D5 symmetry axes. For now, refinement is proceeding in C1, and we are ignoring the D5 symmetry until we have a cleaner subset of particles.)
Connect this reference volume and the accompanying mask to the “Reference Map” and “Reference Mask” inputs, respectively. Leave the “Maps to align (volumes group)” input empty.
Enable “Update particle alignments” parameter. Connect the four EncFtn volumes from each of the Volume Alignment Tools jobs as connections under the “Maps to align (individual volumes)” group. Finally, connect each of the particle sets corresponding the four EncFtn volumes as individual connections under “Particles (map to align, connection X).” Phew! Now run the job.
B3: Group Re-alignment on Encapsulated Ferritin
At this stage of processing, we now have all of the EncFtn superimposed. However, the particle stack is quite dirty — there are many junk particles, as evidenced by the previous 3D Classification results. The first step we’ll do is repeat 3D Classification, this time using the expanded particle dataset and a mask covering only one encapsulated ferritin.
3D Classification is preferred over local refinement at this stage for a few reasons.
3D Classification is less sensitive than local refinement to junk. When dirty particle stacks are given to local refinement, a common outcome is artefacts and overfitting. When dirty particle stacks are given to 3D Classification, it is often the case that poor particles can be separated from good particles reasonably well via their class assignments
The particle stack is (a) contaminated by lots of junk particles and (b) particles’ alignments are far away from coherently superimposing particles onto one rigid structure, as we will see
When alignments are far away from their optimal values, Local Refinement will struggle to align the particles
Heterogeneous Refinement and Local Refinement both attempt to estimate alignments on a per-particle basis, which can be problematic when there is still a lot of junk, and when alignments are far off from their optimal values.
3D Classification freezes alignments, thus is not able to use alignments as a free variable to overfit.
To overcome the fixed alignments of particles, 3D Classification can be combined with Align 3D Maps to allow for re-alignment of classes on a volume basis.
Group Re-alignment (Repeat x2)
The next step comprises of 3D Classification followed by Volume Alignment Tools and Align 3D Maps; this step will be repeated twice in order to iteratively improve the quality of our classification.
3D Classification
Build a 3D Classification job. Connect the Particles for map {0,1,2,3} inputs all as connections to the input particle group. Leave the initial volumes and focus mask inputs empty. Connect the mask from the reference volume chosen in the previous step to the Solvent mask input. Use the following parameters:
Number of classes
20
Filter resolution (Å)
6
O-EM batch size per class
2000
O-EM learning rate init
1
O-EM learning rate half-life (%)
0
Force hard classification
On
Despite aligning each of the encapsulated ferritin to a common reference, the volumes have significant diversity both in their position and contents!
Repeating Align 3D Maps (described in the next sub-section) will help position these volumes back into register, as best as possible, and set us up best for a final Local Refinement.
Volume Alignment Tools (D5 symmetry alignment)
Before running Align 3D Maps, we will use Volume Alignment Tools to align the structure to the D5 symmetry axes. This is in preparation for the final local refinement we’ll do to high-resolution.
Create a Volume Alignment Tools job.
Pick the class from the previous 3D Classification, that shows the strongest 5-fold symmetry. Connect this class to the volume input, and connect its corresponding particles to the particles input. Connect the solvent mask from 3D Classification to the mask input.
Activate the “Do symmetry alignment” parameter. Set the symmetry string to “D5”. Run the job.
Align 3D Maps
Create an Align 3D Classes job. Activate the “Update particle alignments” parameter. Connect the All volumes group from the 3D classification to the Maps to Align (volumes group) input group. Connect the All Particles group from 3D classification to the Particles (all) input group. Use the volume and mask from the previous Volume Alignment Tools job as the reference map and reference mask. Run the job, and observe that the volumes are much closer to alignment than previously:
After two iterations of Group Re-alignment, much more detail is beginning to form in the classes. Classes can be visualized by downloading the volume series from the Align 3D Maps job:
Inspect the classes, and note down which classes are of the intact EncFtn structure, show clear 5-fold symmetry, and are not at low resolution. Below is an example of the classification for 20 classes, with classes selected for further refinement highlighted in blue.
B4: Local Refinement
Create a Local Refinement job. Connect each particles group corresponding to each of the selected classes from the previous Align 3D Maps job. Connect the volume and mask from the latest upstream Volume Alignment Tools job to the Initial Volume and Static Mask inputs, respectively. Use the following parameters:
Use pose/shift gaussian prior during alignment
On
Standard deviation (deg) of prior over rotation
6
Standard deviation (A) of prior over shifts
3
Re-center rotations each iteration?
On
Re-center shifts each iteration?
On
Symmetry
D5
Number of extra final passes
0
With selecting a good subset of 3D classes, we retained ~445k of ~1,009k particles for this Local Refinement. This subset refined to a resolution of ~2.8 Å, compared to previous Local Refinement of the encapsulated ferritin reaching only in the 6-9 Å.
To help understand these results, it’s helpful to examine what changed from the initial Homogeneous Refinement job on all four EncFtn molecules. Why did it work better now?
The particles now had accurate-enough starting orientations. Earlier in the workflow, orientations were very poor, and likely too far away from their optimal values.
Resolving the internal orientation diversity via group re-alignment was crucial to accomplish this. Group re-alignment allowed us to put particles in register, before the reference was high-resolution enough to allow for per-particle-pose estimation
Local Refinement also worked better because there was minimal heterogeneity — the broken/misaligned ferritin classes were removed in this final 3D Classification.
Finally, Local Refinement had access to a greater number of particles, due to the custom symmetry expansion and D5 symmetry enforcement.
Encapsulated Ferritin Density
Comparing our density map from this case study to the previously published density map, we can see that accounting for the challenging symmetry of this sample paid off!
To see if this density map was plausible, we re-refined the protein sequence from the atomic model PDB 5N5F. The 5N5F atomic model was obtained by Didi He, et al. (2019) via Phenix refinement into a 2.1 Å map from x-ray diffraction.
Though this structure isencapsulated ferritin from the same species of bacteria as the cryo-EM map, It wasn’t known to us whether we could expect the conformation of 5N5F to be identical to that of our density map. Possibly because the encapsulated ferritin in the cryo-EM sample was imaged inside of encapsulin (instead of being crystallized).
The above figure shows the density map obtained from this case study, sharpened to a B-factor of -60. This is overlaid on the 5N5F atomic model from the PDB (left column), and the re-refined atomic model (right side). The additional density present near the N-terminus of the AA sequence enabled modelling an extra three residues (GLU5, SER4, and SER3) that were not present in the atomic model from XRD.
And that’s a wrap! This case study highlighted how to use CryoSPARC to handle the unique geometry and symmetry of the encapsulated ferritin dataset. Further standard processing workflows that weren’t explored in this case study could further improve results, for example:
Repeated 3D Classification to remove more junk
Reference Based Motion Correction
Local CTF (defocus) refinement
Process all movies in the dataset
Within each of the encapsulin nanocompartments, there are 4∗5∗2=40 asymmetric units available for symmetry-averaging. The remainder of this tutorial focuses on how we can use CryoSPARC to align these asymmetric units, remove broken particles/further curate the particles, and refine to high resolution.