Tutorial: Mask Creation
Mask Selection and Generation in UCSF ChimeraX
Last updated
Mask Selection and Generation in UCSF ChimeraX
Last updated
In many circumstances it is important to specify a region of a 3D volume. Whether this is to indicate a sub-volume to refine in Local Refinement, a volume to subtract for Particle Subtraction, or simply the region in which to calculate the GSFSC, we use a mask to select the volume.
A mask is another 3D volume with the same box and pixel size as the volume to which it will be applied. The mask has a value of 0.0
outside the region to select and a value of 1.0
inside. When the volume is multiplied by the mask, the result will be a box that is empty except for in the region of interest, which will have the same information as the original volume.
In masks for cryoEM must be “softened” by adding a smooth transition between the 1.0
values inside the mask and the 0.0
values outside the mask. This softening prevents ringing artifacts which severely degrade alignment. A softer mask introduces less severe artifacts, but includes more information from outside the region of interest. The Volume Tools job can be used to create a binarized (1.0
inside and 0.0
outside) mask with a soft edge all in one step from an input map or mask.
The ideal softness for a given dataset and subvolume typically must be determined empirically, but we recommend a minimum soft padding width of where resolution is the GSFSC resolution in Å and apix is the pixel size in Å. In some cases, it can be beneficial to add a wider soft edge or even an expansion of the 1.0 values (i.e., making the mask “wider”, called dilation). It is often best to try a few different combinations of dilation and padding and determine which produces the best results.
One very important thing to consider is that masks which are too tight or too high-resolution introduce shared information into both half-maps, which breaks independence between the two half sets and artificially inflates the GSFSC curve. The magnitude of this effect is directly related to how much “ringing” is present in the mask’s frequency domain; thus sharper masks produce more severe artificial correlations. In jobs which produce GSFSC curves, be wary of results in which the Tight FSC curve does not closely follow the Corrected curve. A mismatch between these curves typically indicates an over-tight mask.
One common application of masks is Local Refinement. Local Refinement is a powerful technique for improving the quality of smaller sub-volumes of a map. However, when the masked sub-volume is too small, not enough information remains for reliable image alignment. This can result in overfitting, characterized by noise, shells, or “blips”, especially surrounding the edge of the mask. If a small subdomain must be masked out for Local Refinement, we recommend applying a Gaussian prior to reduce overfitting. More information on Gaussian Priors is available on the job page.
The first step in all mask creation workflows is the creation of a mask base. We focus here on the three most common workflows:
erasing regions of an existing map
using volume segmentation on an existing map
creating a mask using a molecular model
For all three techniques we will use ChimeraX, a molecular visualization tool developed by the Resource for Biocomputing, Visualization, and Informatics at UCSF. ChimeraX is free for academic, government, nonprofit, and personal use and can be licensed for commercial use.
The visuals for mask creation use data from EMPIAR-10073. This dataset was originally collected and processed by Nguyen et al.
Throughout this page we make a distinction between mask bases and masks. Mask bases are created by a user to specify the region the mask ought to cover. They may have a hard edge and may or may not be binarized. Masks, on the other hand, are ready for use in CryoSPARC. They have been binarized, dilated, and padded. Typically a mask base is plugged into a Volume Tools job to create a mask.
This technique uses watershed segmentation to split a volume into regions. Masks are specified by deleting regions outside the desired volume. This technique remains relatively simple while also allowing for the construction of complex masks and is the method we recommend for most purposes.
Blurring a volume before creating your mask achieves two aims. Practically, it is significantly easier to select a region of interest when high-frequency noise has been attenuated with a blurring operation. Theoretically, it is important that the mask does not introduce high-frequency correlations between the two half maps. Only building blurred masks helps reduce the chance of this happening.
Open the volume in ChimeraX. This example uses the results of a non-uniform refinement. For all commands in this example, the base volume is volume #1.
Blur the volume using a Gaussian filter: volume gaussian #1 sDev 2
. Increasing the value for sDev
will make the map more blurry. This command creates a new, blurred volume. The blurred base volume is volume #2.
In this step, the Segger tool in ChimeraX splits the blurred volume into several regions.
Contour the blurred map until no noise is visible and the region which will ultimately form the mask has the desired topology.
Open Segger via clicking Tools > Volume Data > Segment Map
Click "Segment" in the Segment Map pane to produce the segmentation. In this case, the segmentation is model #3. This segments the map into several regions, each of which is a distinct color.
In this step, we build our mask around the region of interest by hiding regions we wish to exclude from the mask. Because there is no undo operation when working with segmented maps, we recommend that users only hide the regions rather than outright deleting them. This also makes it easier to generate a complementary volume for Particle Subtraction, which we will do later on.
Open the "Shortcuts Options" dropdown in the Segment Map pane.
Control-click a region you wish to exclude from the mask to select it. The region should be surrounded by a light-green outline.
Click the Hide button in Shortcuts Options to hide the region. The hidden region remains selected until a new region is clicked, so clicking “Show” will return the region to a visible state.
Proceed to hide all regions which are to be excluded from the mask.
Control-click-and-drag selects multiple regions in a box
Control-shift-click adds or removes a region from the current selection
If a region contains parts of the map you wish to keep and parts you wish to exclude, select only that region and click “Ungroup” in Shortcuts Options. This will break the region into smaller subregions. Repeat the process until you can isolate the regions you wish to exclude.
When this process is completed, the segmentation model should have all regions outside your desired mask hidden and all regions in the mask shown.
CryoSPARC cannot accept Segger segmentations, so they must first be converted to .mrc
format.
Control-click and drag over the entire segmentation to select all visible regions.
In the Segment Map pane, click File > Save selected regions to .mrc file. The filename you choose at this stage is not important, as the resulting .mrc file has some issues we will fix in the next step.
This step generates a new volume, the mask base. In this example, the mask base is volume #4.
If you are performing Local Refinement, it can be helpful at this stage to create the complementary mask for Particle Subtraction. To do this, we delete all the regions we used to create the mask base, then save another .mrc
file with the remaining regions.
With the regions used to create the mask base selected, click “Delete” in the Segment Map pane.
In the Segment Map pane, click “All” next to “Show regions:”. This reveals the regions you hid during Step 3.
Select all regions and save an .mrc file as in Step 4. In this example, the particle subtraction mask base is volume #5.
When Segger saves the regions to an .mrc
file, it crops the box size to perfectly fit the mask base. This results in a box size that is different from the map’s box size, meaning CryoSPARC will not know where to position the mask. To resolve this problem, we must first resample the mask base onto the original map’s box.
When using commands in ChimeraX, ensure that you are using the correct numbers for your maps, as they may differ from those printed here if you did not take the optional step 5, or your ChimeraX session already had maps or models loaded into it prior to starting this tutorial.
Use the command volume resample #4 onGrid #1
to resample the mask base onto the original map’s box. Note that the resulting maps are positioned in the same region of space and contain the same information, but the box sizes are different. In this case, the volume saved by Segger has a box size of 96 x 94 x 129
voxels, while the map and resampled volumes both have box sizes of 380 x 380 x 380
.
Save the resampled volume (in the case of this example, volume #5 if a particle subtraction volume was not created and volume #6 if a mask subtraction volume was created). This is the mask base, so we recommend using an informative name.
(If creating a particle subtraction mask) repeat the above steps to resample the Particle Subtraction mask base onto the map volume. In this case, the necessary command would be volume resample #5 onGrid #1
. This is the completed mask base for Particle Subtraction.
Both of the mask bases are now resampled and ready for import to CryoSPARC, via the Import 3D Volumes job. Once imported, the mask bases can be converted to masks via thresholding, dilation, and padding in the Volume Tools job.
For very simple masks, this technique is much faster than Method One. However, it can be susceptible to creating undesired noise and care must be taken when analyzing the resulting masks.
Blurring a volume before creating your mask achieves two aims. Practically, it is significantly easier to select a region of interest when high-frequency noise has been attenuated with a blurring operation. Theoretically, it is important that the mask does not introduce high-frequency correlations between the two half maps. Only building blurred masks helps reduce the chance of this happening.
Open the volume in ChimeraX. This example uses the results of a non-uniform refinement. For all commands in this example, the base volume is volume #1.
Blur the volume using a Gaussian filter: volume gaussian #1 sDev 2
. Increasing the value for sDev
will make the map more blurry. This command creates a new, blurred volume. The blurred base volume is volume #2.
Since the volume eraser tool directly modifies the volume it operates on, you must create another copy of the blurred volume if you plan on creating a mask for Particle subtraction.
Copy the map with volume copy #2
In this step, we will erase the regions outside the mask using the Volume Eraser tool. This tool creates a sphere and allows you to erase everything either inside or outside the sphere. There is no undo function for this tool, so be careful when erasing volumes. You can create copies of the volume as you go if the process is long and complicated using volume copy
as above.
Open the volume eraser tool: Right Mouse ribbon menu > Erase. You should see a sphere appear. Holding down right-click and dragging your mouse moves the sphere. Erasing inside or outside the sphere is accomplished with the buttons in the Map Eraser pane.
Using the sphere, erase all regions outside your desired mask. Perform a close inspection of your final volume, being careful to notice small regions left behind by imprecise eraser placement:
Save the erased map as your mask base.
Subtract the erased and blurred map (in this case, #2) from the unerased and blurred map (in this case, #3): volume subtract #3 #2
. If the result has negative values in most voxels and an unexpected and noisy shape, the arguments were likely given in the wrong order.
Save the resulting mask bases to .mrc
files and upload the files to CryoSPARC.
ChimeraX can create volumes based on molecular models using a command called molmap. Generating mask bases using this technique is by far the simplest — using a single command, we can create a mask around (in this example) chain U: molmap #2/U 16 onGrid #1
.
If onGrid
is left out of this command, a seemingly-correct mask base will be generated, but it will be on the wrong grid and so unusable!
In this example, #2 is our molecular model, we generated a mask base with a resolution of 16 angstroms, and #1 is our map from CryoSPARC. Note that even though none of the information in the mask base comes from the map, you still must have it loaded so that the mask base is on the correct grid.
In this command, resolution merely notes the level of detail in the resulting simulated map. It will not affect the quality of refinements using the mask.
Note also that any resolution can be selected. ChimeraX is not simulating any electron microscopy process — it is simply generating a volume using the provided model and the specified resolution. We recommend that masks are never generated with a resolution better than (i.e., never a value lower than) 12 Å.
Masks created using molmap are already on the correct grid and can immediately be saved and uploaded to CryoSPARC.
A mask base is converted to a mask by following steps:
The mask base is “binarized”. All values greater than a user-selected threshold are set to 1.0
and all values below this threshold are set to 0.0
.
The resulting binary volume is dilated. Additional pixels within a user-selected distance from the volume surface are also set to 1.0
.
The binary volume has a soft edge added (padding). This edge gradually decreases from 1.0
to 0.0
and has a user-specified width.
All of these steps can be performed simultaneously via a Volume Tools job, and more information about the necessary parameters are available on that job page.
The binarization threshold will be different each time depending on the input mask base, and can be determined with a volume visualization tool like ChimeraX. A threshold should be selected such that there are no floating “specks” of density and the desired topology of the mask is preserved.
The amount of dilation required also depends on both the dataset and the sub-volume. Generally adding a few pixels of dilation helps to prevent over-tight masks, and allows for the inclusion of newly-resolved density in the masked volume.
Padding is an essential component of masking in cryoEM to prevent ringing artifacts. We recommend a minimum padding width of where resolution is the GSFSC resolution in Å and apix is the pixel size in Å, but the optimal result can require significantly larger padding widths. We therefore recommend that users try a variety of mask dilation and padding combinations to find the ideal combination.
Eric F. Pettersen et al., “UCSF ChimeraX: Structure Visualization for Researchers, Educators, and Developers.,” Protein Science : A Publication of the Protein Society 30, no. 1 (January 2021): 70–82, https://doi.org/10.1002/pro.3943.
Thi Hoang Duong Nguyen et al., “Cryo-EM Structure of the Yeast U4/U6.U5 Tri-snRNP at 3.7 Å Resolution,” Nature 530, no. 7590 (February 1, 2016): 298–302, https://doi.org/10.1038/nature16940.
Grigore Pintilie and Wah Chiu, “Comparison of Segger and Other Methods for Segmentation and Rigid-Body Docking of Molecular Components in Cryo-EM Density Maps.,” Biopolymers 97, no. 9 (September 2012): 742–60, https://doi.org/10.1002/bip.22074.