Tutorial: Mask Creation

Mask Selection and Generation in UCSF ChimeraX

What Is a Mask?

In many circumstances it is important to specify a region of a 3D volume. Whether this is to indicate a sub-volume to refine in Local Refinement, a volume to subtract for Particle Subtraction, or simply the region in which to calculate the GSFSC, we use a mask to select the volume.

A mask is another 3D volume with the same box and pixel size as the volume to which it will be applied. The mask has a value of 0.0 outside the region to select and a value of 1.0 inside. When the volume is multiplied by the mask, the result will be a box that is empty except for in the region of interest, which will have the same information as the original volume.

In masks for cryoEM must be “softened” by adding a smooth transition between the 1.0 values inside the mask and the 0.0 values outside the mask. This softening prevents ringing artifacts which severely degrade alignment. A softer mask introduces less severe artifacts, but includes more information from outside the region of interest. The Volume Tools job can be used to create a binarized (1.0 inside and 0.0 outside) mask with a soft edge all in one step from an input map or mask.

The ideal softness for a given dataset and subvolume typically must be determined empirically, but we recommend a minimum soft padding width of 5×resolutionapix5 \times \frac{\mathrm{resolution}}{\mathrm{apix}} where resolution is the GSFSC resolution in Å and apix is the pixel size in Å. In some cases, it can be beneficial to add a wider soft edge or even an expansion of the 1.0 values (i.e., making the mask “wider”, called dilation). It is often best to try a few different combinations of dilation and padding and determine which produces the best results.

Why do masks need a soft edge?

When we view particles, we view them in the spatial domain. However, particles are aligned in the frequency domain. To convert an image to its frequency domain representation we must take the 2D Fourier Transform. This is where the difficulty with hard-edged masks arises.

A sharp transition in the spatial domain (the “normal” way of looking at images) becomes an infinitely fluctuating wave in the frequency domain. This wave artifact is known as “ringing". When we align images with ringing artifacts, we run the risk of lining up the artifact from our sharp mask instead of the information from the image itself.

The softer the edge is in the spatial domain, the less ringing we observe. However, the softer the mask’s edge, the more of the volume outside our region of interest we include.

Common Pitfalls

Mask too tight

One very important thing to consider is that masks which are too tight or too high-resolution introduce shared information into both half-maps, which breaks independence between the two half sets and artificially inflates the GSFSC curve. The magnitude of this effect is directly related to how much “ringing” is present in the mask’s frequency domain; thus sharper masks produce more severe artificial correlations. In jobs which produce GSFSC curves, be wary of results in which the Tight FSC curve does not closely follow the Corrected curve. A mismatch between these curves typically indicates an over-tight mask.

Mask too small

One common application of masks is Local Refinement. Local Refinement is a powerful technique for improving the quality of smaller sub-volumes of a map. However, when the masked sub-volume is too small, not enough information remains for reliable image alignment. This can result in overfitting, characterized by noise, shells, or “blips”, especially surrounding the edge of the mask. If a small subdomain must be masked out for Local Refinement, we recommend applying a Gaussian prior to reduce overfitting. More information on Gaussian Priors is available on the job page.

Mask Base Creation

The first step in all mask creation workflows is the creation of a mask base. We focus here on the three most common workflows:

  • erasing regions of an existing map

  • using volume segmentation on an existing map

  • creating a mask using a molecular model

For all three techniques we will use ChimeraX, a molecular visualization tool developed by the Resource for Biocomputing, Visualization, and Informatics at UCSF. ChimeraX is free for academic, government, nonprofit, and personal use and can be licensed for commercial use.

The visuals for mask creation use data from EMPIAR-10073. This dataset was originally collected and processed by Nguyen et al.

Mask Bases

Throughout this page we make a distinction between mask bases and masks. Mask bases are created by a user to specify the region the mask ought to cover. They may have a hard edge and may or may not be binarized. Masks, on the other hand, are ready for use in CryoSPARC. They have been binarized, dilated, and padded. Typically a mask base is plugged into a Volume Tools job to create a mask.

Method One: volume segmentation

This technique uses watershed segmentation to split a volume into regions. Masks are specified by deleting regions outside the desired volume. This technique remains relatively simple while also allowing for the construction of complex masks and is the method we recommend for most purposes.

Step 1: Open and blur the input volume

Blurring a volume before creating your mask achieves two aims. Practically, it is significantly easier to select a region of interest when high-frequency noise has been attenuated with a blurring operation. Theoretically, it is important that the mask does not introduce high-frequency correlations between the two half maps. Only building blurred masks helps reduce the chance of this happening.

  1. Open the volume in ChimeraX. This example uses the results of a non-uniform refinement. For all commands in this example, the base volume is volume #1.

  2. Blur the volume using a Gaussian filter: volume gaussian #1 sDev 2. Increasing the value for sDev will make the map more blurry. This command creates a new, blurred volume. The blurred base volume is volume #2.

Step 2: Segment the volume

In this step, the Segger tool in ChimeraX splits the blurred volume into several regions.

  1. Contour the blurred map until no noise is visible and the region which will ultimately form the mask has the desired topology.

  2. Open Segger via clicking Tools > Volume Data > Segment Map

  1. Click "Segment" in the Segment Map pane to produce the segmentation. In this case, the segmentation is model #3. This segments the map into several regions, each of which is a distinct color.

Step 3: Hide unwanted regions

In this step, we build our mask around the region of interest by hiding regions we wish to exclude from the mask. Because there is no undo operation when working with segmented maps, we recommend that users only hide the regions rather than outright deleting them. This also makes it easier to generate a complementary volume for Particle Subtraction, which we will do later on.

  1. Open the "Shortcuts Options" dropdown in the Segment Map pane.

  1. Control-click a region you wish to exclude from the mask to select it. The region should be surrounded by a light-green outline.

  1. Click the Hide button in Shortcuts Options to hide the region. The hidden region remains selected until a new region is clicked, so clicking “Show” will return the region to a visible state.

  2. Proceed to hide all regions which are to be excluded from the mask.

    • Control-click-and-drag selects multiple regions in a box

    • Control-shift-click adds or removes a region from the current selection

    • If a region contains parts of the map you wish to keep and parts you wish to exclude, select only that region and click “Ungroup” in Shortcuts Options. This will break the region into smaller subregions. Repeat the process until you can isolate the regions you wish to exclude.

When this process is completed, the segmentation model should have all regions outside your desired mask hidden and all regions in the mask shown.

Step 4: Create the mask base

CryoSPARC cannot accept Segger segmentations, so they must first be converted to .mrc format.

  1. Control-click and drag over the entire segmentation to select all visible regions.

  2. In the Segment Map pane, click File > Save selected regions to .mrc file. The filename you choose at this stage is not important, as the resulting .mrc file has some issues we will fix in the next step.

This step generates a new volume, the mask base. In this example, the mask base is volume #4.

Step 5 (optional): Generate the complementary mask base for Particle Subtraction

If you are performing Local Refinement, it can be helpful at this stage to create the complementary mask for Particle Subtraction. To do this, we delete all the regions we used to create the mask base, then save another .mrc file with the remaining regions.

  1. With the regions used to create the mask base selected, click “Delete” in the Segment Map pane.

  2. In the Segment Map pane, click “All” next to “Show regions:”. This reveals the regions you hid during Step 3.

  1. Select all regions and save an .mrc file as in Step 4. In this example, the particle subtraction mask base is volume #5.

Step 6: Fix box size

When Segger saves the regions to an .mrc file, it crops the box size to perfectly fit the mask base. This results in a box size that is different from the map’s box size, meaning CryoSPARC will not know where to position the mask. To resolve this problem, we must first resample the mask base onto the original map’s box.

When using commands in ChimeraX, ensure that you are using the correct numbers for your maps, as they may differ from those printed here if you did not take the optional step 5, or your ChimeraX session already had maps or models loaded into it prior to starting this tutorial.

  1. Use the command volume resample #4 onGrid #1 to resample the mask base onto the original map’s box. Note that the resulting maps are positioned in the same region of space and contain the same information, but the box sizes are different. In this case, the volume saved by Segger has a box size of 96 x 94 x 129 voxels, while the map and resampled volumes both have box sizes of 380 x 380 x 380.

  1. Save the resampled volume (in the case of this example, volume #5 if a particle subtraction volume was not created and volume #6 if a mask subtraction volume was created). This is the mask base, so we recommend using an informative name.

  2. (If creating a particle subtraction mask) repeat the above steps to resample the Particle Subtraction mask base onto the map volume. In this case, the necessary command would be volume resample #5 onGrid #1. This is the completed mask base for Particle Subtraction.

Step 7: Upload to CryoSPARC

Both of the mask bases are now resampled and ready for import to CryoSPARC, via the Import 3D Volumes job. Once imported, the mask bases can be converted to masks via thresholding, dilation, and padding in the Volume Tools job.

Method Two: volume eraser

For very simple masks, this technique is much faster than Method One. However, it can be susceptible to creating undesired noise and care must be taken when analyzing the resulting masks.

Step 1: Open and blur the input volume

Blurring a volume before creating your mask achieves two aims. Practically, it is significantly easier to select a region of interest when high-frequency noise has been attenuated with a blurring operation. Theoretically, it is important that the mask does not introduce high-frequency correlations between the two half maps. Only building blurred masks helps reduce the chance of this happening.

  1. Open the volume in ChimeraX. This example uses the results of a non-uniform refinement. For all commands in this example, the base volume is volume #1.

  2. Blur the volume using a Gaussian filter: volume gaussian #1 sDev 2. Increasing the value for sDev will make the map more blurry. This command creates a new, blurred volume. The blurred base volume is volume #2.

Step 2: Create copies of the blurred volume

Since the volume eraser tool directly modifies the volume it operates on, you must create another copy of the blurred volume if you plan on creating a mask for Particle subtraction.

  1. Copy the map with volume copy #2

Step 3: Erase the region outside the desired mask

In this step, we will erase the regions outside the mask using the Volume Eraser tool. This tool creates a sphere and allows you to erase everything either inside or outside the sphere. There is no undo function for this tool, so be careful when erasing volumes. You can create copies of the volume as you go if the process is long and complicated using volume copy as above.

  1. Open the volume eraser tool: Right Mouse ribbon menu > Erase. You should see a sphere appear. Holding down right-click and dragging your mouse moves the sphere. Erasing inside or outside the sphere is accomplished with the buttons in the Map Eraser pane.

  1. Using the sphere, erase all regions outside your desired mask. Perform a close inspection of your final volume, being careful to notice small regions left behind by imprecise eraser placement:

  1. Save the erased map as your mask base.

Step 4: (Optional) Create a particle subtraction mask

  1. Subtract the erased and blurred map (in this case, #2) from the unerased and blurred map (in this case, #3): volume subtract #3 #2. If the result has negative values in most voxels and an unexpected and noisy shape, the arguments were likely given in the wrong order.

Step 5: Upload the mask bases to CryoSPARC

  1. Save the resulting mask bases to .mrc files and upload the files to CryoSPARC.

Method Three: molmap

ChimeraX can create volumes based on molecular models using a command called molmap. Generating mask bases using this technique is by far the simplest — using a single command, we can create a mask around (in this example) chain U: molmap #2/U 16 onGrid #1.

If onGrid is left out of this command, a seemingly-correct mask base will be generated, but it will be on the wrong grid and so unusable!

In this example, #2 is our molecular model, we generated a mask base with a resolution of 16 angstroms, and #1 is our map from CryoSPARC. Note that even though none of the information in the mask base comes from the map, you still must have it loaded so that the mask base is on the correct grid.

In this command, resolution merely notes the level of detail in the resulting simulated map. It will not affect the quality of refinements using the mask.

Note also that any resolution can be selected. ChimeraX is not simulating any electron microscopy process — it is simply generating a volume using the provided model and the specified resolution. We recommend that masks are never generated with a resolution better than (i.e., never a value lower than) 12 Å.

Masks created using molmap are already on the correct grid and can immediately be saved and uploaded to CryoSPARC.

Converting a mask base to a mask

A mask base is converted to a mask by following steps:

  1. The mask base is “binarized”. All values greater than a user-selected threshold are set to 1.0 and all values below this threshold are set to 0.0.

  2. The resulting binary volume is dilated. Additional pixels within a user-selected distance from the volume surface are also set to 1.0.

  3. The binary volume has a soft edge added (padding). This edge gradually decreases from 1.0 to 0.0 and has a user-specified width.

All of these steps can be performed simultaneously via a Volume Tools job, and more information about the necessary parameters are available on that job page.

The binarization threshold will be different each time depending on the input mask base, and can be determined with a volume visualization tool like ChimeraX. A threshold should be selected such that there are no floating “specks” of density and the desired topology of the mask is preserved.

The amount of dilation required also depends on both the dataset and the sub-volume. Generally adding a few pixels of dilation helps to prevent over-tight masks, and allows for the inclusion of newly-resolved density in the masked volume.

Padding is an essential component of masking in cryoEM to prevent ringing artifacts. We recommend a minimum padding width of 5×resolutionapix5 \times \frac{\mathrm{resolution}}{\mathrm{apix}} where resolution is the GSFSC resolution in Å and apix is the pixel size in Å, but the optimal result can require significantly larger padding widths. We therefore recommend that users try a variety of mask dilation and padding combinations to find the ideal combination.

Next steps

pageJob: Local RefinementpageJob: Particle Subtraction

References

  1. Eric F. Pettersen et al., “UCSF ChimeraX: Structure Visualization for Researchers, Educators, and Developers.,” Protein Science : A Publication of the Protein Society 30, no. 1 (January 2021): 70–82, https://doi.org/10.1002/pro.3943.

  2. Thi Hoang Duong Nguyen et al., “Cryo-EM Structure of the Yeast U4/U6.U5 Tri-snRNP at 3.7 Å Resolution,” Nature 530, no. 7590 (February 1, 2016): 298–302, https://doi.org/10.1038/nature16940.

  3. Grigore Pintilie and Wah Chiu, “Comparison of Segger and Other Methods for Segmentation and Rigid-Body Docking of Molecular Components in Cryo-EM Density Maps.,” Biopolymers 97, no. 9 (September 2012): 742–60, https://doi.org/10.1002/bip.22074.

Last updated