# Helical symmetry in cryoSPARC

Many proteins that form filaments can be characterized as having some form of helical symmetry. In cryo-EM, symmetry is a powerful tool to both increase the effective signal in a dataset, and to provide a structural constraint on the reconstruction. With helical symmetry, the latter point is especially important in order to arrive at correct, high-resolution structures. However, a major challenge in reconstruction of helical assemblies is determining the parameters that define the helical symmetry. While it is often necessary, there is currently no universal method for determining or validating helical symmetry parameters. This page details how helical symmetry is defined and treated in cryoSPARC, as well as some tools developed in cryoSPARC to assist in the exploration of symmetry parameters.

# Definition of helical symmetry in cryoSPARC

This section covers some of the math involved in representing helical symmetry. Similar to point group symmetry, proteins with helical symmetry are comprised of a single asymmetric unit that exists in multiple different spatial positions. With point group symmetry, the transformations that relate one copy of the subunit to all others are characterized by pure rotations around a fixed point. With helical symmetry, these transformations are characterized instead by a simultaneous rotation ("twist") and translation ("rise") along the same axis (called the helical axis, "meridian", or the "screw" axis). The image below shows an example of a simple one-start helical lattice on the right, compared to a continuous helix on the left [1].

Global helical symmetry can be defined through two sets of equivalent parameters:

• Helical rise $\Delta z$ (Å), and twist $\Delta \phi$ (º)

• Helical pitch, $p$ (Å), number-of-subunits-per-full-turn, $n$, and hand $h \in \{+1, -1\}$

• Conventionally, $h = +1$ corresponds to right-handed helices and $h= -1$ corresponds to left-handed helices

The helical pitch, number of subunits per turn, and hand contains the same information as the rise and twist. Note that $p$, $n$, and $\Delta z$ are strictly positive, whereas $h$ and $\Delta \phi$ can take on either sign and thus serve to indicate the handedness of the helix. From$p$, $n$, and $h$values, the rise and twist can be calculated as:

$\Delta \phi = \frac{360º\;h}{n} ; \;\; \Delta z = \frac{p}{n}$

And similarly, from $\Delta z$ and $\Delta \phi$, the pitch, number of subunits per turn, and hand can be calculated as:

$n = \frac{360º}{|\Delta \phi|} ; \; \; p = \frac{360º \Delta z}{|\Delta \phi|} ; \; \; h = sign(\Delta \phi)$

In some cases, helical assemblies will show more ambiguity in the number of subunits per turn than they do in pitch, making the $(n,p,h)$ basis more useful to explore and decouple the space of symmetry parameters. On the other hand, the $(\Delta \phi, \Delta z)$basis more clearly represents the information needed to impose symmetry during a refinement, as it more directly relates to important low-level parameters such as the number of asymmetric units to perform symmetry-averaging over. For this reason, in cryoSPARC, both sets of parameters are used at different points in a helical processing workflow. In particular, helical refinements use the $(\Delta \phi, \Delta z)$basis (and this is what is stored internally), and the symmetry search utility uses either basis.

## Advanced notes on helical symmetry

CryoSPARC also supports additional point group symmetry associated with a helical assembly, namely cyclic symmetries (with a cyclic axis coincident with the helical axis), and dihedral symmetries, (with the dyad axis perpendicular to the helical axis). These are often referred to as "n-start" helices, where n is the cyclic order.

It should also be noted that in proteins that form a helical lattice, there are many ambiguities in the characterization of helical symmetry. For the purposes of symmetry imposition during refinement, the helical twist and rise pair should:

• Correspond to a valid symmetry transform that translates an individual asymmetric unit to another identical asymmetric unit on the helical lattice, and

• Have the minimum helical rise out of all such valid transformations

The first condition ensures that asymmetric units are properly averaged together, and the second condition ensures that symmetry is used to its maximum extent.

Note that even abiding by these conditions may produce multiple valid symmetry parameterizations. This is especially true when the helical assemble has an additional cyclic or dihedral point group symmetry associated with it. For example, EMPIAR-10019 with an additional C6 point group present has a helical rise (between six-fold symmetric layers) of $21.8 \; Å$. However, multiple twist values are valid due to the six-fold symmetry. Along with the reported twist value of $29.4º$, a helical twist of $\frac{-1}{6}(360º) + 29.4º = -30.6º$ will also perform the equivalent operation during reconstruction.

In addition to these ambiguities, there may also be ambiguities in the hand of helical symmetry. Since the hand of a cryo-EM map is not constrained (projection images carry no information of the correct hand), it's possible for maps to reach high resolution but have an incorrect hand. It's generally only possible to observe the hand of a density map once the resolution is high enough that structural motifs such as alpha helices can be seen – if these are left-handed, that likely indicates the hand of the map is inverted. If symmetry was applied, and the map has the wrong hand, the helical twist should have its sign inverted in order to produce the correct symmetry. Note that the Volume Tools job can invert the hand of a map.

Finally, more advanced cases exist, such as:

• helical proteins with multiple starts but without cyclic symmetry (i.e. "screw symmetry", or "n-start C1 helical symmetry")

• pseudo-helical symmetry (e.g. microtubules with a symmetry-breaking seam)

The bottom line is that is most cases of global helical symmetry, a valid twist/rise representation will exist. In the first case of screw symmetry, these are reducible into a valid representation within the global twist/rise parameterization, as described by He and Scheres [2]. On the other hand, microtubules with seams may or may not be able to fit into a global helical symmetry formulation. In cases where you suspect the helical symmetry is not global, it may be insightful to refine the structures both with and without symmetry applied, and observe the differences.

# Exploration of symmetry parameters

Note: Reliable determination of helical symmetry is an open problem in cryo-EM reconstruction, and it has been observed that the error landscape over helical symmetry parameters contains many false minima [3]. Furthermore, validation of helical symmetry parameters poses an even greater challenge, as standard methods of validation (including FSC) often are not sufficient when comparing two reconstructions with differing helical symmetry. For these reasons, the tools and workflows presented on this page are not intended to immediately provide unambiguous solutions to, and/or validate, the helical symmetry parameters. Rather, these tools are primarily intended to help guide the initial exploration of candidate symmetry parameters, and to enable refining these candidate solutions during a helical refinement. Though the success of these methods is often dataset-dependent, they may be useful as part of the process of determining helical symmetry.

Where possible, validation of imposed helical symmetry parameters should be done by cross-referencing with other methods (e.g. prior knowledge, known similar structures, Fourier-Bessel analysis, etc.). In addition, density maps with imposed symmetry should always be inspected for expected structural motifs and secondary structure. Often times, imposing incorrect symmetry parameters can still produce refined maps with relatively high claimed resolutions, but structural motifs may appear highly distorted, may be "smeared" out along the azimuth, or may not exist altogether. Finally, FSC plots should also always be inspected. High amplitude oscillations in the FSC curves are typical in early iterations when the resolution is still fairly low, but if they persist into the final iterations of the refinement, that may indicate that the wrong symmetry was imposed.

## Using asymmetric reconstructions and symmetry search utility job to explore symmetry parameters

### Using helical refinement

One way to do an asymmetric reconstruction is to launch an asymmetric helical refinement. This can be done by creating a Helical Refinement (BETA) job, and leaving the twist and rise parameters empty – this will mean that no symmetry will be applied during reconstruction. For such refinements, the initial density generated can impact the results. By default, helical refinement will generate an initial density by using the in-plane rotation (i.e. tilt) estimates of the filament picks, but with scrambled azimuth rotation angles. Two parameters that strongly determine how successful this method is are the following:

• Initial lowpass resolution (A) : This controls the resolution used in the first iteration.

• Number of images for initial density generation: This controls the number of images used in the random density initialization.

• GSFSC Split Resolution (A): This controls the resolution beyond which each half-map is considered independent. For smaller filaments, ideal values may lie lower (in the range of 12-15 Å) relative to the defualt of 20 Å.

The GSFSC split resolution is quite important for smaller filaments, and when running helical refinements without a volume input. If you notice strong oscillation in the FSC curves, especially at lower resolutions, you may want to set the split resolution to a lower number (e.g. 12-15 Å). With higher initial lowpass resolutions and lower numbers of images used for initial density generation, there is more potential for the initial density to introduce bias into the refinement, potentially causing the refinement to converge to an incorrect model. On the other hand, with lower initial lowpass resolutions and higher numbers of images used for initial density generation, the initial model may lack enough detail to reliably align particle images. While symmetric refinements are largely insensitive to these parameters (the imposition of correct symmetry overcomes most of these pitfalls), asymmetric refinements may require some experimentation to find the set of ideal parameters for initial model generation.

### Using ab-initio reconstruction

Ab-initio reconstruction can sometimes be used to directly reconstruct a helical assembly from the particles. For some datasets, all projection views appear so similar that the reconstruction assigns them all to very similar poses, and hence suffers from strong preferred orientation. However, for other datasets in which views have more diversity, ab-initio reconstruction can work well for generating an initial model.

Some parameters that may increase the effectiveness of ab-initio reconstruction under these conditions are:

• Initial/Final minibatch size: Increasing these to ≥400 and ≥1000, respectively, may increase the diversity of views used in each step of the SGD optimization (only if the dataset has enough diversity!)

• Initial/Maximum resolution (Angstroms): Decreasing these to ≤20 and ≤10, respectively, will force the use of higher frequency information during optimization.

• Initial structure lowpass (Fourier radius): Increasing this dramatically (i.e. from the default of 7 to between 30 and 60) can significantly impact the results, as it allows the use of higher-frequency information right from the start (which is often important for learning helical symmetry).

Reconstructions should ideally have orientation distributions that cover all or most azimuth angles. Alternately, if there are large gaps in coverage along the azimuth angle even after tweaking the above parameters, you may find more success in using an asymmetric helical refinement as detailed above.

### Exploring the symmetry parameters in a reconstruction

Once a refinement or ab-initio reconstruction has completed with a satisfactory orientation distribution, it can be passed to a Symmetry search utility (BETA) job, which will search predefined range of symmetry parameters and will report the set of candidate symmetry pairs. Typically, the search ranges should be determined by inspection of the both the refined map, as well as of the 2D class averages, as these can often reveal insight as to which values of helical pitch are plausible. For more information and tips on using the search utility in your workflows, please refer to the linked job page, as well as the EMPIAR-10031 case study.

The output of a Symmetry search utility will include a list of candidate symmetry parameters, both printed out to the streamlog, and also written to a .cs file in the job directory. They will be ranked by increasing mean squared error score, where lower scoring values are considered more plausible. The twist and rise values printed to the streamlog can be used as initial estimates for a helical refinement.

### Using Helical Refinement (BETA) to refine symmetry parameters

Once a pair of symmetry parameters is chosen for refinement, it can be passed to a helical refinement job through the Helical twist estimate and Helical rise estimate parameters. Since these values are often not precise enough to bring the structure to very high resolution, it is recommended to enable refinement of the symmetry parameters by setting the Resolution to begin local searches of helical symmetry parameter to between 5-8 Angstroms.

## Citations

[1] Doryen Bubeck. Making the most of symmetry. 2019. [ http://www.cryst.bbk.ac.uk/embo2019/ppts/EMBO_2019.pdf ]

[2] Shaoda He, Sjors H.W. Scheres. Helical reconstruction in RELION. Journal of Structural Biology, Volume 198, Issue 3, 2017, Pages 163-176, ISSN 1047-8477. [ https://doi.org/10.1016/j.jsb.2017.02.003 ]

[3] Edward Egelman. Ambiguities in helical reconstruction. eLife 3:e04969, 2017. [ https://doi.org/10.7554/eLife.04969.001 ]