Job: Reference Based Auto Select 3D (BETA)
Last updated
Last updated
Select 3D volumes based on their similarity to a reference.
Reference Based Auto Select 3D uses an existing 3D reference to select volumes which are similar to the reference, and reject volumes which are dissimilar. If particles are included with the job, particles will be assigned to a selected or excluded output according to the status of the volume they belong to.
Similarity of the input volumes to the reference volume is assessed by calculating two correlations: an average FSC and the Pearson correlation coefficient. In both cases, a higher number means the volumes are more similar, with the value for identical volumes being 1.0
.
The average FSC calculated by Reference Based Auto Select 3D is the average correlation between the reference volume and the input volume over all resolution shells, thereby yielding a single value. Note that this is different from the GSFSC used to assess resolution in most refinement jobs, which is calculated between independent half-maps of the same refinement.
The Pearson correlation coefficient measures the similarity in real space of each voxel of the input and reference volumes, and is calculated over the entire 3D box.
This input is optional. If provided, particles which belong to a volume selected by Reference Based Auto Select 3D will be selected, while particles belonging to a rejected volume will be excluded. Particle poses will not be changed in the output of this job.
Note that particles should come from the same job as the input volumes.
The volume to which the input volumes will be compared. This volume will be lowpass filtered before the Volumes are aligned and compared to it. The reference volume does not need to come from the same dataset as the input volumes.
Reference Based Auto Select 3D can select volumes in two ways: by resolution only, or by comparing to a reference. The mode is selected by this parameter.
This parameter is the only parameter required by Reference Based Auto Select 3D when using the Select by Resolution Only
select mode.
All of the input volumes' FSC resolutions are compared to that of the best input volume. For each volume, if the reciprocal of the FSC resolution divided by the reciprocal of the best FSC resolution, the volume is selected.
For example, consider a job with the Resolution threshold
set to 0.75 and input volumes with FSC resolutions of 2.8, 3.2, 5, and 8 Å. The best resolution is 2.8 Å. The reiprocal ratios of the other volumes' FSC ratios are 0.875, 0.560, and 0.350 respectively. Only the first and second volumes have ratios greater than 0.75, so those volumes are selected and the others are rejected.
Classes with an FSC resolution worse (i.e., higher numeric value) than this number are excluded regardless of the correlations with the reference volume.
Volumes with a volume-to-reference FSC worse than the FSC threshold
and volumes with a Pearson correlation coefficient worse than the Pearson threshold
are excluded. Note that volumes are excluded if either one of their correlation coefficients are worse than the respective threshold.
Before volumes are aligned and correlations are calculated, the reference volume is lowpass filtered. If this parameter is left blank, the reference will be filtered to the best (i.e., lowest numeric value) GSFSC resolution of the input Volumes. If this parameter is set, the reference volume will be filtered to this resolution instead. Note that if this parameter is set, the Average FSC value for a given input volume is calculated only up to this resolution.
Particles belonging to selected volumes are in this output. Note that even though the input volumes are aligned to the reference during the Reference Based Auto Select 3D, the particles’ poses are unchanged. This output is not present if particles were not provided.
Particles belonging to excluded volumes are in this output. This output is not present if particles were not provided.
In addition to slices through the aligned input volumes, Reference Based Auto Select 3D produces a plot of each volume’s FSC and Pearson correlation coefficient, the thresholds, and whether each volume as accepted or rejected.
Reference Based Auto Select 3D is typically run manually once per sample type to determine useful parameters for future automated use.
Kumar, K. et al. Structure of a Signaling Cannabinoid Receptor 1-G Protein Complex. Cell 176, 448-458.e12 (2019).
Volumes which will be compared with the reference. Note that volumes will be aligned to the reference before comparison. Note that this is a input.
Although, in general, good volumes should have better correlation scores than bad volumes at any resolution level, the relative numeric value of the correlation scores depends on the filter resolution. Thus, for unsupervised selection of volumes (for instance, as part of a ), we recommend that this parameter is set to a specific value while searching for the appropriate threshold values, and then is later set to the same value in the workflow.
Volumes for which both correlation scores were better than the respective threshold are output here. Note that this is a output and includes a series
result containing a of all volumes.
Volumes for which either correlation score was better than the respective threshold are output here. Note that this is a output and includes a series
result containing a of all volumes.
Once useful thresholds for a given target have been determined, Reference Based Auto Select 3D can play an important role in automated processing using . For instance, this job can be included in an automated pipeline after jobs which generate several 3D volumes of unknown quality (such as , , or ) and before a downstream or . In this way, the Workflow can proceed from particle and volume curation to a high-quality consensus refinement without any user intervention, providing a degree of automation for known samples.
Since Reference Based Auto Select 3D outputs a number of selected volumes that is unknown at the time of building the job, it is not possible to know which of the output volumes to use for a downstream refinement when setting up an automated workflow. In this case, a single, average volume for refinement can be created by connecting the selected volumes to a job and setting Number of super classes
to 1
.