Job: Remove Duplicate Particles

Remove duplicate particles.

Description

Remove duplicate particle picks in a given input set of particles. This enables picks from various different pickers (e.g. template picker and blob picker) to be combined safely, by passing them through this job. It also can incorporate the alignment shifts, which makes it useful for removing duplicates after 2D classification, and before launching into ab-initio reconstruction or refinement. Remove Duplicate Particles may also be used for undoing a previous symmetry expansion.

Input

  • Particles

    • location required

    • blob, pick_stats, alignments2D, alignments3D are optional

  • Micrographs (optional)

    • micrograph_blob (optional)

Parameters

  • Minimum separation distance (A)

    • The desired minimum distance between particle centers (default 20 Å)

  • Micrograph pixel size (A)

    • If micrographs are not input to the job, then the micrograph pixel size in Angstroms must be set here

  • Shift key

    • This is used only if alignments2D or alignments3D are connected, in which case, the particle center coordinates will be determined by applying the additional shift offset (either from alignments2D or alignments3D according to the shift key). This parameter can be set as none to prevent shifts from being used when determining particle center coordinates; in this case, the particle centers will be read from location only.

  • Score field

    • This controls the field that will be used to determine which particles should be kept, amongst the duplicate particles. In all cases, particles with the more favourable score will be kept, and the identified duplicates will be rejected. By default, the ncc_score from particle picking will be used, however other scores may be used. These include the particle's agreement with the reference (error) or the particle's agreement with the reference after accounting for per-particle scales (error_min). This parameter also can be set to none, in which case, the particles to be kept (amongst identified duplicates) will be determined randomly.

  • Remove duplicates entirely

    • Activate this to reject all particles amongst the identified duplicates, rather than keeping one from each set of duplicates.

Output

  • particles_kept: Particles kept after duplicates are removed

  • particles_rejected: Particles identified as duplicates

Example Plots

If micrographs are connected, the job will plot out an example micrograph with the kept and rejected coordinates. Shown below are example plots of duplicate removal on a template-picked dataset, with a high minimum separation distance of 100 Å.

Common Use Cases

Combining Particles from two different Particle Pickers

Remove Duplicate Particles may be used to combine the results from two different particle pickers, for example, the template picker and blob picker. This is important in order to prevent duplication of particles in subsequent classifications and refinements, which can cause overfitting and misestimation of resolution. In this case, input the source micrographs and the picked particles from each picker, and optionally specify the desired Minimum separation distance and Score key. If particles have been previously aligned via 2D classification or 3D refinement, then the Shift key may also be specified.

Undoing symmetry expansion

Particles that have been previously symmetry expanded can have the symmetry expansion effectively "undone" via passing through this job. Specifically, this job can take the full expanded particle stack, and select only one particle from each symmetry expanded copy. This may be desired when the symmetry expanded particles have been used for downstream processing (e.g. 3D Variability Display in the cluster or intermediates modes) and a subset of the expanded particle stack has been selected for further symmetry-enforced refinements, for example.

For undoing symmetry expansion, the Minimum separation distance should be set to 0 to indicate that only particles with precisely the same coordinates should be identified as duplicates. The Shift key should also be set to none to ensure that particle coordinates are obtained from their locations, rather than from 2D or 3D alignments. Finally, the Score field can optionally be specified.

Common Next Steps

  • Job: 2D Classification

  • Job: Ab-Initio Reconstruction

  • Job: Homogeneous Refinement (NEW)

Last updated