# Job: Remove Duplicate Particles

## Description

Remove duplicate particle picks in a given input set of particles. This enables picks from various different pickers (e.g. template picker and blob picker) to be combined safely, by passing them through this job. It also can incorporate the alignment shifts, which makes it useful for removing duplicates after [2D classification](/processing-data/all-job-types-in-cryosparc/particle-curation/job-2d-classification.md), and before launching into [ab-initio reconstruction](/processing-data/all-job-types-in-cryosparc/3d-reconstruction/job-ab-initio-reconstruction.md) or [refinement](/processing-data/all-job-types-in-cryosparc/3d-refinement.md). Remove Duplicate Particles may also be used for undoing a previous [symmetry expansion](/processing-data/all-job-types-in-cryosparc/utilities/job-symmetry-expansion.md).

## **Input**

* Particles
  * `location` required
  * `blob`, `pick_stats`, `alignments2D`, `alignments3D` are optional
* Micrographs (optional)
  * `micrograph_blob` (optional)

## **Parameters**

* `Minimum separation distance (A)`
  * The desired minimum distance between particle centers (default 20 Å)
* `Micrograph pixel size (A)`
  * If micrographs are not input to the job, then the micrograph pixel size in Angstroms must be set here
* `Shift key`
  * This is used only if `alignments2D` or `alignments3D` are connected, in which case, the particle center coordinates will be determined by applying the additional shift offset (either from `alignments2D` or `alignments3D` according to the shift key). This parameter can be set as `none` to prevent shifts from being used when determining particle center coordinates; in this case, the particle centers will be read from `location` only.
* `Score field`
  * This controls the field that will be used to determine which particles should be kept, amongst the duplicate particles. In all cases, particles with the more favourable score will be kept, and the identified duplicates will be rejected. By default, the `ncc_score` from particle picking will be used, however other scores may be used. These include the particle's agreement with the reference (`error`) or the particle's agreement with the reference after accounting for per-particle scales (`error_min`). This parameter also can be set to `none`, in which case, the particles to be kept (amongst identified duplicates) will be determined randomly.
* `Remove duplicates entirely`
  * Activate this to reject all particles amongst the identified duplicates, rather than keeping one from each set of duplicates.

## **Output**

* `particles_kept`: Particles kept after duplicates are removed
* `particles_rejected`: Particles identified as duplicates

## Example Plots

If micrographs are connected, the job will plot out an example micrograph with the kept and rejected coordinates. Shown below are example plots of duplicate removal on a template-picked dataset, with a high minimum separation distance of 100 Å.

![On the left are kept particles; on the right are removed particles (too close to other particles).](/files/-MNf6kE2nsXtFxhnxY2m)

## Common Use Cases

### Combining Particles from two different Particle Pickers

Remove Duplicate Particles may be used to combine the results from two different particle pickers, for example, the template picker and blob picker. This is important in order to prevent duplication of particles in subsequent classifications and refinements, which can cause overfitting and misestimation of resolution. In this case, input the source micrographs and the picked particles from each picker, and optionally specify the desired `Minimum separation distance` and `Score key`. If particles have been previously aligned via 2D classification or 3D refinement, then the `Shift key` may also be specified.

### Undoing symmetry expansion

Particles that have been previously [symmetry expanded](/processing-data/all-job-types-in-cryosparc/utilities/job-symmetry-expansion.md) can have the symmetry expansion effectively "undone" via passing through this job. Specifically, this job can take the full expanded particle stack, and select only one particle from each symmetry expanded copy. This may be desired when the symmetry expanded particles have been used for downstream processing (e.g. [3D Variability Display](/processing-data/all-job-types-in-cryosparc/variability/job-3d-variability-display.md) in the `cluster` or `intermediates` modes) and a subset of the expanded particle stack has been selected for further symmetry-enforced refinements, for example.

For undoing symmetry expansion, the `Minimum separation distance` should be set to `0` to indicate that only particles with precisely the same coordinates should be identified as duplicates. The `Shift key` should also be set to `none` to ensure that particle coordinates are obtained from their `locations`, rather than from 2D or 3D alignments. Finally, the `Score field` can optionally be specified.

## Common Next Steps

* Job: 2D Classification
* Job: Ab-Initio Reconstruction
* Job: Homogeneous Refinement (NEW)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://guide.cryosparc.com/processing-data/all-job-types-in-cryosparc/utilities/job-remove-duplicate-particles.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
