Job: Deep Picker Train and Job: Deep Picker Inference

Deep picking job types.

1. Deep Picker Train

To perform particle picking using the deep picker, a model must first be trained using the Deep Picker Train job. Both of these jobs require the same inputs and produce the same outputs as listed below:

Inputs

  • Particle Picks

  • Micrographs

  • Deep Picker Model (Optional)

Outputs

  • Deep Picker Model

  • Micrographs

Parameters

The Deep Picker Train job features various parameters. The training parameters are detailed below:

  • Particle Diameter (A)

    • Diameter of particles in Angstroms

  • Initial Learning Rate and Final Learning Rate

    • The values that determine the extent by which model weights are updated. Higher values will result with training approaching an optimum faster but may prevent the model from reaching the optimum itself, resulting with potentially worse final accuracy. The learning rate is split into two values as deep picker training has been found to benefit from learning rate decay. The training will begin by using the value provided in the initial learning rate parameter and with each epoch, steadily change until it becomes the value provided in the final learning rate parameter by the final epoch.

  • Minibatch Size

    • The number of examples that are used within each batch during training. Lower values may improve model accuracy at the cost of increased training time. The learning rate will have to be tuned based on the minibatch size.

  • Number of Epochs

    • The number of iterations through the entire dataset the training performs. Higher number of epochs will naturally lead to longer training times.

  • Validation Set Fraction

    • The fraction of the dataset to use for validation. For example, a value of 0.2 will use 80% of the input micrographs for training and the remaining 20% for testing. The validation loss and accuracy will be provided at each epoch. It is highly recommended to use a fraction greater than 0.

  • Test Set Fraction

    • The fraction of the dataset to use for testing. The test loss and accuracy will only be provided at the end of training as a means of confirming that the model did not overfit.

  • Shape of Split Micrographs

    • The deep picker will split input micrographs into patches to be input into the model. The value input into this parameter will determine the shape of these patches. For example, if the default value of 256 is used, then the shape of the patches will be 256x256 pixels.

The preprocessing parameters are detailed below:

  • Number of Parallel Threads

    • Number of threads to distribute preprocessing over. This parameter decreases the preprocessing time by a factor approximately equal to the input value. It is recommended to set this value to at least 4 as the processing time is often a bottleneck in the time performance of the job. Values less than 2 will default to a single thread.

  • Desired Pixels per Angstrom

    • The pixels per Angstrom to normalize the input micrographs to before training.

  • Degree of Lowpass Filtering

    • The degree by which to lowpass filter the micrographs before training. Lower values will result with increased lowpass filtering. Values must be greater than 0.

  • Use Denoised Micrographs

    • Determines whether to use denoised or original micrographs if denoised micrographs are input into the training job. This parameter has no impact if the input micrographs were not denoised.

Interpreting Training Results

Once training using the Deep Picker Train job is complete, it will output two plots indicating the performance on the training and validation sets over each epoch. The first plot presents the training and validation losses, while the second plot presents the training and validation accuracies. The x-axis for both plots indicates the epoch and the y-axis indicates the data that the corresponding plot is presenting. Successfully trained models will have losses that decrease overtime and accuracies that increase overtime. It should be noted that losses and accuracies may change in an undesirable fashion and then correct itself with more training.

The validation loss and accuracy should ideally trail slightly behind the training loss and accuracy or have a negligible difference.

Below is an example of plots from a well-performing deep picking model.

Recovering from Training Failure

In the case that the Deep Picker Train job fails during training, the job will still output a Deep Picker Model. This model will be the model at the epoch with the lowest validation loss prior to failure. The training on this model can be resumed by passing this output as the Deep Picker Model input of another Deep Picker Train job then setting the "resume training" parameter on. The job will then continue training from the point that it saved. Otherwise, the job will begin training anew using the input model parameters as an initialization.

2. Deep Picker Inference

Once the Deep Picker Train job has been used to train a deep particle picking model it can be used to pick particles from micrographs using the Deep Picker Inference job. This job has the following inputs and outputs:

Inputs

  • Deep Picker Model

  • Micrographs

Outputs

  • Particle Picks

  • Micrographs

Parameters

The Deep Picker Inference job features various parameters. The parameters are detailed below:

  • Use Pretrained Model

    • Use cryoSPARC-included pretrained model for inference. If a model is input and this parameter is selected, the job will use the pretrained model instead of the input model.

  • Inference Pixel Threshold

    • Minimum number of pixels required for an output to be considered a particle.

  • Specific Device to Use

    • Index of device to use for training. -1 will force CPU usage.

  • Number of Parallel Threads

    • Number of threads to distribute preprocessing over. This parameter decreases the preprocessing time by a factor approximately equal to the input value. It is recommended to set this value to at least 4 as the processing time is often a bottleneck in the time performance of the job. Values less than 2 will default to a single thread.

  • Show Plots

    • Show one micrograph input, and its corresponding model outputs and particle locations.

Interpreting Training Results from Deep Picker Inference

The particle picks from the Deep Picker Inference job can be observed and have a threshold applied using the Inspect Particle Picks job. This job interacts with particle picks from Deep Picker Inference differently in that it enable a user to apply a threshold based on Topaz model performance rather than power score. To do so, vary the power score threshold in the Inspect Particle Picks job. This number is the a percentage indicating how confident the model is with its prediction.

All particles outputted from Deep Picker Inference must be processed using the Extract from Micrographs job in CryoSPARC. This updates the CTF information within the particle picks and makes the picks compatible with other CryoSPARC jobs such as Ab-Initio Reconstruction.

Last updated