Job: Topaz Denoise
Topaz Denoise job available via wrapper in CryoSPARC.
Topaz Denoisejob can be used to remove noise from micrographs using the Topaz provided model. A new model can also be trained using the job, which can then also be used. This job has the following inputs and outputs.
- Denoise Model
- Training Micrographs
- Denoised Micrographs
- Topaz Denoise Model
These inputs and outputs are pertinent in selecting which kind of model to use for denoising the micrographs. How the inputs and outputs affect the model are detailed in the Specifying Model section.
The key parameters are detailed below:
- General Settings
- Path to Topaz Executable
- The absolute path to the Topaz executable that will run the denoise job.
- Number of Plots to Show
- The number of side-by-side micrograph comparisons to show at the end of the job.
- Number of Parallel Threads
- The number of threads to run the denoising job on. This parameter is used only if the provided Topaz model is used rather than a self-trained model. This parameter decreases the preprocessing time by a factor approximately equal to the input value. Values less than 2 will default to single thread. These threads can also be distributed amongst GPUs, the number of which can be set with the Number of GPUs for Parallel Threads parameter.
- Number of GPUs for Parallel Threads
- The number of GPUs to distribute parallel threads over. The specific GPUs to use can be set using the CryoSPARC scheduler when queuing the job.
- Denoising Parameters
- Normalize Micrographs
- Specify whether to normalize the micrographs prior to denoising.
- Shape of Split Micrographs
- The shape of micrographs after they been split into patches. The shape of the split micrographs will be (x, x) where x is the input parameter.
- Padding around Each Split Micrograph
- The padding to set around each split micrograph.
- Training Parameters
- Learning Rate
- The value that determines how quickly the training approaches an optimum. Higher values will result with training approaching an optimum faster but may prevent the model from reaching the optimum itself, resulting with potentially worse final accuracy.
- Minibatch Size
- The number of examples that are used within each batch during training. Lower values will improve model accuracy at the cost of significantly increased training time.
- Number of Epochs
- The number of iterations through the entire dataset the training performs. Higher number of epochs will naturally lead to longer training times. The number of epochs does not have to be optimized as the train and cross validation jobs will automatically output the model from the epoch with the highest precision.
- The number of updates that occur each epoch. Increasing this value will increase the amount of training performed in each epoch in exchange for slower training speed.
- Crop Size
- The size of each micrograph after random cropping is performed during data augmentation.
- Number of Loading Threads
- The number of threads to use for loading data during training.
- Pretrained Parameters
- Model Architecture
- U-Net (unet) is a convolutional neural network architecture that convolves the input information until a suitable bottleneck shape and then deconvolves the data, concatenating the opposite convolution outputs during the deconvolutions.
- U-Net Small (unet-small) is the same as a U-Net except with less layers.
- FCNN (fcnn) stands for fully-convolutional neural network and is the standard neural network architecture used in many computer vision tasks.
- Affine (affine) applies an affine transformation by a single convolution.
- Prior to Topaz version 0.2.3, only the L2 model architecture is available.
- Compute Settings
- Use CPU for Training
- Specify whether to only use CPU for training.
To denoise micrographs using the
Topaz Denoisejob, users have the option of using the provided pretrained model or to train a model for immediate and future use. Thus the user must select which model to use from three general categories. These categories are:
- 1.Provided pretrained model
- 2.Model to be trained by user
- 3.Model previously trained by user
Specifying which approach to use depends on the job inputs and the build parameters. However, the
Topaz Denoisejob requires the micrographs that will be denoised to be input into the
micrographsinput slot regardless of model specification. The job inputs and build parameters required to select each category are summarized in the table below and detailed further below the table.
To use the provided training model, the
training_micrographsinput slots must be empty.
To train a model for immediate and future use, imported movies that were not pre-processed must be input into the
training_micrographsinput slot and the
denoise_modelinput slot must be empty.
When the job is complete, it will output the trained model through the
topaz_denoise_modeloutput, allowing the the trained model to be used in other
Topaz Denoisejobs. How to use this output is specified in the Using model previously trained by user section below.
To use a previously trained model, pass the
topaz_denoise_modeloutput from the
Topaz Denoisejob with the trained model into the
denoise_modelinput slot. The
training_micrographsinput slot must remain empty.
Topaz Denoisejob is complete, the job will output micrograph comparisons, the amount of which is dependent on the
Number of plots to showbuild parameter. Each comparison features two micrographs on the same row. The micrograph to the left is the original micrograph prior to denoising and the micrograph to the right is the denoised version of the micrograph. This side-by-side comparison serves to inform the user of the effect of the denoising.
Side-by-Side Micrograph Denoising Comparison
Topaz Denoisejob is used to train a model, a plot of the training and validation losses will also be shown. The plots for both losses should be descending overtime. If the plot for the training loss is decreasing while the plot for the validation loss is increasing, this indicates that the model has overfit and training parameters must be tuned. The simplest approach to resolving overfitting is to reduce the learning rate.
Topaz Denoising Loss Plot