Job: Exposure Group Utilities
Exposure group utilities for combining or splitting exposure or particle datasets.
Last updated
Exposure group utilities for combining or splitting exposure or particle datasets.
Last updated
Combine or split your exposure or particle datasets into exposure groups.
Splitting may be done based on filename or beam shift (refer to )
Particles
ctf
(required)
location
(optional)
blob
(optional)
Exposures
ctf
(required)
blob
[movie/micrograph]
(optional)
mscope_params
(optional)
Input Dataset [particle/exposure]
Combined (default)
Input Dataset [particle/exposure]
Split by Exposure Groups (optional)
Input Selection
: Specifies which input group to use
Action
: Specifies which mode to use
info_only
: Prints a table listing information about the dataset's exposure group stats
combine&set
: Allows combining of multiple exposure inputs and set them to a single exposure group
split
: Allows use of a path field to split the datasets into unique exposure groups
test_token
: When splitting a dataset into exposure groups, this option allows you to test your "token creation" method
cluster&split
: Enables clustering exposures (movies or micrographs) into exposure groups based on their mscope_params/beam_shift
values, via K-means or Agglomerative clustering. Note: movies or micrographs must have been previously imported in CryoSPARC v4.4.0+ with valid beam shift metadata
Token Creation Strategy
: The strategy used to create the tokens that will split the dataset into unique exposure groups.
string_slice
: Uses character positions to slice the filepath into unique tokens
string_split
: Uses a separator to split the file path into groups, of which one can be selected
regular_expression
: Uses python's re.search() to evaluate a regular expression against each filename, creating subgroups, one of which can be selected to create unique tokens
As of v4.0.2, all filepaths that do not match the provided regular expression will be assigned to a separate exposure group ID.
Combine Strategy
: Specifies which mode to use when finding conflicting CTF values across the exposure group
fail
: throws an exception if inconsistent CTF values are found
take median
: overwrites the CTF values of each exposure group with the median of the CTF values across the exposure group
Set Exposure Group Value
: Used only in mode combine&set
- indicates the exposure group ID to set for this dataset
Split Outputs by Exposure Group
: Specifies whether to output the dataset by each individual exposure group
split
mode:Field to use to split Dataset
: Used only in mode split
- indicates which file path field to use to create unique tokens to split the dataset into different exposure groups
Starting Exposure Group ID
: Used only in mode split
- indicates the starting ID that each exposure group will increment from
Start Slice Index
: Used only in mode split/string_slice
- indicates the number of characters from the Index Position
to start the slice
Number of characters to Consider
: Used only in mode split/string_slice
- indicates the number of characters from the start position to slice to create the tokens out of the file paths
Index Position
: Used only in mode split/string_slice
- indicates which position of the file path to index from
File path separator
: Used only in mode split/string_split
- indicates the separator to use to split the filepath into individual groups
Split Group Index
: Used only in modes split/string_split
or split/regular_expression
- indicates the group to select when splitting the filename into groups
Regular Expression String
: Used only in mode split/regular_expression
- indicates the regular expression to evaluate against each file path using python's re.search()
As of v4.0.2, all filepaths that do not match the provided regular expression will be assigned to a separate exposure group ID.
cluster&split
mode:Number of clusters
: This controls the number of clusters to create when clustering exposures into groups based on beam shift
Clustering method
: Controls the clustering method used to group exposures based on beam shift. By default, agglomerative (hierarchical) clustering is used; K-means is also available
Correspond particles to exposures and enforce consistency of exposure group IDs
: Activate this parameter to override the exposure group IDs in the particles input with those from the connected exposure inputs. If this is activated, then the Input Selection must be set to exposures, and both particles and exposures must be connected.