CryoSPARC Guide
  • About CryoSPARC
  • Current Version
  • Licensing
    • Non-commercial license agreement
  • Setup, Configuration and Management
    • CryoSPARC Architecture and System Requirements
    • CryoSPARC Installation Prerequisites
    • How to Download, Install and Configure
      • Obtaining A License ID
      • Downloading and Installing CryoSPARC
      • CryoSPARC Cluster Integration Script Examples
      • Accessing the CryoSPARC User Interface
    • Deploying CryoSPARC on AWS
      • Performance Benchmarks
    • Using CryoSPARC with Cluster Management Software
    • Software Updates and Patches
    • Management and Monitoring
      • Environment variables
      • (Optional) Hosting CryoSPARC Through a Reverse Proxy
      • cryosparcm reference
      • cryosparcm cli reference
      • cryosparcw reference
    • Software System Guides
      • Guide: Updating to CryoSPARC v4
      • Guide: Installation Testing with cryosparcm test
      • Guide: Verify CryoSPARC Installation with the Extensive Validation Job (v4.3+)
      • Guide: Verify CryoSPARC Installation with the Extensive Workflow (≤v4.2)
      • Guide: Performance Benchmarking (v4.3+)
      • Guide: Download Error Reports
      • Guide: Maintenance Mode and Configurable User Facing Messages
      • Guide: User Management
      • Guide: Multi-user Unix Permissions and Data Access Control
      • Guide: Lane Assignments and Restrictions
      • Guide: Queuing Directly to a GPU
      • Guide: Priority Job Queuing
      • Guide: Configuring Custom Variables for Cluster Job Submission Scripts
      • Guide: SSD Particle Caching in CryoSPARC
      • Guide: Data Management in CryoSPARC (v4.0+)
      • Guide: Data Cleanup (v4.3+)
      • Guide: Reduce Database Size (v4.3+)
      • Guide: Data Management in CryoSPARC (≤v3.3)
      • Guide: CryoSPARC Live Session Data Management
      • Guide: Manipulating .cs Files Created By CryoSPARC
      • Guide: Migrating your CryoSPARC Instance
      • Guide: EMDB-friendly XML file for FSC plots
    • Troubleshooting
  • Application Guide (v4.0+)
    • A Tour of the CryoSPARC Interface
    • Browsing the CryoSPARC Instance
    • Projects, Workspaces and Live Sessions
    • Jobs
    • Job Views: Cards, Tree, and Table
    • Creating and Running Jobs
    • Low Level Results Interface
    • Filters and Sorting
    • View Options
    • Tags
    • Flat vs Hierarchical Navigation
    • File Browser
    • Blueprints
    • Workflows
    • Inspecting Data
    • Managing Jobs
    • Interactive Jobs
    • Upload Local Files
    • Managing Data
    • Downloading and Exporting Data
    • Instance Management
    • Admin Panel
  • Cryo-EM Foundations
    • Image Formation
      • Contrast in Cryo-EM
      • Waves as Vectors
      • Aliasing
  • Expectation Maximization in Cryo-EM
  • Processing Data in cryoSPARC
    • Get Started with CryoSPARC: Introductory Tutorial (v4.0+)
    • Tutorial Videos
    • All Job Types in CryoSPARC
      • Import
        • Job: Import Movies
        • Job: Import Micrographs
        • Job: Import Particle Stack
        • Job: Import 3D Volumes
        • Job: Import Templates
        • Job: Import Result Group
        • Job: Import Beam Shift
      • Motion Correction
        • Job: Patch Motion Correction
        • Job: Full-Frame Motion Correction
        • Job: Local Motion Correction
        • Job: MotionCor2 (Wrapper) (BETA)
        • Job: Reference Based Motion Correction (BETA)
      • CTF Estimation
        • Job: Patch CTF Estimation
        • Job: Patch CTF Extraction
        • Job: CTFFIND4 (Wrapper)
        • Job: Gctf (Wrapper) (Legacy)
      • Exposure Curation
        • Job: Micrograph Denoiser (BETA)
        • Job: Micrograph Junk Detector (BETA)
        • Interactive Job: Manually Curate Exposures
      • Particle Picking
        • Interactive Job: Manual Picker
        • Job: Blob Picker
        • Job: Template Picker
        • Job: Filament Tracer
        • Job: Blob Picker Tuner
        • Interactive Job: Inspect Particle Picks
        • Job: Create Templates
      • Extraction
        • Job: Extract from Micrographs
        • Job: Downsample Particles
        • Job: Restack Particles
      • Deep Picking
        • Guideline for Supervised Particle Picking using Deep Learning Models
        • Deep Network Particle Picker
          • T20S Proteasome: Deep Particle Picking Tutorial
          • Job: Deep Picker Train and Job: Deep Picker Inference
        • Topaz (Bepler, et al)
          • T20S Proteasome: Topaz Particle Picking Tutorial
          • T20S Proteasome: Topaz Micrograph Denoising Tutorial
          • Job: Topaz Train and Job: Topaz Cross Validation
          • Job: Topaz Extract
          • Job: Topaz Denoise
      • Particle Curation
        • Job: 2D Classification
        • Interactive Job: Select 2D Classes
        • Job: Reference Based Auto Select 2D (BETA)
        • Job: Reconstruct 2D Classes
        • Job: Rebalance 2D Classes
        • Job: Class Probability Filter (Legacy)
        • Job: Rebalance Orientations
        • Job: Subset Particles by Statistic
      • 3D Reconstruction
        • Job: Ab-Initio Reconstruction
      • 3D Refinement
        • Job: Homogeneous Refinement
        • Job: Heterogeneous Refinement
        • Job: Non-Uniform Refinement
        • Job: Homogeneous Reconstruction Only
        • Job: Heterogeneous Reconstruction Only
        • Job: Homogeneous Refinement (Legacy)
        • Job: Non-uniform Refinement (Legacy)
      • CTF Refinement
        • Job: Global CTF Refinement
        • Job: Local CTF Refinement
        • Job: Exposure Group Utilities
      • Conformational Variability
        • Job: 3D Variability
        • Job: 3D Variability Display
        • Job: 3D Classification
        • Job: Regroup 3D Classes
        • Job: Reference Based Auto Select 3D (BETA)
        • Job: 3D Flexible Refinement (3DFlex) (BETA)
      • Postprocessing
        • Job: Sharpening Tools
        • Job: DeepEMhancer (Wrapper)
        • Job: Validation (FSC)
        • Job: Local Resolution Estimation
        • Job: Local Filtering
        • Job: ResLog Analysis
        • Job: ThreeDFSC (Wrapper) (Legacy)
      • Local Refinement
        • Job: Local Refinement
        • Job: Particle Subtraction
        • Job: Local Refinement (Legacy)
      • Helical Reconstruction
        • Helical symmetry in CryoSPARC
        • Job: Helical Refinement
        • Job: Symmetry search utility
        • Job: Average Power Spectra
      • Utilities
        • Job: Exposure Sets Tool
        • Job: Exposure Tools
        • Job: Generate Micrograph Thumbnails
        • Job: Cache Particles on SSD
        • Job: Check for Corrupt Particles
        • Job: Particle Sets Tool
        • Job: Reassign Particles to Micrographs
        • Job: Remove Duplicate Particles
        • Job: Symmetry Expansion
        • Job: Volume Tools
        • Job: Volume Alignment Tools
        • Job: Align 3D maps
        • Job: Split Volumes Group
        • Job: Orientation Diagnostics
      • Simulations
        • Job: Simulate Data (GPU)
        • Job: Simulate Data (Legacy)
    • CryoSPARC Tools
    • Data Processing Tutorials
      • Case study: End-to-end processing of a ligand-bound GPCR (EMPIAR-10853)
      • Case Study: DkTx-bound TRPV1 (EMPIAR-10059)
      • Case Study: Pseudosymmetry in TRPV5 and Calmodulin (EMPIAR-10256)
      • Case Study: End-to-end processing of an inactive GPCR (EMPIAR-10668)
      • Case Study: End-to-end processing of encapsulated ferritin (EMPIAR-10716)
      • Case Study: Exploratory data processing by Oliver Clarke
      • Tutorial: Tips for Membrane Protein Structures
      • Tutorial: Common CryoSPARC Plots
      • Tutorial: Negative Stain Data
      • Tutorial: Phase Plate Data
      • Tutorial: EER File Support
      • Tutorial: EPU AFIS Beam Shift Import
      • Tutorial: Patch Motion and Patch CTF
      • Tutorial: Float16 Support
      • Tutorial: Particle Picking Calibration
      • Tutorial: Blob Picker Tuner
      • Tutorial: Helical Processing using EMPIAR-10031 (MAVS)
      • Tutorial: Maximum Box Sizes for Refinement
      • Tutorial: CTF Refinement
      • Tutorial: Ewald Sphere Correction
      • Tutorial: Symmetry Relaxation
      • Tutorial: Orientation Diagnostics
      • Tutorial: BILD files in CryoSPARC v4.4+
      • Tutorial: Mask Creation
      • Case Study: Yeast U4/U6.U5 tri-snRNP
      • Tutorial: 3D Classification
      • Tutorial: 3D Variability Analysis (Part One)
      • Tutorial: 3D Variability Analysis (Part Two)
      • Tutorial: 3D Flexible Refinement
        • Installing 3DFlex Dependencies (v4.1–v4.3)
      • Tutorial: 3D Flex Mesh Preparation
    • Webinar Recordings
  • Real-time processing in cryoSPARC Live
    • About CryoSPARC Live
    • Prerequisites and Compute Resources Setup
    • How to Access cryoSPARC Live
    • UI Overview
    • New Live Session: Start to Finish Guide
    • CryoSPARC Live Tutorial Videos
    • Live Jobs and Session-Level Functions
    • Performance Metrics
    • Managing a CryoSPARC Live Session from the CLI
    • FAQs and Troubleshooting
  • Guides for v3
    • v3 User Interface Guide
      • Dashboard
      • Project and Workspace Management
      • Create and Build Jobs
      • Queue Job, Inspect Job and Other Job Actions
      • View and Download Results
      • Job Relationships
      • Resource Manager
      • User Management
    • Tutorial: Job Builder
    • Get Started with CryoSPARC: Introductory Tutorial (v3)
    • Tutorial: Manually Curate Exposures (v3)
  • Resources
    • Questions and Support
Powered by GitBook
On this page
  • Overview
  • Benchmark Data
  • Sharing Data with Structura Biotechnology (Optional)
  • Filesystem Benchmark
  • A Note About Filesystem Caching
  • Sequential Read Test - Movies
  • Sequential Read Test - Particles
  • Random Read Test - Particles
  • CPU Benchmark
  • GPU Benchmark
  • Interpreting Results Using The Benchmark Viewer
  • Component Tags
  • Raw Data
  • Performance Benchmarking Entire Jobs with the Extensive Validation Job
  • Appendix
  • Drop the page cache
  1. Setup, Configuration and Management
  2. Software System Guides

Guide: Performance Benchmarking (v4.3+)

This guide covers the new benchmarking tool in CryoSPARC that allows for benchmarking a worker’s filesystem, CPUs and GPUs. Available in CryoSPARC v4.3.0+.

PreviousGuide: Verify CryoSPARC Installation with the Extensive Workflow (≤v4.2)NextGuide: Download Error Reports

Last updated 1 year ago

Overview

After installing CryoSPARC and verifying the instance is working correctly (see ), use the Performance Benchmarking job to measure the performance of your system and compare it against references provided by Structura and your own past benchmarks.

The new “Benchmark” job is available in the CryoSPARC job builder and can be run on any worker lane connected to your CryoSPARC instance.

The “Benchmark” job will make sure the benchmark data exists in the right location (and downloads it if it doesn’t), and runs the three benchmark tests in serial (CPU, Filesystem and GPU benchmarks) as specified.

Benchmark Data

The benchmark data (17GB, compressed) is required to be downloaded and extracted into a location accessible by the job in order to run the benchmarks. As a convenience, this is automatically done by the Benchmark job when the required data does not exist in the project directory. The benchmark data can also be manually downloaded via the link provided below. Once manually downloaded and extracted, the absolute path to the folder can be specified in the “Benchmark Data Directory” parameter.

The benchmark data package contains movies, particles and volumes required for each of the tests. An abridged directory listing can be seen below:

.
├── class2D_test
│   └── maps.mrc
├── gpu_engine_test
│   ├── abinit_particles.cs
│   ├── abinit_volume.mrc
│   └── J586
│       └── extract
│           ├── 001411154804159785773_14sep05c_c_00003gr_00014sq_00005hl_00003es.frames_patch_aligned_doseweighted_particles.mrc
│           ├── (...)_14sep05c_c_00003gr_00014sq_00006hl_00003es.(...).mrc
│           ├── (...)_14sep05c_00024sq_00003hl_00005es.(...).mrc
│           ├── (...)_14sep05c_c_00003gr_00014sq_00007hl_00005es.(...).mrc
│           ├── (...)_14sep05c_c_00003gr_00014sq_00006hl_00005es.(...).mrc
│           ├── (...)_14sep05c_c_00003gr_00014sq_00005hl_00002es.(...).mrc
│           ├── (...)_14sep05c_c_00003gr_00014sq_00009hl_00004es.(...).mrc
│           ├── (...)_14sep05c_c_00003gr_00014sq_00011hl_00003es.(...).mrc
│           ├── (...)_14sep05c_00024sq_00004hl_00002es.(...).mrc
│           ├── (...)_14sep05c_c_00003gr_00014sq_00011hl_00002es.(...).mrc
│           ├── (...)_14sep05c_c_00003gr_00014sq_00008hl_00005es.(...).mrc
│           ├── (...)_14sep05c_c_00003gr_00014sq_00004hl_00004es.(...).mrc
│           ├── (...)_14sep05c_c_00003gr_00014sq_00010hl_00002es.(...).mrc
│           ├── (...)_14sep05c_c_00003gr_00014sq_00007hl_00004es.(...).mrc
│           ├── (...)_14sep05c_00024sq_00006hl_00003es.(...).mrc
│           ├── (...)_14sep05c_c_00003gr_00014sq_00005hl_00005es.(...).mrc
│           ├── (...)_14sep05c_00024sq_00003hl_00002es.(...).mrc
│           ├── (...)_14sep05c_c_00003gr_00014sq_00006hl_00002es.(...).mrc
│           ├── (...)_14sep05c_c_00003gr_00014sq_00002hl_00005es.(...).mrc
│           └── (...)_14sep05c_c_00003gr_00014sq_00011hl_00004es.(...).mrc
├── gpu_fsc_test
│   ├── half_map_A.mrc
│   └── half_map_B.mrc
├── movies
│   ├── eer
│   │   ├── FoilHole_2669035_Data_2668380_2668382_20200703_235726_Fractions.mrc.eer
│   │   ├── FoilHole_2669035_Data_2668383_2668385_20200703_235738_Fractions.mrc.eer
│   │   └── FoilHole_2669035_Data_2671097_2671099_20200703_235716_Fractions.mrc.eer
│   ├── mrc
│   │   ├── 17jul30a_b_00007gr_00002sq_v01_00002hl16_00002edhiii.frames.mrc
│   │   ├── 17jul30a_b_00007gr_00002sq_v01_00002hl16_00004edhiii.frames.mrc
│   │   └── 17jul30a_b_00014gr_00001sq_v01_00002hl16_00005edhiii.frames.mrc
│   └── tiff
│       ├── FoilHole_21044295_Data_21043958_21043960_20210422_050646_fractions.tiff
│       ├── FoilHole_21044296_Data_21043958_21043960_20210422_050719_fractions.tiff
│       └── FoilHole_21044297_Data_21043958_21043960_20210422_051105_fractions.tiff
└── picking_test
    ├── 014750136049583239150_14sep05c_00024sq_00004hl_00002es.frames_background.mrc
    ├── 014750136049583239150_14sep05c_00024sq_00004hl_00002es.frames_patch_aligned_ctf_spline.npy
    ├── 014750136049583239150_14sep05c_00024sq_00004hl_00002es.frames_patch_aligned_doseweighted.mrc
    ├── exposure_dataset_for_picking.cs
    ├── picking_templates.mrc
    └── templates_dataset_for_picking.cs

Particle Data Source

Particles previously processed in CryoSPARC from a subset of movies in EMPIAR-10025:

Movie Data Sources

TIFF: 3x K3 Super-resolution (11520, 8184) 70 frames: 1.16GB each from EMPIAR-10721:

MRC: 3x K2 (3710, 3838) 44 Frames 1.2GB each from EMPIAR-10249:

EER: 3x Falcon 4 (4096, 4096) 48 Frames: 500MB each from EMPIAR-10612:

Sharing Data with Structura Biotechnology (Optional)

In each job, there is a parameter (”Share benchmark data with Structura Biotechnology”, disabled by default) to allow uploading of benchmark data to Structura’s servers. The data sent includes timings and hardware information, but does not include any user identifiable information.

An example of the data uploaded can be seen below:

{
    "type" : "gpu",
    "timings" : {
        "fsc_spherical" : {
            "put_mapr_on_gpu" : 0.5858473777771,
						...
        },
        "fsc_loose" : {
            "put_mapr_on_gpu" : 0.677032470703125,
						...
        },
				...
    },
    "test_info" : {
        "cryosparc_version" : "v4.1.0",
        "created_at" : 1666985248.02303,
        "instance_information" : {
            "platform_node" : "server_hostname",
            "platform_release" : "4.15.0-142-generic",
            "platform_version" : "#146~16.04.1-Ubuntu SMP Tue Apr 13 09:27:15 UTC 2021",
            "platform_architecture" : "x86_64",
            "cpu_model" : "Intel(R) Xeon(R) CPU E5-1630 v4 @ 3.70GHz",
            "physical_cores" : 4,
            "total_memory" : "62.80GB",
            "available_memory" : "52.17GB",
            "used_memory" : "9.81GB",
            "ofd_soft_limit" : 1048576,
            "ofd_hard_limit" : 1048576
        },
        "job_params" : {
            "benchmark_data_dir" : null,
            "gpu_num_gpus" : 1,
            "send_data" : true,
            "test_random" : true,
            "test_sequential" : true,
            "use_all_gpus" : false,
            "use_ssd" : true
        },
        "gpu_name" : "NVIDIA GeForce GTX 1080 Ti",
        "gpu_bus_id" : "0000:02:00.0"
    }
}

The data sent includes timings and hardware information, but does not include any user identifiable information. Structura will use this data to maintain aggregate statistics about CryoSPARC performance in the wild and help us focus our optimization efforts on the jobs and codepaths with the most benefit to users. Users who do upload benchmark data should not expect any direct response from Structura.

Filesystem Benchmark

The filesystem benchmark employs a sequential read test for movies, and both a sequential and random read test for particles simulating real CryoSPARC workflows to benchmark the filesystem where the benchmark data exists.

Turn off the parameter “Use SSD for Tests” to disable the use of the caching system when performing the particle read tests. The job will instead report the time it takes to read the particles in a sequential and random pattern from the project directory instead of a local cache device.

A Note About Filesystem Caching

sudo bash -c 'sync; echo 1 > /proc/sys/vm/drop_caches'

Note that there still may be other caches in play if your data is hosted on other machines (e.g., a storage cluster’s cache).

Sequential Read Test - Movies

For TIFF and EER movies, only the time it takes for the system to read the data into memory is recorded. The time it takes for the movies to be decompressed (which is always performed when reading TIFF and EER movies) is timed but not recorded.

Sequential Read Test - Particles

Random Read Test - Particles

To benchmark random reads, which are relevant during most parts of particle processing, the same particle stack created during the sequential read test is used, but this time the particles are read in a random pattern. The time it takes the system to read the particles randomly (particle_random_read_time) and the rate at which the particles are read (particle_random_read_rate) are recorded.

CPU Benchmark

Decompression time is averaged from three runs of different movies and reported as tiff and eer in the CPU tab of the Benchmark viewer.

GPU Benchmark

The GPU benchmark executes a collection of functions from CryoSPARC jobs on each of the worker’s GPUs (unless the “Number of GPUs to benchmark” parameter is specified, in which case only the specified number of GPUs are benchmarked), and times them. The tests include:

  1. FSC Calculations using different masks:

    • Spherical Mask (fsc_spherical)

    • Loose Mask (fsc_loose)

    • Tight Mask (fsc_tight)

    • Noise Sub Mask (fsc_noisesub)

  2. Non-Uniform Refinement’s core algorithm (matched_cv_filter_estimation)

  3. Particle picking’s core algorithm (picking)

  4. CryoSPARC’s core alignment and reconstruction algorithm (”Engine”), tested with various parameter combinations:

    • Particles in cache

      • Using 1 CPU thread

        • Using a trilinear interpolation kernel

          • Using C1 symmetry

            • Using Pose Maximization

              • Test A (disk_single_linear10_max_C1)

        • Using a tricubic interpolation kernel

          • Using C1 symmetry

            • Using Pose Maximization

              • Test B (disk_single_linear20_max_C1)

      • Using 2 CPU threads

        • Using a trilinear interpolation kernel

          • Using C1 symmetry

            • Using Pose Maximization

              • Test C (disk_multi_linear10_max_C1)

        • Using a tricubic interpolation kernel

          • Using C1 symmetry

            • Using Pose Maximization

              • Test D (disk_multi_linear20_max_C1)

    • Particles in memory

      • Using 1 CPU thread

        • Using a trilinear interpolation kernel

          • Using C1 symmetry

            • Using Pose Maximization

              • Test E (memory_single_linear10_max_C1)

            • Using Pose Marginalization

              • Test F (memory_single_linear10_marg_C1)

          • Using D7 symmetry

            • Using Pose Maximization

              • Test G (memory_single_linear10_max_D7)

        • Using a tricubic interpolation kernel

          • Using C1 symmetry

            • Using Pose Maximization

              • Test H (memory_single_linear20_max_C1)

            • Using Pose Marginalization

              • Test I (memory_single_linear20_marg_C1)

As of CryoSPARC v4.4+, memory_multi_* (particles in memory + multithreaded) tests have been removed.

Non-Uniform Refinement’s core algorithm

The core algorithm used in Non-Uniform Refinement performs multiple data transfers to and from the GPU, while performing hundreds of GPU-accelerated FFTs. This test stresses the memory performance of the GPU and PCIe bandwidth of the CPU, and is limited by the performance of a single CPU core.

CryoSPARC’s core reconstruction algorithm

The different parameter combinations specified for the core reconstruction algorithm tests code paths used by various CryoSPARC jobs including Homogeneous Refinement, Non Uniform Refinement, 3D Classification, and more. The test name (e.g., memory_multi_linear20_marg_C1) is comprised of the following parameters used to perform the test:

<particle location><number of CPU threads><interpolation kernel><pose assignment method><symmetry operator>

  • Particle Location:

    • Particles can either be stored on the cache device (SSD) if caching is enabled, or read into memory. When particles are in memory, IO time becomes negligible.

  • Number of CPU Threads:

    • CryoSPARC’s core algorithm can be run with one or two threads. Most of the time, it’s run with two threads, so that particle IO and GPU computation is performed concurrently. Note that in tests using 2 CPU threads, some timing numbers are not accurate due to the concurrency of the computation, which is why only the overall time is reported. In these cases, it’s best to compare the timings from the corresponding single-threaded test.

  • Interpolation Kernel:

    • CryoSPARC’s refinement and classification algorithms use two main interpolation kernels: trilinear (linear10) and tricubic (linear20) to interpolate values of the 3D density in Fourier space. Interpolation is necessary when rotating and projecting the 3D density, which is used in the orientation search step in most refinement/classification/variability jobs.

      Trilinear interpolation is significantly less computationally expensive than tricubic interpolation, requiring only 8 array accesses (vs. 64) of the underlying 3D density. Trilinear interpolation is also hardware-accelerated on NVIDIA GPUs through CUDA, whereas tricubic interpolation is not.

  • Pose Assignment Method:

    • Non-Uniform Refinement supports either pose “maximization” or “marginalization” during the reconstruction of the 3D density from the particle images. Maximization means that each particle is assigned a single 3D pose and shift during reconstruction. Alternatively, marginalization allows each particle to be assigned multiple 3D poses and shifts, each being weighted by their relative likelihoods under the image formation model. Maximization is usually sufficient, but for small particles or noisy datasets, marginalization helps to account for uncertainty in estimating the poses. When reconstructing the 3D density, maximization only has to insert each particle image into the 3D reconstruction once; marginalization is more computationally expensive because it requires inserting each image into the reconstruction multiple times.

  • Symmetry Operator:

    • CryoSPARC’s core reconstruction algorithm supports many symmetry operators, but C1 and D7 were chosen for these benchmarks as a way to turn “off” and “on” the code paths respectively that enable symmetry.

Interpreting Results Using The Benchmark Viewer

To view and compare previous Benchmark results and reference benchmarks provided by Structura, navigate to the “Benchmarks” tab inside the “Manage” panel.

Under each sub-tab (CPU, File System, GPU, Extensive Validation), there will be reference benchmarks provided by Structura which can be used as a comparison against benchmarks run on the current instance.

To compare multiple references, select them from the table using the checkboxes and click the “Compare” button on the top right side of the screen.

In the comparison view, each benchmark is a column, and their timings are listed as rows. The overall time that the benchmark took is listed under the “Time” sub-column (A), and the portion of how long it took relative to the other timings is represented as a percentage in the “Pct” sub-column.

When a benchmark (column) is selected, it becomes the base “Reference” B1 for the “Speedup” columns B2, which are available for all other benchmarks in the comparison view. The “Speedup” is calculated as

Reference(seconds)/Current(seconds),Reference(seconds)/Current(seconds),Reference(seconds)/Current(seconds),

which helps to easily glean how much faster or slower a timing is in comparison to the reference.

When a timing is hovered over, its details will be displayed in the “Benchmark Details” section on the right side of the page (C).

To view more detailed timings (available for the GPU benchmark only), click on the “+” button to expand a row (D). These sub-timings are the low-level functions that get called inside of CryoSPARC’s core reconstruction algorithm. The “Tags” column (E) indicates what hardware component each function’s speed is most dependent on.

For example, for the setup_scales sub-timing, the relevant component tags are “PCIe Latency/Bandwidth Speed” and “GPU/CPU Memory Allocation” because the function allocates space for a float32 array in CPU and GPU memory, fills it with data in CPU Memory, then downloads the contents of the array from CPU memory to it’s corresponding location on GPU memory. When a “download” happens, this occurs over the PCIe lanes that connect the CPU to the GPU, where the link speed (determined by e.g., PCIe Gen. 3 on most GPUs and PCIe Gen. 4 on NVIDIA Ampere and Ada architectures) matters the most in determining how fast this happens.

Component Tags

Tag Name
Most likely bottleneck

CPU Performance

- single core clock speed

GPU Performance

- float32 performance and memory bandwidth

PCIe Latency/Bandwidth Speed

- PCIe generation (3, 4) and number of lanes per slot (x8, x16)

GPU/CPU Memory Allocation

- general cpu/gpu performance

Input/Output Speed

- random read speeds of the storage device where particle images are located

Raw Data

At the end of the benchmark job, results are saved as a JSON and CSV in the job directory. The exact path of the files can be seen at the end of each test in the job’s Event Log.

Writing benchmark data to /bulk9/data/dev_projects/CS-peformance-benchmark/J73/J73_fs_benchmark_data.json

Writing benchmark data to /bulk9/data/dev_projects/CS-peformance-benchmark/J73/J73_fs_benchmark_data.csv

To view the original Benchmark job that a benchmark was created from, right click on the column header and select “Show job in sidebar”. The JSON and CSV results can also be downloaded from this context menu.

Performance Benchmarking Entire Jobs with the Extensive Validation Job

The Extensive Workflow job is now called the Extensive Validation job (v4.3.0+).

First, create an “Extensive Validation” job and select “Benchmark” as the value for the “Run Mode” parameter:

In “Benchmark Mode”, jobs that support multi-GPU parallelization (such as Patch Motion Correction, Patch CTF Estimation, and 2D Classification) can be allocated multiple GPUs. To allocate multiple GPUs, specify a number greater than 1 for the “Number of GPUs to use” parameter field, and either select a lane or specify the exact GPUs using the “Run on specific GPUs” tab in the Resource Selection panel.

For more information on the jobs that are launched by the Extensive Validation job in benchmark mode, see the Extensive Validation documentation here:

Appendix

Drop the page cache

sudo bash -c 'sync; echo 1 > /proc/sys/vm/drop_caches'

On Linux, the “page cache” is an area of unused memory that is used to store data that the OS reads for later rapid retrieval. For example, when you read a 1GB file twice, the second access of the file will be faster, since the file blocks come directly from the cache in memory instead of the hard disk or SSD. The OS automatically frees up data stored in the page cache as more memory is requested by other applications. The Benchmark job attempts to drop files that it uses from the page cache by using the function to declare that the files “will not be accessed in the near future” (POSIX_FADV_DONTNEED). Doing so allows subsequent runs of the Benchmark job to be reproducible (meaning that the numbers reported by the job won’t be skewed by faster read times) without having to manually drop the page cache.

Dropping the page cache: The following two commands first instruct the kernel to , then :

To benchmark sequential reading, which is relevant in the early stages of data processing, three different types of movies (TIFF, EER and MRC) are read and timed. The test reports the averages of the total I/O time taken. This measures the performance of the storage volume on which the movies are located. To benchmark a storage volume that is different from the project directory, copy the benchmark data to the new location and specify it in the “Benchmark Data Directory” parameter. For more information on the sources of each of the movies, see .

To benchmark sequential particle reads, which are relevant during some parts of particle processing, a small particle stack (50,000 particles with shape (256,256) across 500 files) is randomly generated using , written to the project directory, cached onto the cache device (if available and enabled), then read (in a sequential pattern) back into memory. The time it takes the system to cache the particles (particle_cache_time), read the particles sequentially (particle_sequential_read_time) and the rate at which the particles are read (particle_sequential_read_rate) are recorded.

The CPU Benchmark reads the same TIFF and EER movies from the Filesystem test (see ), but instead reports the time it takes to decompress the movies, which is heavily dependent on CPU and Memory performance.

Note that in order to measure decompression time, the CryoSPARC environment variable CRYOSPARC_TIFF_IO_SHM must be set and turned on (which by default, it is). . This parameter tells the IO system to first copy the contents of TIFF and EER files to /dev/shm (a temporary file storage system backed by RAM) before decompressing it, allowing the system to distinguish IO time from decompression time. Note that this parameter also increases performance on some networked file systems.

The Extensive Validation job is a job that creates and queues other jobs in a pre-defined workflow. Workflows available are for the and datasets, which are downloaded when the job is run. If you run the Extensive Validation job in "Benchmark" mode, each job defined in the workflow will run in sequence. This will allow you to compare the overall performance of each job in the Benchmark UI, along with the CPU, Filesystem, and GPU performance benchmarks.

First, instruct the kernel to , then :

Click here to download the benchmark data package directly from cloud storage.
posix_fadvise
write dirty pages to disk
drop the page cache
numpy.random.randn
See Environment Variables
EMPIAR-10025
EMPIAR-10305
Guide: Verify CryoSPARC Installation with the Extensive Validation Job (v4.3+)
write dirty pages to disk
drop the page cache
Benchmark Data
Sequential Read Test
Installation Testing
LogoEMPIAR-10025 T20S Proteasome at 2.8 Å Resolution
LogoEMPIAR-10721 CryoEM single particle analysis of 65 kDa human haemoglobin using a 200 kV Talos Arctica
LogoEMPIAR-10249 Horse liver alcohol dehydrogenase movies obtained using Talos Arctica operating at 200 kV equipped with a K2
LogoEMPIAR-10612 High-resolution SARS-CoV-2 ORF3a dimer in an MSP1E3D1 lipid nanodisc