CryoSPARC Cluster Integration Script Examples

Examples of cluster_info.json and cluster_script.sh scripts for various cluster workload managers

CryoSPARC can integrate with cluster scheduler systems. This page contains examples of integration setups.

Due to the many variations of cluster scheduler systems and their configurations, the examples here will need to be modified for your own specific use case.

For information on what each variable means, see Connect a Cluster to CryoSPARC.

GPU Resource Management

When CryoSPARC launches a job to the cluster, the number of GPUs requested by the user is used in the submission script, but CryoSPARC does not know which specific devices it will be allocated and therefore each job simply tries to use GPUs with device numbers starting at zero. For example, a 2-GPU job will try to use GPUs [0, 1] when submitted to a cluster. It is the responsibility of the cluster system to correctly allocate requested GPU resources to CryoSPARC jobs while insulating those allocated resources from interference by other jobs. For this purpose, the SLURM scheduler, for example, can combine Generic Resource (GRES) management with (Linux) cgroup controls.

A SLURM Example

cluster_info.json
{
    "name": "slurm-lane1",
    "worker_bin_path": "/path/to/cryosparc_worker/bin/cryosparcw",
    "send_cmd_tpl": "{{ command }}",
    "qsub_cmd_tpl": "/opt/slurm/bin/sbatch {{ script_path_abs }}",
    "qstat_cmd_tpl": "/opt/slurm/bin/squeue -j {{ cluster_job_id }}",
    "qdel_cmd_tpl": "/opt/slurm/bin/scancel {{ cluster_job_id }}",
    "qinfo_cmd_tpl": "/opt/slurm/bin/sinfo"
}
cluster_script.sh
#!/usr/bin/env bash

#SBATCH --job-name cryosparc_{{ project_uid }}_{{ job_uid }}
#SBATCH --cpus-per-task={{ num_cpu }}
#SBATCH --gres=gpu:{{ num_gpu }}
#SBATCH --mem={{ ram_gb|int }}G
#SBATCH --comment="created by {{ cryosparc_username }}"
#SBATCH --output={{ job_dir_abs }}/{{ project_uid }}_{{ job_uid }}_slurm.out
#SBATCH --error={{ job_dir_abs }}/{{ project_uid }}_{{ job_uid }}_slurm.err

{{ run_cmd }}

A PBS Example

cluster_info.json
{
    "name" : "pbscluster",
    "worker_bin_path" : "/path/to/cryosparc_worker/bin/cryosparcw",
    "cache_path" : "/path/to/local/SSD/on/cluster/nodes",
    "send_cmd_tpl" : "ssh loginnode {{ command }}",
    "qsub_cmd_tpl" : "qsub {{ script_path_abs }}",
    "qstat_cmd_tpl" : "qstat -as {{ cluster_job_id }}",
    "qdel_cmd_tpl" : "qdel {{ cluster_job_id }}",
    "qinfo_cmd_tpl" : "qstat -q"
}
cluster_script.sh
#!/bin/bash

#PBS -N cryosparc_{{ project_uid }}_{{ job_uid }}
#PBS -l select=1:ncpus={{ num_cpu }}:ngpus={{ num_gpu }}:mem={{ (ram_gb*1000)|int }}mb:gputype=P100
#PBS -o {{ job_dir_abs }}/cluster.out
#PBS -e {{ job_dir_abs }}/cluster.err

{{ run_cmd }}

A Gridengine Example

cluster_info.json
{
    "name" : "ugecluster",
    "worker_bin_path" : "/u/cryosparcuser/cryosparc/cryosparc_worker/bin/cryosparcw",
    "cache_path" : "/scratch/cryosparc_cache",
    "send_cmd_tpl" : "{{ command }}",
    "qsub_cmd_tpl" : "qsub {{ script_path_abs }}",
    "qstat_cmd_tpl" : "qstat -j {{ cluster_job_id }}",
    "qdel_cmd_tpl" : "qdel {{ cluster_job_id }}",
    "qinfo_cmd_tpl" : "qstat -q default.q"
}
cluster_script.sh
#!/bin/bash

## What follows is a simple UGE script:
## Job Name
#$ -N cryosparc_{{ project_uid }}_{{ job_uid }}

## Number of CPUs (select 1 CPU always, and oversubscribe as GPU is per core value)
##$ -pe smp {{ num_cpu }}
#$ -pe smp 1

## Memory per CPU core
#$ -l m_mem_free={{ (ram_gb)|int }}G

## Number of GPUs 
#$ -l gpu_card={{ num_gpu }}

## Time limit 4 days
#$ -l h_rt=345600

## STDOUT/STDERR
#$ -o {{ job_dir_abs }}/cluster.out
#$ -e {{ job_dir_abs }}/cluster.err
#$ -j y

## Number of threads
export OMP_NUM_THREADS={{ num_cpu }}

echo "HOSTNAME: $HOSTNAME"

{{ run_cmd }}

Last updated