CryoSPARC Guide
Search…
⌃K

CryoSPARC Cluster Integration Script Examples

Examples of cluster_info.json and cluster_script.sh scripts for various cluster workload managers
CryoSPARC can integrate with cluster scheduler systems. This page contains examples of integration setups.
Due to the many variations of cluster scheduler systems and their configurations, the examples here will need to be modified for your own specific use case.
For information on what each variable means, see Connect a Cluster to CryoSPARC.

GPU Resource Management

When CryoSPARC launches a job to the cluster, the number of GPUs requested by the user is used in the submission script, but CryoSPARC does not know which specific devices it will be allocated and therefore each job simply tries to use GPUs with device numbers starting at zero. For example, a 2-GPU job will try to use GPUs [0, 1] when submitted to a cluster. It is the responsibility of the cluster system to correctly allocate requested GPU resources to CryoSPARC jobs while insulting those allocated resources from interference by other jobs. For this purpose, the SLURM scheduler, for example, can combine Generic Resource (GRES) management with (Linux) cgroup controls.
If your cluster does not properly allocate GPU resources, you may try to partially work around this using the following bash code snippet just before {{run_cmd}} in your cluster submission script template:
available_devs=""
for devidx in $(seq 0 15);
do
if [[ -z $(nvidia-smi -i $devidx --query-compute-apps=pid --format=csv,noheader) ]] ; then
if [[ -z "$available_devs" ]] ; then
available_devs=$devidx
else
available_devs=$available_devs,$devidx
fi
fi
done
export CUDA_VISIBLE_DEVICES=$available_devs
This loop will check which GPUs are currently idle (have no context) and will use those devices for the newly spawned job. This method is no substitute for a properly configured cluster resource allocation and therefore not recommended.

SLURM

Example A

cluster_info.json
{
"name" : "slurmcluster",
"worker_bin_path" : "/path/to/cryosparc_worker/bin/cryosparcw",
"cache_path" : "/path/to/local/SSD/on/cluster/nodes",
"send_cmd_tpl" : "ssh loginnode {{ command }}",
"qsub_cmd_tpl" : "sbatch {{ script_path_abs }}",
"qstat_cmd_tpl" : "squeue -j {{ cluster_job_id }}",
"qdel_cmd_tpl" : "scancel {{ cluster_job_id }}",
"qinfo_cmd_tpl" : "sinfo",
"transfer_cmd_tpl" : "scp {{ src_path }} loginnode:{{ dest_path }}"
}
cluster_script.sh
1
#!/usr/bin/env bash
2
#### cryoSPARC cluster submission script template for SLURM
3
## Available variables:
4
## {{ run_cmd }} - the complete command string to run the job
5
## {{ num_cpu }} - the number of CPUs needed
6
## {{ num_gpu }} - the number of GPUs needed.
7
## Note: the code will use this many GPUs starting from dev id 0
8
## the cluster scheduler or this script have the responsibility
9
## of setting CUDA_VISIBLE_DEVICES so that the job code ends up
10
## using the correct cluster-allocated GPUs.
11
## {{ ram_gb }} - the amount of RAM needed in GB
12
## {{ job_dir_abs }} - absolute path to the job directory
13
## {{ project_dir_abs }} - absolute path to the project dir
14
## {{ job_log_path_abs }} - absolute path to the log file for the job
15
## {{ worker_bin_path }} - absolute path to the cryosparc worker command
16
## {{ run_args }} - arguments to be passed to cryosparcw run
17
## {{ project_uid }} - uid of the project
18
## {{ job_uid }} - uid of the job
19
## {{ job_creator }} - name of the user that created the job (may contain spaces)
20
## {{ cryosparc_username }} - cryosparc username of the user that created the job (usually an email)
21
## {{ job_type }} - CryoSPARC job type
22
##
23
## What follows is a simple SLURM script:
24
25
#SBATCH --job-name cryosparc_{{ project_uid }}_{{ job_uid }}
26
#SBATCH -n {{ num_cpu }}
27
#SBATCH --gres=gpu:{{ num_gpu }}
28
#SBATCH --partition=gpu
29
#SBATCH --mem={{ (ram_gb*1000)|int }}MB
30
#SBATCH --output={{ job_log_path_abs }}
31
#SBATCH --error={{ job_log_path_abs }}
32
33
{{ run_cmd }}

Example B

cluster_info.json
{
"qdel_cmd_tpl": "scancel {{ cluster_job_id }}",
"worker_bin_path": "/home/cryosparcuser/cryosparc_worker/bin/cryosparcw",
"title": "debug_cluster",
"cache_path": "/ssd/tmp",
"qinfo_cmd_tpl": "sinfo --format='%.8N %.6D %.10P %.6T %.14C %.5c %.6z %.7m %.7G %.9d %20E'",
"qsub_cmd_tpl": "sbatch {{ script_path_abs }}",
"qstat_cmd_tpl": "squeue -j {{ cluster_job_id }}",
"cache_quota_mb": null,
"send_cmd_tpl": "{{ command }}",
"cache_reserve_mb": 10000,
"name": "debug_cluster"
}
cluster_script.sh
#!/bin/bash
#SBATCH --job-name=cryosparc_{{ project_uid }}_{{ job_uid }}
#SBATCH --partition=debug
#SBATCH --output={{ job_log_path_abs }}
#SBATCH --error={{ job_log_path_abs }}
#SBATCH --nodes=1
#SBATCH --mem={{ (ram_gb*1000)|int }}M
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task={{ num_cpu }}
#SBATCH --gres=gpu:{{ num_gpu }}
#SBATCH --gres-flags=enforce-binding
srun {{ run_cmd }}

Example C

cluster_info.json
{
"qdel_cmd_tpl": "scancel {{ cluster_job_id }}",
"worker_bin_path": "/home/cryosparcuser/cryosparc_worker/bin/cryosparcw",
"title": "test",
"cache_path": "",
"qinfo_cmd_tpl": "sinfo",
"qsub_cmd_tpl": "sbatch {{ script_path_abs }}",
"qstat_cmd_tpl": "squeue -j {{ cluster_job_id }}",
"send_cmd_tpl": "{{ command }}",
"name": "test"
}
cluster_script.sh
#!/bin/bash
#SBATCH --job-name=cryosparc_{{ project_uid }}_{{ job_uid }}
#SBATCH --output={{ job_log_path_abs }}
#SBATCH --error={{ job_log_path_abs }}
#SBATCH --ntasks={{ num_cpu }}
#SBATCH --mem={{ (ram_gb*1000)|int }}M
#SBATCH --cpus-per-task=1
#SBATCH --gres=gpu:{{ num_gpu }}
#SBATCH --gres-flags=enforce-binding
srun {{ run_cmd }}

Example D

cluster_script.sh
#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --nodes=1
#SBATCH --ntasks={{ num_cpu }}
#SBATCH --gres=gpu:{{ num_gpu }}
#SBATCH --time=48:00:00
#SBATCH --mem={{ (ram_gb)|int }}GB
#SBATCH --exclusive
#SBATCH --job-name cspark_{{ project_uid }}_{{ job_uid }}
#SBATCH --output={{ job_log_path_abs }}
#SBATCH --error={{ job_log_path_abs }}
{{ run_cmd }}

Example E

cluster_script.sh
#!/bin/bash
#SBATCH --job-name=cryosparc_{{ project_uid }}_{{ job_uid }}
#SBATCH --partition=q2
#SBATCH --output={{ job_log_path_abs }}
#SBATCH --error={{ job_log_path_abs }}
{%- if num_gpu == 0 %}
#SBATCH --ntasks={{ num_cpu }}
#SBATCH --cpus-per-task=1
#SBATCH --threads-per-core=1
{%- else %}
#SBATCH --nodes=1
#SBATCH --ntasks-per-node={{ num_cpu }}
#SBATCH --cpus-per-task=1
#SBATCH --threads-per-core=1
#SBATCH --gres=gpu:{{ num_gpu }}
#SBATCH --gres-flags=enforce-binding
{%- endif %}
{{ run_cmd }}

Example F

cluster_script.sh
{%- macro _min(a, b) -%}
{%- if a <= b %}{{a}}{% else %}{{b}}{% endif -%}
{%- endmacro -%}
#SBATCH --job-name=cryosparc_{{ project_uid }}_{{ job_uid }}
#SBATCH --output={{ job_log_path_abs }}
#SBATCH --error={{ job_log_path_abs }}
#SBATCH --cpus-per-task=1
#SBATCH --threads-per-core=1
#SBATCH --partition=gpu
#SBATCH --exclusive
#SBATCH --mem=100000
{%- if num_gpu == 0 %}
# Use CPU cluster
#SBATCH --constraint=mc
#SBATCH --ntasks={{ num_cpu }}

PBS

Example A

cluster_info.json
{
"name" : "pbscluster",
"worker_bin_path" : "/path/to/cryosparc_worker/bin/cryosparcw",
"cache_path" : "/path/to/local/SSD/on/cluster/nodes"
"send_cmd_tpl" : "ssh loginnode {{ command }}",
"qsub_cmd_tpl" : "qsub {{ script_path_abs }}",
"qstat_cmd_tpl" : "qstat -as {{ cluster_job_id }}",
"qdel_cmd_tpl" : "qdel {{ cluster_job_id }}",
"qinfo_cmd_tpl" : "qstat -q",
"transfer_cmd_tpl" : "scp {{ src_path }} loginnode:{{ dest_path }}"
}
cluster_script.sh
#!/bin/bash
#### cryoSPARC cluster submission script template for PBS
## Available variables:
## {{ run_cmd }} - the complete command string to run the job
## {{ num_cpu }} - the number of CPUs needed
## {{ num_gpu }} - the number of GPUs needed.
## Note: the code will use this many GPUs starting from dev id 0
## the cluster scheduler or this script have the responsibility
## of setting CUDA_VISIBLE_DEVICES so that the job code ends up
## using the correct cluster-allocated GPUs.
## {{ ram_gb }} - the amount of RAM needed in GB
## {{ job_dir_abs }} - absolute path to the job directory
## {{ project_dir_abs }} - absolute path to the project dir
## {{ job_log_path_abs }} - absolute path to the log file for the job
## {{ worker_bin_path }} - absolute path to the cryosparc worker command
## {{ run_args }} - arguments to be passed to cryosparcw run
## {{ project_uid }} - uid of the project
## {{ job_uid }} - uid of the job
## {{ job_creator }} - name of the user that created the job (may contain spaces)
## {{ cryosparc_username }} - cryosparc username of the user that created the job (usually an email)
## {{ job_type }} - CryoSPARC job type
##
## What follows is a simple PBS script:
#PBS -N cryosparc_{{ project_uid }}_{{ job_uid }}
#PBS -l select=1:ncpus={{ num_cpu }}:ngpus={{ num_gpu }}:mem={{ (ram_gb*1000)|int }}mb:gputype=P100
#PBS -o {{ job_log_path_abs }}
#PBS -e {{ job_log_path_abs }}
{{ run_cmd }}

UGE

Example A

cluster_info.json
{
"name" : "ugecluster",
"worker_bin_path" : "/u/cryosparcuser/cryosparc/cryosparc_worker/bin/cryosparcw",
"cache_path" : "/scratch/cryosparc_cache",
"send_cmd_tpl" : "{{ command }}",
"qsub_cmd_tpl" : "qsub {{ script_path_abs }}",
"qstat_cmd_tpl" : "qstat -j {{ cluster_job_id }}",
"qdel_cmd_tpl" : "qdel {{ cluster_job_id }}",
"qinfo_cmd_tpl" : "qstat -q default.q",
"transfer_cmd_tpl" : "scp {{ src_path }} uoft:{{ dest_path }}"
}
cluster_script.sh
#!/bin/bash
## What follows is a simple UGE script:
## Job Name
#$ -N cryosparc_{{ project_uid }}_{{ job_uid }}
## Number of CPUs (select 1 CPU always, and oversubscribe as GPU is per core value)
##$ -pe smp {{ num_cpu }}
#$ -pe smp 1
## Memory per CPU core
#$ -l m_mem_free={{ (ram_gb)|int }}G
## Number of GPUs
#$ -l gpu_card={{ num_gpu }}
## Time limit 4 days
#$ -l h_rt=345600
## STDOUT/STDERR
#$ -o {{ job_log_path_abs }}
#$ -e {{ job_log_path_abs }}
#$ -j y
## Number of threads
export OMP_NUM_THREADS={{ num_cpu }}
echo "HOSTNAME: $HOSTNAME"
{{ run_cmd }}