ssh <cryosparcuser>@<cryosparc_server>
<cryosparcuser> = cryosparcuser
The username of the account that cryoSPARC is to be installed by.
<cryosparc_server> = uoft
The hostname of the server where cryoSPARC will be installed on.
cd <install_path>
<install_path> = /home/cryosparcuser/cryosparc
This will be the root directory where all cryoSPARC code and dependencies will be installed.
export LICENSE_ID="<license_id>"
<license_id> = 682437fb-d6ae-47b8-870b-b530c587da94
This is the License ID issued to you, which you would have received in an email.
A unique license key is required for each individual crySPARC master instance.
curl -L https://get.cryosparc.com/download/master-latest/$LICENSE_ID -o cryosparc_master.tar.gzcurl -L https://get.cryosparc.com/download/worker-latest/$LICENSE_ID -o cryosparc_worker.tar.gz
tar -xf cryosparc_master.tar.gz cryosparc_mastertar -xf cryosparc_worker.tar.gz cryosparc_worker
Follow these instructions to install the master package on the master node. If you are following the Single Workstation setup, once you run the installation command, cryoSPARC will be ready to use.
If you are installing cryoSPARC on a single workstation, you can use the Single Workstation instructions below that simplify installation to a single command. Otherwise, use the Master Node Only instructions here, and then continue with the further steps to install on worker nodes separately.
cd cryosparc_master./install.sh --standalone \--license $LICENSE_ID \--worker_path <worker path> \--cudapath <cuda path> \--ssdpath <ssd path> \--initial_email <user email> \--initial_password <user password> \--initial_username "<login username>" \--initial_firstname "<given name>" \--initial_lastname "<surname>" \[--port <port_number>]
./install.sh --standalone \--license $LICENSE_ID \--worker_path /u/cryosparc_user/cryosparc/cryosparc_worker \--cudapath /usr/local/cuda \--ssdpath /scratch/cryosparc_cache \--initial_email "someone@structura.bio" \--initial_password "Password123" \--initial_username "username" \--initial_firstname "FirstName" \--initial_lastname "LastName" \--port 61000
<worker_path> = /home/cryosparc_user/software/cryosparc/cryosparc_worker
the full path to the worker directory, which was downloaded and extracted in step c)
To get the full path, cd
into the cryosparc_worker
directory and run the command: pwd -P
<cuda_path> = /usr/local/cuda
path to the CUDA installation directory on the worker node (CUDA Requirements)
This path should not be the cuda/bin
directory, but the cuda
directory that contains both bin
and lib64
subdirectories
<ssd_path> = /scratch/cryosparc_cache
path on the worker node to a writable directory residing on the local SSD (For more information on SSD usage in cryoSPARC, see X)
this is optional, and if omitted, specify the --nossd
option to indicate that the worker node does not have an SSD
<initial_email> = someone@structura.bio
login email address for first cryoSPARC webapp account
this will become an admin account in the user interface
<initial_username> = "FirstName LastName"
login username of the initial admin account to be created
ensure the name is quoted
<initial_firstname> = "LastName"
given name of the initial admin account to be created
ensure the name is quoted
<initial_lastname> = "LastName"
surname of the initial admin account to be created
ensure the name is quoted
<user_password> = Password123
temporary password that will be created for the <user_email>
account
<port_number> = 39000
The base port number for this cryoSPARC instance. Do not install cryosparc master on the same machine multiple times with the same port number - this can cause database errors. 39000 is the default, which will be used if you don't specify this option.
cd cryosparc_master./install.sh --license $LICENSE_ID \--hostname <master_hostname> \--dbpath <db_path> \--port <port_number> \[--insecure] \[--allowroot] \[--yes] \
./install.sh --license $LICENSE_ID \--hostname cryoem.cryosparcserver.edu \--dbpath /u/cryosparcuser/cryosparc/cryosparc_database \--port 45000 \
./bin/cryosparcm start
cryosparcm createuser --email "<user email>" \--password "<user password>" \--username "<login username>" \--firstname "<given name>" \--lastname "<surname>"
--license $LICENSE_ID
the LICENCE_ID variable exported in step a)
<master_hostname> = your.cryosparc.hostname.com
the hostname of the server where the master is to be installed on
<port_number> = 39000
The base port number for this cryoSPARC instance. Do not install cryosparc master on the same machine multiple times with the same port number - this can cause database errors. 39000 is the default, which will be used if you don't specify this option.
<db_path> = /u/cryosparcuser/cryosparc/cryosparc_database
the absolute path to a folder where the cryoSPARC database is to be installed. Ensure this location is in a readable and writeable location. The folder will be created if it doesn't already exist
<initial_email> = someone@structura.bio
login email address for first cryoSPARC webapp account
this will become an admin account in the user interface
<initial_username> = "FirstName LastName"
login username of the initial admin account to be created
ensure the name is quoted
<initial_firstname> = "LastName"
given name of the initial admin account to be created
ensure the name is quoted
<initial_lastname> = "LastName"
surname of the initial admin account to be created
ensure the name is quoted
<user_password> = Password123
temporary password that will be created for the <user_email>
account
--insecure
[optional] specify this option to ignore SSL certificate errors when connecting to HTTPS endpoints. This is useful if you are behind an enterprise network using SSL injection.
--allowroot
[optional] if you run the installation function with root privileges, it will fail. Specify this option to force cryoSPARC to be installed as the root user.
--yes
[optional] do not ask for any user input confirmations
source ~/.bashrc
The cryoSPARC database holds metadata, images, plots and logs of jobs that have run. If the database becomes corrupt or lost due to to user error or filesystem issues, work that was done in projects can be lost. This can be recovered in two ways:
In a new/fresh cryoSPARC instance, import the project directories of projects from the original instance. This should resurrect all jobs, outputs, inputs, metadata, etc. Please note: user and instance level settings and configuration will be lost.
Maintain a backup (daily or weekly recommended) of the cryoSPARC database. You can use the backup functionality documented here as well as a cron
job or other recurring scheduler to run the backup command.
After completing the above, navigate your browser to http://<workstation_hostname>:39000
to access the cryoSPARC user interface.
If you were following the Single Workstation install above, your installation is now complete.
Log onto the worker node (or a cluster worker node if installing on a cluster) as the cryosparcuser
, and run the installation command.
Installing cryosparc_worker
requires a NVIDIA GPU and CUDA Toolkit version ≥9.2 and ≤10.2 (CUDA 11 is not supported)
cd cryosparc_worker./install.sh --license $LICENSE_ID \--cudapath <cuda_path> \[--yes]
--license $LICENSE_ID
The LICENCE_ID variable exported in step a)
<cuda_path> = /usr/local/cuda
Path to the CUDA installation directory on the worker node
Note: this path should not be the cuda/bin
sub directory, but the CUDA directory that contains both bin and lib64 subdirectories
--yes
[optional] do not ask for any user input confirmations
Ensure cryoSPARC is turned on when connecting a worker node. On the worker node itself, run the connection function:
cd cryosparc_worker./bin/cryosparcw connect --worker <worker_hostname> \--master <master_hostname> \--port <port_num> \--ssdpath <ssd_path> \[--update] \[--sshstr <custom_ssh_string> ] \[--nogpu] \[--gpus <0,1,2,3> ] \[--nossd] \[--ssdquota <ssd_quota_mb> ] \[--ssdreserve <ssd_reserve_mb> ] \[--lane <lane_name> ] \[--newlane]
When updating configurations for a worker node that is already connected to your cryoSPARC instance, supply the --update
argument as well as the argument you are trying to update.
<worker_hostname> = worker.cryosparc.hostname.com
the hostname of the worker node to connect to your cryoSPARC instance
<master_hostname> = your.cryosparc.hostname.com
the hostname of the server where the master is installed on
<port_number> = 39000
The base port number for the cryoSPARC instance you are connecting the worker node to
--ssdpath <ssd_path>
[optional] path to directory on local ssd
use --nossd
to connect a worker node without an SSD
--update
[optional] used to notify the command that a configuration is being updated
--sshstr <custom_ssh_string>
[optional] custom ssh connection string like user@hostname
--nogpu
[optional] connect worker with no GPUs
--gpus 0,1,2,3
[optional] enable specific GPU devices only
For advanced configuration, run the gpulist
command:
$ bin/cryosparcw gpulistDetected 4 CUDA devices.id pci-bus name---------------------------------------------------------------0 0000:42:00.0 Quadro GV1001 0000:43:00.0 Quadro GV1002 0000:0B:00.0 Quadro RTX 50003 0000:0A:00.0 GeForce GTX 1080 Ti---------------------------------------------------------------
This will list the available GPUs on the worker node, and their corresponding numbers. Use this list to decide which GPUs you wish to enable using the --gpus
flag, or leave this flag out to enable all GPUs.
--nossd
[optional] connect a worker node with no SSD
--ssdquota <ssd_quota_mb>
[optional] quota of how much SSD space to use (MB)
--ssdreserve <ssd_reserve_mb>
[optional] minimum free space to leave on SSD (MB)
--lane <lane_name>
[optional] name of lane to add worker to
--newlane
[optional] force creation of a new lane if lane specified by --lane
does not exist
Once the cryosparc_worker
package is installed, the cluster must be registered with the master process, including providing template job submission commands and scripts that the master process will use to submit jobs to the cluster scheduler.
To register the cluster, you will need to provide cryoSPARC with two files: cluster_info.json
and clusters_script.sh
. The first file (cluster_info.json
) contains template strings used to construct cluster commands (like qsub
, qstat
, qdel
etc., or their equivalents for your system), and the second file (cluster_script.sh
) contains a template string to construct appropriate cluster submission scripts for your system. The jinja2
template engine is used to generate cluster submission/monitoring commands as well as submission scripts for each job.
The following fields are required to be defined as template strings in the configuration of a cluster. Examples for PBS are given here, but you can use any command required for your particular cluster scheduler.
cluster_info.jsonname : "cluster1"# A unique name for the cluster to be connected (multiple clusters can be connected)worker_bin_path : "/path/to/cryosparc_worker/bin/cryosparcw"# Path on cluster nodes to the cryosparcw entry point for worker processcache_path : "/path/to/local/SSD/on/cluster/nodes"# Path on cluster nodes that is a writable location on local SSD on each cluster node. This might be /scratch or similar. This path MUST be the same on all cluster nodes. Note that the installer does not check that this path exists, so make sure it does and is writable. If you plan to use the cluster nodes without SSD, you can leave this blank.send_cmd_tpl : "ssh loginnode {{ command }}"# Used to send a command to be executed by a cluster node (in case the cryosparc master is not able to directly use cluster commands). If your cryosparc master node is able to directly use cluster commands (like qsub etc) then this string can be just "{{ command }}"qsub_cmd_tpl : "qsub {{ script_path_abs }}"# The exact command used to submit a job to the cluster, where the job is defined in the cluster script located at {{ script_path_abs }}. This string can also use any of the variables defined below that are available inside the cluster script (num_gpus, num_cpus, etc)qstat_cmd_tpl : "qstat -as {{ cluster_job_id }}"# Cluster command that will report back the status of cluster job with id {{ cluster_job_id }}.qdel_cmd_tpl : "qdel {{ cluster_job_id }}"# Cluter command that will kill and remove {{ cluster_job_id }} from the queue.qinfo_cmd_tpl : "qstat -q"# General cluster information commandtransfer_cmd_tpl : "scp {{ src_path }} loginnode:{{ dest_path }}"# Command that can be used to transfer a file {{ src_path }} on the cryosparc master node to {{ dest_path }} on the cluster nodes. This is used when the master node is remotely updating a cluster worker installation. This is optional - if it is incorrect or omitted, you can manually update the cluster worker installation.
Along with the above commands, a complete cluster configuration requires a template cluster submission script. The script should be able to send jobs into your cluster scheduler queue marking them with the appropriate hardware requirements. The cryoSPARC internal scheduler will take care of submitting jobs as their inputs become ready. The following variables are available to be used within a cluster submission script template. Examples of templates, for use as a starting point, can be generated with the commands explained below.
cluster_script.sh{{ script_path_abs }} # the absolute path to the generated submission script{{ run_cmd }} # the complete command-line string to run the job{{ num_cpu }} # the number of CPUs needed{{ num_gpu }} # the number of GPUs needed.{{ ram_gb }} # the amount of RAM needed in GB{{ job_dir_abs }} # absolute path to the job directory{{ project_dir_abs }} # absolute path to the project dir{{ job_log_path_abs }} # absolute path to the log file for the job{{ worker_bin_path }} # absolute path to the cryosparc worker command{{ run_args }} # arguments to be passed to cryosparcw run{{ project_uid }} # uid of the project{{ job_uid }} # uid of the job{{ job_creator }} # name of the user that created the job (may contain spaces){{ cryosparc_username }} # cryosparc username of the user that created the job (usually an email)
Note: The cryoSPARC scheduler does not assume control over GPU allocation when spawning jobs on a cluster. The number of GPUs required is provided as a template variable, but either your submission script, or your cluster scheduler itself is responsible for assigning GPU device indices to each job spawned. The actual cryoSPARC worker processes that use one or more GPUs on a cluster will simply begin using device 0, then 1, then 2, etc. Therefore, the simplest way to get GPUs correctly allocated is to ensure that your cluster scheduler or submission script sets the CUDA_VISIBLE_DEVICES
environment variable, so that device 0 is always the first GPU that the particular spawned job should use. The example script for pbs
clusters (generated as below) shows now to check which GPUs are available at runtime, and automatically select the next available device.
To actually create or set a configuration for a cluster in cryoSPARC, use the following commands. the example
, dump
, and connect
commands read two files from the current working directory: cluster_info.json
and cluster_script.sh
cryosparcm cluster example <cluster_type># dumps out config and script template files to current working directory# examples are available for pbs and slurm schedulers but others should be very similarcryosparcm cluster dump <name># dumps out existing config and script to current working directorycryosparcm cluster connect# connects new or updates existing cluster configuration, reading cluster_info.json and cluster_script.sh from the current directory, using the name from cluster_info.jsoncryosparcm cluster remove <name># removes a cluster configuration from the scheduler