Downloading and Installing CryoSPARC

Downloading and installing the cryosparc_master and cryosparc_worker packages.

Overview

Once you've reviewed the CryoSPARC Installation Prerequisites and have decided on an architecture suitable for your situation, the installation process can be distilled to a few main steps which are further detailed below:

Download the cryosparc_master package
Download the cryosparc_worker package
Install the cryosparc_master package on the master node
Start cryoSPARC for the first time
Create the first administrator user
[Optional but recommended] Set up recurring backup of the CryoSPARC database
If you chose to install CryoSPARC via the single-workstation method, at this point, you are finished installing CryoSPARC. Otherwise, continue.
Log onto a worker node and install the cryosparc_worker package (installing the cryosparc_worker package requires an Nvidia GPU and Nvidia Driver 520.61.05 or newer)
Depending on whether you're installing CryoSPARC on a cluster or on a standalone worker machine, connect the worker node to CryoSPARC via cryosparcm cluster connect or cryosparcw connect
CryoSPARC is now fully installed. You may also choose to add more standalone worker machines at any time by using the cryosparcw connect utility.

You will need at least 15 GB of space to download and install the cryosparc_master and cryosparc_workerpackages

Prepare for Installation

Log into the workstation where you would like to install and run CryoSPARC

ssh <cryosparcuser>@<cryosparc_server>

<cryosparcuser> = cryosparcuser

The username of the account that cryoSPARC is to be installed by.

<cryosparc_server> = uoft

The hostname of the server where cryoSPARC will be installed on.

Select a suitable directory for installation

cryosparc_master/and/or cryosparc_worker/directories will be created inside the selected directory during a subsequent step. Select the directory carefully because the cryosparc_master/and cryosparc_worker/ directories cannot be moved following installation.

The absolute path to the installation directory must not exceed 83 characters (after dereferencing sybolic links in the path, if applicable). Suppose one selected and, if necessary, created the /sw/cryosparc/directory for this purpose:

# enter the selected directory
cd /sw/cryosparc
# confirm the max path length is not exceeded
echo -n $(pwd -P) | wc -c

Download and Extract the `cryosparc_master` and `cryosparc_worker` Packages Into Your Installation Directory

Define your License ID as an environment variable:

LICENSE_ID="your-unique-license-id"

Your unique CryoSPARC license ID

should be included in an e-mail we sent to you
should look similar to the example 682437fb-d6ae-47b8-870b-b530c587da94
can be used for the installation and operation of a single CryoSPARC master instance and all CryoSPARC workers associated with that single master instance

CryoSPARC download or installation commands that include or refer to $LICENSE_ID will fail if the LICENSE_ID variable is not defined in the current shell's environment.

Use `curl` to download the two files into tarball archives:

curl -L https://get.cryosparc.com/download/master-latest/$LICENSE_ID -o cryosparc_master.tar.gz
curl -L https://get.cryosparc.com/download/worker-latest/$LICENSE_ID -o cryosparc_worker.tar.gz

Extract the downloaded files:

tar -xf cryosparc_master.tar.gz cryosparc_master
tar -xf cryosparc_worker.tar.gz cryosparc_worker

Note: After extracting the worker package, you may see a second folder called cryosparc2_worker (note the 2) containing a single version file. This is here for backward compatibility when upgrading from older versions of CryoSPARC and is not applicable for new installations.

You may safely delete the cryosparc2_worker directory.

Install The `cryosparc_master` Package

Follow these instructions to install the master package on the master node. If you are following the Single Workstation setup, once you run the installation command, CryoSPARC will be ready to use.

After installation and startup, the cryosparc_master software exposes a number of network ports for connections from other computers, such as CryoSPARC worker nodes and computers from which users browse the CryoSPARC user interface. You must ensure that these ports cannot be accessed directly from the internet. The CryoSPARC guide contains suggestions on implementing access to the user interface.

If you are installing CryoSPARC on a single workstation, you can use the Single Workstation instructions below that simplify installation to a single command. Otherwise, use the Master Node Only instructions here, and then continue with the further steps to install on worker nodes separately.

Select and monitor the filesystem that hosts the $CRYOSPARC_DB_PATH directory carefully. The CryoSPARC instance will fail if the filesystem is allowed to fill up completely. Recovery from such a failure may be tedious, particularly in the absence of a recent database backup. The directory can be specified during installation with the "--dbpath" parameter of the install.sh command. Otherwise, the default is a directory named cryosparc_database inside the directory where the software tar packages were unpacked.

Single Workstation CryoSPARC Installation

cd cryosparc_master

./install.sh    --standalone \ 
                --license $LICENSE_ID \ 
                --worker_path <worker path> \ 
                --ssdpath <ssd path> \ 
                --initial_email <user email> \
                --initial_username "<login username>" \
                --initial_firstname "<given name>" \
                --initial_lastname "<surname>" \
                [--port <port_number>] \
                [--initial_password <user password>]

Example command execution

./install.sh    --standalone \
                --license $LICENSE_ID \
                --worker_path /u/cryosparc_user/cryosparc/cryosparc_worker \
                --ssdpath /scratch/cryosparc_cache \
                --initial_email "[email protected]" \
                --initial_password "Password123" \
                --initial_username "username" \
                --initial_firstname "FirstName" \
                --initial_lastname "LastName" \
                --port 61000

Glossary/Reference

<worker_path> = /home/cryosparc_user/software/cryosparc/cryosparc_worker

the full path to the worker directory, which was downloaded and extracted in step c)
To get the full path, cd into the cryosparc_worker directory and run the command: pwd -P

<ssd_path> = /scratch/cryosparc_cache

path on the worker node to a writable directory residing on the local SSD (For more information on SSD usage in cryoSPARC, see X)
this is optional, and if omitted, specify the --nossd option to indicate that the worker node does not have an SSD

<initial_email> = [email protected]

login email address for first CryoSPARC webapp account
this will become an admin account in the user interface

<initial_username> = "FirstName LastName"

login username of the initial admin account to be created
ensure the name is quoted

<initial_firstname> = "FirstName"

given name of the initial admin account to be created
ensure the name is quoted

<initial_lastname> = "LastName"

surname of the initial admin account to be created
ensure the name is quoted

<initial_password> = Password123

temporary password that will be created for the <user_email> account
Note that if this argument is not specified, a silent input prompt will be provided

<port_number> = 61000

The base port number for this CryoSPARC instance. Do not install cryosparc master on the same machine multiple times with the same port number - this can cause database errors. Choose the base port number carefully to
- avoid conflicts with other applications
- avoid conflicts with other CryoSPARC instances that may be running on the same computer

Versions of CryoSPARC prior to v4.4.0 also require

--cudapath <cuda path>

where <cuda path> corresponds to the CUDA installation directory, such as

/opt/cuda-11.8, that contains the bin/ and lib64/ subdirectories on the worker node, not the cuda-11.8/bin/ subdirectory.

Master Node CryoSPARC Installation

cd cryosparc_master

./install.sh --license $LICENSE_ID \
             --hostname <master_hostname> \
             --dbpath <db_path> \
             --port <port_number> \
             [--insecure] \
             [--allowroot] \
             [--yes]

Example command execution

./install.sh --license $LICENSE_ID \
             --hostname cryoem.cryosparcserver.edu \
             --dbpath /u/cryosparcuser/cryosparc/cryosparc_database \
             --port 61000

Start CryoSPARC

./bin/cryosparcm start

Create the first user

cryosparcm createuser --email "<user email>" \
                      --password "<user password>" \
                      --username "<login username>" \
                      --firstname "<given name>" \
                      --lastname "<surname>"

Glossary/Reference

--license $LICENSE_ID

the LICENCE_ID variable exported in step a)

<master_hostname> = your.cryosparc.hostname.com

the hostname of the server where the master is to be installed on

<port_number> = 61000

The base port number for this CryoSPARC instance. Do not install cryosparc master on the same machine multiple times with the same port number - this can cause database errors. Choose the base port number carefully to
- avoid conflicts with other applications
- avoid conflicts with other CryoSPARC instances that may be running on the same node

<db_path> = /u/cryosparcuser/cryosparc/cryosparc_database

the absolute path to a folder where the CryoSPARC database is to be installed. Ensure this location is in a readable and writeable location. The folder will be created if it doesn't already exist

<initial_email> = [email protected]

login email address for first CryoSPARC webapp account
this will become an admin account in the user interface

<initial_username> = "FirstName LastName"

login username of the initial admin account to be created
ensure the name is quoted

<initial_firstname> = "FirstName"

given name of the initial admin account to be created
ensure the name is quoted

<initial_lastname> = "LastName"

surname of the initial admin account to be created
ensure the name is quoted

<user_password> = Password123

temporary password that will be created for the <user_email> account

--insecure

[optional] specify this option to ignore SSL certificate errors when connecting to HTTPS endpoints. This is useful if you are behind an enterprise network using SSL injection.

--allowroot

[optional] if you run the installation function with root privileges, it will fail. Specify this option to force CryoSPARC to be installed as the root user.

--yes

[optional] do not ask for any user input confirmations

[Optional] Re-load your `bashrc`. This will allow you to run the `cryosparcm` management script from anywhere in the system:

source ~/.bashrc

[Strongly recommended] Set up recurring backups of the CryoSPARC database

The CryoSPARC database holds metadata, images, plots and logs of jobs that have run. If the database becomes corrupt or lost due to to user error or filesystem issues, work that was done in projects can be lost. This can be recovered in two ways:

In a new/fresh CryoSPARC instance, import the project directories of projects from the original instance. This should resurrect all jobs, outputs, inputs, metadata, etc. Please note: user and instance level settings and configuration will be lost.
Maintain a backup (daily or weekly recommended) of the CryoSPARC database. You can use the backup functionality documented here as well as a cron job or other recurring scheduler to run the backup command.

Access the User Interface

After completing the above, navigate your browser to http://<workstation_hostname>:<base_port_number> to access the CryoSPARC user interface.

Accessing the CryoSPARC User Interface

If you were following the Single Workstation install above, your installation is now complete.

Install The `cryosparc_worker` Package

Log onto the worker node (or a cluster worker node if installing on a cluster) as the cryosparcuser, and run the installation command.

Installing cryosparc_worker requires a Nvidia GPU and Nvidia Driver version 520.61.05 or newer

GPU Worker Node CryoSPARC Installation

cd cryosparc_worker

./install.sh --license $LICENSE_ID [--yes]

Worker Installation Glossary/Reference

--license $LICENSE_ID

The LICENCE_ID exported in the "Export your License ID as an environment variable" step

--yes

[optional] do not ask for any user input confirmations

Versions of CryoSPARC prior to v4.4.0 also require

--cudapath <cuda path>

such as

./install.sh --license $LICENSE_ID --cudapath /opt/cuda-11.8

Path to the CUDA installation directory on the worker node
Note: this path should not be the cuda/bin/ sub directory, but the CUDA directory that contains both bin/ and lib64/ subdirectories

Connecting A Worker Node

Connect A Managed Worker to CryoSPARC

Ensure cryoSPARC is running when connecting a worker node. Log into the worker node and run the connection function (substitute the placeholders with your own configuration parameters; details below):

cd cryosparc_worker

./bin/cryosparcw connect --worker <worker_hostname> \
                         --master <master_hostname> \
                         --port <port_num> \
                         --ssdpath <ssd_path> \
                         [--update] \
                         [--sshstr <custom_ssh_string> ] \
                         [--nogpu] \
                         [--gpus <0,1,2,3> ] \
                         [--nossd] \
                         [--ssdquota <ssd_quota_mb> ] \
                         [--ssdreserve <ssd_reserve_mb> ] \
                         [--lane <lane_name> ] \
                         [--newlane]

Worker Connection Glossary/Reference

--worker <worker_hostname> = worker.cryosparc.hostname.com
# [required] hostname of the worker node to connect to your 
# cryoSPARC instance

--master <master_hostname> = your.cryosparc.hostname.com
# [required] hostname of the server where the master is 
# installed on

--port <port_number> = 61000
# [required] base port number for the cryoSPARC instance you 
# are connecting the worker node to

--ssdpath <ssd_path>
# [optional] path to directory on local SSD
# replace this with --nossd to connect a worker node without an SSD

--update
# [optional] use when updating and existing configuration

--sshstr <custom_ssh_string>
# [optional] custom SSH connection string such as user@hostname 

--cpus 4
# [optional] enable this number of CPU cores 

--nogpu
# [optional] connect worker with no GPUs 

--gpus 0,1,2,3
# [optional] enable specific GPU devices only 

# For advanced configuration, run the gpulist command:
# $ bin/cryosparcw gpulist

# Detected 4 CUDA devices.

#   id           pci-bus  name
#   ---------------------------------------------------------------
#       0      0000:42:00.0  Quadro GV100
#       1      0000:43:00.0  Quadro GV100
#       2      0000:0B:00.0  Quadro RTX 5000
#       3      0000:0A:00.0  GeForce GTX 1080 Ti
#   ---------------------------------------------------------------
# This will list the available GPUs on the worker node, and their 
# corresponding numbers. Use this list to decide which GPUs you wish 
# to enable using the --gpus flag, or leave this flag out to enable all GPUs.

--nossd
# [optional] connect a worker node with no SSD 

--ssdquota <ssd_quota_mb>
# [optional] quota of how much SSD space to use (MB) 

--ssdreserve <ssd_reserve_mb>
# [optional] minimum free space to leave on SSD (MB) 

--lane <lane_name>
# [optional] name of lane to add worker to 

--newlane
# [optional] force creation of a new lane if the lane specified by --lane 
# does not exist

Update a Managed Worker Configuration

To update an existing managed worker configuration, use cryosparcw connect with the --update flag and the field you would like to update.

For example:

cd cryosparc_worker
./bin/cryosparcw connect --worker <worker_hostname> \
                         --master <master_hostname> \
                         --port <port_num> \
                         --update
                         --ssdquota 500000

Connect a Cluster to CryoSPARC

Once the cryosparc_worker package is installed, the cluster must be registered in the CryoSPARC database. Registration is performed with the command

cryosparcm cluster connect

and requires the presence in the current working directory of

the cluster information file cluster_info.json, which contains template strings used to construct cluster commands (e.g., qsub, qstat, qdel etc., or their equivalents for your system)
the cluster script template file cluster_script.sh, which contains a jinja2 template string corresponding to a cluster job script.

CryoSPARC Cluster Integration Script Examples

Creation and/or modification of the cluster_info.json and cluster_script.sh files is only one step in the configuration of a cluster lane for CryoSPARC jobs. To become effective, information in these files needs to be registered in the CryoSPARC database.

For the submission and managment of CryoSPARC cluster jobs, cluster information and script templates are read from the database, not from the cluster_info.json and cluster_script.sh files. The jinja2 template engine renders the actual cluster management commands as well as the submission scripts for each CryoSPARC cluster job.

Variables available in `cluster_info.json`

name: string, required

Unique name for the cluster to be connected (multiple clusters can be connected).
This will be the name of the lane users will see when queuing jobs in CryoSPARC.

worker_bin_path: string, required

Absolute path on cluster nodes to the cryosparcw script.

cache_path: string, optional

Absolute path on cluster nodes that is a writable location on the local SSD on each cluster node. This might be /scratch or similar.
This path must be the same on all cluster nodes. See cryosparcm test workers for instructions on how to verify that CryoSPARC is able to successfully write to the cache path.
If you plan to use the cluster nodes without an SSD, you can omit this field.

cache_reserve_mb:, integer, optional

The size (in MB) to initially reserve for the cache on the SSD. This value is 10GB by default, which means CryoSPARC will always leave at least 10GB of free space on the SSD.

cache_quota_mb: integer, optional

The maximum size (in MB) to use for the cache on the SSD.

send_cmd_tpl: string, required

Used to send a cluster management command to be executed by a cluster node (in case the cryosparc master is not able to directly use cluster management commands).
If your cryosparc master node is able to directly use cluster management commands (e.g., qsub, etc.,) then this string can be just "{{ command }}".

qsub_cmd_tpl: string, required

The cluster management command used to submit a job to the cluster, where the job is defined in the cluster script located at {{ script_path_abs }}.
This string can also use any of the variables defined in cluster_script.sh that are available when the job is scheduled (e.g., cryosparc_username, project_uid, etc.,).

qstat_cmd_tpl: string, required

The cluster management command that will report back the status of cluster job with its ID {{ cluster_job_id }}.

qdel_cmd_tpl: string, required

The cluster management command that will kill and remove the cluster job (using {{ cluster_job_id }}) from the queue.

qinfo_cmd_tpl: string, required

The cluster management command to retrieve general cluster information.

Along with the configuration variables above, a complete cluster configuration requires a template cluster submission script that needs to be prepared in a file named cluster_script.sh. The script must send jobs into your cluster scheduler queue and mark them with the appropriate hardware requirements. The CryoSPARC internal scheduler submits jobs with this script as their inputs become ready. The following variables are available for use used within a cluster submission script template. When starting out, example templates may be generated with the commands explained below.

The Basic Set of Variables Available in `cluster_script.sh`

{{ script_path_abs }}    # absolute path to the generated submission script
{{ run_cmd }}            # complete command-line string to run the job
{{ num_cpu }}            # number of CPUs needed
{{ num_gpu }}            # number of GPUs needed.
{{ ram_gb }}             # amount of RAM needed in GB
{{ job_dir_abs }}        # absolute path to the job directory
{{ project_dir_abs }}    # absolute path to the project dir
{{ job_log_path_abs }}   # absolute path to the log file for the job
{{ worker_bin_path }}    # absolute path to the cryosparc worker command
{{ run_args }}           # arguments to be passed to cryosparcw run
{{ project_uid }}        # uid of the project
{{ job_uid }}            # uid of the job
{{ job_creator }}        # name of the user that created the job (may contain spaces)
{{ cryosparc_username }} # cryosparc username of the user that created the job (usually an email)
{{ job_type }}           # CryoSPARC job type

The CryoSPARC scheduler does not control GPU allocation when spawning jobs on a cluster. The number of GPUs required is provided as a template variable. Either your submission script or your cluster scheduler is responsible for assigning GPU device indices to each job spawned based on the provided variable. The CryoSPARC worker processes that use one or more GPUs on a cluster simply use device 0, then 1, then 2, etc.

The cluster script template can be further customized with additional variables, as described in

Guide: Configuring Custom Variables for Cluster Job Submission Scripts

Register the Cluster Lane in the CryoSPARC Database

The cryosparcm cluster connect command activates a new or modified cluster lane configuration by uploading information from the files cluster_info.json and cluster_script.sh to the CryoSPARC database.

Thecryosparcm cluster connect command attempts reading cluster_info.json and cluster_script.sh from the current working directory.

The cryosparcm cluster connect command will overwrite an existing database record if the existing record's name matches the "name" value inside cluster_info.json.

cryosparcm cluster example <cluster_type>
# dumps out config and script template files to current working directory
# examples are available for pbs and slurm schedulers but others should 
# be very similar

cryosparcm cluster dump <name>
# dumps out existing config and script to current working directory

cryosparcm cluster connect
# connects new or updates existing cluster configuration, 
# reading cluster_info.json and cluster_script.sh from the current directory, 
# using the name from cluster_info.json

cryosparcm cluster remove <name>
# removes a cluster configuration from the scheduler

Test the Registered Cluster Lane

The success of the cluster lane's registration may be tested by queuing a CryoSPARC job to the cluster. For example, one may for this purpose:

clone an existing job that was previously completed while the CryoSPARC instance ran on the current CryoSPARC version and queue the clone
create an Extensive Validation job and queue it

to the newly connected cluster lane

Update a Cluster Lane Configuration

To update an existing cluster integration, call the cryosparcm cluster connect command with the updated cluster_info.json and cluster_script.sh in the current working directory.

Note that the name field from cluster_info.json must be the same in the cluster configuration to update

If you don't already have the cluster_info.json and cluster_script.sh files in your current working directory, you can get them by running the command cryosparcm cluster dump <name>

Add Additional Cluster Lanes

Additional lanes for a cluster can be added to increase options for submission parameters when submitting jobs to the cluster. The following steps can be followed to create a new lane from an existing cluster configuration:

Prerequisites:

a folder containing the cluster setup files - cluster_folder
cluster configuration files - cluster_folder/cluster_info.json and cluster_folder/cluster_script.sh

Steps:

create a new directory cluster_folder_new_lane parallel to cluster_folder
copy the cluster configuration files cluster_folder/cluster_info.json and cluster_folder/cluster_script.sh to cluster_folder_new_lane
change the name field in cluster_folder_new_lane/cluster_info.json to the desired name of the new lane
edit cluster_folder_new_lane/cluster_info.json and cluster_folder_new_lane/cluster_script.sh with any lane configuration changes
set cluster_folder_new_lane as the current working directory
connect the new cluster lane with cryosparcm cluster connect (this must be run with cluster_folder_new_lane as the working directory)

Note: The folder names cluster_folder and cluster_folder_new_lane can be changed to more descriptive names if desired, such as the name of the lane.

PreviousObtaining A License ID NextCryoSPARC Cluster Integration Script Examples

Last updated 3 months ago

Overview

Prepare for Installation

Log into the workstation where you would like to install and run CryoSPARC

Select a suitable directory for installation

Download and Extract the cryosparc_master and cryosparc_worker Packages Into Your Installation Directory

Define your License ID as an environment variable:

Use curl to download the two files into tarball archives:

Extract the downloaded files:

Install The cryosparc_master Package

Single Workstation CryoSPARC Installation

Example command execution

Glossary/Reference

Master Node CryoSPARC Installation

Example command execution

Start CryoSPARC

Create the first user

Glossary/Reference

[Optional] Re-load your bashrc. This will allow you to run the cryosparcm management script from anywhere in the system:

[Strongly recommended] Set up recurring backups of the CryoSPARC database

Access the User Interface

Install The cryosparc_worker Package

GPU Worker Node CryoSPARC Installation

Worker Installation Glossary/Reference

Connecting A Worker Node

Connect A Managed Worker to CryoSPARC

Worker Connection Glossary/Reference

Update a Managed Worker Configuration

Connect a Cluster to CryoSPARC

Variables available in cluster_info.json

The Basic Set of Variables Available in cluster_script.sh

Register the Cluster Lane in the CryoSPARC Database

Test the Registered Cluster Lane

Update a Cluster Lane Configuration

Add Additional Cluster Lanes

Download and Extract the `cryosparc_master` and `cryosparc_worker` Packages Into Your Installation Directory

Use `curl` to download the two files into tarball archives:

Install The `cryosparc_master` Package

[Optional] Re-load your `bashrc`. This will allow you to run the `cryosparcm` management script from anywhere in the system:

Install The `cryosparc_worker` Package

Variables available in `cluster_info.json`

The Basic Set of Variables Available in `cluster_script.sh`