Downloading and Installing CryoSPARC
Downloading and installing the cryosparc_master and cryosparc_worker packages.
Overview
Once you've reviewed the CryoSPARC Installation Prerequisites and have decided on an architecture suitable for your situation, the installation process can be distilled to a few main steps which are further detailed below:
Download the
cryosparc_master
packageDownload the
cryosparc_worker
packageInstall the
cryosparc_master
package on the master nodeStart cryoSPARC for the first time
Create the first administrator user
[Optional but recommended] Set up recurring backup of the CryoSPARC database
If you chose to install CryoSPARC via the single-workstation method, at this point, you are finished installing CryoSPARC. Otherwise, continue.
Log onto a worker node and install the
cryosparc_worker
package (installing thecryosparc_worker
package requires an Nvidia GPU and Nvidia Driver 520.61.05 or newer)Depending on whether you're installing CryoSPARC on a cluster or on a standalone worker machine, connect the worker node to CryoSPARC via
cryosparcm cluster connect
orcryosparcw connect
CryoSPARC is now fully installed. You may also choose to add more standalone worker machines at any time by using the
cryosparcw connect
utility.
You will need at least 15 GB of space to download and install the cryosparc_master
and cryosparc_worker
packages
Prepare for Installation
Log into the workstation where you would like to install and run CryoSPARC
<cryosparcuser> = cryosparcuser
The username of the account that cryoSPARC is to be installed by.
<cryosparc_server> = uoft
The hostname of the server where cryoSPARC will be installed on.
Determine where you'd like to install CryoSPARC
<install_path> = /home/cryosparcuser/cryosparc
This will be the root directory where all cryoSPARC code and dependencies will be installed.
Download And Extract The cryosparc_master
And cryosparc_worker
Packages Into Your Installation Directory
cryosparc_master
And cryosparc_worker
Packages Into Your Installation DirectoryExport your License ID as an environment variable:
<license_id> = 682437fb-d6ae-47b8-870b-b530c587da94
This is the License ID issued to you, which you would have received in an email.
A unique license key is required for each individual crySPARC master instance.
Use curl
to download the two files into tarball archives:
curl
to download the two files into tarball archives:Extract the downloaded files:
Note: After extracting the worker package, you may see a second folder called cryosparc2_worker
(note the 2
) containing a single version
file. This is here for backward compatibility when upgrading from older versions of CryoSPARC and is not applicable for new installations.
You may safely delete the cryosparc2_worker
directory.
Install The cryosparc_master
Package
cryosparc_master
PackageFollow these instructions to install the master package on the master node. If you are following the Single Workstation setup, once you run the installation command, CryoSPARC will be ready to use.
After installation and startup, the cryosparc_master
software exposes a number of network ports for connections from other computers, such as CryoSPARC worker nodes and computers from which users browse the CryoSPARC user interface. You must ensure that these ports cannot be accessed directly from the internet. The CryoSPARC guide contains suggestions on implementing access to the user interface.
If you are installing CryoSPARC on a single workstation, you can use the Single Workstation instructions below that simplify installation to a single command. Otherwise, use the Master Node Only instructions here, and then continue with the further steps to install on worker nodes separately.
Select and monitor the filesystem that hosts the $CRYOSPARC_DB_PATH
directory carefully. The CryoSPARC instance will fail if the filesystem is allowed to fill up completely. Recovery from such a failure may be tedious, particularly in the absence of a recent database backup. The directory can be specified during installation with the "--dbpath"
parameter of the install.sh
command. Otherwise, the default is a directory named cryosparc_database
inside the directory where the software tar packages were unpacked.
Single Workstation CryoSPARC Installation
Example command execution
Glossary/Reference
<worker_path> = /home/cryosparc_user/software/cryosparc/cryosparc_worker
the full path to the worker directory, which was downloaded and extracted in step c)
To get the full path,
cd
into thecryosparc_worker
directory and run the command:pwd -P
<ssd_path> = /scratch/cryosparc_cache
path on the worker node to a writable directory residing on the local SSD (For more information on SSD usage in cryoSPARC, see X)
this is optional, and if omitted, specify the
--nossd
option to indicate that the worker node does not have an SSD
<initial_email> = someone@structura.bio
login email address for first CryoSPARC webapp account
this will become an admin account in the user interface
<initial_username> = "FirstName LastName"
login username of the initial admin account to be created
ensure the name is quoted
<initial_firstname> = "FirstName"
given name of the initial admin account to be created
ensure the name is quoted
<initial_lastname> = "LastName"
surname of the initial admin account to be created
ensure the name is quoted
<initial_password> = Password123
temporary password that will be created for the
<user_email>
accountNote that if this argument is not specified, a silent input prompt will be provided
<port_number> = 39000
The base port number for this CryoSPARC instance. Do not install cryosparc master on the same machine multiple times with the same port number - this can cause database errors. 39000 is the default, which will be used if you don't specify this option.
Versions of CryoSPARC prior to v4.4.0 also require
--cudapath <cuda path>
where <cuda path> corresponds to the CUDA installation directory, such as
/opt/cuda-11.8,
that contains the bin/
and lib64/
subdirectories on the worker node, not the cuda-11.8/bin/
subdirectory.
[Optional] Re-load your bashrc
. This will allow you to run the cryosparcm
management script from anywhere in the system:
bashrc
. This will allow you to run the cryosparcm
management script from anywhere in the system:[Strongly recommended] Set up recurring backups of the CryoSPARC database
The CryoSPARC database holds metadata, images, plots and logs of jobs that have run. If the database becomes corrupt or lost due to to user error or filesystem issues, work that was done in projects can be lost. This can be recovered in two ways:
In a new/fresh CryoSPARC instance, import the project directories of projects from the original instance. This should resurrect all jobs, outputs, inputs, metadata, etc. Please note: user and instance level settings and configuration will be lost.
Maintain a backup (daily or weekly recommended) of the CryoSPARC database. You can use the backup functionality documented here as well as a
cron
job or other recurring scheduler to run the backup command.
Access the User Interface
After completing the above, navigate your browser to http://<workstation_hostname>:39000
to access the CryoSPARC user interface.
If you were following the Single Workstation install above, your installation is now complete.
Install The cryosparc_worker
Package
cryosparc_worker
PackageLog onto the worker node (or a cluster worker node if installing on a cluster) as the cryosparcuser
, and run the installation command.
Installing cryosparc_worker
requires a Nvidia GPU and Nvidia Driver version 520.61.05 or newer
GPU Worker Node CryoSPARC Installation
Worker Installation Glossary/Reference
--license $LICENSE_ID
The LICENCE_ID exported in the "Export your License ID as an environment variable" step
--yes
[optional] do not ask for any user input confirmations
Versions of CryoSPARC prior to v4.4.0 also require
--cudapath <cuda path>
such as
Path to the CUDA installation directory on the worker node
Note: this path should not be the
cuda/bin/
sub directory, but the CUDA directory that contains bothbin/
andlib64/
subdirectories
Connecting A Worker Node
Connect A Managed Worker to CryoSPARC
Ensure cryoSPARC is running when connecting a worker node. Log into the worker node and run the connection function (substitute the placeholders with your own configuration parameters; details below):
Worker Connection Glossary/Reference
Update a Managed Worker Configuration
To update an existing managed worker configuration, use cryosparcw connect
with the --update
flag and the field you would like to update.
For example:
Connect a Cluster to CryoSPARC
Once the cryosparc_worker
package is installed, the cluster must be registered with the master process. This requires a template for job submission commands and scripts that the master process will use to submit jobs to the cluster scheduler.
To register the cluster, provide CryoSPARC with the following two files and call the cryosparcm cluster connect
command:
cluster_info.json
cluster_script.sh
The first file (cluster_info.json
) contains template strings used to construct cluster commands (e.g., qsub
, qstat
, qdel
etc., or their equivalents for your system). The second file (cluster_script.sh
) contains a template string to construct appropriate cluster submission scripts for your system. The jinja2
template engine is used to generate cluster submission/monitoring commands as well as submission scripts for each job.
Create the files
The following fields are required to be defined as template strings in the configuration of a cluster. Examples for SLURM are given; use any command required for your particular cluster scheduler. Note that parameters listed as "optional" can be omitted or included with their value as null
.
See the following page for examples of these files:
pageCryoSPARC Cluster Integration Script ExamplesVariables available in cluster_info.json
name
: string, required
Unique name for the cluster to be connected (multiple clusters can be connected).
worker_bin_path
: string, required
Absolute path on cluster nodes to the
cryosparcw
script.
cache_path
: string, optional
Absolute path on cluster nodes that is a writable location on the local SSD on each cluster node. This might be
/scratch
or similar.This path must be the same on all cluster nodes. See cryosparcm test workers for instructions on how to verify that CryoSPARC is able to successfully write to the cache path.
If you plan to use the cluster nodes without an SSD, you can omit this field.
cache_reserve_mb
:, integer, optional
The size (in MB) to initially reserve for the cache on the SSD. This value is 10GB by default, which means CryoSPARC will always leave at least 10GB of free space on the SSD.
cache_quota_mb
: integer, optional
The maximum size (in MB) to use for the cache on the SSD.
send_cmd_tml
: string, required
Used to send a cluster management command to be executed by a cluster node (in case the cryosparc master is not able to directly use cluster management commands).
If your cryosparc master node is able to directly use cluster management commands (e.g.,
qsub
, etc.,) then this string can be just"{{ command }}"
.
qsub_cmd_tpl
: string, required
The cluster management command used to submit a job to the cluster, where the job is defined in the cluster script located at
{{ script_path_abs }}
.This string can also use any of the variables defined in cluster_script.sh that are available when the job is scheduled (e.g.,
cryosparc_username
,project_uid
, etc.,).
qstat_cmd_tpl
: string, required
The cluster management command that will report back the status of cluster job with its ID
{{ cluster_job_id }}
.
qdel_cmd_tpl
: string, required
The cluster management command that will kill and remove the cluster job (using
{{ cluster_job_id }}
) from the queue.
qinfo_cmd_tpl
: string, required
The cluster management command to retrieve general cluster information.
Along with the above commands, a complete cluster configuration requires a template cluster submission script. The script must send jobs into your cluster scheduler queue and mark them with the appropriate hardware requirements. The CryoSPARC internal scheduler submits jobs with this script as their inputs become ready. The following variables are available for use used within a cluster submission script template. When starting out, example templates may be generated with the commands explained below.
Note: The CryoSPARC scheduler does not assume control over GPU allocation when spawning jobs on a cluster. The number of GPUs required is provided as a template variable. Either your submission script or your cluster scheduler is responsible for assigning GPU device indices to each job spawned based on the provided variable. The CryoSPARC worker processes that use one or more GPUs on a cluster simply use device 0, then 1, then 2, etc.
Therefore, the simplest way to correctly allocate GPUs is to set sets the CUDA_VISIBLE_DEVICES
environment variable in your cluster scheduler or submission script. Then device 0 is always the first GPU that a running job must use. The example script for pbs
clusters (generated as below) shows how to check which GPUs are available at runtime and automatically select the next available device.
Load the scripts and register the integration
To create or set a configuration for a cluster in CryoSPARC, use the following commands.
The command cryosparcm cluster connect
attempts reading cluster_info.json
and cluster_script.sh
from the current working directory.
For more information on validating a cluster integration, see Guide: Cluster Integration Validation
Update a Cluster Configuration
To update an existing cluster integration, call the cryosparcm cluster connect
commands with the updated cluster_info.json
and cluster_script.sh
in the current working directory.
Note that the name
field from cluster_info.json
must be the same in the cluster configuration to update
If you don't already have the cluster_info.json
and cluster_script.sh
files in your current working directory, you can get them by running the command cryosparcm cluster dump <name>
Adding Additional Cluster Lanes
Additional lanes for a cluster can be added to increase options for submission parameters when submitting jobs to the cluster. The following steps can be followed to create a new lane from an existing cluster configuration:
Prerequisites:
a folder containing the cluster setup files -
cluster_folder
cluster configuration files -
cluster_folder/cluster_info.json
andcluster_folder/cluster_script.sh
Steps:
create a new directory
cluster_folder_new_lane
parallel tocluster_folder
copy the cluster configuration files
cluster_folder/cluster_info.json
andcluster_folder/cluster_script.sh
tocluster_folder_new_lane
change the
name
field incluster_folder_new_lane/cluster_info.json
to the desired name of the new laneedit
cluster_folder_new_lane/cluster_info.json
andcluster_folder_new_lane/cluster_script.sh
with any lane configuration changesset
cluster_folder_new_lane
as the current working directoryconnect the new cluster lane with
cryosparcm cluster connect
(this must be run withcluster_folder_new_lane
as the working directory)restart CryoSPARC with
cryosparcm restart
Note: The folder names cluster_folder
and cluster_folder_new_lane
can be changed to more descriptive names if desired, such as the name of the lane.
Last updated