Troubleshooting

Overview of common issues and advice on how to resolve them

Unless otherwise noted:

  • Log in to the workstation or remote node where cryosparc2_master is installed.

  • Use the same non-root UNIX user account that runs the cryoSPARC process and was used to install cryoSPARC.

  • Run all commands on this page in a terminal running bash

Common Issues

Cannot download or install cryoSPARC

Problems with the installation steps are indicated with the some of the following error messages:

"Couldn't connect to host" "Could not resolve host" {"success": false} "tar: This does not look like a tar archive" "Version mismatch! Worker and master versions are not the same. Please update." "An unexpected error has occurred."

Steps

echo $LICENSE_ID

Ensure the output exactly matches the cryoSPARC License ID you were issued over email.

  • Check your machine's connection to cryoSPARC's license servers at get.cryosparc.com with this curl command:

curl https://get.cryosparc.com/checklicenseexists/$LICENSE_ID

You should see the message {"success": true}. If instead you see {"success": false}, your license is not valid, so please check it has been entered correctly.

If you see an error message like "Couldn't connect to host" or "Could not resolve host" check your Internet connection, firewall or ensure your IT department has the get.cryosparc.com license server domain whitelisted.

Cannot update cryoSPARC

Steps

CryoSPARC does not start or encounters error on startup

This can happen following a fresh install or recent update.

Steps

  • In a command line, run cryosparcm status

  • Check that the output looks like this

----------------------------------------------------------------------------
CryoSPARC System master node installed at
/home/cryosparcuser/cryosparc2/cryosparc2_master
----------------------------------------------------------------------------
cryosparcm process status:
app RUNNING pid 13480, uptime 16 days, 3:01:21
app_dev STOPPED Not started
command_core RUNNING pid 12495, uptime 16 days, 3:01:22
command_proxy RUNNING pid 17832, uptime 16 days, 3:01:24
command_rtp RUNNING pid 18361, uptime 16 days, 3:01:23
command_vis RUNNING pid 19585, uptime 16 days, 3:01:19
database RUNNING pid 15879, uptime 16 days, 3:01:30
watchdog_dev STOPPED Not started
webapp RUNNING pid 19805, uptime 16 days, 3:01:18
webapp_dev STOPPED Not started
----------------------------------------------------------------------------
global config variables:
export CRYOSPARC_LICENSE_ID="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
export CRYOSPARC_MASTER_HOSTNAME="localhost"
export CRYOSPARC_DB_PATH="/home/cryosparcuser/cryosparc2/cryosparc2_database"
export CRYOSPARC_BASE_PORT=39000
export CRYOSPARC_DEVELOP=false
export CRYOSPARC_INSECURE=false
export CRYOSPARC_CLICK_WRAP=true
  • Check that all items under "cryosparcm process status" that do not end in _dev are RUNNING. If any are not, run cryosparcm restart

  • If any non-_dev components have a status other than RUNNING (such as STOPPED or EXITED), check their log for errors. For example, this command checks for errors on the database process:

cryosparcm log database

Any error messages here could indicate specific configuration issues and may require re-installing cryoSPARC.

If at any point you see No command 'cryosparcm' found or command not found: cryosparcm:

  • Check that you are on the master node or workstation where cryosparc2_master is installed

  • Run echo $PATH and check that it contains <installation directory>/cryosparc2_master/bin

$ echo $PATH
/home/cryosparcuser/cryosparc/cryosparc2_master/bin:/usr/local/cuda-10.1/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

License error or license not found

Follow the steps in this section when you see error messages that look like this:

"License is invalid." "License signature invalid." "Could not find local license file. Please re-establish your connection to the license servers." "Local license file is expired. Please re-establish your connection to the license servers." "Token is invalid. Another cryoSPARC instance is running with the same license ID."

Steps

  • Ensure you entered your license correctly during the installation step.

  • Check your Internet connection

  • Check your machine's connection to cryoSPARC's license servers at get.cryosparc.com with this curl command (substitute <license> with your unique license ID):

curl https://get.cryosparc.com/checklicenseexists/<license>

You should see the message {"success": true}

If instead you see {"success": false}, your license is not valid so please check it has been entered correctly.

If you see an error message like "Couldn't connect to host" or "Could not resolve host" check your Internet connection, firewall or ensure your IT department has the get.cryosparc.com license server domain whitelisted.

  • If you see a license ID conflict such as this

"Another cryoSPARC instance is running with the same license ID."

Check that no ghost processes are running on all nodes. First stop the running cryoSPARC instance:

cryosparcm stop

For every machine running cryoSPARC (including workers), list all processes associated with cryoSPARC with the following command:

ps -ax | grep cryosparc

If you see any processes that appear associated with cryoSPARC, kill them with the command kill <PID>, where <PID> is the process ID number at the beginning of each line.

For example, to kill the first python process in this process list

PID TTY STAT TIME COMMAND
12495 ? Sl 76:06 python -c import cryosparc2_command.command_core as serv; serv.start(port=39002)
15879 ? Sl 379:49 mongod --dbpath /home/cryosparcuser/cryosparc2/cryosparc2_database --port 39001 --replSet meteor
17832 ? S 0:02 python -c import cryosparc2_command.command_proxy as serv; serv.start(port=39004)
18361 ? Sl 288:40 python -c import cryosparc2_command.command_rtp as serv; serv.start(port=39005)
19585 ? Sl 46:34 python -c import cryosparc2_command.command_vis as serv; serv.start(port=39003)

Run

kill 12495

Alternatively, if you have the killall program installed, kill all running python processes with one of these commands:

killall -u $(whoami) python
# or
sudo killall -u <cryosparcuser> python

If the current Linux user runs cryoSPARC, use the first command. Otherwise use the second command, substituting <cryosparcuser> with the username that runs cryoSPARC.

Then start cryoSPARC on the master note or workstation:

cryosparcm start

Cannot queue or run job

Follow these steps when the cryoSPARC web interface is up and running normally and jobs may be created but do not run. These error messages may indicate this issue:

"list index out of range" "Could not resolve hostname ... Name or service not known"

A job that never changes from Queued or Launched status may also indicate this.

Steps

  • Ensure at least one worker is connected to the master. See the Installation page for details.

Visit the /resources page to see what lanes are available

  • Check that all non-development cryoSPARC processes are running with the cryosparcm status command

  • (For master/worker setups) check that SSH is configured between the master and worker machines.

  • Check the log for the command_core log to find any application error messages

cryosparcm log command_core

(press Control + C on the keyboard to exit when finished)

cryosparcm cli "refresh_job_types()"
cryosparcm cli "reload()"
  • Force-reinstall master and worker dependencies. This can help when a worker was not correctly installed.

On the workstation or master node run:

cryosparcm forcedeps

On worker nodes run:

cryosparcw forcedeps
  • Restart cryoSPARC:

cryosparcm restart

Then Clear the job and re-run it.

Job runs but ends unexpectedly with status "Failed"

When a job fails, its job card in the interface turns red and the bottom of the job log includes an error message with the textTraceback (most recent call last)

Common failure reasons include:

  • Invalid or unspecified input slots

  • Invalid or unspecified required parameters, including file/folder paths

  • Incorrectly set up GPU (e.g., running a job on a node without enough GPUs or CUDA drivers not installed)

  • Another process taking up memory on a GPU

  • Cache not set up correctly for a worker

Example of failed jobs in a cryoSPARC workspace
Example of an error log entry at the bottom of a Failed job

Common job failure error messages:

"AssertionError: Child process with PID ... has terminated unexpectedly!"

Common error messages that indicate incorrectly configured GPU drivers:

"cuInit failed: unknown error" "no CUDA-capable device is detected" "cuMemHostAlloc failed: OS call failed or operation not supported on this OS" "cuCtxCreate failed: invalid device ordinal kernel.cu ... error: identifier "__shfl_down_sync" is undefined

Common error messages that indicate not enough GPU memory:

"cuMemAlloc failed: out of memory" "cuArrayCreate failed: out of memory" "cufftAllocFailed"

Steps

  • Ensure a CUDA version in the range ≥9.2, ≤10.2 is installed and running on the workstation or each worker node (CUDA 11 is not supported)

  • Check the GPU configuration on the workstation or node where the job runs on. Log into that machine and navigate to the cryoSPARC installation directory. Run the cryosparcw gpulist command:

cd /path/to/cryosparc2_worker
bin/cryosparcw gpulist
  • Run nvidia-smi to check the GPU and CUDA driver status.

    • Check that CUDA version is within the allowed range. cryoSPARC supports CUDA versions ≥9.2, ≤10.2 (CUDA 11 is not supported)

    • Check that no other processes are using GPU memory. cryoSPARC-related process appear with process name "python"

Example output of the nvidia-smi command, showing CUDA 10.2 and a cryoSPARC python process using ~2GB on GPU 0

If you don't recognize the processes using GPU memory, run kill <PID>, substituting <PID> with the value under the Processes PID column

  • Check the Job log: Select the Job card in the cryoSPARC interface and press the Spacebar on your keyboard to see the log. Scroll down to the bottom and look for the failure reasons in red

  • Clear the job and select the "Building" status badge on the job card to enter the Job Builder

  • If the job failed with a GPU-related error and multiple GPUs are available, try running the job on a different GPU. Press Queue, switch to the "Run on specific GPU" Queue type and select one or more GPUs

  • If the job failed with AssertionError: Non-optional inputs from the following input groups and their slots are not connected then expand any input groups connected to the job. Missing required slots appear in red

  • Check the job parameters To learn about setting specific parameters, hover over or touch the parameter names in the Job Builder to see a description of what they do

On-hover description of the "Negative Stain Data" parameter for the "Import Movies" job

Find the target job type in this guide's Job Reference for more detailed descriptions of expected input slots and parameters.

  • Reduce the box-size of extracted particles. Some jobs need to fit thousands of particles in GPU memory at a time, and larger box sizes exceed GPU memory limits. Either extract with a smaller box size or with the Downsample Particles job

  • Refresh job types and reload cryoSPARC:

cryosparcm cli "refresh_job_types()"
cryosparcm cli "reload()"
  • Look for extended error information with the cryosparcm joblog command (press Control + C on the keyboard to exit when finished)

  • On occasion, a job fails due to an error in the cryoSPARC code (bug). When a bug is discovered, the cryoSPARC team promptly patches the issue and makes it available in the next update.

If you find a new bug, see the Additional Help section for advice.

Job stuck or taking a very long time

By nature, the size of cryo-EM datasets can take a long time to process with sub-optimal hardware or parameters. Here are some facilities that cryoSPARC provides for increasing speed/performance.

  • Connect workers with SSD cache enabled. This speeds up processing for extracted particles during 2D Classification, ab-initio reconstruction, refinement and more. Ensure the "Cache particle images on SSD" parameter is enabled under "Compute settings" for the target particle-processing job

  • Some jobs (motion correction, ctf estimation, 2D classification) can run on multiple GPUs. If your hardware supports it, increase the number of GPUs to parallelize over under

2D Classification jobs support particle SSD caching and parallelizing over multiple GPUs.
  • Extracted particles with large box sizes (relative to their pixel size) take a long time to process. Consider Fourier-cropping (or "binning") the extracted particle blobs with the Downsample Particles job

  • Minimize the number of processes using system resources on the workstation or worker nodes

  • Check for zombie processes on worker machines. The process is similar to the steps under "Another cryoSPARC instance is running with the same license ID" under the License error or license not found section

  • Cancel the job, clear and re-queue

Additional Help

You can get additional help through the cryoSPARC Discussion Forum where users post questions, tips and bug reports not dealt with above, or by contacting us directly.

Discussion Forum

Search for keywords related to the specific issue on the cryoSPARC Discussion Forum. If no related discussions exist, please create a new post.

Contact Us

If assistance is still required, get in touch with the cryoSPARC team directly via the Questions and Support page.