Troubleshooting

Overview of common issues and advice on how to resolve them.

Unless otherwise noted:

  • Log in to the workstation or remote node where cryosparc_master is installed.

  • Use the same non-root UNIX user account that runs the CryoSPARC process and was used to install CryoSPARC.

  • Run all commands on this page in a terminal running bash

In v4.0+, you can download error reporting information from within the application. For more details, see: Guide: Download Error Reports

Common Issues

Cannot download or install CryoSPARC

Problems with the installation steps are indicated with the some of the following error messages:

"Couldn't connect to host" "Could not resolve host" {"success": false} "tar: This does not look like a tar archive" "Version mismatch! Worker and master versions are not the same. Please update." "An unexpected error has occurred."

Steps

  1. Check that your LICENSE_ID environment variable is set correctly with this command

    echo $LICENSE_ID

    Ensure the output exactly matches the CryoSPARC License ID issued to you over email.

  2. Check your machine's connection to CryoSPARC's license servers at get.cryosparc.com with this curl command:

    curl https://get.cryosparc.com/checklicenseexists/$LICENSE_ID

    You should see the message {"success": true}. If instead you see {"success": false}, your license is not valid, so please check it has been entered correctly. If you see an error message like "Couldn't connect to host" or "Could not resolve host" check your Internet connection, firewall or ensure your IT department has the get.cryosparc.com license server domain whitelisted.

Cannot update CryoSPARC

Steps

CryoSPARC does not start or encounters error on startup

This can happen following a fresh install or recent update.

Steps

  1. In a command line, run cryosparcm status

  2. Check that the output looks like this

    ----------------------------------------------------------------------------
    CryoSPARC System master node installed at
    /home/cryosparcuser/cryosparc/cryosparc_master
    Current CryoSPARC version: v4.0.0
    ----------------------------------------------------------------------------
    
    CryoSPARC process status:
    
    app                              RUNNING   pid 1223898, uptime 0:51:41
    app_api                          RUNNING   pid 1224512, uptime 0:51:39
    app_api_dev                      STOPPED   Not started
    app_legacy                       STOPPED   Not started
    app_legacy_dev                   STOPPED   Not started
    command_core                     RUNNING   pid 1218914, uptime 0:51:56
    command_rtp                      RUNNING   pid 1221639, uptime 0:51:48
    command_vis                      RUNNING   pid 1220983, uptime 0:51:49
    database                         RUNNING   pid 1217182, uptime 0:52:00
    
    ----------------------------------------------------------------------------
    License is valid
    ----------------------------------------------------------------------------
    
    global config variables:
    
    export CRYOSPARC_LICENSE_ID="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
    export CRYOSPARC_MASTER_HOSTNAME="localhost"
    export CRYOSPARC_DB_PATH="/home/cryosparcuser/cryosparc/cryosparc_database"
    export CRYOSPARC_BASE_PORT=39000
    export CRYOSPARC_INSECURE=false
    export CRYOSPARC_CLICK_WRAP=true
  3. Check that all items under "CryoSPARC process status" that do not end in _dev or _legacy are RUNNING. If any are not, run cryosparcm restart

  4. If any non-_dev /non-_legacy components have a status other than RUNNING (such as STOPPED or EXITED), check their log for errors. For example, this command checks for errors on the database process:

    cryosparcm log database

    (Press control C, then q to stop logging)

  5. If the web interface is inaccessible, check firewall settings to ensure CryoSPARC's base port number (default 39000) is exposed for network access

Any error messages here could indicate specific configuration issues and may require re-installing CryoSPARC.

If at any point you see No command 'cryosparcm' found or command not found: cryosparcm:

  1. Check that you are on the master node or workstation where cryosparc_master is installed

  2. Run echo $PATH and check that it contains <installation directory>/cryosparc_master/bin

    $ echo $PATH
    /home/cryosparcuser/cryosparc/cryosparc_master/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
  3. Reinstall CryoSPARC if the above did not restore the the cryosparcm command

'User not found' error when attempting to log in

This error message occurs if the email address field does not match any existing users in your CryoSPARC instance. Use the CryoSPARC command-line to verify the details of your user account and change the email address or password if needed.

  1. Run the following command in your terminal: cryosparcm listusers

  2. If an email address is incorrect (e.g., mispelled or with an extra space at the beginning or end), modify it in the database. Run the following commands:

    • Log into the MongoDB shell: cryosparcm mongo

    • Once in the MongoDB shell, enter the following (replace the incorrect/correct email):

    db.users.update(
      { 'emails.0.address': 'incorrect@domain.eud' }, 
      { $set: { 'emails.0.address': 'correct@domain.edu' } 
    })
    • Exit the MongoDB shell with exit

  3. If you don't remember your password, reset it with the following command (replace with your email address and new password):

    cryosparcm resetpassword --email "<email address>" --password "<new password>"

Incomplete CryoSPARC shutdown

An incomplete shutdown of CryoSPARC is likely to interfere with subsequent attempts to start CryoSPARC and/or CryoSPARC software updates. Incomplete shutdowns can occur for various reasons, including, but not limited to:

  • unclean shutdown of the computer that runs cryosparc_master processes

  • failed coordination of services by cryosparc_master's supervisord process

Follow this sequence to ensure a complete shutdown of CryoSPARC

1. Basic shutdown

For CryoSPARC instances that were not configured as a systemd service, run the command

cryosparcm stop

Do not use

cryosparcm stop

for CryoSPARC instances that are controlled by systemd. For such instances, use the appropriate systemctl stop command.

2. Find and, if necessary, terminate "zombie" processes

Confirm that the basic shutdown did not "leave behind" any CryoSPARC-related processes. If the basic shutdown was successful, a suitable ps command should not show any processes for the CryoSPARC instance in question, but processes may be shown if

  • a glitch occurred during the basic shutdown or

  • the computer hosts multiple CryoSPARC instances.

To illustrate what kind of processes one might encounter, here is an example command and its output for a running CryoSPARC v4.4 instance:

$ ps -weo pid,ppid,start,cmd | grep -e cryosparc -e mongo | grep -v grep
 347992       1 16:08:45 python /u/user1/cryosparc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/supervisord -c /u/wtempel/sw/cryosparc-cryoem13/cryosparc_master/supervisord.conf
 348133  347992 16:08:51 mongod --auth --dbpath /u/user1/cryosparc/cryosparc_db --port 61571 --oplogSize 64 --replSet meteor --wiredTigerCacheSizeGB 4 --bind_ip_all
 348259  347992 16:08:57 python -c import cryosparc_command.command_core as serv; serv.start(port=61572)
 348325  347992 16:09:04 python -c import cryosparc_command.command_vis as serv; serv.start(port=61573)
 348345  347992 16:09:06 python -c import cryosparc_command.command_rtp as serv; serv.start(port=61575)
 348446  347992 16:09:13 /u/user1/cryosparc/cryosparc_master/cryosparc_app/nodejs/bin/node ./bundle/main.js

This is a simple example. More complex configurations, such a host with multiple active CryoSPARC instances, may require different ps options and/or grep patterns.

The ps output may include processes that belong to non-CryoSPARC applications or to CryoSPARC instances other than the CryoSPARC instance that you wish to shutdown. Parent process identifiers and port numbers in the listed commands can help in attributing processes to a common parent supervisord process. Carefully confirm the purpose and identity of any process before termination.

For the example above, it should be sufficient to kill the supervisord process using the process identifier shown by the ps command

kill 347922

and wait a few seconds for the supervisord process' children to be terminated automatically.

Never use the kill -9 option for mongod processes.

Finally, using another ps command with suitable options, re-confirm that all relevant processes have in fact been terminated

3. Only under certain circumstances, delete "orphaned" socket files

An intact CryoSPARC instance manages the creation and deletion of socket files for mongod and supervisord, like

/tmp/mongodb-39001.sock
/tmp/cryosparc-supervisor-263957c4ac4e8da90abc3d163e3c073c.sock

Filenames differ between CryoSPARC instances, for example based on $CRYOSPARC_DB_PORT.

Socket files should be deleted only under specific circumstances, subject to precautions given below.

Never delete socket files before confirming that associated processes have been terminated as described in the previous step.

The computer may store socket files that belong to non-CryoSPARC applications or to CryoSPARC instances other than the CryoSPARC instance that you wish to shutdown. Such socket files may have names similar to the files you wish to delete. Carefully confirm the purpose and identity of each file before any deletion.

License error or license not found

Follow the steps in this section when you see error messages that look like this:

"License is invalid." "License signature invalid." "Could not find local license file. Please re-establish your connection to the license servers." "Local license file is expired. Please re-establish your connection to the license servers." "Token is invalid. Another CryoSPARC instance is running with the same license ID."

Steps

Run cryosparcm licensestatus. This should result in "License is valid". If you see this error:

ServerError: Authentication failed

Your license ID is not entered configured correctly. Check the CRYOSPARC_LICENSE_ID entry in cryosparc_master/config.sh .

If you see this error:

WARNING: Could NOT verify active license

This includes a list of checks, the last of which will indicate a failure. Depending on which check failed, do one of the following:

  • Ensure you entered the license correctly during the installation step.

  • Check your Internet connection

  • Check your machine's connection to CryoSPARC's license servers at get.cryosparc.com with this curl command (substitute <license> with your unique license ID):

curl https://get.cryosparc.com/checklicenseexists/<license>

Look for the message message {"success": true}

If instead you see {"success": false}, your license is not valid so please check it has been entered correctly.

If you see an error message like "Couldn't connect to host" or "Could not resolve host" check your Internet connection, firewall or ensure your IT department has the get.cryosparc.com license server domain whitelisted.

If you see a license ID conflict such as

"Another cryoSPARC instance is running with the same license ID."

Follow the complete shutdown procedure before running

cryosparcm start

Cannot queue or run job

Follow these steps when the CryoSPARC web interface is up and running normally and jobs may be created but do not run. These error messages may indicate this issue:

"list index out of range" "Could not resolve hostname ... Name or service not known"

A job that never changes from Queued or Launched status may also indicate this.

Steps

  1. Ensure at least one worker is connected to the master. See the Installation page for details. Visit Manage > Resources to see what lanes are available

  2. Check that all non-development CryoSPARC processes are running with the cryosparcm status command

  3. (For master/worker setups) check that SSH is configured between the master and worker machines.

  4. Check the log for the command_core log to find any application error messages

    cryosparcm log command_core

    (press Control + C on the keyboard followed by q to exit when finished)

  5. If applicable, check that the cluster submission script is correct

  6. Refresh job types and reload CryoSPARC:

    cryosparcm cli "refresh_job_types()"
    cryosparcm cli "reload()"
  7. Force-reinstall master and worker dependencies. This can help when a worker was not correctly installed. On the workstation or master node run:

    cryosparcm forcedeps

    On worker nodes run:

    cryosparcw forcedeps
  8. Restart CryoSPARC:

    cryosparcm restart
  9. Clear the job and re-run it.

Job stuck in launched status

This indicates that CryoSPARC started the job process but the job encountered an internal error immediately after.

Steps

or from the command line with

cryosparcm joblog PX JY

Substituting PX and JY for the project and job IDs, respectively.

Typically the errors here occur when the worker process cannot connect back to the master. Ensure there is a stable network connection between all machines involved. Ensure CryoSPARC was installed correctly and re-install if necessary.

Job runs but ends unexpectedly with status "Failed"

When a job fails, its job card in the interface turns red and the bottom of the job log includes an error message with the text Traceback (most recent call last)

Common failure reasons include:

  • Invalid or unspecified input slots

  • Invalid or unspecified required parameters, including file/folder paths

  • Incorrectly set up GPU (e.g., running a job on a node without enough GPUs or CUDA drivers not installed)

  • Another process taking up memory on a GPU

  • Cache not set up correctly for a worker

  • Lost connection to cryosparc_master

Common job failure error messages:

"AssertionError: Child process with PID ... has terminated unexpectedly!" Job is unresponsive - no heartbeat received in 30 seconds.

Common error messages that indicate incorrectly configured GPU drivers:

"cuInit failed: unknown error" "no CUDA-capable device is detected" "cuMemHostAlloc failed: OS call failed or operation not supported on this OS" "cuCtxCreate failed: invalid device ordinal kernel.cu ... error: identifier "__shfl_down_sync" is undefined

Common error messages that indicate not enough GPU memory:

"cuMemAlloc failed: out of memory" "cuArrayCreate failed: out of memory" "cufftAllocFailed"

Steps

  1. Ensure a supported version of the CUDA toolkit is installed and running on the workstation or each worker node

  2. Check the GPU configuration on the workstation or node where the job runs on. Log into that machine and navigate to the CryoSPARC installation directory. Run the cryosparcw gpulist command:

    cd /path/to/cryosparc_worker
    bin/cryosparcw gpulist
  3. Run nvidia-smi to check that no other processes are using GPU memory. CryoSPARC-related process appear with process name "python"

    If you don't recognize the processes using GPU memory, run kill <PID>, substituting <PID> with the value under the Processes PID column

  4. Clear the job and select the "Build" status badge on the job card to enter the Job Builder

  5. Check the job parameters: To learn about setting specific parameters, hover over or touch the parameter names in the Job Builder to see a description of what they do

  6. Find the target job type in this guide's Job Reference for more detailed descriptions of expected input slots and parameters: All Job Types in CryoSPARC.

  7. Reduce the box-size of extracted particles. Some jobs need to fit thousands of particles in GPU memory at a time, and larger box sizes exceed GPU memory limits. Either extract with a smaller box size or with the Downsample Particles job.

  8. Refresh job types and reload CryoSPARC:

    cryosparcm cli "refresh_job_types()"
    cryosparcm cli "reload()"
  9. Look for extended error information with the cryosparcm joblog command (press Ctrl + C on the keyboard to exit when finished)

  10. Check the network connection from the worker machine to the master

  11. On occasion, a job fails due to an error in the CryoSPARC code (bug). The CryoSPARC team regularly releases updates and patches with bug fixes. Check for and install the latest update or patch. If you find a new bug, see the Additional Help section for advice.

Job stuck or taking a very long time

Due to their large sizes, cryo-EM datasets can take a long time to process with sub-optimal hardware or parameters. Here are some facilities that CryoSPARC provides for increasing speed/performance.

  • Connect workers with SSD cache enabled. This speeds up processing for extracted particles during 2D Classification, ab-initio reconstruction, refinement and more. Ensure the "Cache particle images on SSD" parameter is enabled under "Compute settings" for the target particle-processing job

  • Some jobs (motion correction, ctf estimation, 2D classification) can run on multiple GPUs. If your hardware supports it, increase the number of GPUs to parallelize over

  • Extracted particles with large box sizes (relative to their pixel size) take a long time to process. Consider Fourier-cropping (or "binning") the extracted particle blobs with the Downsample Particles job

Job: Downsample Particles
  • Minimize the number of processes using system resources on the workstation or worker nodes

  • Check for zombie processes on worker machines. The process is similar to the steps under "Another CryoSPARC instance is running with the same license ID" under the License error or license not found section

  • Cancel the job, clear and re-queue

GPU Issues

cudaErrorInsufficientDriver or CUDA_ERROR_UNSUPPORTED_PTX_VERSION

A job fails with errors similar to the following when the Nvidia driver is out-of-date or incompatible with the target GPU or CUDA Toolkit Version that ships with CryoSPARC:

File "/u/cryosparc/cryosparc_worker/cryosparc_compute/jobs/runcommon.py", line 1711, in run_with_except_hook 
    run_old(*args, **kw) 
File "cryosparc_worker/cryosparc_compute/engine/cuda_core.py", line 129, in cryosparc_compute.engine.cuda_core.GPUThread.run 
File "cryosparc_worker/cryosparc_compute/engine/cuda_core.py", line 130, in cryosparc_compute.engine.cuda_core.GPUThread.run 
File "cryosparc_worker/cryosparc_compute/engine/engine.py", line 997, in cryosparc_compute.engine.engine.process.work 
File "cryosparc_worker/cryosparc_compute/engine/engine.py", line 106, in cryosparc_compute.engine.engine.EngineThread.load_image_data_gpu 
File "cryosparc_worker/cryosparc_compute/engine/gfourier.py", line 33, in cryosparc_compute.engine.gfourier.fft2_on_gpu_inplace 
File "/u/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/fft.py", line 102, in __init__ 
    capability = misc.get_compute_capability(misc.get_current_device()) 
File "/u/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/misc.py", line 254, in get_current_device 
    return drv.Device(cuda.cudaGetDevice()) 
File "/u/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/cudart.py", line 767, in cudaGetDevice 
    cudaCheckStatus(status) 
File "/u/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/cudart.py", line 565, in cudaCheckStatus 
    raise e 
skcuda.cudart.cudaErrorInsufficientDriver
Traceback (most recent call last):
    driver.cuLinkAddData(self.handle, input_ptx, ptx, len(ptx),
  File "/home/cryosparcuser/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 352, in safe_cuda_api_call
    return self._check_cuda_python_error(fname, libfn(*args))
  File "/home/cryosparcuser/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 412, in _check_cuda_python_error
    raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [CUresult.CUDA_ERROR_UNSUPPORTED_PTX_VERSION] Call to cuLinkAddData results in CUDA_ERROR_UNSUPPORTED_PTX_VERSION

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "cryosparc_master/cryosparc_compute/run.py", line 96, in cryosparc_master.cryosparc_compute.run.main
  File "/home/cryosparcuser/cryosparc_worker/cryosparc_compute/jobs/instance_testing/run.py", line 174, in run_gpu_job
    func = mod.get_function("add")
  File "/home/cryosparcuser/cryosparc_worker/cryosparc_compute/gpu/compiler.py", line 256, in get_function
    cufunc = self.get_module().get_function(name)
  File "/home/cryosparcuser/cryosparc_worker/cryosparc_compute/gpu/compiler.py", line 212, in get_module
    linker.add_cu(s, k)
  File "/home/cryosparcuser/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 3022, in add_cu
    self.add_ptx(program.ptx, ptx_name)
  File "/home/cryosparcuser/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 3010, in add_ptx
    raise LinkerError("%s\n%s" % (e, self.error_log))
numba.cuda.cudadrv.driver.LinkerError: [CUresult.CUDA_ERROR_UNSUPPORTED_PTX_VERSION] Call to cuLinkAddData results in CUDA_ERROR_UNSUPPORTED_PTX_VERSION
ptxas application ptx input, line 9; fatal   : Unsupported .version 7.8; current version is '7.3'

To fix this, update the Nvidia driver to the minimum driver version noted in Installation Prerequistes. Please follow instructions specific to the worker's Linux distribution to install the Nvidia driver. The latest Nvidia driver is available to download on Nvidia's website.

Related Discussion Forum post where a user encountered this error.

undefined symbol: _ZSt28__throw_bad_array_new_lengthv

A job may fail with the following error when running GPU jobs (Patch Motion Correction, 2D Classification) with CryoSPARC v3.3.2 or v3.4.0 on Ubuntu 22+.

Traceback (most recent call last):
  File "cryosparc_worker/cryosparc_compute/run.py", line 72, in cryosparc_compute.run.main
  File "/home/cryosparc/cryosparc/cryosparc_worker/cryosparc_compute/jobs/jobregister.py", line 371, in get_run_function
    runmod = importlib.import_module(".."+modname, __name__)
  File "/home/cryosparc/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 1050, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "cryosparc_worker/cryosparc_compute/jobs/class2D/run.py", line 13, in init cryosparc_compute.jobs.class2D.run
  File "/home/cryosparc/cryosparc/cryosparc_worker/cryosparc_compute/engine/__init__.py", line 8, in <module>
    from .engine import *  # noqa
  File "cryosparc_worker/cryosparc_compute/engine/engine.py", line 9, in init cryosparc_compute.engine.engine
  File "cryosparc_worker/cryosparc_compute/engine/cuda_core.py", line 4, in init cryosparc_compute.engine.cuda_core
  File "/home/cryosparc/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/pycuda/driver.py", line 62, in <module>
    from pycuda._driver import *  # noqa
ImportError: /home/cryosparc/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/pycuda/_driver.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZSt28__throw_bad_array_new_lengthvytho

To fix, set environment variables CFLAGS="-static-libstdc++" and CXXFLAGS="-static-libstdc++" to the environment before recompiling the PyCUDA module with cryosparcw newcuda:

cd /path/to/cryosparc_worker
export CFLAGS="-static-libstdc++"
export CXXFLAGS="-static-libstdc++"
bin/cryosparcw newcuda ~/cryosparc/cuda-11.8.0

SSD Cache Issues

In some circumstances, jobs with "Cache particle images on SSD" enabled will not complete with one of the following errors in the job log:

SSD cache needs additional xx B but drive can only be filled up to yy B

SSD cache needs additional xx B but drive has yy B free. CryoSPARC can only free a maximum of zz B. This may indicate that programs other than CryoSPARC are using the SSD.

Cannot allocate space on the SSD to cache /path/to/particles.mrc; other non-CryoSPARC processes are using the cache.

Cannot finish transfer; other programs may be accessing the SSD.

FileNotFoundError: [Errno 2] No such file or directory: '…/…/store-v2/...' → '/scratch/instance_cryosparc:39001/links/...’

You may also observe some of the following symptoms:

  • Slower than expected SSD copy speeds

  • Jobs that use the SSD cache take much longer than they should

  • Jobs spend a very long time waiting on cache files locked by other jobs or waiting for additional space to free up

These issues may occur in long-active CryoSPARC instances with many projects and files on the SSD, and many jobs or other non-CryoSPARC processes running simultaneously. There are several strategies to address these or reduce their likelihood.

Option 1: Check SSD health

SSDs naturally degrade over time, the likelihood of failure increasing with heavy usage. Use a tool such as smartctl to check the SSD. If enough errors have accumulated, the SSD may have to be replaced.

CryoSPARC automatically removes files from the SSD that have not been accessed in a while (more than 30 days by default) each time the SSD cache system runs. If the SSD is very heavily used in particle processing jobs or by other external tools, leaving more free space available may extend its lifetime. This is possible with one or both of these strategies:

  • Reconnect the worker with the --ssdreserve flag set (default 10GB or 10000MB) to ensure CryoSPARC always leaves the given amount of free space on the SSD (will clean out old files to stay above the threshold)

  • Set the CRYOSPARC_SSD_CACHE_LIFETIME_DAYS environment variable in cryosparc_master/config.sh to clean up unused files on the SSD more frequently. The default value is 30 days

Option 2: Ensure no other programs are using the CryoSPARC SSD cache path

CryoSPARC assumes that it has exclusive access to the SSD cache path. If other programs are accessing the cache path at the same time as CryoSPARC, the cache step may fail with a file system-related or OS-level error.

Sharing a cache device with other applications, including other CryoSPARC instances, is not recommended:

  • Performance may suffer from due to shared bandwidth usage.

  • The given CryoSPARC instance's caching may be disrupted because the CryoSPARC instance does not prevent use of free space within its own quota by other applications or CryoSPARC instances.

Option 3: Increase or Reduce the SSD Quota

The SSD Quota is the maximum amount of SSD space that CryoSPARC jobs use on a connected worker. CryoSPARC will never use more than this amount of SSD space. A job that requires caching will delete enough older cache files to ensure the quota is not exceeded.

  • If the quota is significantly lower than the total size of the SSD and jobs frequently wait for cache space to free up, consider increasing the quota.

  • If other programs regularly share the SSD with CryoSPARC and jobs fail because the cache drive runs out of space, consider decreasing the quota to reduce the likelihood of this.

Change the quota by re-running the cryosparcw connect command with --ssdquota <amount in MB> and --update arguments.

Option 4: Increase or Reduce the SSD Reserve

The SSD Reserve is the minimum amount of free space that CryoSPARC leaves on the SSD, also considering potential use of the the cache device by other applications even if CryoSPARC has not reached its SSD quota. CryoSPARC jobs will delete, if possible, unused files from its cache until this much space is free again. CryoSPARC jobs will not copy files to the SSD until at least this amount of space is free. This setting is intended to preserve SSD health.

  • If CryoSPARC reports slower cache write speeds than expected, or jobs that use the SSD cache take longer than they should, consider increasing the reserve.

  • If jobs frequently wait for cache space to free up, consider decreasing the reserve.

Change the reserve by re-running the cryosparcw connect command with --ssdreserve <amount in MB> (default 10000) and --update arguments.

Option 5: Change the SSD cache locking strategy

Applies to SSD caches that run on network file systems such as NFS, GPFS, BeeGFS or Lustre. Note that we strongly recommend local SSD caches for best performance.

By default, CryoSPARC jobs use a file-system level POSIX lock to ensure mutual exclusion between multiple jobs that access the SSD cache simultaneously. These locks can be unreliable on network file systems such as NFS, GPFS, BeeGFS or Lustre. This may lead to errors during the SSD caching step, particularly during copy or symbolic link operations. Error messages with the following format are key symptoms of this:

FileNotFoundError: [Errno 2] No such file or directory: '…/…/store-v2/...' → '/scratch/instance_cryosparc:39001/links/...’

Add the following line to cryosparc_worker/config.sh to use the CryoSPARC master to broker cache access instead:

export CRYOSPARC_CACHE_LOCK_STRATEGY="master"

Option 6: Fix database inconsistencies (CryoSPARC ≤4.4)

These instructions apply to CryoSPARC versions v4.4 or older. v4.5+ uses a new cache system that does not require the database.

cryosparc_master uses its MongoDB database to coordinate SSD caching between multiple workers running in parallel. If a job fails unexpectedly during the SSD caching step, this could lead to database inconsistencies which prevent other jobs from proceeding.

To address these, try the following steps:

  1. Ensure no jobs are running in CryoSPARC

  2. In a terminal, enter cryosparcm mongo to enter the interactive database prompt

  3. Enter the following command to check how many records are in an inconsistent state:

    db.cache_files.find({status: {$nin: ['hit', 'miss']}}).length()
  4. If the result is not 0 (zero), enter the following command to fix them

    db.cache_files.updateMany({status: {$nin: ['hit', 'miss']}}, {$set: {status: 'miss'}})
  5. Exit from the database prompt with Ctrl + D

  6. Try re-running the problematic jobs

Option 7: Fully reset the SSD cache system

Fully reset the caching system with the following steps:

  1. Ensure no jobs are running in CryoSPARC

  2. For each connected worker machine:

    • Navigate to the SSD cache directory containing CryoSPARC's cache files (e.g., /scratch/). This path was configured during installation time

    • Look for a directory named instance_<master hostname>:<master port + 1> e.g., instance_localhost:39001

    • Delete this directory and all its contents

The following additional reset instructions are required for CryoSPARC v4.4 or older.

  1. In a terminal, enter cryosparcm mongo to enter the interactive database prompt

  2. Enter the following command to clear out the cache records

    db.cache_files.deleteMany({})
  3. Exit from the database prompt with Ctrl + D

  4. Try re-running the problematic jobs

Option 8: Disable SSD cache for the job

If the issue persists after trying any of the above options, consider disabling the "Cache particle images on SSD" parameter for the affected job.

User Interface Error Logging

If you encounter a problem with the user interface in your web browser, e.g., one or more elements of a page are not loading, etc., you can use the following steps to obtain debugging information.

  1. Open the browser console

    • In Chrome, Firefox, Edge, and Safari this can be done by right clicking the page to open the browser context menu and selecting the Inspect option (Inspect Element in Safari) .

    • This will open up a “DevTools” panel used for inspecting and debugging in the browser. This panel includes a number of tabs at the top used to display different views. When opened using the context menu the current view will be the Elements tab. Click on the Console tab directly beside the Elements tab in order to view the web console. This is where errors, warnings, and general information about the page’s javascript code can be observed.

    • In order to keep the console clean in production we disable our development logs. Enable these logs by pasting the command

      window.localStorage.setItem('cryosparc_debug', true); into the browser console and then pressing the enter key on your keyboard. undefined will be logged below this command if it was submitted correctly.

    • Now reload the page and all of the development console logs and errors will be visible.

  2. Save Console Output

    Please save console output as a .log file, including the type of browser (Chrome, Firefox, Edge, Safari, etc.) in the file name. The filename should be formatted as such: console_{browser}.log , eg. console_chrome.log . Before saving the file, try to reproduce the issue you encountered.

    • Chrome or Edge: Right click anywhere in the console panel to open the context menu and select the Save As... option. This will allow you to save the entire output as a .log file.

    • Firefox: Right click on a console message to open the context menu and select the Save all Messages to File option. This will allow you to save the entire output as a .txt file as default (or .log file optionally).

    • Safari: Click and drag the cursor over all items in the console output so that the items are all highlighted blue. You can then right click on any of the highlighted items to open the context menu and select the Save Selected option to to save the entire output as a .txt file as default (or .log file optionally).

  3. Save Network Output

    Navigate to the Network tab in the DevTools by selecting it from the tabs in the top bar of the panel. If the Network tab is not shown then it is likely hidden in the overflow menu (this appears when there is not enough space to display all of the tab options in the DevTools). Click the overflow menu button represented by two right chevrons (>>) and select the Network option.

    Please include the type of web browser (Chrome, Firefox, Edge, or Safari) in the name of the .har file you are saving. The filename should be formatted as such: network_{browser}.har , eg. network_chrome.har .

    • Chrome or Edge: Right click on any of the items in the network request table and then select Save all as HAR with context from the context menu.

      Make sure that the red record button at the top left of the panel has been activated. It will appear as a grey circle if it has not been activated, and a red circle if it has. The Preserve log checkbox must also be selected. Now reload the page and the network panel will be populated with all requests made and received by the browser. Before saving the file, try to reproduce the issue you encountered.

    • Firefox: Right click on any of the items in the network request table and then select Save All As HAR from the context menu.

    • Safari: Right click on any of the items in the network request table and then select Export HAR from the context menu.

Additional Help

For topics not covered above, get additional help through the CryoSPARC Discussion Forum:

If no related discussions exist, please create a new post. Review our Troubleshooting Guidelines for items to include in your post:

Last updated