Guide: Migrating your CryoSPARC Instance
A guide to moving CryoSPARC from one location to another.
Last updated
A guide to moving CryoSPARC from one location to another.
Last updated
There may come a time when you want to move your CryoSPARC instance from one location to another. This may be between folders, different network storage locations, or even different host machines entirely. There are four main areas we will focus on:
The paths of any raw particle, micrograph, or movie data imported into CryoSPARC
All CryoSPARC project directories
The CryoSPARC database and its (new) location
The identities/hostnames of compute nodes or the master node and the CryoSPARC binaries
All four of the above areas can be taken care of in isolation, but if combined, will amount to a full-out migration of your CryoSPARC instance.
We will be using a combination of the shell as well as an interactive python session to complete this migration. You will need access to the master node in which the CryoSPARC system is hosted.
It is also recommended that a database backup is created before starting anything. More details are at: Setup, Configuration and Management
When raw data is imported into a CryoSPARC project, rather than copy the data into the project directory, symlinks are created inside the import job directories pointing to the original data files. Read for more details.
If you're moving data that you used an "Import Particles", "Import Micrographs" or "Import Movies" job to bring into CryoSPARC, you will need to repair these jobs. When CryoSPARC imports these three types of data, it creates symlinks to each file inside the job's imported
directory. These symlinks may become broken if the original path to the file no longer exists. You can check the status of the symlinks by running ls -l
inside the imported
directory of the job. Note: The "Import Templates" and "Import Volumes" jobs copy the specified files directly into the job directory.
Start up an interactive python session
Use the cli to find all the symlinks for an entire project or a single job.
where
link_path
is the path to the symlink file
link_target
is the file the symlink points to
exists
indicates if the target file exists
Use the command cli.job_import_replace_symlinks(project_uid, job_uid, prefix_cut, prefix_new)
where prefix_cut
is the beginning of the link you'd like to cut (e.g. /data/EMPIAR
) and where prefix_new
is what you'd like to replace it with (e.g. /data
). This function will loop through every file inside the job directory, find all symlinks, and only modify them only if they start with prefix_cut
. The function returns the number of links it modified. Below it is used in a loop to modify all jobs across all projects all at once.
If you're moving the locations of the projects and their jobs, you will need to point CryoSPARC to the new directory where the projects reside. Jobs inside CryoSPARC are referenced by their relative location to their project directory. This allows a user to specify a new location for the project directory only, rather than each job.
cryosparcm cli "update_project('PXX', {'project_dir' : '/new/abs/path/PXX'})"
Where 'PXX'
is the project UID and '/new/abs/path/PXX'
is the new directory.
Start up an interactive python session
Execute a MongoDB query to list all project directories
Use the command update_project(project_uid, attrs, operation='$set')
where attrs
is a dictionary whose keys correspond to the fields in the project document to update. In the following example, update_project
used in a loop to modify all project directory paths.
The CryoSPARC database doesn't necessarily have to be in the same location as the CryoSPARC binaries. To move the database, specify the new location in the cryosparc_master/config.sh file.
A copy of the database should only be used with CryoSPARC project directories whose contents have not changed after the database copy has started. Otherwise, CryoSPARC may malfunction and the affected project directory may get corrupted.
Navigate to the cryosparc_master directory
Modify config.sh
to contain the new directory path to the database
Assuming that the CryoSPARC instance is located on a shared storage layer, if you want to host it on another machine, (i.e. server1:39000 —> server2:39000
you just need to specify the new hostname to CryoSPARC master. Read further if the new machine doesn't have access to the same file system.
NOTE: Skip this step if you're using a shared filesystem (e.g. a remote storage server that is hosted on all machines). This means that the CryoSPARC binaries, database and project directories are already accessible on the new machine.
First, follow all previous parts of this guide in order. Read this entire guide thoroughly before starting anything. You will need your old instance started and working, so make sure you still have access to it. Then, follow the install guide to install CryoSPARC normally using the migrated database path. After this step, you are done.
Navigate to the cryosparc_master directory
Modify config.sh to list the new master node
On the new machine, start CryoSPARC. Ensure you start CryoSPARC using the same user.
The information in this section applies to CryoSPARC ≤v3.3. For instructions on moving cryoSPARC project directories in v4.0+, see