Guide: Reduce Database Size (v4.3+)

A guide on reducing the size of large CryoSPARC databases using methods provided by MongoDB.

CryoSPARC uses MongoDB to store records and image data. As the size of a CryoSARC instance grows, the size of the database files can get quite large and affect the performance of the instance. This guide shows how to reduce the size of large databases using methods provided by MongoDB. There are two options.

Option 1: Compact the database

After clearing or deleting significant amounts of data in CryoSPARC, the MongoDB database can be compacted to potentially reduce the size of the database files. Using the cryosparcm compact command will run MongoDB’s compact administration command on collections in the database. Be aware, the result of MongoDB’s compact varies and it is not guaranteed to reduce the size of any files in the database.

Please refer tohttps://www.mongodb.com/docs/v3.6/reference/command/compact for more details on the behaviour of MongoDB’s compact command.

Compact database steps

  1. Allow all previously running jobs on the instance to run to completion, then ensure that there are no running jobs on the instance

  2. Create a backup of the database if possible with cryosparcm backup (In cases where there is not enough disk space left to create a backup, this step can be skipped)

  3. Restart CryoSPARC with cryosparcm restart

  4. Run cryosparcm compact

MongoDB’s backup command will write all entries and saved files in the database to a single backup file. The size of this backup file can be used as a reference for how much space the database should take up at minimum. After running cryosparcm compact, verify that the size of the database files after compaction is similar to the size of a backup created immediately before or after compaction. If the size of the backup is similar to the size of the database files, it is likely that no further space can be reclaimed from the database. Otherwise, if the size of the backup differs significantly from the size of the database files, cryosparcm restore can be attempted to more forcefully reduce database size.

Option 2: Using backup and restore to reduce database size

If cryosparcm compact does not successfully reduce database size, and the size of a database backup is significantly less than the size of the database files after compaction, cryosparcm restore can be used to re-write database files and hopefully create a database with a size closer to the backup.

Note: Creating a database backup and restoring from that backup will require free disk space to store both the backup and the restored database. A worst-case estimate of the required free space would be double the size of the current database files. Make sure enough space is available before beginning this process.

Backup and restore steps

  1. Allow all previously running jobs on the instance to run to completion, then ensure that there are no running jobs on the instance

  2. Create a backup of the database with cryosparcm backup

  3. Stop CryoSPARC with cryosparcm stop

  4. Create a new (empty) directory for the restored database, e.g. mkdir cryosparc_database_from_backup

  5. Change the CRYOSPARC_DB_PATH environment variable in cryosparc_master/config.sh to be the path to the new (empty) database directory. (This can be changed back to the original database directory if anything goes wrong in the next steps)

  6. Run cryosparcm restore --file=<path_to_backup_file>. This will restore the backup to the new empty directory that was created and pointed to above.

  7. Start CryoSPARC with cryosparcm start and verify the restoration was successful by checking that existing projects and jobs were restored correctly.

After restoring

After a successful database restoration, the old database folder and database backup file can be deleted. The new database directory can also be renamed to its original name, but remember to edit the CRYOSPARC_DB_PATH environment variable in cryosparc_master/config.sh to match.

Last updated