Guide: CryoSPARC Live Session Data Management

How to manage the data created by your cryoSPARC Live Sessions via the user interface and data management API.

The images in this section depict CryoSPARC ≤v3.3. For CryoSPARC v4.0+, please see: Managing Data

Overview

Live Session Data Management tools are available in CryoSPARC v3.0.0+.

To access the Session Management page, click on the "Manage Data" button at the top of the Browse Sessions Page.

Metadata Available

You will be greeted with a table that shows an overview of all Projects and Live Sessions including:

  • Project and Session UID

  • Project or Session title

  • Status of the Session (e.g., Running, Paused, Marked Completed)

  • File sizes of the following data categories (Projects show the total across all Sessions within that project)

    • Raw import data

    • Motion corrected micrographs

    • Exposure thumbnails

    • Extracted particles

    • Metadata

  • Date and time the Project or Session was created

  • Date and time the session was last Paused (if applicable)

  • Date and time the session was Marked Completed (if applicable)

  • Project actions:

    • Refresh statistics for all sessions within a Project (update file sizes)

    • Download a list of all Project statistics (all files linked to each session, organized into groups)

  • Session actions:

    • Refresh statistics for a particular Session (update file sizes)

    • Navigate to the Live Session interface

    • Additional actions than can be performed on a category of data within a session is described below

Upon first visit to the Data Management page, you will need to click the "Refresh Project Stats" or "Refresh Session Stats" action to populate the file sizes of each category.

You can only perform session actions on sessions that are in Completed status. To mark a session as completed, navigate to the session, and click the Mark Completed button in the Session Information sub-tab under the Configuration tab.

Actions

Project Actions

You can download a list of file paths (JSON format) of all sessions within a project (organized into categories) by clicking on the 'Download Project Stats' button in the 'Action' column.

Session Actions

You can perform actions on all five categories in completed sessions by clicking a cell in the table:

  1. Download a list of file paths (JSON format) for a particular category (such as motion corrected micrographs) via the browser for use in an external archiving utility

    • Available for all categories except for thumbnails

  2. Mark a category as 'archived', 'archiving', or 'active'

    • Available for all categories except for thumbnails

  3. Mark a category as 'deleted' (for when you have used an external tool to delete all files associated with that category)

    • Available for all categories except for thumbnails

  4. Delete data from a particular category (CryoSPARC will delete all files associated with the category)

    • Available for all categories except for raw data

    • A confirmation dialog will be presented before any data is deleted:

Interface Features

Right-clicking over any cell other than file size data will present will three options:

  1. Refresh table data

  2. Expand all projects (show all Sessions across all Projects)

  3. Collapse all Projects (show only Project totals)

State Change Hooks

As of v3.3+, CryoSPARC Live Data Management supports executing a script upon a datatype's state change.

To use this feature, add the following environment variables to cryosparc_master/config.sh:

export CRYOSPARC_LIVE_DATA_MANAGEMENT_SCRIPT_ENABLE=true
export CRYOSPARC_LIVE_DATA_MANAGEMENT_SCRIPT_PATH=/abs/path/to/script.sh

Then, restart CryoSPARC (cryosparcm stop && cryosparcm start).

The trigger will execute the script that you've specified and pass the following arguments: project_uid, session_uid, datatype, status

For example, the "Mark as Archiving" button is clicked for micrographs in P45 S1, the script will be triggered with 'P45', 'S1', 'micrographs', 'archiving'

Your script can look something like this (the following script reacts to "archiving" state changes, does some work, then updates the status of the datatype to "archived" once completed):

#!/bin/sh

# set path to cryosparcm
cryosparcm=/fast5/userhome/sarulthasan/software/cryosparc/cryosparc_master/bin/cryosparcm
# get variables
project_uid=$1
session_uid=$2
datatype=$3
status=$4

echo "Running data management script on "$CRYOSPARC_MASTER_HOSTNAME:$CRYOSPARC_BASE_PORT
echo "cryosparcm: $cryosparcm";
echo ""
echo "project_uid: $project_uid";
echo "session_uid: $session_uid";
echo "datatype: $datatype";
echo "status: $status";
echo ""

datatype_size=$(${cryosparcm} rtpcli "get_datatype_size(project_uid = '$project_uid', session_uid = '$session_uid', datatype = '$datatype')"
datatype_filepaths_json=$(${cryosparcm} rtpcli "get_datatype_file_paths(project_uid = '$project_uid', session_uid = '$session_uid', datatype = '$datatype')")

echo "Total size of $datatype datatype in $project_uid $session_uid is $datatype_size bytes"
echo ""
# echo "All $datatype filepaths: "
# echo $datatype_filepaths_json

if [ $status = 'archiving' ]; then

		# do something with the filepaths here

		# update the status for this data type
    echo "Changing data management state for $datatype in $project_uid $session_uid to 'archived'"
    ${cryosparcm} rtpcli "change_session_data_management_state(project_uid = '$project_uid', session_uid = '$session_uid', datatype = '$datatype', status = 'archived')"
    RESULT=$?
    if [ $RESULT -eq 0 ]; then
        echo "SUCCESS changed data management state for $datatype in $project_uid $session_uid to 'archived'"
    else
        echo "FAILED changing data management state for $datatype in $project_uid $session_uid to 'archived'. Exiting"
        exit 1;
    fi
fi

CryoSPARC Live Data Management API

Here is the rest of the CryoSPARC Live Data Management API that is available via cryosparcm rtpcli or cryosparcm icli:

  • get_datatype_file_paths(project_uid, session_uid, datatype)

    • Get all the file paths associated with a specific datatype inside a session as a json dictionary.

  • delete_live_datatype(project_uid, session_uid, datatype, filepaths_to_delete=None, user_id=None, asynchronous=True)

    • Delete a specific datatype inside a session. Alternatively, provide a json dict of file paths to delete.

  • get_datatype_size(project_uid, session_uid, datatype)

    • Get the total size of a datatype inside a session in bytes.

  • update_session_datatype_sizes(project_uid, session_uid)

    • Updates the session's 'data_management' top-level key with the current size of each datatype. Additionally returns the entire size of the session's datatypes.

  • update_all_sessions_datatype_sizes(project_uid)

    • Loops through each session in the project and updates all datatype sizes in each session document.

  • get_data_management_stats(project_uid) **

    • Returns a json formatted dictionary that includes the data_management dictionary of all sessions in the project.

  • change_session_data_management_state(project_uid, session_uid, datatype, status)

    • Modify the data management status for a specific datatype inside a session. Note that this function will call trigger_live_data_management_script once completed

  • trigger_live_data_management_script(project_uid, session_uid, datatype, status=None, script_location=None, script_log_path_abs=None)

    • Execute the specified script using the parameters sent. Will log all stdout & stderr into a file. Will only execute if environment variable CRYOSPARC_LIVE_DATA_MANAGEMENT_SCRIPT_ENABLE is set in cryosparc_master/config.sh

Enabling Access to the Tool

All CryoSPARC users are able to:

  1. View the data management table and refresh file sizes via the 'Action' column → 'Refresh Project Stats'/'Refresh Session Stats'

  2. Download the JSON list of files for a category within a session or across a project

In order to mark a category as active/archiving/archived/deleted or delete data, users must have access enabled in the admin panel (in the main cryoSPARC web application):

Admin users can click on the button within the 'Live Data Management' column to toggle the ability for a user to modify Live session data:

Last updated