Skip to content

Accessing Elm and Copying Data#

Source Tool Description Links
Local NAS rclone Synology and QNAP NAS devices would need enough free space equal to at least the size of the packaged file being created rclone Download
Oak elm_archive Using the elm_archive utility, we've streamlined the archival process of data from Sherlock or Oak to Elm storage using Globus Archiving Data to Elm using elm_archive
Sherlock elm_archive Using the elm_archive utility, we've streamlined the archival process of data from Sherlock or Oak to Elm storage using Globus Archiving Data to Elm using elm_archive
Local disk rclone or mc MinIO Client (mc) or rclone can be installed on Linux, macOS, or Windows rclone Download
MinIO Client (mc) Download
SCG rclone, mc, or Globus MinIO Client (mc) or rclone can be installed on Linux, macOS, or Windows; Globus can be used on multiple platforms rclone Download
MinIO Client (mc) Download
Globus endpoint
Google Drive Globus Globus can be used on multiple platforms Globus endpoint
Google Cloud Platform Globus Globus can be used on multiple platforms Globus endpoint

Once you've gained access to your new Elm bucket, you may refer to the reccomendations below for example workflows that suit your needs:

Sherlock to Elm#

Transferring data from Sherlock to Elm can be done by running elm_archive, which is a utility designed to streamline the archival process of data from Sherlock to Elm storage using Globus.

Archiving Data to Elm using elm_archive

Notes for elm_archive:

Issue: Job times out after 2 days

To get around this, you can request a longer allocation:

$ salloc -p [partition] -t 7-0 (7 days)
$ elm_archive transfer -p [partition] -n 16 -t 7-0 /scratch/groups/[group]/[folder] [group]/[archive-path]

Local NAS to Elm (via Sherlock)#

If your source data is not already prepared as tar archives, it's possible to send it to Sherlock $SCRATCH space first, before using the elm_archive tool to pack and transfer it to Elm.

We recommend that you limit transfers to 10-25 TB at a time.

`elm_archive` needs working space in you $SCRATCH space to pack and split your source data into `tar` files before sending them into Elm.

This is a multi-step loop process

Step 1: NAS to Sherlock SCRATCH

SSH into NAS, then run:

# rsync -rtlvhP --stats /path/to/backup/ [username]@dtn.sherlock.stanford.edu:/scratch/groups/[group]/

Step 2: Sherlock SCRATCH to Elm

SSH into Sherlock, then run:

$ elm_archive transfer -p [partition] -n 16 -t 7-0 /scratch/groups/[group]/[folder] [bucket-name]/[archive-path]

elm_archive will run as a background job, so if you lose connection or need to log out, the job will continue to run. Once the transfer completes, you will receive an email confirmation.

To check the status of your job, run the following command on Sherlock:

$ squeue -u [username]

Step 3: Start Next NAS to Sherlock Transfer

While Step 2 is running, copy the next group of data from the NAS to Sherlock scratch.

Step 4: Clean Up Sherlock Scratch

When elm_archive completes, delete files from scratch:

$ pwd  # CONFIRM YOU ARE IN YOUR SCRATCH DIRECTORY
$ rm -rf [folder]  # BE EXTREMELY CAREFUL WITH THIS COMMAND!

Local NAS to Elm#

Elm has a very strict 2000 object per TiB limit.

It is imperitive that you do not transfer large quantities of small <250MB files without first packing your data into a tar or similar archive. Data moved improperly into Elm will need to be deleted and re-uploaded.

Use rclone

Step 1: Get Elm Access Keys

  1. Go to campus.elm.stanford.edu:9001

  2. Log in with your SUNet ID (click "Stanford OpenID Connect")

  3. Click "Access Keys" on the left sidebar

  4. Click "Create access key" access key button
  5. Name it: rclone-nas-to-elm (or similar descriptive name) access key button
  6. Important: Download the credentials immediately

  7. Save both the Access Key and Secret Key (take a screenshot or save the download)

Step 2: Download and Install rclone

QNAP and Synology NAS runs a Linux-based operating system, so you should use the Linux version of the rclone download

download linux icon

  • Download one of the Linux ARM versions of the rclone executables from the rclone downloads page

  • Extract the rclone executable on the NAS device

  • Run rclone config to setup

Follow these prompts:

Prompt Value
New remote n
name elm
Storage choose the number for s3
provider choose the number for Minio
env_auth 1 (false) — enter credentials manually
access_key_id paste your Access Key from Step 1
secret_access_key paste your Secret Key from Step 1 (won't display as you type)
region press Enter (leave blank)
endpoint https://campus.elm.stanford.edu:9000
(remaining basic options) press Enter to accept defaults
Edit advanced config? y
(options until chunk_size) press Enter to accept defaults
chunk_size 5Gimandatory for Elm
(remaining advanced options) press Enter to accept defaults
Edit advanced config? (second prompt) n
Keep this remote? y
(quit) q

Step 3: Run rclone

  • Test your connection to Elm by running the following "ls" command
# rclone ls elm:bucket-name
  • Run copy from NAS to Elm
# rclone copy /path/to/backup/ elm:[bucket-name]/[archive-path] -P --transfers 10

Local Disk to Elm#

Prerequisite: An active Sherlock account

This can be accomplished a couple of ways

Way 1: Copy Local Drive Data to Sherlock SCRATCH

Prepare data and stage your data on the local drive

If there's space on the drive equal to

Check current free space:

  • Linux/macOS: Open Terminal and run df -h. Look for "Avail" or "Available" for your target drive (e.g., / or /dev/sda1).

  • Windows: Open File Explorer, go to "This PC," right-click the drive, and select "Properties" to see used/free space.

Estimate total data size:

  • Linux/macOS (Best for Directories): du -sh /path/to/drive (e.g., du -sh /) shows the total size of everything on the drive.

  • Windows: Use Storage settings (Settings > System > Storage) or a tool like WinDirStat/TreeSize Free to find the size of folders.

Simulate the tar archive size:

  • For Linux/macOS (Recommended): Run a command to see the size without writing the file:
# tar -cf - /path/to/drive | wc -c

This streams the tar output to /dev/null and counts the bytes, giving you the expected archive size to compare with your free space.

  • For Windows (Manual Estimate): Get the total size from Step 2 — a tar archive will be roughly the same size as the source data.

Open a terminal session and run:

# rsync -rtlvhP --stats --chmod=Du=rwx,Fu=rw /volume1/[source]/[folder] [username]@dtn.sherlock.stanford.edu:/scratch/groups/[group]/

Way 2: Run rclone to copy data from the local disk directly to Elm

Prerequisite: rclone setup guide

If you have a bunch of small files, we recommend that you pack up your files into tar archives. This is critical to get the best performance out of the tape technology Elm uses to store your data.

Preview a sync by doing a safe test with the --dry-run option:

# rclone sync /path/to/source remote:destination --dry-run

Oak to Elm#

Transferring data from Oak to Elm can be done by running elm_archive, which is a utility designed to streamline the archival process of data from Oak to Elm storage using Globus.

Archiving Data to Elm using elm_archive

Notes for elm_archive:

Issue: Job times out after 2 days

To get around this, you can request a longer allocation:

$ salloc -p [partition] -t 7-0 (7 days)
$ elm_archive transfer -p [partition] -n 16 -t 7-0 /scratch/groups/[group]/[folder] [group]/[archive-path]

SCG to Elm#

Elm is not mounted on SCG, however you can use S3 API to access Elm using many of the same tools you'd use to interact with common cloud vendor storage like rclone. You can also use the Globus connector.

SCG Data Movement

Google Drive to Elm#

Move the data from Google Drive to $SCRATCH on Sherlock or onto your Oak space using Globus. Instructions on how to connect to the Stanford Google Drive collection are on the Stanford Google Drive Globus collection page.

Once your data is on Sherlock or Oak, organize it into your preferred folder structure and create a tar archive:

$ tar -cvf archive_name.tar /path/to/archive

Then use elm_archive (which is on Sherlock) to copy the data to Elm:

Archiving Data to Elm using elm_archive

Stanford Elm Storage (project: campus)

Google Cloud to Elm#

Use Globus to move it to $SCRATCH on Sherlock or Oak:

Google Cloud Globus connector

We recommend that you limit transfers to 10-25 TB at a time.

`elm_archive` needs working space in you $SCRATCH space to pack and split your source data into `tar` files before sending them into Elm.

Once your data is on Sherlock or Oak, organize it into your preferred folder structure and create a tar archive:

$ tar -cvf archive_name.tar file_name

Example:

$ tar -cvf ImagingArchiveFebruary2025.tar /scratch/users/vnwong/imaging_data_from_Feb

Archiving Data to Elm using elm_archive