Psst, remember the cheatsheet!

Moving data#

Local computer <-> supercomputer#

Puhti Web Interface#

Graphical data transfer tools on local computer#

  • For example: FileZilla, WinSCP and CyberDuck

  • For medium amounts of data, < 1 Tb.

  • Easy drag-and-drop for moving, but installation required.

  • WinSCP is slower than others.

  • CSC Docs: Graphical data transfer tools

"FileZilla"

Command line tools on local computer#

  • For any amount of data, practically required if data size > 1 Tb.

  • Requires knowing the commands.

scp#

  • The most usual Linux tool for moving files

  • scp works even in Windows Powershell

  • CSC Docs: scp

# One file:
scp /path/to/a_file cscusername@puhti.csc.fi:/scratch/project_200xxxx/data_dir

# One folder:
scp -r /path/to/directory cscusername@puhti.csc.fi:/scratch/project_200xxxx/directory 

rsync#

  • Best for big data transfers: does not copy what is already there, can resume a copy process which disconnected.

  • Can warn against accidental over-writes.

  • Available on Linux, Mac and Windows Subsystem Linux (WSL).

  • Windows Powershell does not have rsync, MobaXterm has rsync, but it removes write permissions of copied files

  • CSC Docs: rsync

# One file:
rsync --info=progress2 -a /path/to/a_file cscusername@puhti.csc.fi:/scratch/project_200xxxx/data_dir

# One folder:
rsync --info=progress2 -a /path/to/directory cscusername@puhti.csc.fi:/scratch/project_200xxxx/directory
  • --info=progress2 shows time left and percentage

Firewall limitations

Some organizations, for example research institutes with IT-services from Valtori, have stricter rules and need to use proxy for connecting to CSC. Ask your IT-service or other Puhti users in your organization for extra-guidelines.

External data services -> supercomputer#

  • When downloading from exernal services try to download directly to CSC, not via your local computer

  • Check what APIs/tools the service supports:

    • Standard APIs: OGC APIs, STAC

    • Custom service APIs

    • ftp, rsync

    • wget/curl if HTTP-urls avaialable

wget#

# One file:
wget http://wwwd3.ymparisto.fi/d3/gis_data/spesific/syvyyskayra.zip 

# One folder:
wget -r -nc ftp://ftp.aineistot.metsaan.fi/Metsamaski/Maakunta/ --cut-dirs=2