Where to store files in CSC’s computing environment?
In this tutorial you
- Familiarize yourself with personal and project-specific disk areas and their quotas on CSC supercomputers.
- Learn how to share your files, such as software installations and data, to other project members on CSC supercomputers.
💬 Each user of CSC supercomputers (Puhti and Mahti) have access to different disk areas (or directories) for managing their data. Each disk area has its own specific purpose.
💬 Active data files needed for computational simulations and analyses should be stored and shared in directories under /scratch
while any software installations and binaries should be shared under the /projappl
directory.
Identify your personal and project-specific directories on Puhti and Mahti supercomputers
-
First login to Puhti using SSH (or by opening a login node shell in the Puhti web interface):
ssh <username>@puhti.csc.fi # replace <username> with your CSC username, e.g. myname@puhti.csc.fi
-
Get an overview of your projects and directories by running the following commands on the login node:
csc-projects csc-workspaces
- Inspect the output information summarizing your directories and their current quotas.
-
Visit your project’s
/scratch
directory and list its contents:cd /scratch/<project>/ # replace <project> with your CSC project, e.g. project_2001234 ls
-
Visit your project’s
/projappl
directory and list its contents:cd /projappl/<project>/ # replace <project> with your CSC project, e.g. project_2001234 ls
💬 These directories can be briefly summarized as follows:
- User-specific directory (i.e. your personal home folder)
- Your home directory (path stored in environment variable
$HOME
) - The default directory when you login to Puhti/Mahti
- You can store configuration files and other minor data for personal use
- Your home directory (path stored in environment variable
- Project-specific directories:
- The project’s
/scratch
and/projappl
directories - Each project has its own
/scratch
disk space where most computational tasks are performed. The/scratch
area is a temporary space not intended for long-term data storage! Please move inactive data to e.g. Allas. /projappl
directory on the other hand is mainly for storing and sharing compiled applications and libraries etc. with other members of the project.
- The project’s
Sharing binaries and data files
💬 Data transfer between two supercomputers can be done e.g. with rsync
.
Download the example files
☝🏻 In this example you will download data from Allas object storage, but keep in mind that one should avoid using Allas to do data transfer between Puhti and Mahti.
-
Move to your home folder:
cd
💡 If you know the files are large, you should consider downloading them directly to
/scratch
. -
Download an example program package (
ggplot2_3.3.3_Rprogramme.tar.gz
) and a data file (Merged.fasta
) from the Allas object storagewget https://a3s.fi/CSC_training/shared_files.tar.gz tar -xavf shared_files.tar.gz cd shared_files
Let’s assume that
Merged.fasta
is a data file intended for computational useggplot2_3.3.3_Rprogramme.tar.gz
is a software tool needed for the analysis.
Move the files to Puhti /scratch
and /projappl
-
Create folders with your username (using environment variable
$USER
) in your project directories under/scratch
and/projappl
on Puhti.mkdir -p /projappl/<project>/$USER # replace <project> with your CSC project, e.g. project_2001234 mkdir -p /scratch/<project>/$USER # replace <project> with your CSC project, e.g. project_2001234
-
Copy your
ggplot2_3.3.3_Rprogramme.tar.gz
file to the/projappl
directorycp ggplot2_3.3.3_Rprogramme.tar.gz /projappl/<project>/$USER/ # replace <project> with your CSC project, e.g. project_2001234
-
Copy the
Merged.fasta
file to the/scratch
directorycp Merged.fasta /scratch/<project>/$USER/ # replace <project> with your CSC project, e.g. project_2001234
- Note that all new files and directories are also fully accessible to other members of the project (including read and write permissions).
-
Set read-only permissions for your project members for the file
Merged.fasta
:cd /scratch/<project>/$USER/ # replace <project> with your CSC project, e.g. project_2001234 chmod g-w Merged.fasta # g-w means that we "subtract" write permissions for users belong to our group (g), i.e. our project
Copying files from Puhti to Mahti (optional)
- Change to the folder where you have the example files
-
Copy
Merged.fasta
file from Puhti to the/scratch
drive of Mahti:rsync -P Merged.fasta <username>@mahti.csc.fi:/scratch/<project>/$USER/ # replace <username> with your CSC username and <project> with your CSC project, e.g. project_2001234
-
Copy the
ggplot2_3.3.3_Rprogramme.tar.gz
file from Puhti to the/projappl
directory on Mahti:rsync -P ggplot2_3.3.3_Rprogramme.tar.gz <username>@mahti.csc.fi:/projappl/<project>/$USER/ # replace <username> with your CSC username and <project> with your CSC project, e.g. project_2001234
More information
💡 Hint: You can use your folder under /scratch
for the rest of the tutorials. You can save the path using an alias (with cd
or echo
) or somewhere in your notes.
💡 It is sometimes required to export the paths of the /scratch
or /projappl
directories in environmental variables (until logout). This can be done with the following commands:
export PROJAPPL=/projappl/<project>/ # replace <project> with your CSC project, e.g. project_2001234
export SCRATCH=/scratch/<project>/ # replace <project> with your CSC project, e.g. project_2001234