In this tutorial you
- Familiarize yourself with personal and project-specific disk areas and their quotas on CSC supercomputers.
- Learn how to share your files, such as software installations and data, to other project members on CSC supercomputers.
💬 Each user of CSC supercomputers (Puhti and Mahti) have access to different disk areas (or directories) for managing their data. Each disk area has its own specific purpose.
💬 Active data files needed for computational simulations and analyses should be stored and shared in directories under /scratch
while any software installations and binaries should be shared under the /projappl
directory.
ssh <username>@puhti.csc.fi # replace <username> with your CSC username, e.g. myname@puhti.csc.fi
csc-projects
csc-workspaces
/scratch
directory and list its contents:cd /scratch/<project>/ # replace <project> with your CSC project, e.g. project_2001234
ls
/projappl
directory and list its contents:cd /projappl/<project>/ # replace <project> with your CSC project, e.g. project_2001234
ls
💬 These directories can be briefly summarized as follows:
$HOME
)/scratch
and /projappl
directories/scratch
disk space where most computational tasks are performed. The /scratch
area is a temporary space not intended for long-term data storage! Please move inactive data to e.g. Allas./projappl
directory on the other hand is mainly for storing and sharing compiled applications and libraries etc. with other members of the project.💬 Data transfer between two supercomputers can be done e.g. with rsync
.
☝🏻 In this example you will download data from Allas object storage, but keep in mind that one should avoid using Allas to do data transfer between Puhti and Mahti.
cd
💡 If you know the files are large, you should consider downloading them directly to /scratch
.
ggplot2_3.3.3_Rprogramme.tar.gz
) and a data file (Merged.fasta
) from the Allas object storagewget https://a3s.fi/CSC_training/shared_files.tar.gz
tar -xavf shared_files.tar.gz
cd shared_files
Let’s assume that
Merged.fasta
is a data file intended for computational useggplot2_3.3.3_Rprogramme.tar.gz
is a software tool needed for the analysis./scratch
and /projappl
$USER
) in your project directories under /scratch
and /projappl
on Puhti.mkdir -p /projappl/<project>/$USER # replace <project> with your CSC project, e.g. project_2001234
mkdir -p /scratch/<project>/$USER # replace <project> with your CSC project, e.g. project_2001234
ggplot2_3.3.3_Rprogramme.tar.gz
file to the /projappl
directorycp ggplot2_3.3.3_Rprogramme.tar.gz /projappl/<project>/$USER/ # replace <project> with your CSC project, e.g. project_2001234
Merged.fasta
file to the /scratch
directorycp Merged.fasta /scratch/<project>/$USER/ # replace <project> with your CSC project, e.g. project_2001234
Merged.fasta
:cd /scratch/<project>/$USER/ # replace <project> with your CSC project, e.g. project_2001234
chmod g-w Merged.fasta # g-w means that we "subtract" write permissions for users belong to our group (g), i.e. our project
Merged.fasta
file from Puhti to the /scratch
drive of Mahti:rsync -P Merged.fasta <username>@mahti.csc.fi:/scratch/<project>/$USER/ # replace <username> with your CSC username and <project> with your CSC project, e.g. project_2001234
ggplot2_3.3.3_Rprogramme.tar.gz
file from Puhti to the /projappl
directory on Mahti:rsync -P ggplot2_3.3.3_Rprogramme.tar.gz <username>@mahti.csc.fi:/projappl/<project>/$USER/ # replace <username> with your CSC username and <project> with your CSC project, e.g. project_2001234
💡 Hint: You can use your folder under /scratch
for the rest of the tutorials. You can save the path using an alias (with cd
or echo
) or somewhere in your notes.
💡 It is sometimes required to export the paths of the /scratch
or /projappl
directories in environmental variables (until logout). This can be done with the following commands:
export PROJAPPL=/projappl/<project>/ # replace <project> with your CSC project, e.g. project_2001234
export SCRATCH=/scratch/<project>/ # replace <project> with your CSC project, e.g. project_2001234