Upon completion of this tutorial, you will be familiar with ideal disk areas for I/O intensive workloads, i.e. frequent read and write operations
💬 You may sometimes come across situations where you have to process a large number of smaller files, which can cause heavy input/output load on the shared file system used in CSC’s computing environment.
💬 In order to facilitate such heavy I/O operations, CSC provides fast local disk areas on the login and compute nodes (excluding Mahti CPU nodes).
💡 The local disk area on the login nodes is meant for light-weight pre-processing of data and I/O intensive tasks such as software compilation. Actual computations should be submitted to the batch queue from the
💡 The local disk area on the login nodes are meant for temporary use and cleaned often, so make sure to move important data to
/projappl once you do not need the fast disk anymore. Note that e local disk is specific to a particular node, i.e. you cannot access the local disk of
cd $TMPDIR wget https://a3s.fi/CSC_training/Individual_files.tar.gz
tar -xavf Individual_files.tar.gz cd Individual_files
find . -name 'individual.fasta*' | xargs cat >> Merged.fasta find . -name 'individual.fasta*' | xargs rm
☝🏻 If you intend to perform heavy computing tasks using a large number of small files, you have to use the fast local disk areas on the compute nodes instead of the login nodes. The compute nodes are accessed either interactively or using batch jobs.
echo $LOCAL_SCRATCH echo $TMPDIR
$LOCAL_SCRATCHin your batch job scripts to access the local storage on that node (only on Puhti).
/scratcharea before analysis
💭 Remember: the commands
csc-workspaces reveal information about your projects.
$USER) under a project-specific directory on the
/scratchdisk (or skip this step if you already created the folder in a previous tutorial).
mkdir -p /scratch/<project>/$USER/ # replace <project> with your CSC project, e.g. project_2001234
Merged.fastafile) from the fast disk to
mv Merged.fasta /scratch/<project>/$USER
/scratcharea and can start performing actual analysis using batch job scripts
💡 Hint: You can use your folder under
/scratch for the rest of the tutorials. You can save the path using an alias (with
echo) or somewhere in your notes.
💡 It is sometimes required to export the paths of the
/projappl directories in environmental variables (until logout). This can be done with the following commands:
export PROJAPPL=/projappl/<project>/ # replace <project> with your CSC project, e.g. project_2001234 export SCRATCH=/scratch/<project>/ # replace <project> with your CSC project, e.g. project_2001234