Supercomputer setup#

Typical physical parts of a supercomputer:

  • Login nodes

  • Compute nodes

  • Storage

  • High-speed networks between these

Login nodes#

  • Login nodes are used for moving data and scripts, editing scripts, and starting jobs.

  • When you login to CSC’s supercomputers, you enter one of the login nodes of the computer

  • There are only a few login nodes and they are shared by all users, so they are not intended for heavy computing.

Compute nodes#

  • Heavy computing should be done on compute-nodes.

  • Each node has memory, which is used for storing information about the current task.

  • Compute nodes can be classified based on the types of processors they have:

    • CPU nodes have only CPUs (central processing unit).

    • GPU nodes have both GPUs (graphical processing unit) and CPUs. GPUs are widely used for deep learning.

    • Each CPU has multiple cores, which are the basic computing resource. There are 40 cores in Puhti and 128 in Mahti and LUMI CPU-nodes.

    • Whether your task benefits from a GPU depends on the software used. Most GIS-tools can not use GPUs.

    • GPUs are more expensive, so in general the software should run at least 3x faster on GPU, that it would be reasonable to use GPU nodes.

  • When using compute nodes, the compute resources have to be defined in advance. You must specify e.g. the amount of GPUs, memory and nodes or CPU cores.

  • Specifics of Puhti, Mahti and LUMI compute nodes.

Storage#

  • Disk refers to all storage that can be accessed as a file system. This is generally storage that can hold data permanently, i.e. data is still there even if the computer has been restarted.

  • CSC supercomputers use Lustre as the parallel distributed file system

Puhti disk areas#

Name

Access

Path

Cleaning

Capacity

Number of files

Use

home

Personal

/users/cscusername

No

10 GiB

100 000 files

personal settings and files

projappl

Project

/projappl/project_200xxxx

No

50 GiB

100 000 files

installation files

scratch

Project

/scratch/project_200xxxx

180 days

1 TiB

1 000 000 files

main working area

  • scratch space can be extended, but it would use billing units then.

Temporary fast disks#

  • Some nodes might also have local disk space for temporary use.

  • CSC Docs: Login node local tmp $TMPDIR for compiling, cleaned frequently.

  • CSC Docs: NVMe - $LOCAL_SCRATCH in batch jobs,

    • NVMe is accessible only during your job allocation (including any interactive jobs)

    • You must copy data in and out during your batch job

    • If your job reads or writes lots of small files, using this can give 10x performance boost

Avoid unneccesary reading and writing

Avoid unnecessary reads and writes of data to improve I/O performance

  • Read and write in big chunks and avoid reading/writing lots of small files

LUMI disk areas#

Login node etiquette

Which of the following tasks would suit to run on the login node?

  1. python join_dataframes.py

  2. make

  3. create_directories.sh

  4. qgis

  5. tar -xzf mytool.tar.gz