Material for CSC Computing Environment -course

1. Prerequisites (accounts, connecting, command-line basics)

1.1 Slides: Accounts and projects

1.2 Video: Accounts and projects

1.3 Slides: Connecting to the CSC supercomputers

1.4 Video: Connecting to the CSC supercomputers

1.5 Video tutorial: Connecting to Puhti

1.6 Tutorials and exercises

  1. Tutorial - Start with getting a CSC account and project (essential)

  2. Tutorial - Login to Puhti with a browser or SSH (essential)

  3. Tutorial - Basic linux commands

  4. Tutorial - Basic file editing

  5. Advanced tutorial - Using SSH keys to authenticate connection

  6. Advanced tutorial - Run RStudio/Jupyter Notebooks on Puhti using SSH tunneling. Requires setting up SSH keys. An easier (and the recommended) option is to use the Puhti web interface.

2. Introduction to the HPC environment

2.1 Slides

2.2 Video: HPC environment

2.3 Video: CSC’s datacenter in Kajaani

3. Disk areas

3.1 Slides

3.2 Video: Disk areas

3.3 Tutorials and exercises

  1. Tutorial - Main disk areas in CSC’s computing environment (essential)

  2. Tutorial - Finding out where you have a lot of data (essential)

  3. Tutorial - Fast disk areas in CSC’s computing environment

  4. Exercise - I/O intensive computing tasks (advanced)

4. Module system

4.1 Slides

4.2 Video: Modules and preinstalled software

4.5 Tutorials and exercises

  1. Tutorial - Modules in Puhti (essential)

  2. Advanced tutorial - Biosoftware in Puhti

5. Batch queue system and interactive use

5.1 Slides

5.2 Video: Batch jobs

5.3 Tutorials and exercises

  1. Tutorial - Serial batch jobs (essential)

  2. Tutorial - Parallel batch jobs

  3. Tutorial - Interactive batch jobs

  4. Exercise - Retrieving data from bio data repositories (advanced)

  5. Exercise - Serial, array and parallel jobs with R + contours calculation from DEM with a raster package (GIS)
  6. Exercise - Serial, array and parallel jobs with Python + NDVI calculation rasterio package (GIS)

6. Batch job resource usage

6.1 Slides

6.2 Video: Resources usage

6.3 Tutorials and exercises

  1. Tutorial - sacct and seff, resource usage (essential)

  2. Exercise - Find your past job resource usage

7. Allas and where to keep your data

7.1 Slides

7.2 Video: Allas object storage

7.3 Video: Using Allas

7.4 Tutorials and exercises

  1. Tutorial - File transfer with Allas (essential)

  2. Tutorial - File backup with Allas

  3. Tutorial - Allas in batch jobs

  4. Advanced Tutorial - Using Allas (bio-data example)

8. Installing your own software

8.1 Slides

8.2 Video: Installing own software

8.3 Tutorials and exercises

  1. Tutorial - Installing binary applications (essential)

  2. Tutorial - Installing a simple C code from source

  3. Tutorial - Installing R applications and libraries

  4. Tutorial - Installing Python applications and libraries (essential)

  5. Tutorial - Installing Perl applictions and libraries

  6. Tutorial - Installing Java applications

  7. Exercise - Installing own C, C++, or Fortran programs

9. Containers and Apptainer/Singularity

9.1 Slides

9.2 Video: Containers

9.3 Tutorials and exercises

  1. Tutorial - Running containerized applications

  2. Tutorial - Introduction to Apptainer (essential)

  3. Tutorial - Apptainer introduction continued

  4. Exercise - Replicating a Conda environment

  5. Exercise - Creating Apptainer containers

  6. Advanced tutorial - How to get containers

10. How to speed up jobs

10.1 Slides

10.2 Tutorials and exercises

  1. Tutorial - Processing many files with HyperQueue

  2. Advanced tutorial - Gaussian with sbatch-hq

Information

This project has received funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 951732.

All materials (c) 2020-2023 by CSC – IT Center for Science Ltd.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 Unported License, http://creativecommons.org/licenses/by-sa/4.0/