Material for CSC Computing Environment -course
- The material is organized by topics of increasing complexity
- Feel free to jump if you know the basics already
- In each topic, first read the slides / watch the video
- Complete the tutorial(s) to make sure you’ve got the steps right
- Try out the exercises to verify your new skills
- If you get stuck, consult Docs CSC linked in the slides and the tutorials
- If the documentation does not provide a sufficient answer, please contact support by email servicedesk@csc.fi or by filling in the contact form at https://research.csc.fi/support
- Press and hold
ctrl/cmd
and click to open links in a new window or tab
- Left-click the slides to enable navigation with the arrow keys
- Use the Back button of your browser, the
Backspace
button on your keyboard or an external link to return to the main menu
- A video with study tips
1. Prerequisites (accounts, connecting, command-line basics)
1.6 Tutorials and exercises
-
Tutorial - Start with getting a CSC account and project (essential)
-
Tutorial - Login to Puhti with a browser or SSH (essential)
-
Tutorial - Basic linux commands
-
Tutorial - Basic file editing
-
Advanced tutorial - Using SSH keys to authenticate connection
-
Advanced tutorial - Run RStudio/Jupyter Notebooks on Puhti using SSH tunneling. Requires setting up SSH keys. An easier (and the recommended) option is to use the Puhti web interface.
2. Introduction to the HPC environment
3. Disk areas
3.3 Tutorials and exercises
-
Tutorial - Main disk areas in CSC’s computing environment (essential)
-
Tutorial - Finding out where you have a lot of data (essential)
-
Tutorial - Fast disk areas in CSC’s computing environment
-
Exercise - I/O intensive computing tasks (advanced)
4. Module system
4.5 Tutorials and exercises
-
Tutorial - Modules in Puhti (essential)
-
Advanced tutorial - Biosoftware in Puhti
5. Batch queue system and interactive use
5.3 Tutorials and exercises
-
Tutorial - Serial batch jobs (essential)
-
Tutorial - Parallel batch jobs
-
Tutorial - Interactive batch jobs
-
Exercise - Retrieving data from bio data repositories (advanced)
- Exercise - Serial, array and parallel jobs with R + contours calculation from DEM with a raster package (GIS)
- Exercise - Serial, array and parallel jobs with Python + NDVI calculation rasterio package (GIS)
6. Batch job resource usage
6.3 Tutorials and exercises
-
Tutorial - sacct and seff, resource usage (essential)
-
Exercise - Find your past job resource usage
7. Allas and where to keep your data
7.4 Tutorials and exercises
-
Tutorial - File transfer with Allas (essential)
-
Tutorial - File backup with Allas
-
Tutorial - Allas in batch jobs
-
Advanced Tutorial - Using Allas (bio-data example)
8. Installing your own software
8.3 Tutorials and exercises
-
Tutorial - Installing binary applications (essential)
-
Tutorial - Installing a simple C code from source
-
Tutorial - Installing R applications and libraries
-
Tutorial - Installing Python applications and libraries (essential)
-
Tutorial - Installing Perl applictions and libraries
-
Tutorial - Installing Java applications
-
Exercise - Installing own C, C++, or Fortran programs
9. Containers and Apptainer/Singularity
9.3 Tutorials and exercises
-
Tutorial - Running containerized applications
-
Tutorial - Introduction to Apptainer (essential)
-
Tutorial - Apptainer introduction continued
-
Exercise - Replicating a Conda environment
-
Exercise - Creating Apptainer containers
-
Advanced tutorial - How to get containers
10. How to speed up jobs
10.2 Tutorials and exercises
-
Tutorial - Processing many files with HyperQueue
-
Advanced tutorial - Gaussian with sbatch-hq
This project has received funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 951732.