Python

πŸ’¬ To run Python applications, first load a suitable Python module. CSC has several Python environments available with focus on different application areas, e.g. data science and bio-/geoinformatics.

πŸ’‘ For details, please see the Python page in Docs CSC.

πŸ’­ By selecting a suitable Python environment to start with, you’ll minimize the need to install additional packages.

☝🏻 Note that Conda environments should be containerized according to our usage policy. See the Tykky container wrapper to accomplish this easily!

Installing Python packages

πŸ’¬ To install simple packages it is usually enough to use pip, for example:

pip install --user <package name>   # Or pip3 to ensure use of Python 3

☝🏻 Remember to include --user. By default, pip tries to install to the system Python installation path, which will not work.

πŸ—― For more complex installations you should create a containerized environment.

πŸ’‘ See the the Python documentation pages for each Python environment as there might be some environment-specific instructions.

Example: Installing a simple package with pip

πŸ’¬ Let’s install a library called coverage.

  1. Start by loading a Python module and checking if the library is already installed.
module load python-data
python -c "import coverage"

☝🏻 The error message is indicating that the library is not available:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'coverage'
  1. Install the missing library:
pip3 install --user coverage    # This may take a while - don't worry!
  1. Re-test if the library is available:
python -c "import coverage"

πŸ’‘ This time there’s no error message, indicating that the import was successful.

  1. User libraries are installed by default under $HOME/.local. To change the installation folder:
export PYTHONUSERBASE=/path/to/your/preferred/installdir
  1. To uninstall:
pip3 uninstall coverage
  1. Type y to confirm.

☝🏻 Note, if the package you installed also contains executable files these may not work. This is because the Python modules provided by CSC are containerized and the user-installed binaries will refer to an inaccessible Python path inside the container. For workaround instructions, see our Python documentation or install your own environment from scratch inside a container as outlined in the following example.

Example: Containerizing a Conda environment with Tykky

πŸ’¬ Let’s create a containerized Conda environment using the Tykky wrapper.

  1. Create a folder under your project’s /projappl directory for the installation, e.g.:
mkdir -p /projappl/<project>/$USER/tykky-env    # replace <project> with your CSC project, e.g. project_2001234
  1. Create an env.yml environment file defining the packages to be installed. Using for example nano, copy/paste the following contents to the file:
channels:
  - conda-forge
dependencies:
  - python=3.10.8
  - scipy
  - pandas
  - nglview
  1. Purge your current module environment and load the Tykky module:
module purge
module load tykky
  1. Create and containerize the Conda environment using the conda-containerize command:
conda-containerize new --prefix /projappl/<project>/$USER/tykky-env env.yml    # replace <project> with your CSC project, e.g. project_2001234

☝🏻 This process can take several minutes so be patient.

  1. As instructed by Tykky, add the path to the installation bin directory to your $PATH:
export PATH="/projappl/<project>/$USER/tykky-env/bin:$PATH"    # replace <project> with your CSC project, e.g. project_2001234

πŸ’‘ Adding this to your $PATH allows you to call Python and all other executables installed by Conda in the same way as you had activated a non-containerized Conda environment.

πŸ’­ The above Conda installation would create more than 40k files if installed directly on the parallel file system. Containerizing the environment with Tykky decreases this to less than 200, thus avoiding Lustre performance issues.

πŸ’¬ To modify an existing Tykky-based Conda environment you can use the update keyword of conda-containerize together with the --post-install option to specify a bash script with commands to run to update the installation. See more details in Docs CSC.