preloader
  • Home
  • Installing Python Modules

A short introduction on how to install Python modules on HSUper

Installing Python Modules

Contemporary research pipelines often employ Python - either for data analysis or even for data generation (e.g., simulation) itself. On HSUper, Python interpreters and corresponding packages can be installed via Spack or via miniforge3, where the latter will be most suitable for typical users. This article briefly describes how to configure miniforge3 on HSUper and provides an example installation of typical deep learning libraries to a user-defined environment.

Please note that Python and pip are already installed on each node, but they’re only available via an alias for the respective Python version, e.g., python3.6 and pip3.6. Be aware of this distinction to avoid confusion with system-wide installations.

Initial Configuration of miniforge3 on HSUper

miniforge3 is a management system for your scientific Python stack. It is available via the module system and the initial setup can be performed via

module load miniforge3
conda init

You need to execute the conda init command only once. To restore your shell configuration to its original state, you can run conda init --reverse --all.

Note that you need to close and re-open your SSH connection to HSUper in order for the changes to take effect.

Using conda on HSUper

The initial configuration adds conda permanently to your environment. However, to use the conda command you still need to load the miniforge3 module.

module load miniforge3

Since conda relies on a specific Python version and the python alias, attempting to use the conda command without first loading the miniforge3 module will result in an error message.

Creating New Environments

Typically, conda-users define a conda-environment for each project. Such an environment contains a dedicated python interpreter and all the respective libraries. Environments do not interfere with each other (i.e., you can install different versions of the same library to different environments without causing any problems).

As an example, consider a project in which you perform image classification using the PyTorch library. Assume further, that your specific workflow is only compatible with (slightly outdated) Python v3.11. At the start of the project, you then define a new conda-enviroment entitled img_class_project with the correct Python version by pasting the following command into your terminal:

conda create -n "img_class_project" python=3.11

When creating a new conda environment, you can optionally specify which Python version to use upfront, avoiding the need to install it separately later.

Installing Python with conda currently breaks the conda command. A workaround is to install everything within an environment that has no python version installed by conda using conda install <package> -n <myenv>. This was necessary when using miniforge3 version 24.3.0.

After confirming the installation, you can now switch into your newly created enviroment by using

conda activate img_class_project

As of miniforge3 version 24.3.0, running the conda command will exit with an error if Python in the active environment was installed using conda install.

Note that you can deactivate your currently active enviroment with the command

conda deactivate

New Environment - Another Python Version

Installing Dependencies into an Environment

Let us again assume the image classification project from above and switch into the newly created environment as described before. Typically, users want to install packages using pip. However, if you used miniforge3 version 24.3.0, the conda solver already installed pip if you defined a Python version for the environment or installed it manually. If pip is not yet present in your environment, execute the following command to install pip into your currently active environment

conda install pip

or you can install from any environment packages to another one using the following command:

conda install pip -n "img_class_project"

The conda solver installs pip alongside with Python, which breaks (with miniforge3 version 24.3.0) the conda command [ModuleNotFoundError: No module named ‘conda’]. To avoid this issue, simply run conda commands from outside the environment where Python was installed using conda:

You can then install any pip-compatible package, e.g.

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Note again, that the respective package will only be installed into your current environment. Other environments will remain unaffected.

Using conda and SLURM

Using conda together with the job scheduler SLURM on HSUper is fairly trivial. Yet, users must distinguish between regular jobs (i.e., using jobscripts) and interactive jobs.

Using Jobscripts

Users can use their regular jobscripts as described in the documentation e.g.,

#!/bin/bash
#SBATCH --job-name=conda_tutorial
#SBATCH --partition=small_gpu
#SBATCH --nodes=1
#SBATCH --time 1-00:00:00
#SBATCH --gpus=1

module load miniforge3
python main.py

SLURM will execute the respective job in the conda environment that is active at time of job submission i.e., when using sbatch.

Using Interactive Jobs

Unlike with jobscripts, SLURM does not transfer the current conda environment into an interactive job. Users must hence activate the desired environment again after ssh’ing to the allocated interactive node. Consider the following example

salloc --time=10:00 --partition=dev
#...
# e.g. ssh node002, if this node was allocated for the interactive job
module load miniforge3
conda activate img_class_project
# execute your python commands e.g., python main.py