Partitions

The table below presents various partition configurations along with their respective limitations. Note that additional constraints (*), not listed here, also apply. If the limitations presented here pose a problem for your use case, we invite you to contact our support team at cbrz-support@hsu-hh.de.

Partition Name	Nodes per Job	Wall-clock Limit	Concurent Jobs Limit	Nodes	Remarks
`dev`	1-2	1h	1	1-571	For testing purposes only, max. two queued jobs
`small`	1-5	72h	- (*)	3-571	Regular nodes, exclusive node reservation
`small_shared`	1-5	72h	- (*)	3-571	Same settings as `small` but node resources are by default shared
`small_fat`	1-5	24h	- (*)	572-576	Fat memory nodes, exclusive node reservation
`small_gpu`	1-5	24h	- (*)	gpu 1-5	Up to two A100 GPUs can be allocated per node
`small_gpu8`	1-4	24h	- (*)	gpu 6-9	Up to eight L40S GPUs can be allocated per node
`medium-short`	6-16	12h	20	3-571	Regular nodes, exclusive node reservation
`medium-s`	6-32	24h	5	3-571	Regular nodes, exclusive node reservation
`medium-m`	33-64	24h	3	3-571	Regular nodes, exclusive node reservation
`medium-l`	65-256	24h	1	3-571	Regular nodes, exclusive node reservation
`large`	>256	24h	- (*)	3-571	Regular nodes, exclusive node reservation ! available to selected users only !

(*): HSUper has a general maximum of concurrent jobs per user, which can be found here. Other limitations are specified there as well.

Concurrent Jobs

The ‘Concurrent Jobs Limit’ column specifies the maximum number of concurrent jobs allowed per user within the partition by default. If no value is given, the general HSUper-wide setting restricting the maximum amount of jobs per user that may run at any given time concurrently applies.

Note: Also a limit for the amount of submitted jobs by a single user exists to prevent slow job scheduling.

Both constraints are described on the limitations page page. You can circumvent some of those limitations by using a different Quality of Service (see below) than the default one.

Quality of Service

HSUper offers different Quality of Services (QOS) that impose varying limitations. Each QOS may loosen some restrictions while enforcing others.

Preemption (`preempt` QOS)

You can use the preempt QOS (e.g., #SBATCH --qos=preempt) to run up to 1000 jobs concurrently. However, be aware that your job may be preempted (canceled) within 30 seconds if a higher-priority job is queued.

Mitigating Preemption Risks

To minimize the impact of preemption:

Implement signal handling to preserve your job’s state and resume from the last checkpoint.
Use regular checkpointing to ensure that your job can be resumed from the last saved point upon requeuing.

Unlimited Concurrent Jobs, max 15 Nodes

The many-jobs-small_shared QOS removes partition limits for concurrent jobs per user while enforcing a maximum of 15 nodes in use per user at any given time.

Accessing this QoS: Contact our support team at cbrz-support@hsu-hh.de to request access.

HSUper Documentation

On This Page

Concurrent Jobs

Quality of Service

Preemption (`preempt` QOS)

Mitigating Preemption Risks

Unlimited Concurrent Jobs, max 15 Nodes

HSUper Documentation

On This Page

Partitions

Concurrent Jobs

Quality of Service

Preemption (preempt QOS)

Mitigating Preemption Risks

Unlimited Concurrent Jobs, max 15 Nodes

Preemption (`preempt` QOS)