Centaurus / GPU User Notes

The Centaurus Slurm partition is an HPC resource that is dedicated to supporting student access associated with class assignments. Centaurus is available to students in a designated set of courses.

ACCESS TO the Educational Cluster

Before logging into the educational cluster, please ensure that you have setup Duo (Setup Duo). In additon, if you are on campus, please connect to EDUROAM wireless. Otherwise, please connect to the campus VPN (Setup VPN).

Centaurus and GPU can be accessed via SSH to “hpc-student.uncc.edu.” Your credentials will be your NinerNET username and NinerNET password. Please do not include “@uncc.edu” on your NinerNET username. Once logged in, you will be on the Educational Cluster Interactive/Submit host which should be used to perform tasks such as transferring data using SCP or SFTP, and for code development.

From this node, a user can submit jobs requesting the following resources:

  • General Compute Nodes (12 nodes with 16 cores/node = 192 procs total)
  • Large Memory Compute Node (1 node with 16 cores and 768GB RAM)
  • GPU Compute Nodes:
    • 8 nodes with 16 cores and 2 GPUs/node
    • 1 node with 8 cores and 8 GPUs

Jobs should always be submitted to the “Centaurus” partition for CPU jobs and “GPU” for GPU jobs, unless directed otherwise by your instructor.

NFS STORAGE

Each student is given a default storage quota of 150 GBs for their home directory located at /users/. This volume is BACKED Up nightly. Users can check their current quota usage using the command “urcquota”.

Each class also has a shared folder located at /projects/class/ which instructors may use to share information or data with class members.

ACCESSING SOFTWARE

Centaurus uses environment modules to set up the user environment to use specific software packages. Additional details on modules can be found here.

SUBMITTING COMPUTE JOBS

Centaurus uses the Slurm scheduler to manage access to the computational resources. To submit a job to the scheduler, users must prepare a “submit script”. At its simplest, a submit script (my_script.sh) would look like this:

#! /bin/bash
/users//myprogram

And would be submitted to the cluster as follows:

sbatch --job-name=myjob --partition=Centaurus --time=00:01:00 my_script.sh

Or, instead of specifying the Slurm directives on the command line, you can put them in the script like this:

#!/bin/bash
#SBATCH --job-name=myjob
#SBATCH --partition=Centaurus
#SBATCH --time=00:01:00
/users//myprogram

Which would simplify your sbatch command to this:

sbatch my_script.sh

Submit scripts may also load any needed environment modules and set additional parameters specifying details of the desired execution environment (e.g. number of required processes, memory size, gpu access, etc.)

PARALLEL PROCESSING WITH OPENMPI

Slurm supports parallel processing via message passing. To access OpenMPI, load the desired modules: e.g.

$ module load openmpi
$ mpicc myprogram.c

And include a request for multiple processes in the submit script:

#! /bin/bash #SBATCH --job-name="MyMPIJob"
#SBATCH --partition=Centaurus
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=4
#SBATCH --time=00:01:00 module load openmpi/4.0.3
srun --mpi=pmix_v3 /users//myprogram

and submit with sbatch:

$ sbatch my_script.sh

The Slurm options may also be set on the sbatch command line as follows

$ sbatch --job-name=MyMPIJob --partition=Centaurus --nodes=4 --ntasks-per-node=4 my_script.sh

In this example, the resource request is for 4 cores (or processes) on each of 4 compute nodes for a total of 16 processes.

SUBMITTING GPU JOBS

The educational cluster has a GPU parition that can be utilized for GPU computing jobs. Here is a simple example of a submit script that will queue up and run a GPU compute job:

#! /bin/bash #SBATCH --job-name=MyGPUJob
#SBATCH --partition=GPU
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --gres=gpu:1
#SBATCH --time=00:01:00 nvidia-smi

The above job will request one core and one gpu on a single GPU compute node.