githubEdit

Installation

Setup

This guide will walk you through the installation and setup of the Cedana checkpoint/restore plugin for SLURM.

circle-exclamation

Installing Cedana

Prerequisites

We depend on a few packages to be installed on the node, which can be set up in the following ways:

Using dnf/yum (Fedora/CentOS)

yum install -y libnet-devel protobuf-c-devel libnl3-devel libbsd-devel libcap-devel libseccomp-devel gpgme-devel nftables-devel

Using apt (Debian/Ubuntu)

apt-get install -y libnet-dev libprotobuf-c-dev libnl-3-dev libbsd-dev libcap-dev libseccomp-dev libgpgme11-dev libnftables1

Getting Cedana

You can either download our latest published release, or build from source. On each SLURM node, run

curl -L -o cedana.tar.gz https://github.com/cedana/cedana/releases/download/v0.9.280/cedana-amd64.tar.gz
tar -xzvf cedana.tar.gz
chmod +x cedana
mv cedana /usr/local/bin/cedana
rm cedana.tar.gz

See the Cedana daemon documentation repoarrow-up-right for info on building from source.

Installing Plugins

On each SLURM node, install the SLURM plugin with

For Cedana to manage GPU workloads, install the GPU plugin using:

You will first need to set up the cedana configuration on the node however (TBD).

We ship our own modified version of the CRIU binary, which is necessary to do any sort of checkpoint/restore in userspace.

You can directly start the daemon with:

If you're a systemd user, you may also install it as a service (if built from source):

Try make help to see all available targets.

Health check the daemon

The daemon can be health checked to ensure it fully supports the system and is ready to accept requests. See health checks for more information.

Enable and Start the Service

After the installation is complete, you need to enable the Cedana service to ensure it starts automatically on system boot. Then, start the service to activate it immediately.

Run the following commands:

Once these steps are completed on all nodes, the Cedana plugin will be installed and running in your cluster.

Installing the SLURM plugin

To build the Cedana SLURM plugin, build the Docker builder image

The builder image handles building the SLURM plugin with the necessary dependencies.

To generate the plugin file, in the project root, run

To install the plugin, run the following script to update the SPANK config file plugstack.conf and install the SLURM plugin:

Last updated

Was this helpful?