Installation

Setup

This guide will walk you through the installation and setup of the Cedana checkpoint/restore plugin for SLURM.

Installing Cedana

Prerequisites

We depend on a few packages to be installed on the node, which can be set up in the following ways:

Using dnf/yum (Fedora/CentOS)

yum install -y libnet-devel protobuf-c-devel libnl3-devel libbsd-devel libcap-devel libseccomp-devel gpgme-devel nftables-devel

Using apt (Debian/Ubuntu)

apt-get install -y libnet-dev libprotobuf-c-dev libnl-3-dev libbsd-dev libcap-dev libseccomp-dev libgpgme11-dev libnftables1

Getting Cedana

You can either download our latest published release, or build from source.

curl -L -o cedana.tar.gz https://github.com/cedana/cedana/releases/download/v0.9.245/cedana-amd64.tar.gz
tar -xzvf cedana.tar.gz
chmod +x cedana
mv cedana /usr/local/bin/cedana
rm cedana.tar.gz

See our daemon repo for info on building from source.

Installing Plugins

The plugins required are dependent on your cluster. We ship plugins for different containerization frameworks (Singularity soon!). For example, if you'd like Cedana to manage GPU workloads:

sudo cedana plugin install criu gpu

We ship our own modified version of the CRIU binary, which is necessary to do any sort of checkpoint/restore in userspace.

You can directly start the daemon with:

sudo cedana daemon start

If you're a systemd user, you may also install it as a service (if built from source):

make install-systemd

Try make help to see all available targets.

Health check the daemon

The daemon can be health checked to ensure it fully supports the system and is ready to accept requests. See health checks for more information.


Enable and Start the Service

After the installation is complete, you need to enable the Cedana service to ensure it starts automatically on system boot. Then, start the service to activate it immediately.

Run the following commands:

# Enable the service to start on boot
sudo systemctl enable cedana

# Start the service now
sudo systemctl start cedana

Once these steps are completed on all nodes, the Cedana plugin will be installed and running in your cluster.

Last updated

Was this helpful?