SLURM Setup
Setup
This guide will walk you through the installation and setup of the Cedana checkpoint/restore plugin for SLURM. These steps must be performed on all SLURM controller and compute nodes in your cluster.
Installing our plugin
Automated Installation
First, you need to download and run the installation script. This script will install the Cedana agent and all necessary dependencies for the plugin to function correctly.
For now, you can either install the daemon from source, or use the released binaries.
Prerequisites
Since Cedana depends on CRIU, you will need to ensure it's dependencies are installed.
Using apt (Debian/Ubuntu)
apt-get install -y libnet-devel protobuf-c-devel libnl3-devel libbsd-devel libcap-devel libseccomp-devel gpgme-devel nftables-devel
Using dnf/yum (Fedora/CentOS)
yum install -y libnet-dev libprotobuf-c-dev libnl-3-dev libbsd-dev libcap-dev libseccomp-dev libgpgme11-dev libnftables1
Build from source
Build
make cedana
Install
make install
Build and install (with all plugins)
make all
Try make help
to see all available targets.
Download from releases
Download the latest release from the releases.
curl -L -o cedana.tar.gz https://github.com/cedana/cedana/releases/download/v0.9.245/cedana-amd64.tar.gz
tar -xzvf cedana.tar.gz
chmod +x cedana
mv cedana /usr/local/bin/cedana
rm cedana.tar.gz
Install CRIU
To install a plugin from the online registry, you need to be authenticated. See plugins for more information.
A modified version of CRIU is shipped as a plugin for Cedana, so you don't need to install it separately. You can simply do:
sudo cedana plugin install criu
This version of CRIU is not a requirement for Cedana, but it is recommended for certain features, such as checkpoint/restore streamer.
To install CRIU independently, see the CRIU installation guide.
Start the daemon
The daemon requires root privileges for checkpoint/restore operations. Check the CLI reference for all options.
You can directly start the daemon with:
sudo cedana daemon start
If you're a systemd user, you may also install it as a service (if built from source):
make install-systemd
Try make help
to see all available targets.
Health check the daemon
The daemon can be health checked to ensure it fully supports the system and is ready to accept requests. See health checks for more information.
2. Enable and Start the Service
After the installation is complete, you need to enable the Cedana service to ensure it starts automatically on system boot. Then, start the service to activate it immediately.
Run the following commands:
# Enable the service to start on boot
sudo systemctl enable cedana
# Start the service now
sudo systemctl start cedana
Once these steps are completed on all nodes, the Cedana plugin will be installed and running in your cluster.
Last updated
Was this helpful?