> For the complete documentation index, see [llms.txt](https://docs.cedana.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.cedana.ai/cedana-slurm/installation.md).

# Installation

{% hint style="success" %}
To use Cedana in SLURM, you need to be registered with us! Reach out to [founders@cedana.ai](mailto:foundes@cedana.ai) to get set up with an organization.
{% endhint %}

{% hint style="info" %}
You can also deploy fully self-hosted, with zero limitations on where you can store your checkpoints! Check out [configuration](https://docs.cedana.ai/daemon/get-started/configuration). If you have any questions, please reach out to us at <founders@cedana.ai>.
{% endhint %}

You can install Cedana on a SLURM node in 3 ways:

1. \[Option 1]  [#install-from-web-recommended](#install-from-web-recommended "mention")
2. \[Option 2] [#install-using-cedana](#install-using-cedana "mention")
3. \[Option 3] [#build-from-source](#build-from-source "mention")

{% hint style="warning" %}
These steps must be performed on **a SLURM controller and one compute node minimum**. Other compute nodes without Cedana can continue to functional normally without checkpoint/restore capability.
{% endhint %}

{% hint style="warning" %}
The installer restarts the `slurmctld` , `slurmd` daemons on the controller and compute nodes after successful completion. This may cause a short disruption in job submission capability.
{% endhint %}

## Install from web (recommended)

The web installer will automatically install the latest stable version of Cedana and all plugins required for SLURM support with sane defaults.

### Install

{% hint style="success" %}
Check [Authentication](/get-started/authentication.md) for more details on how to get an authentication token.
{% endhint %}

```sh
export CEDANA_URL=https://myorg.cedana.ai/v1
export CEDANA_AUTH_TOKEN=your_auth_token
export CEDANA_CLUSTER_ID=your_cluster_id

curl -fsSL "${CEDANA_URL}/install/slurm" -H "Authorization: Bearer ${CEDANA_AUTH_TOKEN}" | sudo -E bash -s -- --node-role <node-role>
```

* Register a new cluster through [Cedana Dashboard](https://ui.cedana.com/slurm/clusters).
* Use `?version=x.y.z` query parameter to install a specific version.
* Use `?build=alpha&version=feat/my-branch` to install an alpha build from a branch.
* Use `--node-role controller` on controller nodes, `--node-role worker` on worker nodes, and `--node-role login` on login (submission) nodes.

### Configure

For changes in configuration, follow instructions on [Cedana Daemon configuration](https://docs.cedana.ai/daemon/get-started/configuration).&#x20;

After you have made changes to the configuration, simply run the installer again to update and restart Cedana on the node.

```sh
export CEDANA_URL=https://myorg.cedana.ai/v1
export CEDANA_AUTH_TOKEN=your_auth_token

curl -fsSL "${CEDANA_URL}/install/slurm" -H "Authorization: Bearer ${CEDANA_AUTH_TOKEN}" | sudo -E bash -s -- --node-role <node-role>
```

And you're all set! Check out [Manual Checkpoint/Restore](/cedana-slurm/cr.md) to test it out. Below sections are on the alternative methods to install Cedana SLURM.

## Install using Cedana

You can also install SLURM support directly using Cedana, if you have Cedana already installed.

### Install

First, install Cedana by following instructions on [Cedana Daemon installation](https://docs.cedana.ai/daemon/get-started/installation).

{% hint style="success" %}
Check [Authentication](/get-started/authentication.md) for more details on how to get an authentication token.
{% endhint %}

Then, install the `slurm` plugin and run the setup:

```sh
export CEDANA_URL=https://myorg.cedana.ai/v1
export CEDANA_AUTH_TOKEN=your_auth_token
export CEDANA_CLUSTER_ID=your_cluster_id

sudo cedana plugin install slurm
sudo cedana slurm setup --node-role <node-role>
```

* Register a new cluster through [Cedana Dashboard](https://ui.cedana.com/slurm/clusters).
* Use `--node-role controller` on controller nodes, `--node-role worker` on worker nodes, and `--node-role login` on login (submission) nodes.

This should setup everything required. If you wish to setup manually, follow the [next section](#install-manual).

### Install (manual)

For deployments that require installing the plugin files manually, you can download the files directly.

First, install Cedana by following instructions on [Cedana Daemon installation](https://docs.cedana.ai/daemon/get-started/installation).

{% hint style="success" %}
Check [Authentication](/get-started/authentication.md) for more details on how to get an authentication token.
{% endhint %}

To get the Cedana SLURM plugin:

```sh
export CEDANA_URL=https://myorg.cedana.ai/v1
export CEDANA_AUTH_TOKEN=your_auth_token
export CEDANA_CLUSTER_ID=your_cluster_id

sudo cedana version --init-config
sudo cedana plugin remove slurm slurm/wlm
sudo cedana plugin install slurm slurm/wlm
```

For `alpha` builds:

```sh
export CEDANA_URL=https://myorg.cedana.ai/v1
export CEDANA_AUTH_TOKEN=your_auth_token
export CEDANA_CLUSTER_ID=your_cluster_id
export CEDANA_PLUGINS_BUILDS=alpha

sudo cedana version --init-config
sudo ./cedana plugin remove slurm slurm/wlm
sudo ./cedana plugin install slurm@main slurm/wlm@main-slurm-25-11-5-1
```

This will download the `cedana-slurm` binary to `/usr/local/bin` and the SLURM plugin files to `/usr/local/lib`. Remember to replace the `slurm-25-11-5-1` above with the SLURM version your cluster is running.&#x20;

To install the files, transfer the files to the required directories:

```sh
# install to the worker nodes (slurmd), controller nodes (slurmctld), and the database node (slurmdbd)
sudo install /usr/local/bin/cedana-slurm <binary-directory>/cedana-slurm
sudo install /usr/local/lib/cli_filter_cedana.so <slurm-plugin-directory>/cli_filter_cedana.so
sudo install /usr/local/lib/job_submit_cedana.so <slurm-plugin-directory>/job_submit_cedana.so
sudo install /usr/local/lib/task_cedana.so <slurm-plugin-directory>/task_cedana.so
sudo install /usr/local/lib/spank_cedana.so <slurm-plugin-directory>/spank_cedana.so
```

Update the `/etc/slurm/plugstack.conf` to include the `spank_cedana.so:`

```diff
+required <slurm-plugin-directory>/spank_cedana.so
```

Update the `/etc/slurm/slurm.conf` to include the plugins:

```diff
-TaskPlugin=task/affinity,task/cgroup
+TaskPlugin=task/affinity,task/cgroup,task/cedana
+CliFilterPlugins=cli_filter/cedana
+JobSubmitPlugins=job_submit/cedana
```

Reload the `slurmctld` and `slurmd` with:

```sh
sudo systemctl restart slurmctld
sudo systemctl restart slurmd
```

On the database node (`slurmdbd`), start `cedana-slurm:`

```sh
sudo cedana-slurm daemon
```

Or, if you are using systemd, create the service file:

```sh
export LOG_PATH=/var/log/cedana-slurm.log
export SERVICE_FILE=/etc/systemd/system/cedana-slurm.service
export APP_PATH=/usr/local/bin/cedana-slurm

cat <<EOF | tee "$SERVICE_FILE" >/dev/null
[Unit]
Description=Cedana Daemon
[Service]
ExecStart=$APP_PATH daemon start
User=root
Group=root
Restart=no

[Install]
WantedBy=multi-user.target

[Service]
StandardError=append:$LOG_PATH
StandardOutput=append:$LOG_PATH
EOF
```

### Configure

For changes in Cedana configuration, follow instructions on [Cedana Daemon configuration](https://docs.cedana.ai/daemon/get-started/configuration).

#### Privileged mode (root)

In privileged mode, the checkpointing and restoring are done as root. Privileged mode requires no additional configuration.

#### Unprivileged mode (user)

In unprivileged mode, the checkpointing and restoring are done as the job's user, i.e., the UID of the SLURM job performs the checkpoint and restore. This configuration is useful when the root is demoted for security purposes. For example, NFS with `root_squash` requires unprivileged mode.

To enable unprivileged mode, set `Slurm.Unprivileged` to `true` in the [Cedana Daemon configuration](https://docs.cedana.ai/daemon/get-started/configuration). Otherwise, just do this:

```sh
export CEDANA_URL=https://myorg.cedana.ai/v1
export CEDANA_AUTH_TOKEN=your_auth_token
export CEDANA_SLURM_UNPRIVILEGED=1

sudo cedana version --merge-config
```

In addition, the `cedana-slurm`, `cedana`, and `criu` binaries must have the required capabilities for users to perform checkpoint and restore.

```sh
setcap CAP_SYS_PTRACE,CAP_DAC_READ_SEARCH,CAP_CHECKPOINT_RESTORE+eip /usr/local/bin/criu
setcap CAP_SYS_PTRACE,CAP_DAC_READ_SEARCH,CAP_CHECKPOINT_RESTORE+eip /usr/local/bin/cedana
setcap CAP_SYS_PTRACE,CAP_DAC_READ_SEARCH,CAP_CHECKPOINT_RESTORE+eip /usr/local/bin/cedana-slurm
```

## Build from source

Check `make help` for available build targets.

Build all binaries:

```sh
make all
```

By default, the binaries will be built using the `cedana/cedana-slurm:build` docker image.

These binaries are useless on their own. You need to install Cedana to use them.

First, install Cedana by following instructions on [Cedana Daemon installation](https://docs.cedana.ai/daemon/get-started/installation). Then, install the `slurm` plugin after changing into the build directory:

```sh
cd build
sudo cedana plugin install slurm
sudo cedana slurm setup --node-role <node-role>
```

{% hint style="info" %}
You need to be in the `build` directory for the `cedana slurm setup` command to work, as it needs to find the binaries you just built.
{% endhint %}

{% hint style="success" %}
Check [Authentication](/get-started/authentication.md) for more details on how to get an authentication token.
{% endhint %}

You're all set up! Let's checkpoint some workloads. Continue to [Checkpoint/restore](/cedana-slurm/cr.md) to get started.

## Uninstall

To remove Cedana SLURM completely, run on all nodes:

```sh
sudo cedana slurm destroy
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.cedana.ai/cedana-slurm/installation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
