Checkpoint/restore basics

The Cedana daemon is designed to checkpoint/restore processes as well as containers.

Checkpoint

To checkpoint:

cedana dump <type> ...

Where <type> can be process, containerd, runc, job, etc. See features for all plugins that support checkpointing.

For example, to checkpoint a process:

cedana dump process <PID> --dir /tmp

A --dir flag can be used to specify the parent directory where the checkpoint will be stored. If not provided, the checkpoint will be stored in the default checkpoint directory as specified in the configuration, or in /tmp if not set. You may also specify a --name flag to give a custom name to the checkpoint file.

See CLI reference for all available options for process checkpoint.

Restore

Using daemon

cedana restore <type> ...

Where <type> can be process, containerd, runc, job, etc. See features for all plugins that support restoring.

For example, to restore a process:

cedana restore process --path <path-to-dump>

Notice that for restore the flag is called --path instead of --dir (as in dump), this is because it can be a path to a compressed file, or to a directory if not compressed.

Without daemon

It's also possible to restore directly as a child of the current shell command without the daemon:

This is useful for scenarios where you want to restore a process as a child of the current shell, for example, to restore a shell process and interact with it directly.

See CLI reference for all available options for process restore.

Managed checkpoint/restore

As explained in managed process/container, a job can be of any type, and thus can be checkpointed and restored using the cedana dump job and cedana restore job subcommands.

The cedana dump/restore job subcommands have the same options as their non-managed counterparts, but with pretty good defaults. For e.g., the --path flag is not required for cedana restore job, as the checkpoint path is stored in the job metadata.

If you do cedana job list after checkpointing a job, you will see the latest checkpoint time and size:

To view all checkpoints for a job, use cedana job checkpoints <job_id>:

Compression

The cedana dump subcommand supports a --compression flag to specify the compression algorithm to use. For example:

This will create a compressed checkpoint file with the path /tmp/xyz.tar.gz. The --name flag is optional, and if not provided, the daemon will choose a unique name based on some metadata.

When restoring, the daemon will automatically detect the compression algorithm used and decompress the file. Simply provide the path to the compressed file:

Supported values for --compression are none, tar, gzip, lz4, zlib.

You may also specify the default compression algorithm in the configuration.

Remote storage

Cedana supports checkpointing/restoring to/from remote storage, through storage plugins. Check out the following guides for specific remote storage:

Advanced

Last updated

Was this helpful?