Managed process/container
Last updated
Was this helpful?
Last updated
Was this helpful?
The Cedana daemon is designed to manage the entire lifecycle of a process/container, including checkpoint/restore, in the larger Cedana system.
Managed processes/containers are those that are spawned using cedana run
(CLI reference). This command creates a managed job, which can be checkpointed and restored using cedana dump job
and cedana restore job
subcommands.
By default, jobs are stored in a local DB (in /tmp
). You may set the db.path
in configuration to change the path to persist it across restarts. If you're authenticated, you may set the db.remote
to true
in configuration to use a remote DB at your specified Cedana endpoint instead.
To run a new managed job:
Where <type>
can be process
, containerd
, runc
, etc. See feature matrix for all plugins that support running managed jobs.
For example, to run a new managed process:
The --jid
flag is optional, and if not provided, a random job ID will be generated.
It's also possible to start managing an existing process/container:
Where <type>
can be process
, containerd
, runc
, etc. See feature matrix for all plugins that support managing existing jobs.
For example, to manage an existing process:
The subcommand cedana job
has many subcommands such as list
, kill
,m delete
, etc. Check the CLI reference for all available subcommands.
cedana ps
is a shorthand for cedana job list
.
To attach to the I/O of a job, use the --attach
flag:
This will attach the standard input, output, and error streams to the terminal, including the job's exit status. Press Ctrl+C
to detach, and the job will continue running in the background.
If you list the jobs, you will see that the job is attachable:
To attach to the job again, use the cedana job attach
subcommand:
Or the generic cedana attach
if you want to attach using the job's PID:
If you want to make a job attachable, but not attach to it immediately, you can use the --attachable
flag:
By default, the jobs stdout/stderr are stored in the /var/log/
directory. If you do cedana job list
, you will see the path to the log file.
If the job has attachable I/O, it will appear as such:
Once the daemon has started managing a job, it can be checkpointed and restored using the cedana dump job
and cedana restore job
subcommands. See managed checkpoint/restore basics for more information.
GPU C/R support is only available for managed jobs. Check out the checkpoint/restore with GPUs guide for more information.