Skip to main content

Checkpoint/Restore Service API

Getting started with checkpointing, migrating and restoring containers and pods in Kubernetes.

NOTE When running Cedana against an EKS cluster, ensure that cgroupsv2 is being used by your container runtime and is installed on your AMI. Checkpoint/Restore will not work otherwise.

Authentication

Create an account at https://auth.cedana.com/ui/registration. Using the username and password, you can grab a session token using the following steps:

curl -s -X GET -H "Accept: application/json" \
'https://auth.cedana.com/self-service/login/api' | jq -r '.ui.action'

Which returns a ui action flow URL, that you can use to authenticate and grab a token:

curl -s -X POST -H "Accept: application/json" -H "Content-Type: application/json" \  
-d '{"identifier": "your-email", "password": "your-password", "method": "password"}' \
"$actionUrl" | jq

This token is valid for 720 hours, and can be used to authenticate all requests to our services.

Bootstrapping

In order to get started, a Service Account needs to get created for Cedana to be able to deploy the Cedana Binary onto your instances.

kubectl -n kube-system create serviceaccount <service-account-name>

Now create a cluster role binding for the service account and make it cluster-admin:

kubectl create clusterrolebinding <binding-name> --clusterrole=cluster-admin --serviceaccount=kube-system:<service-account-name>

We now need to obtain an auth token, start by applying the following secret:

apiVersion: v1
kind: Secret
metadata:
name: <kubeconfig-sa-token-name>
namespace: kube-system
annotations:
kubernetes.io/service-account.name: <service-account-name>
type: kubernetes.io/service-account-token

We can now obtain the data needed to hit Cedana's attach to kubernetes endpoint. The following command returns the service account token:

kubectl describe secrets <kubeconfig-sa-token-name> -n kube-system

We also need the certificate of authority:

kubectl get secret test-sa-token -n kube-system -o jsonpath='{.data.ca\.crt}'

We can now hit the attach to kubernetes endpoint and deploy Cedana to your kubernetes cluster:

curl -X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
-d '{
"server": "your_cluster_server_url",
"token": "your_token_value",
"cert": "your_cert_value"
}' https://sandbox.cedana.ai/kubernetes/attach

Once Cedana is attached, a CustomResourceDefinition called Cedana and a Kubernetes Operator called Cedana_Controller are deployed to your kube cluster. You can now create an instance of the Cedana resource and conduct checkpoint and restore of containers in your cluster. The Cedana_Controller pod also containers a rest service with the following endpoints:

Checkpoint/Restore - Rest Service

The Cedana Rest Service provides a REST API for checkpointing and restoring containers in your Kubernetes cluster. The API runs concurrently with the Cedana Controller. Below are curl commands illustrating the schema of the API. All curls are using the in-cluster ip of the cedanacontroller pod. In order to do out of cluster checkpoint and restore, you would have to expose the pod and create an external ip address with kubernetes services.

List Containers in Namespace

GET /list/:namespace

List containers in a specific namespace by querying Kubernetes pods with specific labels.

Response

  • Returns JSON array containing a list of containers in the specified namespace.

Checkpoint

Initiate a checkpoint for a container:

curl -X POST -H "Content-Type: application/json" -d '{
"sandbox_id": "sandbox_id",
"container_name": "container_name",
"namespace": "namespace"
}' http://<CONTROLLER_CLUSTER_IP>/checkpoint

Argument:

  • sandbox_id: Identifier for the sandbox.
  • container_name: Name of the container to checkpoint.
  • namespace: Namespace in which the container resides.

Response:

  • checkpoint_id: A uuid that is associated with the checkpoint, used for restore.

Restore

Restore a container from a checkpoint:

curl -X POST -H "Content-Type: application/json" -d '{
"sandbox_id": "sandbox_id",
"container_name": "container_name",
"namespace": "namespace",
"checkpoint_id": "checkpoint_id"
}' http://<CONTROLLER_CLUSTER_IP>/restore

Argument:

  • sandbox_id: Identifier for the sandbox.
  • container_name: Name of the container to restore.
  • namespace: Namespace in which the container resides.
  • checkpoint_id: Identifier for the checkpoint to restore.

Response:

  • Status Code: 200 OK

Checkpoint/Restore - CRD

With Cedana, you can checkpoint a container, a set of containers or an entire pod - and then restore into a container or into an existing pod.

To perform any cedana operation against pods or containers in the cluster, you can apply a customResourceDefinition, the schema of which is shown below:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.13.0
name: cedanas.core.cedana.ai
spec:
group: core.cedana.ai
names:
kind: Cedana
listKind: CedanaList
plural: cedanas
singular: cedana
scope: Namespaced
versions:
- name: v1
schema:
openAPIV3Schema:
description: Cedana is the Schema for the cedanas API
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: CedanaSpec defines the desired state of Cedana
properties:
containerName:
type: string
method:
enum:
- Checkpoint
- Restore
- Pending
- Failed
- Done
type: string
sandboxName:
type: string
type: object
status:
description: CedanaStatus defines the observed state of Cedana
properties:
bundlePath:
type: string
checkpointDone:
description: 'INSERT ADDITIONAL STATUS FIELD - define observed state
of cluster Important: Run "make" to regenerate code after modifying
this file'
type: boolean
lastCheckpointedTime:
format: date-time
type: string
restoreDone:
type: boolean
type: object
type: object
served: true
storage: true
subresources:
status: {}

Manual Checkpoint & Restore

To manually checkpoint, simply change the method and specify the sandboxName, which you can get from kubectl get pods. An example of a modified spec is shown below:

---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.13.0
name: cedanas.core.cedana.ai
spec:
group: core.cedana.ai
names:
kind: Cedana
listKind: CedanaList
plural: cedanas
singular: cedana
scope: Namespaced
method: checkpoint
sandboxName: some-sandbox-id-from-kubectl-get-pods

Then simply apply this with kubectl apply -f file.yaml to conduct a checkpoint.

To manually restore, simply change the method name.