Skip to content

KubeRay

KubeRay is a Kubernetes operator for deploying and managing Ray applications on Kubernetes using its custom resources, including distributed training and inference jobs.

Prerequisites

Before proceeding, make sure you have a running Kubernetes cluster. See one of the following guides depending on your setup:

Install the KubeRay Operator

Add the KubeRay Helm repository and install the operator:

helm repo add kuberay https://ray-project.github.io/kuberay-helm/
helm repo update
helm install kuberay-operator kuberay/kuberay-operator \
  --namespace kuberay-system \
  --create-namespace

Expected output:

NAME: kuberay-operator
LAST DEPLOYED: Fri May  8 15:42:32 2026
NAMESPACE: kuberay-system
STATUS: deployed
REVISION: 1
DESCRIPTION: Install complete
TEST SUITE: None

Verify the operator pod is running and the Ray CRDs were installed:

kubectl get pods -n kuberay-system

Expected output:

NAME                                READY   STATUS    RESTARTS   AGE
kuberay-operator-667ddd4b8f-dhm2j   1/1     Running   0          5m18s
kubectl get crd | grep ray.io

Expected output:

rayclusters.ray.io    2026-05-08T19:40:24Z
raycronjobs.ray.io    2026-05-08T19:40:24Z
rayjobs.ray.io        2026-05-08T19:40:25Z
rayservices.ray.io    2026-05-08T19:40:26Z

Create a RayCluster

Create a RayCluster with one head node and one worker node:

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  name: test-raycluster
spec:
  rayVersion: "2.9.0"

  headGroupSpec:
    serviceType: ClusterIP
    rayStartParams:
      dashboard-host: "0.0.0.0"

    template:
      spec:
        containers:
        - name: ray-head
          image: rayproject/ray:2.9.0
          resources:
            requests:
              cpu: "500m"
              memory: "1Gi"

  workerGroupSpecs:
  - groupName: workers
    replicas: 1
    minReplicas: 1
    maxReplicas: 1

    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.9.0
          resources:
            requests:
              cpu: "500m"
              memory: "1Gi"

Watch the cluster come up:

kubectl get raycluster -w

Expected output once ready:

NAME              DESIRED WORKERS   AVAILABLE WORKERS   CPUS   MEMORY   GPUS   STATUS   AGE
test-raycluster   1                                     1      2Gi      0               7s

Verify the Cluster

Once the head pod is running, connect to it and verify the cluster resources are visible:

kubectl get pods | grep head

Expected output:

test-raycluster-head-f8thn   1/1     Running   0          104s

Connect to the head pod and query the cluster resources:

kubectl exec -it test-raycluster-head-f8thn -- python -c "
import ray
ray.init(address='auto')
print(ray.cluster_resources())
"

Expected output:

2026-05-08 12:51:07,269 INFO worker.py:1405 -- Using address 127.0.0.1:6379 set in the environment variable RAY_ADDRESS
2026-05-08 12:51:07,269 INFO worker.py:1540 -- Connecting to existing Ray cluster at address: 10.100.247.245:6379...
2026-05-08 12:51:07,281 INFO worker.py:1715 -- Connected to Ray cluster. View the dashboard at http://10.100.247.245:8265
{'memory': 21275978548.0, 'object_store_memory': 9805600358.0, 'node:10.100.30.224': 1.0, 'CPU': 2.0, 'node:__internal_head__': 1.0, 'node:10.100.247.245': 1.0}

The output confirms the cluster has 2 CPUs available across the head and worker nodes and Ray is ready to accept workloads.