KubeRay
KubeRay is a Kubernetes operator for deploying and managing Ray applications on Kubernetes using its custom resources, including distributed training and inference jobs.
Prerequisites
Before proceeding, make sure you have a running Kubernetes cluster. See one of the following guides depending on your setup:
Install the KubeRay Operator
Add the KubeRay Helm repository and install the operator:
helm install kuberay-operator kuberay/kuberay-operator \
--namespace kuberay-system \
--create-namespace
Expected output:
NAME: kuberay-operator
LAST DEPLOYED: Fri May 8 15:42:32 2026
NAMESPACE: kuberay-system
STATUS: deployed
REVISION: 1
DESCRIPTION: Install complete
TEST SUITE: None
Verify the operator pod is running and the Ray CRDs were installed:
Expected output:
Expected output:
rayclusters.ray.io 2026-05-08T19:40:24Z
raycronjobs.ray.io 2026-05-08T19:40:24Z
rayjobs.ray.io 2026-05-08T19:40:25Z
rayservices.ray.io 2026-05-08T19:40:26Z
Create a RayCluster
Create a RayCluster with one head node and one worker node:
apiVersion: ray.io/v1
kind: RayCluster
metadata:
name: test-raycluster
spec:
rayVersion: "2.9.0"
headGroupSpec:
serviceType: ClusterIP
rayStartParams:
dashboard-host: "0.0.0.0"
template:
spec:
containers:
- name: ray-head
image: rayproject/ray:2.9.0
resources:
requests:
cpu: "500m"
memory: "1Gi"
workerGroupSpecs:
- groupName: workers
replicas: 1
minReplicas: 1
maxReplicas: 1
template:
spec:
containers:
- name: ray-worker
image: rayproject/ray:2.9.0
resources:
requests:
cpu: "500m"
memory: "1Gi"
Watch the cluster come up:
Expected output once ready:
Verify the Cluster
Once the head pod is running, connect to it and verify the cluster resources are visible:
Expected output:
Connect to the head pod and query the cluster resources:
kubectl exec -it test-raycluster-head-f8thn -- python -c "
import ray
ray.init(address='auto')
print(ray.cluster_resources())
"
Expected output:
2026-05-08 12:51:07,269 INFO worker.py:1405 -- Using address 127.0.0.1:6379 set in the environment variable RAY_ADDRESS
2026-05-08 12:51:07,269 INFO worker.py:1540 -- Connecting to existing Ray cluster at address: 10.100.247.245:6379...
2026-05-08 12:51:07,281 INFO worker.py:1715 -- Connected to Ray cluster. View the dashboard at http://10.100.247.245:8265
{'memory': 21275978548.0, 'object_store_memory': 9805600358.0, 'node:10.100.30.224': 1.0, 'CPU': 2.0, 'node:__internal_head__': 1.0, 'node:10.100.247.245': 1.0}
The output confirms the cluster has 2 CPUs available across the head and worker nodes and Ray is ready to accept workloads.