Gang Scheduling
Gang scheduling ensures that all pods in a workload are scheduled together — either all pods start at once or none start at all. This is important for distributed workloads such as multi-GPU training jobs where partial scheduling would waste resources and stall progress. This page uses Kueue — a Kubernetes-native job queueing controller — to provide the all-or-nothing semantics; it installs as a workload inside the cluster and isn't tied to anything Breqwatr-specific.
Prerequisites
Before proceeding, make sure you have a running Kubernetes cluster. See one of the following guides depending on your setup:
Install Kueue
Install Kueue using Helm:
helm install kueue oci://registry.k8s.io/kueue/charts/kueue \
--version=0.17.2 \
--namespace kueue-system \
--create-namespace \
--wait --timeout 300s
Verify Kueue is running:
Expected output:
Configure Queues and Resource Flavors
Create a ClusterQueue, ResourceFlavor, and LocalQueue. The ClusterQueue defines the cluster-wide resource quotas, the ResourceFlavor describes the available hardware, and the LocalQueue is the namespace-scoped entry point that workloads reference:
apiVersion: kueue.x-k8s.io/v1beta2
kind: ClusterQueue
metadata:
name: gpu-cluster-queue
spec:
namespaceSelector: {}
resourceGroups:
- coveredResources: ["cpu", "memory"]
flavors:
- name: default-flavor
resources:
- name: cpu
nominalQuota: "10"
- name: memory
nominalQuota: 10Gi
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: ResourceFlavor
metadata:
name: default-flavor
---
apiVersion: kueue.x-k8s.io/v1beta2
kind: LocalQueue
metadata:
name: gpu-local-queue
namespace: default
spec:
clusterQueue: gpu-cluster-queue
Verify all three resources were created:
Expected output:
Expected output:
Expected output:
Submit a Gang-Scheduled Job
Submit a test job with parallelism: 2 and completions: 2. Kueue will admit both pods together or hold them both until capacity is available. The job references the LocalQueue via the kueue.x-k8s.io/queue-name label:
apiVersion: batch/v1
kind: Job
metadata:
name: gang-schedule-test
namespace: default
labels:
kueue.x-k8s.io/queue-name: gpu-local-queue
spec:
completions: 2
parallelism: 2
template:
spec:
containers:
- name: worker
image: busybox
command: ["sh", "-c", "echo Gang scheduled worker && sleep 30"]
resources:
requests:
cpu: "100m"
memory: "128Mi"
restartPolicy: Never
Verify both pods were scheduled and completed together:
Expected output:
NAME READY STATUS RESTARTS AGE
gang-schedule-test-8sb7w 0/1 Completed 0 97s
gang-schedule-test-fhrxt 0/1 Completed 0 97s
Expected output:
Inspect the job events to see Kueue's scheduling flow:
Expected output:
Name: gang-schedule-test
Namespace: default
Selector: batch.kubernetes.io/controller-uid=2780b857-a84a-4d71-958f-bd78979c604e
Labels: kueue.x-k8s.io/queue-name=gpu-local-queue
Annotations: <none>
Parallelism: 2
Completions: 2
Completion Mode: NonIndexed
Suspend: false
Backoff Limit: 6
Start Time: Tue, 05 May 2026 16:14:09 -0400
Completed At: Tue, 05 May 2026 16:14:47 -0400
Duration: 38s
Pods Statuses: 0 Active (0 Ready) / 2 Succeeded / 0 Failed
Pod Template:
Labels: batch.kubernetes.io/controller-uid=2780b857-a84a-4d71-958f-bd78979c604e
batch.kubernetes.io/job-name=gang-schedule-test
controller-uid=2780b857-a84a-4d71-958f-bd78979c604e
job-name=gang-schedule-test
kueue.x-k8s.io/cluster-queue-name=gpu-cluster-queue
kueue.x-k8s.io/local-queue-name=gpu-local-queue
kueue.x-k8s.io/podset=main
Annotations: kueue.x-k8s.io/workload: job-gang-schedule-test-33e93
Containers:
worker:
Image: busybox
Port: <none>
Host Port: <none>
Command:
sh
-c
echo Gang scheduled worker && sleep 30
Requests:
cpu: 100m
memory: 128Mi
Environment: <none>
Mounts: <none>
Volumes: <none>
Node-Selectors: <none>
Tolerations: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Suspended 2m5s job-controller Job suspended
Normal CreatedWorkload 2m5s batch/job-kueue-controller Created Workload: default/job-gang-schedule-test-33e93
Normal Started 2m5s batch/job-kueue-controller Admitted by clusterQueue gpu-cluster-queue
Normal SuccessfulCreate 2m5s job-controller Created pod: gang-schedule-test-fhrxt
Normal SuccessfulCreate 2m5s job-controller Created pod: gang-schedule-test-8sb7w
Normal Resumed 2m5s job-controller Job resumed
Normal Completed 87s job-controller Job completed
Normal FinishedWorkload 87s (x3 over 87s) batch/job-kueue-controller Workload 'default/job-gang-schedule-test-33e93' is declared finished
The events confirm the gang scheduling behaviour: the job was first Suspended by Kueue, then admitted as a whole to gpu-cluster-queue, both pods were created simultaneously, and the job completed successfully.