Introduction
In this repo, we will demostrate how to use quota allocation with Kueue with preemption.
Overview
In this example, there are 2 teams that work in their own namespace:
- Team A and B belongs to the same cohort
- Both teams share a quota
- Team A has access to GPU while team B does not
- Team A has higher priority and can prempt others
Kueue Configuration
There are 2 ResourceFlavor
that manages the CPU/Memory and GPU resources. The GPU ResourceFlavor
tolerates nodes that have been tainted.
Both teams have their invididual cluster queue that is associated with their respective namespace.
Name | CPU | Memory (GB) | GPU |
---|---|---|---|
Team A cq | 0 | 0 | 4 |
Team B cq | 0 | 0 | 0 |
Shared cq | 10 | 64 | 0 |
A local queue is defined in their namespace to associate the cluster queue. E.g.
apiVersion: kueue.x-k8s.io/v1beta1
kind: LocalQueue
metadata:
name: local-queue
namespace: team-a
spec:
clusterQueue: team-a-cq
When a Ray cluster is defined, it is submitted to the local queue with the associated priority.
apiVersion: ray.io/v1
kind: RayCluster
metadata:
labels:
kueue.x-k8s.io/queue-name: local-queue
kueue.x-k8s.io/priority-class: dev-priority
Ray cluster configuration
The shared quota is only up to 10 CPU for both teams.
Name | CPU | Memory (GB) | GPU |
---|---|---|---|
Team A | 10 | 24 | 4 |
Team B | 6 | 16 | 0 |
Premption
Team A cluster queue has preemption defined that can borrowWithinCohort
of a lower priority which Team B belongs to.
apiVersion: kueue.x-k8s.io/v1beta1
kind: ClusterQueue
metadata:
name: team-a-cq
spec:
preemption:
reclaimWithinCohort: Any
borrowWithinCohort:
policy: LowerPriority
maxPriorityThreshold: 100
withinClusterQueue: Never
Team A will preempt team B because it has insufficient resources to run.
Setting Up the Demo
-
Install OpenShift AI Operator
-
Ensure there is at least 1 worker node that has a 4 GPUs. On AWS, this can be a p3.8xlarge instance.
-
Taint the GPU node
oc adm taint nodes <gpu-node> nvidia.com/gpu=Exists:NoSchedule
-
Git clone the repo
git clone https://github.com/opendatahub-io-contrib/ai-on-openshift cd ai-on-openshift/docs/odh-rhoai/kueue-preemption
-
Run the makefile target to setup the example. This will setup 2 namespaces: team-a and team-b.
make setup-kueue-examples
To teardown the example, you can use:
make teardown-kueue-preemption
Warning
The setup script will delete all clusterqueues and resourceflavors in the cluster.
Running the example
-
Create a ray cluster for team B. Wait for the cluster to be running.
oc create -f team-b-ray-cluster-dev.yaml
$ oc get rayclusters -A NAMESPACE NAME DESIRED WORKERS AVAILABLE WORKERS CPUS MEMORY GPUS STATUS AGE team-b raycluster-dev 2 2 6 16G 0 ready 70s $ oc get po -n team-b NAME READY STATUS RESTARTS AGE raycluster-dev-head-zwfd8 2/2 Running 0 45s raycluster-dev-worker-small-group-test-4c85h 1/1 Running 0 43s raycluster-dev-worker-small-group-test-5k9j5 1/1 Running 0 43s
-
Create a Ray cluster for team A.
oc create -f team-a-ray-cluster-prod.yaml
-
Observe team B cluster is suspended and team A cluster is running because of preemption. This may take a few seconds to happen.
$ oc get rayclusters -A NAMESPACE NAME DESIRED WORKERS AVAILABLE WORKERS CPUS MEMORY GPUS STATUS AGE team-a raycluster-prod 2 2 10 24G 4 ready 75s team-b raycluster-dev 2 6 16G 0 suspended 3m46s