Configure Time Window for Exclusive GPU Access¶
This guide will instruct you on how to configure the exclusive access time
window (TQ
) for the Kiwi Scheduler.
The exclusive access mechanism, implemented by the Kiwi Scheduler, exists in order to prevent thrashing scenarios
when the sum of the applications’ working set sizes (GPU memory) exceeds the
physical GPU memory capacity. Each application that needs to do GPU work gets
exclusive access to the GPU for TQ
seconds at a time. If more than one
applications’ GPU bursts overlap, Kiwi assigns exclusive access to the GPU in a
round-robin manner.
Note
If you use a big TQ
, for example 100 seconds, you sacrifice interactivity
in favor of maximum throughput. This is because each time another application
gets exclusive access to the GPU, Kiwi must fetch its data to the GPU and
evict the previous application’s data.
Important
By default, the Kiwi Scheduler’s time quantum is 30 seconds. We do not recommend setting this variable to a value of 5 seconds or less, as the application will spend most of its exclusive access window waiting for its data to become resident on the GPU.
Important
Changes to the configuration of a Kiwi Scheduler instance only affect that particular instance and are not persisted across Pod restarts.
In order to make the changes persistent and also have them apply to all Kiwi Scheduler instances you must redeploy Kiwi with the desirable configuration.
Overview
What You’ll Need¶
- An existing Kiwi deployment on your Kubernetes cluster.
Procedure¶
Find the node for which you wish to change the time window.
Fast Forward
If you already know the node for which you wish to change the time window, expand this box to fast-forward this step.
Specify the node name:
root@rok-tools:~# KIWI_NODENAME=<NODE_NAME>Replace
<NODE_NAME>
with the name of the desired node, for example:root@rok-tools:~# KIWI_NODENAME=ip-192-168-109-143.eu-central-1.compute.internalProceed to step 2.
Specify the name of the application Pod you are interested in changing the time window for:
root@rok-tools:~# KIWI_POD_NAME=<POD_NAME>Replace
<POD_NAME>
with the name of the Pod you want to configure, for example:root@rok-tools:~# KIWI_POD_NAME=kiwi-podSpecify the namespace of the application Pod:
root@rok-tools:~# KIWI_POD_NAMESPACE=<POD_NAMESPACE>Replace
<POD_NAMESPACE>
with the namespace of the Pod you want to configure, for example:root@rok-tools:~# KIWI_POD_NAMESPACE=kiwi-pod-namespaceFind the node where the application Pod is running on:
root@rok-tools:~# KIWI_NODENAME=$(kubectl get pod ${KIWI_POD_NAME?} \ > -n ${KIWI_POD_NAMESPACE?} -o json \ > | jq -r '.spec.nodeName') > && echo ${KIWI_NODENAME?} ip-192-168-109-143.eu-central-1.compute.internal
Get the Kiwi Scheduler’s Pod name for the specified node:
root@rok-tools:~# KIWI_SCHEDULER_POD_NAME=$(kubectl get pod \ > -n kiwi-system -l name=kiwi-scheduler -o json \ > | jq -r '.items[] | select(.spec.nodeName == "'$KIWI_NODENAME'") | .metadata.name') > && echo ${KIWI_SCHEDULER_POD_NAME?} kiwi-scheduler-4pk55Change the Kiwi Scheduler’s
TQ
:root@rok-tools:~# kubectl exec -it ${KIWI_SCHEDULER_POD_NAME?} \ > -n kiwi-system -- kiwictl -T <NEW_TQ>Replace
<NEW_TQ>
with the desired time quantum for the Kiwi Scheduler, for example:root@rok-tools:~# kubectl exec -it ${KIWI_SCHEDULER_POD_NAME?} \ > -n kiwi-system -- kiwictl -T 20 [INFO] Successfully set the scheduler TQ to 20 seconds.
Verify¶
Inspect the Kiwi Scheduler logs and verify that
TQ
is equal with the one you previously set:root@rok-tools:~# kubectl logs ${KIWI_SCHEDULER_POD_NAME?} -n kiwi-system [INFO] New TQ = 20
What’s Next¶
Check out the rest of the Kiwi operations you can perform.