Kubeflow
- Kubeflow Charmers | bundle
- Cloud
Channel | Revision | Published |
---|---|---|
latest/candidate | 294 | 24 Jan 2022 |
latest/beta | 430 | 30 Aug 2024 |
latest/edge | 423 | 26 Jul 2024 |
1.10/stable | 436 | 07 Apr 2025 |
1.10/candidate | 434 | 02 Apr 2025 |
1.10/beta | 433 | 24 Mar 2025 |
1.9/stable | 432 | 03 Dec 2024 |
1.9/beta | 420 | 19 Jul 2024 |
1.9/edge | 431 | 03 Dec 2024 |
1.8/stable | 414 | 22 Nov 2023 |
1.8/beta | 411 | 22 Nov 2023 |
1.8/edge | 413 | 22 Nov 2023 |
1.7/stable | 409 | 27 Oct 2023 |
1.7/beta | 408 | 27 Oct 2023 |
1.7/edge | 407 | 27 Oct 2023 |
juju deploy kubeflow --channel 1.10/stable
Deploy Kubernetes operators easily with Juju, the Universal Operator Lifecycle Manager. Need a Kubernetes cluster? Install MicroK8s to create a full CNCF-certified Kubernetes system in under 60 seconds.
Platform:
This guide discusses Kubernetes (K8s) scheduling patterns for Charmed Kubeflow (CKF) workloads.
Scheduling CKF workloads into Pods to run on K8s nodes with specialised hardware requires specific configurations. These vary depending on the use case and the working environment.
The most common scheduling patterns are the following:
- Schedule on GPU nodes.
- Schedule on a specific node pool.
- Schedule on Tainted nodes.
Schedule on GPU nodes
In most production scenarios, Pods are scheduled on GPUs using one or a combination of the following methods:
- Setting up GPUs via their resources.
- Configuring Taints for getting scheduled on Tainted GPU nodes.
- Configuring Affinities for getting scheduled on nodes with specialised hardware.
See Use NVIDIA GPUs for more details on how to leverage NVIDIA GPU resources in your CKF deployment.
Schedule on a specific node pool
Configuring resources in the workload Pod allows Kubernetes to schedule it on a node with the required hardware. However, there may be additional scheduling requirements beyond hardware needs.
For example, a workload might require GPU resources but also run on a development node, not production, within a specific availability zone or data center.
This is achieved by configuring the underlying workload Pod’s nodeSelector
or node Affinities
, specifying the list of nodes the Pod should be scheduled on.
Schedule on Tainted nodes
Nodes with specialized hardware, such as GPUs, are very expensive. As a result, a common pattern is to use autoscaling node pools for these nodes, so they are scaled down when not in use.
To support this setup, administrators often apply Taints to these nodes, ensuring that only Pods configured with the appropriate Tolerations can be scheduled on them. See K8s use cases for more details.
In this scenario, CKF workload Pods must also be configured with the necessary Tolerations to be scheduled on the specialised nodes.
See also
- Learn how to configure advanced scheduling for specific use cases.