Kubeflow

  • Kubeflow Charmers | bundle
  • Cloud
Channel Revision Published
latest/candidate 294 24 Jan 2022
latest/beta 430 30 Aug 2024
latest/edge 423 26 Jul 2024
1.9/stable 432 03 Dec 2024
1.9/beta 420 19 Jul 2024
1.9/edge 431 03 Dec 2024
1.8/stable 414 22 Nov 2023
1.8/beta 411 22 Nov 2023
1.8/edge 413 22 Nov 2023
1.7/stable 409 27 Oct 2023
1.7/beta 408 27 Oct 2023
1.7/edge 407 27 Oct 2023
juju deploy kubeflow --channel 1.9/stable
Show information

Platform:

This guide describes how to install Charmed Kubeflow (CKF) on NVIDIA DGX hardware. DGX systems are purpose-built hardware for enterprise AI use cases, featuring NVIDIA Tensor Core GPUs.

Requirements

  • NVIDIA DGX-enabled hardware setup, including no NVIDIA drivers preinstalled, BIOS settings and bootloader.
  • kubectl.

Install MicroK8s

Install MicroK8s and enable required add-ons as follows:

sudo snap install microk8s --classic --channel 1.22
 
sudo microk8s enable dns:10.229.32.21 storage ingress registry rbac helm3 metallb:10.64.140.43-10.64.140.49,192.168.0.105-192.168.0.111
 
sudo usermod -a -G microk8s ubuntu
sudo chown -f -R ubuntu ~/.kube
newgrp microk8s

Edit /var/snap/microk8s/current/args/containerd-template.toml by adding:

[plugins."io.containerd.grpc.v1.cri".registry.configs]

[plugins."io.containerd.grpc.v1.cri".registry.configs."registry-1.docker.io".auth]
username = "afrikha"
password = "<>"

Finally , restart MicroK8s:

microk8s.stop
microk8s.start

Enable GPU add-on

Install the required GPU operator as follows:

sudo microk8s.enable gpu
mkdir .kube
microk8s config > ~/.kube/config

Check the GPU count for MicroK8s:

kubectl get nodes --show-labels | grep gpu.count

Configure MIG

Configure MIG devices running the following command:

kubectl label nodes blanka nvidia.com/mig.config=all-1g.5gb --overwrite

Check again the GPU count to confirm it has increased:

kubectl get nodes --show-labels | grep gpu.count

If no nodes appear in the command output above, uninstall all GPU drivers form K8s nodes and reinstall MicroK8s.

Deploy CKF

Follow the instructions in General installation for this section.

Explore some examples

CKF can be run on different types of DGX hardware: