Upgrade Kubeflow¶
This section describes how to upgrade Kubeflow. If you have not deployed Kubeflow in your cluster, you can safely skip this section.
What You’ll Need¶
- An upgraded management environment.
- An existing Kubeflow 1.4 deployment.
- Your clone of the Arrikto GitOps repository.
- Arrikto manifests for EKF version 1.5.3.
Procedure¶
Go to your GitOps repository inside your
rok-tools
management environment:root@rok-tools:~# cd ~/ops/deploymentsUpgrade your Spark Operator installation:
root@rok-tools:~/ops/deployments# rok-deploy --apply \ > kubeflow/manifests/contrib/spark/spark-operator/overlays/deploy \ > --force --force-kinds DeploymentUpgrade your Kubeflow installation:
root@rok-tools:~/ops/deployments# rok-deploy --apply install/kubeflowConfigure the Argo workflows executor if necessary. Choose one of the following options, based on your cloud provider:
You can skip this step since EKF already uses the correct executor.
Follow the Configure Argo Workflow Executor guide to set the executor to PNS.
Then, come back to this guide and follow the rest of the procedure.
You can skip this step since EKF already uses the correct executor.
Enable Istio sidecar injection for existing inference services:
root@rok-tools:~/ops/deployments# rok-serving-upgrade
Verify¶
Verify that the Dex pod is up-and-running. Check the pod status and verify field STATUS is Running and field READY is 2/2:
root@rok-tools:~# kubectl -n auth get pods NAME READY STATUS RESTARTS AGE dex-57c98bb9bb-l466d 2/2 Running 3 1mVerify that the pods in the
cert-manager
namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is 1/1 for all Pods:root@rok-tools:~# kubectl -n cert-manager get pods NAME READY STATUS RESTARTS AGE cert-manager-6d86476c77-qwgnj 1/1 Running 0 1m cert-manager-cainjector-5b9cd446fd-kl9gg 1/1 Running 0 1m cert-manager-webhook-64d967c45-jmxcz 1/1 Running 0 1mVerify that the pods in the
istio-system
namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is 1/1 for all Pods:root@rok-tools:~# kubectl -n istio-system get pods NAME READY STATUS RESTARTS AGE authservice-0 1/1 Running 0 1m cluster-local-gateway-b76ff5885-2rjg5 1/1 Running 0 1m istio-ingressgateway-57f58bf544-x45kw 1/1 Running 0 1m istiod-68f6c899f5-wzjfm 1/1 Running 0 1mVerify that the pods in the
knative-monitoring
namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is n/n for all Pods:root@rok-tools:~# kubectl -n knative-monitoring get pods NAME READY STATUS RESTARTS AGE grafana-6695587d6f-ktf86 1/1 Running 0 1m kube-state-metrics-79ddb7fc64-w7s5m 1/1 Running 0 1m node-exporter-xlj2v 2/2 Running 0 1m node-exporter-zfjh5 2/2 Running 0 1m prometheus-system-0 1/1 Running 0 1m prometheus-system-1 1/1 Running 0 1mVerify that the pods in the
knative-serving
namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is 2/2 for all Pods:root@rok-tools:~# kubectl -n knative-serving get pods NAME READY STATUS RESTARTS AGE activator-5d6754bc67-qb2ct 2/2 Running 0 1m autoscaler-6dd6dbbb84-zgwkf 2/2 Running 0 1m controller-687f6c6995-27fkw 2/2 Running 0 1m istio-webhook-8d4f5fbfb-tg6h4 2/2 Running 0 1m networking-istio-785675596f-nnqbr 2/2 Running 0 1m webhook-6d776d968c-gmnbz 2/2 Running 0 1mVerify that the pods in the
kubeflow
namespace are up-and-running. Check the pod status and verify field STATUS is Running and field READY is n/n for all Pods:root@rok-tools:~# kubectl -n kubeflow get pods NAME READY STATUS RESTARTS AGE admission-webhook-deployment-5d4cf6bbdb-jszsw 2/2 Running 0 1m centraldashboard-fd8774874-56587 2/2 Running 0 1m jupyter-web-app-deployment-7987d45c7d-5gwss 2/2 Running 0 1m katib-controller-54f895f874-g29bx 2/2 Running 2 1m katib-db-manager-6f5d8f5945-wmmnb 2/2 Running 1 1m katib-mysql-857bfdb7f9-w5zj8 2/2 Running 0 1m katib-ui-696fc69ddc-jkk2x 2/2 Running 2 1m kfp-cache-d96f57c8b-5cjht 3/3 Running 4 1m kfserving-controller-manager-0 3/3 Running 1 1m kubeflow-reception-9c67996fc-46djf 2/2 Running 1 1m metadata-db-d48d67699-89fg9 2/2 Running 0 1m metadata-envoy-deployment-775b466c45-4gbkx 1/1 Running 0 1m metadata-grpc-deployment-5c975cb96d-tq5vr 2/2 Running 4 1m minio-7c9b6578cd-7f2tb 2/2 Running 0 1m ml-pipeline-7867b5b879-dgmnj 2/2 Running 0 1m ml-pipeline-persistenceagent-8495768cbb-vpfjt 2/2 Running 0 1m ml-pipeline-scheduledworkflow-7f58d84f9f-4pf7d 2/2 Running 0 1m ml-pipeline-ui-678cb55d6f-z9spc 2/2 Running 0 1m ml-pipeline-viewer-crd-57768dc6c6-wtxjm 2/2 Running 1 1m ml-pipeline-visualizationserver-68498d6df6-ms74w 2/2 Running 0 1m models-web-app-748f8776df-zrc66 2/2 Running 0 1m mpi-operator-f658c675b-6jrln 1/1 Running 0 1m mxnet-operator-6594fb56b-q68pp 1/1 Running 0 1m mysql-55d57856d7-bzvgd 2/2 Running 0 1m notebook-controller-deployment-6cf9974cd9-2p9mj 2/2 Running 1 1m profiles-deployment-64cf74dfd4-b6dx2 3/3 Running 1 1m pvcviewer-controller-controller-manager-6dd55d9dfd-m5j8s 3/3 Running 1 1m pytorch-operator-74788b9d8c-prdsb 2/2 Running 0 1m spark-operatorsparkoperator-5775c699bb-4xgn2 2/2 Running 0 1m tensorboard-controller-controller-manager-7f766c8676-8g6fq 3/3 Running 2 1m tensorboards-web-app-deployment-6b4dfd598c-r9xgk 1/1 Running 0 1m tf-job-operator-d8b96567b-qj48v 2/2 Running 1 1m volumes-web-app-deployment-7b58b4c478-btfmw 2/2 Running 0 1m workflow-controller-76579565dd-8f6vw 2/2 Running 1 1m xgboost-operator-deployment-7dcff8bf85-t9hvr 2/2 Running 1 1mVerify that there are no inference services with Istio sidecar injection disabled, that is, the following command produces no output:
root@rok-tools:~# kubectl get isvc -A -o json \ > | jq -r '.items[] | select(.metadata.annotations["sidecar.istio.io/inject"]=="false") | .metadata.namespace, .metadata.name' \ > | paste - -
Summary¶
You have successfully upgraded Kubeflow.