# Install kubeflow on Ubuntu 20.04, using microk8s ## Installation Following the [installation steps](https://ubuntu.com/kubeflow/install) - Machine requirement - 16Gb of RAM and 50Gb of free disk. - Install microk8s ```bash # sudo snap install microk8s --classic sudo snap install microk8s --classic --channel=1.20/stable ``` ``` microk8s.enable dns storage dashboard ``` ``` microk8s.enable kubeflow ``` ## Access kubeflow dashboard ``` ssh -D9999 <user>@<master_node_public_ip> ``` In your host operating system, go to **Settings** > **Network** > **Network Proxy**, and enable **SOCKS proxy** pointing to: 127.0.0.1:9999. ## Add new worker node At the master node, run command: ``` microk8s add-node ``` You will have the API url with token, like this: ``` Join node with: microk8s join ip-172-31-20-243:25000/DDOkUupkmaBezNnMheTBqFYHLWINGDbf If the node you are adding is not reachable through the default interface you can use one of the following: microk8s join 10.1.84.0:25000/DDOkUupkmaBezNnMheTBqFYHLWINGDbf microk8s join 10.22.254.77:25000/DDOkUupkmaBezNnMheTBqFYHLWINGDbf ``` And at the worker node, run this command: ``` microk8s join ip-172-31-20-243:25000/DDOkUupkmaBezNnMheTBqFYHLWINGDbf ``` Remember to add the worker node host and ip to `/etc/hosts` # Trouble shooting - Kubeflow successfully launch but kubeflow dashboard login UI is blank - 404 Not found - must be auth service error - solution: **restart oidc-gatekeeper**: delete oidc-gatekeeper, let it restart by himself - Notebook servers page is blank - check your virtual services ```bash kubectl get virtualservices -A ``` You need services like below to have the whole functionality of kubeflow. ``` NAMESPACE NAME GATEWAYS HOSTS AGE kubeflow dex-auth ["kubeflow"] ["*"] 20h kubeflow katib-ui ["kubeflow"] ["*"] 20h kubeflow kubeflow-dashboard ["kubeflow"] ["*"] 20h kubeflow jupyter-web ["kubeflow"] ["*"] 20h kubeflow argo-ui ["kubeflow"] ["*"] 20h ``` - If you miss any of it, create/patch the virtualservices. [reference]( https://github.com/weaveworks/mlops-profile/blob/master/base/jupyter-web-app.yaml) - the origin `jupyter-web.yaml` could be in wrong setting. check with below: ```yaml apiVersion: networking.istio.io/v1beta1 kind: VirtualService metadata: generation: 1 name: jupyter-web namespace: kubeflow spec: gateways: - kubeflow hosts: - '*' http: - headers: request: add: x-forwarded-prefix: /jupyter match: - uri: prefix: /jupyter/ rewrite: uri: / route: - destination: host: jupyter-web.kubeflow.svc.cluster.local port: number: 5000 ``` - To create virtualservices: `kubectl create -f jupyter-web.yaml` - if you already have it, try to modify it using patch ``` kubectl patch virtualservices jupyter-web -n kubeflow --patch "$(cat patch-jupyter-web.yaml)" --type=merge ``` - Others yaml that I patched ```yaml # metadata.yaml apiVersion: networking.istio.io/v1beta1 kind: VirtualService metadata: generation: 1 name: metadata-ui namespace: kubeflow spec: gateways: - kubeflow hosts: - '*' http: - match: - uri: prefix: /metadata/ rewrite: uri: /metadata/ route: - destination: host: metadata-ui.kubeflow.svc.cluster.local port: number: 3000 # kubectl create -f metadata.yaml ``` ```yaml # patch-pipeline.yaml spec: gateways: - kubeflow hosts: - '*' http: - match: - uri: prefix: /pipeline rewrite: uri: /pipeline route: - destination: host: pipelines-ui.kubeflow.svc.cluster.local port: number: 3000 timeout: 300s # kubectl patch virtualservices pipelines-api -n kubeflow --patch "$(cat patch-pipeline.yaml)" --type=merge ``` - Fail to connect to jupyter notebook (press the CONNECT button, it turns out "not a valid page") - The `jupyter-controller` will help you generate a new jupyter server. - Set the `env` of `jupyter-controller` container, make it use the correct way to run the jupyter container. - `USE_ISTIO: true` - `ISTIO_GATEWAY: {the gateway you use}` It would be something like that: ```yaml containers: - name: jupyter-controller image: >- registry.jujucharms.com/kubeflow-charmers/jupyter-controller/oci-image@sha256:6490f737000bd1d2520ac4b8cbde2b09749cdb291b1967ddda95d05131db49db command: - /manager env: - name: USE_CULLING value: 'true' - name: ENABLE_CULLING value: 'true' - name: USE_ISTIO value: 'true' - name: ISTIO_GATEWAY value: kubeflow ``` - Reason: Unprocessable Entity ```bash [2021-01-21 02:49:41,076] ERROR in app: Exception on /api/namespaces/admin/poddefaults [GET] Traceback (most recent call last): File "/app/kubeflow_jupyter/common/auth.py", line 64, in is_authorized obj = api.create_subject_access_review(sar) File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/apis/authorization_v1_api.py", line 390, in create_subject_access_review (data) = self.create_subject_access_review_with_http_info(body, **kwargs) File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/apis/authorization_v1_api.py", line 475, in create_subject_access_review_with_http_info collection_formats=collection_formats) File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/api_client.py", line 321, in call_api _return_http_data_only, collection_formats, _preload_content, _request_timeout) File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/api_client.py", line 155, in __call_api _request_timeout=_request_timeout) File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/api_client.py", line 364, in request body=body) File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/rest.py", line 266, in POST body=body) File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/rest.py", line 222, in request raise ApiException(http_resp=r) kubernetes.client.rest.ApiException: (422) Reason: Unprocessable Entity HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': '27dd0635-eb66-4833-8291-628238525dbf', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'c0415b08-1eaa-4f97-82c5-82c1a7d9f8bc', 'Date': 'Thu, 21 Jan 2021 02:49:41 GMT', 'Content-Length': '1786'}) HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":".authorization.k8s.io \"\" is invalid: metadata: Invalid value: v1.ObjectMeta{Name:\"\", GenerateName:\"\", Namespace:\"\", SelfLink:\"\", UID:\"\", ResourceVersion:\"\", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string{\"app.juju.is/created-by\":\"jupyter-web\"}, Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:\"\", ManagedFields:[]v1.ManagedFieldsEntry{v1.ManagedFieldsEntry{Manager:\"Swagger-Codegen\", Operation:\"Update\", APIVersion:\"authorization.k8s.io/v1\", Time:(*v1.Time)(0xc01a531480), FieldsType:\"FieldsV1\", FieldsV1:(*v1.FieldsV1)(0xc01a5314a0)}}}: must be empty","reason":"Invalid","details":{"group":"authorization.k8s.io","causes":[{"reason":"FieldValueInvalid","message":"Invalid value: v1.ObjectMeta{Name:\"\", GenerateName:\"\", Namespace:\"\", SelfLink:\"\", UID:\"\", ResourceVersion:\"\", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string{\"app.juju.is/created-by\":\"jupyter-web\"}, Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:\"\", ManagedFields:[]v1.ManagedFieldsEntry{v1.ManagedFieldsEntry{Manager:\"Swagger-Codegen\", Operation:\"Update\", APIVersion:\"authorization.k8s.io/v1\", Time:(*v1.Time)(0xc01a531480), FieldsType:\"FieldsV1\", FieldsV1:(*v1.FieldsV1)(0xc01a5314a0)}}}: must be empty","field":"metadata"}]},"code":422} During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 2292, in wsgi_app response = self.full_dispatch_request() File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1815, in full_dispatch_request rv = self.handle_user_exception(e) File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1718, in handle_user_exception reraise(exc_type, exc_value, tb) File "/usr/local/lib/python3.6/dist-packages/flask/_compat.py", line 35, in reraise raise value File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1813, in full_dispatch_request rv = self.dispatch_request() File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1799, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/app/kubeflow_jupyter/common/base_app.py", line 49, in get_poddefaults data = api.list_poddefaults(namespace=namespace) File "/app/kubeflow_jupyter/common/auth.py", line 91, in runner if is_authorized(user, verb, namespace, group, version, resource): File "/app/kubeflow_jupyter/common/auth.py", line 68, in is_authorized sar, utils.parse_error(e)) AttributeError: module 'kubeflow_jupyter.common.utils' has no attribute 'parse_error' 127.0.0.1 - - [21/Jan/2021 02:49:41] "GET /api/namespaces/admin/poddefaults HTTP/1.1" 500 - ``` - use latest juju bundle will fix previous problem - `microk8s enable kubeflow --bundle=cs:kubeflow-245`