# GKE Autopilot It is known that Elastic Agent does not work for Google GKE Autopilot. ## Explore and confirm the cause ### Cannot deploy Daemonsets on `kube-system` namespace ```console Error from server (Forbidden): error when creating "elastic-agent-managed-kubernetes.yaml": daemonsets.apps is forbidden: User "christos.markou@elastic.co" cannot create resource "daemonsets" in API group "apps" in the namespace "kube-system": GKEAutopilot authz: the namespace "kube-system" is managed and the request's verb "create" is denied ``` ### Cannot access some hostpaths ```console Error from server ([denied by autogke-disallow-hostnamespaces]|[denied by autogke-no-write-mode-hostpath]): error when creating "elastic-agent-managed-kubernetes.yaml": admission webhook "gkepolicy.common-webhooks.networking.gke.io" denied the request: GKE Policy Controller rejected the request because it violates one or more policies: {"[denied by autogke-disallow-hostnamespaces]":["enabling hostPID is not allowed in Autopilot. Requested by user: 'christos.markou@elastic.co', groups: 'system:authenticated'.","enabling hostNetwork is not allowed in Autopilot. Requested by user: 'christos.markou@elastic.co', groups: 'system:authenticated'."],"[denied by autogke-no-write-mode-hostpath]":["hostPath volume proc used in container elastic-agent uses path /proc which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/]. Requested by user: 'christos.markou@elastic.co', groups: 'system:authenticated'.","hostPath volume cgroup used in container elastic-agent uses path /sys/fs/cgroup which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/]. Requested by user: 'christos.markou@elastic.co', groups: 'system:authenticated'.","hostPath volume varlibdockercontainers used in container elastic-agent uses path /var/lib/docker/containers which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/]. Requested by user: 'christos.markou@elastic.co', groups: 'system:authenticated'.","hostPath volume etc-full used in container elastic-agent uses path /etc which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/]. Requested by user: 'christos.markou@elastic.co', groups: 'system:authenticated'.","hostPath volume var-lib used in container elastic-agent uses path /var/lib which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/]. Requested by user: 'christos.markou@elastic.co', groups: 'system:authenticated'.","hostPath volume etc-mid used in container elastic-agent uses path /etc/machine-id which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/]. Requested by user: 'christos.markou@elastic.co', groups: 'system:authenticated'."]} ``` ### Cannot use `hostNetwork` ```console Error from server ([denied by autogke-disallow-hostnamespaces]): error when creating "elastic-agent-managed-kubernetes.yaml": admission webhook "gkepolicy.common-webhooks.networking.gke.io" denied the request: GKE Policy Controller rejected the request because it violates one or more policies: {"[denied by autogke-disallow-hostnamespaces]":["enabling hostNetwork is not allowed in Autopilot. Requested by user: 'christos.markou@elastic.co', groups: 'system:authenticated'.","enabling hostPID is not allowed in Autopilot. Requested by user: 'christos.markou@elastic.co', groups: 'system:authenticated'."]} ``` ### `kube-state-metrics` default manifests cannot be deployed ```console clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created clusterrole.rbac.authorization.k8s.io/kube-state-metrics created Error from server (Forbidden): error when creating "../../kube-state-metrics/examples/standard/deployment.yaml": deployments.apps is forbidden: User "christos.markou@elastic.co" cannot create resource "deployments" in API group "apps" in the namespace "kube-system": GKEAutopilot authz: the namespace "kube-system" is managed and the request's verb "create" is denied Error from server (Forbidden): error when creating "../../kube-state-metrics/examples/standard/service-account.yaml": serviceaccounts is forbidden: User "christos.markou@elastic.co" cannot create resource "serviceaccounts" in API group "" in the namespace "kube-system": GKEAutopilot authz: the namespace "kube-system" is managed and the request's verb "create" is denied Error from server (Forbidden): error when creating "../../kube-state-metrics/examples/standard/service.yaml": services is forbidden: User "christos.markou@elastic.co" cannot create resource "services" in API group "" in the namespace "kube-system": GKEAutopilot authz: the namespace "kube-system" is managed and the request's verb "create" is denied ``` ### Exceeds local storage ``` Container elastic-agent exceeded its local ephemeral storage limit "100Mi". ``` ## Explore workarounds 1. Running on a namespace other than `kube-system`. 2. Remove hostPaths related to security and cloud posture. Autopilot is restrictive enough so users might not need extra security visions at this level. 3. Remove HostNetwork. TODO: verify what is missing 4. Avoid using `system` package to monitor the underlying nodes. Use GKE's along with the GCP Elastic module to achieve this. 5. Set the proper volume limits ### Cannot deploy Daemonsets on `kube-system` namespace Solution: change namespace ## Make users aware of any confirmed limitations for GKE Autopilot mode during onboarding 1. `orchestrator.cluster.name` field is not populated. Because of this, the aggregations and visualizations that are based on this field do not work. We need to manually add this field by using a processor. This can be fixed if we add the following processor under the `processors` field in the integration: ```yml - add_fields: target: orchestrator.cluster fields: name: obs-cnative-elastic-agent-copilot url: https://34.79.103.184 ``` ## Working manifest <details> <summary>elastic-agent-managed-kubernetes.yml</summary> ```yml # For more information https://www.elastic.co/guide/en/fleet/current/running-on-kubernetes-managed-by-fleet.html apiVersion: apps/v1 kind: DaemonSet metadata: name: elastic-agent # namespace: kube-system labels: app: elastic-agent spec: selector: matchLabels: app: elastic-agent template: metadata: labels: app: elastic-agent spec: # Tolerations are needed to run Elastic Agent on Kubernetes control-plane nodes. # Agents running on control-plane nodes collect metrics from the control plane components (scheduler, controller manager) of Kubernetes tolerations: - key: node-role.kubernetes.io/control-plane effect: NoSchedule - key: node-role.kubernetes.io/master effect: NoSchedule serviceAccountName: elastic-agent #hostNetwork: true # 'hostPID: true' enables the Elastic Security integration to observe all process exec events on the host. # Sharing the host process ID namespace gives visibility of all processes running on the same host. #hostPID: true #dnsPolicy: ClusterFirstWithHostNet containers: - name: elastic-agent image: docker.elastic.co/beats/elastic-agent:8.5.0-SNAPSHOT env: # Set to 1 for enrollment into Fleet server. If not set, Elastic Agent is run in standalone mode - name: FLEET_ENROLL value: "1" # Set to true to communicate with Fleet with either insecure HTTP or unverified HTTPS - name: FLEET_INSECURE value: "true" # Fleet Server URL to enroll the Elastic Agent into # FLEET_URL can be found in Kibana, go to Management > Fleet > Settings - name: FLEET_URL value: "https://my-deployment-11cbbe.fleet.europe-west1.gcp.cloud.es.io" # Elasticsearch API key used to enroll Elastic Agents in Fleet (https://www.elastic.co/guide/en/fleet/current/fleet-enrollment-tokens.html#fleet-enrollment-tokens) # If FLEET_ENROLLMENT_TOKEN is empty then KIBANA_HOST, KIBANA_FLEET_USERNAME, KIBANA_FLEET_PASSWORD are needed - name: FLEET_ENROLLMENT_TOKEN value: "xxxxxxxxxxx==" - name: KIBANA_HOST value: "http://kibana:5601" # The basic authentication username used to connect to Kibana and retrieve a service_token to enable Fleet - name: KIBANA_FLEET_USERNAME value: "elastic" # The basic authentication password used to connect to Kibana and retrieve a service_token to enable Fleet - name: KIBANA_FLEET_PASSWORD value: "xxxxxxxxxxxx" - name: NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name securityContext: runAsUser: 0 resources: limits: memory: 500Mi ephemeral-storage: "4Gi" requests: cpu: 200m memory: 300Mi ephemeral-storage: "2Gi" volumeMounts: - name: varlog mountPath: /var/log readOnly: true volumes: - name: varlog hostPath: path: /var/log --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: elastic-agent subjects: - kind: ServiceAccount name: elastic-agent namespace: default roleRef: kind: ClusterRole name: elastic-agent apiGroup: rbac.authorization.k8s.io --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: #namespace: kube-system name: elastic-agent subjects: - kind: ServiceAccount name: elastic-agent namespace: default roleRef: kind: Role name: elastic-agent apiGroup: rbac.authorization.k8s.io --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: elastic-agent-kubeadm-config #namespace: kube-system subjects: - kind: ServiceAccount name: elastic-agent namespace: default roleRef: kind: Role name: elastic-agent-kubeadm-config apiGroup: rbac.authorization.k8s.io --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: elastic-agent labels: k8s-app: elastic-agent rules: - apiGroups: [""] resources: - nodes - namespaces - events - pods - services - configmaps # Needed for cloudbeat - serviceaccounts - persistentvolumes - persistentvolumeclaims verbs: ["get", "list", "watch"] # Enable this rule only if planing to use kubernetes_secrets provider #- apiGroups: [""] # resources: # - secrets # verbs: ["get"] - apiGroups: ["extensions"] resources: - replicasets verbs: ["get", "list", "watch"] - apiGroups: ["apps"] resources: - statefulsets - deployments - replicasets - daemonsets verbs: ["get", "list", "watch"] - apiGroups: - "" resources: - nodes/stats verbs: - get - apiGroups: [ "batch" ] resources: - jobs - cronjobs verbs: [ "get", "list", "watch" ] # Needed for apiserver - nonResourceURLs: - "/metrics" verbs: - get # Needed for cloudbeat - apiGroups: ["rbac.authorization.k8s.io"] resources: - clusterrolebindings - clusterroles - rolebindings - roles verbs: ["get", "list", "watch"] # Needed for cloudbeat - apiGroups: ["policy"] resources: - podsecuritypolicies verbs: ["get", "list", "watch"] - apiGroups: [ "storage.k8s.io" ] resources: - storageclasses verbs: [ "get", "list", "watch" ] --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: elastic-agent # Should be the namespace where elastic-agent is running #namespace: kube-system labels: k8s-app: elastic-agent rules: - apiGroups: - coordination.k8s.io resources: - leases verbs: ["get", "create", "update"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: elastic-agent-kubeadm-config #namespace: kube-system labels: k8s-app: elastic-agent rules: - apiGroups: [""] resources: - configmaps resourceNames: - kubeadm-config verbs: ["get"] --- apiVersion: v1 kind: ServiceAccount metadata: name: elastic-agent #namespace: kube-system labels: k8s-app: elastic-agent --- ``` </details> ## Summary With the above manifest and the manual addition of processor for adding the orchestrator fields, the OOTB dashboards work as expected and the onboarding is smooth as in Standard GKE.