OpenShift Logging and Kubernetes Auditing - The mighty Duo

Blog Image: https://unsplash.com/photos/aQYgUYwnCsM # OpenShift Logging and Kubernetes Auditing - The mighty Duo In today's fast-paced digital landscape, containerization has become the norm, and Kubernetes has emerged as the _de facto_ standard for container orchestration. However, with the increasing complexity of Kubernetes deployments, it has become more critical than ever to monitor and secure those environments. Kubernetes auditing is a powerful tool that provides visibility into the activities of your Kubernetes cluster, allowing you to detect and investigate suspicious activity, identify potential security breaches, and maintain compliance with regulatory requirements. In this blog post, we will explore the fundamentals of Kubernetes auditing, its benefits, and how you can leverage it to enhance the security of your OpenShift deployments. On top of that, we will leverage OpenShift Logging to send alerts based on audit events. The information on this blog post has been extracted mainly from the [OpenShift official docs](https://docs.openshift.com/container-platform/4.12/security/audit-log-view.html) and from the [Kubernetes auditing community docs](https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/). ## OpenShift Environments OpenShift has an opinionated configuration for auditing, this configuration has been thoughtfully designed by our engineers to cover most of our customers use cases. But still, some tweaks can be done to this configuration to better handle specific use cases. ## Introduction to Kubernetes Auditing Kubernetes auditing provides a security-relevant, chronological set of records documenting the sequence of actions in a cluster. The cluster audits the activities generated by users, by applications that use the Kubernetes API, and by the control plane itself. [source](https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/) This information allows cluster administrators to answer the following questions: [source](https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/) * What happened and whend did it happen? * Who initiated it? * On which resource did it happen? * Where was it observed? * From where was it initiated? The collection of this information has an impact on the memory/cpu consumption of the Kubernetes API Server. Memory/cpu consumption depends on the audit configuration, the more verbose, the higher is the consumption. ### Auditing Stages [source](https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/) * `RequestReceived`: The stage for events generated as soon as the audit handler receives the request, and before it is delegated down the handler chain. * `ResponseStarted`: Once the response headers are sent, but before the response body is sent. This stage is only generated for long-running requests (e.g. watch). * `ResponseComplete`: The response body has been completed and no more bytes will be sent. * `Panic`: Events generated when a panic occurred. ### Auditing Policy [source](https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/) * `None`: Don't log events that match this rule. * `Metadata`: Log request metadata (requesting user, timestamp, resource, verb, etc.) but not request or response body. * `Request`: Log event metadata and request body but not response body. This does not apply for non-resource requests. * `RequestResponse`: Log event metadata, request and response bodies. This does not apply for non-resource requests. ## OpenShift Auditing Configuration [source](https://docs.openshift.com/container-platform/4.12/security/audit-log-view.html) The default audit policy logs only metadata for read and write requests. We can make changes to this default configuration, OpenShift ships with 4 pre-defined audit configurations we can use: * `Default`: Logs metadata for read and write requests (get, list, create, update, patch). * `WriteRequestBodies`: Same as Default but it also logs request bodies for every write request (create, update, patch). * `AllRequestBodies`: Same as WriteRequestBodies but it logs also the read requests (get, list). * `None`: Disables logging. We should keep in mind that there are some sensitive resources like Secrets, OAuthClients or Routes for which `RequestBodies` will never be logged. The configuration is stored in the `APIServer` custom resource, under the `.spec.audit.profile` parameter. We can define rules for different users/groups, for example, we could log `RequestBodies` for regular users but not log them for admin users. ## Commands that should be audited In order to maintain the security and integrity of the system in Kubernetes environments there are some commands that should be audited. These commands are: * `oc rsh` * `oc exec` * `oc attach` * `oc debug` These commands provide users with powerful capabilities to interact with running containers in the cluster, such as running arbitrary commands, attaching to running processes, and debugging applications or nodes. However, if used improperly, these commands can introduce significant security risks, such as unauthorized access, data leakage, or disruption of the cluster's stability. Therefore, auditing these commands helps to ensure that users are only using them for legitimate purposes and following the least privilege principle. Additionally, auditing can detect any unusual or suspicious activities that may indicate a security breach or violation of the organization's policies. Overall, auditing these commands is essential to maintain the security and compliance of Kubernetes environments. ### How oc rsh /exec / debug / attach can be blocked via RBAC? * `oc rsh` and `oc exec` require permissions to `get` the pods in a given namespace and to `create` against the `pods/exec` resource in the core API group. So if the user is missing any of these two permissions, the execution will be blocked. * `oc debug` requires permissions to `get` the pods in a given namespace or nodes at cluster level and to `update/create` against the `pods` resource in the core API group. And finally to `create` against the `pods/attach` resource in the core API group. * `oc attach` requires permission to `get` the pods in a given namespace and to `create` against the `pods/attach` resource in the core API group. Below some examples for blocked calls for these commands: ~~~sh $ oc rsh test-74f5d5857f-snzsz Error from server (Forbidden): pods "test-74f5d5857f-snzsz" is forbidden: User "developer3" cannot create resource "pods/exec" in API group "" in the namespace "test-audit" $ oc exec -ti test-74f5d5857f-snzsz -- /bin/sh Error from server (Forbidden): pods "test-74f5d5857f-snzsz" is forbidden: User "developer3" cannot create resource "pods/exec" in API group "" in the namespace "test-audit" $ oc debug pod/test-74f5d5857f-snzsz Error from server (Forbidden): pods is forbidden: User "developer3" cannot create resource "pods" in API group "" in the namespace "test-audit" $ oc debug node/openshift-master-0 Error from server (Forbidden): nodes "openshift-master-0" is forbidden: User "developer3" cannot get resource "nodes" in API group "" at the cluster scope $ oc attach test-74f5d5857f-snzsz If you don't see a command prompt, try pressing enter. Error from server (Forbidden): pods "test-74f5d5857f-snzsz" is forbidden: User "developer3" cannot create resource "pods/attach" in API group "" in the namespace "test-audit" ~~~ ### What do we see in the Kubernetes Audit logs? #### Example of `oc rsh / exec` > **INFO**: In the logs below we won't see any `RequestReceived` event, since those are omitted in the default configuration. If we look at the `auditID` of both started and finished commands audit logs we will see that the ID is the same for both of them. We can know when the command started and when finished by looking at the `stage`. Finally in the `requestURI` we can see the pod, container and what command was used as entrypoint. * Command started: ~~~json {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"a946069a-9832-4bc9-bd20-a957fcd61e69","stage":"ResponseStarted","requestURI":"/api/v1/namespaces/test-audit/pods/test-74f5d5857f-82ql8/exec?command=%2Fbin%2Fsh\u0026command=-c\u0026command=TERM%3D%22xterm-256color%22+%2Fbin%2Fsh\u0026container=trbsht\u0026stdin=true\u0026stdout=true\u0026tty=true","verb":"create","user":{"username":"system:admin","groups":["system:masters","system:authenticated"]},"sourceIPs":["10.39.193.118"],"userAgent":"oc/4.11.0 (linux/amd64) kubernetes/5e53738","objectRef":{"resource":"pods","namespace":"test-audit","name":"test-74f5d5857f-82ql8","apiVersion":"v1","subresource":"exec"},"responseStatus":{"metadata":{},"code":101},"requestReceivedTimestamp":"2022-12-22T11:43:16.025549Z","stageTimestamp":"2022-12-22T11:43:16.046947Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}} ~~~ * Command finished: ~~~json {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"a946069a-9832-4bc9-bd20-a957fcd61e69","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/test-audit/pods/test-74f5d5857f-82ql8/exec?command=%2Fbin%2Fsh\u0026command=-c\u0026command=TERM%3D%22xterm-256color%22+%2Fbin%2Fsh\u0026container=trbsht\u0026stdin=true\u0026stdout=true\u0026tty=true","verb":"create","user":{"username":"system:admin","groups":["system:masters","system:authenticated"]},"sourceIPs":["10.39.193.118"],"userAgent":"oc/4.11.0 (linux/amd64) kubernetes/5e53738","objectRef":{"resource":"pods","namespace":"test-audit","name":"test-74f5d5857f-82ql8","apiVersion":"v1","subresource":"exec"},"responseStatus":{"metadata":{},"code":101},"requestReceivedTimestamp":"2022-12-22T11:43:16.025549Z","stageTimestamp":"2022-12-22T11:43:21.856096Z","annotations":{"apiserver.latency.k8s.io/etcd":"2.747066ms","apiserver.latency.k8s.io/total":"5.830546721s","authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}} ~~~ #### Example of `oc attach` If we look at the `auditID` of both started and finished commands audit logs we will see that the ID is the same for both of them. We can know when the command started and when finished by looking at the `stage`. Finally in the `requestURI` we can see to what container in a pod the user was attached to. * Command started: ~~~json {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"1fb198e5-168f-47ed-ae7e-1ae0b95dcecf","stage":"ResponseStarted","requestURI":"/api/v1/namespaces/test-audit/pods/test-74f5d5857f-snzsz/attach?container=trbsht\u0026stderr=true\u0026stdout=true","verb":"create","user":{"username":"system:admin","groups":["system:masters","system:authenticated"]},"sourceIPs":["10.39.194.44"],"userAgent":"oc/4.11.0 (linux/amd64) kubernetes/5e53738","objectRef":{"resource":"pods","namespace":"test-audit","name":"test-74f5d5857f-snzsz","apiVersion":"v1","subresource":"attach"},"responseStatus":{"metadata":{},"code":101},"requestReceivedTimestamp":"2023-01-11T14:23:10.276425Z","stageTimestamp":"2023-01-11T14:23:10.292822Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}} ~~~ * Command finished: ~~~json {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"1fb198e5-168f-47ed-ae7e-1ae0b95dcecf","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/test-audit/pods/test-74f5d5857f-snzsz/attach?container=trbsht\u0026stderr=true\u0026stdout=true","verb":"create","user":{"username":"system:admin","groups":["system:masters","system:authenticated"]},"sourceIPs":["10.39.194.44"],"userAgent":"oc/4.11.0 (linux/amd64) kubernetes/5e53738","objectRef":{"resource":"pods","namespace":"test-audit","name":"test-74f5d5857f-snzsz","apiVersion":"v1","subresource":"attach"},"responseStatus":{"metadata":{},"code":101},"requestReceivedTimestamp":"2023-01-11T14:23:10.276425Z","stageTimestamp":"2023-01-11T14:23:14.075215Z","annotations":{"apiserver.latency.k8s.io/etcd":"2.187636ms","apiserver.latency.k8s.io/total":"3.798790765s","authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}} ~~~ #### Example of `oc debug pod` If we look at the `auditID` of both started and finished commands audit logs we will see that the ID is the same for both of them. We can know when the command started and when finished by looking at the `stage`. Finally in the `requestURI` we can see what pod is being debugged. * Command started: ~~~json {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"e4d2c08d-ca58-4722-ba39-d26dd6a68cc1","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/test-audit/pods","verb":"create","user":{"username":"system:admin","groups":["system:masters","system:authenticated"]},"sourceIPs":["10.39.194.44"],"userAgent":"oc/4.11.0 (linux/amd64) kubernetes/5e53738","objectRef":{"resource":"pods","namespace":"test-audit","name":"test-74f5d5857f-snzsz-debug","apiVersion":"v1"},"responseStatus":{"metadata":{},"code":201},"requestReceivedTimestamp":"2023-01-11T13:59:26.219666Z","stageTimestamp":"2023-01-11T13:59:26.237960Z","annotations":{"<OMITTED>"}} {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"d016bf04-aa32-4e04-90fa-cb262561d955","stage":"ResponseStarted","requestURI":"/api/v1/namespaces/test-audit/pods/test-74f5d5857f-snzsz-debug/attach?container=trbsht\u0026stdin=true\u0026stdout=true\u0026tty=true","verb":"create","user":{"username":"system:admin","groups":["system:masters","system:authenticated"]},"sourceIPs":["10.39.194.44"],"userAgent":"oc/4.11.0 (linux/amd64) kubernetes/5e53738","objectRef":{"resource":"pods","namespace":"test-audit","name":"test-74f5d5857f-snzsz-debug","apiVersion":"v1","subresource":"attach"},"responseStatus":{"metadata":{},"code":101},"requestReceivedTimestamp":"2023-01-11T13:59:30.556395Z","stageTimestamp":"2023-01-11T13:59:30.579673Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}} ~~~ * Command finished: ~~~json {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"d016bf04-aa32-4e04-90fa-cb262561d955","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/test-audit/pods/test-74f5d5857f-snzsz-debug/attach?container=trbsht\u0026stdin=true\u0026stdout=true\u0026tty=true","verb":"create","user":{"username":"system:admin","groups":["system:masters","system:authenticated"]},"sourceIPs":["10.39.194.44"],"userAgent":"oc/4.11.0 (linux/amd64) kubernetes/5e53738","objectRef":{"resource":"pods","namespace":"test-audit","name":"test-74f5d5857f-snzsz-debug","apiVersion":"v1","subresource":"attach"},"responseStatus":{"metadata":{},"code":101},"requestReceivedTimestamp":"2023-01-11T13:59:30.556395Z","stageTimestamp":"2023-01-11T13:59:35.110094Z","annotations":{"apiserver.latency.k8s.io/etcd":"2.18315ms","apiserver.latency.k8s.io/total":"4.55369928s","authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}} {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"b91b5044-e16f-42d0-84ea-f791c94e3957","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/test-audit/pods/test-74f5d5857f-snzsz-debug","verb":"delete","user":{"username":"system:admin","groups":["system:masters","system:authenticated"]},"sourceIPs":["10.39.194.44"],"userAgent":"oc/4.11.0 (linux/amd64) kubernetes/5e53738","objectRef":{"resource":"pods","namespace":"test-audit","name":"test-74f5d5857f-snzsz-debug","apiVersion":"v1"},"responseStatus":{"metadata":{},"code":200},"requestReceivedTimestamp":"2023-01-11T13:59:35.276029Z","stageTimestamp":"2023-01-11T13:59:35.295104Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}} ~~~ #### Example of `oc debug node` If we look at the `auditID` of both started and finished commands audit logs we will see that there are three IDs, one of them is the same for both started and finished commands. We can know when the command started and when finished by looking at the `stage`. Finally in the `requestURI` we can see what node is being debugged. The first event creates the debug pod for accessing the node, the second event attaches to this pod. Once the debug session finishes, we have two events: One that detaches from the pod, and another one that deletes the debug pod. * Command started: ~~~json {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"79b81446-0e5a-4a7f-9ced-f56c282b3bce","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/test-audit/pods","verb":"create","user":{"username":"system:admin","groups":["system:masters","system:authenticated"]},"sourceIPs":["10.39.194.44"],"userAgent":"oc/4.11.0 (linux/amd64) kubernetes/5e53738","objectRef":{"resource":"pods","namespace":"test-audit","name":"openshift-master-1-debug","apiVersion":"v1"},"responseStatus":{"metadata":{},"code":201},"requestReceivedTimestamp":"2023-01-11T14:31:24.182887Z","stageTimestamp":"2023-01-11T14:31:24.204870Z","annotations":{<OMITTED>}} {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"83b1e926-1568-46bf-be03-028f7031b7bb","stage":"ResponseStarted","requestURI":"/api/v1/namespaces/test-audit/pods/openshift-master-1-debug/attach?container=container-00\u0026stdin=true\u0026stdout=true\u0026tty=true","verb":"create","user":{"username":"system:admin","groups":["system:masters","system:authenticated"]},"sourceIPs":["10.39.194.44"],"userAgent":"oc/4.11.0 (linux/amd64) kubernetes/5e53738","objectRef":{"resource":"pods","namespace":"test-audit","name":"openshift-master-1-debug","apiVersion":"v1","subresource":"attach"},"responseStatus":{"metadata":{},"code":101},"requestReceivedTimestamp":"2023-01-11T14:31:38.032804Z","stageTimestamp":"2023-01-11T14:31:38.056639Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}} ~~~ * Command finished: ~~~json {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"83b1e926-1568-46bf-be03-028f7031b7bb","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/test-audit/pods/openshift-master-1-debug/attach?container=container-00\u0026stdin=true\u0026stdout=true\u0026tty=true","verb":"create","user":{"username":"system:admin","groups":["system:masters","system:authenticated"]},"sourceIPs":["10.39.194.44"],"userAgent":"oc/4.11.0 (linux/amd64) kubernetes/5e53738","objectRef":{"resource":"pods","namespace":"test-audit","name":"openshift-master-1-debug","apiVersion":"v1","subresource":"attach"},"responseStatus":{"metadata":{},"code":101},"requestReceivedTimestamp":"2023-01-11T14:31:38.032804Z","stageTimestamp":"2023-01-11T14:31:44.772604Z","annotations":{"apiserver.latency.k8s.io/etcd":"1.543679ms","apiserver.latency.k8s.io/total":"6.739800833s","authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}} {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"d5b0ef45-af27-4bc9-8545-51825a6ab424","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/test-audit/pods/openshift-master-1-debug","verb":"delete","user":{"username":"system:admin","groups":["system:masters","system:authenticated"]},"sourceIPs":["10.39.194.44"],"userAgent":"oc/4.11.0 (linux/amd64) kubernetes/5e53738","objectRef":{"resource":"pods","namespace":"test-audit","name":"openshift-master-1-debug","apiVersion":"v1"},"responseStatus":{"metadata":{},"code":200},"requestReceivedTimestamp":"2023-01-11T14:31:44.936277Z","stageTimestamp":"2023-01-11T14:31:44.945775Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}} ~~~ ## Kubernetes Auditing Questions **Q**: When a user runs "oc rsh" or "oc exec" commands what do we see? **A**: The command executed will be logged, when it started and when it finished. Commands typed inside a shell of the container won’t be logged. **Q**: When a user runs "oc debug node" command what do we see? **A**: The pod creation will be logged. The pod created for "oc debug node" commands has the following naming. <node-name>-debug. E.g: openshift-master-0-debug. Commands typed by the user won’t be logged. **Q**: Can we audit SSH logins into the nodes? **A**: Yes, but that information is logged at the OS level. For example in /var/log/audit/audit.log. **Q**: Does OpenShift Logging save Kubernetes Audit and OS Audit logs by default? **A**: Yes. More info [here](https://docs.openshift.com/container-platform/4.12/logging/cluster-logging.html). Important: Audit logs are not stored in the OCP logstore instance by default, [we recommend sending them to a secure external location](https://docs.openshift.com/container-platform/4.12/logging/cluster-logging-external.html#cluster-logging-external). If you still want to get them in the default logstore, you can read [this doc](https://docs.openshift.com/container-platform/4.12/logging/config/cluster-logging-log-store.html#cluster-logging-elasticsearch-audit_cluster-logging-store). **Q**: Can we block the execution of rsh / exec / debug / attach? **A**: Yes. We can leverage RBAC to block rsh / exec / debug / attach. As seen [here](#How-oc-rsh-exec--debug--attach-can-be-blocked-via-RBAC). **Q**: Can we know when someone accesses one of the nodes? **A**: SSH logins are logged in /var/log/audit/audit.log. "oc debug node" accesses are logged in the Kubernetes audit log. Debugged pods will get the <podName>-debug name. Debugged nodes will get the <nodeName>-debug name. Keep in mind that users with access to create privileged pods could craft pods that mimic oc debug node behavior with a different name. **Q**: Can we know what actions a user did when accessing a node? **A**: Commands [can be audited](https://access.redhat.com/solutions/49257) or [logged](https://access.redhat.com/solutions/20707). Examples [here](#AuditD-rules). **Q**: Can we know when a debug pod gets created? **A**: Yes. If we configure the audit level to include RequestBodies we will see something like this in the audit log. ~~~json {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"RequestResponse","auditID":"d0390c4b-4e0b-4f98-a5d1-c6a4de4e7647","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/exec-rsh-audit/pods/openshift-master-1-debug","verb":"get",<OMITTED_OUTPUT>,"nodeName":"openshift-master-1",<OMITTED_OUTPUT>}} ~~~ **Q**: Can we know in which node is located a pod where we detected and rsh / exec / attach? **A**: No. Kubernetes Audit log doesn’t provide that information. ## Processing Audit logs Once we have our audit systems properly configured and our audit logs being generated, having them processed automatically will help maintaining the security and compliance of containerized applications and ensuring that organizations can quickly detect and respond to any security threats that arise. [OpenShift Logging](https://docs.openshift.com/container-platform/4.13/logging/cluster-logging.html) can help with that in two ways: * We can consume audit logs from the cluster * We can forward audit logs to an external location The v5.7 release of OpenShift Logging comes with new features that moved to GA, like the Vector collector and the Loki logstore. We can leverage Loki to not only store our logs, but also to alert us when audit events happen in our cluster. In this section we will see how to consume logs from the UI, as well as how to create alerts out of these logs. At the moment, Vector has some limitations in terms of supported backends for log forwarding, check them out [here](https://docs.openshift.com/container-platform/4.13/logging/cluster-logging-loki.html#collector-features). ### Consuming logs from the UI One of the new features when using the Loki stack is being able to consume logs directly from the OpenShift Web Console. You can access the Log console under `Observe -> Logs` once you have deployed the Cluster Logging operator and enabled the logs console plugin. ![Loki UI](https://i.imgur.com/uePbd4h.png) The image above is what you see when you access the Logs UI within the OpenShift Web Console, you reach this UI by clicking on `1`, and you can filter the type of logs you want to see `application`, `infrastructure`, or `audit` by clicking on `2`. Once on this UI we can search for specific logs, for example, let's filter by `oc rsh` or `oc exec` executions: ![Loki UI Filter Logs](https://i.imgur.com/DCbSxbp.png) In above's images you can see how we can filter logs by using the built-in filters (`1`) or by manually writting the query (`2`). Manual queries use the `LogQL` format, you can read more about it [here](https://grafana.com/docs/loki/latest/logql/). For this specific use case, filtering `oc rsh` or `oc exec` command executions we have used the filter `{ log_type="audit" } |~ "/exec?" | json`. This filter basically searches for the "/exec?" string in the audit logs content. We then tell the filter to extract all json properties as labels if the log line is a valid json document by piping the output to the `json` filter. We can also expand the log messages, this is the view for an expanded message: ![Log entry expanded](https://i.imgur.com/FPpjMwA.png) As you can see, we get the different labels presented. ### Creating alerts for audit logs We have seen how we can query audit logs in the previous section, in this section we will be configuring alerts for `oc rsh/oc exec` and for `oc debug node`. #### Alerting when detecting `oc rsh/exec` and `oc debug` > **NOTE**: This section assumes that you have OpenShift Logging v5.7 deployed on OpenShift v4.13 with Loki being used as log store and that OpenShift Alert Manager is configured to send alerts somewhere (in our case to Slack, you can check AlertManager config used in this post [here](https://gist.githubusercontent.com/mvazquezc/09671f6d04c70bf1fbc02e74124a4e5d/raw/10ea65a279f0256b4e8f0e5ce3f1df4d756129fb/AlertManager.yaml)). We will be creating an `AlertingRule` that covers these two cases. The idea is that we get a notification on Slack everytime we detect a user runs `oc exec/rsh` or `oc debug` in our cluster. We want to get in a message who did it, where they did it and when they did it. This is our `AlertingRule`: > **NOTE**: We will check `oc rsh/exec` and `oc debug` for the last two hours. You can edit the time to whatever fits you best. > **IMPORTANT**: `AlertingRules` targeting `tenandId: audit` must be created in the `openshift-logging` namespace. ~~~yaml apiVersion: loki.grafana.com/v1 kind: AlertingRule metadata: name: audit-alerts namespace: openshift-logging labels: openshift.io/cluster-monitoring: "true" spec: tenantID: "audit" groups: - name: audit-rules-group interval: 20s rules: - alert: OcRshExecCommandDetected expr: | sum by (objectRef_namespace, objectRef_name, user_username) (count_over_time({log_type="audit"} |~ "/exec?" | json [2h])) > 0 for: 10s labels: severity: warning description: "Detected oc rsh / exec command on the cluster" annotations: summary: "Detected oc rsh / exec command execution" description: "Detected oc rsh / exec command on the cluster" message: "Detected oc rsh / exec on namespace {{ $labels.objectRef_namespace }}, rsh / exec was run against pod {{ $labels.objectRef_name }} by user {{ $labels.user_username }}." name: OcRshExecCommandDetected - alert: OcDebugCommandDetected expr: | sum by (objectRef_namespace, objectRef_name, user_username) (count_over_time({log_type="audit"} |~ "-debug/attach" | json [2h])) > 0 for: 10s labels: severity: warning description: "Detected oc debug command on the cluster" annotations: summary: "Detected oc debug command execution" description: "Detected oc debug command on the cluster" message: "Detected oc debug on namespace {{ $labels.objectRef_namespace }}, the debug pod is named {{ $labels.objectRef_name }} and was requested by user {{ $labels.user_username }}." name: OcDebugCommandDetected ~~~ #### Alerting on `oc rsh/exec` and `oc debug` ##### `oc rsh/exec` 1. In the Web Console we will get the alert in the Alerting section: ![oc rsh alert on web console](https://hackmd.io/_uploads/ryXAtgKB2.png) 2. We can click on the alert to get to details view: ![oc rsh alert details on web console](https://hackmd.io/_uploads/rkVuqlYSh.png) 3. We also got notified on Slack: ![oc rsh alert on slack](https://hackmd.io/_uploads/ry499gFB3.png) ##### `oc debug` 1. In the Web Console we will get the alert in the Alerting section: ![oc debug alert on web console](https://hackmd.io/_uploads/SyO25xKS2.png) 2. We can click on the alert to get to details view: ![oc debug alert details on web console](https://hackmd.io/_uploads/rJT3qlKBh.png) 3. We also got notified on Slack: ![oc debug alert on slack](https://hackmd.io/_uploads/SyBaceKS3.png) ## Closing Thoughts In conclusion, the integration of OpenShift Auditing and OpenShift Logging offers a powerful solution for organizations seeking improved security and operational visibility in their OpenShift environment. This post covers a basic use case where you analyze logs generated in the same cluster, more advanced deployments may require different approaches. For example, if working on Edge deployments using SNO (Single Node OpenShift) which have constrained CPU usage boundaries, a potential approach would be to export the logs to an external analysis tool which would reduce the requirements on CPUs for the SNO instances. By incorporating OpenShift Logging on top of the auditing capabilities of OpenShift, audit logs become easily searchable and analyzable, facilitating the detection of patterns, anomalies, and potential security breaches. Additionally, Loki's alerting capabilities enable proactive monitoring and prompt response to emerging threats. This combined solution not only strengthens security and compliance but also provides valuable insights for troubleshooting, capacity planning, and performance optimization.