--- tags: Best Practice, Policies --- # Aqua Policy Guidance :green_apple: ![downloads](https://img.shields.io/github/downloads/atom/atom/total.svg) **Table of Contents** [TOC] ## First deploy aqua-management console > Components. * Aqua Console is deployed in high available mode * Available by making use of orchestration tools, or on standalone docker with redundant nodes in a cluster model configuration for server and redundant gateways. * PostgreSQL availability and data persistence is available. * Aqua Enforcers are deployed in Audit-only mode for visibility into the operational environment ## Integration with the environment > External tools * External authentication system such as AD/LDAP or SAML are configured to grant users access and map roles to permissions within the Aqua UI. * SIEM integration is configured to route event information from Aqua Console to existing security event systems; pre-built dashboards (eg, Aqua Splunkbase app) are added for additional views if they exist ## Monitoring > Database - [ ] Watch resource usage - [ ] Make sure resources (cpu, IOPS) don't get pegged at max limit for extended periods - [ ] Lower number of active processes or queries and a very high number of wait-state or pending queries - [ ] Many processes are long running > Aqua Console - [ ] Orchestrated deployment comes out of the box via liveliness problems - [ ] Standalone docker needs load balancer: - [ ] Health check performed by LB, not necessarily reporting these results anywhere if LB cannot report changes in health status, monitor externally - [ ] External uptime monitors - [ ] Goes into SIEM integration a bit, but you want to at least look at administrative events - [ ] Tune any alerts for monitoring to only include relevant data > Timing - [ ] Retry requirement? Depends on uptime guarantee ## Registries and image scanning ```` • Registries are on-boarded to Aqua Console • Method of image ingest to Aqua is determined and configured or planned; • Web-hook call from registry to Aqua upon new image push • Automatic nightly pull of images (with scoping rules to include most relevant, and possibly cleanup to keep only most recent images) • CI/CD is a method of ingest as well, but not as useful for initial inventory; they may help for registries which do not support web-hook • Aqua Console can ingest new images via the API • Retention policy is determined (eg, keep last X tags of an image, or only ingest specific tags). This can work in conjunction with the`Block unregistered images` run-time control to ensure out of date images are not allowed to run in the environment when they have already been retired. • Depending on size of registry and number of new images each day, additional Scanning containers may be deployed for greater throughput; • Location and configuration may be optimised for environment-specific considerations ```` # Aqua Policies - Getting Started :banana: ## Image Assurance Image Assurance policies: - [ ] A default image assurance policy can be configured, with a non-blocking mode (images are not marked as disallowed, but are audited). - [ ] This will inform about current state of the images but should not be the end state; it is used to take an inventory of current state. - [ ] The current state of the registry and scope of work to bring images into compliance with desired policy may inform the actual initial security policy. ## There are a few general approaches we see here: The desired policy is set immediately; this is useful if the organisation is early in process of adopting containers. `The idea is to start secure and maintain the state over time` > The downside is this is potentially disruptive. If there are a lot of images which violate the image assurance policies, it is a lot of effort to address at once. Development timelines may be impacted and pushback can come from the users. > 1. Alternatively, a graduated approach can be taken. First address most critical concerns in the initial security policy, and increase strength of controls over time. 1. This is useful in environments where a lot of images already exist which require remediation and where there is too much work to perform at once to bring them all to the desired level. 1. Over time, security policy can be strengthened as the most critical issues are addressed without too much intrusion to developer workflow. 1. For example, initially policy may look only at images which run as root user, but over time this expands to include higher severity CVE, then less severe CVE, specific blacklist of packages with high numbers of CVE, etc, in a graduated approach. The timeline depends on factors such as speed issues are addressed, buy-in to making the change, communication and visibility given to teams. 1. Use of multiple image assurance policies can be layered for visibility; the end goal, not all policies need to mark an image as 'disallowed', and may simply audit. This gives visibility so teams continue to progress towards end goal without having to wait for policy enforcement to act.After inventory is taken, the organisation may analyse results to see where they can get the most impact for their results; 1. If most vulnerabilities result from base images, a secure base image program may be adopted; 1. Decisions are made about which images would serve as the golden 'base images' with developer buy-in 1. These are maintained at a more stringent security level, but someone must be an owner of them, and an acceptable policy must be determined. 1. If developers build from these images they spend less time securing images as they only need to remediate what they themselves add to the image. 1. If particular old OS packages and images are significant detriment to the organisation's security posture, specific versions of common libraries may be blacklisted 1. At times you will find a particular package that does not violate a maximum severity policy but contributes quite a lot of lower severity vulnerabilities. 1. Some criteria might need to be determined for when it is acceptable to blacklist a package (eg, a fix version is available and has been available for X amount of time). 1. An exception process is created to manage exceptions to the policies 1. Vulnerabilities can be acknowledged for specific images, or globally for all images 1. Acknowledgements can be done manually or via the API, and include requester and the justification 1. Acknowledgements are logged in Aqua console and sent to SIEM. ## Implement CI / CD Integration for image scanning - [ ] This can provide visibility directly to developers within their build process - [ ] CI/CD integration does not depend on the image assurance policy already being in place, but it is useful. - [ ] Once image assurance policy is in place, decision can be made on whether to perform CI/CD scanning in audit-only mode or enforcement mode - [ ] This mode will not fail builds for images which violate 'blocking' image assurance policies - [ ] A report is created in the build artifacts for developer to review, but build proceeds without impediment. - [ ] An audit message is created in Aqua console, can be routed to SIEM and alerts generated from here to stakeholders - [ ] Developers are less likely to review builds that pass than builds that fail. - [ ] Enforcement mode for image assurance can be configured at CI/CD integration point - [ ] Both the image assurance policy in Aqua console and the plugin or scanner parameters must agree that CI/CD job should fail before build is failed. - [ ] When an image violates the policy, the build fails - [ ] Developers are forced to review the results and remediate or request an exception before build with the image will pass - [ ] If the build fails prior to the push of the image to the registry, the image never gets to a place it can potentially be deployed in the environment - [ ] Plugins for different CI/CD tools are available for configuration, or scanner-cli can be integrated manually to build process; ideal placement for scanning is after image is built and before image is pushed to the registry - [ ] The CI/CD can also be used as a notification point to register an image with Aqua Console instead of relying upon a web-hook notification or the nightly image pull from registry - [ ] Re-scan of existing images on periodic schedule can be configured to determine if any new vulnerabilities that were previously unknown have been detected. ## Runtime Controls and Policies ## Runtime Policy Structure > The runtime policies should be layered with a strong Default policy to be applied to most if not all containers. Additional policies can be then layered with specific container subsets or groupings such as namespaces, targeted. This approach allows for more granularity to be applied where it is required, while also ensuring that containers have a baseline policy they must adhere to and additional policies for specific use-cases. ## Default Policy All policy types include a Default that is applied to the global run-time profile. The global policy is meant to provide coverage to most if not all containers and provide a baseline security standard. Out of the box, we include a default policy with 2 controls enabled; volume blacklist and drift prevention. See below for more details: ## Global Run-Time Profile - [ ] The default run-time profile requires Aqua Enforcer to be deployed - [ ] This should initially be set in audit-only mode while impact is determined - [ ] If Aqua Enforcer is in audit mode, or the run-time policy is in audit mode, only audit messages will be generated (with exception of a couple of engine-level controls such as application of specific seccomp profiles which can only be turned on or off, and not audit/enforced separately) - [ ] Audit events from the enforcers are reviewed to determine what the impact of the control will be once set to enforcement mode - [ ] Any DETECT level run-time audit event would be blocked if both the Aqua Enforcer and the runtime profile were in enforcement mode. - [ ] Audit events can be sent to SIEM and on to relevant stakeholders; roles for can allow users to login to Aqua console directly to see impacts as well. - [ ] **Volume Blacklisting** is set by default, to ensure that the specified volumes are not mountable by any container. This control prevents the modification of the files via a volume mount. This is particulary an issue if the image is configured to run as root. - [ ] **Drift prevention** it is enabled by default and works in conjunction with the initial image scan to ensure that the container can only run executables that were previously identified. This prevents images from being marked compliant and bypassing this in the container runtime to download malicious code. - [ ] Image assurance enforcement is enabled - [ ] Enforcement of image assurance is typically enabled prior to enabling enforcement via the Aqua Enforcers - [ ] Any images that are marked disallowed as result of violation of Image Assurance policy (which has control to disallow image enabled) will not be allowed to run on hosts which have Aqua Enforcer in enforcement mode - [ ] If CI/CD integration is in place, developers will have insight directly into their builds; if this is enforced, they fail at build rather than deploy time. - [ ] Run-time events detected by Aqua Enforcer for disallowed images will be sent to Aqua console and the SIEM integration for alerting - [ ] Review of the audit events from Aqua Enforcer in audit mode will show any containers being created from images that are disallowed - [ ] If CI/CD integration is in full enforcement, and prevents push to registry, containers from disallowed images should not be seen often in the audit events - [ ] Periodic nightly scanning which marks image disallowed due to previously unknown vulnerabilities may be an exception to this. - [ ] The 'Block unregistered images' control can point out gaps in the image ingest process; This might include unknown registries or CI/CD processes. - [ ] With the Enforcer in audit only mode these still run but are audited. - [ ] These unknown images should be known to the Aqua Console before turning on full enforcement mode for the Aqua Enforcer ## Example Expanded Default Policy (Audit is the OOTB configuration, once thoroghly tested this should be enforced at the first opportunity) ![](https://i.imgur.com/8I2xmrI.png) ## Prepare to move Aqua Enforcer from Audit mode to Enforcement mode - [x] A careful review of the audit events in the Aqua console is essential to a smooth transition. - [x] The global run-time policy should not be generating `DETECT` events; these will turn to denials with enforcement mode. - [ ] Action should be taken for `DETECT` level run-time policy events. - [ ] Remediation of offending behavior (eg, team no longer runs image as root user) - [ ] Revisement of the policy (if events for particular controls are widespread), - [ ] Application of a specific run-time profile excluding the offending controls to images for which the behavior is acceptable is put in place - [ ] Containers should no longer be created from images unknown to Aqua. - [ ] These will be denied in enforcement mode. - [ ] These can indicate an unconscious gap in the image ingest process or policy awareness by teams. - [ ] Containers from images that have been disallowed by image assurance policy should not be running. - [ ] This may indicate a gap in the CI/CD integration, where image is passed to next phase despite violations. - [ ] Ensure that owners of the images are aware of the impact of their images violating the image assurance policies prior to enforcement. - [ ] Ensure there is an exception process for any of the image assurance violations, as an outlet to request remediation. Stakeholders should be aware of this process. - [ ] Review the SIEM integration and ensure that there is a proactive alerting policy which can route audit events to appropriate stakeholders. - [ ] You may wish to send alerts from the SIEM directly to stakeholders while still in audit mode and confirm receipt prior to enforcement. - [ ] The SIEM may not have information about who owns a particular image. Some of this may be determinable but depends largely on how images are organised and what metadata is available. - [ ] Some organisations use separate registries for different teams, or different paths upon the same registry to segment teams, in which the image name can be keyed upon to determine who should be notified - [ ] The Aqua API can be a useful tool here for pulling data such as labels from the images; Many organisations will require their teams to add labels to their images indicating an owner; this data can be often pulled from registry, or from Aqua itself. - [ ] A process to identify ownership of unknown images may be key; some review of metadata of image in Aqua UI can assist, or on the registry. - [ ] Ensure that the Audit mode for the Enforcer with current policies has at least completed the entire rotation of the development cycle (build, push, test, deploy, iterations upon this for upgrade, etc). This can weed out any timing issues if the ingest of the image to the Aqua console is not in line with the deployment cycle. A longer duration in the Audit mode is better than a shorter duration, at least initially, if smoothness of transition is a priority. - [ ] For example, if image is created and then immediately deployed, but is not added to Aqua until the nightly pull process, this may cause a timing problem. - [ ] If 'block unregistered images' control is enabled, that would generate an Audit message because the image has been deployed before Aqua becomes aware of it. In full enforce mode, this would be a denial and the container from the image would not be allowed to be run. - [ ] Use a testbed environment to turn on enforcement and validate expectations from the image assurance and run-time controls which have been enabled; this will reduce any potential surprises. ## Enforcement > Move CI/CD integration from audit to enforcement mode > Before enabling enforcement on the Aqua Enforcer, the Image Assurance should be in enforcement mode if you plan to utilise controls to block unknown images or disallow images which have violated the image assurance policy. > While scanning from the registry you should mark the images as disallowed, enforcement should be set at CI/CD level before moving agent to enforcement so that developers are never unaware that an image they have just created will not be allowed to run. It is typically less impactful to deny the image at build time than to deny it at deploy time. Ensure advice in previous section about CI/CD and Image Assurance has been reviewed prior to setting Aqua Enforcer into enforcement mode. --- > Move Aqua Enforcer from audit to enforcement mode Modification of the Aqua Enforcer from Audit to Enforce mode can be performed via the UI or Terraform. --- >For wider deployment, it may be useful to turn on enforcement in small batches; if some events were overlooked or there is some other unforeseen impact, this will limit the radius of effect --- >Careful review the audit events, and communication with teams, this should ensure immediate awareness of any impacts (expected or unexpected). --- >Enable the enforcement mode of the Aqua Enforcer's global enforcement prior to enabling enforcement of the run-time profile. When no negative impact is seen to this, enable the enforcement mode on the global or default run-time profile. --- > Be aware of the audit/enforce settings and where to locate them in various policies; this will allow you to quickly switch a particular policy back from Enforce to Audit mode if the need arises ## Improve security posture over time with stronger runtime controls > * Enable image-specific run-time profiling, this is a deeper integration which can be hugely valuable. > * The audit or enforce mode of a runtime profile can be set separately from the global audit or enforce mode of the Aqua Enforcer. However, the run-time profile will only enforce if both the Aqua Enforcer enforcement mode is set and the runtime profile enforcement mode is set. If either have audit mode, the runtime profile will perform in audit mode. > * There is an automatic run-time profiler which can be used, but it is recommended to only use it in the Audit mode initially as the profile is generated based upon initially observed activity. > * The ideal approach to enable image-specific run-time profiling is to explicitly enable run-time profiling during an application's TEST phase. > * This should capture the expected behavior by running the application through typical operation and use cases. > * The application should not need overly-exhaustive tests in order to use this; as long as the executables, volume mounts, and other resources in the runtime profile are touched during this test phase it should be sufficient to generate a highly accurate run-time profile of the application. > * An audit-only mode should be enabled here as well initially until confidence in the process is established. > * The process: > * Before test phase starts, a single REST API call to Aqua Console can be performed to enable the run-time profiling for the image, or the image can be launched with an additional environment variable to enable the profiling. > * The image runs through its typical test phase, launching and using its typical resources such as programs, files, volume mounts, etc. These are captured by the runtime profile. > * At the end of the test phase, another REST API call is send to Aqua Console to end the run-time profiling and apply the run-time profile to the image. > * If the image needs to run multiple times, the run-time profiling can be performed in an 'append' mode to add to existing profile by providing name of a previously generated profile (or same environment variable as used previously). > * Once the run-time profile is generated and applied to the image, run the image and confirm there are no violations by reviewing (or programatically checking Aqua API) for audit events generated. > * Once confidence is established the run-time profile can be modified to turn from Audit to Enforce mode. > * When adding the run-time profiling to new projects, ensure that review of the audit logs is performed over time; there may be unseen events not captured by the runtime profile if they are not captured in a test. > * For example, a binary which is only executed once a week and was not included in the tests may show up in the audit logs as a DETECT level event due to the runtime profile. > * If items in the audit events are not added to the run-time profile before the run-time profile is switched to Enforce mode, these will turn to BLOCK events and be denied when the enforcement mode for the runtime profile is enabled. > * Enhance the image assurance and run-time controls over time. > * Layered image assurance policies described above can provide a mechanism for incremental improvement to security. > * Review of the audit events from the CI/CD process can help estimate the impact of current policy. > * When the impact of one enforced image assurance policy tapers off, it may be safe to plan moving the next policy layer from audit to enforce mode. > * The global or default run-time profile provides pretty good protection,but consider expanding adoption of image specific run-time profiling. > * It is best to integrate Secrets after enforcer has entered enforcement mode. > * In audit mode, secrets can be revealed despite policy violation. However, access will be audited. > * Much of the policy here can be automated through the use of Service membership assignment rules and appropriate configuration of the Secret backend with automatic label assignment > Services look at metadata for a deployment, such as a Kubernetes deployment or docker image attribute, to determine which container belongs in a service. >> --- ---