The purpose of negative testing is to ensure that Ratify can gracefully handle invalid input or unexpected user behavior. It also ensures Ratify can provide useful error logs for users when handling invalid inputs or unexpected server responses. Negative test cases can also help to improve the overall quality of Ratify.
This document summarizes necessary negative test cases for Ratify.
OS Arch
Linux OS
No | Category | Description | Results |
---|---|---|---|
TC-1 | Registry | Users provide an mismatched registry credential in Ratify store CRD (authProvider) | pass |
TC-2 | Registry | Registry connection timeout | pass |
TC-3 | Registry | Interact with insecure registry without the right flag (useHTTP) | pass |
TC-4 | Registry | Verify a image signature without read permission to the registry. | pass |
TC-5 | Verifier | Trust Policy is not set | pass |
TC-6 | Verifier | registryScope is not set or misconfigured in Trust policy |
pass |
TC-7 | Verifier | Trust policy matches incorrectly and fails due to wrong certificate | pass |
TC-8 | Verifier | The image signed by an untrusted identities or trustedIdentities was misconfigured |
pass |
TC-9 | Cert Store | Certificate expired | pass |
TC-10 | Cert Store | Certificate revoked | pass |
TC-11 | Cert Store | Ratify can't access AKV | pass |
TC-12 | Cert Store | An invalid certificate is filled out in the cert store CRD (inline) | failed |
TC-13 | Cert Store | Certificate format/type is not supported by Ratify | pass |
TC-14 | Policy | policyPath is not existed |
pass |
TC-15 | Policy | Syntax problems or misconfiguration in the REGO policy | pass |
TC-16 | Policy | Syntax problems or misconfiguration in the Config policy | failed |
TC-17 | HA | A Kubernetes node that Ratify instances are running on top of it is crashed | pass |
TC-18 | HA | Redis Pod is crashed or in outage | pass |
TC-19 | HA | Gatekeeper instance is crashed or disconnected | N/A |
TC-20 | HA | Dapr sidecar is not available | pass |
TC-21 | HA | Dapr is not configured on cluster properly | pass |
Gather the Ratify logs for each test case.
If the registryScope is not set in ConfigMap of charts, Ratify will fail deployment.
If users applied a CR with misconfigured registryScope, the verification would fail:
The image verification fails.
The image verification fails:
time=2023-09-22T08:58:54.288013215Z level=error msg=Reconciler error CertificateStore=gatekeeper-system/certstore-akv controller=certificatestore controllerGroup=config.ratify.deislabs.io controllerKind=CertificateStore error=Error fetching certificates in store certstore-akv with azurekeyvault provider, error: failed to get secret objectName:wabbit-network-io, objectVersion:, error: keyvault.BaseClient#GetSecret: Failure responding to request: StatusCode=404 – Original Error: autorest/azure: Service returned an error. Status=404 Code="SecretNotFound" Message="A secret with (name/id) wabbit-network-io was not found in this key vault. If you recently deleted this secret you may be able to recover it using the correct recovery command. For help resolving this issue, please see https://go.microsoft.com/fwlink/?linkid=2125182" name=certstore-akv namespace=gatekeeper-system reconcileID=28eac175-4f3b-4353-88df-df5bdc3ae6de
The inline cert is invalid and has an error but Ratify logs doesn't mentioned the root cause.
I misconfigured the config policy but the Ratify logs don't point this root cause out.
SSH into one of the nodes that has a ratify replica scheduled on it. Manually reboot
the node. Immediately execute pod creation commands.
There's a short period of time when the node health is not known to be degraded yet. During this service will continue to route requests to dead pods. Once node health is degraded, the requests seem to all route to healthy instance.
The Redis stateful sets (master + replica) were drained to 0
Error from kubectl run
output:
Ratify Logs:
Audit also begins to fail with timeout of 4.9 second.
The original mutation exceeded 1.95s
error does not immediately point to what the root cause. Scanning through logs, you eventually see context deadline exceeded
errors for cache writes which points to underlying cache being unavailable.
N/A
Scale dapr-side-car inject deployment to 0. Scale ratify deployment to 0. Scale ratify deployment back up to 2. Now the daprd sidecar containers will not be present on each pod.
Ratify pod will fail to startup if no daprd pod is enabled. This is due to cache intialization failure which is currently a blocking error.
Logs
Force a Dapr installation misconfiguration by not applying the redis StateStore CR.
User will not see any errors when applying pods/deployment. Perf will be degraded an may lead to more timeouts.
From logs, cache errors: