owned this note
owned this note
Published
Linked with GitHub
# Notary Security Topics
###### tags: `notary`
[Notary Issue #82](https://github.com/notaryproject/notaryproject/issues/82)
## Timestamp Server
Time Stamping Authorities (TSAs) defined by [RFC3161](https://tools.ietf.org/html/rfc3161) provide signed timestamp for a signature in order to prove that the signature was generated during the validity period of a certificate.
### Scenarios
With TSA, signature can be considered valid even if the signing cerificate is expired. This technique is widely used by [Authenticode](https://docs.microsoft.com/windows/win32/seccrypto/time-stamping-authenticode-signatures) with [SignTool](https://docs.microsoft.com/windows/win32/seccrypto/signtool), [NuGet](https://docs.microsoft.com/nuget/create-packages/sign-a-package), [Adobe Acrobat](https://helpx.adobe.com/acrobat/using/certificate-based-signatures.html#add_a_timestamp_to_certificate_based_signatures), and many other industrial products.
In the world of artifacts, including container images, scenarios are
- Developers sign their artifacts or images with certificates
- The certificates MAY have a configurable expiry time.
- Developers publish their artifacts or images and at a later date stop maintaining them
- Content consumers SHOULD be able to verify signatures until they expire and use the artifacts or images
- Attackers with compromised keys try to sign artifacts or images with timestamps before the key compromise event.
### Pros and Cons
There are many public TSA servers [available](https://gist.github.com/Manouchehri/fd754e402d98430243455713efada710) on the Internet. The advantages of public timestamp servers are obvious:
- Public
- Free
However, those public TSAs also come with disadvantages:
- Require Internet access for signing
- It is not really a disadvantage since public TSAs are online services. Devices in the air-gapped environment SHALL access the Internet for timestamp signing services.
- Out of the control of the signer
- External dependency
- Availability is not assured. No SLA on signing.
- Not all TSAs are available in all regions.
- Some regions may have high latency.
- The certificates of TSAs MAY be revoked at any time without notices.
- [The removal of trust of VeriSign broke .NET 5+ NuGet.](https://github.com/dotnet/announcements/issues/180) Thus Microsoft has to disable the package verification with a new release to unblock customers.
Implications of not using a signed timestamp for a signature
- In the absence of additional timestamp signature, the signature is only considered valid till key expiry. This may limit the use of short lived keys.
- In case of key compromise it's not possible to revoke signatures from a point in time where the time of compromise is known, as an attacker can create signatures with any signature time (by changing local time)
### Requirements and Recommendation
The following requirements and recommendations use TSAs as a method to prolong the validity of the original artifact signature. However, it does not prevent attackers signing signature and declaring arbitrary signing date in a key compromise event.
1. An artifact signature SHOULD be time stamped with a timestamp signature.
2. When time stamping an artifact signature, the following process SHOULD be applied:
1. The artifact signature SHOULD be hashed and sent to TSA for time stamping.
2. After timestamp signature retrieval, the artifact signature and the timestamp signature SHOULD be signed again by the artifact signer as a countersignature.
3. Clients MUST verify the artifact signature first.
4. If the artifact signature is expired and the timestamp signature with the countersignature exist, clients MAY verify those signatures and use the timestamp to verify the artifact signature again.
5. Timestamp signers MUST have a valid certificate with the `id-kp-timeStamping` purpose
6. Publishers MAY publish a list of TSAs they will use
- The TSA list MUST NOT be defined in the signature
The above is what we have agreed in the discussion meeting.
:::info
:bulb: **Previous** requirements and recommendation before the discussion meeting:
1. An artifact signature SHOULD be time stamped with exactly one timestamp signature
- If a publisher does not time stamp its artifact signature, a warning MUST be prompted.
2. An artifact signature MUST be time stamped without the signing key for the artifact signature.
3. It SHOULD be possible to strip a timestamp signature from an artifact signature.
4. Timestamp signers MUST have a valid certificate with the `id-kp-timeStamping` purpose
5. Clients SHOULD be configurable to different security levels
- **Default Security**
- Clients MUST verify the timestamp signature and then verify the artifact signature
- An artifact signature without a timestamp signature MUST NOT be considered valid
- **Limited Security**
- Clients MUST verify the timestamp signature and then verify the artifact signature if the timestamp signature is present
- Clients SHOULD skip verifying the timestamp signature if a timestamp signature is absent
- A warning MUST be prompted in this security level
6. Publishers MAY publish a list of TSAs
- The TSA list MUST NOT be defined in the signature
Note: #1-3 allow anyone to replace the timestamp signature for an artifact signature within the validity of the artifact signature. Meanwhile, it also reduced the impact of the unavailability of TSAs.
:::
## Signature Expiry
Signatures generally expires when the certificate of the signing key expires, which is not always desirable since the signed data has different timeliness.
In this section, artifact signatures are focused. Thus
- The signed data of the signature is the hash of the artifact.
- The signed data of the certificate is the public key / verification key of the publisher.
General speaking, artifacts have shorter timeline than the public key of the publisher. Furthermore, artifact publishers MAY publish artifacts with different life-cycle. Some artifacts are in preview and have short-term support. Others are generally available (GA) and have long-term support (LTS). Therefore, configurable signature expiry is desirable for individual artifacts.
An added benefit is to defend against [freeze attack](https://doi.org/10.1145/1455770.1455841). Since the signature expires in a short time although the signing key can live longer, verifiers have to obtain the latest signature to verify, which implies that verifies have access to the latest content.
> **Freeze Attack** Similar to a replay attack, a freeze attack works by providing metadata that is not current. However, in a freeze attack, the attacker freezes the information a client sees at the current point in time to prevent the client from seeing updates, rather than providing the client older versions than the client has already seen. As with replay attacks, the attacker’s goal is ultimately to compromise a client who has vulnerable versions of packages installed. A freeze attack may be used to prevent updates in addition to having an installed package be out of date.
### Scenarios
With explicit signature expiry in addition to the key expiry, publishers are able to control the life-cycle of various published artifacts at a fine-grained level.
Artifacts can be categorized into
- Nightly built artifacts
- Preview artifacts
- Instances are `alpine/helm:3.5.0-rc.1`, `python:3.10-rc` ...
- Current artifacts
- Instances are `ubuntu:20.10`, `ubuntu:21.04` ...
- Artifacts with long-term support (LTS)
- Instances are `ubuntu:20.04`, `mcr.microsoft.com/dotnet/sdk:3.1` ...
If the artifact signature expires, it implies that the artifact is no longer supported by the publisher. On reaching the signature expiry, artifact consumers have the following options:
- Find artifacts with valid signatures
- Update to a newer version
- Fallback to an older version
- It is possible since `ubuntu:19.10` reached the end of life on July 17, 2020 but users can still fallback to `ubuntu:18.04` since it is supported till April 2028.
- Ask the publisher for a new signature with extended expiry
- Stop using artifacts and optionally find alternatives
- Re-sign the artifact using consumers own keys with caution
- The resulted signature may not be trusted by others
- Suppress the signature verification (insecure)
Here is an example how signature expiry helps both publishers and deployers stay in a healthy ecosystem.
1. A publisher wants to release an artifact `net-monitor:v1`. Thus it publishes an release candidate `net-monitor:v1-rc` on `2021-04-02T08:00:00Z` with signature expiry on `2021-05-01T00:00:00Z`.
2. A deployer picks up `net-monitor:v1-rc` on `2021-04-10T12:34:56Z`, and verifies its signature regularly. Since the current time `2021-04-10T12:34:56Z` is before the signature expiry `2021-05-01T00:00:00Z`, the signature is valid and the artifact is accepted.
3. The publisher later fixes bugs in `net-monitor:v1-rc`, including security vulnerability fixes, and releases `net-monitor:v1` on `2021-04-15T08:00:00Z` with signature expiry on `2022-05-01T00:00:00Z`.
4. After 1 month, the deployer continues to deploy `net-monitor:v1-rc` on `2021-05-10T12:34:56Z`. Since the current time `2021-05-10T12:34:56Z` is after the signature expiry `2021-05-01T00:00:00Z`, the signature is invalid. Thus the deployment is failed and an alert is sent to the deployer admin.
5. The deployer admin updates the artifact version from `v1-rc` to `v1` either manually or automatically.
6. The deployer picks up `net-monitor:v1` on `2021-05-11T12:34:56Z`, and verifies its signature regularly. Since the current time `2021-05-11T12:34:56Z` is before the signature expiry `2022-05-01T00:00:00Z`, the signature is valid and the artifact is accepted.
Without signature expiry, the deploy may stick with `net-monitor:v1-rc` till the signing key expires, which could be several years later.
1. A publisher wants to release an artifact `net-monitor:v1`. Thus it publishes an release candidate `net-monitor:v1-rc` on `2021-04-02T08:00:00Z`.
2. A deployer picks up `net-monitor:v1-rc` on `2021-04-10T12:34:56Z`, verifies its signature regularly, and accepts the artifact on its validity.
3. The publisher later fixes bugs in `net-monitor:v1-rc`, including security vulnerability fixes, and releases `net-monitor:v1` on `2021-04-15T08:00:00Z`.
4. After 1 month, the deployer continues to deploy `net-monitor:v1-rc` on `2021-05-10T12:34:56Z`. Since the signature for `net-monitor:v1-rc` is still valid, the deployer proceeds without alerts.
At this point, the deployer always uses a version without bug fix. The publisher may not be aware of that `net-monitor:v1-rc` is still in use even after several years. Since using the release candidate `v1-rc` for a long time is harmful to the software ecosystem, all players enter a no-win situation.
### Pros and Cons
Pros:
- Signatures can have shorter expiry time than the signing key
- The life-cycle of the artifacts are fine-controlled by the publishers
- Freeze attack is defended
- Artifact support timeline is indicated by the signature expiry
- The deployers can choose the most suitable version in advance
- Healthier software ecosystem
- Publishers can safely retire older artifact releases
Cons:
- Publishers have to sign the artifact more often
### Requirements and Recommendation
The artifact signature SHOULD have signature expiry. If the signature expiry does not present, it is implied by the key expiry.
An example of the signature expiry for different artifact categories could be:
| Artifact Category | Recommended Expiry |
| ----------------- | ------------------ |
| Nightly built | 1 week |
| Preview | 1 - 3 months |
| Current | 9 - 12 months |
| LTS | 3 years |
## Key Expiry
In public key cryptography, a pair of private key a.k.a. *signing key* and public key a.k.a. *verification key* is generated, and later the public key is certified by a root. When certified, the public key is embedded in the certificate.
In common scenarios, the certificate validity period means the upper bound of the signature expiry. With the help of TSA, the signature validity period can be extended. Meanwhile, the signature SHOULD be timestamped by a TSA so that a signer cannot sign artifacts with an expired certificate.
Moreover, a new key pair SHOULD be generated before generating a new certificate in the real world, and thus the key expiry is considered to be equivalent to certificate expiry. Using static keys is discouraged since its cryptographical guarantee is weaken over time.
### Approaches
Developers can have keys with different expiry times.
Expiry time more than or equal to 1 year is generally considered as **long term**, and expiry time less than or equal to 1 day is generally considered as **short term**. Other expiry time ranges falls in to long term or short term depending on various scenarios.
In some scenarios, the developers can use self-signed certificates or even deploy using the public key directly without a third party. When the public key is directly used, it means life-time validity.
#### Long Term
Generally speaking, longer expiry makes the private key exposed for a longer time, which implies larger blast radius. In other words, more artifacts are impacted when a key with longer key duration is rescinded.
Since the key has long term expiry and usually has higher height in the key hierarchy, the key SHOULD be secured in [HSMs](https://en.wikipedia.org/wiki/Hardware_security_module).
#### Short Term
Since the key has short expiry that expires in days or even in minutes, the signer has to generate the key pair and get the public key certified more often by a root.
Overall, the threat model is shifted from the actual developer to the parent signing key and the TSA. In addition, the parent signing key is exposed longer due to large amounts of certifying requests and thus the risk of its key compromise is increased.
Nevertheless, developers are benefited from the short term expiry.
- Lower risks of developer signing key compromise due to narrower attack surfaces
- Less key management efforts from the developer side
### Recommendation
Based on the scenarios, it is recommended that
- Take long term keys for organizations
- Take short term keys for individual developers
Community developers can choose a common trusted public CA to authenticate themselves and certify their short-term ephemeral keys.
## Example E2E Scenario
Here is an E2E example for CI with the combination of signature expiry, short-lived signing keys, and time stamp authorities (TSA).
### Signing
1. A publisher generates a public / private key pair for artifact signing.
2. The publisher authenticates itself to the root and gets the public key certified by the root with short expiry.
3. The publisher hands over the private key and the associated certificate to the CI machine.
4. CI builds an artifact and sign it using the private key with expiry.
- The associated certificate is stored aside the signature or embedded in the signature.
5. CI sends the hash of the artifact signature to a TSA for timestamping.
6. CI verifies the received timestamp signature.
7. CI combines the artifact signature and the timestamp signature as a signature bundle.
8. CI signs the signature bundle using the private key.
9. CI pushes the artifact and all 3 signatures to the registry.
- Although there are 3 signatures, they can be combined into 1 signature file.
10. If there are more artifacts to release, go to step 4.
11. If the short-lived certificate is expired, start over the whole process.
- Depends on the scenarios, the CI can choose to dispose the signing key after a certain count of artifact signing within its validity period.
### Configure Trusted roots and Trust Policy
1. The deployer configures the trusted roots, i.e public keys of trusted publishers
2. The deployer configures the trust policy
- Enables signature verification
- Indicates if signature expiry should be validated or not
### Verification
1. A deployer queries the registry for the existence of the target artifact by digest.
- If the digest is unknown, the deployer asks the registry to resolve the digest by a tag.
2. The deployer queries the registry for all signatures associated with the target artifact.
3. The deployer fetches and verifies the artifact signature against the artifact digest.
4. If the artifact signature is valid, go to step 10.
5. If the artifact signature is invalid because of signature expiry or certificate expiry, go to step 6. Otherwise, reject the artifact.
6. If the timestamp signature and bundle signature do not exist, reject the artifact.
7. Verify the bundle signature. Reject the artifact if invalid.
8. Verify the timestamp signature. Reject the artifact if invalid.
9. Verify the signature expiry and certificate expiry of the artifact signature against the time stamp in the timestamp signature. Reject the artifact if the signature or the certificate are still expired against the time stamp.
10. Pull the artifact by digest.
11. Verify the artifact against the artifact digest.
### Security Assertion
#### What happens if timestamp root is compromised?
A compromised timestamp root can have the following behaviors
- **Denial of Service**: No artifacts can be signed if its signing process requires TSA, especially for short-lived keys.
- **Certificate Revocation**: The trust of a compromised timestamp will be removed. When that occurs, all previous time stamped signatures are invalidated. Deployers may refuse to pull and deploy previous artifacts.
- **Signing with arbitrary time stamp**: This behavior itself does not harm. However, if the adversary also compromise any signing key, the adversary is able to sign arbitrary artifacts and output valid signatures.
# Rescinding Signatures
## Context
Content addressable registries allow Deployers to pull an artifact based on artifact digest or a tag (which references a digest). With Signed artifacts Deployers can specify the artifact publishers they trust using trust policies.
**Why is Revocation required?**
Signed artifacts attest to the integrity (artifact was unmodified after it was published) and authenticity (artifact was published by the party claiming to be the Publisher), but make no guarantees about the quality of the content in the artifacts itself. There can be multiple conditions in which a Publisher no longer considers the artifact fit for consumption. Though a Deployer can have their own mechanism (e.g. image scanner) to check artifacts, the Publisher is in the best position to make assertions about their artifacts they distribute.
When a Publisher no longer considers an artifact as trusted, a Publisher could make the artifact unavailable by deleting it and the tag associated with it.
1. A Deployer would no longer be able build new software using use the artifact.
2. If a Deployer already pulled and used the artifact in the past, future pulls will fail, and cause an outage for the Deployer.
3. If a Deployer pulled the artifact, pushed it to another repository, and used the artifact from the downstream repository, they would be unaware of the revocation event. This may mean the Deployer continues to operate with a degraded security posture.
Scenarios 2 and 3 are not ideal.
Another option would be deleting the signature associated with the image. This addresses Scenario 2, but not scenario 3.
**How do Deployers use this information?**
Using Trust policies, Deployers can specify whether they want execution of the artifact to be allowed or blocked when an artifacts is revoked. A common way to react to revocation, would be to assess the impact by evaluating the details of the revocation, and rolling forward to a new version that mitigates it. Deployers can also block artifact execution on revocation, but this will cause an outage. It’s preferable for Deployers to monitor, alarm and evaluate revocations, rather than block execution.
## Terminology
- **Node** - The Deployer owned infrastructure (physical or virtual machine) on which the artifact is pulled and executed.
- **Orchestrator** - The Deployer owned software that manage aspects like deployment, container placement, and scaling of a node cluster. E.g. Kubernetes.
- **Node Agent** - A daemon on the Node that Orchestrator communicates with to perform actions on the Node (manage container lifecycle, health checks etc.) e.g. Kubernetes kubelet, ECS Agent.
- **Trust Store** -List of certificates/public keys trusted by the Deployer. A trust store is used for signature validation.
- **Trust Policy** -Configuration that controls aspects of signature validation e.g. enable/disable signature validation, require timestamp signature, etc.
- **Air Gapped Network** - A network which is completely isolated from the public internet. Customer operating services in an air gapped network can only rely on infrastructure and services available within this network. For Container based services, it is assumed that the Deployer only has access to registry, and other services local to the air gapped network, and the registry operator does not have access to public internet. There are two ways in which customers build software that run in air gapped network.
- The artifacts are authored, signed, distributed and consumed within the air gapped network.
- The artifacts are authored and signed outside the air gapped network then "imported" and distributed inside the air gapped network for consumption through a one way data transfer.
- **Constrained Network** - The infrastructure which consume artifacts (orchestrators and nodes) may not have access to external network (internet or air gapped network) based on the network configuration defined by the Deployer. Again we assume that the infrastructure has access to the registry from which it pulls artifacts (e.g. A Kubernetes cluster where the control plane and nodes have no access to public internet).
## Scenarios
1. A Publisher discovers vulnerability in a previously signed artifact revokes the artifact. Whether the revoked artifact is subsequently used by a Deployer is a policy decision for the Deployer.
2. A Publisher signing key was compromised, the Publisher wants to indicate that any signed artifacts associated with the signing key are revoked.
3. Cross registry usage - a Deployer pulls an artifact from a public repository, validates it and pushes it to an internal repository and uses the artifact. Subsequently the Publisher revokes the artifact, and the Deployer discovers that the artifact was revoked.
4. Air gapped network - An artifact authored outside the air gapped network is revoked by the Publisher, and the Deployer discovers that the artifact was revoked.
## Requirements
- A Publisher MUST have a mechanism to indicate that a signed artifact is no longer trusted.
- A Publisher MUST have a mechanism to indicate that multiple signed artifacts associated with a signing key are not longer trusted. For blast radius control, a Publisher can OPTIONALLY indicate the time after which the signing key should be considered untrusted.
- The Deployer MUST be able to enable revocation checks and failure mode (error, warn) when revocation checks fail.
- The Deployer SHOULD have mechanisms to discover this information from an air gapped or constrained network.
## Comparison
### Allowlist in Repository
- Allowlists explicitly list the trusted artifacts. This can be a list of digests that the publisher considers trusted for consumption.
- The allowlist can be signed, providing protection against a compromised registry.
**Pros**
- Anything not in the list is implicitly untrusted, no separate revocation mechanism is required. A Publisher removes a digest from the list to indicate that it's no longer trusted.
- If signed allowlist are used, the artifacts themselves may not need to be signed. The downside of this approach is that does not allow movement of specific artifacts to another registry without moving the complete signed allowlist along with it.
**Cons**
- The allowlist needs to be updated for every artifact being pushed to the repository.
- In scenarios where artifacts move across multiple registries, when a Publisher revokes an artifact in the source repository, downstream repositories are not automatically updated without additional sync mechanism to propagate the allowlist.
- May need to maintain a large allowlist which may be an overhead to distribute.
### Denylist in Repository
- Denylists explicitly list artifacts which are untrusted.
- The denylist can be signed, providing protection against a compromised registry.
**Pros**
- Updates to denylist are less frequent, unlike updates to allowlist for every update of an artifact
**Cons**
- In scenarios where artifacts move across multiple registries, when a Publisher revokes an artifact in the source repository, downstream repositories are not automatically updated without additional sync mechanism to propagate the denylist.
### CRL and OCSP
These are related mechanisms used by PKI to provide revocation information for certificates issued by a CA.
### CRL
- Certificate Revocation List (CRL) are signed deny lists which contains the list of revoked certificates. The CRL is signed by the CA that issued the certificate.
- A CRL is published by the issuing CA and periodically refreshed. For certificates issues by public CAs, CRLs that contain signing certificates (end-entity certificates) are refreshed once every 7 days by issuing CA, and CRLs that contains Subordinate CA certificates at least once 12 months or within 24 hours of a revocation event. Clients that perform signature validation cache CRLs and refresh them before they expire.
- As part of signature validation, clients validate the revocation status of each certificate within the certificate chain of a signature.
- For each revoked certificate, the CRL contains the serial number, revocation date and reason.
- CRL only support certificate revocation and not single artifact/signature revocation
- For code signing certificates, as timestamping can be used to extend signature expiry, revoked certificates are maintained in the CRL for a longer time (minimum 10 years post expiry of certificate).
- A certificate contains the CRL endpoint where its status can be checked, so no additional mechanism is required to discover the CRL endpoint.
### OCSP
- Online Certificate Status Protocol (OCSP) allows querying the revocation status of a certificate on a need basis rather than downloading and refreshing CRL periodically.
- They avoid some of the downsides associated with CRLs such as download and parsing of large CRL files. Typical OCSP reponses can be ~2.5kb whereas large CRLs can be 10s of MBs.
**Pros**
- Relies on centralized mechanism to get revocation status of a certificate. Revocation status can be fetched even when artifacts move across multiple registries.
- Relies on existing mechanisms that are used by Publishers that code sign software they publicly distribute.
**Cons**
- CRL for code signing certificates can get large over time as revoked code signing certificates are not pruned at their expiry. Systems consuming the CRL need to download and parse large files which can add latency to signature validation. CRL sharding is used by public CAs to address this downside, where multiple CRL URLs are used and maximum size of individual CRL is controlled.
- CRL and related mechanisms do not support artifact level revocation.
- Requires Publishers to use certificates issued through a CA (e.g. a public CA) that maintains publicly accessible CRL/OCSP endpoints. If an internal organization's CA is used, the organization will require the CRL/OCSP endpoints to be public, and highly available, which can add operational burden.
- CRL and OCSP use HTTP endpoints and can be vulnerable to man in the middle attack where the attacker can cause DoS or replay attack.
- OCSP checks cause pprivacy concerns for some users (browsing behavior can be disclosed)
- CRL and OCSP endpoints for the issuing CA may not be accessible from an air gapped network or constrained network.
## Recommendations
### Use CRL/OCSP
For cross registry scenarios Allowlists and Denylists stored in registry require either registry operator or Deployer to periodically refresh the lists from upstream registry.
- If registry operators implement the sync,
- It relies on features which are not in OCI spec (repository level metadata)
- It introduces additional overhead and not all registry operators may implement this feature.
- If Deployer implements the sync
- Customers need to periodically pull metadata from upstream repository and push to downstream repositories
- There is no current trigger for this workflow, customers generally pull a image from upstream registry once (snapshot) and do not have a need to sync afterwards.
- Customers may implement this process inconsistently or skip it altogether as it's an additional process to maintain.
In comparision, CRL/OCSP rely on a centralized endpoints and do not need additional sync mechanism. CRL/OCSP endpoints are discoverable as they are present in certificate metadata.
### CRL/OCSP based revocation check in Notary V2 tooling
- Notary V2 tooling performs signature validation based on local trust store and trust policy. External tooling (e.g. a Node agent) can setup the trust store and policy before a signature validation occurs. Deployer can enable/disable revocation checks through the trust policy.
- When revocation check is enabled the validation component will fetch CRLs or make OCSP requests as part of signature validation.
- Deployers can specify alternate CRL endpoints in the trust policy.
- If unexpired, cached CRLs and OCSP responses are available locally, the validation component will use the cached version.
### Artifact level revocation
- A new spec for Artifact Revocation List(ARL) and Online Artifact Status Protocol (OASP) to support artifact revocation must be provided. This spec will be derive from existing standards for CRL/OCSP but is not a standard itself. It should support simplistic implementations (E.g. A public hosted endpoint with signed ARL)
- In content addressable storage like registries, the digest of the artifact (e.g. image manifest, blob) is its unique identifier. The same artifact may have multiple signatures associated with it. ARL/OASP must use the artifact digest when referring to an artifact instead of its signature.
- Publishers that want to support for artifact level revocation must host an endpoint that supports the spec.
- ARL/OASP must be signed (similar to CRL) by a party that is included in the trust store, have an expiry, and need to be periodically resigned.
- Publishers must include the ARL/OASP endpoint in signed artifacts, for signature validation to discover these endpoints.
- The trust policy will contain additional options to enable disable artifact level revocation checks.
- Cloud service providers can implement these specs as an added service for customers.
### Revocation checks on local container
- Developers must be able to configure revocation checks on local containers by configuring local trust policy
### Revocation check at deployment and runtime
- Deployers must be able to configure trust store (trusted certificates/public keys) and trust policies in artifact execution services supporting them (e.g. Orchestration frameworks like Kubernetes, AKS, ECS, EKS )
- Customers must be able to enable revocation checks in supported Execution services
- E.g. A Kubernetes admissions controller can perform revocation check for deployment to the cluster. Or a Kubernetes component that performs periodic revocation check.
### Revocation checks for air gapped environment/constrained networks
- Deployer must be able to able to provide alternate CRL endpoints through trust policy. CRLs are signed and specifying alternate endpoints only presents a risk when a root is compromised. Over riding a root CRL endpoint is not recommended.
- Deployer must setup a mechanism by which CRLs are aggregated and replicated within a air gapped environment/constrained network to an accessible endpoint periodically, within the expiry period for CRLs.
## Additional Considerations
### Revocation checks at orchestrator vs node
- Deployers should rely on execution service/orchestrator level revocation checks instead of Node level revocation check at runtime
- Revocation checks include calls to multiple external endpoints which may not be available (CRL/OCSP outage) and cause revocation checks to fail.
- Revocation checks may rely on public endpoints that may not be accessible from Nodes.
- Revocation check at each artifact run may add latency as Nodes are ephemeral and CRL/OSCP responses may need to be fetched more frequently. Revocation check for a single signature can involve multiple CRL/OCSP calls as the revocation cehck is performed on each certificate in the certificate chain of the signature.
- Revocation check at each run may be an overkill, if the Deployer only wants warnings/notifications instead of artifact run to fail on revocation check failures.
### Air Gapped/Constrained Networks
- Deployers can re-sign artifacts with a PKI managed entirely within an air-gapped network.
- Vendors managing constrained networks can provide CRL aggregation and replication capabilities.
### Transparency log for Revocation
Transparency logs for only for revocation of public artifacts could be another mechanism that can be supported in future.
# Trust Policy management
### Context
Deployers who consume signed artifacts from a registry require that only artifacts from trusted parties are deployed and executed. The trusted parties are specified in some form of a trust policy against which the signed artifacts are validated. The trust policy can include trusted root and leaf keys/certificates which represent the Publisher identities the Deployer trusts. This section covers where the trust policies should be stored, and how they are distributed/updated.
### Scenarios
* A Deployer consumes artifacts from a third party Publisher and wants to configure the trusted publishers in a policy. The Deployer is not the same user/organization as Publisher and wants to independently define the trusted publishers.
* A Publisher's repository may contain a mix of artifacts, published by different teams using different keys. The Deployer wants to trust only artifacts signed by specific keys.
* E.g. A group may include the MySQL image from docker hub, and the MySQL Helm chart from another group in the same registry/repo. These are signed by different entities, and valid to be placed in the same repository.
* Some scenarios where the trusted publishers are updated
* A Deployer updates their dependencies and no longer consumes artifacts from particular Publisher
* A public root associated with a leaf key a Publisher uses to sign artifacts is compromised.
* An internal key used by an organization is due to expire. A new key is created and rotated before the existing root expires.
* An internal key used by an organization is not stored securely and the private key is lost. A new key pair is created to continue operations.
* A cryptographic algorithm is deprecated, or a compliance standard requires a Publisher to upgrade their key strength.
### Storage and distribution of trust policies from registry
*Pros*
* Trust policies need not be stored in another location
* Allows in band/automated distribution and update of trust policies to consumers.
*Cons*
* Automated update of trust policies can be disruptive to Deployers that consume artifacts. Deployers should be able to control when trust policies are updated.
* Only allows Publishers to define the trust policy. It works well when Publisher and Deployer are the same user/organization.
* This model does not allow Deployers to independently define trust policies. Furthermore, it may be required that Deployer Admins define trusted publishers, and Deployer Operators only specify/configure which artifacts (from repositories) are deployed.
### Recommendation
* Trust policies should not be stored in the registry. Multiple Deployers can consume artifacts from the same repository, and may need to define trust policies independently.
* Trust policies should be configured, distributed/updated out of band from artifact updates from registry.
# Older Requirement Topics
## Signature allow list/deny list
### Context
- Artifact Publishers and Consumers need mechanisms to indicate that an artifact version is trusted/untrusted explicitly or implicitly. This is in addition to indicating which roots or Publishers are trusted/untrusted.
### Scenarios
- A Publisher vends a new version of an artifact, and wants to indicate the new version as trusted in addition to older versions already published.
- A Publisher discovers vulnerability in an older version of artifact and wants to explicitly indicate that version as untrusted.
- A Consumer builds an artifact (which may include third party dependencies), and wants to indicate the artifact bundle, including dependencies as trusted.
- A Consumer no longer wants trust a deprecated version of dependency, and wanted to indicate that for artifacts already in circulation.
### Comparison
* **Code signing** is a mechanism for Publishers to explicitly indicate trust which is implicitly trusted by Consumer given some conditions are satisfied (based on trust policy, signature being valid and unrevoked)
* An explicit allowlist is not required
* A denylist is required in addition to indicate which artifacts are no longer trusted
* **Allowlist** explicitly specifies the list of trusted artifacts.
* Pros
* Anything not in the list is implicitly untrusted, no separate revocation mechanism is required
* If signed allowlist are used, the artifacts themselves may not need to be signed.
* Cons
* The allowlist needs to be updated for every version of artifact being published
* May need to maintain a large allowlist which may be an overhead to distribute
* **Denylist**
* For Consumers, denylist allows explicitly indicating that an artifact or dependency is untrusted
* For Publishers, this allows communicating to Consumers that specific versions of artifact are untrusted.
* **Centralized/local public/private lists**
* The list can be local to repository, requiring update to the list in each repository, and requires keeping track of all repositories where the artifact needs to be published/revoked.
* The list can be centralized
* Centralized deny list (e.g. CRL maintained by public CAs) can be used. The endpoint information is included in the signature, and signature verification step checks against this list.
* Customers can define network topology with restricted network access, where these endpoints may not be accessible for hosts where signature verification occurs.
* These endpoints may not be available in air-gapped environments.
* Public lists (e.g. transparency logs) may not be suitable for enterprise customers who don't want their artifact updates to be disclosed publicly.
## Transparent Root Key Auto-Rotation
### Context
- Root keys form the basis of hierarchical trust systems, where intermediate and leaf keys chain back to a root key, and leaf keys are used to generate signatures. Consumers of signed artifacts associate a baseline level of trust with signatures that chain to roots they trust. In the scenario that a root key is no longer trusted, all signed artifacts chaining back to the root are no longer trusted. The conditions which render the root keys to be no longer trusted, are disclosed to the customer through some other channel (CVE, internal security audit, etc.)
### Scenarios
- A public root associated with a key a Publisher uses to sign artifacts is compromised. The access to root key is compromised, rather than the root key itself being stolen, allowing an attacker to create new intermediate and leaf trusted keys.
- An internal root used by an organization is due to expire. A new root is created and rotated before the existing root expires.
- An internal root used by an organization is not stored securely and the private key is lost. A new root is created to continue operations.
- A cryptographic algorithm is deprecated, or a compliance standard requires a customer to upgrade their key strength. The customer wants to safely rotate a root key without rendering existing signatures invalid and disrupting operations.
Safe rotation of keys require a overlap period where both new and old root keys are trusted. This is always not possible, such as when a root key is compromised.