Trusted Artifacts in Workspaces

--- status: proposed title: Trusted Artifacts in Workspaces creation-date: '2023-07-27' last-updated: '2023-09-05' authors: - '@afrittoli' collaborators: - '@pritidesai' - '@jerop' --- # TEP-0139: Trusted Artifacts in Workspaces ---  - [Summary](#summary) - [Background](#background) - [Motivation](#motivation) - [Security](#security) - [Usability](#usability) - [Goals](#goals) - [Non-Goals](#non-goals) - [Requirements](#requirements) - [Proposal](#proposal) - [API](#api) - [Execution](#execution) - [Non-Falsifiability of Results](#non-falsifiability-of-results) - [Example](#example) - [Notes and Caveats](#notes-and-caveats) - [Future Work](#future-work) - [Reusable Steps](#reusable-steps) - [Provenance Generation](#provenance-generation) - [Extend Schema](#extend-schema) - [Design Evaluation](#design-evaluation) - [Reusability](#reusability) - [Simplicity](#simplicity) - [Flexibility](#flexibility) - [Conformance](#conformance) - [User Experience](#user-experience) - [Performance](#performance)  ## Summary The goal of the TEP is to extend the chain of trust for provenance produced by Tekton based on `TaskRuns` and `PipelineRuns`. It accomplishes this goal by enabling consumer `Tasks` to trust `Artifacts` on a `Workspace` from producer `Tasks` by verifying against hashes stored as non-falsifiable `Results`. ## Background The Tekton Data Interface working group has been working on for about 10 months now, identified a number of different problems to solve and proposed a number of different solutions. The number of different issues discussed and their sometimes conflicting requirements means that only a small fraction of the proposed solutions has actually been implemented in Tekton. This proposal is an attempt to take one of the problems identified, describe it in a way that is as much as possible self-contained, and provide a simple solution to it. The solution proposed does not need to address all the requirements and constrains of adjacent problems, but it should at least not make it harder for them to be addressed in future. ## Motivation Tekton runtime model maps the execution of a `Task` (i.e. `TaskRun`) to a Kubernetes `Pod` and the execution of `Pipeline` (i.e. `PipelineRun`) to a collection of `Pods`. `Tasks` in a `Pipeline` share data through the `Workspace` abstraction, which can be bound to a `Persistent Volume` in `Kubernetes`. ### Security Because of the design of `Persistent Volumes`, a downstream `TaskRun` has no way of knowing whether the content of a `Workspace` it receives as input has been tampered with. For example, if the source code on the `Workspace` is changed between a git clone `Task` and a container build `Task`, there is no longer a guarantee the build is of the git reference that was checked out. SLSA v0.1 L3 requires provenance to identify source code used for builds: “provenance must authenticate the repository that stored the source code used in the build”. We need to ensure the integrity of `Artifacts` in `Workspaces` to meet these requirements and secure software supply chains. ### Usability The current solution for generating provenance to identify source code is a suffix-based type hinting where Results must have ``“-ARTIFACT_INPUTS”`` or ``“-ARTIFACT_OUTPUTS”`` suffixes, which presents challenges: - Encodes ``"Artifact"` concept into `Result` names. Intertwining concepts makes the API complex. - Users have to always use the suffixes in `Result` names for Tekton Chains to generate provenance. - Users have to wire `Results` correctly between `Tasks` and `Pipelines` for Tekton Chains to generate provenance; it doesn't “just work”. It is critical that the proposed solution is easy to use and “just works” so that user workloads are secure by default. ## Goals - Enable producer `Tasks` to declare artifacts they produce to a `Workspace` and consumer `Tasks` to declare artifacts they consume from a `Workspace`. - Enable consumer `Tasks` to trust artifacts on a `Workspace` from producer `Tasks`. ## Non-Goals - This proposal does not address the integrity of `Artifacts` stored outside `Workspaces`. However, the proposed solution can be extended to upload, download and verify `Artifacts` from other storage e.g. object storage and OCI registries. - This proposal does not address downloading `Artifacts` as inputs to a `Pipeline` or uploading `Artifacts` as outputs of a `Pipeline`. This proposal focuses on passing `Artifacts` within a `Pipeline` and sets the foundation for this work to be explored in the future. ## Requirements The solution should support the following combinations: - One producer `Task`, one consumer `Task`. - One producer `Task`, N consumer `Tasks`, including with write-one, read-many storage class. - Many producer `Tasks`, one consumer `Task`. - Many producer `Tasks`, many consumer `Tasks`, including with write-one, read-many storage class. - Fail validation if `Workspaces` (static/runtime) are not fit for `Artifacts`. - Fail execution if hash validation fails and surface error to `TaskRun` / `PipelineRun` failure reason. ## Proposal ### API Add `Input.Artifacts` and `Output.Artifact` types with a fixed schema with three properties: `path`, `hash`, `type`: - Path from which files are uploaded and to which files are downloaded. - Hash of the produced files as computed by injected Steps. - Type: ”file” or ”directory”. This will be implemented using object `Parameters` and `Results` for inputs and outputs respectively. ```yaml # Interface for users to declare input and output artifacts which inbuilt schema inputs: artifacts: - name: foo description: abcd outputs: artifacts: - name: bar description: 1234 # Implementation object type Params and Results with properties from the schema params: - name: foo type: object description: abcd properties: path: type: string hash: type: string type: type: string results: - name: bar type: object description: 1234 properties: path: type: string hash: type: string type: type: string ``` Extend `Workspaces` to indicate whether they are used to store `Artifacts`. This is done through a new field that defaults to ``”false”``, but users can set it to ``”true”``. If set to ``”true”``, Tekton will validate that it is backed by a `Persistent Volume`. ```yaml spec: workspaces: - name: artifactStorage artifacts: true ``` Extend variable expansion to add `.data.path` for `Artifacts`. The `.data.path` will be backed by an `EmptyDir Volume` named `/tekton/artifacts/` that’s mounted to the `Pod` by Tekton. ```yaml steps: - name: produce-file image: bash:latest script: | #!/usr/bin/env bash date +%s | tee "$(outputs.artifacts.aFileArtifact.data.path)/foo.txt" - name: produce-folder image: bash:latest script: | #!/usr/bin/env bash date +%s | tee "$(outputs.artifacts.aFolderArtifact.data.path)/aFolder/a.txt" ``` ```yaml steps: - name: consume-file image: bash:latest script: | #!/usr/bin/env bash echo "File content" cat $(inputs.artifacts.aFileArtifact.data.path) - name: consume-folder image: bash:latest script: | #!/usr/bin/env bash echo "Folder content" find $(inputs.artifacts.aFolderArtifact.data.path) -type f ``` ### Execution A producing `Task` writes files to a path backed by an `EmptyDir Volume`. After its execution, Tekton injects a `Step` to compute hash and copy the files from the `EmptyDir Volume` to a `Persistent Volume`. A consuming `Task` reads files from a path backed by an `EmptyDir Volume`. Before its execution, Tekton injects a `Step` to copy the files from the `Persistent Volume` to the `EmptyDir Volume`, and verify the hash to ensure that the files that were produced are what are being consumed. ### Non-Falsifiability of Results This proposal requires `Results` to be non-falsifiable, as proposed in TEP-0089, so that Tekton can rely on the hashes to validate the artifacts in a `Workspace`. As such, we need to complete the implementation of SPIRE support in Tekton Pipelines. ### Example This `PipelineRun` demonstrates the passing of trusted artifacts between `Tasks` through a `Workspace`. ```yaml apiVersion: tekton.dev/v1 kind: PipelineRun metadata: generateName: trusted-artifacts-example spec: workspaces: - name: artifactStorage volumeClaimTemplate: spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi pipelineSpec: workspaces: - name: artifactStorage artifacts: true tasks: - name: producer taskSpec: outputs: artifacts: - name: aFileArtifact description: An artifact file - name: aFolderArtifact description: An artifact folder steps: - name: produce-file image: bash:latest script: | #!/usr/bin/env bash date +%s | tee "$(outputs.artifacts.aFileArtifact.data.path)/afile.txt" - name: produce-folder image: bash:latest script: | #!/usr/bin/env bash A_FOLDER_PATH=$(outputs.artifacts.aFolderArtifact.data.path)/afolder mkdir "$A_FOLDER_PATH" date +%s | tee "${A_FOLDER_PATH}/a.txt" date +%s | tee "${A_FOLDER_PATH}/b.txt" date +%s | tee "${A_FOLDER_PATH}/c.txt" - name: consumer params: - name: aFileArtifact value: $(tasks.producer.outputs.artifacts.aFileArtifact) - name: aFolderArtifact value: $(tasks.producer.outputs.artifacts.aFolderArtifact) taskSpec: inputs: artifacts: - name: aFileArtifact description: An artifact file - name: aFolderArtifact description: An artifact folder steps: - name: consume-file image: bash:latest script: | #!/usr/bin/env bash echo "File content" cat $(inputs.artifacts.aFileArtifact.data.path) - name: consume-folder image: bash:latest script: | #!/usr/bin/env bash echo "Folder content" find $(inputs.artifacts.aFolderArtifact.data.path) -type f ``` In practice, this is what happens to the above `PipelineRun` with the `Steps` injected at execution time: ```yaml apiVersion: tekton.dev/v1 kind: PipelineRun metadata: generateName: trusted-artifacts-example spec: workspaces: - name: artifactStorage volumeClaimTemplate: spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi pipelineSpec: workspaces: - name: artifactStorage artifacts: true tasks: - name: producer taskSpec: outputs: artifacts: - name: aFileArtifact description: An artifact file - name: aFolderArtifact description: An artifact folder steps: - name: produce-file image: bash:latest script: | #!/usr/bin/env bash date +%s | tee "$(outputs.artifacts.aFileArtifact.data.path)/afile.txt" - name: produce-folder image: bash:latest script: | #!/usr/bin/env bash A_FOLDER_PATH=$(outputs.artifacts.aFolderArtifact.data.path)/afolder mkdir "$A_FOLDER_PATH" date +%s | tee "${A_FOLDER_PATH}/a.txt" date +%s | tee "${A_FOLDER_PATH}/b.txt" date +%s | tee "${A_FOLDER_PATH}/c.txt" - name: upload-file image: bash:latest script: | #!/usr/bin/env bash set -ex ARTIFACT_ROOT="/tekton/artifacts" A_FILE_PATH=afile.txt A_FILE_HASH=$(md5sum "${ARTIFACT_ROOT}/${A_FILE_PATH}" | awk '{ print $1 }') TARGET_PATH="$(workspaces.artifactStorage.path)/.tekton/artifacts" mkdir -p "$TARGET_PATH" cp "${ARTIFACT_ROOT}/${A_FILE_PATH}" "${TARGET_PATH}/${A_FILE_HASH}" cat <<EOF | tee $(output.artifacts.aFileArtifact.path) { "path": "${A_FILE_PATH}", "hash": "${A_FILE_HASH}", "type": "file" } EOF - name: upload-folder image: bash:latest script: | #!/usr/bin/env bash set -ex ARTIFACT_ROOT="/tekton/artifacts" A_FOLDER_PATH=afolder tar zcf "${ARTIFACT_ROOT}/${A_FOLDER_PATH}.tgz" "${ARTIFACT_ROOT}/${A_FOLDER_PATH}" A_FOLDER_HASH=$(md5sum "${ARTIFACT_ROOT}/${A_FOLDER_PATH}.tgz"| awk '{ print $1 }') TARGET_PATH="$(workspaces.artifactStorage.path)/.tekton/artifacts" mkdir -p "$TARGET_PATH" cp "${ARTIFACT_ROOT}/${A_FOLDER_PATH}.tgz" "${TARGET_PATH}/${A_FOLDER_HASH}.tgz" cat <<EOF | tee $(output.artifacts.aFolderArtifact.path) { "path": "${A_FOLDER_PATH}", "hash": "${A_FOLDER_HASH}", "type": "folder" } EOF - name: consumer params: - name: aFileArtifact value: $(tasks.producer.outputs.artifacts.aFileArtifact) - name: aFolderArtifact value: $(tasks.producer.outputs.artifacts.aFolderArtifact) taskSpec: inputs: artifacts: - name: aFileArtifact description: An artifact file - name: aFolderArtifact description: An artifact folder steps: - name: download-verify-file image: bash:latest script: | #!/usr/bin/env bash set -e # Download file ARTIFACTS_ROOT="$(workspaces.artifactStorage.path)/.tekton/artifacts" ARTIFACT="${ARTIFACTS_ROOT}/$(inputs.artifacts.aFileArtifact.hash) TARGET_ROOT="/tekton/artifacts" TARGET_ARTIFACT="${TARGET_ROOT}/$(inputs.artifacts.aFileArtifact.hash)" cp "$ARTIFACT" "$TARGET_ARTIFACT" # Check the md5sum echo "${inputs.artifacts.aFileArtifact.hash} ${TARGET_ARTIFACT}" | md5sum -c || ret=$? if [[ $ret -ne 0 ]]; then >&2 echo "Want $(inputs.artifacts.aFileArtifact.hash), got $(md5sum ${TARGET_ARTIFACT})" exit 1 fi - name: download-verify-folder image: bash:latest script: | #!/usr/bin/env bash set -e # Download folder ARTIFACTS_ROOT="$(workspaces.artifactStorage.path)/.tekton/artifacts" ARTIFACT="${ARTIFACTS_ROOT}/$(inputs.artifacts.aFolderArtifact.hash).tgz TARGET_ROOT="/tekton/artifacts" TARGET_ARTIFACT="${TARGET_ROOT}/$(inputs.artifacts.aFolderArtifact.hash).tgz" cp "$ARTIFACT" "$TARGET_ARTIFACT" # Check the md5sum echo "${inputs.artifacts.aFolderArtifact.hash} ${TARGET_ARTIFACT}" | md5sum -c || ret=$? if [[ $ret -ne 0 ]]; then >&2 echo "Want $(inputs.artifacts.aFolderArtifact.hash), got $(md5sum ${TARGET_ARTIFACT})" exit 1 fi - name: consume-file image: bash:latest script: | #!/usr/bin/env bash echo "File content" cat $(inputs.artifacts.aFileArtifact.data.path) - name: consume-folder image: bash:latest script: | #!/usr/bin/env bash echo "Folder content" find $(inputs.artifacts.aFolderArtifact.data.path) -type f ``` ## Notes and Caveats Some questions raised during the initial presentation: * Q: Can we restrict access to the persistent volumes to the injected steps? Maybe using TEP-0029 * A: We could mount the workspace to injected steps instead of relying on propagated workspaces. This would not prevent users from mounting the workspace to other steps / sidecars though, unless we add validation to prevent that. However that would mean that a consumer could not use the artifact workspace to produce another artifact, which would be problematic * Q: Using an emptyDir secures the data, but may be less performant than writing directly * A: On the producing side, we need to let users write to an `emptyDir`, and the injected step will calculate the hash and then copy the data to the workspace. On the consuming side, we need to copy data to an `emptyDir` and then verify the checksum, or else we cannot be sure that the data has not been compromised after the checksum verification * Q: Controller could be the one that has access to write the files to the artifact storage * A: Using the Tekton controller to transfer data for all `TaskRuns` would turn it into an I/O bottleneck. We could conceive a Tekton managed service where artifact are uploaded to/downloaded from, but for this proposal I wanted to rely on the existing workspace as a baby-step forward. Once that is in place, we can introduce different kinds of backends. The beauty of it is that we can switch the implementation behind the scenes with no impact on the user interface and thus no impact on existing tasks and pipelines * Q: If an Artifact needs to be consumed by multiple Tasks, do we need a lock? * A: We don't need a lock, but we need to copy the artifact to the Pod local disk (`emptyDir`) first, then verify the checksum, and then hand-off control to the user * Q: What about the flexibility of the (injected) steps? * A: The implementation for workspace (this TEP) won't be flexible. In future we will introduce support for other kind of backends, and perhaps user-defined ones, which means we may need to give users a way to define what the upload/download steps look like. I purposefully wanted to steer away from that complexity in this proposal. * Q: Is the path/hash to be used for provenance generation? * A: This proposal is only meant for tasks to securely share artifact between each other. Provenance generation is interested in input and output artifacts instead. That said, this proposal is designed so that it may be used and extended for input and output artifacts as well, in which case the artifact metadata will become relevant from a provenance point of view. * Q: Do we want to support several verification policies, like we do for trusted resources? * A: Not in the initial version where we will only fail when the hash doesn’t match. ## Future Work ### Reusable Steps We can introduce `Step` CRDs to enable the reusable units of work in Tekton to execute in the same environment with a shared file system – `Pod` in Kubernetes. We can build on the above proposal to enable users to define the injected `Steps` used to transfer artifacts to/from local disks and verify artifacts before they are operated on. This will provide greater flexibility than the injected Steps defined by Tekton. ### Provenance Generation This proposal focuses on sharing artifacts between `Tasks` in a `Pipeline` via a `Workspace`. We can extend this feature to declare inputs and outputs of a `Pipeline` for which provenance needs to be generated by Tekton Chains. This will be explored in future work. ### Extend Schema The schema requires `path`, `hash` and `type`. Users may need additional properties in schemas of specific `Artifacts`. We can explore supporting this in future work. ## Design Evaluation ### Reusability Adopting trusted artifacts would require users to make changes to their Tasks and Pipelines, albeit minimal ones. ### Simplicity The proposed functionality relies as much as possible on existing Tekton features. ### Flexibility The proposed functionality relies on workspaces and `PVCs`, however it could easily be extended to support additional storage formats. In terms of flexibility of adoption in pipelines, there are no assumptions made on the `Tasks` and `Pipelines` where this is used. The artifact schema could extended in future, or it could support custom fields to be specified by users in the same way they do today for object paramters and results, to allow users to attach additional metadata to their artifacts/ ### Conformance TBD ### User Experience The API surface change is minimal and consistent with the API that users are familiar with today. ### Performance Injected steps would impact the execution of `TaskRuns` and `PipelineRuns`, however impact should be minimal: - a single producer and consumer step can be used to handle multiple artifacs to avoid the overhead of one container per artifact - steps shall be injected only where needed - the ability to use `workspaces` means that minimal extra data I/O is required: - tar/untar folders for hashing purposes - copy data on the consuming side to avoid dirty reads