owned this note
owned this note
Published
Linked with GitHub
# OKD Shipwright Multi-Arch Builds
Authors:
- [@Adarash-jaiss](https://github.com/Adarsh-jaiss)
## Overview
The [multi-arch-native-build-stategy.yaml](https://github.com/okd-project/okd-payload-pipeline/blob/80cd93b299495e805608179b5af4d47024d255cb/shipwright/multi-arch-native-build-strategy.yaml) file defines a ClusterBuildStrategy for multi-architecture native builds using [Buildah](https://buildah.io). Here is a plain text translation of the build strategy spec:
- The volumes section defines a volume named "oci-archive-storage" with an emptyDir.
- The buildSteps section contains a list of steps to be executed during the build process.
- The first step is named "prepare-build". It uses the "quay.io/fedora/fedora:latest" image and sets the working directory to the value of the "shp-source-root" parameter. It also specifies resource requests and limits for CPU and memory. The command is set to "/bin/bash" and the args section contains a bash script that performs various tasks, including parsing parameters, verifying the existence of the context directory and Dockerfile, creating registries config file, downloading kubectl, creating build jobs for each architecture, and uploading assets to the build pods.
- The second step is named "wait-manifests-complete". It uses the "quay.io/fedora/fedora:latest" image and sets the working directory to "/tmp". It mounts the "oci-archive-storage" volume. The command is set to "/bin/bash" and the args section contains a bash script that waits for the build jobs to complete, downloads the built images, and lists the files in the "oci-archive-storage" directory.
- The third step is named "package-manifest-list-and-push". It uses the "quay.io/containers/buildah:v1.28.0" image and sets the working directory to "/var/oci-archive-storage". It mounts the "oci-archive-storage" volume. The command is set to "/bin/bash" and the args section contains a bash script that creates a manifest list for the built images and pushes it to the specified image repository.
- The parameters section defines various parameters that can be passed to the build strategy, such as architectures, build arguments, image names, registries, resource requests and limits.
## Args deep dive
1. **prepare-build step:**
- The args section of this step contains the arguments that will be passed to the /bin/bash command.
- The arguments are specified using the -c flag, which allows you to provide a command or script as a string.
- The script inside the args section is a Bash script that performs various tasks related to preparing the build environment.
- The script starts with the shebang #!/bin/bash to indicate that it should be executed using the Bash shell.
- The script sets up error handling `(set -Eueo pipefail) and a trap to handle termination signals (trap 'CHILDREN=$(jobs -p); if test -n "${CHILDREN}"; then kill ${CHILDREN} && wait; fi' TERM ERR).`
- The script then parses the command-line arguments using a while loop and assigns values to variables based on the arguments.
- After parsing the arguments, the script performs some validations and checks the existence of the context directory and Dockerfile.
- It creates a registries configuration file based on the provided arguments.
- If a runtime-stage-from image is specified, it replaces the value in the last FROM instruction in the Dockerfile.
- It downloads kubectl and sets up some variables related to the task run and Kubernetes environment.
- Next, it creates a build job for each architecture specified in the architectures array.
- Inside each build job, it sets up the necessary environment, mounts volumes, and executes a Bash script to perform the actual build steps.
- The build script inside the build job creates a temporary directory, waits for assets to be copied, builds the image using buildah, and stores the image as an oci-archive.
- Finally, it waits for the build jobs to complete and uploads the assets to the build pods.
2. **wait-manifests-complete step:**
- The args section of this step contains the arguments that will be passed to the /bin/bash command.
- Similar to the previous step, the script inside the args section is a Bash script that performs tasks related to waiting for the build jobs to complete and downloading the built images.
- The script starts with the shebang #!/bin/bash and sets up error handling and a trap for termination signals.
- It parses the command-line arguments using a while loop and assigns values to variables based on the arguments.
- It downloads kubectl and sets up some variables related to the task run and Kubernetes environment.
- Next, it waits for each build job to complete and downloads the built images using kubectl cp.
- It also waits for the build pods to be ready and streams the logs of the build containers.
- Finally, it lists the files in the /var/oci-archive-storage directory.
3. **package-manifest-list-and-push step:**
- The args section of this step contains the arguments that will be passed to the /bin/bash command.
- The script inside the args section is a Bash script that performs tasks related to creating a manifest list and pushing the built images.
- The script starts with the shebang #!/bin/bash and sets up error handling.
- It parses the command-line arguments using a while loop and assigns values to variables based on the arguments.
- It retrieves the image name from the --image argument and sets up a variable for insecure registries.
- The script then creates a manifest list using buildah manifest create and adds the OCI archives to the manifest list using buildah manifest add.
- Finally, it pushes the manifest list and the individual images to a Docker registry using buildah manifest push.
- These args sections contain the commands and arguments that will be executed within each build step. They define the behavior and actions performed during the build process.
## code level Deep dive
### Build step
1. **Error Handling**
- Script uses bash best practices for failing on error - `set -euo pipefail`
- Script also uses `set -E` and `trap` to terminate child jobs upon receiving an interrupt signal (SIGINT).
2. **Parsing parameters from the arguments**
- It initializes several variables to empty strings. These variables will hold the values of different command-line arguments.
- It declares two arrays, architectures and buildArgs, which will hold the values of the `--architectures` and `--build-args` arguments, respectively.
- It enters a while loop that continues as long as there are command-line arguments to process ($# -gt 0).
- Inside the loop, it takes the first argument ($1), assigns it to the variable arg, and then removes it from the list of arguments (shift).
- It then checks the value of arg against a series of known command-line options (e.g., `--context`, `--dockerfile`, `--image`, etc.). If arg matches one of these options, it assigns the next command-line argument to the appropriate variable and removes it from the list of arguments.
- Some options, like `--architectures`, `--build-args`, `--build-contexts`, `--registries-block`, `--registries-insecure`, and `--registries-search`, require multiple values. For these options, it sets the status variable to a special value (e.g., parse_architectures) and then, in subsequent iterations of the loop, adds the following arguments to the appropriate array or string until it encounters another option.
- If arg starts with -- but doesn't match any known options, or if it encounters an argument when status is not set to a known value, it prints an error message and exits the script with a non-zero status code, indicating an error.
- The loop continues until all command-line arguments have been processed. At the end, the variables and arrays contain the values of the command-line arguments, ready to be used in the rest of the script.
3. **Verifying the Existence of Context directory and Dockerfile**
- The first `if` statement checks if the directory specified by the variable `${context}` exists. The `-d` flag in the test `[ ! -d "${context}" ]` checks for a directory existence. The `!` negates the test, so the code inside the if statement executes if the directory does not exist.
- If the directory does not exist, it prints an error message, writes **"ContextDirNotFound"** to the file specified by `$(results.shp-error-reason.path)`, writes an error message to the file specified by `$(results.shp-error-message.path)`, and then exits the script with a status of 1, indicating an error.
- If the directory does exist, it changes the current working directory to `${context}` using the `cd` command.
- The second `if` statement checks if the file specified by the variable `${dockerfile}` exists. The `-f` flag in the test `[ ! -f "${dockerfile}" ]` checks for a file's existence.
- If the file does not exist, it prints an error message, writes **"DockerfileNotFound"** to the file specified by `$(results.shp-error-reason.path)`, writes an error message to the file specified by `$(results.shp-error-message.path)`, and then exits the script with a status of 1, indicating an error.
- If the file does exist, the script continues to the next command.
4. **Creating registry configs based on provided args**
- It creates a new file at `/tmp/registries.conf` using the `touch` command.
- It checks if the `registriesSearch`, `registriesInsecure`, and `registriesBlock` variables are not empty. If they are not, it appends a section to the `registries.conf` file for each one. The `cat <<EOF >>/tmp/registries.conf` command is a **"here document"** that appends the text between `EOF` markers to the file. The ${variable::-2} syntax removes the last two characters from the variable's value.
- It checks if the `runtime_stage_from_image` variable is not empty. If it is not, it uses several commands to extract the name of the base image from the Dockerfile and adds a new build argument that replaces this base image with the value of `runtime_stage_from_image.`
5. **Downloading kubectl**
- It downloads the `kubectl` command-line tool for interacting with Kubernetes. It first determines the version of Kubernetes and the architecture of the current machine, then downloads the appropriate version of kubectl from Google's storage service and makes it executable.
- It sets several variables related to the current Kubernetes pod and task run. The `$(</var/run/secrets/kubernetes.io/serviceaccount/namespace)` command reads the namespace from a file, and the `$(kubectl get pod/"${task_run_pod}" -o jsonpath='{.metadata.ownerReferences[0].uid}')` command uses kubectl to get the UID of the owner of the current pod.
6. **Creating build job : Used to create k8s jobs for building Docker images on different architectures.**
- The script first loops over an array of architectures. For each architecture, it creates a Kubernetes job using a heredoc (the `<<EOF ... EOF block).` This job is configured to build a Docker image using Buildah, a tool for building OCI-compliant container images.
- The job is configured with various parameters such as the architecture, the image to be built, the Dockerfile to use, and resource requests and limits. It also sets up a pipe at `/tmp/pipe` which is used to synchronize the build process.
- After creating the jobs, the script sets up two arrays to keep track of the process IDs of the jobs that are waiting for failure and success conditions respectively.
- The script then enters a loop where it waits for either a failure or success condition to be met. If a failure condition is met, it exits with an error. If a success condition is met, it removes the corresponding process ID from the success array.
- Once all jobs have started successfully, the script uploads the necessary assets to the build pods. This is done by creating a tarball of the current directory and copying it to the pod, and also copying a registries.conf file to the pod.
- The script then waits for all the asset uploads to complete. If any of them fail, it exits with an error.
- Finally, the script prints a message indicating that all assets have been uploaded and the build process can continue.
### wait group step
- The Pod is named `wait-manifests-complete` and uses the latest Fedora image from Quay.io. It works in the /tmp directory and mounts a volume at `/var/oci-archive-storage.`
- The Pod requests 50m CPU and 16Mi memory, and limits memory usage to 256Mi.
- The Pod runs a Bash script. The script starts by setting some Bash options for error handling and traps any errors or termination signals to kill all child processes.
- Similar to the previous step, The script declares an array `architectures` and a boolean `inArchitectures`. It then processes its arguments, adding any arguments after -`-architecture`s to the architectures array until it encounters another argument starting with `--`.
- The script downloads `kubectl`, matching the version of the Kubernetes server it's running on. It saves `kubectl` to `/tmp/bin/` and adds this directory to the `PATH`.
- The script gets the name of the current Pod and constructs a job name from it. It also declares arrays `success_pids` and `failure_pids` to keep track of background processes, and a variable `finished_job`.
- The script defines a function `download_images` that triggers an image download in a specified Pod and copies the downloaded image to `/var/oci-archive-storage`.
- The script then loops over the architectures array. For each `architecture`, it waits for a job to complete or fail, downloads images, and waits for a Pod to become ready, logging its output. The process IDs of these background tasks are added to `success_pids`.
- The script enters a loop where it waits for any of the background tasks to finish. If a task fails, it exits with an error. If a task succeeds, it removes its process ID from `success_pids`. The loop continues until all tasks have succeeded.
- Finally, the script prints a success message and lists the contents of `/var/oci-archive-storage.`
- The arguments to the script are `--architectures` followed by the elements of `params.architectures[*]`. This means that the script will process architectures specified in the `params.architectures` array.
## Build and push step
- This step involves Kubernetes configuration for a Pod that uses the Buildah tool to create, add to, and push a manifest list of container images
- The Pod is named `package-manifest-list-and-push` and uses the `v1.28.0` version of the Buildah image from Quay.io. It works in the `/var/oci-archive-storage` directory and mounts a volume at the same path.
- The Pod requests 50m CPU and 16Mi memory, and limits memory usage to 256Mi.
- The Pod runs a Bash script. The script starts by setting some Bash options for error handling and lists the contents of the current directory.
- The script declares some variables and enters a loop where it processes its arguments. It recognizes `--image` and `--registries-insecure` arguments. If it encounters an unrecognized argument, it exits with an error.
- If the `--image` argument is passed, the script sets the image variable to the next argument and clears the status variable.
- If the `--registries-insecure` argument is passed, the script sets the status variable to parse_registries_insecure. If the `status `variable is `parse_registries_insecure` and the next argument is not another option, the script adds the argument to the `registriesInsecure` variable and checks if the `image` variable starts with the argument. If it does, it sets `tlsVerify` to `false`.
- The script gets the base name of the `image` variable and creates a manifest list with this name using Buildah.
- The script then loops over all `.tar.gz` files in the current directory that start with `image-`. For each file, it adds a manifest to the manifest list using Buildah. If a file does not exist, it exits with an error.
- Finally, the script pushes the manifest list to a Docker registry using Buildah. It passes the tlsVerify variable to the `--tls-verify` option.
- The arguments to the script are `--image` followed by the value of `params.shp-output-image`, and `--registries-insecure` followed by the elements of `params.registries-insecure[*]`.
- The `parameters` section defines parameters that can be passed to the Pod. These include the architectures to build the image for, the arguments to the Dockerfile, the names of the images to replace in the FROM instructions, additional build contexts, registries to block or treat as insecure, registries to search for images, and the CPU and memory requests and limits for the Pod.