# Sourcing Bundle content from Git repositories # Goal Support unpacking content from a git repository to the storage backend via the Bundle API. The git repository would contain a manifests directory composed of static kubernetes YAML files (a plain+v0 bundle directory). ## Secondary Goals * Ensure design is scalable to support other various content sources * Ensure design is simple to understand and use * Ensure design can eventually incorporate additional features such as sourcing from private git repositories # Design Considerations * Union type source field vs URI approach (agreed to be union type for now) * Should the git-based provisioner be a new, separate controller from the existing plain bundle controller? (No) * Use a sidecar git image to pull down contents (for example https://hub.docker.com/r/bitnami/git/) or encode git logic into an unpacker? * See flux source-controller Git CRD for an example of a fully featured git unpacker https://github.com/fluxcd/source-controller/blob/a3cbe6ee46f0454cf0974a042e5ee1a5c4bb1983/api/v1beta2/gitrepository_types.go#L49 ## Proposed API Design ```yaml kind: Bundle metadata: name: plumbus-operator.v0.9.3 spec: provisionerClassName: core.rukpak.io/plain source: type: git # required, one-of git: repository: https://github.com/operator-framework/combo # which repo (required) directory: /deploy/manifests # where in the repo are the manifests (default root) (optional) reference: # required, one-of, exactly one branch: dev # which branch has the required manifests (optional) tag: 0.1.0 # tag reference (optional) commit: eea6a040b321cb35ef394c15a006ae0901f2f8b0 # (optional) secret: my-gh-secret # secret with auth information (optional) ``` The existing plain Bundle controller should know how to pull down and unpack a git repository. The primary component involved in unpacking is the unpacker binary, which is separate from the controller binary, that can read a directory and write the JSON-encoded contents to stdout. The following steps can occur: 0. User creates a Bundle with a source.git indicated on the spec 1. The Bundle contoller creates a container with the same init container that installs the unpacker binary onto the /util volume 2. The primary container image can be something like `bitnami/git:latest` or any image that has git installed 3. The primary image should have the init container volume mounted so the unpacker is available 4. Execute a command that can clone the given spec.git.repository, checkout the spec.git.reference indicated, and run the unpacker binary against the spec.git.directory 5. Once that container exits successfully the rest of the unpacking process can occur (read pod logs, persist to storage) ## Open Questions * Auth (support private repositories) - top level or part of git struct? * Manifest locations (not in /manifests?) * Support alternate git providers (not GitHub? Is this an issue?) * Additional knobs (submodules, etc) * Tim: what happens when new changes are checked into the configured git.directory repository path? * Tim: for the plain provisioner, what happens when the configured git.directory contains nested directories? * Tim: what happens if the configured git.directory is a submodule? ## Alternate Design: Git URI ```yaml kind: Bundle metadata: name: plumbus-operator.v0.9.3 spec: provisionerClassName: core.rukpak.io/plain source: type: generic # required, one-of uri: git://github.com/torvalds/linux.git ``` The URI based approach leads to a generic `source.type.generic` API field which supports content from different sources based on the URI provided. This keeps the spec of the Bundle smaller and more agnostic to what each content source requires. However, there are a lot of potential details that users would want to provide when configuring their content source. It's not clear whether a URI can capture all the information required. A URI-based scheme would potentially still require additional spec fields to capture additional information (such as which directory contains the static manifests, git-specific options, etc). If the scope of the `generic` source type would have to grow to support all knobs from all providers then it would be less valuable. Potentially if the `generic` source type were to exist only to suport simple URI-based content sources, with no additional inputs provided, then it would be more valuable. It risks duplicating support for content sources (should a user use the generic type or git type for a git bundle?) but would work well in a case where a user simply wants to provide a URI and pull that content onto the cluster.