# Meta [meta]: #meta - Name: Export image in OCI format into layers - Start Date: 2022-01-12 - Author(s): [jjbustamante](https://github.com/jjbustamante/) - Status: Draft <!-- Acceptable values: Draft, Approved, On Hold, Superseded --> - RFC Pull Request: (leave blank) - CNB Pull Request: (leave blank) - CNB Issue: (leave blank) - Supersedes: (put "N/A" unless this replaces an existing RFC, then link to that RFC) # Summary [summary]: #summary Allow the Lifecycle to export the output image in OCI format and save it to the file system opening the possibility to deprecate the use of the Daemon in the future. See requirement(s): - [lifecycle #424](https://github.com/buildpacks/lifecycle/issues/423) # Definitions [definitions]: #definitions - A [Platform](https://buildpacks.io/docs/concepts/components/platform/) uses a lifecycle, buildpacks (packaged in a builder), and application source code to produce an OCI image. - A [Lifecycle](https://buildpacks.io/docs/concepts/components/lifecycle/) orchestrates buildpack execution, then assembles the resulting artifacts into a final app image. - A Daemon is a service, popularized by Docker, for downloading container images, and executing and managing containers from those images. - A Registry is a long-running service used for storing and retrieving container images. - An [OCI Image Layout](https://github.com/opencontainers/image-spec/blob/main/image-layout.md) is the directory structure for OCI content-addressable blobs and location-addressable references. - An [image index](https://github.com/opencontainers/image-spec/blob/main/image-index.md) is a higher-level manifest which points to a list of manifests and descriptors. - An [Image manifest](https://github.com/opencontainers/image-spec/blob/main/manifest.md) provides a configuration and set of layers for a single container image for a specific architecture and operating system. - A [config](https://github.com/opencontainers/image-spec/blob/main/descriptor.md) is a property references a configuration object for a container, by digest. It must support the following media type `application/vnd.oci.image.config.v1+json` Aditionally in order to document this RFC we are using the [C4 model](https://c4model.com) which is an "abstraction-first" approach to diagramming software architecture, based upon abstractions that reflect how software architects and developers think about and build software. <!-- Make a list of the definitions that may be useful for those reviewing. Include phrases and words that buildpack authors or other interested parties may not be familiar with. --> # Motivation [motivation]: #motivation <!-- Why should we do this? --> As we can see in the following landscape diagram, currently lifecycle requires access to a Daemon or a Registry to do its job. ![](https://i.imgur.com/lsCuY8h.png) This design makes harder to mantain the current capabilities or extend them because there are differences between both formats that increase the complexity of keeping the same functionalites on images that are saved in Daemon or in a Registry. For example: The OCI image specification defines a Manifest file but in the Daemon this concept doesn't exists or if we try to add [annotations](https://github.com/buildpacks/rfcs/pull/196) then we can't guarantee the same behavior with images saved in the Daemon. <!-- Or to evolve lifecycle with new features, for example the following features: - [Buildkit integration](https://github.com/EricHripko/cnbp) - [Cosign integration](https://github.com/buildpacks/rfcs/pull/195) - [OCI annotations](https://github.com/buildpacks/rfcs/pull/196) Are pending to be implemented because some issues related with the Daemon. --> The main goal is to deprecate the use of the Daemon in the lifecycle, embrace the use of the [OCI specification](https://github.com/opencontainers/image-spec) and move the complexity of interacting with the Daemon into the **Platform** component. <!-- What use cases does it support? --> Some of the uses cases this feature can soport are: - TBD - TBD <!-- What is the expected outcome? --> The expected output image generated by the lifecycle would be saved using the [OCI image layout](https://github.com/opencontainers/image-spec/blob/main/image-layout.md) in a path configured by the user. # What it is [what-it-is]: #what-it-is <!-- This provides a high level overview of the feature. - Define any new terminology. - Define the target persona: buildpack author, buildpack user, platform operator, platform implementor, and/or project contributor. --> The general idea is to produce an [OCI image layout](https://github.com/opencontainers/image-spec/blob/main/image-layout.md) and save it in a file system accesible from the lifecycle execution. The proposal tagets the Platform implementor because delegates the following responsabilities to them: - Pull the require dependencies (runtime image for example) and pass it throught the lifecycle - Save the resulting image to the daemon Let's see the updated landscape diagram after implementing the proposal. ![](https://i.imgur.com/7xPpyMZ.png) The integration between *Lifecycle* and the *Daemon* is gone and the *Platform* component has now more responsabilities and it's not just forwarding the requests to the lifecycle. **Note** It's important to notice, in this proposal *Lifecycle* only interacts with a Filesystem storage avoiding the complexity of determine if the image is on a Registry or in the Filesystem. # How it Works [how-it-works]: #how-it-works As we saw in the previous landscape diagram, the proposal involves changing the *Platform* and the *Lifecycle*, let's check both in details. ## Platform This component takes the responsability of interacting with the Daemon to make the *Developer* experience easier, let's suppose it prepares the images using a tool similar to [skopeo](https://github.com/containers/skopeo), let's see how this process looks like ```shell= # Copy the run image from a daemon and save it locally > skopeo copy docker-daemon:alpine:latest oci:alpine:latest Getting image source signatures Copying blob 8d3ac3489996 done Copying config d539cd357a done Writing manifest to image destination Storing signatures > ls alpine # The structure in oci layout format > tree . . └── alpine/ ├── blobs/ │ └── sha256/ │ ├── 03.. │ ├── 69.. │ └── a0.. ├── index.json └── oci-layout ``` <!-- Let's check the format of some important files #### Index file ```json= { "schemaVersion": 2, "manifests": [ { "mediaType": "application/vnd.oci.image.manifest.v1+json", "digest": "sha256:03014f0323753134bf6399ffbe26dcd75e89c6a7429adfab392d64706649f07b", "size": 348 } ] } ``` #### Manifest file ```json= { "schemaVersion": 2, "config": { "mediaType": "application/vnd.oci.image.config.v1+json", "digest": "sha256:696d33ca1510966c426bdcc0daf05f75990d68c4eb820f615edccf7b971935e7", "size": 585 }, "layers": [ { "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip", "digest": "sha256:a0d0a0d46f8b52473982a3c466318f479767577551a53ffc9074c9fa7035982e", "size": 2814446 } ] } ``` #### Config file ```json= { "created": "2021-08-27T17:19:45.758611523Z", "architecture": "amd64", "os": "linux", "config": { "Env": [ "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" ], "Cmd": [ "/bin/sh" ] }, "rootfs": { "type": "layers", "diff_ids": [ "sha256:e2eb06d8af8218cfec8210147357a68b7e13f7c485b991c288c2d01dc228bb68" ] }, "history": [ { "created": "2021-08-27T17:19:45.553092363Z", "created_by": "/bin/sh -c #(nop) ADD file:aad4290d27580cc1a094ffaf98c3ca2fc5d699fe695dfb8e6e9fac20f1129450 in / " }, { "created": "2021-08-27T17:19:45.758611523Z", "created_by": "/bin/sh -c #(nop) CMD [\"/bin/sh\"]", "empty_layer": true } ] } ``` --> This information will be send to the *Lifecycle* during the invocation. Once the output from *Lifecycle* is received, *Platform* saves it into the Daemon. Let's suppose the output image is saved in a folder named `oci-dir` ```shell= . └── oci-dir/ ├── blobs/ │ └── sha256/ │ ├── 01.. │ ├── 55.. │ └── 86.. ├── index.json └── oci-layout ``` *Platform* will push that image into a Daemon, an example of that process using a tool similar to [skopeo](https://github.com/containers/skopeo) looks as follows ```shell= > skopeo copy oci:oci-dir:latest docker-daemon:my-oci-app:latest Getting image source signatures Copying blob e4ca327ec0e7 done Copying blob 55fae2d3d3bc done Copying blob 6faf60e5f26d done Copying blob e6c238050bf1 done Copying blob baa9584706b2 done Copying blob 963a25eb7bff done Copying blob fc094da8ae34 done Copying blob 9e1b79ed05fb done Copying config 01b31028ae done Writing manifest to image destination Storing signatures > docker images | grep my-oci-app my-oci-app latest 01b31028ae5e N/A 300MB ``` ## Lifecycle <!-- This is the technical portion of the RFC, where you explain the design in sufficient detail. The section should return to the examples given in the previous section, and explain more fully how the detailed proposal makes those examples work. --> The lifecycle phases affected by this new behavior are: [Analyze](https://buildpacks.io/docs/concepts/components/lifecycle/analyze/), [Export](https://buildpacks.io/docs/concepts/components/lifecycle/export/) and [Create](https://buildpacks.io/docs/concepts/components/lifecycle/create/) The following new input is proposed to be added to these phases | Input | Environment Variable | Default Value | Description |-------------------|-----------------------|--------------------------|---------------------- | `<layout>` | `CNB_LAYOUT_DIR` | "" | The root directory where all the OCI image will be located, including the output image. The presence of a none empty value for this environment variable will enable the feature. | As we saw before, *Platform* needs to provide the *Lifecycle* a store with all the images needed to build the application. The proposed structure for this store can be summarize as follows: ```shell= . └── <root>/ ├── <image-name-1>/ │ ├── blobs/ │ │ ├── <manifest-blob> (2) │ │ ├── <config-blob> (3) │ │ └── <layer-blob> (4) │ ├── oci-layout │ └── index.json (1) └── <image-name-2>/ ├── blobs/ │ ├── <manifest-blob> │ ├── <config-blob> │ └── <layer-blob> ├── oci-layout └── index.json ``` * The *<root>* folder will be pass to *Lifecycle* using the new `-layout` flag or the `CNB_LAYOUT_DIR` environment variable * *Lifecycle* will attempt to load the `<run-image>` or `<previous-image>` based on the `layout` flag or the `CNB_LAYOUT_DIR` and the name of the image. Let's see some examples of the expected behavior of the *Lifecycle*, for simplicity let's suppose the root store path was set using the environment variable. ```shell= export CNB_LAYOUT_DIR=oci # phase is one of {analyzer|exporter|creator} > cnb/lifecycle/$phase [-run-image|-previous-image] ``` | Image name provided | Expected behavior | | -------- | -------- | | `some-image` | load from $CNB_LAYOUT_DIR/some-image directory | | `some-image:0.0.1` | load from $CNB_LAYOUT_DIR/some-image directory and enforced the `org.opencontainers.image.ref.name` annotation saved in $CNB_LAYOUT_DIR/some-image/index.json is equal to 0.0.1 | | `some-image:sha256:03...b` | load from $CNB_LAYOUT_DIR/some-image directory and enforced the `manifest.digest` is equal to 'sha256:03...b' | | `gcr.io/my-org/some-image` | load from $CNB_LAYOUT_DIR/gcr.io/my-org/some-image directory | | `gcr.io/my-org/some-image:0.0.1` | load from $CNB_LAYOUT_DIR/gcr.io/my-org/some-image directory and enforced the `org.opencontainers.image.ref.name` annotation saved in $CNB_LAYOUT_DIR/gcr.io/my-org/some-image/index.json is equal to 0.0.1 | | `gcr.io/my-org/some-image:sha256:03...b` | load from $CNB_LAYOUT_DIR/gcr.io/my-org/some-image directory and enforced the `manifest.digest` is equal to 'sha256:03...b' | Once the input images references were loaded by the *Lifecycle* it will keep its current behavior and during the export phase, it will save the output in [OCI image layout](https://github.com/opencontainers/image-spec/blob/main/image-layout.md) format at the same store path defined by the user. ### Proof of concept As part of this RFC we did a little PoC on [Lifecycle](https://github.com/buildpacks/lifecycle/pull/793) and [Pack](https://github.com/buildpacks/pack/pull/1314) that allow the user to export the image to their local machine. ```shell= > ./out/pack build oci-example --lifecycle-image \ lifecycle-layout --builder cnbs/sample-builder:bionic \ --verbose \ --path /Users/jbustamante/workspace/buildpack.io/samples/apps/java-maven \ --tag latest \ --oci-dir . ``` In the following gift we can see the output image exported to the filesystem in OCI layout format ![](https://i.imgur.com/MgHolnW.gif) <!-- Mention the annotation proposed by Sam at https://github.com/samj1912/rfcs/blob/annotations/text/0000-annotations.md --> <!-- - Explaining the feature largely in terms of examples. - If applicable, provide sample error messages, deprecation warnings, or migration guidance. - If applicable, describe the differences between teaching this to existing users and new users. --> <!-- @@@ Juan's thoughts @@@@ What did we do in the PoC? - New layout dir called /image inside the layout directory - Environment variables - CNB_USE_LAYOUT // set to false by default - CNB_LAYOUT_DIR // by default goes into $LAYERS_DIR/image - When CNB_USE_LAYOUT is set to true - Analyzed.toml MUST include OCI annotations User's goal: As a Developer I want to pull my source code from my SCM and run pack build .... and I expect to have a docker image loaded into my daemon and then run that image with docker run ... --> # Drawbacks [drawbacks]: #drawbacks <!-- Why should we *not* do this? --> - A major drawback is look-ups. All known and anticipated images must be available on disk since there's no registry or daemon to do further look-ups against during the build process. - Exploding the images into disk could affect the performance and be very costly # Alternatives [alternatives]: #alternatives ## Lifecycle hybrid mode approach In this solution the *Lifecycle* is able to determine when the image can be allocated from the File System storage or from Registry. It's similar to the orginal solution proposed but it could simply the task from *Platform* ![](https://i.imgur.com/0ROGR4E.png) ### Drawbacks * Exploding the images into disk could affect the performance and be very costly ## Lifecycle registry only approach In this solution the *Lifecycle* ONLY interacts with a registry for pull/push the requires images, we forget about the OCI layout format and the responsability to interact with the daemon is still move to the *Platform*. ![](https://i.imgur.com/P0NxdMw.png) ### Drawbacks * ## Lifecycle daemon wrapper approach This solution is a variant of the registry only approach but instead on delegating all the responsability to *Platform* a new system called *Wrapper Registry* is created, this component is responsable of exposing an Registry API but also to syncronize the data from this registry into the daemon. The high level idea is summarize in the following landscape diagram ![](https://i.imgur.com/dpbDSpg.png) If we zoom in into the *Wrapper Registry* component ![](https://i.imgur.com/qilyDoe.png) The *Daemon Sync* component must take care of handling the synchronization of the data saved in the ephemeral registry and the daemon. Thinking on some implementation, there is a suggestion of considering [lazzy image distribution](https://github.com/containerd/stargz-snapshotter) to try affect the performance of the solution ### Drawbacks ## Lifecycle pluggable architecture approach Based on the PoC results, the hard work to enable the feature for exporting images to OCI layout format was done implementing the [Image interface](https://github.com/buildpacks/imgutil/blob/main/image.go) in imgUtil. The idea is use [Go plugins](https://pkg.go.dev/plugin) concepts and convert the implementation of this interface in an external module that will be injected at runtime in the *Lifecycle*. The high level idea is summarized in the following landscape diagram ![](https://i.imgur.com/oqfVxia.png) The *Plugin System* is external to lifecycle because their code resides outside the *Lifecycle* implementation and probably it could be design as a new ecosystem like the Buildpacks. To make my point clear, let's take the following example [code](https://github.com/buildpacks/lifecycle/blob/4ebc4456001e540792e9eef04706864ff1faeeb4/cmd/lifecycle/analyzer.go#L235) from the Analyzer in the Lifecycle: ```go= func (aa analyzeArgs) localOrRemote(fromImage string) (imgutil.Image, error) { if aa.useDaemon { return local.NewImage( fromImage, aa.docker, local.FromBaseImage(fromImage), ) } return remote.NewImage( fromImage, aa.keychain, remote.FromBaseImage(fromImage), ) } ``` This code is a [Factory Method](https://en.wikipedia.org/wiki/Factory_method_pattern) responsible for creating a `imgutil.Image` interface instance, if we replaced that logic with some kind of `Plugin Engine` that delegates the instantiation of the interface to an external module, then the complexity is moved outside the *Lifecycle* and moves into those plugin. A more detailed interaction is shown in the following container diagram of the *Lifecycle*. ![](https://i.imgur.com/TUl5hAR.png) Platform will be responsible for injecting the plugin into the *Lifecycle* executables at runtime and those plugins will do the job of interacting with the sources of the images. The same abstraction can be done in other sections of the core workflow and phases to expose interfaces that can be categorized as **pluggable**, that could help the community to extend lifecycle behavior easily. ### Drawbacks * Go plugins are not compatible in Windows OS ## Lifecycle export ONLY layers above base image Is there an alternative here for lifecycle exporting an OCI layout _just_ for the layers above the base image? Could the platform be responsible for outputting the base image + lifecycle output? ### Drawbacks <!-- - What other designs have been considered? - Why is this proposal the best? - What is the impact of not doing this? --> # Prior Art [prior-art]: #prior-art <!-- Discuss prior art, both the good and bad. --> # Unresolved Questions [unresolved-questions]: #unresolved-questions <!-- - What parts of the design do you expect to be resolved before this gets merged? - What parts of the design do you expect to be resolved through implementation of the feature? - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? --> # Spec. Changes (OPTIONAL) [spec-changes]: #spec-changes <!-- Does this RFC entail any proposed changes to the core specifications or extensions? If so, please document changes here. Examples of a spec. change might be new lifecycle flags, new `buildpack.toml` fields, new fields in the buildpackage label, etc. This section is not intended to be binding, but as discussion of an RFC unfolds, if spec changes are necessary, they should be documented here. --> ![](https://i.imgur.com/yXsLK6N.png) ![](https://i.imgur.com/rWElkCw.png)