owned this note
owned this note
Published
Linked with GitHub
# KubeCon 2019 EU Notes: OCI Artifact Registries and CNAB Registry Support
While these meetings were initially split up across two 1 hour blocks, the majority of people were in attendance for both and the conversations generally spanned the two hours. The first hour we focused on the more common single artifact case like Helm, Singularity and Helm. The second hour we moved into the multi-artifact case that CNAB supports.
## Attendees:
- Chris Crone - CNAB developer tooling Engineering manager
- Simon Ferquel - Docker tooling in Paris - CNAB maintainer, Docker App & Docker Desktop
- George from Docker Tooling (Developer Solutions Group
- Kohel ? - researcher for registries - speeding up container distribution
- Daniel Jiang & Steven Ren from VMWare, representing Harbor
- Jimmy Zelinskie - Quay, OCI maintainer. Worked with Antoine for App Registry
- Michael Brown - IBM, OCI maintainer
- Joey Schor - Quay - co-founder & tech lead
- Interests in coming up with solutions that scale. Any solution must work from thousands to billions
- Dirk Herrmann - RedHat - PM for Quay
- Kenneth Brooks - JPMorgan Chase
- End user - having to stand up various registries and artifact registries
- Would really like to manage less of these
- Radu Matei - Microsoft, working on CNAB
- Matt Fisher - Microsoft working on CNAB
- Carolyn Van Slyck - Microsoft working on Porter
- Ralph Squillace - Microsoft, PM on CNAB
- Simon Davies - Microsoft, CNAB and Marketplace
- Gareth Rushgrove - Snyk - formerly worked on CNAB while at Docker
- Maya Even-Shani - Twistlock
## Notes
Having a full range of views from registry owners (Quay, ACR), maintainers of OCI, Artifact Owners (CNAB & Helm), Security Scanners (Snyk & Twistlock) and end users (Ken from JP Morgan) provided for a vibrant and full range of conversations.
- Steve presented an overview of our goals for OCI Artifact Registries
- Customers like JP Morgan can work with fewer systems the must maintain
- Clouds like Azure, Google, AWS and product like Quay & Harbor and others would need to maintain fewer codebases, leveraging the investments to provide a broader set of artifacts
- The content adressable way to pull and run images can be applied to new artifact types (Helm, Singularity, CNAB, OPA and others, possibly Build Packs)
- Not looking to replace the existing systems like NPM and NuGet - if it aint broke, don’t fix it.
- Provides for an offering for artifact owners to decide.
- Believes the registry API and the core capability is a commodity that we can collaborate upon, with value added above the core registry capabilities
- Goal to enable any and all artifact types with minimal, possibly no code changes.
- Wrapping up distribution 1.0 this year. Looking for the minimal changes required to ship 1.0. Other things, like the search & catalog APIs may be 2.0.
### PowerPoint decks presented
- [OCIArtifactRegistries-QuickIntro.pptx - Specific Content to this meeting](https://github.com/SteveLasker/Presentations/blob/master/ACR/OCIArtifactRegistries-QuickIntro.pptx)
- [OCIArtifactRegistries.pptx - Cloud Native Rejekts Longer Version](https://github.com/SteveLasker/Presentations/blob/master/ACR/OCIArtifactRegistries.pptx)
### Discussion
- Some discussion whether everything in the registry was a docker image, possibly built from scratch
- Chris: Are we just building yet another blob store?
- Daniel: Would like Harbor to have various artifact support. Already have it on the backlog, this proposal gives a common solution
- Discussed the blob store is fronted by a REST api for discovery, a common addressable pattern for identifying the artifacts, and a multi-layer blob storage system that can be generally applied.
- The blobs don't need to be docker images or docker layers. It's really up to the artifact author and tooling around that artifact to decide
- The blobs layers don't even need to be ordinal. It's up to the artifact owner.
- Gareth - OPA stores rego files individually in the blob. They may have another layer type as well.
- Steve - Singularity stores a SIF layer. Helm stores charts in a tar, but a meta file in a json. It's really up to the owner. Stephen Levine from Pivotal, working on Build Packs suggested they may store docker layers in their build pack artifact, but the layers are staged for re-targeting manifests to be patched docker images.
- Joey - Quay has been supporting various types in App Registry and would really like to generalize on a common approach that can scale to millions. If it doesn't scale to millions, it's not a good solution.
- Joey - images that have don't have tags are in a recycle bin status. Any image that doesn't have a tag isn't valid. You can't pull an image by digest, if it doesn't have a tag
- Maya, Gareth, Steve, Joey - discussed the value of security scanners knowing they format of the artifact so they know how to process and scan the various types. Being able to scan fewer products allows them to provide higher value
### How to identify the type:
Discussion on why we're not changing the manifest.mediaType, rather using the `manifest.config.mediaType`
- The schema of OCI Manifest fits the majority of artifacts. All artifacts are capable of being represented by 1 or more blobs, and an optional config. If we change the manifest.mediaType, registries would need to know different schemas, making it harder for them to onboard new types.
- Steve and Jimmy shared the previous discussion on proposing manifest.artifactType as too impactful and we had agreed to use manifest.config.mediaType
- Group supported the config object, with optionally providing content in the config
- Joey - Quay parses the config on push, indexing information for docker images
## How a registry supports new types
- Steve: we'd really like this to be super easy. It shouldn't be limited to the big types that can justify the investment of each registry. Smaller types, or types that are under incubation should be able to make a PR to a registry.
- Chris: expressed concern about registries having to get a json file to support new types
- Steve, Joey, Gareth: knowing the type is important for the scanning companies to know how to scan them, and the registries to provide optional value on the type. We really don't want to be a unknown blob store.
- Joey - Quay supports on prem. On prem customers should be able to reconfigure quay to support whatever they want. Ken (JP Morgan affirmed)
- Jimmy, Steve - we may want to have a defined "unknown" type for general use, but if we can make it easy to onboard new types, this shouldn’t be needed. As it could be abused
### Discussion on json file
Provides the minimal information required to represent an artifact.
- manifest.config.mediaType value to uniquely identify it
- Layer.mediaTypes
- The name of the artifact, as displayed to humans
- A logo, in svg format
- An optional schema for the config object
- Version
- schemas and ease of config
- Should these be per registry, per repo?
- Agreed this is a later discussion, possibly up to the registry to decide
- Do we need a registry api to import this?
- Agreed, maybe, but not yet. If we can agree on a minimal json payload, we can later decide how easy we want/need to make this for users to onboard new types.
- Deferred discussion on a common place to maintain these definitions
## Cross Artifact Type Client Tooling Support
With the question and concern over using different types, we had a quick discussion on what the behavior should be.
- Simon (docker) - concern about docker client pulling a manifest that's not a docker image
- Jimmy: the spec says tooling should ignore types they don't understand
- Gareth - confirmed pulling the OPA artifact from ACR with the docker client. it failed properly as it knew it had types it didn't support.
## Index Conversations
We first dove into details, debating the current design and limitations.
- Should Image be used for things other than images?
- Joey: expressed concern that a single repo shouldn't represent more than one type.
- Gareth or Chris: some repos have multiple things, and it's their choice. They get represented as different tags.
- Steve: had previously agreed, and worked to separate the various types for Microsoft images. Dotnet had runtimes, sdks, dependencies for all the supported platforms in one repo. It was difficult and split up under mcr.
- Joey - RBAC on the repo can be permission based. The user may only have access to the bundle, not the individual images. Having the option for different RBAC roles on different repos is benefit some customers may need.
- After realizing this may be a registry choice, and registries may likely get forced into supporting different types, we put this topic aside.
- We agreed to backup and discuss requirements
- Radu
- Need to store different things in a bundle
- "Thin" bundles do not need to include the actual artifacts. They reference artifacts in various locations.
- Need the bundle and the individual artifacts to be signed
- Need the artifacts to be in the same registry, or reference artifacts in other registries
- Want the same manifest type for local or remote references as they're the same type, just location is different
- Steve: We're already seeing images be pulled from different locations. The wordpress image is sourced from docker hub. The microsoft images are found on docker hub, but pulled from mcr.microsoft.com/productgroup/image
- Radu: a customer shouldn't have to push all the images (artifacts), just to push a bundle definition
- Foreign artifact conversation
- Steve and others - Windows and RHEL already utilize foreign layer support for legal software distribution reasons.
- Should extend this to foreign artifacts as well
- General discussion on the concern over garbage collection. If we know these are foreign, registries can do the right thing for gc
- Jimmy: OCI Index and OCI Manifest schemas don't need to change. They can be used as is
- Joey - OCI Index - are a problem that it can mean different things on different machines - the multi-arch problem
- Steve - agree this can be a problem for some, and "friends shouldn't let friends build against :latest". But, there is value for running an image, and the machine knowing it can pull a windows, linux or arm image. And, this cat is out of the bag.
- Joey - large orgs will build the components of a CNAB with different teams. Keeping them separate allows team flexibility.
- Kenneth - JP Morgan sets policy on different artifacts
- Joey, Radu, … side discussion on having a remote flag on the Index.manifest to state the manifest is remote. This allows registries to manage garbage collection.
- Index vs. Manifest to reference other sub types
- We agree CNAB needs to reference multiple artifact types. Today, this is a single invocation image that contains many of types it deploys - such as Helm or ARM (Azure Resource Manager) templates. But it also references the collection of images it would deploy.
- If we didn't use Index, could the CNAB bundle, that references the different images be a manifest?
- How would it reference the images? Would the bundle be In the manfiest.config?
- How would a registry do garbage collection? Registries would need to parse the config, for this specific artifact type.
- If Artifact Registries are to easily support new types, with minimal to no registry code changes, how does this meet our goals?
- Why not use Index?
## Updating and index conversation
- ?: Concern: if an index can contain multiple artifacts, than each artifact must know how to update an index
- Steve, Joey: if an index is an artifact type, than only that artifact tooling would need to update the index. Users can push new manifests to the same repo, but it doesn't mean the index must be updated. That's a choice of the index owning tool.
### How to identify an index is a specific artifact type
With agreement how manifests are uniquely identified, and tooling should only support the types they understand, and index is better for registry garbage collection, we discussed how an index would be typed.
- Joey - Index either needs to be typed, or not used. Too difficult to understand how to update the index if it doesn't have an owner type
- If an index was typed, each client would ignore the types it doesn't understand.
- Similar to the manifest conversation, we didn't want to change the index.mediaType as the index schema work as is
- Docker client and other tooling: If an index is pulled, and it can't find a docker image manifest, for the platform it supports, it already knows how to fail. This same logic would be used for non docker/oci indexes
- Should we add artifactType or another means to index to identify it's type?
- CNAB could potentially use an index.config for storing bundle.json content.
- If `config.mediaType` works for manifest, why not use this same object for Index? The `config.json` file would be optional, just as it is with manifest. But the type would be uniquely named
After walking the various scenarios and alternatives, we had general consensus that adding `index.config.mediaType` was the proposal
- Mike Brown: agree, we should rev the index schema to 2.1
- Jimmy: also agree
## Consensus Amongst Attendees
We realize all vested parties couldn't attend. Even those at KubeCon had conflicts. Below is a list of things we, as a representative group agreed upon. The next step would be to summarize to the larger group on the dev alias, and the next call. Make the following proposals, engaging larger discussion.
Steve will make a PR on distribution to start comments and feedback.
- `manifest.config.mediaType` is the means to identify the type of artifact - the distribution spec doesn't change its requirements, rather explains the usage of types, how registries would validate these, and how authors would name their types.
- `manifest.config` file is optional. An artifact author could send a null pointer, but still define the `config.mediaType` in the manifest. Distribution spec clarifies usage pattern.
- config schemas can be optionally parsed by registries to provide value, based on the type. Docker and OCI Images provide which platform they provide. It's a big enough type that registries will want to invest
- Using Index for multiple artifacts enables registries to manage garbage collection
- Adding index.config would be a great consistent model - rev the schema to 2.1. This is the larger, more impactful change
- For the immediate case, to unblock CNAB, we may use an annotation, but would like to quickly move forward with a config object
- Index manifests can by foreign, or externally referenceable. We need more discussion as we ran out of time. Joey, Jimmy, Radu and a few others had some ideas already in mind.
## CNAB Notes from Radu
https://hackmd.io/s/HyX_1znTV