owned this note
owned this note
Published
Linked with GitHub
# OCI Artifact Search Requirements
**Note:** This doc is a continuation of [Registries To Become Cloud Native Artifact Stores](https://hackmd.io/Jk2XCLP2S2y8AfdXJdRLrw)
As registries support multiple artifact types, a search/catalog API that supports filtering on the artifact type will be needed.
The docker v1 registry spec supported Docker Search. While some vendors like Quay.io implemented the v1 search API, the majority of vendors require the v2 registry api which dropped search.
We beleive revisiting the search api will support client CLIs that span registries, such as `helm search`, `duffle search` (CNAB), `docker search`, and other evolving artifact types.
By supporting a common search API across all registries, users could consistently use these new artifact CLIs across all registries.
## Use Cases
Search is a generic capability used across several different use cases.
- [Artifact Tool Specific Searches](#Tool-Specific-Searches)
- [Registry Specific Searchs](Regisry-Specific-Search)
- [Registry Tools Searches](#Registry-Tool-Search---Scanners)
### Tool Specific Searches
Helm, CNAB, Docker, Terraform and other tools will need to query their specific artifact types across various regisries.
**Example:** The helm cli would need to query a registry for charts that match a specific name. The result should return helm only artifacts.
```
helm search demo42.azurecr.io hello-world
Results
--------------------------------------
samples/hello-world
marketing/products/hello-world-sample
dev/prototypes/sample-hello-world
```
Version specific searches:
```
helm search --versions demo42.azurecr.io samples/hello-world
Results
--------------------------------------
samples/hello-world 1.0
samples/hello-world 1.1
samples/hello-world 1.2
```
### Regisry Specific Search
Users want to query registries for the artifacts that match a specific name or list artifacts wihtin a given path. In this case, the results contain multiple artifact types.
Today, regisries have created unique client APIs and server APIs. Until we have a generic registry client, it's expected registries will have vendor specific APIs. However, having common registry server side APIs expands the possibility for common tooling across registries.
A registry search API would include
- repo listings
- tag/version listings
- limit by artifact type
- query by date range, such as what's changed/added since a given timestamp
- as results may be paged, sorting the results by name and/or version with ascending and decending options
#### Existing examples
**Azure Container Registry list repo example:**
Without a common search/catalog API, cloud vendors have had to implement vendor specific experiences:
```
az acr repository list -n demo42
Name
-----------------------------
samples/demo42/queueworker
samples/demo42/quotes-api
samples/demo42/web
samples/demo42/deploy/chart
samples/demo42/deploy/cnab
samples/demo42/deploy/arm
```
**Azure Container Registry list tags example, w/ *future* type added:**
```
az acr repository show-tags -n demo42 --repository samples/demo42/deploy/chart
Result Type
-------------
1.0 chart
1.1 chart
1.1.1 chart
2.0 chart
3.0 chart
```
A repo could contain multiple artifact types
```
az acr repository show-tags -n demo42 --repository samples/demo42/deploy
Result Type
------------ ----------------
helm-1.0 chart
helm-1.1 chart
helm-1.1.1 chart
cnab-1.0 cnab
arm-1.0 arm
cft-1.0 cloud-formation
```
Rather than each registry vendor having to offer unique APIs, the goal would be to offer a common API.
### Registry Tool Search - Scanners
Vendors and the community have attempted to build tools atop registries.
- [Image Layers](https://imagelayers.io/)
- [Dive – A tool for exploring each layer in a Docker image | Hacker News](https://news.ycombinator.com/item?id=18528423)
- [Analyze And Explore The Contents Of Docker Images](https://www.ostechnix.com/how-to-analyze-and-explore-the-contents-of-docker-images/)
- [10+ top open-source tools for Docker security | TechBeacon](https://techbeacon.com/security/10-top-open-source-tools-docker-security)
Without a common search/catalog API, these tools must work with individual images.
One of the most common registriy tools include image scanning tools like [Aqua](https://aquasec.com), [Twistlock](https://twistlock.com), [Neuvector](https://neuvector.com/) and [Clair](https://github.com/coreos/clair).
While the scanning tools protect runtime nodes, they all pre-scan registries to understand image vulnerabilities before they're run.
Scanners evaluate images in registries with a combination of a search/catalog API and events.
These vulnerability scanners need the following:
- list all repos and tags for the inital scan evaluation
- get paged results as they may contain thousands of images
- periodically list all new and update images and tags, to keep a registry up to date
- register for events to scan images as they arrive. Possibly using [The Container Quarantine Pattern](https://github.com/AzureCR/QuarantinePattern-Spec/)
- filter, or at least understand the different artifact types
- as new [CVEs](https://cve.mitre.org/) are found, re-scan the registry
Today, scanners assume all artifacts in a registry are a container image. As a registry stores new artifact types, scanners will either need to know how to scan these new artifacts, or at least filter the results to artifacts they support.
## Artifact Types
A registry must know the types it hosts for it to provide meaningful search results.
Artifact types will be internally identified by an expanded set of [OCI Media Types](https://github.com/SteveLasker/RegistryArtifactTypes/blob/master/mediaTypes.md).
However, displaying `application/vnd.cncf.helm.index.v3+json` does not make for a good user experience. To provide clean user experiences, a list of artifact types, a short description, and info on the artifact tooling will be maintained. [Media Type Short Names](https://github.com/SteveLasker/RegistryArtifactTypes/blob/master/mediaTypes.md#media-type-short-names
)
| Media Type | Display Name | Info |
|-|-|-|
|`application/vnd.oci.image.index.v1+json` | OCI Image | [Docker](https://www.docker.com/products/docker-desktop) *|
|`application/vnd.oci.image.manifest.v1+json` | OCI Image | [Docker](https://www.docker.com/products/docker-desktop) * |
|`application/vnd.cncf.helm.index.v3+json` | Helm | [Helm](https://helm.sh)
|`application/vnd.oci.cnab.index.v1+json` | CNAB | [Duffle](https://cnab.io), [Docker-application](https://www.docker.com/search/node?keys=docker-application)|
\* most registry providers automatically convert `oci.image` manifests to the format requested by the client.
## Registry Search Requirements
### Listing repos
### Listing artifacts
### Listing versions
### Filtering by artifacts
### Filtering by date ranges
Search queries may specify date ranges, enabling the return of artifacts that have been created or changed since a given date:time
### Paging
Results may be paged, to provide a full list of artifacts.
A default page size of 100, with the abiliyt to change the paging size.
### Sorting
As results may be paged, being able to sort provides the ability to get the top n results, based on a given sort order. Sorting includes ascending and decending.
### Role Based Access Control
Search results shall be limited to the artifacts the user has read access control. The user may be a person or service account.
## Possible Technologies & Solutions
Stephen Day suggested: [graphql](https://graphql.org/)