# Kuberpak Prototype Findings
## Overview
During the 4.11 release, we attempted to spend some time prototyping a solution to the "OLM Dance(TM)" Service Delivery Epic that attempted to replace the InstallPlan API with the new rukpak APIs when an opt-in mechanism has been configured.
The main issue with the existing InstallPlan API is that it is not continually reconciled by OLM and errors on the InstallPlan are terminal. This behavior does not provide a way to pivot from a failed InstallPlan to another one, instead the operator install is stuck in a failed state. This prevents upgrades to a newer version in case of a failed installations (the "fail-forward" ability).
We found that integrating the new v2 APIs with the existing OLM APIs creates scoping conflicts that lead to potential short-term tradeoffs with the new APIs. The existing resolution primitives, subscriptions and CSVs, are namespace scoped whereas the new Bundle and BundleInstance APIs are cluster-scoped. Tying them together in the context of resolution leads to a natural conflict attempting to ensure consistency between old and new representations of installed content on-cluster.
Our recommendation is to solve the OLM Dance using the existing APIs without having to make any substantial changes to the rukpak APIs that were outlined in the e2e strawman.
## Summary
Originally, this process started with trying to build out and map the Kuberpak controller, API's, and packages directly onto OLM. After some time, a simplification attempt was made to build a new custom controller that performs the mapping of Kuberpak API's into existing OLM resources.
Through this process, it was found that the existing resolution and solver interfaces proved difficult to reason about, which led to difficulty consuming those packages outside of the OLM codebase.
This is for a number of reasons. Firstly, we need a way of identifying what's driving CSV subscriptions. How does one map a subscription to a `BundleInstance`? One solution presented was to just add an annotation at the bundle instance layer. Another was to create a Subscription key that maps to the Bundle Instance.
In the E2E Strawman, it outlines a label selector for this issue. A questions that arises from this is, if I have a label selector, how do I reason as a user what bundle is being selected?
## Open Questions
- What kind of granularity do we need to expose for configuring an opt-in mechanism?
- Once per-cluster using the OLMConfig API, or per-Subscription to control this behavior throughout the cluster?
- What is the ideal mapping between the existing APIs and the new APIs in a scoped world?
- Are BundleInstances 1:1 with OperatorGroups/Namespaces? Does the BundleInstance API need to be namespaced-scoped?
- What's would a new resolver look like that needs to accomodate both the new and existing APIs
- What are the inputs/outputs for this new resolver?
- How do we audit dependency resolution? Do we need some sort of parent bundle instance?
- What's the UX around having label-selectors as a first-class concept in the BundleInstance API?
- Is it easier to reason about pivoting decisions, or what Bundle is currently "active" if we just have a spec.BundleName field, which is what's outlined in Joe's kuberpak repository?
- If a label selector is desirable, do we need to expose a way to audit pivoting decisions? How can a user derive which Bundle is active?
- What's the UX around Bundle and BundleInstance deletion?
- What happens if a user deletes an "internal" API?
- Do we need to expose GC around Bundle/BundleInstances?
## Outstanding Work
- Update the resolver to consider Bundle sources and filter out CSVs that belong to a BundleInstance
- Add the opt-in mechanism that delegates work between the control planes
- Add CRD preflight checking to the kuberpak provisioner
- Investigate integrating the Input API
- Investigate adding approval
## Notes/Needs Home
- The current catalog-operator codebase wasn't as extensible as we'd like and we wanted to avoid doing any large refactoring effort
- Any opt-in field on the Subscription annotation that delegated work to separate control planes felt inadequate
- Note that it took a while to build up the requisite types to use the resolver/solver packages outside of OLM