owned this note
owned this note
Published
Linked with GitHub
# Hack The Garden 11/2023 - Topics
## Votings
See all proposals [here](#All-Proposals).
### Discussion Topics (probably no hacking)
- Air-gapped clusters (7 votes)
- IPv6 in Gardener: current state and next steps
- ControllerDeployment with `type!=helm` (2 votes)
- small 23ke demo: how STACKIT uses the flux-based gardener-installer
### Initial Assignment / First Topics
- 🚧 Control Plane Migration w/o Downtime (8 votes)
- Kai K.
- Johannes S.
- (Michael R.)
- 🚧 Stop vendoring in gardener/gardener and extensions (8 votes)
- Andreas F.
- Lukas F.
- 🚧 Continue with [gardener-node-agent](https://github.com/gardener/gardener/issues/8023) (7 votes)
- Stefan M.
- Oliver G.
- [Write leases from the node-agent](https://github.com/gardener/gardener/pull/8767)
- [Refactor health checks in go](https://github.com/gardener/gardener/pull/8786)
- 🚧 Extend support for more cloud providers in [ACL extension](https://github.com/stackitcloud/gardener-extension-acl) (and make it aware of HA control planes and `ExposureClass`es) (7 votes)
- Markus W.
- Konstantinos A.
- Max G.
- https://github.com/stackitcloud/gardener-extension-acl/pull/35
- (bad bug we found while impementing): https://github.com/stackitcloud/gardener-extension-acl/pull/31
- ✅ ARM support in provider-openstack (6 votes)
- Robin S.
- Tim E.
- Konstantinos A.
- https://github.com/gardener/gardener-extension-provider-openstack/pull/690
- ⏸️ Enhance `gardener-operator` to deploy `gardenlet` via dedicated CRD (5 votes)
- Rafael F.
- Gerrit S.
### Fast Track
- ✅ Add `./hack/update-skaffold-deps.sh`
- https://github.com/gardener/gardener/pull/8766
- 🚧 The openstack CSI driver was enhanced to not require the infrastructure credentials anymore
- https://github.com/kubernetes/cloud-provider-openstack/issues/1020
### More Topics To Pick Up
- 🙅♂️ Extensibility for `gardener-operator` (reuse provider extensions for `BackupBucket`, DNS records, etc.) (6 votes)
- 🙅♂️ Start working on [GEP-19: Observability Stack - Migrating To prometheus-operator](https://github.com/gardener/gardener/blob/master/docs/proposals/19-migrating-observability-stack-to-operators.md) (5 votes)
- 🙅♂️ Distribute requests from one TLS/HTTP2 connection across API servers (5 votes)
- 🙅♂️ [Expose Extension Provider Status in Garden Cluster](https://github.com/gardener/gardener/issues/3873) (4 votes)
- 🙅♂️ Drop `extensions.gardener.cloud/v1alpha1.Cluster` API for better scalability (3 votes)
- 🙅♂️ [ETCD Encryption for Custom Resources](https://github.com/gardener/gardener/issues/4606) (3 votes)
- 🚧 Rework [shoot-flux extension](https://github.com/23technologies/gardener-extension-shoot-flux) (3 votes)
- Robin S.
- Kai K.
- Tim E.
- https://github.com/stackitcloud/gardener-extension-shoot-flux
- 🙅♂️ Enhance extensions library (“library v2”, e.g., move CSI controller deployments into `gardener/gardener` instead of duplicating it into each provider extension) (3 votes)
- 🙅♂️ More observability for shoot operations (3 votes)
- 🚧 Audit logging implementation (3 votes)
- Michael R.
- Gerrit S.
- https://github.com/metal-stack/gardener-extension-audit
- 🙅♂️ Watch `ManagedResource`s in Shoot Care Controller (2 votes)
- 🙅♂️ ClusterAPI Provider for Gardener (1 vote)
- 🙅♂️ Publish `g/g/pkg/apis` as dedicated repo/module `g/api` (not voted)
## All Proposals
### Core
- air-gapped clusters
- FI-TS just started to design how this could look like
- how to handle registry and DNS
- design is ongoing, but might be interesting to discuss
- Continue with gardener-node-agent 🕵️ (GNA 🧬)
- Orchestrated rolling updates (e.g. kubelet updates)
- Removing Bash scripts!
- Remove vendoring from `gardener` project and subsequent usage in e.g. provider extensions
- Adapt a similar approach we are using in `onmetal`
- Fix hack script fetching without vendoring of `gardener` in your project
- `ControllerDeployment` with `type!=helm`, e.g. `type=manifests`, `type=kustomize`, `type=flux`
- would be useful for deploying stuff to all seeds that are not actually extensions
- rather invest into shoot-flux?
- ETCD Encryption For Custom Resources (Fast Track 😉 (very fast))
- Control Plane Migration w/o downtime
- istio multi-cluster service mesh
- rolling seeds upgrade -> better than previous approach to canary deployments
- Enhance `gardener-operator` to deploy `gardenlet` via dedicated CRD
- Watch `ManagedResource`s in Shoot Care Controller
- Extensibility for `gardener-operator` (reuse provider extensions for `BackupBucket`, DNS records, etc.)
- ClusterAPI Provider for Gardener
- It could also be an investigation proposal.
- Or even investigating ClusterAPI provider for gardener/autoscaler.
- Publish `g/g/pkg/apis` as dedicated repo/module `g/api`
- similar to https://github.com/kubernetes/api
- repo/module should only contain gardener's API types
- very slim set of dependencies: `k8s.io/api`, `k8s.io/apimachinery`, protobuf
- similar to k/k's staging mechanism: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/staging.md
- reuse kube's publishing-bot: https://github.com/kubernetes/publishing-bot
- ref https://github.com/gardener/gardener/issues/2871, https://github.com/gardener/gardener/issues/5978
### Extensions
- ARM support in provider-openstack
- support for different arch OS images
- rework https://github.com/23technologies/gardener-extension-shoot-flux
- prevent running into GitHub rate limits for checks like `https://api.github.com/repos/fluxcd/flux2/releases/latest`
- use `providerConfig` instead of `ConfigMap`
- allow configuring a set of flux resources, e.g. `GitRepository` that contains more flux resources
- improve SSH key usage
- put SSH key into secret in garden
- reference secret in `Shoot.spec.resources` and `providerConfig`
- extension controller brings SSH key to shoot into `flux-system`
- drop hacky access to garden cluster
- ~~could be maintained in gardener-community org~~ -> maintained in stackitcloud org for now, can be moved later
- would be useful for deploying stuff to all managed seeds that are not actually extensions
- [x] Golang CI - Introduce linter
- [x] Ko
- [x] Update GitHub Actions
- [x] Local setup
- [x] Add renovate
- [x] Drop vendor folder
- [x] API validation
- [x] API doc generation
- [x] Re-implement extension controller
- [ ] support sops decryption in kustomize (not right now, open issue for later), e.g.:
```yaml
apiVersion: flux.extensions.gardener.cloud/v1alpha1
kind: FluxConfig
kustomization:
template:
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
spec:
path: clusters/production
decryption:
provider: sops
secretRef:
name: gpg
secretResourceName: flux-gpg-secret
```
- [x] resolve TODOs, write unit tests
- [x] Eliminate `constants` package
- [ ] test Control Plane Migration, implement missing actuator methods if necessary: `Migrate`, `Restore`
- [ ] flux is re-bootstrapped on control plane migration, because gardener does not copy conditions of `Extension` objects during migration
- [x] rework extension entrypoint, update chart
- [ ] add additional mode that also reconciles objects after bootstrapping
- [ ] this mode probably should locally cache flux install manifests, to avoid GitHub API limits
- [x] flux cannot be enabled when creating clusters (flow dependencies in new g/g version)
- [x] solve problem that `reconcile=AfterKubeAPIServer` happens before workers with https://github.com/gardener/gardener/pull/8232
- [x] consider adding `reconcile=AfterWorker`
- [ ] Docs
- [ ] breaking release notes for v1
- [ ] image-automation-controller, image-reflector-controller are not installed/maintained by extension
- Enhance extensions library (“library v2”, e.g., move CSI controller deployments into `gardener/gardener` instead of duplicating it into each provider extension)
- Merge `machine-controller-manager-provider-*` repos into `gardener-extension-provider-*` repos
- [Expose Extension Provider Status in Garden Cluster](https://github.com/gardener/gardener/issues/3873)
- Drop extensions.gardener.cloud/v1alpha1.Cluster API for better scalability
### Networking
- IPv6 (discussion only, next steps are unclear) 👺
- distribute requests from one tls/http2 connection across API servers
- currently, all requests of one client/controller go over a single tls/http2 connection to a single API server
- due to gardener's architecture using SNI passthrough from gateway to API servers
- fair distribution of requests would simplify autoscaling
- Extend support for more cloud providers in [ACL extension](https://github.com/stackitcloud/gardener-extension-acl) (and make it aware of `ExposureClass`es)
- also support multiple AZs
### Observability
- more observability for shoot operations
- flow metrics and traces with OpenTelemetry?
- cluster and machine creation times
- collect shoot creation time metrics from e2e tests
- start working on [GEP-19: Observability Stack - Migrating To prometheus-operator](https://github.com/gardener/gardener/blob/master/docs/proposals/19-migrating-observability-stack-to-operators.md)
### Security
- [Shrink garden cluster permissions for extensions](https://github.com/gardener/gardener/blob/master/docs/extensions/garden-api-access.md)
- audit logging implementation
### Other
- small 23ke/gardener-installer demo
- caching in prow
- local image registry cache
- go build/test results
### Gaming
🏓 Play Table Tennis 🏓
🎱 Billard is also there 🎱