owned this note
owned this note
Published
Linked with GitHub
# Hack The Garden 2024-05 - Topics
Also see https://github.com/gardener-community/hackathon.
## Participants (16)
- Valentin K.
- Gerrit S.
- Johannes S.
- Rafael F.
- Martin W.
- Andreas F.
- Oliver G.
- Thomas B.
- Konstantinos A.
- Michael R.
- Tim E.
- Maximilian G.
- Marcel B.
- Johann G.
- Lukas H.
- Tim U.
- Stefan M.
## Votings
See all proposals [here](#All-Proposals).
### Discussion Topics (probably no hacking)
- ❌ extension for managing nvidia GPU drivers on shoot clusters (5 votes)
- Stefan M.
- Tim E.
- Andreas F.
- Johann G.
- requests/limits for kube-apiserver/etcd (not overload a node)
- "what's the next big thing?"
- ✅ move community-maintained repositories and extensions into one org: gardener-community
- candidates:
- https://github.com/stackitcloud/gardener-extension-acl
- https://github.com/stackitcloud/gardener-extension-shoot-flux
- https://github.com/metal-stack/gardener-extension-backup-s3
- https://github.com/metal-stack/gardener-extension-dns-powerdns
- https://github.com/metal-stack/gardener-extension-audit
- https://github.com/gardener-attic/gardener-extension-provider-kubevirt (unmaintained)
- https://github.com/gardener-attic/machine-controller-manager-provider-kubevirt (unmaintained)
- https://github.com/gardener-attic/gardener-extension-provider-vsphere (unmaintained)
- https://github.com/gardener-attic/machine-controller-manager-provider-vsphere (unmaintained)
- https://github.com/23technologies/gardener-extension-provider-ionos
- https://github.com/23technologies/gardener-extension-provider-hcloud
- https://github.com/23technologies/gardener-extension-runtime-kata
- https://github.com/schrodit/gardener-extension-provider-dns-cloudflare (-> should be renamed to gardener-extension-dns-cloudflare)
- https://github.com/knightXun/gardener-extension-provider-tencentcloud
- ask 23T folks (Jan Lohage, Lothar Bach) and Tim Schrodi: @timebertt
- later on, we might consider harmonizing CI flows
- ✅ discuss architecture of ACL extension in combination with proxy protocol on istio LoadBalancer
### Initial Assignment / First Topics
- 🚧 [[operator] Extensions for garden cluster via `gardener-operator`](https://github.com/gardener/gardener/issues/9635) (9 votes)
- Rafael F.
- Konstantinos A.
- Valentin K.
- 🚧 rewrite [vpn2](https://github.com/gardener/vpn2) in go (8 votes)
- Martin W.
- Maximilian G.
- Michael R.
- Lukas H.
- 🚧 [node-agent] Type safe configuration of the worker node like DNS, containerd, ntp et.al. (6 votes)
- Gerrit S.
- Tim U.
- Thomas B.
- ✅ [provider-local] Use `gardener-operator` for local development setup (5 votes)
- Oliver G.
- Robert V.
- ✅ Add alternative to base64 encoded helm chart in ControllerDeployment (e.g. OCI Helm release reference) (5 votes)
- Tim E.
- Marcel B.
- ✅ extension to expose the kubernetes api server only in a tailscale VPN, more like the ACL extension but much more secure and flexible (3 votes)
We found a simple solution which does not require to write a gardener extension. The simple solution was described in a document which can be found [here](https://github.com/majst01/tailscale-for-gardener).
A final destination for this document must be decided.
- Stefan M.
- Andreas F.
- Johannes S.
https://gardener.cloud/docs/guides/administer-shoots/tailscale/
- ✅ lift restrictions in choosing IPv4 seed/shoot networks by switching to a pure IPv6 tunnel (https://github.com/gardener/gardener/pull/9597#discussion_r1572358733)
- Stefan M.
- Andreas F.
- Johannes S.
https://github.com/gardener/vpn2/pull/83
### Fast Track
- [extensions] Introduce object selectors for extension webhooks (3 votes)
- restructure e2e tests into ordered `It`s (3 votes)
- [gardenlet] Put Plutono dashboards into a central namespace to not duplicate them in each shoot namespace
- ✅ Make Skaffold rebuild binaries in case `go embed`s have changed
- Marcel B.
- https://github.com/gardener/gardener/pull/9778
- Make `provider-local` patch `minReadySeconds` of GAPI/KAPI to make rollouts faster.
- Make `shoot-care` controller watch `ManagedResource`s
- Switch GitHub action in https://github.com/gardener/machine-controller-manager-provider-local to ko-based build, add `git describe` image tags, and configure renovate to bump references in g/g
- ✅ alternatively, move into g/g
- https://github.com/gardener/gardener/pull/9782
- ✅ Update Masterminds/semver to v3 in g/mcm: https://github.com/gardener/machine-controller-manager/pull/917
### More Topics To Pick Up
- ✅ [resource-manager] Explore compression of `ManagedResource` secrets (4 votes)
- Rafael F.
- Martin W.
- 🚧 [node-agent] Individual credentials per node + authorizer (4 votes)
- Gardener scale-out tests (3 votes)
- Gardener SLIs: cluster creation/deletion times, machine creation times (3 votes)
- Oliver G.
- ✅ [gardenlet] Self-upgrades based on information in garden cluster (3 votes)
- Rafael F.
- Oliver G.
- Stefan M.
- [x] Add API types to `gardener-apiserver`
- [x] Make `ManagedSeed` controller reusable for this use case
- [x] Add validation to prevent creation of `Gardenlet` resources for `ManagedSeed`s
- [x] Validate that `Gardenlet`s are only in `garden` namespace
- [ ] Write a guide that can be followed when a `gardenlet` needs to be deployed to a new unmanaged soil/seed cluster
- [[extensions] Expose Extension Provider Status in Garden Cluster](https://github.com/gardener/gardener/issues/3873) (2 votes)
- ✅ Rework apiserver-proxy to drop PROXY protocol for supporting ACL extension use case behind a PROXY protocol LoadBalancer
- Johannes S.
- Tim E.
- [x] test new configuration with ACL extension and LB with proxy protocol
- [x] drop envoy filter with `allow_requests_without_proxy_protocol`, replace with explicit configuration option for operators: https://github.com/gardener/gardener/pull/9844
- [ ] drop `direct_remote_ip` option in ACL extension, always use `remote_ip` (-> simplify API)
- [ ] consider using dedicated header instead of reusing `Reversed-VPN` header
- [x] check which parts of envoy config is actually needed (e.g., HTTP protocol options)
- [ ] only keep node egress IP on VPN and (legacy) apiserver-proxy path, don't add allowed CIDRs
- [ ] docs
- [ ] g/g: explain new architecture of apiserver-proxy
- [ ] ACL extension: explain how to verify ACL configurations on all paths with example commands, etc. (mainly for operators/developers)
- [ ] later: drop old apiserver-proxy port in g/g
- [ ] later: drop apiserver-proxy port handling in ACL extension
- [ ] later: drop `DENY` option from ACL API
- ✅ flux extension: finish v1: reconcile mode, secrets management/sops
- Andreas F.
- Marcel B.
- [x] implement syncMode
- [x] implement additional secrets
- [ ] cleanup, tests
- [x] update makefile / gardener
- [ ] verify migration works
- [ ] later: admission webhook for validation
- ✅ [GEP-21](https://github.com/gardener/gardener/issues/7051): IPv6 e2e tests in prow (2 votes)
- Add more backends to gardener-extension-audit - BYOB (bring your own backend :D ) (1 vote)
- [yake](https://yake.cloud): add support for gardener-operator (1 vote)
- Investigate how to integrate the [descheduler](https://github.com/kubernetes-sigs/descheduler) in Gardener for compacting seed nodes/fixing inter-pod anti affinity (0 votes)
- 🚧 [provider-local] Resolve discrepancy between local setup and cloud setups regarding the VPN connection: [proposal](https://github.com/gardener/gardener/issues/9604#issuecomment-2063177128)
- ✅ [vpn] Investigate possible solutions for problems of HA-VPN with [cilium 1.15](https://docs.cilium.io/en/v1.15/operations/performance/scalability/identity-relevant-labels/) due to usage of `statefulset.kubernetes.io/pod-name` in services/network policies to address individual `vpn-seed-server` pods
- Johannes S.
- https://github.com/gardener/gardener/commit/756c727e9412d108bb12c980a3acae9f6be2214f
## All Proposals
Add your proposal in the following sections 🚀
### Core
- Gardener scale-out tests
- load on garden: run hollow gardenlets similar to kubemark’s hollow nodes
- [[operator] Extensions for garden cluster via `gardener-operator`](https://github.com/gardener/gardener/issues/9635)
- [x] Define API
- Write controller(s?)
- [x] Deploy `Controller{Registration,Deployment}` into virtual garden cluster
- [x] Deploy extensions into garden runtime cluster (to make them usable for `Garden` controller)
- [x] Logs for the runtime-config controller
- [x] Delete `ManagedResource`s and uninstall the Helm Chart that are not used anymore in `Garden`
- [x] Watch the `ManagedResource`s and attach them to the `Extension`'s Status
- [x] Implement the same as `extension.spec.deployment.extension` for `extension.spec.deployment.admission`
- Adapt `Garden` controller to make use of extensions
- [x] Create (relevant?) `extensions.gardener.cloud` CRDs
- [x] Prevent `gardenlet` from managing CRDs that are shared with garden cluster (in case its `Seed` is the `Garden` at the same time)
- [x] Check if relevant extensions have been deployed (similar check like in `Shoot` controller)
- [x] Manage `extensions.gardener.cloud/v1alpha1.BackupBucket` resource
- Question: Do we also need to create a `BackupEntry` resource?
- [x] Manage `extensions.gardener.cloud/v1alpha1.DNSRecord` resources
- For each domain in `.spec.runtimeCluster.ingress.domains`: Create a wildcard record `*.<domain>`
- For each domain in `.spec.virtualCluster.dns.domains`: Create `api.<domain>` and `gardener.<domain>`
- [x] Extend `garden-reference` controller to put finalizer on DNS secrets
- [x] Default `ExtensionSpec` for well-known extensions (something like `images.yaml` is needed)
- [ ] Validate `Extension` resources
- [ ] Tests
- [ ] Admission
- [x] Prevent `gardenlet` from deploying an extension a second time in case the garden runtime cluster gets registered as `Seed`
- [x] Introduce `extension-required` reconciler in GOP (similar to `controllerinstallation-required`) to make `Garden` deletion work (extensions shall only be deleted when no extension object exists anymore)
- [[node-agent] Support patching `containerd` config file](https://github.com/gardener/gardener/issues/8929)
- [node-agent] Type safe configuration of the worker node like DNS, containerd, ntp et.al.
- [provider-local] Use `gardener-operator` for local development setup
- [gardenlet] Self-upgrades based on information in garden cluster
- [resource-manager] Explore compression of `ManagedResource` secrets
### Extensions
- extension for managing nvidia GPU drivers on shoot clusters
- extension to expose the kubernetes api server only in a tailscale VPN, more like the ACL extension but much more secure and flexible
- Add alternative to base64 encoded helm chart in ControllerDeployment (e.g. OCI Helm release reference)
```yaml
apiVersion: core.gardener.cloud/v1
kind: ControllerDeployment
metadata:
name: provider-local
annotations:
migration.controllerdeployment.gardener.cloud/type: custom
migration.controllerdeployment.gardener.cloud/providerConfig: {}
helm:
rawChart: # tar | gzip | base64
ociRepository: {} # step 2, cannot be set together with rawChart
values: {}
kustomize: # step 3
ociRepository: {}
patches: []
images: []
# ...
```
- https://github.com/gardener/gardener/pull/9771
- todo:
- [x] add API reference docs for `core.gardener.cloud/v1`
- [ ] remove `runtime.NewMultiGroupVersioner()`
- [ ] check if version override for bastion is needed
- [x] check if `APIService` versionPriority should be different between `v1beta1` and `v1` -> kube-apiserver automagically uses v1 if existing for a resource
- [x] switch `ControllerDeployment` controller to `v1`
- [x] switch `ControllerDeployment` docs to `v1`
- [x] round-trip conversion tests
- [x] validation for new ControllerDeployment fields
- [x] add RBAC/SeedAuthorizer for `core.gardener.cloud/v1`
- [x] ~~consider limiting size of rawChart in validation~~
- [ ] how can we cache artifacts?
- [ ] auth for private OCI? prob. via additional field in CtrlDeploy
- [x] implement helm push to OCI for provider-local chart
- [x] strip optional oci:// prefix from url (not tested)
- [ ] consider implementing `kustomize` deployment type
- Add more backends to [gardener-extension-audit](https://github.com/metal-stack/gardener-extension-audit) - BYOB (bring your own backend :D )
- [extensions] Introduce object selectors for extension webhooks
- [[extensions] Expose Extension Provider Status in Garden Cluster](https://github.com/gardener/gardener/issues/3873)
### Networking
- rewrite [vpn2](https://github.com/gardener/vpn2) in go
- also, move to g/g?
- Maybe also tailscale/headscale based
- [x] [GEP-21](https://github.com/gardener/gardener/issues/7051): IPv6 e2e tests in prow
- consider using https://github.com/aojea/nat64
### Observability
- Gardener SLIs: cluster creation/deletion times, machine creation times
### Security
- [node-agent] Individual credentials per node + authorizer
### Other
- [yake](https://yake.cloud): add support for gardener-operator
- restructure e2e tests into ordered `It`s
- Investigate how to integrate the [descheduler](https://github.com/kubernetes-sigs/descheduler) in Gardener for compacting seed nodes/fixing inter-pod anti affinity