owned this note
owned this note
Published
Linked with GitHub
---
tags: Agenda
---
# Design Meetings
[TOC]
## Meeting Links
Can be found here https://wiki.openstack.org/wiki/Airship#Get_in_Touch
## Administrative
### [Recordings ](https://hackmd.io/CvuF8MzmR9KPqyAePnilPQ)
### Old Etherpad https://etherpad.openstack.org/p/Airship_OpenDesignDiscussions
### Design Needed - Issues List
| Priority | Issues List |
| - | - |
| Critical | https://github.com/airshipit/airshipctl/issues?q=is%3Aopen+is%3Aissue+milestone%3Av2.0++label%3Apriority%2Fcritical+label%3A%22design+needed%22+|
| Medium | https://github.com/airshipit/airshipctl/issues?q=is%3Aopen+is%3Aissue+milestone%3Av2.0++label%3Apriority%2Fmedium+label%3A%22design+needed%22+ |
| Low | https://github.com/airshipit/airshipctl/issues?q=is%3Aopen+is%3Aissue+milestone%3Av2.0++label%3Apriority%2Flow+label%3A%22design+needed%22+ |
## Thursday Apr 15, 2021
### Airship 2.0 Troubleshooting Guide Continued - Andrew K
* WIP patchset - https://review.opendev.org/c/airship/docs/+/786062
* What do we want to target for v2.1? This will be evolutionary.
* Collaborators needed
* Related question: do we log what phase is being executed when it starts/ends?
### FQDN resolution - Andrii O.
* Some software require fqdn resolution from hostname. /etc/hosts and systemd-resolved config need to be populated accordingly during the deployment.
* Arvinder:
* I see three possible approaches to doing this:
1. host-config operator: this is best when updates need to happen dynamically on existing K Nodes.
2. pre/post kubeadm hooks in KubeadmConfigSpec: you can write custom scripts that for example patch the /etc/hosts accordingly. For example, you can use KubeadmConfigSpec.Files to create one or more files in say the /etc/host.d directory and then during PreKubeadmCommand run something along the lines of `cat /etc/hosts.d/*.conf > /etc/hosts`
3. Metal3DataTemplate: the above KubeadmConfig approach applies the same configuration across all nodes in an KCP or MD. Metal3DataTemplate provides more flexibility by allowing some of the configuration to be Node specific. https://github.com/metal3-io/metal3-docs/blob/master/design/metadata-handling.md
:::warning
We are already using #3 for networking etc, its unclear to me that this helps with the original question. FQDN/Systemd changes etc.
:::
## Tuesday Apr 13, 2021
### Hostconfig-operator integration (Sreejith P)
While integrating HCO with treasuremap, found that we need to annotate the secret on to nodes and we also need to have a specific label. what would be the best way to annotate nodes. Also would it be best to add a mechanism to override the default labels in HCO via manifests.
- Metal3 Feature for Label Cascading: https://github.com/airshipit/airshipctl/issues/377
### Replacing into multiple targets (Reddy / Matt)
There are use cases where we need to ReplacementTransform the same source data into multiple target paths -- e.g. replacing an IP address into many network policy rules. It would be helpful for the RT to support this natively. Some options,
1) add a `targets` list as an alternative to `target`
* Lets open an ISSUE to support this. Improve Transformer to support Target as Slice : https://github.com/airshipit/airshipctl/blob/7998615a7b5847c367c29874641c8422157ebb52/pkg/document/plugin/replacement/transformer.go#L77
2) allow for global substring replacements across a document, in addition to the path-based substring replacement we have
ISSUE : Open thi sfor a future priority. specify a patteern that does not hae a specific target.
### PTG later this month
* Thurs, Apr 22 1300 UTC (8:00 AM CDT) - 1700 UTC (12:00 PM CDT)
* Fri, Apr 23 1300 UTC (8:00 AM CDT) - 1700 UTC (12:00 PM CDT)
* Open agenda: https://etherpad.opendev.org/p/xena-ptg-airship
* Registration (free): https://april2021-ptg.eventbrite.com
### Need to discuss document pull command behavior (Kozhukalov)
Related issues
* https://github.com/airshipit/airshipctl/issues/416
* https://github.com/airshipit/airshipctl/issues/417
Patch is ready for review (probably needs rebasing)
* https://review.opendev.org/c/airship/airshipctl/+/767571
Patch implements two things
* Default target path is ${HOME}/.airship/<manifest_name> (i.e. ${HOME}/.airship/default)
* If target path is not defined in the config for a manifest, then the current directory is used to pull the manifests repo
## Thursday Apr 8, 2021
### Discuss the document-validation solution for airshipctl (Ruslan A.)
https://hackmd.io/t2mxDiB3TdGXI8B6gDtA-Q
### Discuss "dex-aio" Implementation Short-/Long-term (Sidney S.)
Approach to discuss described in https://hackmd.io/bdPFHBBSQy-IrpPe1U9itg
### Align on approach to troubleshooting guide (Andrew K)
Use the life cycle states as a high level framework:
- Initialize State
- Prepare State
- Physical Validation
- Bootstrap State
- Target Cluster Lifecycle
- Target Workload Lifecycle
- Sub-clusters (separate state or combine with Target?)
- Day 2 Operations
Proposed approach would be to ~~to list the phases/steps within each higher level lifecycle state & then~~ reference the relevant troubleshooting areas listed below within the life cycle states. Generally speaking here's what you need to look at within a phase (base on executor, do x, y, z).
Troubleshooting areas:
- Manifests:
- Documents,
- Kustomize,
- Kustomize-KRM-Plugins
- Our troubleshooting for Replacememt, Sops,
- Running phases: How to debug a failed phase, where to start, which logs to read > Focus from the phase perspective.
- Identify Phase
- UUnderstand Phase.yaml
- Identify Executor
- Understand Executor.yaml
- Given. theexecutor type ;
- Different guidance
- is it generic
- which one,
- sops, gpg,kubeval, image builder, etc
- is it k8s
- is it clusterctl
- Cluster-API & Kubernetes Deployment: grouped together as the k8s deployment is done by Cluster-API
- Proxy settings
- Networking: is this too broad/complex/specific to individual use cases?
- Helm Charts: Helm Operator & Helm Chart Collator debugging
- Image Builder: base image generation debugging, ISO/QCOW generation & application debugging
-
- Host Config Operator (may be a future topic)
- Sub-Clusters?
- Services/Applications
- i.e. CEPH
- DEX
- LMA stuff
- ... Point to their documentation
Assuming our other documentation will provide details on what each phase does. Would it make sense to incorporate troubleshooting into the deployment guide so you have a one stop shop or keep it separate so it's not cluttering up the deployment guide?
We created an issue for this quite awhile back [#328](https://github.com/airshipit/airshipctl/issues/328). It references this TM v1 debug report script as a potential starting point. Is this still valid? https://github.com/airshipit/treasuremap/blob/master/tools/gate/debug-report.sh
Next Steps:
- Create WIP Patchset with documentation framework
- Identify SMEs for each troubleshooting area who can help contribute
- Try to get a basic framework & some initial content by the EOM for v2.1 & continue to build on it
## Tuesday Apr 6, 2021
### Discuss the document-validation solution for airshipctl (Ruslan A.)
Review the followings commits
https://review.opendev.org/q/topic:%22add-validation-phases%22+(status:open)
### Discuss "dex-aio" certifcate generated by Cert-Manager (Sidney S.)
Approach to discuss described in https://hackmd.io/bdPFHBBSQy-IrpPe1U9itg
## Thursday Apr 1, 2021
### Discuss the rook-ceph cluster implementation (Vladimir S.)
Review the initial commit https://review.opendev.org/c/airship/treasuremap/+/784184 ,
Discuss rook-ceph components which should be deployed by default,
Discuss the further downstream/WHARF work
- Set failure domain to host by default
-
## Place for scripts such as waiting, that are currently in tools/deployment. (KKalynovskyi)
We have new pattern of waiting and adding new scripts to gate/deployments: https://review.opendev.org/c/airship/airshipctl/+/782520
as example, we placed script here, but now its test-site specific, we need a place to be shared between every site:
https://github.com/airshipit/airshipctl/tree/master/manifests/site/test-site/phases/helpers
## Tuesday Mar 30, 2021
### Exploratory - Discuss Load Balancer (Sidney S)
Propose to create sub-issues for https://github.com/airshipit/treasuremap/issues/84 for each operator and having one person working on each sub-issue. Of course, team needs to coordinate the common effort, e.g., common design.
Also, have a more detailed description of how this load balancer is going to be used:
* Is this load-balancer going to be used when a service of type "LoadBalancer" is deployed?
* Is a GCP load balancer being used only with a GCP target cluster or with any other provider's cluster, e.g., Baremetal?
Some additional questions in terms of design:
* Is it expected to add a new airshipctl phase for it?
* Design using Docker container or any other proposal?
CAPI working group on LoadBalancer type:
* https://docs.google.com/document/d/1wJrtd3hgVrUnZsdHDXQLXmZE3cbXVB5KChqmNusBGpE/edit
### Discuss Network Catalogue Lists (Jess E)
Currently the network catalogue has a list of networks and a list of vlans. It is somewhat tricky to modify these as it requires accessing them by index. It would be preferable to access them by id instead. Either we should use dictionaries or have some sort of lookup function to more easily access these items.
Example network catalogue:
```
networks:
- id: network_a
type: ipv4
link: bond0.123
# ip_address:
netmask: 255.255.255.128
- id: network_b
type: ipv4
link: bond0.456
# ip_address: <from host-catalogue>
netmask: 255.255.255.128
```
Example using patchesjson6902:
```
# remove network_a
- op: remove
path: "whatever/networks/0"
# modify network_b now at index 0
- op: replace
path: "whatever/networks/0/link"
value: bond0.123
```
### ViNO - VM Infrastructure Bridge
* Discuss the end-to-end flow for how the VM Infra Bridge is created and used.
* Should it be created on all bare metal hosts prior to applying ViNO CR?
* Currently ViNO CR specifies bridge interface name via vmBridge entry. Node labeler uses this value to label nodes with IP address. How should vino-builder use this to create libvirt network?
## Thursday Mar 25, 2021
### CRI & CGroup driver (Vladimir Sigunov/Craig Anderson)
Need to discuss the approach for [Airshipctl #456](https://github.com/airshipit/airshipctl/issues/456)
TODO :Open an issue in images regarding
### Airshipctl Useability Concerns
- Why do we get SOPS errors when doing a phase list
- Group 0: FAILED
FBC7B9E2A4F9289AC0C1D4843D16CEE4A27381B4: FAILED
- | could not decrypt data key with PGP key:
| golang.org/x/crypto/openpgp error: Could not load secring:
| open /tmp/secring.gpg: no such file or directory; GPG binary
| error: exit status 2
I understand we can follow instructions from Treasuremap , but that makes using airshipctl, a bit of a mess.
Given that the flow a user might follow is simply :
* clone airshipctl
* build airshipctl
* install airshipctl
* airshipctl config init
**ISSUE (v2.1)**- Better integration between airship config and sops expectations for keys.
Needd more discussion
* airshipctl config get-manifest
* Update config Manifest for Target Path to meet my local filesystem
**ISSUE v2.1** with using airshipctl config set-manifest <NAME> target-path=...
Using airshipctl config get-manifest DOES NOT return the Manifest Name
* airshipctl document pull
* airshipctl phase list
BOMBED!!&#^$%$&
- config init, should include airshipctl into the manifest, and under the target path given the default resource relationship. **ISSUE v2.1 ** fixed this
- This stuff should be only visible with a debug like option [ This should be fixed already ... ]
(base) vpn-135-210-4-4:bin rp2723$ airshipctl phase render initinfra-target
gpg: keybox '/tmp/pubring.kbx' created
gpg: /tmp/trustdb.gpg: trustdb created
gpg: key 3D16CEE4A27381B4: public key "SOPS Functional Tests Key 1 (https://github.com/mozilla/sops/) <secops@mozilla.com>" imported
gpg: key 3D16CEE4A27381B4: secret key imported
gpg: key D8720D957C3D3074: public key "SOPS Functional Tests Key 2 (https://github.com/mozilla/sops/) <secops@mozilla.com>" imported
gpg: key D8720D957C3D3074: secret key imported
gpg: key 3D16CEE4A27381B4: "SOPS Functional Tests Key 1 (https://github.com/mozilla/sops/) <secops@mozilla.com>" not changed
gpg: key D8720D957C3D3074: "SOPS Functional Tests Key 2 (https://github.com/mozilla/sops/) <secops@mozilla.com>" not changed
gpg: key 19F9B5DAEA91FF86: public key "SOPS Functional Tests Key 3 (https://github.com/mozilla/sops/) <secops@mozilla.com>" imported
gpg: Total number processed: 5
gpg: imported: 3
gpg: unchanged: 2
gpg: secret keys read: 2
gpg: secret keys imported: 2
gpg: keybox '/tmp/pubring.kbx' created
gpg: /tmp/trustdb.gpg: trustdb created
### Discuss - Implementation of Rook-Ceph Deployment (Vladimir S)
Review rook-ceph cluster deployment: (reviews pending) https://review.opendev.org/c/airship/charts/+/780590
Take a peek at this https://review.opendev.org/c/airship/treasuremap/+/783051
* Are we going to use a helm chart to deploy rook CRs, or utilize raw CR documents for now?
### Plan for this
“On June 30 2021, Quay.io will move to Red Hat Single Sign-On Services. If you haven't done so already, please create a PERSONAL (not corporate) Red Hat account and attach it to your Quay.io account, following these instructions. If you create a corporate account you WILL lose access to your Quay personal namespace” **ISSUE **: MAke sre this doesnt impact airshgip repo publishing .
### About phase list
* Phase should include only the appropriate phases for the type
* Problem might be the type definitions...
ISSUE : How do we make phase list more useful? Its essentially catalogue list. ofphases the document set knows about.
This has no context for a user ?
* This should be an issue or behavior from the Plan command
* Order/Relationship/Dependency/etc
NAMESPACE RESOURCE
Phase/clusterctl-init-ephemeral
Phase/clusterctl-init-target
Phase/clusterctl-move
Phase/controlplane-ephemeral
Phase/controlplane-target
Phase/ephemeral-az-cleanup
Phase/ephemeral-az-genesis
Phase/ephemeral-gcp-cleanup
Phase/ephemeral-gcp-genesis
Phase/ephemeral-os-cleanup
Phase/ephemeral-os-genesis
Phase/initinfra-ephemeral
Phase/initinfra-networking-ephemeral
Phase/initinfra-networking-target
Phase/initinfra-target
Phase/iso-build-image
Phase/iso-cloud-init-data
Phase/lma-configs
Phase/lma-infra
Phase/lma-stack
Phase/remotedirect-ephemeral
Phase/secret-generate
Phase/secret-reencrypt
Phase/secret-show
Phase/workers-classification
Phase/workers-target
Phase/workload-target
## Tuesday Mar 23, 2021
### Confirm image tagging approach + some additional release tagging questions (Andrew K) (Continued)
When discussing [Images #3](https://github.com/airshipit/images/issues/3), Sai pointed out that currently the krm functions are also getting tagged as latest vs. a specific version.
Sean has been working on the tagging/release approach. There two issues out there currently to address:
• [#419](https://github.com/airshipit/airshipctl/issues/419) - Covers versioning the templater & replacement-transformer.
• [#354](https://github.com/airshipit/airshipctl/issues/354) - Overall release tagging approach
• Sops looks like it’s already version tagged.
Couple of questions:
1) Do we need to add cloud-init to 419 or create a new issue to version tag cloud-init?
* Added to release automation here: https://review.opendev.org/c/airship/airshipctl/+/780875
* Added to version pinning here: https://review.opendev.org/c/airship/airshipctl/+/767179
3) Do we do anything with the Makefiles for the krm functions in https://github.com/airshipit/airshipctl/tree/master/krm-functions? Each Makefile for cloud-init, replacement-transformer & templater points to the “latest” DOCKER_IMAGE_TAG
https://github.com/airshipit/airshipctl/blob/master/krm-functions/templater/Makefile#L6-L13
* that is the default tag, it can be overridden, which we do for both git SHA (Zuul post job) and semver tags (github action)
3) Besides answering 1) & 2) above, is there anything else we need to address? Sean has some additional questions in #354 that need reveiw: https://github.com/airshipit/airshipctl/issues/354#issuecomment-801436275
### Discuss Treasuremap Branching Strategy (Matt F)
Related to [#50](https://github.com/airshipit/treasuremap/issues/50), discuss path forward for branch renaming in Treasuremap. Previous notes recommended moving v2 -> master, renaming master to perhaps v1.9, and leaving v1.8 as is.
* Is this still the consensus?
* When will the change take place? (and who will do it?)
* Any other related items that need to be communicated on the mailing list before the change occurs?
*
## Thursday Mar 18, 2021
### CONTINUED: Static schema validation of airshipctl documents (Ruslan Aliev)
These is how we do validation today https://github.com/airshipit/airshipctl/blob/master/tools/document/validate_site_docs.sh
The topic is related to issue [#19](https://github.com/airshipit/airshipctl/issues/19). Proposed solution: a KRM function which can validate input documents via Kubeval.
Validation of custom resources is possible by extracting openAPIV3Schema from needed CRDs and converting them to JSON. Using this approach, each phase (which has a documentEntryPoint) could be validated by creating a new one with the same documentEntryPoint, but pointing a new Executor (KRM based on GenericContainer).
:::warning
*airshipctl phase validate [NAME]*
validate will :
- 1st Phase validate document (Phase, Execution)
- Generic container kubeval for the phase itself
Document like PhaseValidationConfig
...
Config Here that specifies the CRD such as https://review.opendev.org/c/airship/airshipctl/+/780681/3/manifests/site/test-site/phases/phase-patch.yaml#27 :
crdList:
- function/airshipctl-schemas/versions-catalogue.yaml
- function/airshipctl-schemas/network-catalogue.yaml
- https://raw.githubusercontent.com/tigera/operator/release-v1.13/config/crd/bases/operator.tigera.io_installations.yaml
- function/capi/v0.3.7/crd/bases/cluster.x-k8s.io_clusters.yaml
- function/cacpk/v0.3.7/crd/bases/controlplane.cluster.x-k8s.io_kubeadmcontrolplanes.yaml
- function/capm3/v0.4.0/crd/bases/infrastructure.cluster.x-k8s.io_metal3clusters.yaml
- function/capm3/v0.4.0/crd/bases/infrastructure.cluster.x-k8s.io_metal3machinetemplates.yaml
- global/crd/baremetal-operator/metal3.io_baremetalhosts_crd.yaml
- function/cabpk/v0.3.7/crd/bases/bootstrap.cluster.x-k8s.io_kubeadmconfigtemplates.yaml
- function/capi/v0.3.7/crd/bases/cluster.x-k8s.io_machinedeployments.yaml
- function/hwcc/crd/bases/metal3.io_hardwareclassifications.yaml
- function/flux/helm-controller/upstream/crd/bases/helm.toolkit.fluxcd.io_helmreleases.yaml
- function/flux/source-controller/upstream/crd/bases/source.toolkit.fluxcd.io_helmrepositories.yaml
*
:::
Finally, we can define a PhasePlan, containing validation phases, launching which we can validate all documents related to particular site.
PS to discuss - https://review.opendev.org/c/airship/airshipctl/+/780681/
Proposed Path Forward:
1. a `phaseplan validate` command that walks the phases in a plan
2. `phaseplan validate` would discover most schemas from the document set
3. a superset of "extra" CRD pointers (not in the doc set) are shared across the validated plan
4. circle back after `phaseplan validate` and figure out how we want to do `phase validate` which is actually trickier
### EndpointCatalogue needed for Dex? (Matt Fuller)
This is related to issue [#317](https://github.com/airshipit/airshipctl/issues/317). Previous investigation didn't indicate a strong need for an EndpointCatalogue as replacements were already covered by existing Network and VersionsCatalogues. However, a discussion last week about the Dex function in treasuremap seemed to indicate an EndpointCatalogue may be needed after all. Is this slated for future work, or should this be part of 2.0 release?
Path forward:
1. Close #317
2. Create a new issue in treasuremap for airship/charts format endpoints
### Confirm image tagging approach + some additional release tagging questions (Andrew K)
When discussing [Images #3](https://github.com/airshipit/images/issues/3), Sai pointed out that currently the krm functions are also getting tagged as latest vs. a specific version.
Sean has been working on the tagging/release approach. There two issues out there currently to address:
• [#419](https://github.com/airshipit/airshipctl/issues/419) - Covers versioning the templater & replacement-transformer.
• [#354](https://github.com/airshipit/airshipctl/issues/354) - Overall release tagging approach
• Sops looks like it’s already version tagged.
Couple of questions:
1) Do we need to add cloud-init to 419 or create a new issue to version tag cloud-init?
* 419 is about pinning to versions, we could add cloud-init to it. but first we need to be publishing git SHA and/or semver versions to pin to. it looks like we are attempting to push to a non-existent repo for cloud-init which is causing the other images in that repo to not be published as well: https://zuul.opendev.org/t/openstack/builds?job_name=airship-airshipctl-publish-image
3) Do we do anything with the Makefiles for the krm functions in https://github.com/airshipit/airshipctl/tree/master/krm-functions? Each Makefile for cloud-init, replacement-transformer & templater points to the “latest” DOCKER_IMAGE_TAG
https://github.com/airshipit/airshipctl/blob/master/krm-functions/templater/Makefile#L6-L13
* that is the default tag, it can be overridden. we are pushing git SHA and semver tags for replacement-transformer and templater. not sure about cloud-init.
3) Besides answering 1) & 2) above, is there anything else we need to address? Sean has some additional questions in #354 that need reveiw.
TODOs:
1. cloud-init repo exists, and we can add it to #419 alongside our other krm fns
## Tuesday Mar 16, 2021
### airshipctl secret decrypt not working (Sreejith)
airshipctl secret decrypt is giving the message that its not implemented as mentioned below.
airshipctl secret decrypt --src secrets.yaml --dst decryptes_secrets.yaml
not implemented: secret encryption/decryption
Do we have plans to implement this for the 2.0 release?
Comment from Alexey O.:
what is the use-case?
JFYI we have https://review.opendev.org/c/airship/airshipctl/+/780670 that allows you to do `airshipctl phase run secret-show` to see all generated secrets decrypted on the screen. Does it cover your use-case?
ISSUE : Discuss/Design better user experience for undestanding how to decrypt a particular secret artifact. Post v2. **Created [#489](https://github.com/airshipit/airshipctl/issues/489)**
### Static schema validation of airshipctl documents (Ruslan Aliev)
[Continued here](https://hackmd.io/QiEksO4fRk-MnBjwBFaAkQ#CONTINUED-Static-schema-validation-of-airshipctl-documents-Ruslan-Aliev)
## Tuesday Mar 9, 2021
Topics left over from last week.
### Update on Host Config Operator periodic checks (Andrew/Sirisha)
Discussion notes here https://hackmd.io/QCSjN1NWQ1qLdPcX8C7_sg
### Host Config Operator to provide day 2 storage clean up capabilities (Sreejith/Vladimir)
https://github.com/airshipit/hostconfig-operator/issues/3
We have porthole for performing some of the day 2 operations. is hostconfig-operator suppose to do all that functionality of porthole or is it suppose to do activities like monitoring ceph cluster and do rebalancing, clearing un-used pools etc
### Keepalived VRRP for Service VIPs(Manoj Alva)
Clarification around the following items.
- Placement of the KubeadmControlPlane manifest under function/k8scontrol directory in treasuremap
- keepalived installation and start of service will be done using shell script passed as preKubeadmCommands instead of baking the same in the isoimage.
#### Note:
- ISSUE : Extend the VRRP Service Implementation to comply with Security consideration via appropriate policies (??) taken into account given the ingress is now on OAM IP
- ISSUE : Test performance of using the the VRRP Service Implementation given the ingress is now on OAM IP
## Thursday Mar 4, 2021
### What are our targeted phases? (Matt)
This has been a bit of an evolving target - what should people do today?
* How will we batch up Infra, Operators, Operator Config, LMA Workload...
* We have some extra phase(s) today out of necessecity that we plan to remove
* For the discussion https://hackmd.io/7vQOSYADSVessB9zZ0P-Kw
* Move the discssion notes arond changing metadata and enhancing phases to the hackmd note above. Both isues below refer to those topics.
**ISSUE** : Future optimization to try to allow for reuse of phases uniquely.
Possible solution change metadat to include an atribute to inform that paths should include cluster name in them.
**ISSUE**: Future decision of how to better integrate extended wait logic into executor or airshipctl in general. Which will allow us to simplify the number of phases as well. Might mean adding a concept of steps to a Phase?
## Tuesday Mar 2, 2021
### Keepalived VRRP for Service VIPs (Anand/Andrew K)
Discuss keepalived VRRP design for K8s Service VIPs of the undercloud cluster.
* Upstream this should be the same we do downstream (for undercloud k8s api/services)
Discuss the HA for Ironic "PROVISIONING_IP".
- Should airshipctl handle the implementation via Ingress ..Operator?
- may be deelop Operator similar to SIP that creates LBs for tenants.
- Or just use ingress charts to keep the implementation simple.
https://github.com/airshipit/treasuremap/issues/94
Path Forward:
* Anand to follow up w/ Sai on how to handle ironic ingress vip
* Follow the RDM9 approach of configuring keepalived via the kubeadmcontrolplane resource, and baking keepalived into the base image
* Add replacement of keepalived variables (VIPs, vlans, etc) from the Networking Catalogue into the KubeadmControlPlane
* Note: the Kubernetes community best practice for API load balancers is to manage them outside of kubernetes itself. This is why Airship 2 diverged from the Airship 1 / OSH approach of configuring a containerized keepalived in the Ingress chart.
### ViNO mac address management (Matt)
How should ViNO manage mac addresses for VMs it creates?
Assumptions:
* libvirt (via vino-builder) needs to know mac addresses
* metal3 BMH networking secret (via vino controller) needs to know mac addresses
* nothing else needs to know the mac addresses
Potential approaches:
* Add mac addresses into ViNO CR input -- shift responsibility to the user (seems bad)
* ViNO controller allocates/calculates macs (similar to its IPAM), and puts them into the vino-builder config and the BMH Secret
* vino-builder allocates/calculates macs and passes it back to vino controller to put in the BMH Secret
Path Forward:
* Populate mac addresses with a dummy value for now
* Look into how robust libvirt mac generation is, and whether there's any risk of collisions
* Look into how vino builder could pass mac addresses back to the vino controller
* As a Plan B, vino controller could generate/track mac addresses similarly to what it does for IP addresses
### Treasuremap Issue template
* Appears to have reverted from the recent template changes
* We were going to update to reflect the type direction
Path Forward:
* The change was made to treasuremap sometime in the last two weeks (the .github folder has disappeared). Andrew to look through git history so we can figure out who/why.
## Thursday Feb 25, 2021
### Injecting SSH authorized keys into Vino VMs (sean)
What is the upstream solution for getting SSH authorized keys injected into the vino VMs, so that we can access them from the SIP jump container. Does anyone know what functionality exists in CAPI or metal3 for injecting SSH authorized keys (or arbitrary file content) into the hosts? Or does it already do that by default? If not, I see metal3 BMH allows specifying ironic metadata/userdata, would that be a good approach?
Cluster API bootstrap provider Kubeadm (CABPK) has direct support for adding ssh authorized keys at node creation time:
https://github.com/kubernetes-sigs/cluster-api/blob/master/docs/book/src/tasks/kubeadm-bootstrap.md#additional-features
This is the path forward, ViNO will have no role in key injection. New keys will require redeployment of vBMH as appropriaye.
### Catalogues & Replacement as interface to functions (Matt/Sidney)
This patchset proposes an idea for using a catalogue-based approach to configuring a function (I think) -- putting all of the function's tunables in one place. A little like a chart's overrides.yaml. Let's talk through the use case and implementation of this.
* This is a little different than our use cases for catalogues so far
* Is the extra layer of abstraction something we want?
* If so, where to we put the moving pieces (catalogue, transformer config...)
https://review.opendev.org/c/airship/treasuremap/+/776528
## Tuesday Feb 23, 2021
### Review first pass of Treasure Map function > type mapping
https://docs.google.com/spreadsheets/d/1-sq2j-JzD9Jv2D6FTimzD6eqa45_iRP8BYNlAw2StPI/edit#gid=0
### Discuss Host Config Operator checks & validations
From the 2/18 design call, continue discussion on how to periodically check & remediate deny list, permissions & limits violations via Host Config Operator.
https://github.com/airshipit/hostconfig-operator/issues/10
### Sequencing CRD establishment & CR delivery (matt)
We discussed an issue a few weeks ago around the fact that the tigera-operator itself creates its CRDs (as opposed to airshipctl delivering the CRDs along with the operator), which leads to a race condition: we have to wait for the operator to do it's thing before moving on to the next phase that delivers CRs.
We've hit the same issue with our LMA operators. If this is going to be a common problem, we may want a generic mechanism to handle it -- something like "wait for CRD [X] to be established before moving on", where [X] is driven by metadata.
I think we talked before about a shorter-term "CRD wait as its own new phase, using generic container" approach, and maybe a longer-term "CRD wait as a wait condition on the existing phase". Either way, the logic could be based on something like this (but containerized):
https://review.opendev.org/c/airship/airshipctl/+/769617/12/tools/deployment/26_deploy_metal3_capi_ephemeral_node.sh
(sean) CRD wait is already supported by kstatus:
https://github.com/kubernetes-sigs/cli-utils/blob/535f781a3c1b1d66b06a93f59e7a03c03af81477/pkg/kstatus/status/core.go#L559
so issue is just that the operators are creating their own CRDs rather than kustomize. do the operators provide status conditions which indicate when their CRDs are established?
tigera operator (no): https://github.com/tigera/operator/issues/1055
lma operators (???): ???
github issue updated with this discussion: https://github.com/airshipit/airshipctl/issues/443
## Thursday Feb 18, 2021
### Functions & Treasuremap Types
*If this belongs in Flight Plan then we can defer to that meeting*
We need to review the manifest functions being developed in Treasuremap to determine whether they should be included in the airship-core or treasuremap types. We've discussed the approach for identifying [Jan 13 Flight Plan Call](https://hackmd.io/93_0K4AAR9izrEpuDMa5Rw#Jan-13-2021), but haven't done the review yet.
:::warning
* Goal keep airship core simple and functional
* From LMA only logging
* Dex ..
* HostConfig
* Helm Operator (*)
* Helm Chart Collator
* Ingress (VIP based.. work anywhere withouth BGP expectations)
* Rook and Ceph:
* Some basic configuration?
* Default behaviours of rook, nothing prescriptive
* Rest of functions will go into NC type
* Airship core functions +
* LMA complete stack (dashboards, prometheus, etc)
* Rook and ceph, prescriptive configuration for NC type
*
Moving forward new functions will follow this procedures to determine where they go :
* How do we tag/identify what functions go with what types? Labels, new issues?
* Create label for each type for identification purposes.
* When creating the issue for the function, include which type(s) in which the function should be included.
* The developer who is working the function creation issue should be responsible for including it in the types that are specified in the issue.
* We are making the assumption all other types inherit airship-core, and will also inherit the functions associated with airship-core.
* When a new type is defined, part of the issue to create the type should include which functions are part of the type.
:::
### Image-builder: blacklist package (Sreejith)
Divingbell supports blacklisting packages. do we need this in image-builder?
## Tuesday Feb 16, 2021
* [Upgrade docker/containerd](https://github.com/airshipit/hostconfig-operator/issues/2) - hostconfig-operator - Issue #2 - (Sirisha)
* Design document and detailed analysis of the scenarios executed to upgrade docker/containerd can be found here - https://hackmd.io/1wzoYuNeSzuEp7XvxIOr8Q
* Should we support downgrading packages?
* PS to address upgrading docker/containerd.io is here - https://review.opendev.org/c/airship/hostconfig-operator/+/773389
* [Emergency package installation/update](https://github.com/airshipit/hostconfig-operator/issues/4) - hostconfig-operator - Issue #4 - (Sirisha)
* Design document - https://hackmd.io/z2eKJPPDTAeEh0dYfnCOig
## Thursday Feb 11, 2021
* (Constantine Kalynovskyi) - Kubeconfig phase integration,
[related issue](https://github.com/airshipit/airshipctl/issues/460)
Open question(s):
* how dynamic config will work if we're going to delete epthemeral/bootstrap cluster (need to switch to filesystem? how to store kubeconfig? probably it should be encrypted)? 'Dynamic' mode will work really well for tenant clusters though...
* For how long the kubeconfig, generated by clusterApi for the target cluster will be valid? What if CA is compromized and we need to rotate it? What if we need a new kubeconfig for any other reason? Are we going to put new kubeconfig to the cluster?
* (Ahmad Mahmoudi) - Undercloud cluster lifecycle updates and how they impact tenant sub-cluster workloads
Open questions:
* Need to review this flow for day2 undercloud lifecycle changes and sub-cluster impacts
* For each undercloud BMH node to be updated perform following steps:
* If the undercloud BMH node is a worker node, perform following steps. for Control Plane BMH nodes skip this step and move to next step:
* Identify the impacted tenant sub-cluster vBMH nodes running on the BMH node being updated (node labels?), and Drain the impacted sub-cluster vBMH nodes. Once the sub-cluster vBMH nodes are drained, remove the vBMH nodes from the impacted sub-clusters.
* **Q: What tool drives this, airshipctl phase?** This step is not needed. To start with we extend the healthcheck timout to a time to allow for re-deployment of the BMH, during upgrade.
* When the impacted vBMH is deleted, ViNo starts the sub-cluster's `inactive` vBMH, SIP sets the sub-cluster vBMH labels, and CAPI reconcilliation, joins the vBMH to the sub-cluster.
* **Q: Check with ViNo, SIP and CAPI, if/how the `inactive` vBMH is started?**
* **Q: Is the assumption that each sub-cluster is deployed with a spare vBMH node as `inactive` to be used for lifeclycle updates correct?**
* No !! vBMH are labeled before hand. The pool of vBMH labeled by sip are done before hand
* Vino doesnt do anything here. Other than when the host comes back up after re-deployment.
* Drain the BMH node, wait until all node resources are vacated, delete the BMH node from the undercloud cluster, shut down the BMH node, re-deploy the node and join the cluster.
* **Q: Does CAPI resilency drive this step or do we need an airshipctl for this?** CAPI will do all this for identified BMH(s):drain, delete from cluster, redeploy and join the cluster.
* **Open Question: For the worker nodes how can we group the nodes to be redeployed? Serially will take a long time, per-rack, might bring too many BMHs. This needs to be tested and assessess.
* [Upgrade docker/containerd](https://github.com/airshipit/hostconfig-operator/issues/2) - hostconfig-operator - Issue #2 - (Sirisha) - Moving it to 16th Feb
* Design document and detailed analysis of the scenarios executed to upgrade docker/containerd can be found here - https://hackmd.io/1wzoYuNeSzuEp7XvxIOr8Q
* Should we support downgrading packages?
* PS to address upgrading docker/containerd.io is here - https://review.opendev.org/c/airship/hostconfig-operator/+/773389
---
## Tuesday Feb 9, 2021
* (Manoj Alva) [Apply failsafe North - South network policies via Calico chart #32(Treasuremap)](https://github.com/airshipit/treasuremap/issues/32)
* Is there agreement on the default parameters?
* *At a minimum this would disabling Calico's default failsafe parameters by setting FailsafeInboundHostPorts and FailsafeOutboundHostPorts to "none".*
* What needs to be defined when replacing the default failsafe rules?
* Does this mean we define GlobalNetworkPolicy resource as described in reference sample (https://docs.projectcalico.org/reference/host-endpoints/connectivity)
* Failsafe Default rules (https://docs.projectcalico.org/reference/host-endpoints/failsafe)
### [Upgrade docker/containerd](https://github.com/airshipit/hostconfig-operator/issues/2) - hostconfig-operator - Issue #2 - (Sirisha)
* Detailed analysis of the scenarios executed to upgrade docker/containerd can be found here - https://hackmd.io/1wzoYuNeSzuEp7XvxIOr8Q
* PS to address upgrading docker/containerd.io is here - https://review.opendev.org/c/airship/hostconfig-operator/+/773389
* Once we upgrade docker/containerd it requires service restart, which effects the kubernetes pods and applications running on that node
* And if the upgrade is on the k8 node where the hostconfig-operator pod is running then there is downtime in the kubernetes cluster as etcd leader election can take time.
* After the k8 cluster is up the hostconfig-operator pod on the same node can become leader or other replica running on different node can become leader
* Once the leader hostconfig-operator pod comes up it re-executes all the HostConfig CRs in the cluster.
* So all configuration roles in the hostconfig-operator have to first check if the configuration exists and then execute the if it doesn't exists or there should not be any effect even if the configuration re-executes
* No two CRs should have same configuration defined for same node. ex: sysctl configuration defined in two CRs for same node. In this case not sure which configuration executes first.
* Are these consquences expected or should we drop the support for upgrading docker/containerd?
* Both upgrade/downgrade can happen with the above PS, looking for pointers on how to restrict to upgrade?
### [Emergency package installation/update](https://github.com/airshipit/hostconfig-operator/issues/4) - hostconfig-operator - Issue #4 - (Sirisha)
* Do we need to support package installation/update through apt/yum or do we have to support any other procedures like
* wget
* executing script
* installing .deb/.rpm package supplied - how will it be supplied?
* pip package installation
* Upgrading package will not be scope as it conflicts with [Issue #2](https://github.com/airshipit/hostconfig-operator/issues/2)
* If package/binary already exists we would throw an error, as it would be part of upgrade.
* What do we need ...
* What packages do we support
* Based on what Airship 1 supports via MiniMirror that is your starting possble target
* Derive from these list what you need to support
* apt ...? for the most part
* Should look at diving bell apt management:
* https://github.com/airshipit/divingbell/blob/master/divingbell/templates/bin/_apt.sh.tpl
* Do we want to whitelist the packages that we support via tese Hostconfig operation, in other we want. tomake sure things like containrd or kubeet are not upgraded through this mecchanism. Black list of packages...
The work :
* CR :
* This is how I deliver the list of packages
* This is how I inform what I want to do with
* Playbook
* Role/Task for package consumption and applying them
* What error conditions?
*
## Thursday Feb 4, 2021
* SIP is going to enable operators to power-cycle virtual machines from its jump-host service.
1. Should the jump pod power-cycle virtual machines using Redfish or libvirt?
3. What do we need to do to restrict access to which VMs an operator can cycle? i.e. we can limit which virtual machines they see inside the jump pod, but do we need a more robust approach to locking down access to the other VMs?
:::info
Discussion Image https://go.gliffy.com/go/publish/13446596
* Redfish because we can not block libvir policy wise
* SIP will create POD Network Policies to allow, ssh, and https/s access to vm's and approriate service/ports
* SIP will craft an apropriate config file that the DMTF redfish can use as inoput perhaps for the list of valid target redfish?
* Is it a single config will all targets , a cluster. vm endpoints document?
* vm name |fqdn | ip | http url | ssh ..|
* What is the vmname?
* Its the BMH name .. properly correlated.
* BMH name is complex name that includes, rack , server, number in rack, vm number, etc.
* BMH :: libvirt :: ... <- everything is the same name
* A wrapper script can use this to call things...
* ....
:::
## Tuesday Feb 2, 2021
## Thursday Jan 28, 2021
### How to host krm function containers out of private repos?
* We will have many many references like the following, across multiple repos
```
config.kubernetes.io/function: |-
container:
image: quay.io/airshipit/replacement-transformer:latest
```
* At render time, we may not have access to quay, and need to use e.g. artifactory
* We could serve it out of a local docker cache, configured to pull from artifactory but serve images up with quay tags? (Not sure if this is possible)
* We could do a Big Sed (s/quay.io/my-artifactory/), either on disk or dynamically as part of `airshipctl phase run`
*
* Any other options?
* Patching the downstream git repos how we want them.
* Docker Registry/Catching needd dto understand if these gives us an option.
* Future possibility: https://github.com/GoogleContainerTools/kpt/issues/1204
These patchset introduces a solution for the versioning issue https://review.opendev.org/c/airship/airshipctl/+/767179
New ISSUE for this discussion: https://github.com/airshipit/airshipctl/issues/457
### Secret rotation UX (Matt)
* This script demonstrates how to invoke `airshipctl cluster rotate-sa-token`
* https://github.com/airshipit/airshipctl/blob/master/tools/deployment/provider_common/42_rotate_sa_token.sh#L21-L39
* Note how much bash needs to occur first to extract the name of the secret(s) to rotate
* Can we enhance the command to do that work?
Need to verify if ASPR/Security Requirements require SA to be rotated ?
Even if we find we do need to , will create an issue for Post v2 scope...
* Perhpas an operator in the cluster managing these internally. CR tells it what to mage , and policies like frequency, etc.
### Secret decrypt discussion continued from Tuesday
https://review.opendev.org/c/airship/airshipctl/+/772467 see the flag TOLERATE_DECRYPTION_FAILURES=true in https://review.opendev.org/c/airship/airshipctl/+/772467/2/manifests/site/test-site/target/generator/results/decrypt-secrets/configurable-decryption.yaml#10
https://github.com/GoogleContainerTools/kpt-functions-catalog/pull/153
https://github.com/airshipit/airshipctl/issues/453
## Tuesday Jan 26, 2021
### Walk through the Secrets only using phases approach (Alexey)
* Explain where we are at ?
* Can we deprecate the secrets command?
* Any gaps in scope :
* ISSUES [#453](https://github.com/airshipit/airshipctl/issues/453)
* Decrypt , phase render will fail if user doesnt have access to the keys.
* Behaviour should be that it simple renders encrypted.
* Having access to the keys implies privileges .
* ISSUES for documenting *how to secrets* in airship
* PAtchsets related to this
* https://review.opendev.org/q/topic:%22generator_sops_encrypter%22+(status:open%20OR%20status:merged)
### Synchronize Labels between BareMetalHosts and Kubernetes Nodes (Arvinder)
* The design doc ready to merge. Waiting on lazy consensus by end of this week: https://github.com/metal3-io/metal3-docs/pull/149
* The PR for the feature is also under review: https://github.com/metal3-io/cluster-api-provider-metal3/pull/152
* Demo
## Thursday Jan 21, 2021
### Some small items
* Should document pull have clean options
* To clean target path
* or simpler document clean ups such as `git clean -f -d .` does.
* LEAVE IT AS IS. Managing TargetPath manually.
* document pull - should validate the Mamnifesrt configuration between URL and expected authentication mechanismis accurate
* ISSUE (fix/enhancement ..)
* bug calling document pull twice seems to have an issue.[ISSUE]
* config init [ISSUE]
* default to treasuremap but pointing to master, should be branch v2? Or is this tied to teh release managment discussion for the future.
* Message about TargetPath default ...
### Issues needed design
* Implement log collection in 3rd party gates for deployment success/failure tracking #449
* Two options
* Jenkins on the WWT
* Nexus Server up on same WWT (**Direction)
* Some capacity calculation to determine policies
* Will store both success and failures
### Walk through Baremetal Inventory Approach (Kostiantyn)
https://review.opendev.org/c/airship/airshipctl/+/771083
## Tuesday Jan 19, 2021
Canceled
## Thursday Jan 14, 2021
### Plan run: Parallel vs Sequential phase execution (Dmitry)
Let's make final decision regarding phases execution strategy within phase group. Initially we discussed that phases within group are executed sequentially and groups are executed in parallel. This may lead to a bit of inconvenience in case we need a dependency between groups (user has to create separate plan). For example
``` yaml
---
apiVersion: airshipit.org/v1alpha1
kind: PhasePlan
metadata:
name: phasePlan
description: "Default phase plan"
phaseGroups:
- name: group1
phases:
- name: initinfra-ephemeral
- name: initinfra-networking-ephemeral
- name: clusterctl-init-ephemeral
- name: controlplane-ephemeral
- name: initinfra-target
- name: initinfra-networking-target
- name: workers-target
- name: workers-classification
- name: workload-target
---
apiVersion: airshipit.org/v1alpha1
kind: PhasePlan
metadata:
name: phasePlan2
description: "Another phase plan"
phaseGroups:
- name: group-which-depends-on-group1
phases:
- name: openstack-deployment
```
Alternative approach is to execute groups sequentially and phases within group in parallel (i.e. opposite logic). This eventually lead to another inconvenience: significant amount of groups with single phase and once we add wait tasks it becomes worth. For example
``` yaml
---
apiVersion: airshipit.org/v1alpha1
kind: PhasePlan
metadata:
name: phasePlan
description: "Default phase plan"
phaseGroups:
- name: group0
phases:
- name: initinfra-ephemeral
- name: group1
phases:
- name: wait-initinfra-ephemeral
- name: group2
phases:
- name: initinfra-networking-ephemeral
- name: group3
phases:
- name: clusterctl-init-ephemeral
- name: group4
phases:
- name: controlplane-ephemeral
- name: group5
phases:
- name: initinfra-target
- name: group6
phases:
- name: initinfra-networking-target
- name: group7
phases:
- name: workers-target
- name: group8
phases:
- name: workers-classification
- name: group9
phases:
- name: workload-target
```
Ongoing Audit v2 Audit effort https://docs.google.com/spreadsheets/d/1YnVC_yQr7m-TUDaLz0xGndkoVIlO2emIwoGirbPBz-8/edit#gid=1852982768
## Tuesday Jan 12, 2021
### `kpt pkg sync` proposal (Sean)
See https://github.com/airshipit/airshipctl/issues/430#issuecomment-756872360
We agree we like using the kpt files for driving. release maangment of upstream functions/provenance.
Will use treasuremapCI/CD to drive updates, and tagging as needed to ensure no drifting occurs.
Ongoing Audit v2 Audit effort https://docs.google.com/spreadsheets/d/1YnVC_yQr7m-TUDaLz0xGndkoVIlO2emIwoGirbPBz-8/edit#gid=1852982768
## Thursday Jan 7, 2021
## Tuesday Jan 5, 2021
### Quick revisit of Variable Catalog Open issues
[#363](https://github.com/airshipit/airshipctl/issues/363) - Define CRD/Schema approach
[#317](https://github.com/airshipit/airshipctl/issues/317) - Define Endpoints catalog
* Should we go ahead & break #317 into multiple issues as Matt suggests in the comments?
* New issue for OpenStack-Helm (see below)
* Update 317 with schema example & item to reconstruct URL from piece parts
* If we do, would there be multiple endpoint catalog CRDs or just one?
* Should we break #363 into multiple issues, i.e. one issue per catalog CRD?
* In https://github.com/airshipit/airshipctl/tree/master/manifests/function/airshipctl-base-catalogues we currently have networking, versions & environment variables (assuming optional as it's part of the template). Confirm endpoints is the only outstanding catalog (didn't see any other issues).
endpoint format
```yaml=
apiVersion: airshipit.org/v1alpha1
kind: VariableCatalogue
metadata:
name: endpoint_generic
labels:
airshipit.org/deploy-k8s: "false"
spec:
<the-endpoint-name>:
namespace: TBD if we need this here..
fqdn: name.iam-sw.DOMAIN
path: /v3
protocol: "https"
port: 443
```
:::warning
Openstack will need its own endpoint catalogue that mimics the existing format 100% to avoid changes to helm toolkit.
Example catalogue:
https://raw.githubusercontent.com/airshipit/treasuremap/master/site/seaworthy/software/config/endpoints.yaml
:::
## Thursday Dec 17, 2020
### Continue the airshipctl cluster put kubeconfig?
.... https://hackmd.io/U_8VYjK0Qoe24Us8IMQpcQ
### Work the details for the Image Builder CowImage
Where does the data that deliver Host and HArdware Configuration live
### SIP & ViNO interactions
- ViNO adds a `tenant-cluster=x` label to BMHs: does this label mean that a particular BMH can only be scheduled to that subcluster? Does ViNO apply this label to each BMH object it creates?
- No, Vino does not add the tennant-cluster or subcluster label, that is done by SIP. Vino simple adds labels such as Rack, Server, Flavor to the vm associated BMH.
- Where should we define the ViNO bridge interfaces? Сurrent VINO CR example:
```
nodes: # Change this to vBMH or nodes?
- name: master
networkInterfaces:
- name: mobility-gn
type: sriov-bond
network: mobility-gn
mtu: 9100
options: # these key value options look like they belong to HOST node instead they are defined on VM level
pf: [enp29s0f0,enp219s1f1]
vlan: 100
bond_mode: 802.3ad
bond_xmit_hash_policy: layer3+4
bond_miimon: 100
```
## Tuesday Dec 15, 2020
### Airship MultiProvider Cohesiveness
- Discuss current differences between BM and Cloud Provides in terms of lifecycle experiences and expectations
- Goal is it should be the same , ...Will try to identify the gaps
* A couple things in progress which will help with this goal:
- `airshipctl phaseplan run` will make the selection of phases
to run be driven by the PhasePlan manifest itself
- This will solve for isobuilder
- standing up a kind cluster for everything other than BM
- Serves the same purpose as ephemeral cluster
- Related to bootstrap containers, but I don't think that solves it
- The ability to ingest dynamically-generated kubeconfigs into
the document set will make bare metal and public cloud kubeconfig
representations look more similar
- Any provider-specific actions, like extracting the kubeconfig, can be formed into generic container executor phases
- Same goes for any "custom waiting" that needs to loop
- verify_hwcc_profiles.sh -- testing-only script
- Path forward: form into a generic executor container
Thought :
* cloud Providers provide a get credentials, we have airshipctl cluster get kubeconfig <cluster> *
Convenient way to get kubeconfig ... however is retrieving from live cluster
Should or can we integrate withh the public cloud notion ... Can we use this ?
After we create theh clod provider cluster, whiuch stoes kbeconfig in te management cluster
We can use the cloud provider store <data> into the provider..
...
The we can rely on the cloud persistece infrastcture to have the kubeconfig structure...
Issue is the persistence on the filesystem .., ..
**KNOWN ISSUES or DIVERGENCES**
* kubeconfig management:
We are programatically specifying the kubeconfig for baremetal vs cloud providers where the kubeconfig is generated?
Discussion was to try to merge the kubeconfig to merge into a single entity.
Airshipctl will not be responsible for persisting beyond the filesystem
Airshipctl will just make sure that the kubeconfig in manifests or explicitly stated outside will is updated.
* hwcc : is not needed for cloud providers other than metal 3 since its tied to the BMO/Ironic mechanisms.
LONG TERM : Issue - Define a Phase Post Execution notion , still TBD ..
## Thursday Dec 10, 2020
### CI/CD Portability Philosophy
* A self-contained CI/CD philosophy has been proposed by an airshipctl PTL who shall remain nameless.
* In other words, put as much of the CI/CD into makefile targets as possible, so that there is as little reliance on the specific CI/CD platform used (e.g., Zuul vs Jenkins)
* Example: https://review.opendev.org/c/airship/airshipctl/+/765830
* Not everyone in airshipctl community seems aware of this approach, so this agenda item in part is meant to socialize it.
* Arijit: One negative of this approach is losing the debug console: https://zuul.opendev.org/t/openstack/build/1a23ef90c08343e3af30f6b65536fe1c/console Is this acceptable, or is there another workaround available to make it work?
* Arijit: In future we are planning to test different CAPI providers. Is the preferred approach for new targets in the makefile for different providers, or to use same target with argument of different site names?
GAPS/OUTCOME :
* Future roadmap , will take advantage of this move to a makefile driven approach. I..e introddcing airship in a pod, and the inntroduction of plan run.
* Will introduce an acculation of logs. Will lose some detail in terms of where the deployment fails.
* This of course helps the facilitate ea common experience between the developers and the ci systems.
* Potential issue , some capabilitites available with zuul , might be impacted, or not available in jenkins. i.e. depends.
### Treasuremap Branching
Currently, we have these Treasuremap branches:
* master (Airship 1, with sloop, skiff, foundry etc)
* v1.8-prime branch (latest Airship 1, but only cruiser type functional)
* v2.0 branch (Airship 2)
Ideally, v2 content would be in the main branch for the Airship 2.0 release:
* One simple option: simply rename v2.0 to master, and master to v1.7, and leave v1.8-prime as-is (perhaps rename to v1.8)
* Folks definitely use the v1.8 branch. Does anyone actually use master today?
Les create an ISSUE for Treasuremap Barnching so we can discuss this.
A plan for branch realignment. Tie to the v2 release/milestone.
### VINO & SIP
FYI: ViNO and SIP have been added as Airship projects.
Let's take the opportunity to review the plans for integration into Treasuremap.
Do we need VINO & SIP integration in TM for Av2.0, or can it defer to v2.1?
## Tuesday Dec 8, 2020
### Understanding Treasuremap
* Its modules and structure
* How Treasuremap is used with airshipctl
### clusterctl rollout
* https://github.com/kubernetes-sigs/cluster-api/issues/3439
* `clusterctl rollout` command provides users with a convenient and consistent means through which to rollout an update, monitor the status of an update, rollback to a specific update and view the history of past updates.
* Currently focused on `MachineDeployment` with support for `KubeadmControlPlane` coming later.
* Long-term goal is to add the command under `kubectl rollout`.
ISSUE : Integrate with ***airshipctl cluster rollout*** .
## Thursday Dec 3, 2020
### Secret Generation
* A design incorporating Alexey's KRM Template-based secret gen idea:
* https://hackmd.io/NW5fySw1QQ2Ex8wMKF9WUg?both
* How to scope the Template(s) themselves?
* Since it's 100% data driven, a single `function/secretgenerator`.
* How to scope the ReplacementTransformer rules for secret gen input/spec:
* `VariableCatalogue(s)` -> `secret gen Template.values`
* This is specific to particular functions, so maybe `function/<name>/secretgeneration`?
* The encrypted/generated secrets themselves must be scoped to site-level.
* How to scope the ReplacementTransformer rules for outputted secrets?
* `kind: Secret data.password` -> `HelmRelease.values.password (for example)`
* Perhaps add this into `function/<name>/replacements`
* Possible Airship documents lifecycle options based on generation/encryption example
* https://docs.google.com/document/d/1CUYMGsEQ9ZYez0mSdG3DyyhPiA8_nRXQf-aFP5WufGc/edit#heading=h.oy2ce573d5m
Following discussion of https://github.com/airshipit/airshipctl/issues/419
```yaml
apiVersion: airshipit.org/v1alpha1
kind: ReplacementTransformer
metadata:
name: ......
annotations:
config.kubernetes.io/function: |-
container:
image: quay.io/airshipit/replacement-transformer:latest
```
Need a mechanism to specify the version for the plugin image once , and not in every ReplacementTarnsformer artifact.
## Tuesday Dec 1, 2020
### PXE boot issue with new ironic python agent
The current ironic images have an issue where the ironic-inspector ramdisk is using a different network interface than the one it was PXE booted from -- resulting in not being able to report status back to Ironic, which is listening on the PXE network. This has manifested in multiple bare metal deployments having multiple NICs.
Sai & Konstantine found a workaround -- downgrading the ipa package in the ironic image.
* Do we need to create an issue w/ Ironic for this
* How can we bake this into / use our custom Ironic image
* Let's use this as an opportunity to align / chart a course w.r.t. ironic image overall
* We have an image, but so far it's very light & we're not using it in our deployment manifests: https://github.com/airshipit/images/tree/master/ironic
* We should bake all scripts into the image rather than overriding them in, e.g. https://github.com/airshipit/airshipctl/tree/master/manifests/function/baremetal-operator/entrypoint
* An "isolinux.bin file is not found" ironic misconfiguration error has been reported - maybe address this at the same time. https://github.com/airshipit/airshipctl/issues/420
### Secret stringData & last-applied-configuration
We are using `stringData` as a human-friendly way to author Secret data, with the understanding that Kubernetes will change it into a b64-enc `data` (which it does).
However the kubectl library also adds a `kubectl.kubernetes.io/last-applied-configuration` annotation, which unfortunately includes the cleartext `secretData`, which kind of defeats the point.
Is there some trick we can employ to change this behavior, or do we need to switch from `stringData` to `data`? The ReplacementTransformer has b64-enc capabilities that should make this easier to deal with than it was.
Some discussion here : https://github.com/kubernetes/kubernetes/issues/23564#issuecomment-517931384
Lets open an issue to update/enhance/use the replacement transformer or another encoding transformer plugin to encode the data from stringdata to data field prior to applying against kubernetes. Issue created: https://github.com/airshipit/airshipctl/issues/424
* from Alexey Odinokov to Everyone: https://github.com/airshipit/airshipctl/blob/master/pkg/api/v1alpha1/replacement_plugin_types.go#L25
* from Matt McEuen to Everyone: https://github.com/airshipit/airshipctl/blob/master/pkg/document/plugin/replacement/transformer.go#L201-L228
### Align phase list-related work
* This one (in progress) is for `airshipctl phase list <planName>` proper:
* https://github.com/airshipit/airshipctl/issues/358
* This one is for a `airshipctl plan describe <planName>`
* https://github.com/airshipit/airshipctl/issues/394
* Seems to include `phase list` functionality
* Should we simply output the plan description in `phase list`?
* `airshipctl plan list`:
* https://github.com/airshipit/airshipctl/issues/385
* Do we have an issue for listing out phase plans?
* Could do via `phase list` (no `planName`: implicit "all plans")
* Could do via `plan list`
* Do we need a way to list all phases for all plans available to the document set [ ISSUE ]
`airshipctl phase list `
Would :
* Find all documents with kind: Phase .
_________
*airshipctl phase list|get*
All plans in the documents set
: Tabular form
phasnameA | phase fields clustername |...
phasnameB | phase fields clustername |...
*airshipctl phase list|get --plan <planname>*
Only phases of the plan <planname>
: Tabular form
phasnameA | phase fields clustername |...
phasnameB | phase fields clustername |...
*airshipctl phase list|get phaseA*
phasnameA | phase fields clustername |...
*airshipctl phase list|get phaseA -o yaml*
Phase artfacts
*airshipctl plan list|get *
: Tabular form
planA | fields clustername | description...
planB | fields clustername | description...
*airshipctl plan list|get planA*
planA | fields clustername |...
*airshipctl plan list|get planA -o yaml*
For the plan descriptionb field. We can make sure that the description is shorthened for the tabular form. If there are special characters such as return or LF, etc. We can shrink it to that.
VOTE :
| Command | Votes |
| - | - |
| Single command GET | Sudeep,|
| Single command LIST | Matt, Dmitry , Rodolfo, Drew, Srini Muly, Bijaya|
| Multiple Commands GET and LIST | Sean |
## Thursday Nov 19, 2020
### Revisit SOPS encryption subcommand
The GPG team pointed us to documentation that the SOPS library we're using for encryption is not intended for external use.
https://lists.gnupg.org/pipermail/gnupg-users/2020-November/064320.html
https://godoc.org/go.mozilla.org/sops/v3
Let's confirm whether we're comfortable using that library, and if we need to pivot to using the sops binary, what the best integration approach is.
**Alternative**
Implement encryption as a krm-function (as we have for Templater and ReplacementTransformer here https://github.com/airshipit/airshipctl/tree/master/krm-functions) and use generic container executor to run it as a phase or as a subcommand
### What is airshipctl cluster status ?
Follow discussion here https://hackmd.io/eiVvMiCwR5KsZSKGWjxlfQ
## Tuesday Nov 17, 2020
### Review list of functions for refactoring analysis
https://hackmd.io/cxcxX6ypQbCNifjJbDJnrg
## Thursday Nov 12, 2020
Follow up to Matt McEuen's question on removing taint from script https://github.com/airshipit/airshipctl/blob/master/tools/deployment/31_deploy_initinfra_target_node.sh#L22-L26
On testing found that ironic, helm-controller, source-controller, capi components, cert-manager etc was not comming up. For fixing ironic, helm-controller, source-controller, capi components etc we can add the toleration on the yaml or patch them with kustomize. But for cert-manager since its brought up by clusterctl, we will have to add cert-manager as a function and then deploy it via init-infra phase.
WIP Commit: https://review.opendev.org#/c/762186/
From Chat:
https://github.com/kubernetes-sigs/kustomize/issues/3095
New Issues :
1. Add a function for cert-manager, to give more control over its deployment config (like tolerations), rather than relying on clusterctl to manage it for us: https://github.com/airshipit/airshipctl/issues/408
2. Add patches to all infrastructure components to apply tolerations for master nodes, to allow them to run on master nodes. Remove the kubectl toleration application from our deployment scripts. https://github.com/airshipit/airshipctl/issues/406
3. Add patches to all infrastructure components to apply nodeSelectors for master nodes, to constrain them away from worker nodes. https://github.com/airshipit/airshipctl/issues/407
4. New issue to refactor functions to follow the pattern below.
a. If a function uses no copied upstream base, leave out `[function]/upstream/`
b. If a function is *only* an upstream base, put it under `[function]/upstream/` and have a thin passthrough `[function]/kustomization.yaml`
```plantuml
@startsalt
{
{T
/
+ manifests/
++ function/
+++ capi/
++++ vx.y.z/
+++++ upstream/ | Put the upstream here
+++++ replacements/
+++++ patches1..n.yaml | Airship specific patches
+++++ kustomization.yaml | Apply our patches on top of upstream
}
}
@endsalt
```
### What is airshipctl cluster status ?
Continue discussion here https://hackmd.io/eiVvMiCwR5KsZSKGWjxlfQ
New Issues :
* Update executor definition to introduce. astatus interface: https://github.com/airshipit/airshipctl/issues/409
* Update different executors to implement status interface [EPIC]: https://github.com/airshipit/airshipctl/issues/410
* Add a new Phase status subcommand: https://github.com/airshipit/airshipctl/issues/411
* Add a new PhasePlan status subcommand: https://github.com/airshipit/airshipctl/issues/412
## Tuesday Nov 10, 2020
Please review design proposal for BMH -> Node label synchronization in metal-3: https://github.com/metal3-io/metal3-docs/pull/149
### Alternative for sops+gpg - airshipctl secret ...
* https://github.com/mozilla/sops
* SOPS seems like the best alternative, ..
* SOPS issue https://github.com/mozilla/sops/issues/767
* We wait for this to be solved?
* Implement a non gpg Client for Sops *tree.GenerateDataKeyWithKeyServices([]keyservice.KeyServiceClient{keySvc})
* Utilize the SOPS container from airshipctl.
* Is there a personal vault approach?
* Would it make sense to see if that can be integrated with airshipctl
* Standing a personal vault in a container?
### What is airshipctl cluster status ?
Follow discussion here https://hackmd.io/eiVvMiCwR5KsZSKGWjxlfQ
## Thursday Nov 5, 2020
### Not Assigned, Not Reasearch, Not Ready for Review issues needing Design
https://github.com/airshipit/airshipctl/issues?q=is%3Aopen+is%3Aissue+label%3A%22design+needed%22+no%3Aassignee+-label%3Aresearch+-label%3A%22ready+for+review%22+
### Ability to support multiple phaseplans within the same manifest context. [#385](https://github.com/airshipit/airshipctl/issues/385)
~~Do we simple have a phasePlan entryt in metadata?
https://github.com/airshipit/airshipctl/blob/master/manifests/metadata.yaml
```yaml=
phase:
path: manifests/phases
docEntryPointPrefix: manifests/site/test-site
```
~~
airshipctl phase list|describe|run
Phase Plans...
***airshipctl plan list|describe|run***
This would behave the same way as phases command. But act upon the phaseplan documents instead.
Need new isses for airshipctl phase describe and run commands.
What should be the "well curated", Phase Plans.
How many phaseplans do we need to define in treasuremao?
Hoow do we optimize phases utilization while reducing duplication impactign ci/cd.
### Investigate post-beta release approach [#354 ](https://github.com/airshipit/airshipctl/issues/354)
* Release Management Lifecycle
* Issue create the dumm or entryt point github actions release.yaml
* see https://github.com/fluxcd/flux2/blob/main/.github/workflows/release.yaml
## Tuesday Oct 27, 2020
### GPG Sops integration issues with gate
* https://review.opendev.org/#/c/758392
* https://review.opendev.org/#/c/758707/
Issues :
* Redesign inplementation of secret decrypt/encrypt
* Fix gating with current design / create mock tests to bypass gpg issues.
* Example ps's that have failed because of gpg sops.
* Issue: https://github.com/airshipit/airshipctl/issues/378
### Patchset failing:
* https://review.opendev.org/#/c/759763/
From chat:
* https://github.com/ProtonMail/gopenpgp
* https://console.cloud.google.com/gcr/images/kpt-functions/GLOBAL/sops?gcrImageListsize=30
* https://github.com/GoogleContainerTools/kpt-functions-catalog/blob/master/functions/ts/src/sops.ts
## Thursday Oct 22, 2020
### Host Config Operator Design
Discuss updates needed for Host Config Operator.
Code: https://github.com/airshipit/hostconfig-operator
Issues: https://github.com/airshipit/hostconfig-operator/issues
* Should we leverage [#1](https://github.com/airshipit/hostconfig-operator/issues/1) to encompass the design changes or create a new issue?
### Container execution from phases
Issues:
* https://github.com/airshipit/airshipctl/issues/369
* https://github.com/airshipit/airshipctl/issues/202
Options: [doc](https://docs.google.com/document/d/1qiao8ApYCavndwhVGuSE4588vAJLXyWmvfXQSOuxUJE/edit#heading=h.qdvgn5twc4uw)
The [demo](https://github.com/aodinokov/noctl-airship-poc) of the first approach and its [screencast](https://drive.google.com/file/d/1f4ZRZqce6NkVY1xvOjeaWwApKkUKGBoc/view?usp=sharing)
### Bootstrap Container execution from phases (Sidney)
Discussion about two design proposals for this feature.
Description of designs can be found [here](https://hackmd.io/Ah1CRbxETLCsUMhdncHERw?view)
### Airship PTG Discussion / Reminder
* Next Wed/Thurs, 8-12 central
* Focus will be on Post-v2 scope, but all design/community topics welcome
* Please add anything to discuss to the agenda:
* https://etherpad.opendev.org/p/wallaby-ptg-airship
* Wed 8-9 will be new dev onboarding
* Overall PTG info & agenda (don't forget to register for free)
* http://ptg.openstack.org/ptg.html
## Tuesday Oct 20, 2020
### HWCC Profile Evaluation (Rajat, Ashu, Matt)
HWCC Profiles are evaluated by the HWCC at the time the profiles are applied, but are only re-checked when the profile resource changes, not when the node status changes. I.e., the node resources must be in Ready state prior to applying profiles.
In the CS below that is accomodated by separating out worker provisioning into separate phases.
Do we want to enhance the HWCC to re-evaluate profiles when node statuses change?
* https://review.opendev.org/#/c/748421/9/tools/deployment/34_deploy_worker_node.sh@37
### Finish Review of Image Builder/Host Config spreadsheet?
https://docs.google.com/spreadsheets/d/1BQRadxOOzvRq8C6j3fe4xQ1f4AjSCfAyWG73U15pIho/edit#gid=0
In reviewing can we determine if the item is already supported by the selected solution or if we need an issue to add support?
### Boostrap Container/Ephemeral Cluster with Phase Run (Sidney)
* Discuss the design for decoupling airshipctl command from provider's bootstrap container
* How to support command concurrency
* How to improve command usability
## Thursday Oct 15, 2020
### Review of Image Builder/Host Config spreadsheet?
https://docs.google.com/spreadsheets/d/1BQRadxOOzvRq8C6j3fe4xQ1f4AjSCfAyWG73U15pIho/edit#gid=0
In reviewing can we determine if the item is already supported by the selected solution or if we need an issue to add support?
### Discussion about the Labeling of Hosts Day 2
Proposal:
* notion of label BMH/Provider Host and cascade upwards to the appropriate nodes
* set of machines (MachineDeployment/machine sets) , update the labels.. Not available through at the moment.
* Currently only supported via redeployment.
* Security issue with kubelet labeling nodes.
* Generic solution : New Generic Labeling Operator
* Running on management cluster
* Can access any workload cluster it manages.
* Has kubeconfig/secret for each cluster.
* Should eventually be a component of CAPI operator
* Would work for multiple providers.
* Where do the labels belong/ to cascade up from... or as the source for the operator
* Should support both:
* BMH's
* MachineDeployment Sets..
* Upstream discussions in CAPI: https://github.com/kubernetes-sigs/cluster-api/issues/493
### Pros & Cons of go<->shell
For the container bootstrap work discussed last week, there are pros & cons around using go, CLI tools, or both to do things like:
* parsingYAML input (go has an advantage)
* program logic & maintainability (go has an advantage)
* interfacing with public clouds (provider CLI tools have an advantage)
The patchsets are currently set up so that a go program is the entrypoint to the containers, and it exec's out to run the commands. Using go libraries for these CLIs may be an option; is it worth it? Other thoughts?
Changes:
* https://review.opendev.org/#/c/752298
* https://review.opendev.org/#/c/748537
### Discussion about Ironic and progress with Redfish
Discussed earlier about bridging current gaps within Ironic for Redfish compliance.
Here's a document in progress
https://hackmd.io/YDLMNIq2SCexddXHbXwosQ
### CAPI related proposal we might need to follow
* [Cluster API Bootstrap Reporting ](https://docs.google.com/document/d/1FVRxo9toKSUmvKIUFFzPFhnFrfdR9s7S6Bl4shovNlg/edit#heading=h.3mwmvwsf4jyi)
* [Management cluster operator](
https://docs.google.com/document/d/1ZsusF5c9pYxseuaKxTpctI5aUDqzl0sdCW4xxDbLm3k/edit#heading=h.wdfs5v5gumb8)
## Tuesday Oct 13, 2020
### Remember PTG Wed Oct 28, 29
* Open Infra PTG Oct 26-30: http://ptg.openstack.org/ptg.html
* Airship agenda (please add topics): https://etherpad.opendev.org/p/wallaby-ptg-airship
* Airship sessions in Cactus room 13-17 UTC (8-12 US Central), Wednesday & Thursday
* First hour of Wednesday session will be New Developer Onboarding
* Dont forget to register (free)! https://october2020ptg.eventbrite.com
### Airshipctl persisting data
**Kubeconfig** for target cluster. Problem is especially vivid during public cloud deployments, capz, capg but in bmh deployments as well:
- **BMH dployments**: We deploy target cluster via ephemeral cluster, where do we store the kubeconfig for the target cluster? You can get a secret from ephmeral cluster with target cluster kubeconfig, however once ephemeral cluster is gone, kubeconfig is gone with it? Is it a manual job to commit it into a document model, or should we require an external storage to save it, and teach airshipctl to use it if needed? In current scenario we predefine the certificates and keys, and we **DO** know the ip address of the server, but that requires that user **MUST** generate certs beforehand, that is a manual labor.
- **Public clouds and CAPD**: Even when we can predefine the certs ands keys, we cant predict the IP address of the API server, which is part of kubeconfig, so we get it from parent ephemeral cluster (KIND), but once you tear it down, you don't have the IP, and airshipctl has no means to connect to it. Of course, we do have an ability, to supply our own kubeconfig via flag, and airshipctl will use it and will connect to it. But the kubeconfig is not persisted anywhere, should we simply rely on user and expect him to commit the kubeconfig to the manifest? should that be automated, or should this be a general bootstrap phase, which ends in kubeconfig being commited to persistent storage automatically that can be reused later?
discussion at https://hackmd.io/qC7PZYxWSaqC3PZVp4d4iw
### Deduplication of network definitions #315
Look at https://review.opendev.org/#/c/749611/ for implementation details
### Airship session in 2020 virtual Open Infrastructure Summit.
https://www.airshipit.org/blog/airship-featured-at-virtual-open-infrastructure-summit-in-october/
## Thursday Oct 8, 2020
### Discuss bootstrap approach for Azure, GCP etc
Defined in https://review.opendev.org/#/c/737864
And https://review.opendev.org/#/c/748537/
Questions:
* airshipctl and public clouds: priority (do we need this right now?) and level of support. If it's needed ASAP - Can we have it properly architectured first to comply with our current architecture? (see some big design concerns below)
* Why do we need a separate sub-command instead of new executor?
Some documentation https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_420/755398/20/check/openstack-tox-docs/4204487/docs/phases.html
* Why do we need to extend the config module with bootstrap container options(we moved them to document model a while ago)?
* Why do we need new container interface for bootstraping a cluster? can not we use 'Kubernetes in Docker' a.k.a 'kind' as an ephemeral cluster?
Proposal:
1. Use KIND cluster instead of ephemeral node for cloud providers
2. Implement executor (and phase) for cloud providers
3. Use clusterctl executor as is against KIND cluster (no additional code required)
### Regarding Image Builder, Have we consider existing tooling, in particular: https://www.packer.io ?
- It can be coupled with things like ansible to build custom images.
- They also support cloud specific image generation. If we ever went down the route of deploying images customized for a particular cloud, say azure or gcp, packer has cloud specific plugins.
### Airshipctl persisting data
**Kubeconfig** for target cluster. Problem is especially vivid during public cloud deployments, capz, capg but in bmh deployments as well:
- **BMH dployments**: We deploy target cluster via ephemeral cluster, where do we store the kubeconfig for the target cluster? You can get a secret from ephmeral cluster with target cluster kubeconfig, however once ephemeral cluster is gone, kubeconfig is gone with it? Is it a manual job to commit it into a document model, or should we require an external storage to save it, and teach airshipctl to use it if needed? In current scenario we predefine the certificates and keys, and we **DO** know the ip address of the server, but that requires that user **MUST** generate certs beforehand, that is a manual labor.
- **Public clouds and CAPD**: Even when we can predefine the certs ands keys, we cant predict the IP address of the API server, which is part of kubeconfig, so we get it from parent ephemeral cluster (KIND), but once you tear it down, you don't have the IP, and airshipctl has no means to connect to it. Of course, we do have an ability, to supply our own kubeconfig via flag, and airshipctl will use it and will connect to it. But the kubeconfig is not persisted anywhere, should we simply rely on user and expect him to commit the kubeconfig to the manifest? should that be automated, or should this be a general bootstrap phase, which ends in kubeconfig being commited to persistent storage automatically that can be reused later?
## Tuesday Oct 6, 2020
### Generation & rotation of site-level secrets (Sirisha)
[airshipctl secrets generate command design doc ](https://hackmd.io/oXLb7IMGRtGUBpewo7spUA?view)
* Secret catalog structure - one or more than one?
* Incorporating generated secrets into phase kustomizations
* Generating secrets other than Secrets
* Design doc: https://hackmd.io/oXLb7IMGRtGUBpewo7spUA?view
Secrets generate discussion notes , or details https://hackmd.io/Mx3X6U6lQ8GLEW4ISgbDoA
-
## Thursday Oct 1, 2020
### [Image Builder Declarative Discussion](https://hackmd.io/6CgeJKqVQJ6vpT2DC5mx6A)
### Generation & rotation of site-level secrets (Sririsha)
Secret catalog structure - one or more than one?
Incorporating generated secrets into phase kustomizations
Generating secrets other than Secrets
### Airshipui - issue grooming & next steps (Andy Schiefelbein)
* Phase render & apply steps (https://github.com/airshipit/airshipui/issues/37) is the most actionable issue as of right now
* Needing use cases for what a user wants to see as a landing page
* Needing use cases for wanted user interactions
* https://github.com/airshipit/airshipctl/issues/359 exposing hosts for use with baremetal
* https://review.opendev.org/#/c/755380/ A first attempt at exposing the info
### HostConfig Operator Discussion Continuation
## Tuesday Sept 29, 2020
*Airship Tuesday Design Meeting-20200929 1305-1*
***Password***: dPEJMDw6
***Recording***: https://attcorp.webex.com/recordingservice/sites/attcorp/recording/playback/24f6248b93164a4b9077c3b2b94285b4
### Using **relative paths as entry points for Phases**
Impact of that , is what about empty site level phases, do they need to be explicitly created
Idea of a airshiopctl phase run --inherited < meaning get the phase from the type)
airshipctl to generate phase documents implicitly form the type associated with the site.
Options that have been discussed in slack :
manifests:
dummy_manifest:
primaryRepositoryName: primary
repositories:
primary:
checkout:
branch: ${AIRSHIP_CONFIG_PRIMARY_REPO_BRANCH}
force: false
remoteRef: ""
tag: ""
url: ${AIRSHIP_CONFIG_PRIMARY_REPO_URL}
metadataPath: manifests/metadata.yaml
targetPath: ${AIRSHIP_CONFIG_MANIFEST_DIRECTORY}
subPath:
apiVersion: airshipit.org/v1alpha1
kind: Phase
metadata:
name: bootstrap
config:
executorRef:
apiVersion: airshipit.org/v1alpha1
kind: ImageConfiguration
name: isogen
documentEntryPoint: airshipctl/manifests/site/test-site/ephemeral/bootstrap
How do I inplicitly figure out the site specific endpoint path:
**Option I**
- Combination of TargetPath +subPath in manifest document imply the location of the site
- con about this approach is an implicit expectation on the site author for where the site phase doccument live.
Example, in airship config:
``subPath: manifests/site/test-site``
``targetPath: /home/matt/airshipctl``
And then in the Phase resources:
``documentEntryPoint: target/initinfra``
**Option II**
Introduce a new value in the metadata.yaml
Airship Config --> Manifest --> metadataPath --> metadata.yaml has entry for location of phase or site entry point.
Repos are been clone to :
targetPath+<PhaseRepositoryName>
Example, in metadata.yaml:
If metadata.yaml is defined at the type level, this might look like :
``documentEntryPointRoot: manifests/type/mytype/myphases``
If metadata.yaml is been defined at the site level then:
``documentEntryPointRoot: manifests/site/test-site``
Code will use TargetPath + <PhaseRepositoryName> + documentEntryPointRoot to identify entryPoint.
And then in the Phase resources:
``documentEntryPoint: target/initinfra``
Example, in metadata.yaml:
``phaseBase: ../treasuremap/manifests/type/cruiser``
And then in the Phase resources:
``documentEntryPoint: target/initinfra``
**Option III**
Use PrimaryRepositoryName, which is already part of the manifest to indicate which repository holds phases and prepend it's name to entrypoint, example: primaryRepositoryName: primary
example result:
``primaryRepositoryName: primary``
``DocumentEntryPoint: manifests/site/test-site``
URL: /root/airshipctl
dirName = airshipctl
``kustomizeRoot: airshipctl/manfiests/site/test-site``
Outcome:
- Change Airship Config Manifest , replace PrimaryRepositoryName for PhaseRepositoryName
- Introduce a new value in metadata.yaml called documentEntryPointRoot that will identify the kustomize entrypoint for the site where the phases can be found. **Phase run ** will use *TargetPath + <PhaseRepositoryName> + documentEntryPointRoot* to identify kustomization entryPoint.
- Need to move Phases and Phase Plan under a ***type*** in the airshipctl manifests.
- Assumption is metadata.yaml is usually defined at the type level ,and comes as part of that repo. But since we can define in the airship.config.manifest then it can be custom at the site level when appropriate.
### Discussion of Firmware config extension and update on configuration moulds - Noor
I talked with Richard about this, and have some things to discuss
* **Timeframe for configuration** moulds is at best 2021
* Finish ironic implementation, still working on specification
* gopher cloud
* Update metal3
*
How do we close the RedFish gaps in the current Bios/Firmware/Raid implementation:
- Ironic PTG about this RedFish point.