HackMD
  • New!
    New!  Markdown linter lands.
    Write with consistency! Enable markdown linter that hints better markdown styling you can take. Click the lightbulb on the editor status bar 💡
    Got it
    • New!  Markdown linter lands.
      New!  Markdown linter lands.
      Write with consistency! Enable markdown linter that hints better markdown styling you can take. Click the lightbulb on the editor status bar 💡
      Got it
      • Options
      • Versions and GitHub Sync
      • Transfer ownership
      • Delete this note
      • Template
      • Insert from template
      • Export
      • Dropbox
      • Google Drive
      • Gist
      • Import
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
      • Download
      • Markdown
      • HTML
      • Raw HTML
      • ODF (Beta)
      • Sharing Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Note Permission
      • Read
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • Write
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • More (Comment, Invitee)
      • Publishing
        Everyone on the web can find and read all notes of this public team.
        After the note is published, everyone on the web can find and read this note.
        See all published notes on profile page.
      • Commenting Enable
        Disabled Forbidden Owners Signed-in users Everyone
      • Permission
        • Forbidden
        • Owners
        • Signed-in users
        • Everyone
      • Invitee
      • No invitee
    Menu Sharing Help
    Menu
    Options
    Versions and GitHub Sync Transfer ownership Delete this note
    Export
    Dropbox Google Drive Gist
    Import
    Dropbox Google Drive Gist Clipboard
    Download
    Markdown HTML Raw HTML ODF (Beta)
    Back
    Sharing
    Sharing Link copied
    /edit
    View mode
    • Edit mode
    • View mode
    • Book mode
    • Slide mode
    Edit mode View mode Book mode Slide mode
    Note Permission
    Read
    Owners
    • Owners
    • Signed-in users
    • Everyone
    Owners Signed-in users Everyone
    Write
    Owners
    • Owners
    • Signed-in users
    • Everyone
    Owners Signed-in users Everyone
    More (Comment, Invitee)
    Publishing
    Everyone on the web can find and read all notes of this public team.
    After the note is published, everyone on the web can find and read this note.
    See all published notes on profile page.
    More (Comment, Invitee)
    Commenting Enable
    Disabled Forbidden Owners Signed-in users Everyone
    Permission
    Owners
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Invitee
    No invitee
       owned this note    owned this note      
    Published Linked with GitHub
    Like BookmarkBookmarked
    Subscribed
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Subscribe
    --- tags: Agenda --- # Design Meetings [TOC] ## Meeting Links Can be found here https://wiki.openstack.org/wiki/Airship#Get_in_Touch ## Administrative ### [Recordings ](https://hackmd.io/CvuF8MzmR9KPqyAePnilPQ) ### Old Etherpad https://etherpad.openstack.org/p/Airship_OpenDesignDiscussions ### Design Needed - Issues List | Priority | Issues List | | - | - | | Critical | https://github.com/airshipit/airshipctl/issues?q=is%3Aopen+is%3Aissue+milestone%3Av2.0++label%3Apriority%2Fcritical+label%3A%22design+needed%22+| | Medium | https://github.com/airshipit/airshipctl/issues?q=is%3Aopen+is%3Aissue+milestone%3Av2.0++label%3Apriority%2Fmedium+label%3A%22design+needed%22+ | | Low | https://github.com/airshipit/airshipctl/issues?q=is%3Aopen+is%3Aissue+milestone%3Av2.0++label%3Apriority%2Flow+label%3A%22design+needed%22+ | ## Thursday Apr 15, 2021 ### Airship 2.0 Troubleshooting Guide Continued - Andrew K * WIP patchset - https://review.opendev.org/c/airship/docs/+/786062 * What do we want to target for v2.1? This will be evolutionary. * Collaborators needed * Related question: do we log what phase is being executed when it starts/ends? ### FQDN resolution - Andrii O. * Some software require fqdn resolution from hostname. /etc/hosts and systemd-resolved config need to be populated accordingly during the deployment. * Arvinder: * I see three possible approaches to doing this: 1. host-config operator: this is best when updates need to happen dynamically on existing K Nodes. 2. pre/post kubeadm hooks in KubeadmConfigSpec: you can write custom scripts that for example patch the /etc/hosts accordingly. For example, you can use KubeadmConfigSpec.Files to create one or more files in say the /etc/host.d directory and then during PreKubeadmCommand run something along the lines of `cat /etc/hosts.d/*.conf > /etc/hosts` 3. Metal3DataTemplate: the above KubeadmConfig approach applies the same configuration across all nodes in an KCP or MD. Metal3DataTemplate provides more flexibility by allowing some of the configuration to be Node specific. https://github.com/metal3-io/metal3-docs/blob/master/design/metadata-handling.md :::warning We are already using #3 for networking etc, its unclear to me that this helps with the original question. FQDN/Systemd changes etc. ::: ## Tuesday Apr 13, 2021 ### Hostconfig-operator integration (Sreejith P) While integrating HCO with treasuremap, found that we need to annotate the secret on to nodes and we also need to have a specific label. what would be the best way to annotate nodes. Also would it be best to add a mechanism to override the default labels in HCO via manifests. - Metal3 Feature for Label Cascading: https://github.com/airshipit/airshipctl/issues/377 ### Replacing into multiple targets (Reddy / Matt) There are use cases where we need to ReplacementTransform the same source data into multiple target paths -- e.g. replacing an IP address into many network policy rules. It would be helpful for the RT to support this natively. Some options, 1) add a `targets` list as an alternative to `target` * Lets open an ISSUE to support this. Improve Transformer to support Target as Slice : https://github.com/airshipit/airshipctl/blob/7998615a7b5847c367c29874641c8422157ebb52/pkg/document/plugin/replacement/transformer.go#L77 2) allow for global substring replacements across a document, in addition to the path-based substring replacement we have ISSUE : Open thi sfor a future priority. specify a patteern that does not hae a specific target. ### PTG later this month * Thurs, Apr 22 1300 UTC (8:00 AM CDT) - 1700 UTC (12:00 PM CDT) * Fri, Apr 23 1300 UTC (8:00 AM CDT) - 1700 UTC (12:00 PM CDT) * Open agenda: https://etherpad.opendev.org/p/xena-ptg-airship * Registration (free): https://april2021-ptg.eventbrite.com ### Need to discuss document pull command behavior (Kozhukalov) Related issues * https://github.com/airshipit/airshipctl/issues/416 * https://github.com/airshipit/airshipctl/issues/417 Patch is ready for review (probably needs rebasing) * https://review.opendev.org/c/airship/airshipctl/+/767571 Patch implements two things * Default target path is ${HOME}/.airship/<manifest_name> (i.e. ${HOME}/.airship/default) * If target path is not defined in the config for a manifest, then the current directory is used to pull the manifests repo ## Thursday Apr 8, 2021 ### Discuss the document-validation solution for airshipctl (Ruslan A.) https://hackmd.io/t2mxDiB3TdGXI8B6gDtA-Q ### Discuss "dex-aio" Implementation Short-/Long-term (Sidney S.) Approach to discuss described in https://hackmd.io/bdPFHBBSQy-IrpPe1U9itg ### Align on approach to troubleshooting guide (Andrew K) Use the life cycle states as a high level framework: - Initialize State - Prepare State - Physical Validation - Bootstrap State - Target Cluster Lifecycle - Target Workload Lifecycle - Sub-clusters (separate state or combine with Target?) - Day 2 Operations Proposed approach would be to ~~to list the phases/steps within each higher level lifecycle state & then~~ reference the relevant troubleshooting areas listed below within the life cycle states. Generally speaking here's what you need to look at within a phase (base on executor, do x, y, z). Troubleshooting areas: - Manifests: - Documents, - Kustomize, - Kustomize-KRM-Plugins - Our troubleshooting for Replacememt, Sops, - Running phases: How to debug a failed phase, where to start, which logs to read > Focus from the phase perspective. - Identify Phase - UUnderstand Phase.yaml - Identify Executor - Understand Executor.yaml - Given. theexecutor type ; - Different guidance - is it generic - which one, - sops, gpg,kubeval, image builder, etc - is it k8s - is it clusterctl - Cluster-API & Kubernetes Deployment: grouped together as the k8s deployment is done by Cluster-API - Proxy settings - Networking: is this too broad/complex/specific to individual use cases? - Helm Charts: Helm Operator & Helm Chart Collator debugging - Image Builder: base image generation debugging, ISO/QCOW generation & application debugging - - Host Config Operator (may be a future topic) - Sub-Clusters? - Services/Applications - i.e. CEPH - DEX - LMA stuff - ... Point to their documentation Assuming our other documentation will provide details on what each phase does. Would it make sense to incorporate troubleshooting into the deployment guide so you have a one stop shop or keep it separate so it's not cluttering up the deployment guide? We created an issue for this quite awhile back [#328](https://github.com/airshipit/airshipctl/issues/328). It references this TM v1 debug report script as a potential starting point. Is this still valid? https://github.com/airshipit/treasuremap/blob/master/tools/gate/debug-report.sh Next Steps: - Create WIP Patchset with documentation framework - Identify SMEs for each troubleshooting area who can help contribute - Try to get a basic framework & some initial content by the EOM for v2.1 & continue to build on it ## Tuesday Apr 6, 2021 ### Discuss the document-validation solution for airshipctl (Ruslan A.) Review the followings commits https://review.opendev.org/q/topic:%22add-validation-phases%22+(status:open) ### Discuss "dex-aio" certifcate generated by Cert-Manager (Sidney S.) Approach to discuss described in https://hackmd.io/bdPFHBBSQy-IrpPe1U9itg ## Thursday Apr 1, 2021 ### Discuss the rook-ceph cluster implementation (Vladimir S.) Review the initial commit https://review.opendev.org/c/airship/treasuremap/+/784184 , Discuss rook-ceph components which should be deployed by default, Discuss the further downstream/WHARF work - Set failure domain to host by default - ## Place for scripts such as waiting, that are currently in tools/deployment. (KKalynovskyi) We have new pattern of waiting and adding new scripts to gate/deployments: https://review.opendev.org/c/airship/airshipctl/+/782520 as example, we placed script here, but now its test-site specific, we need a place to be shared between every site: https://github.com/airshipit/airshipctl/tree/master/manifests/site/test-site/phases/helpers ## Tuesday Mar 30, 2021 ### Exploratory - Discuss Load Balancer (Sidney S) Propose to create sub-issues for https://github.com/airshipit/treasuremap/issues/84 for each operator and having one person working on each sub-issue. Of course, team needs to coordinate the common effort, e.g., common design. Also, have a more detailed description of how this load balancer is going to be used: * Is this load-balancer going to be used when a service of type "LoadBalancer" is deployed? * Is a GCP load balancer being used only with a GCP target cluster or with any other provider's cluster, e.g., Baremetal? Some additional questions in terms of design: * Is it expected to add a new airshipctl phase for it? * Design using Docker container or any other proposal? CAPI working group on LoadBalancer type: * https://docs.google.com/document/d/1wJrtd3hgVrUnZsdHDXQLXmZE3cbXVB5KChqmNusBGpE/edit ### Discuss Network Catalogue Lists (Jess E) Currently the network catalogue has a list of networks and a list of vlans. It is somewhat tricky to modify these as it requires accessing them by index. It would be preferable to access them by id instead. Either we should use dictionaries or have some sort of lookup function to more easily access these items. Example network catalogue: ``` networks: - id: network_a type: ipv4 link: bond0.123 # ip_address: netmask: 255.255.255.128 - id: network_b type: ipv4 link: bond0.456 # ip_address: <from host-catalogue> netmask: 255.255.255.128 ``` Example using patchesjson6902: ``` # remove network_a - op: remove path: "whatever/networks/0" # modify network_b now at index 0 - op: replace path: "whatever/networks/0/link" value: bond0.123 ``` ### ViNO - VM Infrastructure Bridge * Discuss the end-to-end flow for how the VM Infra Bridge is created and used. * Should it be created on all bare metal hosts prior to applying ViNO CR? * Currently ViNO CR specifies bridge interface name via vmBridge entry. Node labeler uses this value to label nodes with IP address. How should vino-builder use this to create libvirt network? ## Thursday Mar 25, 2021 ### CRI & CGroup driver (Vladimir Sigunov/Craig Anderson) Need to discuss the approach for [Airshipctl #456](https://github.com/airshipit/airshipctl/issues/456) TODO :Open an issue in images regarding ### Airshipctl Useability Concerns - Why do we get SOPS errors when doing a phase list - Group 0: FAILED FBC7B9E2A4F9289AC0C1D4843D16CEE4A27381B4: FAILED - | could not decrypt data key with PGP key: | golang.org/x/crypto/openpgp error: Could not load secring: | open /tmp/secring.gpg: no such file or directory; GPG binary | error: exit status 2 I understand we can follow instructions from Treasuremap , but that makes using airshipctl, a bit of a mess. Given that the flow a user might follow is simply : * clone airshipctl * build airshipctl * install airshipctl * airshipctl config init **ISSUE (v2.1)**- Better integration between airship config and sops expectations for keys. Needd more discussion * airshipctl config get-manifest * Update config Manifest for Target Path to meet my local filesystem **ISSUE v2.1** with using airshipctl config set-manifest <NAME> target-path=... Using airshipctl config get-manifest DOES NOT return the Manifest Name * airshipctl document pull * airshipctl phase list BOMBED!!&#^$%$& - config init, should include airshipctl into the manifest, and under the target path given the default resource relationship. **ISSUE v2.1 ** fixed this - This stuff should be only visible with a debug like option [ This should be fixed already ... ] (base) vpn-135-210-4-4:bin rp2723$ airshipctl phase render initinfra-target gpg: keybox '/tmp/pubring.kbx' created gpg: /tmp/trustdb.gpg: trustdb created gpg: key 3D16CEE4A27381B4: public key "SOPS Functional Tests Key 1 (https://github.com/mozilla/sops/) <secops@mozilla.com>" imported gpg: key 3D16CEE4A27381B4: secret key imported gpg: key D8720D957C3D3074: public key "SOPS Functional Tests Key 2 (https://github.com/mozilla/sops/) <secops@mozilla.com>" imported gpg: key D8720D957C3D3074: secret key imported gpg: key 3D16CEE4A27381B4: "SOPS Functional Tests Key 1 (https://github.com/mozilla/sops/) <secops@mozilla.com>" not changed gpg: key D8720D957C3D3074: "SOPS Functional Tests Key 2 (https://github.com/mozilla/sops/) <secops@mozilla.com>" not changed gpg: key 19F9B5DAEA91FF86: public key "SOPS Functional Tests Key 3 (https://github.com/mozilla/sops/) <secops@mozilla.com>" imported gpg: Total number processed: 5 gpg: imported: 3 gpg: unchanged: 2 gpg: secret keys read: 2 gpg: secret keys imported: 2 gpg: keybox '/tmp/pubring.kbx' created gpg: /tmp/trustdb.gpg: trustdb created ### Discuss - Implementation of Rook-Ceph Deployment (Vladimir S) Review rook-ceph cluster deployment: (reviews pending) https://review.opendev.org/c/airship/charts/+/780590 Take a peek at this https://review.opendev.org/c/airship/treasuremap/+/783051 * Are we going to use a helm chart to deploy rook CRs, or utilize raw CR documents for now? ### Plan for this “On June 30 2021, Quay.io will move to Red Hat Single Sign-On Services. If you haven't done so already, please create a PERSONAL (not corporate) Red Hat account and attach it to your Quay.io account, following these instructions. If you create a corporate account you WILL lose access to your Quay personal namespace” **ISSUE **: MAke sre this doesnt impact airshgip repo publishing . ### About phase list * Phase should include only the appropriate phases for the type * Problem might be the type definitions... ISSUE : How do we make phase list more useful? Its essentially catalogue list. ofphases the document set knows about. This has no context for a user ? * This should be an issue or behavior from the Plan command * Order/Relationship/Dependency/etc NAMESPACE RESOURCE Phase/clusterctl-init-ephemeral Phase/clusterctl-init-target Phase/clusterctl-move Phase/controlplane-ephemeral Phase/controlplane-target Phase/ephemeral-az-cleanup Phase/ephemeral-az-genesis Phase/ephemeral-gcp-cleanup Phase/ephemeral-gcp-genesis Phase/ephemeral-os-cleanup Phase/ephemeral-os-genesis Phase/initinfra-ephemeral Phase/initinfra-networking-ephemeral Phase/initinfra-networking-target Phase/initinfra-target Phase/iso-build-image Phase/iso-cloud-init-data Phase/lma-configs Phase/lma-infra Phase/lma-stack Phase/remotedirect-ephemeral Phase/secret-generate Phase/secret-reencrypt Phase/secret-show Phase/workers-classification Phase/workers-target Phase/workload-target ## Tuesday Mar 23, 2021 ### Confirm image tagging approach + some additional release tagging questions (Andrew K) (Continued) When discussing [Images #3](https://github.com/airshipit/images/issues/3), Sai pointed out that currently the krm functions are also getting tagged as latest vs. a specific version. Sean has been working on the tagging/release approach. There two issues out there currently to address: • [#419](https://github.com/airshipit/airshipctl/issues/419) - Covers versioning the templater & replacement-transformer. • [#354](https://github.com/airshipit/airshipctl/issues/354) - Overall release tagging approach • Sops looks like it’s already version tagged. Couple of questions: 1) Do we need to add cloud-init to 419 or create a new issue to version tag cloud-init? * Added to release automation here: https://review.opendev.org/c/airship/airshipctl/+/780875 * Added to version pinning here: https://review.opendev.org/c/airship/airshipctl/+/767179 3) Do we do anything with the Makefiles for the krm functions in https://github.com/airshipit/airshipctl/tree/master/krm-functions? Each Makefile for cloud-init, replacement-transformer & templater points to the “latest” DOCKER_IMAGE_TAG https://github.com/airshipit/airshipctl/blob/master/krm-functions/templater/Makefile#L6-L13 * that is the default tag, it can be overridden, which we do for both git SHA (Zuul post job) and semver tags (github action) 3) Besides answering 1) & 2) above, is there anything else we need to address? Sean has some additional questions in #354 that need reveiw: https://github.com/airshipit/airshipctl/issues/354#issuecomment-801436275 ### Discuss Treasuremap Branching Strategy (Matt F) Related to [#50](https://github.com/airshipit/treasuremap/issues/50), discuss path forward for branch renaming in Treasuremap. Previous notes recommended moving v2 -> master, renaming master to perhaps v1.9, and leaving v1.8 as is. * Is this still the consensus? * When will the change take place? (and who will do it?) * Any other related items that need to be communicated on the mailing list before the change occurs? * ## Thursday Mar 18, 2021 ### CONTINUED: Static schema validation of airshipctl documents (Ruslan Aliev) These is how we do validation today https://github.com/airshipit/airshipctl/blob/master/tools/document/validate_site_docs.sh The topic is related to issue [#19](https://github.com/airshipit/airshipctl/issues/19). Proposed solution: a KRM function which can validate input documents via Kubeval. Validation of custom resources is possible by extracting openAPIV3Schema from needed CRDs and converting them to JSON. Using this approach, each phase (which has a documentEntryPoint) could be validated by creating a new one with the same documentEntryPoint, but pointing a new Executor (KRM based on GenericContainer). :::warning *airshipctl phase validate [NAME]* validate will : - 1st Phase validate document (Phase, Execution) - Generic container kubeval for the phase itself Document like PhaseValidationConfig ... Config Here that specifies the CRD such as https://review.opendev.org/c/airship/airshipctl/+/780681/3/manifests/site/test-site/phases/phase-patch.yaml#27 : crdList: - function/airshipctl-schemas/versions-catalogue.yaml - function/airshipctl-schemas/network-catalogue.yaml - https://raw.githubusercontent.com/tigera/operator/release-v1.13/config/crd/bases/operator.tigera.io_installations.yaml - function/capi/v0.3.7/crd/bases/cluster.x-k8s.io_clusters.yaml - function/cacpk/v0.3.7/crd/bases/controlplane.cluster.x-k8s.io_kubeadmcontrolplanes.yaml - function/capm3/v0.4.0/crd/bases/infrastructure.cluster.x-k8s.io_metal3clusters.yaml - function/capm3/v0.4.0/crd/bases/infrastructure.cluster.x-k8s.io_metal3machinetemplates.yaml - global/crd/baremetal-operator/metal3.io_baremetalhosts_crd.yaml - function/cabpk/v0.3.7/crd/bases/bootstrap.cluster.x-k8s.io_kubeadmconfigtemplates.yaml - function/capi/v0.3.7/crd/bases/cluster.x-k8s.io_machinedeployments.yaml - function/hwcc/crd/bases/metal3.io_hardwareclassifications.yaml - function/flux/helm-controller/upstream/crd/bases/helm.toolkit.fluxcd.io_helmreleases.yaml - function/flux/source-controller/upstream/crd/bases/source.toolkit.fluxcd.io_helmrepositories.yaml * ::: Finally, we can define a PhasePlan, containing validation phases, launching which we can validate all documents related to particular site. PS to discuss - https://review.opendev.org/c/airship/airshipctl/+/780681/ Proposed Path Forward: 1. a `phaseplan validate` command that walks the phases in a plan 2. `phaseplan validate` would discover most schemas from the document set 3. a superset of "extra" CRD pointers (not in the doc set) are shared across the validated plan 4. circle back after `phaseplan validate` and figure out how we want to do `phase validate` which is actually trickier ### EndpointCatalogue needed for Dex? (Matt Fuller) This is related to issue [#317](https://github.com/airshipit/airshipctl/issues/317). Previous investigation didn't indicate a strong need for an EndpointCatalogue as replacements were already covered by existing Network and VersionsCatalogues. However, a discussion last week about the Dex function in treasuremap seemed to indicate an EndpointCatalogue may be needed after all. Is this slated for future work, or should this be part of 2.0 release? Path forward: 1. Close #317 2. Create a new issue in treasuremap for airship/charts format endpoints ### Confirm image tagging approach + some additional release tagging questions (Andrew K) When discussing [Images #3](https://github.com/airshipit/images/issues/3), Sai pointed out that currently the krm functions are also getting tagged as latest vs. a specific version. Sean has been working on the tagging/release approach. There two issues out there currently to address: • [#419](https://github.com/airshipit/airshipctl/issues/419) - Covers versioning the templater & replacement-transformer. • [#354](https://github.com/airshipit/airshipctl/issues/354) - Overall release tagging approach • Sops looks like it’s already version tagged. Couple of questions: 1) Do we need to add cloud-init to 419 or create a new issue to version tag cloud-init? * 419 is about pinning to versions, we could add cloud-init to it. but first we need to be publishing git SHA and/or semver versions to pin to. it looks like we are attempting to push to a non-existent repo for cloud-init which is causing the other images in that repo to not be published as well: https://zuul.opendev.org/t/openstack/builds?job_name=airship-airshipctl-publish-image 3) Do we do anything with the Makefiles for the krm functions in https://github.com/airshipit/airshipctl/tree/master/krm-functions? Each Makefile for cloud-init, replacement-transformer & templater points to the “latest” DOCKER_IMAGE_TAG https://github.com/airshipit/airshipctl/blob/master/krm-functions/templater/Makefile#L6-L13 * that is the default tag, it can be overridden. we are pushing git SHA and semver tags for replacement-transformer and templater. not sure about cloud-init. 3) Besides answering 1) & 2) above, is there anything else we need to address? Sean has some additional questions in #354 that need reveiw. TODOs: 1. cloud-init repo exists, and we can add it to #419 alongside our other krm fns ## Tuesday Mar 16, 2021 ### airshipctl secret decrypt not working (Sreejith) airshipctl secret decrypt is giving the message that its not implemented as mentioned below. airshipctl secret decrypt --src secrets.yaml --dst decryptes_secrets.yaml not implemented: secret encryption/decryption Do we have plans to implement this for the 2.0 release? Comment from Alexey O.: what is the use-case? JFYI we have https://review.opendev.org/c/airship/airshipctl/+/780670 that allows you to do `airshipctl phase run secret-show` to see all generated secrets decrypted on the screen. Does it cover your use-case? ISSUE : Discuss/Design better user experience for undestanding how to decrypt a particular secret artifact. Post v2. **Created [#489](https://github.com/airshipit/airshipctl/issues/489)** ### Static schema validation of airshipctl documents (Ruslan Aliev) [Continued here](https://hackmd.io/QiEksO4fRk-MnBjwBFaAkQ#CONTINUED-Static-schema-validation-of-airshipctl-documents-Ruslan-Aliev) ## Tuesday Mar 9, 2021 Topics left over from last week. ### Update on Host Config Operator periodic checks (Andrew/Sirisha) Discussion notes here https://hackmd.io/QCSjN1NWQ1qLdPcX8C7_sg ### Host Config Operator to provide day 2 storage clean up capabilities (Sreejith/Vladimir) https://github.com/airshipit/hostconfig-operator/issues/3 We have porthole for performing some of the day 2 operations. is hostconfig-operator suppose to do all that functionality of porthole or is it suppose to do activities like monitoring ceph cluster and do rebalancing, clearing un-used pools etc ### Keepalived VRRP for Service VIPs(Manoj Alva) Clarification around the following items. - Placement of the KubeadmControlPlane manifest under function/k8scontrol directory in treasuremap - keepalived installation and start of service will be done using shell script passed as preKubeadmCommands instead of baking the same in the isoimage. #### Note: - ISSUE : Extend the VRRP Service Implementation to comply with Security consideration via appropriate policies (??) taken into account given the ingress is now on OAM IP - ISSUE : Test performance of using the the VRRP Service Implementation given the ingress is now on OAM IP ## Thursday Mar 4, 2021 ### What are our targeted phases? (Matt) This has been a bit of an evolving target - what should people do today? * How will we batch up Infra, Operators, Operator Config, LMA Workload... * We have some extra phase(s) today out of necessecity that we plan to remove * For the discussion https://hackmd.io/7vQOSYADSVessB9zZ0P-Kw * Move the discssion notes arond changing metadata and enhancing phases to the hackmd note above. Both isues below refer to those topics. **ISSUE** : Future optimization to try to allow for reuse of phases uniquely. Possible solution change metadat to include an atribute to inform that paths should include cluster name in them. **ISSUE**: Future decision of how to better integrate extended wait logic into executor or airshipctl in general. Which will allow us to simplify the number of phases as well. Might mean adding a concept of steps to a Phase? ## Tuesday Mar 2, 2021 ### Keepalived VRRP for Service VIPs (Anand/Andrew K) Discuss keepalived VRRP design for K8s Service VIPs of the undercloud cluster. * Upstream this should be the same we do downstream (for undercloud k8s api/services) Discuss the HA for Ironic "PROVISIONING_IP". - Should airshipctl handle the implementation via Ingress ..Operator? - may be deelop Operator similar to SIP that creates LBs for tenants. - Or just use ingress charts to keep the implementation simple. https://github.com/airshipit/treasuremap/issues/94 Path Forward: * Anand to follow up w/ Sai on how to handle ironic ingress vip * Follow the RDM9 approach of configuring keepalived via the kubeadmcontrolplane resource, and baking keepalived into the base image * Add replacement of keepalived variables (VIPs, vlans, etc) from the Networking Catalogue into the KubeadmControlPlane * Note: the Kubernetes community best practice for API load balancers is to manage them outside of kubernetes itself. This is why Airship 2 diverged from the Airship 1 / OSH approach of configuring a containerized keepalived in the Ingress chart. ### ViNO mac address management (Matt) How should ViNO manage mac addresses for VMs it creates? Assumptions: * libvirt (via vino-builder) needs to know mac addresses * metal3 BMH networking secret (via vino controller) needs to know mac addresses * nothing else needs to know the mac addresses Potential approaches: * Add mac addresses into ViNO CR input -- shift responsibility to the user (seems bad) * ViNO controller allocates/calculates macs (similar to its IPAM), and puts them into the vino-builder config and the BMH Secret * vino-builder allocates/calculates macs and passes it back to vino controller to put in the BMH Secret Path Forward: * Populate mac addresses with a dummy value for now * Look into how robust libvirt mac generation is, and whether there's any risk of collisions * Look into how vino builder could pass mac addresses back to the vino controller * As a Plan B, vino controller could generate/track mac addresses similarly to what it does for IP addresses ### Treasuremap Issue template * Appears to have reverted from the recent template changes * We were going to update to reflect the type direction Path Forward: * The change was made to treasuremap sometime in the last two weeks (the .github folder has disappeared). Andrew to look through git history so we can figure out who/why. ## Thursday Feb 25, 2021 ### Injecting SSH authorized keys into Vino VMs (sean) What is the upstream solution for getting SSH authorized keys injected into the vino VMs, so that we can access them from the SIP jump container. Does anyone know what functionality exists in CAPI or metal3 for injecting SSH authorized keys (or arbitrary file content) into the hosts? Or does it already do that by default? If not, I see metal3 BMH allows specifying ironic metadata/userdata, would that be a good approach? Cluster API bootstrap provider Kubeadm (CABPK) has direct support for adding ssh authorized keys at node creation time: https://github.com/kubernetes-sigs/cluster-api/blob/master/docs/book/src/tasks/kubeadm-bootstrap.md#additional-features This is the path forward, ViNO will have no role in key injection. New keys will require redeployment of vBMH as appropriaye. ### Catalogues & Replacement as interface to functions (Matt/Sidney) This patchset proposes an idea for using a catalogue-based approach to configuring a function (I think) -- putting all of the function's tunables in one place. A little like a chart's overrides.yaml. Let's talk through the use case and implementation of this. * This is a little different than our use cases for catalogues so far * Is the extra layer of abstraction something we want? * If so, where to we put the moving pieces (catalogue, transformer config...) https://review.opendev.org/c/airship/treasuremap/+/776528 ## Tuesday Feb 23, 2021 ### Review first pass of Treasure Map function > type mapping https://docs.google.com/spreadsheets/d/1-sq2j-JzD9Jv2D6FTimzD6eqa45_iRP8BYNlAw2StPI/edit#gid=0 ### Discuss Host Config Operator checks & validations From the 2/18 design call, continue discussion on how to periodically check & remediate deny list, permissions & limits violations via Host Config Operator. https://github.com/airshipit/hostconfig-operator/issues/10 ### Sequencing CRD establishment & CR delivery (matt) We discussed an issue a few weeks ago around the fact that the tigera-operator itself creates its CRDs (as opposed to airshipctl delivering the CRDs along with the operator), which leads to a race condition: we have to wait for the operator to do it's thing before moving on to the next phase that delivers CRs. We've hit the same issue with our LMA operators. If this is going to be a common problem, we may want a generic mechanism to handle it -- something like "wait for CRD [X] to be established before moving on", where [X] is driven by metadata. I think we talked before about a shorter-term "CRD wait as its own new phase, using generic container" approach, and maybe a longer-term "CRD wait as a wait condition on the existing phase". Either way, the logic could be based on something like this (but containerized): https://review.opendev.org/c/airship/airshipctl/+/769617/12/tools/deployment/26_deploy_metal3_capi_ephemeral_node.sh (sean) CRD wait is already supported by kstatus: https://github.com/kubernetes-sigs/cli-utils/blob/535f781a3c1b1d66b06a93f59e7a03c03af81477/pkg/kstatus/status/core.go#L559 so issue is just that the operators are creating their own CRDs rather than kustomize. do the operators provide status conditions which indicate when their CRDs are established? tigera operator (no): https://github.com/tigera/operator/issues/1055 lma operators (???): ??? github issue updated with this discussion: https://github.com/airshipit/airshipctl/issues/443 ## Thursday Feb 18, 2021 ### Functions & Treasuremap Types *If this belongs in Flight Plan then we can defer to that meeting* We need to review the manifest functions being developed in Treasuremap to determine whether they should be included in the airship-core or treasuremap types. We've discussed the approach for identifying [Jan 13 Flight Plan Call](https://hackmd.io/93_0K4AAR9izrEpuDMa5Rw#Jan-13-2021), but haven't done the review yet. :::warning * Goal keep airship core simple and functional * From LMA only logging * Dex .. * HostConfig * Helm Operator (*) * Helm Chart Collator * Ingress (VIP based.. work anywhere withouth BGP expectations) * Rook and Ceph: * Some basic configuration? * Default behaviours of rook, nothing prescriptive * Rest of functions will go into NC type * Airship core functions + * LMA complete stack (dashboards, prometheus, etc) * Rook and ceph, prescriptive configuration for NC type * Moving forward new functions will follow this procedures to determine where they go : * How do we tag/identify what functions go with what types? Labels, new issues? * Create label for each type for identification purposes. * When creating the issue for the function, include which type(s) in which the function should be included. * The developer who is working the function creation issue should be responsible for including it in the types that are specified in the issue. * We are making the assumption all other types inherit airship-core, and will also inherit the functions associated with airship-core. * When a new type is defined, part of the issue to create the type should include which functions are part of the type. ::: ### Image-builder: blacklist package (Sreejith) Divingbell supports blacklisting packages. do we need this in image-builder? ## Tuesday Feb 16, 2021 * [Upgrade docker/containerd](https://github.com/airshipit/hostconfig-operator/issues/2) - hostconfig-operator - Issue #2 - (Sirisha) * Design document and detailed analysis of the scenarios executed to upgrade docker/containerd can be found here - https://hackmd.io/1wzoYuNeSzuEp7XvxIOr8Q * Should we support downgrading packages? * PS to address upgrading docker/containerd.io is here - https://review.opendev.org/c/airship/hostconfig-operator/+/773389 * [Emergency package installation/update](https://github.com/airshipit/hostconfig-operator/issues/4) - hostconfig-operator - Issue #4 - (Sirisha) * Design document - https://hackmd.io/z2eKJPPDTAeEh0dYfnCOig ## Thursday Feb 11, 2021 * (Constantine Kalynovskyi) - Kubeconfig phase integration, [related issue](https://github.com/airshipit/airshipctl/issues/460) Open question(s): * how dynamic config will work if we're going to delete epthemeral/bootstrap cluster (need to switch to filesystem? how to store kubeconfig? probably it should be encrypted)? 'Dynamic' mode will work really well for tenant clusters though... * For how long the kubeconfig, generated by clusterApi for the target cluster will be valid? What if CA is compromized and we need to rotate it? What if we need a new kubeconfig for any other reason? Are we going to put new kubeconfig to the cluster? * (Ahmad Mahmoudi) - Undercloud cluster lifecycle updates and how they impact tenant sub-cluster workloads Open questions: * Need to review this flow for day2 undercloud lifecycle changes and sub-cluster impacts * For each undercloud BMH node to be updated perform following steps: * If the undercloud BMH node is a worker node, perform following steps. for Control Plane BMH nodes skip this step and move to next step: * Identify the impacted tenant sub-cluster vBMH nodes running on the BMH node being updated (node labels?), and Drain the impacted sub-cluster vBMH nodes. Once the sub-cluster vBMH nodes are drained, remove the vBMH nodes from the impacted sub-clusters. * **Q: What tool drives this, airshipctl phase?** This step is not needed. To start with we extend the healthcheck timout to a time to allow for re-deployment of the BMH, during upgrade. * When the impacted vBMH is deleted, ViNo starts the sub-cluster's `inactive` vBMH, SIP sets the sub-cluster vBMH labels, and CAPI reconcilliation, joins the vBMH to the sub-cluster. * **Q: Check with ViNo, SIP and CAPI, if/how the `inactive` vBMH is started?** * **Q: Is the assumption that each sub-cluster is deployed with a spare vBMH node as `inactive` to be used for lifeclycle updates correct?** * No !! vBMH are labeled before hand. The pool of vBMH labeled by sip are done before hand * Vino doesnt do anything here. Other than when the host comes back up after re-deployment. * Drain the BMH node, wait until all node resources are vacated, delete the BMH node from the undercloud cluster, shut down the BMH node, re-deploy the node and join the cluster. * **Q: Does CAPI resilency drive this step or do we need an airshipctl for this?** CAPI will do all this for identified BMH(s):drain, delete from cluster, redeploy and join the cluster. * **Open Question: For the worker nodes how can we group the nodes to be redeployed? Serially will take a long time, per-rack, might bring too many BMHs. This needs to be tested and assessess. * [Upgrade docker/containerd](https://github.com/airshipit/hostconfig-operator/issues/2) - hostconfig-operator - Issue #2 - (Sirisha) - Moving it to 16th Feb * Design document and detailed analysis of the scenarios executed to upgrade docker/containerd can be found here - https://hackmd.io/1wzoYuNeSzuEp7XvxIOr8Q * Should we support downgrading packages? * PS to address upgrading docker/containerd.io is here - https://review.opendev.org/c/airship/hostconfig-operator/+/773389 --- ## Tuesday Feb 9, 2021 * (Manoj Alva) [Apply failsafe North - South network policies via Calico chart #32(Treasuremap)](https://github.com/airshipit/treasuremap/issues/32) * Is there agreement on the default parameters? * *At a minimum this would disabling Calico's default failsafe parameters by setting FailsafeInboundHostPorts and FailsafeOutboundHostPorts to "none".* * What needs to be defined when replacing the default failsafe rules? * Does this mean we define GlobalNetworkPolicy resource as described in reference sample (https://docs.projectcalico.org/reference/host-endpoints/connectivity) * Failsafe Default rules (https://docs.projectcalico.org/reference/host-endpoints/failsafe) ### [Upgrade docker/containerd](https://github.com/airshipit/hostconfig-operator/issues/2) - hostconfig-operator - Issue #2 - (Sirisha) * Detailed analysis of the scenarios executed to upgrade docker/containerd can be found here - https://hackmd.io/1wzoYuNeSzuEp7XvxIOr8Q * PS to address upgrading docker/containerd.io is here - https://review.opendev.org/c/airship/hostconfig-operator/+/773389 * Once we upgrade docker/containerd it requires service restart, which effects the kubernetes pods and applications running on that node * And if the upgrade is on the k8 node where the hostconfig-operator pod is running then there is downtime in the kubernetes cluster as etcd leader election can take time. * After the k8 cluster is up the hostconfig-operator pod on the same node can become leader or other replica running on different node can become leader * Once the leader hostconfig-operator pod comes up it re-executes all the HostConfig CRs in the cluster. * So all configuration roles in the hostconfig-operator have to first check if the configuration exists and then execute the if it doesn't exists or there should not be any effect even if the configuration re-executes * No two CRs should have same configuration defined for same node. ex: sysctl configuration defined in two CRs for same node. In this case not sure which configuration executes first. * Are these consquences expected or should we drop the support for upgrading docker/containerd? * Both upgrade/downgrade can happen with the above PS, looking for pointers on how to restrict to upgrade? ### [Emergency package installation/update](https://github.com/airshipit/hostconfig-operator/issues/4) - hostconfig-operator - Issue #4 - (Sirisha) * Do we need to support package installation/update through apt/yum or do we have to support any other procedures like * wget * executing script * installing .deb/.rpm package supplied - how will it be supplied? * pip package installation * Upgrading package will not be scope as it conflicts with [Issue #2](https://github.com/airshipit/hostconfig-operator/issues/2) * If package/binary already exists we would throw an error, as it would be part of upgrade. * What do we need ... * What packages do we support * Based on what Airship 1 supports via MiniMirror that is your starting possble target * Derive from these list what you need to support * apt ...? for the most part * Should look at diving bell apt management: * https://github.com/airshipit/divingbell/blob/master/divingbell/templates/bin/_apt.sh.tpl * Do we want to whitelist the packages that we support via tese Hostconfig operation, in other we want. tomake sure things like containrd or kubeet are not upgraded through this mecchanism. Black list of packages... The work : * CR : * This is how I deliver the list of packages * This is how I inform what I want to do with * Playbook * Role/Task for package consumption and applying them * What error conditions? * ## Thursday Feb 4, 2021 * SIP is going to enable operators to power-cycle virtual machines from its jump-host service. 1. Should the jump pod power-cycle virtual machines using Redfish or libvirt? 3. What do we need to do to restrict access to which VMs an operator can cycle? i.e. we can limit which virtual machines they see inside the jump pod, but do we need a more robust approach to locking down access to the other VMs? :::info Discussion Image https://go.gliffy.com/go/publish/13446596 * Redfish because we can not block libvir policy wise * SIP will create POD Network Policies to allow, ssh, and https/s access to vm's and approriate service/ports * SIP will craft an apropriate config file that the DMTF redfish can use as inoput perhaps for the list of valid target redfish? * Is it a single config will all targets , a cluster. vm endpoints document? * vm name |fqdn | ip | http url | ssh ..| * What is the vmname? * Its the BMH name .. properly correlated. * BMH name is complex name that includes, rack , server, number in rack, vm number, etc. * BMH :: libvirt :: ... <- everything is the same name * A wrapper script can use this to call things... * .... ::: ## Tuesday Feb 2, 2021 ## Thursday Jan 28, 2021 ### How to host krm function containers out of private repos? * We will have many many references like the following, across multiple repos ``` config.kubernetes.io/function: |- container: image: quay.io/airshipit/replacement-transformer:latest ``` * At render time, we may not have access to quay, and need to use e.g. artifactory * We could serve it out of a local docker cache, configured to pull from artifactory but serve images up with quay tags? (Not sure if this is possible) * We could do a Big Sed (s/quay.io/my-artifactory/), either on disk or dynamically as part of `airshipctl phase run` * * Any other options? * Patching the downstream git repos how we want them. * Docker Registry/Catching needd dto understand if these gives us an option. * Future possibility: https://github.com/GoogleContainerTools/kpt/issues/1204 These patchset introduces a solution for the versioning issue https://review.opendev.org/c/airship/airshipctl/+/767179 New ISSUE for this discussion: https://github.com/airshipit/airshipctl/issues/457 ### Secret rotation UX (Matt) * This script demonstrates how to invoke `airshipctl cluster rotate-sa-token` * https://github.com/airshipit/airshipctl/blob/master/tools/deployment/provider_common/42_rotate_sa_token.sh#L21-L39 * Note how much bash needs to occur first to extract the name of the secret(s) to rotate * Can we enhance the command to do that work? Need to verify if ASPR/Security Requirements require SA to be rotated ? Even if we find we do need to , will create an issue for Post v2 scope... * Perhpas an operator in the cluster managing these internally. CR tells it what to mage , and policies like frequency, etc. ### Secret decrypt discussion continued from Tuesday https://review.opendev.org/c/airship/airshipctl/+/772467 see the flag TOLERATE_DECRYPTION_FAILURES=true in https://review.opendev.org/c/airship/airshipctl/+/772467/2/manifests/site/test-site/target/generator/results/decrypt-secrets/configurable-decryption.yaml#10 https://github.com/GoogleContainerTools/kpt-functions-catalog/pull/153 https://github.com/airshipit/airshipctl/issues/453 ## Tuesday Jan 26, 2021 ### Walk through the Secrets only using phases approach (Alexey) * Explain where we are at ? * Can we deprecate the secrets command? * Any gaps in scope : * ISSUES [#453](https://github.com/airshipit/airshipctl/issues/453) * Decrypt , phase render will fail if user doesnt have access to the keys. * Behaviour should be that it simple renders encrypted. * Having access to the keys implies privileges . * ISSUES for documenting *how to secrets* in airship * PAtchsets related to this * https://review.opendev.org/q/topic:%22generator_sops_encrypter%22+(status:open%20OR%20status:merged) ### Synchronize Labels between BareMetalHosts and Kubernetes Nodes (Arvinder) * The design doc ready to merge. Waiting on lazy consensus by end of this week: https://github.com/metal3-io/metal3-docs/pull/149 * The PR for the feature is also under review: https://github.com/metal3-io/cluster-api-provider-metal3/pull/152 * Demo ## Thursday Jan 21, 2021 ### Some small items * Should document pull have clean options * To clean target path * or simpler document clean ups such as `git clean -f -d .` does. * LEAVE IT AS IS. Managing TargetPath manually. * document pull - should validate the Mamnifesrt configuration between URL and expected authentication mechanismis accurate * ISSUE (fix/enhancement ..) * bug calling document pull twice seems to have an issue.[ISSUE] * config init [ISSUE] * default to treasuremap but pointing to master, should be branch v2? Or is this tied to teh release managment discussion for the future. * Message about TargetPath default ... ### Issues needed design * Implement log collection in 3rd party gates for deployment success/failure tracking #449 * Two options * Jenkins on the WWT * Nexus Server up on same WWT (**Direction) * Some capacity calculation to determine policies * Will store both success and failures ### Walk through Baremetal Inventory Approach (Kostiantyn) https://review.opendev.org/c/airship/airshipctl/+/771083 ## Tuesday Jan 19, 2021 Canceled ## Thursday Jan 14, 2021 ### Plan run: Parallel vs Sequential phase execution (Dmitry) Let's make final decision regarding phases execution strategy within phase group. Initially we discussed that phases within group are executed sequentially and groups are executed in parallel. This may lead to a bit of inconvenience in case we need a dependency between groups (user has to create separate plan). For example ``` yaml --- apiVersion: airshipit.org/v1alpha1 kind: PhasePlan metadata: name: phasePlan description: "Default phase plan" phaseGroups: - name: group1 phases: - name: initinfra-ephemeral - name: initinfra-networking-ephemeral - name: clusterctl-init-ephemeral - name: controlplane-ephemeral - name: initinfra-target - name: initinfra-networking-target - name: workers-target - name: workers-classification - name: workload-target --- apiVersion: airshipit.org/v1alpha1 kind: PhasePlan metadata: name: phasePlan2 description: "Another phase plan" phaseGroups: - name: group-which-depends-on-group1 phases: - name: openstack-deployment ``` Alternative approach is to execute groups sequentially and phases within group in parallel (i.e. opposite logic). This eventually lead to another inconvenience: significant amount of groups with single phase and once we add wait tasks it becomes worth. For example ``` yaml --- apiVersion: airshipit.org/v1alpha1 kind: PhasePlan metadata: name: phasePlan description: "Default phase plan" phaseGroups: - name: group0 phases: - name: initinfra-ephemeral - name: group1 phases: - name: wait-initinfra-ephemeral - name: group2 phases: - name: initinfra-networking-ephemeral - name: group3 phases: - name: clusterctl-init-ephemeral - name: group4 phases: - name: controlplane-ephemeral - name: group5 phases: - name: initinfra-target - name: group6 phases: - name: initinfra-networking-target - name: group7 phases: - name: workers-target - name: group8 phases: - name: workers-classification - name: group9 phases: - name: workload-target ``` Ongoing Audit v2 Audit effort https://docs.google.com/spreadsheets/d/1YnVC_yQr7m-TUDaLz0xGndkoVIlO2emIwoGirbPBz-8/edit#gid=1852982768 ## Tuesday Jan 12, 2021 ### `kpt pkg sync` proposal (Sean) See https://github.com/airshipit/airshipctl/issues/430#issuecomment-756872360 We agree we like using the kpt files for driving. release maangment of upstream functions/provenance. Will use treasuremapCI/CD to drive updates, and tagging as needed to ensure no drifting occurs. Ongoing Audit v2 Audit effort https://docs.google.com/spreadsheets/d/1YnVC_yQr7m-TUDaLz0xGndkoVIlO2emIwoGirbPBz-8/edit#gid=1852982768 ## Thursday Jan 7, 2021 ## Tuesday Jan 5, 2021 ### Quick revisit of Variable Catalog Open issues [#363](https://github.com/airshipit/airshipctl/issues/363) - Define CRD/Schema approach [#317](https://github.com/airshipit/airshipctl/issues/317) - Define Endpoints catalog * Should we go ahead & break #317 into multiple issues as Matt suggests in the comments? * New issue for OpenStack-Helm (see below) * Update 317 with schema example & item to reconstruct URL from piece parts * If we do, would there be multiple endpoint catalog CRDs or just one? * Should we break #363 into multiple issues, i.e. one issue per catalog CRD? * In https://github.com/airshipit/airshipctl/tree/master/manifests/function/airshipctl-base-catalogues we currently have networking, versions & environment variables (assuming optional as it's part of the template). Confirm endpoints is the only outstanding catalog (didn't see any other issues). endpoint format ```yaml= apiVersion: airshipit.org/v1alpha1 kind: VariableCatalogue metadata: name: endpoint_generic labels: airshipit.org/deploy-k8s: "false" spec: <the-endpoint-name>: namespace: TBD if we need this here.. fqdn: name.iam-sw.DOMAIN path: /v3 protocol: "https" port: 443 ``` :::warning Openstack will need its own endpoint catalogue that mimics the existing format 100% to avoid changes to helm toolkit. Example catalogue: https://raw.githubusercontent.com/airshipit/treasuremap/master/site/seaworthy/software/config/endpoints.yaml ::: ## Thursday Dec 17, 2020 ### Continue the airshipctl cluster put kubeconfig? .... https://hackmd.io/U_8VYjK0Qoe24Us8IMQpcQ ### Work the details for the Image Builder CowImage Where does the data that deliver Host and HArdware Configuration live ### SIP & ViNO interactions - ViNO adds a `tenant-cluster=x` label to BMHs: does this label mean that a particular BMH can only be scheduled to that subcluster? Does ViNO apply this label to each BMH object it creates? - No, Vino does not add the tennant-cluster or subcluster label, that is done by SIP. Vino simple adds labels such as Rack, Server, Flavor to the vm associated BMH. - Where should we define the ViNO bridge interfaces? Сurrent VINO CR example: ``` nodes: # Change this to vBMH or nodes? - name: master networkInterfaces: - name: mobility-gn type: sriov-bond network: mobility-gn mtu: 9100 options: # these key value options look like they belong to HOST node instead they are defined on VM level pf: [enp29s0f0,enp219s1f1] vlan: 100 bond_mode: 802.3ad bond_xmit_hash_policy: layer3+4 bond_miimon: 100 ``` ## Tuesday Dec 15, 2020 ### Airship MultiProvider Cohesiveness - Discuss current differences between BM and Cloud Provides in terms of lifecycle experiences and expectations - Goal is it should be the same , ...Will try to identify the gaps * A couple things in progress which will help with this goal: - `airshipctl phaseplan run` will make the selection of phases to run be driven by the PhasePlan manifest itself - This will solve for isobuilder - standing up a kind cluster for everything other than BM - Serves the same purpose as ephemeral cluster - Related to bootstrap containers, but I don't think that solves it - The ability to ingest dynamically-generated kubeconfigs into the document set will make bare metal and public cloud kubeconfig representations look more similar - Any provider-specific actions, like extracting the kubeconfig, can be formed into generic container executor phases - Same goes for any "custom waiting" that needs to loop - verify_hwcc_profiles.sh -- testing-only script - Path forward: form into a generic executor container Thought : * cloud Providers provide a get credentials, we have airshipctl cluster get kubeconfig <cluster> * Convenient way to get kubeconfig ... however is retrieving from live cluster Should or can we integrate withh the public cloud notion ... Can we use this ? After we create theh clod provider cluster, whiuch stoes kbeconfig in te management cluster We can use the cloud provider store <data> into the provider.. ... The we can rely on the cloud persistece infrastcture to have the kubeconfig structure... Issue is the persistence on the filesystem .., .. **KNOWN ISSUES or DIVERGENCES** * kubeconfig management: We are programatically specifying the kubeconfig for baremetal vs cloud providers where the kubeconfig is generated? Discussion was to try to merge the kubeconfig to merge into a single entity. Airshipctl will not be responsible for persisting beyond the filesystem Airshipctl will just make sure that the kubeconfig in manifests or explicitly stated outside will is updated. * hwcc : is not needed for cloud providers other than metal 3 since its tied to the BMO/Ironic mechanisms. LONG TERM : Issue - Define a Phase Post Execution notion , still TBD .. ## Thursday Dec 10, 2020 ### CI/CD Portability Philosophy * A self-contained CI/CD philosophy has been proposed by an airshipctl PTL who shall remain nameless. * In other words, put as much of the CI/CD into makefile targets as possible, so that there is as little reliance on the specific CI/CD platform used (e.g., Zuul vs Jenkins) * Example: https://review.opendev.org/c/airship/airshipctl/+/765830 * Not everyone in airshipctl community seems aware of this approach, so this agenda item in part is meant to socialize it. * Arijit: One negative of this approach is losing the debug console: https://zuul.opendev.org/t/openstack/build/1a23ef90c08343e3af30f6b65536fe1c/console Is this acceptable, or is there another workaround available to make it work? * Arijit: In future we are planning to test different CAPI providers. Is the preferred approach for new targets in the makefile for different providers, or to use same target with argument of different site names? GAPS/OUTCOME : * Future roadmap , will take advantage of this move to a makefile driven approach. I..e introddcing airship in a pod, and the inntroduction of plan run. * Will introduce an acculation of logs. Will lose some detail in terms of where the deployment fails. * This of course helps the facilitate ea common experience between the developers and the ci systems. * Potential issue , some capabilitites available with zuul , might be impacted, or not available in jenkins. i.e. depends. ### Treasuremap Branching Currently, we have these Treasuremap branches: * master (Airship 1, with sloop, skiff, foundry etc) * v1.8-prime branch (latest Airship 1, but only cruiser type functional) * v2.0 branch (Airship 2) Ideally, v2 content would be in the main branch for the Airship 2.0 release: * One simple option: simply rename v2.0 to master, and master to v1.7, and leave v1.8-prime as-is (perhaps rename to v1.8) * Folks definitely use the v1.8 branch. Does anyone actually use master today? Les create an ISSUE for Treasuremap Barnching so we can discuss this. A plan for branch realignment. Tie to the v2 release/milestone. ### VINO & SIP FYI: ViNO and SIP have been added as Airship projects. Let's take the opportunity to review the plans for integration into Treasuremap. Do we need VINO & SIP integration in TM for Av2.0, or can it defer to v2.1? ## Tuesday Dec 8, 2020 ### Understanding Treasuremap * Its modules and structure * How Treasuremap is used with airshipctl ### clusterctl rollout * https://github.com/kubernetes-sigs/cluster-api/issues/3439 * `clusterctl rollout` command provides users with a convenient and consistent means through which to rollout an update, monitor the status of an update, rollback to a specific update and view the history of past updates. * Currently focused on `MachineDeployment` with support for `KubeadmControlPlane` coming later. * Long-term goal is to add the command under `kubectl rollout`. ISSUE : Integrate with ***airshipctl cluster rollout*** . ## Thursday Dec 3, 2020 ### Secret Generation * A design incorporating Alexey's KRM Template-based secret gen idea: * https://hackmd.io/NW5fySw1QQ2Ex8wMKF9WUg?both * How to scope the Template(s) themselves? * Since it's 100% data driven, a single `function/secretgenerator`. * How to scope the ReplacementTransformer rules for secret gen input/spec: * `VariableCatalogue(s)` -> `secret gen Template.values` * This is specific to particular functions, so maybe `function/<name>/secretgeneration`? * The encrypted/generated secrets themselves must be scoped to site-level. * How to scope the ReplacementTransformer rules for outputted secrets? * `kind: Secret data.password` -> `HelmRelease.values.password (for example)` * Perhaps add this into `function/<name>/replacements` * Possible Airship documents lifecycle options based on generation/encryption example * https://docs.google.com/document/d/1CUYMGsEQ9ZYez0mSdG3DyyhPiA8_nRXQf-aFP5WufGc/edit#heading=h.oy2ce573d5m Following discussion of https://github.com/airshipit/airshipctl/issues/419 ```yaml apiVersion: airshipit.org/v1alpha1 kind: ReplacementTransformer metadata: name: ...... annotations: config.kubernetes.io/function: |- container: image: quay.io/airshipit/replacement-transformer:latest ``` Need a mechanism to specify the version for the plugin image once , and not in every ReplacementTarnsformer artifact. ## Tuesday Dec 1, 2020 ### PXE boot issue with new ironic python agent The current ironic images have an issue where the ironic-inspector ramdisk is using a different network interface than the one it was PXE booted from -- resulting in not being able to report status back to Ironic, which is listening on the PXE network. This has manifested in multiple bare metal deployments having multiple NICs. Sai & Konstantine found a workaround -- downgrading the ipa package in the ironic image. * Do we need to create an issue w/ Ironic for this * How can we bake this into / use our custom Ironic image * Let's use this as an opportunity to align / chart a course w.r.t. ironic image overall * We have an image, but so far it's very light & we're not using it in our deployment manifests: https://github.com/airshipit/images/tree/master/ironic * We should bake all scripts into the image rather than overriding them in, e.g. https://github.com/airshipit/airshipctl/tree/master/manifests/function/baremetal-operator/entrypoint * An "isolinux.bin file is not found" ironic misconfiguration error has been reported - maybe address this at the same time. https://github.com/airshipit/airshipctl/issues/420 ### Secret stringData & last-applied-configuration We are using `stringData` as a human-friendly way to author Secret data, with the understanding that Kubernetes will change it into a b64-enc `data` (which it does). However the kubectl library also adds a `kubectl.kubernetes.io/last-applied-configuration` annotation, which unfortunately includes the cleartext `secretData`, which kind of defeats the point. Is there some trick we can employ to change this behavior, or do we need to switch from `stringData` to `data`? The ReplacementTransformer has b64-enc capabilities that should make this easier to deal with than it was. Some discussion here : https://github.com/kubernetes/kubernetes/issues/23564#issuecomment-517931384 Lets open an issue to update/enhance/use the replacement transformer or another encoding transformer plugin to encode the data from stringdata to data field prior to applying against kubernetes. Issue created: https://github.com/airshipit/airshipctl/issues/424 * from Alexey Odinokov to Everyone: https://github.com/airshipit/airshipctl/blob/master/pkg/api/v1alpha1/replacement_plugin_types.go#L25 * from Matt McEuen to Everyone: https://github.com/airshipit/airshipctl/blob/master/pkg/document/plugin/replacement/transformer.go#L201-L228 ### Align phase list-related work * This one (in progress) is for `airshipctl phase list <planName>` proper: * https://github.com/airshipit/airshipctl/issues/358 * This one is for a `airshipctl plan describe <planName>` * https://github.com/airshipit/airshipctl/issues/394 * Seems to include `phase list` functionality * Should we simply output the plan description in `phase list`? * `airshipctl plan list`: * https://github.com/airshipit/airshipctl/issues/385 * Do we have an issue for listing out phase plans? * Could do via `phase list` (no `planName`: implicit "all plans") * Could do via `plan list` * Do we need a way to list all phases for all plans available to the document set [ ISSUE ] `airshipctl phase list ` Would : * Find all documents with kind: Phase . _________ *airshipctl phase list|get* All plans in the documents set : Tabular form phasnameA | phase fields clustername |... phasnameB | phase fields clustername |... *airshipctl phase list|get --plan <planname>* Only phases of the plan <planname> : Tabular form phasnameA | phase fields clustername |... phasnameB | phase fields clustername |... *airshipctl phase list|get phaseA* phasnameA | phase fields clustername |... *airshipctl phase list|get phaseA -o yaml* Phase artfacts *airshipctl plan list|get * : Tabular form planA | fields clustername | description... planB | fields clustername | description... *airshipctl plan list|get planA* planA | fields clustername |... *airshipctl plan list|get planA -o yaml* For the plan descriptionb field. We can make sure that the description is shorthened for the tabular form. If there are special characters such as return or LF, etc. We can shrink it to that. VOTE : | Command | Votes | | - | - | | Single command GET | Sudeep,| | Single command LIST | Matt, Dmitry , Rodolfo, Drew, Srini Muly, Bijaya| | Multiple Commands GET and LIST | Sean | ## Thursday Nov 19, 2020 ### Revisit SOPS encryption subcommand The GPG team pointed us to documentation that the SOPS library we're using for encryption is not intended for external use. https://lists.gnupg.org/pipermail/gnupg-users/2020-November/064320.html https://godoc.org/go.mozilla.org/sops/v3 Let's confirm whether we're comfortable using that library, and if we need to pivot to using the sops binary, what the best integration approach is. **Alternative** Implement encryption as a krm-function (as we have for Templater and ReplacementTransformer here https://github.com/airshipit/airshipctl/tree/master/krm-functions) and use generic container executor to run it as a phase or as a subcommand ### What is airshipctl cluster status ? Follow discussion here https://hackmd.io/eiVvMiCwR5KsZSKGWjxlfQ ## Tuesday Nov 17, 2020 ### Review list of functions for refactoring analysis https://hackmd.io/cxcxX6ypQbCNifjJbDJnrg ## Thursday Nov 12, 2020 Follow up to Matt McEuen's question on removing taint from script https://github.com/airshipit/airshipctl/blob/master/tools/deployment/31_deploy_initinfra_target_node.sh#L22-L26 On testing found that ironic, helm-controller, source-controller, capi components, cert-manager etc was not comming up. For fixing ironic, helm-controller, source-controller, capi components etc we can add the toleration on the yaml or patch them with kustomize. But for cert-manager since its brought up by clusterctl, we will have to add cert-manager as a function and then deploy it via init-infra phase. WIP Commit: https://review.opendev.org#/c/762186/ From Chat: https://github.com/kubernetes-sigs/kustomize/issues/3095 New Issues : 1. Add a function for cert-manager, to give more control over its deployment config (like tolerations), rather than relying on clusterctl to manage it for us: https://github.com/airshipit/airshipctl/issues/408 2. Add patches to all infrastructure components to apply tolerations for master nodes, to allow them to run on master nodes. Remove the kubectl toleration application from our deployment scripts. https://github.com/airshipit/airshipctl/issues/406 3. Add patches to all infrastructure components to apply nodeSelectors for master nodes, to constrain them away from worker nodes. https://github.com/airshipit/airshipctl/issues/407 4. New issue to refactor functions to follow the pattern below. a. If a function uses no copied upstream base, leave out `[function]/upstream/` b. If a function is *only* an upstream base, put it under `[function]/upstream/` and have a thin passthrough `[function]/kustomization.yaml` ```plantuml @startsalt { {T / + manifests/ ++ function/ +++ capi/ ++++ vx.y.z/ +++++ upstream/ | Put the upstream here +++++ replacements/ +++++ patches1..n.yaml | Airship specific patches +++++ kustomization.yaml | Apply our patches on top of upstream } } @endsalt ``` ### What is airshipctl cluster status ? Continue discussion here https://hackmd.io/eiVvMiCwR5KsZSKGWjxlfQ New Issues : * Update executor definition to introduce. astatus interface: https://github.com/airshipit/airshipctl/issues/409 * Update different executors to implement status interface [EPIC]: https://github.com/airshipit/airshipctl/issues/410 * Add a new Phase status subcommand: https://github.com/airshipit/airshipctl/issues/411 * Add a new PhasePlan status subcommand: https://github.com/airshipit/airshipctl/issues/412 ## Tuesday Nov 10, 2020 Please review design proposal for BMH -> Node label synchronization in metal-3: https://github.com/metal3-io/metal3-docs/pull/149 ### Alternative for sops+gpg - airshipctl secret ... * https://github.com/mozilla/sops * SOPS seems like the best alternative, .. * SOPS issue https://github.com/mozilla/sops/issues/767 * We wait for this to be solved? * Implement a non gpg Client for Sops *tree.GenerateDataKeyWithKeyServices([]keyservice.KeyServiceClient{keySvc}) * Utilize the SOPS container from airshipctl. * Is there a personal vault approach? * Would it make sense to see if that can be integrated with airshipctl * Standing a personal vault in a container? ### What is airshipctl cluster status ? Follow discussion here https://hackmd.io/eiVvMiCwR5KsZSKGWjxlfQ ## Thursday Nov 5, 2020 ### Not Assigned, Not Reasearch, Not Ready for Review issues needing Design https://github.com/airshipit/airshipctl/issues?q=is%3Aopen+is%3Aissue+label%3A%22design+needed%22+no%3Aassignee+-label%3Aresearch+-label%3A%22ready+for+review%22+ ### Ability to support multiple phaseplans within the same manifest context. [#385](https://github.com/airshipit/airshipctl/issues/385) ~~Do we simple have a phasePlan entryt in metadata? https://github.com/airshipit/airshipctl/blob/master/manifests/metadata.yaml ```yaml= phase: path: manifests/phases docEntryPointPrefix: manifests/site/test-site ``` ~~ airshipctl phase list|describe|run Phase Plans... ***airshipctl plan list|describe|run*** This would behave the same way as phases command. But act upon the phaseplan documents instead. Need new isses for airshipctl phase describe and run commands. What should be the "well curated", Phase Plans. How many phaseplans do we need to define in treasuremao? Hoow do we optimize phases utilization while reducing duplication impactign ci/cd. ### Investigate post-beta release approach [#354 ](https://github.com/airshipit/airshipctl/issues/354) * Release Management Lifecycle * Issue create the dumm or entryt point github actions release.yaml * see https://github.com/fluxcd/flux2/blob/main/.github/workflows/release.yaml ## Tuesday Oct 27, 2020 ### GPG Sops integration issues with gate * https://review.opendev.org/#/c/758392 * https://review.opendev.org/#/c/758707/ Issues : * Redesign inplementation of secret decrypt/encrypt * Fix gating with current design / create mock tests to bypass gpg issues. * Example ps's that have failed because of gpg sops. * Issue: https://github.com/airshipit/airshipctl/issues/378 ### Patchset failing: * https://review.opendev.org/#/c/759763/ From chat: * https://github.com/ProtonMail/gopenpgp * https://console.cloud.google.com/gcr/images/kpt-functions/GLOBAL/sops?gcrImageListsize=30 * https://github.com/GoogleContainerTools/kpt-functions-catalog/blob/master/functions/ts/src/sops.ts ## Thursday Oct 22, 2020 ### Host Config Operator Design Discuss updates needed for Host Config Operator. Code: https://github.com/airshipit/hostconfig-operator Issues: https://github.com/airshipit/hostconfig-operator/issues * Should we leverage [#1](https://github.com/airshipit/hostconfig-operator/issues/1) to encompass the design changes or create a new issue? ### Container execution from phases Issues: * https://github.com/airshipit/airshipctl/issues/369 * https://github.com/airshipit/airshipctl/issues/202 Options: [doc](https://docs.google.com/document/d/1qiao8ApYCavndwhVGuSE4588vAJLXyWmvfXQSOuxUJE/edit#heading=h.qdvgn5twc4uw) The [demo](https://github.com/aodinokov/noctl-airship-poc) of the first approach and its [screencast](https://drive.google.com/file/d/1f4ZRZqce6NkVY1xvOjeaWwApKkUKGBoc/view?usp=sharing) ### Bootstrap Container execution from phases (Sidney) Discussion about two design proposals for this feature. Description of designs can be found [here](https://hackmd.io/Ah1CRbxETLCsUMhdncHERw?view) ### Airship PTG Discussion / Reminder * Next Wed/Thurs, 8-12 central * Focus will be on Post-v2 scope, but all design/community topics welcome * Please add anything to discuss to the agenda: * https://etherpad.opendev.org/p/wallaby-ptg-airship * Wed 8-9 will be new dev onboarding * Overall PTG info & agenda (don't forget to register for free) * http://ptg.openstack.org/ptg.html ## Tuesday Oct 20, 2020 ### HWCC Profile Evaluation (Rajat, Ashu, Matt) HWCC Profiles are evaluated by the HWCC at the time the profiles are applied, but are only re-checked when the profile resource changes, not when the node status changes. I.e., the node resources must be in Ready state prior to applying profiles. In the CS below that is accomodated by separating out worker provisioning into separate phases. Do we want to enhance the HWCC to re-evaluate profiles when node statuses change? * https://review.opendev.org/#/c/748421/9/tools/deployment/34_deploy_worker_node.sh@37 ### Finish Review of Image Builder/Host Config spreadsheet? https://docs.google.com/spreadsheets/d/1BQRadxOOzvRq8C6j3fe4xQ1f4AjSCfAyWG73U15pIho/edit#gid=0 In reviewing can we determine if the item is already supported by the selected solution or if we need an issue to add support? ### Boostrap Container/Ephemeral Cluster with Phase Run (Sidney) * Discuss the design for decoupling airshipctl command from provider's bootstrap container * How to support command concurrency * How to improve command usability ## Thursday Oct 15, 2020 ### Review of Image Builder/Host Config spreadsheet? https://docs.google.com/spreadsheets/d/1BQRadxOOzvRq8C6j3fe4xQ1f4AjSCfAyWG73U15pIho/edit#gid=0 In reviewing can we determine if the item is already supported by the selected solution or if we need an issue to add support? ### Discussion about the Labeling of Hosts Day 2 Proposal: * notion of label BMH/Provider Host and cascade upwards to the appropriate nodes * set of machines (MachineDeployment/machine sets) , update the labels.. Not available through at the moment. * Currently only supported via redeployment. * Security issue with kubelet labeling nodes. * Generic solution : New Generic Labeling Operator * Running on management cluster * Can access any workload cluster it manages. * Has kubeconfig/secret for each cluster. * Should eventually be a component of CAPI operator * Would work for multiple providers. * Where do the labels belong/ to cascade up from... or as the source for the operator * Should support both: * BMH's * MachineDeployment Sets.. * Upstream discussions in CAPI: https://github.com/kubernetes-sigs/cluster-api/issues/493 ### Pros & Cons of go<->shell For the container bootstrap work discussed last week, there are pros & cons around using go, CLI tools, or both to do things like: * parsingYAML input (go has an advantage) * program logic & maintainability (go has an advantage) * interfacing with public clouds (provider CLI tools have an advantage) The patchsets are currently set up so that a go program is the entrypoint to the containers, and it exec's out to run the commands. Using go libraries for these CLIs may be an option; is it worth it? Other thoughts? Changes: * https://review.opendev.org/#/c/752298 * https://review.opendev.org/#/c/748537 ### Discussion about Ironic and progress with Redfish Discussed earlier about bridging current gaps within Ironic for Redfish compliance. Here's a document in progress https://hackmd.io/YDLMNIq2SCexddXHbXwosQ ### CAPI related proposal we might need to follow * [Cluster API Bootstrap Reporting ](https://docs.google.com/document/d/1FVRxo9toKSUmvKIUFFzPFhnFrfdR9s7S6Bl4shovNlg/edit#heading=h.3mwmvwsf4jyi) * [Management cluster operator]( https://docs.google.com/document/d/1ZsusF5c9pYxseuaKxTpctI5aUDqzl0sdCW4xxDbLm3k/edit#heading=h.wdfs5v5gumb8) ## Tuesday Oct 13, 2020 ### Remember PTG Wed Oct 28, 29 * Open Infra PTG Oct 26-30: http://ptg.openstack.org/ptg.html * Airship agenda (please add topics): https://etherpad.opendev.org/p/wallaby-ptg-airship * Airship sessions in Cactus room 13-17 UTC (8-12 US Central), Wednesday & Thursday * First hour of Wednesday session will be New Developer Onboarding * Dont forget to register (free)! https://october2020ptg.eventbrite.com ### Airshipctl persisting data **Kubeconfig** for target cluster. Problem is especially vivid during public cloud deployments, capz, capg but in bmh deployments as well: - **BMH dployments**: We deploy target cluster via ephemeral cluster, where do we store the kubeconfig for the target cluster? You can get a secret from ephmeral cluster with target cluster kubeconfig, however once ephemeral cluster is gone, kubeconfig is gone with it? Is it a manual job to commit it into a document model, or should we require an external storage to save it, and teach airshipctl to use it if needed? In current scenario we predefine the certificates and keys, and we **DO** know the ip address of the server, but that requires that user **MUST** generate certs beforehand, that is a manual labor. - **Public clouds and CAPD**: Even when we can predefine the certs ands keys, we cant predict the IP address of the API server, which is part of kubeconfig, so we get it from parent ephemeral cluster (KIND), but once you tear it down, you don't have the IP, and airshipctl has no means to connect to it. Of course, we do have an ability, to supply our own kubeconfig via flag, and airshipctl will use it and will connect to it. But the kubeconfig is not persisted anywhere, should we simply rely on user and expect him to commit the kubeconfig to the manifest? should that be automated, or should this be a general bootstrap phase, which ends in kubeconfig being commited to persistent storage automatically that can be reused later? discussion at https://hackmd.io/qC7PZYxWSaqC3PZVp4d4iw ### Deduplication of network definitions #315 Look at https://review.opendev.org/#/c/749611/ for implementation details ### Airship session in 2020 virtual Open Infrastructure Summit. https://www.airshipit.org/blog/airship-featured-at-virtual-open-infrastructure-summit-in-october/ ## Thursday Oct 8, 2020 ### Discuss bootstrap approach for Azure, GCP etc Defined in https://review.opendev.org/#/c/737864 And https://review.opendev.org/#/c/748537/ Questions: * airshipctl and public clouds: priority (do we need this right now?) and level of support. If it's needed ASAP - Can we have it properly architectured first to comply with our current architecture? (see some big design concerns below) * Why do we need a separate sub-command instead of new executor? Some documentation https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_420/755398/20/check/openstack-tox-docs/4204487/docs/phases.html * Why do we need to extend the config module with bootstrap container options(we moved them to document model a while ago)? * Why do we need new container interface for bootstraping a cluster? can not we use 'Kubernetes in Docker' a.k.a 'kind' as an ephemeral cluster? Proposal: 1. Use KIND cluster instead of ephemeral node for cloud providers 2. Implement executor (and phase) for cloud providers 3. Use clusterctl executor as is against KIND cluster (no additional code required) ### Regarding Image Builder, Have we consider existing tooling, in particular: https://www.packer.io ? - It can be coupled with things like ansible to build custom images. - They also support cloud specific image generation. If we ever went down the route of deploying images customized for a particular cloud, say azure or gcp, packer has cloud specific plugins. ### Airshipctl persisting data **Kubeconfig** for target cluster. Problem is especially vivid during public cloud deployments, capz, capg but in bmh deployments as well: - **BMH dployments**: We deploy target cluster via ephemeral cluster, where do we store the kubeconfig for the target cluster? You can get a secret from ephmeral cluster with target cluster kubeconfig, however once ephemeral cluster is gone, kubeconfig is gone with it? Is it a manual job to commit it into a document model, or should we require an external storage to save it, and teach airshipctl to use it if needed? In current scenario we predefine the certificates and keys, and we **DO** know the ip address of the server, but that requires that user **MUST** generate certs beforehand, that is a manual labor. - **Public clouds and CAPD**: Even when we can predefine the certs ands keys, we cant predict the IP address of the API server, which is part of kubeconfig, so we get it from parent ephemeral cluster (KIND), but once you tear it down, you don't have the IP, and airshipctl has no means to connect to it. Of course, we do have an ability, to supply our own kubeconfig via flag, and airshipctl will use it and will connect to it. But the kubeconfig is not persisted anywhere, should we simply rely on user and expect him to commit the kubeconfig to the manifest? should that be automated, or should this be a general bootstrap phase, which ends in kubeconfig being commited to persistent storage automatically that can be reused later? ## Tuesday Oct 6, 2020 ### Generation & rotation of site-level secrets (Sirisha) [airshipctl secrets generate command design doc ](https://hackmd.io/oXLb7IMGRtGUBpewo7spUA?view) * Secret catalog structure - one or more than one? * Incorporating generated secrets into phase kustomizations * Generating secrets other than Secrets * Design doc: https://hackmd.io/oXLb7IMGRtGUBpewo7spUA?view Secrets generate discussion notes , or details https://hackmd.io/Mx3X6U6lQ8GLEW4ISgbDoA - ## Thursday Oct 1, 2020 ### [Image Builder Declarative Discussion](https://hackmd.io/6CgeJKqVQJ6vpT2DC5mx6A) ### Generation & rotation of site-level secrets (Sririsha) Secret catalog structure - one or more than one? Incorporating generated secrets into phase kustomizations Generating secrets other than Secrets ### Airshipui - issue grooming & next steps (Andy Schiefelbein) * Phase render & apply steps (https://github.com/airshipit/airshipui/issues/37) is the most actionable issue as of right now * Needing use cases for what a user wants to see as a landing page * Needing use cases for wanted user interactions * https://github.com/airshipit/airshipctl/issues/359 exposing hosts for use with baremetal * https://review.opendev.org/#/c/755380/ A first attempt at exposing the info ### HostConfig Operator Discussion Continuation ## Tuesday Sept 29, 2020 *Airship Tuesday Design Meeting-20200929 1305-1* ***Password***: dPEJMDw6 ***Recording***: https://attcorp.webex.com/recordingservice/sites/attcorp/recording/playback/24f6248b93164a4b9077c3b2b94285b4 ### Using **relative paths as entry points for Phases** Impact of that , is what about empty site level phases, do they need to be explicitly created Idea of a airshiopctl phase run --inherited < meaning get the phase from the type) airshipctl to generate phase documents implicitly form the type associated with the site. Options that have been discussed in slack : manifests: dummy_manifest: primaryRepositoryName: primary repositories: primary: checkout: branch: ${AIRSHIP_CONFIG_PRIMARY_REPO_BRANCH} force: false remoteRef: "" tag: "" url: ${AIRSHIP_CONFIG_PRIMARY_REPO_URL} metadataPath: manifests/metadata.yaml targetPath: ${AIRSHIP_CONFIG_MANIFEST_DIRECTORY} subPath: apiVersion: airshipit.org/v1alpha1 kind: Phase metadata: name: bootstrap config: executorRef: apiVersion: airshipit.org/v1alpha1 kind: ImageConfiguration name: isogen documentEntryPoint: airshipctl/manifests/site/test-site/ephemeral/bootstrap How do I inplicitly figure out the site specific endpoint path: **Option I** - Combination of TargetPath +subPath in manifest document imply the location of the site - con about this approach is an implicit expectation on the site author for where the site phase doccument live. Example, in airship config: ``subPath: manifests/site/test-site`` ``targetPath: /home/matt/airshipctl`` And then in the Phase resources: ``documentEntryPoint: target/initinfra`` **Option II** Introduce a new value in the metadata.yaml Airship Config --> Manifest --> metadataPath --> metadata.yaml has entry for location of phase or site entry point. Repos are been clone to : targetPath+<PhaseRepositoryName> Example, in metadata.yaml: If metadata.yaml is defined at the type level, this might look like : ``documentEntryPointRoot: manifests/type/mytype/myphases`` If metadata.yaml is been defined at the site level then: ``documentEntryPointRoot: manifests/site/test-site`` Code will use TargetPath + <PhaseRepositoryName> + documentEntryPointRoot to identify entryPoint. And then in the Phase resources: ``documentEntryPoint: target/initinfra`` Example, in metadata.yaml: ``phaseBase: ../treasuremap/manifests/type/cruiser`` And then in the Phase resources: ``documentEntryPoint: target/initinfra`` **Option III** Use PrimaryRepositoryName, which is already part of the manifest to indicate which repository holds phases and prepend it's name to entrypoint, example: primaryRepositoryName: primary example result: ``primaryRepositoryName: primary`` ``DocumentEntryPoint: manifests/site/test-site`` URL: /root/airshipctl dirName = airshipctl ``kustomizeRoot: airshipctl/manfiests/site/test-site`` Outcome: - Change Airship Config Manifest , replace PrimaryRepositoryName for PhaseRepositoryName - Introduce a new value in metadata.yaml called documentEntryPointRoot that will identify the kustomize entrypoint for the site where the phases can be found. **Phase run ** will use *TargetPath + <PhaseRepositoryName> + documentEntryPointRoot* to identify kustomization entryPoint. - Need to move Phases and Phase Plan under a ***type*** in the airshipctl manifests. - Assumption is metadata.yaml is usually defined at the type level ,and comes as part of that repo. But since we can define in the airship.config.manifest then it can be custom at the site level when appropriate. ### Discussion of Firmware config extension and update on configuration moulds - Noor I talked with Richard about this, and have some things to discuss * **Timeframe for configuration** moulds is at best 2021 * Finish ironic implementation, still working on specification * gopher cloud * Update metal3 * How do we close the RedFish gaps in the current Bios/Firmware/Raid implementation: - Ironic PTG about this RedFish point.

    Import from clipboard

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lost their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template is not available.
    All
    • All
    • Team
    No template found.

    Create a template

    Delete template

    Do you really want to delete this template?

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in via Google

    New to HackMD? Sign up

    Help

    Documents

    Tutorials
    YAML Metadata
    Slide Example
    Book Example

    Contacts

    Talk to us
    Report an issue
    Send us email

    Cheatsheet

    Example Syntax
    Header # Header
    • Unordered List
    - Unordered List
    1. Ordered List
    1. Ordered List
    • Todo List
    - [ ] Todo List
    Blockquote
    > Blockquote
    Bold font **Bold font**
    Italics font *Italics font*
    Strikethrough ~~Strikethrough~~
    19th 19^th^
    H2O H~2~O
    Inserted text ++Inserted text++
    Marked text ==Marked text==
    Link [link text](https:// "title")
    Image ![image alt](https:// "title")
    Code `Code`
    var i = 0;
    ```javascript
    var i = 0;
    ```
    :smile: :smile:
    Externals {%youtube youtube_id %}
    LaTeX $L^aT_eX$

    This is a alert area.

    :::info
    This is a alert area.
    :::

    Versions

    Versions and GitHub Sync

    Sign in to link this note to GitHub Learn more
    This note is not linked with GitHub Learn more
     
    Add badge Pull Push GitHub Link Settings

    Version named by    

    More Less
    • Edit
    • Delete

    Note content is identical to the latest version.
    Compare with
      Choose a version
      No search result
      Version not found

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub

        Please sign in to GitHub and install the HackMD app on your GitHub repo. Learn more

         Sign in to GitHub

        HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully