TripleO CI Community Meeting Notes 2022

tags: Meeting

Team meeting pinglist

arxcruz, rlandy, marios, ysandeep, bhagyashris, svyas, soniya29, pojadhav, akahat, chandankumar, frenzy_friday, anbanerj, dviroel, dasm

History: https://hackmd.io/IhMCTNMBSF6xtqiEd9Z0Kw

Gmeet: meet.google.com/igc-nxwj-gws

Gerrit ids

2022-12-20 Community Call

Attendees: pojadhav, chandan, marios o/, ysandeep

Agenda:

2022-12-13 Community Call

Attendees: Tengu, pojadhav, dasm, ananya, chandan, marios, jm1, ralfieri, ysandeep, bhagyashris, soniya29, akahat, apevec

Agenda:

2022-12-06 Community Call

Attendees: chandan, dasm, ananya, bhagyashris, jm1, ,rlandy,akahat, pojadhav, dviroel, marios

Agenda:

2022-11-29 Community Call

Attendees: Tengu, akahat, dasm, bhagyashris, rlandy, chandankumar, arxcruz, jm1, marios,

Agenda:

2022-11-22 Community Call

Attendees: Tengu, ysandeep,pojadhav, marios, dasm, rlandy, jm1,

Agenda:

  • [Tengu] Caching in CI
    • Why caching?
      • avoid the kind of errors we saw lately with broken DNS and ansible-galaxy downloads (among things)
      • less bandwidth
      • better perfs
    • DNS
      • Why unbound instead of dnsmasq?
        • unbound implementation in NetworkManager is deprecated (from man NetworkManager.conf on fc-35 - and no mention of unbound in latest man)
      • What's the current state (unbound)?
      • How to "correctly" set things up (with dnsmasq)
        • install dnsmasq package
        • configure /etc/NetworkManager/NetworkManager.conf
        • [main] dns=dnsmasq rc-manager=file
        • configure dnsmasq options in /etc/NetworkManager/dnsmasq.d/ci-config
          • for better caching (max-cache-ttl, min-cache-ttl, cache-size, no-negcache, dnssec/proxy-dnssec for instance)
          • for better security (interface lo)
          • for custom hosts (we may even do this to replace the /etc/hosts at some point (I'm dreaming, I know))
          • add some other DNS forwarders in case the "official" ones are dead (downtime mitigation)
        • nmcli g reload
      • Good part is: there's (probably?) no need to do anything in freeIPA scenarios
      • Alternatives
        • unbound (with dns=unbound in nm conf) - support in nm is ending
        • unbound (without networkmanager support, meaning rc-manager unmanaged) - but we may face issues with the actual DNS forwarder, or need hacks to inject the nameservers
        • systemd-resolved (with dns=systemd-resolved in nm conf): not available in cs8, not installed by default in cs9 (but available)
        • don't use any DNS caching, leading to potential issues in case of Infra issues

2022-11-15 Community Call

Attendees: chandan, rlandy, jm1, akahat, Tengu, rcastillo, marios, pojadhav bhagyashris,

Agenda:

  • Next Gen meetup info

  • [Tengu] Avoid future ansible sh*tstorm due deprecation

    • basically catch deprecation warning in a log, and get ppl to analyse it
    • Action: Tengu to provide somehting in the collect-log
  • [jm1]: How to reset our cockpit(s)?! (optional)


2022-11-08 Community Call

Attendees: marios, akahat, jm1, arxcruz, soniya29, ananya,

Agenda:

  • Change the community call time as well as scrum time

2022-11-01 Community Call

Attendees: arxcruz, ysandeep, dviroel, akahat, dasm

Agenda:


2022-10-25 Community Call

Attendees: dasm, marios, arxcruz, dviroel,

Agenda:

  • [dasm]: INFRA CS9 changes:
    • Problem statement:
      • Current infra deployment code doesn't work for other clouds than infra-tripleo
      • Code cannot be tested before being merged, without abrupting existing infra
    • Proposed resolution:
      • Allow for deployment on different environment for testing purposes, by making deployment code modular
    • Plan of tackling
      • Decoupling base role into separate sub-roles (manual movement to new location)
      • Decouple "deployment" for a cloud from "service" configuration.
      • Create modular playbooks, indicating order of execution to achieve fully operational environment:
        • Configuration steps could be skipped if not needed
    • Objections:
      • The code is not tested. Can be used "staging" tenant:
        ​​​​​​​​​​​​https://rhos-d.infra.prod.upshift.rdu2.redhat.com/dashboard/project/
        ​​​​​​​​​​​​- ci-scripts/infra-setup/tenant_vars/infra-tripleo/→ change according to incockpit psi stuff
        ​​​​​​​​​​​​- ci-scripts/infra-setup/tenant_vars/infra-tripleo/servers.yml → keep only incockpit
        ​​​​​​​​​​​​- source the psi rc file
        ​​​​​​​​​​​​- ansible-playbook ci-scripts/infra-setup/provision-all.yml
        

2022-10-11 Community Call

Attendees: dasm, marios, pojadhav, arxcruz, ysandeep, soniya29, chandan, bhagyashris, jm1,

Agenda:

  • [ysandeep, marios, jm1] dlrn monitoring to sent alerts when its stuck?!

    • [dasm] How often does it happen?
      • [dasm] According to ysandeep it happend 3-4 times in last year.
    • [dasm] What does it mean "stuck"?
      • DLRN stops and needs to be restarted by RDO team.
      • need followup with rdo team - they have some monitoring (sandeep to ping dpawlik amoralej or apevec to find out )
        • we are just concerned with this particular case when building stops completely
  • reminder about https://etherpad.opendev.org/p/antelope_ptg_tripleoci we need to complete this week


2022-10-04 Community Call

Attendees: artom, jparker, ananya, jm1, arxcruz, marios, rlandy, pojadhav, dviroel,

Agenda:

2022-09-27 Community Call

Attendees: Tengu, pojadhav, ananya, chandan, jm1,marios, dasm, arxcruz, soniya29, ysandeep

Agenda:


2022-09-20 Community Call

Attendees: ananya, dasm, marios, rlandy, akahat, arxcruz, jm1,

Agenda:

  • [jm1] Announcement of Ruck Rover Robot (RRR) aka Promotion Enforcer
  • What do we want to do with scrum?
    • too many stories
    • better enforcement of time limits
      • by whoever is running the call
    • tech dicussions need to be taken off line
    • focus on kanban, less on scrum (dariusz, want to write down your thoughts here?)
      • [dasm] Scrum focuses on planning sessions:
        • Entire team agrees on assigned number/points (how long is going to take to finish particular task not difficulty)
        • Scope of sprint cannot be altered - it means there is no place for incoming tasks/interrupts
      • [dasm] Kanban focuses on one person per task, which means no more hoarding tasks for one engineer. If there is someone else who can work on the task it can be grabbed and worked on
      • [dasm] Extra column on the Board, which represents "Blocked". It gives visual queues on what's really not feasible to be done.
        • +1 like that
      • [dasm] Engineers are responsible for moving cards, there is no need for separate facilitator of meetings.
    • three line status?
      • what im doing,
      • whats next
      • blockers/risk for delivery
    • breaking down tasks
      • adding definition of done for tasks
      • [dasm] yes, we need to have more granular tasks, which represents what needs to be achieved.
    • Book

2022-09-13 Community Call

Attendees: marios, ananya, pojadhav, dasm, chandan, doug, arxcruz, rlandy, jm1,

Agenda:


2022-09-06 Community Call

Attendees: ananya, pojadhav, rlandy, arxcruz, akahat, marios, jm1, dasm, chandan

Agenda:

  • [dasm]: Infrastructure changes for TripleO CI: Toolbox, Cockpit, Promoter
    • Update C7 VM to CS9 VM
      • Ansible code updates
        • Some updates were done manually, and not reflected in the code
    • Move downstream baremetal workload to VM
    • Monitoring for tools
      • To be decided
    • Future plans: migrate tools towards openshift environment
  • [marios, jm1]: Jira card for updating links, notes and warnings for our infra, e.g.
    • current hostnames of cockpits, promoters etc.
    • link to our infra-setup repo
    • warning about ansible-pull running in the background aka DONT DO ANY MANUAL CHANGES TO INFRA UNLESS YOU KNOW WHAT YOU ARE DOING

2022-08-30 Community Call

Attendees: dasm, pojadhav, chandan, jm1, akahat, rlandy, bhagyashris, arxcruz,

Agenda:


2022-08-23 Community Call

Attendees: arxcruz, pojadhav, jm1, akahat, bhagyashris, rlandy, ananya, dasm, dviroel

Agenda:

  • New learnings. [Amol] GO lang
    • akahat and arxcruz are interested in using golang in more non-critical places for our team. Especially around "promoted" area, as well as "skiplist".
    • jm1 and dviroel mentioned that the team works mainly with OpenStack, where all tooling is written in Python, hence we should not introduce new dependencies.
    • dasm mentioned that the skiplist code required some tender loving care, before it could be compiled and used again. The code was not widely used across the team, hence there was not interest in keeping it up-to-date.
    • rlandy is opened for new code, as long as it covers team's interest. golang, due to team's involvement into openshift might be a right thing. R, C or Rust not really.
    • rlandy suggested we might look into right place to use new golang code. It might be Operator involvement, with new tools.
    • arxcruz and akahat were asked if they're gonna take care of supporting old code, doing reviews and keeping it up-to-date. Both confirmed interest.
    • ysandeep and dviroel mentioned that currently main code in Operator CI code is written in bash (bash scripts) and there is no a lot of area for new tools, especially written in golang.
      • doug: The tool is written in Go, but steps/jobs/config are bash and yaml files (what we are using do far)
    • Meeting ended up without consensus. This topic needs to be discussed again.

2022-08-16 Community Call

Attendees: ananya, marios, arxcruz, akahat, chandan, bhagyashris,

Agenda:

  • periodic mixed rhel job current-tripleo vs tripleo-ci-testing
  • Network update in downstream:
    • Because of metadata issue.

2022-08-09 Community Call

Attendees: dasm, jm1, Tengu, pojadhav, bhagyashris, arxcruz, chandan, rlandy

Agenda:

  • (Tengu) nftables status: we're almost there!

    • Thank you Sandeep for the hard work
    • Leading to some changes in order to get a better logging, better rules and ensure things will be working in the future
    • Logging was already useful in order to spot firewall issues
    • nftables will make TripleO more secure thanks to the way rules are now implemented! (INPUT chain policy, dedicated chains and so on)
  • (Tengu) enable auditd service in a (periodic?) job to ensure default proposed ruleset is valid (hint: for now, it's NOT)

  • (jm1) How we do reviews and how we can improve them (continued)

2022-07-26 Community Call

Attendees: dasm, chandan, ananya, arxcruz, bhagyashris, akahat, jm1, rlandy, marios,

Agenda:

  • (dasm) OVB reparenting:
  • (dasm) skiplist
  • (dasm) tooling to simplify our lives - ideas

2022-07-19 Community Call

Attendees: marios, Cédric(Tengu), ysandeep, soniya29, bhagyashris, jm1,

Agenda:


2022-07-12 Community Call

Attendees: dasm, jm1, ananya, marios, abregman

Agenda:


2022-07-05 Community Call

Attendees: ananya, pojadhav, jm1, dasm, arxcruz, rlandy, marios, dviroel, akahat

Agenda:

2022-06-28 Community Call

Attendees: arxcruz, chandan, rlandy, dasm, ananya, marios, abregman , bhagyashris, akahat, fultonj, dviroel

Agenda:

2022-06-21 Community Call,

Atendees: ibernal, ananya, pojadhav, akahat, dviroel, marios, fultonj, jm1, arxcruz, jgilaber

Agenda:


2022-06-14 Community Call,

Atendees: dviroel, Chandan, ananya, akahat, rlandy, arxcruz, marios, jgilaber, fultonj, dasm

Agenda:


2022-06-07 Community Call

Atendees: ibernal, marios, ysandep, abregman, jsilvahe, eliadcohen, fultonj,

Agenda:


2022-05-31 Community Call,

Atendees: Tengu, rlandy, rcastillo, pojadhav, marios, arxcruz, dasm, dviroel

Agenda:


2022-05-24 Community Call,

Atendees: chandan, marios, pojadhav, arxcruz, dasm, bhagyashris, akahat

Agenda:


2022-05-17 Community Call,

Atendees: pojadhav, chandan, rlandy, ysandeep, arxcruz, ananya, marios, bhagyashris, jm1, dviroel,

Agenda:


2022-05-10 Community Call,

Atendees:

Agenda:


2022-05-03 Community Call,

Atendees:

Agenda:

  • [jm1] Ansible OpenStack collection: Zuul CI jobs and Grafana Dashboard (postponed to next week because of public holiday in india)

2022-04-26 Community Call

Atendees: chandan, rlandy, ananya, ibernal, abregman, ysandeep, bhagyashris, akahat, dasm, jm1, dviroel,

Agenda:


2022-04-19 Community Call

Atendees: chandan, ibernal, marios, arxcruz, dasm, pojadhav,

Agenda:

  • [abregman] on behalf of Ironic Team, can we have a separate view on component pipeline dashboard (Grafana) for the DFG purpose? so they can customize it the way they want it to look : https://github.com/rdo-infra/ci-config/tree/master/ci-scripts/infra-setup/roles/rrcockpit
    • main question is what is missing/what they want to see
      • They don't want to see all the data, related to all components. Just the status of one specific component
  • CI meetings on Wednesday / Thursday? (or do you want to ask Ronelle etc. first, Bhagyashri?)
    • Retro on Thursday

2022-04-12 Community Call

Atendees: dasm, rlandy, marios, ananya, arxcruz, bhagyashris, pojadhav, ibernal, jm1, abregman, ibernal, jpodivin

Agenda:


2022-03-29 Community Call

Atendees: dasm, pojadhav, rlandy, bhagyashris, rcastillo, jm1

Agenda:

  • [jm1] Zuul in-depth session: Job variants and branches (will be held in future)

  • [jm1] issues like we had yesterday with tripleo ansible currently cannot be discovered because tripleo jobs on ansible podman collection do not redeploy containers. Shall we add a job for github.com/containers/ansible-podman-collections which redeploys overcloud, e.g. runs overcloud deploy twice or runs standalone twice?

  • Jakob

    • Suggestion: We create one more job that test redeploy
      • We will brainstorm about doing redeploy in vanilla standalone.
  • Sandeep:

    • I have discussed c8 wallaby coverage in checkpoint call today, the suggestion was to keep one job which really test overcloud (and not only undercloud) as minimum set for c8 wallaby coverage. People on call have suggestion that atleast vanilla standalone job should run for both c8 and c9 for wallaby in check/gate.
  • naming stable1 line

    • wallaby cs9: stable1-cs9 & stable1: https://review.rdoproject.org/r/c/config/+/41041

      • +1: +1(marios), +1 (bhagyashris), +1(dviroel)
      • -1:
    • wallaby cs8: stable1-cs8 & stable1 :

      • +1: dasm, rlandy, ysandeep, rcastillo
      • -1:
    • 3rd option? stable1-cs8 & stable1-cs9?

      • +1:
      • -1:
  • [dasm] retrospective/planning meetings time


2022-03-22 Community Call

Atendees: ananya, jpodivin

Agenda:


2022-03-15 Community Call,

Atendees: dasm, marios, pojadhav, arxcruz, jm1, ananya, dviroel,

Agenda:


2022-03-08 Community Call

Atendees: jm1, chandan, rlandy, pojadhav, ananya, ysandeep, Tengu, rcastillo, bhagyashris, jistr

Agenda:


2022-03-01 Community Call

Atendees: pojadhav, ananya, jm1,

Agenda:

  • [TripleO] centos9 jobs only for master, centos 8 & 9 for wallaby
    • http://lists.openstack.org/pipermail/openstack-discuss/2022-February/027403.html

    • Upgrades requirements - are we supporting upgrading to Wallaby on 8? For example, the coming undercloud-upgrade-ffu job will be train 8 to wallaby 8. In which case we definitely need to keep at least some subset of 8 jobs (and can't entertain the removal of 8 from Wallaby completely).

    • Are we importing from Wallaby 8 or Wallaby 9? Currently it is 8 but this will soon switch.

    • For the wallaby c8 'subset of jobs' e.g. multinode, vanilla standalone (no scenarios? some subset of them?), undercloud-ffu, minor update.


2022-02-22 Community Call,

Atendees: Chandan, pojadhav, ysandeep, marios, ananya, arxcruz, jm1, rcastillo, akahat

Agenda:


2022-02-15 Community Call

Atendees: ysandeep, dasm, dviroel, chandan, ananya, marios, arxcruz, jm1, akahat, pojadhav

Agenda:


2022-02-08 Community Call

Atendees: bhagyashris, rlandy, pojadhav,

Agenda:


2022-02-01 Community Call

Atendees: ysandeep, bhagyashris, ananya, dviroel, alfrgarc, jm1, arxcruz, akahat, rcastillo

Agenda:


2022-01-25 Community Call

Atendees: Tengu, ananya, rlandy, arxcruz, ysandeep, marios, alfrgarc, dviroel, jm1, rcastillo, akahat, bhagyashris,

Agenda:


2022-01-18 Community Call

Atendees: dviroel, rlandy, ananya,marios, akahat, rcastillo, jm1, dasm, bhagyashris,

Agenda:

  • osp director operator intro
  • compose pinning work - next steps
    • Add CentOS 8
    • Test backwards
    • Add more jobs

2022-01-11 Community Call

Atendees: bhagyashris, chandan, Tengu, dasm, rlandy, marios, akahat, ananya, dviroel, jm1

Agenda:

  • ansible-lint usage and its value
  • osp director operator intro
  • (Tengu) container building containers, container building OC images
    • One VM able to build things for multiple releases right away?

2022-01-04 Community Call

Atendees: rlandy, marios, arxcruz, dviroel, dasm, ananya

Agenda:


2021-12-21 Community Call

Atendees:

Agenda:

  • Plan to scale down c8 testing
    • master moving c9 (DF/prod chain)
      • ok to nuke c8 altogether on master?
      • downside - backports will be tested for the first time on c8 when we hit wallaby
      • possibility: run provider and (standalone?) containers-multinode (intermediate) - only reason we would do this is for backports. If backports are of concern, let's not do this. There is no product value for master on c8.
    • wallaby c8/c9 split (upgrades/prod chain)
      • how much c8 do we need? - consider c7 right now, container build, standalone, scenarios and containers-multinode
      • train on 8 -> wallaby on 9]
        • so do we do os upgrade on T or on W i.e. do we need support 8->8 upgrades from T->W?
    • upgrade w->? what is the upgrade path?
      • target the line to support upgrade path
    • backports (minimal c8?)
    • rhos-17?
      • 16.2 -> 17 upgrade
  • Thursday - propose patch to run c9 on all tripleo repos
    • patch 1 - moves c9 from branchful to all templates ( as non-voting)
    • patch 2 - makes voting
    • patch 3 - reduces c8
  • Templates
    • keep one template for c8 and c9
      • lets try that first
      • once posted, also try posting something that will limit c8 on master (to see what it looks like)
    • create specific c9 templates
      • would need to create the new templates in tripleo-ci
      • then go wire them up in all the repos

2021-12-14 Community Call

Atendees: chandan, rlandy, ananya, marios, dviroel, ysandeep, bhagyashris,

Agenda:


2021-12-07 Community Call

Atendees: dviroel, marios, ananya, rlandy, bhagyashris,

Agenda:


2021-11-30 Community Call

Atendees: ysandeep, matbu, pojadhav, marios, arxcruz, dviroel, bhagyashris,

Agenda:


2021-11-23 Community Call

Atendees: ysandeep, rlandy, marios, fultonj, pojadhav, fmount, gfidente, matbu

Agenda:

2021-11-16 Community Call

Atendees: ysandeep, dviroel, rlandy, pojadhav

Agenda:


2021-11-09 Community Call

Atendees: chandan, marios, ysandeep, ananya, dviroel, matbu (VF), pojadhav

Agenda:


2021-11-02 Community Call

Atendees: akahat, chandan, ananya, marios, , rlandy, pojadhav, jfrancoa (Upgrades), holser (Upgrades)

Agenda:


Select a repo