owned this note
owned this note
Published
Linked with GitHub
# Akri Developer Meeting Notes
###### tags: `Meeting Notes`
**Note**: We've migrated older meeting notes to a separate page because we reached the maximum length. To read previous meeting notes from February 2021 to November 2023, click [here](https://hackmd.io/@akri/rJED-1EB6).
Akri Developer calls take place every first Tuesday of the month at 8AM to 9AM PST on [Zoom](https://zoom.us/j/6894895379).
Anyone is welcome to join this call and participate. Please abide by Akri's [Code of Conduct](https://github.com/project-akri/akri/blob/main/CODE_OF_CONDUCT.md). The meetings are recorded and published afterwards on the [Deis Labs YouTube Channel](https://www.youtube.com/channel/UC90VZDjT8C7ca7zcuFi6oEQ).
If you would like to propose an agenda item, add a comment to the Agenda section under the specific date.
For any other questions or comments, message Yu Jin Kim or Kate Goldenring in the [Kubernetes Slack Channel](https://kubernetes.slack.com/messages/akri).
# May 7th
## Assignments
| Assignment | |
| ------------ | --- |
| Moderator | |
| Notes | |
| Issue Triage | |
## Attendees (please add name and company)
## Announcements
## Agenda
- Submit talk for KubeCon NA?
- Timeline for Akri v1.0 (end of November?)
## Discussion
## Action items
## Assignments for Next Meeting
| Assignment | |
| ------------ | --- |
| Moderator | |
| Notes | |
| Issue Triage | |
# April 9th
## Assignments
| Assignment | |
| ------------ | --- |
| Moderator | Yu Jin Kim |
| Notes | Kate Goldenring |
| Issue Triage | |
## Attendees (please add name and company)
- Nicolas Belouin (SUSE)
- Kate Goldenring (Fermyon)
- Yu Jin Kim (Microsoft)
- Ethan Chang (Microsoft)
## Announcements
- Akri v0.12.20 is live!
## Agenda
- KubeCon EU 2024
- Yu Jin and Nicolas had a talk at Kubernetes on EDGE Day
- Talk went well and demo worked (after a small network hiccup)
- 100+ people
- Good feedback after the talk -- talking about it at the booth
- SUSE gave keynote at Kubernetes on EDGE and talked about the tiny edge and Akri's role
- Akri booth: originally on wait list and got off it for Thursday evening. Fun to have the dynamic demo
- Used Akri twitter to promote. Not a great booth spot for visibility
- Nicolas, Yu Jin, and Reese had joint session at Microsoft booth. 3 options: open source, SUSE edge 3.0, or Azure IoT operations
- Nicolas: a little disappointed in that people didn't want to try to print something after seeing the demo
- Nicolas had a discussion with DRA maintainers (Patrick and Alexander). They are moving it in the wrong direction. Nicolas is trying to get more involved. They are more focused on the autoscaler, and want to integrate the device as being a part of the node. They are focused on the GPU story. They are linking the device to the node with a resource that looks like the Akri Instance CRD but with one node rather than a list. Capacity is also an issue: 1 or infinity. Nvidia is also interested in network based devices. Without tying resources to nodes, it adds latency and complexity to the scheduler. And AI workloads may want to filter more on a specific GPU. Currently have NameResources which has a list of devices for a node. Idea is that the scheduler doesn't have to ask anyone to schedule a job. After some conversation, Patrick said maybe we could change the node field to be a node selector. DRA maintainers asked if they could reference our Rust DRA implementation.
- Kate: All this being said, are we going to merge or keep working on all the DRA changes?
- Nicolas: lets finish all the refactoring work: finishing agent refactor, splitting up CRDs, arbitrary resource creation, splitting out repository
- Nicolas: But overall think we will continue eventually use DRA
- Yu Jin: After the Microsoft booth session, people asked how Akri is different from KubeEdge. They have protocol mappers from edge devices for a variety of protocols. Not sure if their device bridging operator is independent and could be applied to other clusters
- Community engagements (metrics and ideas)
- Ethan has been looking at our release metrics and has some ideas on how we could improve our visibility
- Generates ecosystem visibility reports every month
- issue board visits
- unique clones
- unique repo visitors
- visit numbers of Akri home site
- do we have google analytics numbers?
- or would porting to netlify give us those numbers
- YouTube video views
- Unique downloads
- agent or controller? - would be ideal to track both
- do we need to be wary about whether these are used in workflows?
- Do we want to record maintainer contributions?
- devstats through CNCF does this: https://akri.devstats.cncf.io/d/8/dashboards?orgId=1
- Want to update some of the demo links to the site. How to do that?
- Add PR to https://github.com/project-akri/akri-docs
- Issues
- Many are created by the maintainers. Can we create a label to track the ones that are not created by maintainers (say "ecosystem")?
- Kate: is this something you see in other projects?
- Ethan: Akri seems unique in that it has a lot of issues created by maintainers
- Kate: I have seen this pattern in other projects such as Spin and SpinKube. It is a great way with maintainers to keep track of todos.
- Ethan: Looking for a way to make sure we are tracking user issues
- Nicolas: we have the `bug` label that specifies something as needing to be fixed vs `enhancement` which is lower priority.
- Kate: Response to issues also needs to be improved - probably would be resolved by recruiting more active contributors
## Discussion
## Action items
## Assignments for Next Meeting
| Assignment | |
| ------------ | --- |
| Moderator | |
| Notes | |
| Issue Triage | |
# March 5th
## Assignments
| Assignment | |
| ------------ | --- |
| Moderator | Yu Jin |
| Notes | |
| Issue Triage | |
## Attendees (please add name and company)
- Kate Goldenring (Fermyon)
- Johnson Shih (Microsoft)
- Lior Lustgarten (Microsoft)
- Nicolas Belouin (SUSE)
- Yu Jin Kim (Microsoft)
## Announcements
## Agenda
- KubeCon EU talk and booth
- Talk at KubeCon Edge day
- One day at Kiosk booth
- SUSE Microsoft joint session at Microsoft booth
- Marketing with Twitter/X, LinkedIn to promote talks, booth
- Kate will tweet all the things
- Cut release soon? Status of DRA?
## Discussion
- Bug with deallocating resources with DRA mode
- kubelet is crashing
- has started talking to the DRA folks about it
- Cutting a release?
- agent refactor could maybe go in soon
- it would be worth cutting a release to add this (adds CRI) but we would not have a feature set
- should cut a release before KubeCon since we have things like updates of dependencies
- RUST security issues came up -- we should make sure these are not affecting us and resolved
- update dependendabot
- DRA: don't think we can land DRA anytime soon, but all the things related to refactoring agent and controller / splitting config / refining API can be done without DRA being active
- may be able to merge DRA later to continue testing but DRA should not be default yet
- version matrix of which versions of akri support which version of kubernetes
- we should aim to cut the release by **Thursday the 14th**
- in that time if someone can update dependendabot that would be great, but otherwise we already have other dependency updates in
- We use actix, tonic (axiom), and a third web server. Let's choose one. Also, can use kube rs crate for our webhook. https://github.com/project-akri/akri/issues/375
- Lior to try and tackle this before cutting release
- example: https://github.com/kube-rs/kube/blob/main/examples/admission_controller.rs#L24
- in case of cert rotation, what is a common approach to address this (currently webhook reads a cert once and uses it forever)?
- is there a way to automate if secret is rotated, pod restarts?
- potential approach: whenever request comes in from server, webhook always reads current cert from storage if it is a volume mount
- if kube-rs has utils for cert rotation we should investigate
- Johnson to open an issue on supporting cert rotations
- how do we grow community and maintainership?
- Community growth
- flashier demos (AI)
- more conferences (Scale)
- CNCF Webinars
- Collaborate with other CNCF projects
- List who is adopting Akri
- akri gathering for a weekend to meet deadlines and have a hackathon day together
- Get us listed on Kube RS adoption: https://kube.rs/adopters/
- Get us listed under [device plugin interface](https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/)
- (later) get us listed under DRA
- https://github.com/project-akri/akri/issues/688
- Maintainership
- KPIs - X bugs a season
- Once we get to 1.0, we will have more things like performance testing. If more stable, then more people interested in maintaining
- triage rotation - i.e. one person tackles responding to issues for a week then rotate?
- Kate to maybe help set up spin app for triage rotation
- folks should be able to mark times off
- promote/remind in Akri Slack good first issues so community can take them on
- Should we switch to using OTEL
- Kate: switch to OTEL since a lot of other people/projects use OTEL
- Johnson to create an issue to track this
- could be a good community member ask since there are other experts out there on telemetry
- moving DH out a repo
- udev is moved out and could potentially move the rest out but need to resolve versioning
- this would make it easier for others to contribute to akri
- close to supporting Windows (update socket usage?) for Akri
## Action items
- Cut release on March 14th
- Johnson put up issue on webhook cert rotation
- Lior look at updating webhook https://github.com/project-akri/akri/issues/375
- Kate to investigate app for triage rotation
## Assignments for Next Meeting
| Assignment | |
| ------------ | --- |
| Moderator | |
| Notes | |
| Issue Triage | |
# February 6th
## Assignments
| Assignment | |
| ------------ | --- |
| Moderator | Lior |
| Notes | Yu Jin |
| Issue Triage | Nicolas |
## Attendees (please add name and company)
- Andrew Gracey (SUSE)
- Yu Jin Kim (Microsoft)
- Nicolas Belouin (SUSE)
- Lior Lustgarten (Microsoft)
## Announcements
## Agenda
- Discuss past weekly PR review meetings and progress
## Discussion
- Weekly PR discussions
- trying to add more comments and discuss so everyone can review PRs even without prior knowledge
- once a week has been a good cadence (Lior)
- if there are many pending PRs we can have a more frequency review, but PR owner would need to be present
- feedback received in the sessions were good (Nicolas)
- would be great to have these merged before kubecon
- might not be realistic to cut a v1.0 release by kubecon at this point
- may have all the features we want in API 1.0 as PRs (maybe not all merged
- security review, benchmark testing, etc. need to be done before v1.0 is cut
- for KubeCon we are on the waitlist for a booth in Project Pavilion
- Nicolas preparing Akri demo for KubeCon
- if we don't have an Akri booth, we can showcase in the SUSE booth
- maybe Microsoft SUSE can do something together if there is space for Edge/IoT things
- Yu Jin to follow up with Ralph
- Issue triage
- #685: multiple configuration fails on rbac.enabled=false
- might want to split documentation into dev-testing install vs actual usage
- quickstart that uses helm install with configuration embedded or how to write configuration for more complex use cases
- #107 (docs): missing trademark
- Kate started a PR as a quick fix
- gitbook doesn't support footer or header if you're not in the enterprise subscription
- we are stuck in the quick fix for now until we migrate to something more conventional
- CNCF advises projects to use Hugo
- we should create another issue to track migration for the long term
- #686: security alert
- may need to upgrade fork of h2 to have a fixed version
- can we have some kind of dependabot for this thing to propose upgrades before we have security notifications?
- wait for next dependency upgrade
- dependencies on licenses - should check with CNCF
- we should go through investigating items and see if there is any progress or any progress can be made
## Action items
- Yu Jin to create an issue to track docs migration for the long term
## Assignments for Next Meeting
| Assignment | |
| ------------ | --- |
| Moderator | |
| Notes | |
| Issue Triage | |
# January 9th
## Assignments
| Assignment | |
| ------------ | --- |
| Moderator | Kate |
| Notes | Yu Jin |
| Issue Triage | |
## Attendees (please add name and company)
- Yu Jin Kim (Microsoft)
- Johnson Shih (Microsoft)
- Nicolas Belouin (SUSE)
- Lior Lustgarten (Microsoft)
- Kate Goldenring (Fermyon)
## Announcements
- Paris Kubernetes on Edge Day CFP for Akri and DRA accepted!
## Agenda
## Discussion
- Main events CFP approval still pending
- our agenda should move towards making the CFP for Akri in good shape
- prep for DRA (don't think we can make any release with DRA by then)
- breaking out config etc other proposals we should try to release some of these features
- need to think about when to start deprecating device plugin system
- in KubeCon NA would be cool to talk about what DRA did for us
- want to support backwards compatibility for device plugin for a while
- PR for rework of agents is functionally done, just need to test
- timeline/roadmap towards Kubecon Paris
- which features are must haves and nice to haves?
- must have: Agent refactor PR
- must have: configuration split
- nice to have: full DRA implementation for demo
- not needed: status for resources
- not needed: arbitrary scheduling of resources
- nice to have: discovery handlers pulled out, but we can take a deeper look at release/build pipelines and discuss
- besides the agent refactor work, our work should not be affected by the reorganizing of the discovery handler into separate repo
- weekly/biweekly cadence to review things?
- 1 hour window a week to meet
- **Tuesday 8am PST** for weekly review meeting
- from there move to biweekly if necessary
- before we meet, let's establish in slack what we want to cover in the meeting
- first, we can discuss refactor of agent
- WASM discovery handlers
- we should still look into this at some point
- Issue triage / open PRs
- close autoupdate dependencies PR to see if bot will kick off a new one
- update tonic and prost: opc ua has created a new release, so this should be ready to update and merge
- refactor agent: we should review and cover in next weekly review meeting
- update docs PRs with labels
- akri DH unable to detect IP camera: no response so we can close
- lorawan support: move to backlog
- need to investigate the licensing for lorawan and whether it is permissible within CNCF
- add instructions for akri instance ownerreference when requesting akri resources: doesn't interfere with controller's owner refs
- pods with unready containers exist on node, can't clean slots yet - investigating
- containerd.socket security concern / best practice: check agent refactor work - investigating new approach
- agent-registration.sock and udev.sock socket files world readable - investigating
- next weekly review meeting
- we can do ad hoc at first but we should have summary / key takewaways from review meeting
- in the future we can take notes or record if necessary
## Action items
- Ad hoc code review of agent refactor meeting on Tues, Jan 16th 8-9AM PT
## Assignments for Next Meeting
| Assignment | |
| ------------ | --- |
| Moderator | |
| Notes | |
| Issue Triage | |
# December 5th
## Assignments
| Assignment | |
| ------------ | --- |
| Moderator | Nicolas |
| Notes | Yu Jin |
| Issue Triage | Lior |
## Attendees (please add name and company)
- Andrew Gracey (SUSE)
- Kate Goldenring (Fermyon)
- Nicolas Belouin (SUSE)
- Lior Lustgarten (Microsoft)
- Yu Jin Kim (Microsoft)
## Announcements
- Archived older meeting notes [here](https://hackmd.io/@akri/rJED-1EB6)
- Akri in [Azure IoT Operations Public Preview](https://learn.microsoft.com/en-us/azure/iot-operations/manage-devices-assets/overview-akri)!
## Agenda
- Migrating discovery handlers and samples out of main repo
- [PR](https://github.com/project-akri/udev-discovery-handler/pull/1) to move udev DH
- Review proposals?
- Issue triage
## Discussion
- Akri in Azure IoT Operations Public Preview
- OPC UA is private discovery handler
- planning on leaning on community and partners for more discovery handlers
- [PR](https://github.com/project-akri/udev-discovery-handler/pull/1) to move udev DH
- pending, hoping to get back into this soon
- ordering of moving DH? do we want to prioritize which ones move out in what order
- debug echo will stay for debugging
- allow agent to be able to pull WASM discovery handlers
- what does WASI support? async might be difficult
- for now pull them out and keep embedding as crates
- reverse compatibility?
- need to be particular about interface/runtime we choose to support
- WASI over WASIX and work with Bytecode Alliance to get a fuller feature set
- containerd shim will also define runtime we use (if we want DH to be pod)
- could also do embedded binaries (should we offer the option?)
- if user doesn't have containerd shim, use same deployment manifest to add runtime class and change (with helm chart)
- [Proposals](https://github.com/project-akri/akri-docs/pulls)
- we should add `proposals` tag in the PR
- external device inventory: Johnson's picking this up, will ask him to update the PR with any changes he is making to Leo's initial proposal
- arbitrary broker resource type: changes API (splitting resources)
- status field: also waiting for resource splitting
- MQTT DH: will wait for current DH to be out of main repo
- currently Nicolas has reference implementation
- if we accept as is, we can move the repo to Akri
- DRA mode: first need to split config
- this one's likely the first to be implemented
- maybe we can make split config proposal
- need to decide on naming (likely discovery configuration)
- first split config, get merged then get DRA one merged
- Setting goals for getting Akri to Incubating project
- great idea, we should finish all the changes we started and get to stable API state then apply for Incubating
- opportunity to bring in lots of contributors but right now there are so many changes that it is difficult for new users to touch code
- we can start thinking about goals soon
- Issue triage
- [#659](https://github.com/project-akri/akri/issues/659): ask if Johnson's comment resolved issue, if so we can close
- [#662](https://github.com/project-akri/akri/issues/662): controller pod restarts under normal conditions
- wanted to bubble up errors into restarts
- could consider not bubbling up errors if this is bad practice
- had this request previously from IIoT team
- new kube controller with API changes will hide all the restart errors - can close on this issue
- [#674](https://github.com/project-akri/akri/issues/674): good first issue to point new contributors to
- moving away from tarpaulin in favor of rust-based coverage tools
- add to backlog
- [#677](https://github.com/project-akri/akri/issues/677): lorawan support
- once we move all DH into their repos, this will be up to someone to build
- maybe they have one that's not OSS that they want to use
- need to add docs where all the DH and images live and instructions on how to add them to helm chart
- we probably don't want to add support for any DH that we cannot test
- clarify whether they are asking for or asking to provide it? how can we support them?
- [Docs #35](https://github.com/project-akri/akri-docs/issues/35): goes along with requesting instances page and adding owner reference (good first issue)
- would this not mess with the controller?
- need to find a way to ensure they don't overlap
- should try it before documenting
- [Docs #83](https://github.com/project-akri/akri-docs/issues/83): something in gitbook configuration
- someone with admin rights on gitbook should have to change them
- [Docs #84](https://github.com/project-akri/akri-docs/issues/84): good first issue
- [Docs #103](https://github.com/project-akri/akri-docs/issues/103): we can fix and redirect
- can we do this or need to put in CNCF ticket?
- [#678](https://github.com/project-akri/akri/issues/678): will likely be a breaking change on OPC UA DH
- good first issue
- would be good to get someone else's eyes on it with more expertise on OPC UA
- [#681](https://github.com/project-akri/akri/issues/681): this may be expected behavior but not desired behavior
- slot reconciliation is not something we're planning to change with DRA
- should look into better ways of claiming the slots
- need some clarification - output doesn't make things clear
## Action items
## Assignments for Next Meeting
| Assignment | |
| ------------ | --- |
| Moderator | |
| Notes | |
| Issue Triage | |