Try   HackMD

Archived Akri Developer Meeting Notes (February 2021 - November 2023)

tags: Meeting Notes

Akri Developer calls take place every first Tuesday of the month at 8AM to 9AM PST on Zoom.

Anyone is welcome to join this call and participate. Please abide by Akri's Code of Conduct. The meetings are recorded and published afterwards on the Deis Labs YouTube Channel.

If you would like to propose an agenda item, add a comment to the Agenda section under the specific date.

For any other questions or comments, message Yu Jin Kim or Kate Goldenring in the Kubernetes Slack Channel.

November 14th

Assignments

Assignment
Moderator Yu Jin
Notes Kate
Issue Triage Lior

Attendees (please add name and company)

  • Andrew Gracey (SUSE)
  • Nicolas Belouin (SUSE)
  • Johnson Shih (Microsoft)
  • Kate Goldenring (Fermyon)
  • Lior Lustgarten (Microsoft)
  • Yu Jin Kim (Microsoft)

Announcements

Agenda

  • Kubecon NA 2023 Recap
  • CFPs for Kubecon EU 2024
  • Discuss boundaries of Akri project vs 3rd parties
  • Review proposals

Discussion

  • Kubecon NA 2023 Recap
    • not a great booth location
    • a few developers came by and were interested
    • mentioned that aiming for Akri 1.0 next year
    • Discussions with Microsoft and Fermyon around deploying Wasm brokers
  • CFPs for Kubecon EU 2024
    • CFP for Akri 1.0: https://hackmd.io/@akri/H1Nu9jYxa
      • explain what is Akri and get to what we have is new with DRA
    • Submit to edge day in addition to KubeCon
    • For DRA, would be nice in parallel to reach out to intel folks who implemented the KEP
    • Kate: We should be wary of recruiting community at the same time we are reconstructing Akri
    • Nicolas: On community stability, we should have a list of good first issues so we can lead contributors to things in our road map. so they do not need to search for what to do
  • Discuss boundaries of Akri project vs 3rd parties
    • Where should the discovery handlers live?
    • Andrew: Seems weird to have a few DHs maintained
    • Nicolas: Start with a section in the docs that lists the external discovery handlers and inventory providers. This could be a first step
    • Kate: is the roadblock to pulling out the DHs the actions, versions etc.
    • Nicolas: it seems more to be the issue is embedded.
    • Kate: could we make sure they can pulled in as crates
    • Nicolas: they don't need to be published as crates
    • Nicolas: pulling out can handle licensing issues (like with OPC UA)
  • How much work to move everything out
    • Nicolas: Once the build PR is merged, we should be able to pull out the samples easily. For the DHs, we should be able to build them by copying the workflow
    • Nicolas: We could publish discovery-utils as a crate
      • can't make public crate for anything we use H2 patch for
      • Discovery Handlers are fine because aren't talking to the K8s Go kubelet manager
  • Review proposals
    • Would having more ad hoc meetings for bigger proposals be helpful?
    • Nicolas: yes, if it is on an ad hoc basis
    • Andrew: How do we know enough people are on the same page to merge?
    • Kate: Maybe large changes require 2 approvals
    • Nicolas: large or breaking (the API). Also we should do 2 approvals for a proposal
    • Kate: Could add that to the PR template as a checklist
    • Nicolas: And put it in our contributing docs
    • Kate: Where are we on arbitrary workload deployment?
    • Nicolas: It is grouped into DRA proposal
    • Should we separate the config changes (discovery and deploy config) from the DRA proposal?
      • but then multiple breaking changes, but only 1 breaking release
      • release before we merge any of these changes and then release once they are all in

Action items

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

October 3rd

Assignments

Assignment
Moderator Lior
Notes Kate
Issue Triage Nicolas

Attendees (please add name and company)

  • Kate Goldenring (Fermyon)
  • Nicolas Belouin (SUSE)
  • Johnson Shih (Microsoft)
  • Andrew Gracey (SUSE)
  • Yu Jin Kim (Microsoft)

Announcements

  • v0.12.9 release is live!
  • Sandbox project status renewed

Agenda

  • KubeCon EU 2024 CFP

Discussion

  • Yu Jin: Release

    • config level resources, authenticating onvif cameras, secrets and more
    • docs: we still have one PR pending and then we can make a release for it
  • KubeCon EU 2024 CFP

    • Nicolas: Road to v1 - edge related track in main event
      • Get proposals that have been proposed implementated
      • Raising awareness about Akri. What to expect from Akri V1, what is it and what is new in V1
    • Andrew: Developing for non-k8s devices in a k8s model (title to be improved)
      • Basic idea is that updates to a akric will re-run a Job which can be used as an automated CD system
  • V1

    • threat model update resolved
    • instrumentation
    • performance testing
    • current proposals
    • What do we rename as broker?
      • Descriptive: "device workload"
      • Abstract: "spore"
    • Remove samples from project
    • Move discovery handlers into separate projects
      • what about "akri-full"
      • can we move the discovery handlers so that they are crates that can be pulled in by Akri
      • can we make Akri full just a Pod with multiple containers
        • maybe we can measure the size and performance difference?
        • it is not currently well tested
      • make discovery handlers embeddable via crates
    • Move brokers into separate projects
    • Publish discovery-handler-utils on crates.io
    • Decide on support policy for patching releases
  • Work to consider for v1

    • Agent integration with device registry: https://github.com/project-akri/akri-docs/pull/45/files
    • Do we want to consider breaking the discovery handler interface at some point (1 DH per cluster for network devices or push instead of stream)
      • Discovery handler could be chron jobs
      • OPC discovery being constant sets up potential vulnerabilities
      • Configure discovery interval
      • Some protocols listen to events, would be good to configure for that too
      • Maybe allow for multiple DH contracts rather than force a constant interaction. Extend the registration behavior so that it reports what kind of DH it will be, or extend agent to receive updates with a new API (push). In stream with updates example, up to the DH to maintain state (could the payload get too large).
  • Nicolas: Working on splitting instances and configs and abstract resource scheduling

  • Versioning

    • Current system is annoying (extra step) and error prone (for concurrent PRs)
    • -rc requires you know the next number
    • Should discovery handlers have the same release number and process

Action items

  • Yu Jin: add rendered threat diagram and source to GitHub
  • Kate: look at Akri full being separated into crates that can be added to akri core
  • Create a V1 project
  • Kate: Move current project to new cross org project

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

September 5th

Assignments

Assignment
Moderator Nicolas Belouin
Notes Yu Jin Kim
Issue Triage

Attendees (please add name and company)

  • Yu Jin Kim - Microsoft
  • Nicolas Belouin - SUSE
  • Johnson Shih - Microsoft
  • Kate Goldenring - Fermyon

Announcements

Agenda

  • Upcoming release bug bash
  • Build system issue (cross outdated images)
  • Review proposals
  • Issue triage

Discussion

  • Bug bash document
    • Async bug bashing, deadline around Sept 22/25
  • Build system issue
    • raised issue in tonic and prost
    • we use cross to build the rust artifacts - uses docker images based on 18.04 Ubuntu
    • were planning release to update 20.04, but delayed
    • can we get rid of cross and build another way?
    • suggested this on slack and potential solution may be OBS or buildx
    • could not build using this for armv7
    • need to update dependencies and related cross images
    • make a decision on what to do for armv7 especially, do we care about this? it is deprecated
    • took 10-15 min to build armv7 with OBS, but not integrated in Github Actions so not as straightforward as buildx (took 2 hours to build)
    • buildx also more straightforward for building locally
    • we don't have recent enough protobuf compiler
  • Proposals
    • integrate external device inventory proposal draft
      • quite old, we should check if it's still being worked on
    • arbitrary broker resource type
      • generic way of describing resources in broker spec to deploy any resource type as a broker
      • Shifu allows exposing devices through restAPI
      • currently we only have placeholder but need more variables to expand, but CL resource may help
      • all this implementation would take place in the controller how useful do people find this today?
      • if we switch to DRA, would controller still be just as useful?
      • Nicolas says controller is useful because there is nowhere else in Kubernetes where we can assign resources like this
      • with DRA, two would eventually be combined since they are a lot of work
      • prereq: our configuration is deployment and discovery config - should we unlink those to make things less complicated?
      • in traditional Kubernetes there is a configuration for each operator, right now we have one for two operators
    • DRA proposal
      • instead of resources linked to node, ResourceClaim gets handled by some kind of controller that is different from the current one
      • there is deallocation with DRA
      • we can handle things like CL support much easier
      • need to create resource kinds: BrokerTemplate, DiscoveryConfiguration, PropertyFilter
      • need a new controller and rename current one
      • change agent behavior - one resource plugin for Akri or per discovery handler
      • proposed timeline is to make API changes under Akri v1
      • do we want to have reverse compatibility with device plugin?
      • once DRA stabilized and supported for all Kubernetes versions, we can deprecate the device plugin model
      • in parallel with this, get involved with SIG and Intel folks that started this KEP and have Akri included as a reference implementation
      • KEP aiming to go to beta in December
      • breaking up the configuration would help this as well and lead us on a path to DRA
    • MQTT discovery handler
    • status field in akri resources
  • Issue triage
    • maybe we should migrate project board so we can see issues org-wide
    • need to update CLA
    • investigating version numbering management
    • dependencies license check - there are two that are using licenses not approved by CNCF so may need to ask for approval of these (opcua crate and mock_instant crate)
      • OPCUA is not part of Akri core, mock_instant can probably be replaced with in-house
    • need to upgrade dependencies
    • out of memory error with microk8s and udev video broker on raspberry pi
    • supporting annotations in discovery handler and configuration - backlog item

Action items

  • create issue to track updating build system
  • create cross repo project board

Assignments for Next Meeting

Assignment
Moderator Yu Jin
Notes Kate
Issue Triage Nicolas

August 1st

Assignments

Assignment
Moderator Yu Jin
Notes
Issue Triage

Attendees (please add name and company)

  • Yu Jin Kim - Microsoft
  • Nicolas Belouin - SUSE
  • Andrew Gracey - SUSE
  • Johnson Shih - Microsoft
  • Lior Lustgarten - Microsoft

Announcements

Agenda

  • Version numbering and PRs process
  • Discuss upcoming release (bug bash, pending PRs, etc.)
  • Updated threat model
  • Review Nicolas's proposal?
  • Issue triage

Discussion

  • Can we find a better way around version numbering when merging PRs?
    • something that indicates whether it should increase major or minor number (maybe a tag?)
    • some kind of automation is required
    • do we want to have a different version for every commit or daily build and get version number each day?
    • conventional commit
    • if it's dev release indicate this and commits from last tag
    • take time stamp when PR is merged and have build assets take timestamps?
    • we should not increase version number in PRs anymore
    • do we want different versioning scheme for dev vs main?
      • probably not would be difficult to maintain this
    • anything merged under main gets a version and increments
    • for dev version - tell it which version/snap we're using, embed commit number into binary (can we do this in rust?)
    • we could go look at similar projects to see how they deal with version numbering
  • PRs for next release
    • e2e suites - Johnson to take a look and review
    • configuration device plugin
      • with current behavior creating slots, when resources get freed, have a finalizer on the pod?
      • we do have a reconciler to rollback to available resources
      • Nicolas to take a look at the PR
      • major concern is to decide the behavior
    • udev test suite is draft for now, has to wait on e2e PR
    • authenticated ONVIF discovery - discussion on proposal PR
  • issue triage
    • #634: should we define expected key name schema?

Action items

  • make an issue to track version numbering issue
  • bug bash for release
  • next meeting, review updated threat model and new proposals

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

June 6nd

Assignments

Assignment
Moderator Yu Jin
Notes
Issue Triage Nicolas

Attendees (please add name and company) (5)

  • Yu Jin Kim (Microsoft)
  • Nicolas Belouin (SUSE)
  • Andrew Gracey (SUSE)
  • Johnson Shih (Microsoft)
  • Lior Lustgarten (Microsoft)

Announcements

Agenda

  • CNCF TOC Annual Review?
  • Plans for next release (aim for this month)
    • PRs to get in
    • Bug bash
  • Issue triage

Discussion

  • TOC Annual Review:
    • I think we should talk to Kate and others that were on this project for the previous year to write this anual review and get a PR submitted
    • Also Nicolas and Johnson are not official CNCF maintainers yet? we should check on this
  • Release aim for June?
    • PR#565: will make a change today with option to decide configuration level resource behavior
      • per instance or per device usage (default is per instance)
      • we can get this into June release
    • PR#567: change is ready but would like to talk about tmpfs approach we need to investigate
      • we can get this into a later release because needs more research
      • can we put this in draft?
    • PR#593: updating dependencies
    • PR#594: merged so #593 can be merged for next release
      • default to squashing and merging
    • PR#607: needs to be versioned before merge
    • PR#612: fixes a bug - currently when we modify configuration and with multiple configurations, it will freeze the other ones so this fixes that
      • let's get this in the next release
    • PR#615: draft PR to reduce crashing of controller in case of error when creating broker
    • Bug bash: async bug bashing - main scenario will be for configuration-level resource support
    • we will get a release train going to version and merge everything
    • we should make sure to have a documentation release
      • configuration-level resource will be the main one
      • fixed issue where if there are multiple instances used by workload, suffixed instance id to environment variable we should update this documentation
      • full fledged github release instead of just a tag
    • what is the criteria for us to no longer be in pre-release?
      • we should review threat model and security again
      • core akri threat model but there is also threat model for each protocols
      • made when DH was embedded
      • being clear about the separation of threat models of core (which is already tightly scoped) and DH would have their own models
      • which protocols should we look at? maybe just for core and specify threat model of core against DH and have one for every DH we include (OPC UA, ONVIF and udev)
      • document this process so others can do threat modeling of their own DH
      • issue with testing E2E with the DH we include, we should test with the examples we include
      • make a distinction between production-ready core components vs discovery handlers
        • splitting up the repo and separating DH out (issue#489)
  • Issue triage
    • #603: can be remedied with the configuration-level resource
      • it would only solve the scheduling workload problem
      • need to investigate the other half of the issue
    • #608: we don't have any .NET developers on the project anymore so will be moved to backlog
    • #613: comes back to splitting up the repo/workspace issue
      • we should see how exactly we would want to split it and have discussions about it
    • #614: TOC annual review

Action items

  • Follow up with Kate about TOC Annual Review
  • Follow up with Kate or Edrick about CNCF maintainership
  • Start release train and bug bash!
  • Discussion for next meeting: KPIs for next KubeCon, 5-6 issues we want to work on by the next KubeCon

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

May 2nd

Assignments

Assignment
Moderator Kate Goldenring
Notes Yu Jin Kim
Issue Triage

Attendees (please add name and company) (5)

  • Yu Jin Kim - Microsoft
  • Nicolas Belouin - SUSE
  • Johnson Shih - Microsoft
  • Harrison Tin - Microsoft
  • Kate Goldenring - Fermyon

Announcements

Agenda

  • Release retrospective
  • Discuss feature prioritization for next release
  • Discuss Configuration level resources
  • Discuss secret management proposal and PR

Discussion

  • KubeCon

    • Akri talk (Kubernetes on Edge Day) recording
    • 114 registered attendees, great turnout
    • Akri booth - many folks that stopped by were those already in the IoT/Edge space
    • traffic in Project Pavilion was great, but maybe need a demo to bring more people to Akri booth
    • would be worth pursuing a booth for KubeCon NA 2023
  • Release retrospectives

    • went smoothly, saving breaking changes for next release
    • got in a lot of bug fixes into this release
    • we should add more E2E tests (don't need to run on every PR, but we could kick off manually) to test things like the udev, OPC UA or ONVIF discovery handlers
    • we should try to merge PRs more often even if there is no release upcoming
  • Features for next release

    • ONVIF updates to use secrets to discover cameras with authentication
    • Configuration level resources
    • Secret managements
    • Add node selectors to discovery handler helm charts
    • Bug: udev that Nicolas is investigating
  • How should we go about the proposals process?

    • Nicolas: proposals have been difficult in the past for me, having a separate PR for it is tricky. Include proposals in the triage process.
    • Kate: could we create statuses for proposals: WIP, ready for review, addressing feedback, decided
    • Johnson: we could also move completed proposals to an archive or adding a "status" field as implemented. Final state is implemented or rejected
    • Kate: we should cerate a proposal about writting proposals (issues)
  • Configuration-level resources

    • Johnson: need to have discussion around implementation - pick up some changes from Kate's branch
    • PR will be ready to review/commit in around two weeks
    • need to address cleanup when configuration is deleted - currently relies on discovery handler to be disconnected - there should be a more elegant way to clean up channels/sockets, a dedicated location for cleanup (currently missing in PR)
    • channel from DH to agent will be closed which causes resources to be released
    • Kate: make a separate PR for changes in discovery operator - would be great to keep scope of each PR smaller since this is a big PR
  • Secrets management PR

    • need to review proposal before we review PR, PR is an implementation of the proposal
  • Code owners / maintainer

    • if you would like to become a code owner, please add a PR to the CODEOWNERS page and we can discuss
    • Nicolas: full-time on Akri, would like to become maintainer
    • beyond being added to CODEOWNER page, need to be added to CNCF maintainers list to be given access to maintainer privileges
    • Kate will reach out to previous maintainers and see if they would like to be moved to emeritus
    • Kate: biyearly or so, let's check in and see if we want to move people to emeritus, that way it's not a heavy commitment to become a maintainer
  • Issue triage

    • need to move examples out to a separate repository to keep things organized
    • #591:
      • Nicolas: will try to see if e2e tests can be run locally before adding to github actions
      • Kate: maybe we can have a script that will initialize this locally for you
      • keep in backlog for now, good first issue
    • #589:
      • Kate: had issues on interop between gRPC go and gRPC tonic
      • Nicolas: made a PR on upgrading tonic and prost (PR#593)
      • Kate: need to update containers, maybe update example for ONVIF
    • #592:
      • can be closed as resolved
      • Kate: maybe we need documentation on details of helm charts
      • there is documentation on this in "Customizing an Akri Installation"
    • #597:
      • Johnson: filtering uses context string - passing for the filtering is incorrect
      • Kate: seems like a quick fix, assigned to Johnson
    • #582:
      • in progress based on linked PR
    • #587:
      • Nicolas has made PR on this

Action items

  • we should create a proposal about writing proposals (issues) - Kate (at least create issue)
  • reach out to previous maintainers about emeritus status - Kate
  • hackmd or governance about being added to cncf as maintain - Kate
  • need to cut release for docs - Harrison

Assignments for Next Meeting

Assignment
Moderator Yu Jin
Notes
Issue Triage Nicolas

April 4th

Assignments

Assignment
Moderator Yu Jin Kim
Notes Harrison Tin
Issue Triage Joseph Knierman

Attendees (please add name and company) (6)

  • Yu Jin Kim - Microsoft
  • Kate Goldenring - Fermyon
  • Harrison Tin - Microsoft
  • Nicolas Belouin - Suse
  • Joseph Knierman - Microsoft
  • Johnson Shih - Microsoft

Announcements

  • Akri booth at KubeCon EU 2023 (Amsterdam)!
  • Talk at Kubernetes on EDGE Day at KubeCon EU
  • Past Wasm I/O talk about Akri on AKS Edge Essentials

Agenda

  • Plan for next release
    • Review pending PRs?
    • Target date for release?
  • Go over bug bash document and discuss scenarios to add
  • Adding new code owners/reviewers?

Discussion

  • KubeCon
    • Booth has a small HDMI cord
    • Show a demo of secret management
  • Release
    • Aiming next Thursday/Friday
    • PR
      • #564
        • Ready to merge
        • nice to have in next release.
        • Testing: keyboard touchpad seen by udev as 5 or 6 dev node
      • #561 and #560
        • Github Actions not kicked off
        • need more investigation.
        • Hope to get into the release
        • Joseph can look into the github action issue
      • #573 and #574
        • Preparation for release
      • #570
        • Ready to merge
        • wait for others to merge first to avoid version conflicts
      • #568
        • couldn't reproduce issue
        • no harm merging it.
        • Ping the person in the issue thread about the fix
      • #565
        • Configuration level change
        • ready to merge after testing
        • Kate can also review.
        • Make sure other people are on board with it, might also want to doc this
      • #562
        • Good to go, merged during meeting.
      • #556
        • Already approve, version patch and good to go
      • #554
        • Check in progress
    • Merging PR
      • Have someone be the conductor and maintain the merging flow
      • Yu Jin can do the merge action and tell other PRs to update version
    • Bug Bash
      • Documentation walkthrough would be helpful. Point to github issue at the end
      • option 1: zoom meeting on Friday/next Monday
      • option 2: put doc in slack, work asychronously, taking this option
    • Adding new Code Owners
      • Distribute the workload to review PRs
      • Go through criteria
      • Check with the current co owners to see if they still want to be maintainers or would prefer to be move to emeritis maintainer status
    • Issue Triage
      • #572
        • Release date - sometimes this month
        • in progress
      • #571
        • Unmaintained dependency
        • in akri-shared
        • investigating
        • warp removed buf_redux, so see if next release still has it
      • #569
        • use kube-rs webhook instead of ours
        • linking #375, might be able tackle together

Action items

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

March 7th

Assignments

Assignment
Moderator Yu Jin Kim
Notes Joseph Knierman
Issue Triage Harrison Tin

Attendees (please add name and company) (6)

  • Yu Jin Kim - Microsoft
  • Kate Goldenring - Fermyon
  • Harrison Tin - Microsoft
  • Joseph Knierman - Microsoft
  • Nicolas Belouin - SUSE

Announcements

Agenda

  • Aim for a new release soon?
  • Issue triage

Discussion

  • Release
    • Yu Jin: Want to upstream some changes by the end of month. Maybe aim for release by April.
    • Yu Jin: Any issues that we want to prioritize? OKRs for KubeCon + other features we have been working on
      • #490: Nicolas has made a PR to resolve this, maybe want to align with Johnson on his proposal for #492
      • #491: will check with Adithya to see if there's progress on this
      • #492: Johnson has proposals on accomplishing this
      • Authentication work: secrets management
      • PR #565: configuration level device plugin
    • Kate: Michael from Slack may be running into this issue: https://github.com/project-akri/akri/issues/145
      • Andrew: or might just be an issue with the brokers being too locked down
      • Kate: We should add this too the bug bash, validating what happens when brokers are less permissive
    • Yu Jin: plan for having a bugbash at the start of April or end of March
    • Kate: aim to start the bugbash durring the next Akri sync. Also request additional testing scenerios for this Bug bash.
  • Nicolas: talking though the PR for having multiple devices of the same tree
    • Kate: this relates to how we add device specifc infomation to akri, this might relate to johnsons PR https://github.com/project-akri/akri/pull/561
    • Nicolas: might not be relates sicne that involves having aeperate aki instance in the same pod
    • relates since it is using enviorment variables to track device information
    • Nicolas prefered a way for having multiple rules to get devices in the same group
  • Nicolas: possible to have the meeting earlier to accomidate folks in Euopean time zones
    • Move meeting to be 1 hour earlier
    • Kate: will setup an action item to move meeting/get thoughts on this
  • Kate: Next community meeting have a demo ready, could be a demo of Nicolas' PR that was mentioned in this meeting
    • Yu Jin: demo for the authentication work
  • Issue Triage
    • Issue 181
      • place on the backlog for now
    • Issue 566
      • This is a documentation issue.
      • Name the template something other than "custom" for the discovery handler to a more specifc name
      • In Bug Bash check this documentation to ensure it is still valid
    • Issue 563
      • Harrison: Ask to configure the delay that the agent will call the discovery handler
      • Kate: needs some additional infomation for this bug, maybe making this configurable will solve the issue
    • Issue 558
      • Harrison: this could be expected behavior and not be a bug
      • The current way works since we do not have to clean up the handler.
      • Kate: If the agent restarts we end up deleteing devices that already existed
      • Harrison: leave this issue as investigating
    • Issue 557
      • might be related to when the agent cannot reach the control plane
      • Moved to investigating
    • Issue 551
      • A fix for this is being worked on by Adithya, bug will be assigned to them for now
    • Issue 550
      • Kate: could be an issue with how to find/match properties for a devices
      • Nicolas: This might be the same physical device causing the issue
      • Kate: Nicolas' fix would solve this since it goupds the devices together
    • Issue 555:
      • Assigned to harrison
      • Default to the Ip address provided rather than the URL if anything else is passed.

Action items

  • Setup plan for the next bugbash
  • for bug bash check that lowered permissions for brokers will still have the correct runtime
  • Update slack meeting to be 1 hour earlier

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

February 7th

Assignments

Assignment
Moderator Yu Jin Kim
Notes Adithya J
Issue Triage Harrison Tin

Attendees (please add name and company)

  • Yu Jin Kim - Microsoft
  • Adithya Jayachandran - Microsoft
  • Harrison Tin - Microsoft
  • Andrew Gracey - SUSE
  • Joseph Knierman - Microsoft
  • Shan Desai - Emerson Automation Solutions
  • Kate Goldenring - Fermyon

Announcements

Agenda

  • Discussion about breaking up the repo into Akri Core, Discovery Handlers, and samples
  • Update on Website going down over the holidays
  • Adithya and Yu Jin to share proposal for secrets management (PR)
  • Akri CFP for Kubernetes on the Edge Day

Discussion

  • Akri CFP for Kubernetes on the Edge Day

    • Submitted!
  • How should we go about moving everything separately?

    • Should we move non-prod scenarios?
    • One repo for all samples?
      • Easier to go and see all samples
    • Discovery Handlers should be each in its own repo. Might be specific to a certain OS and architecture.
      • Most scenarios only need onvif but not OPCUA so we wouldn't need to add all that extra code there
      • Need to update GitHub Actions flow for the containers to pull from all repos
      • Ideally do this in a day (Friday/hackday)
    • existing issue: https://github.com/project-akri/akri/issues/489
  • Update on Website going down over the holidays

  • Adithya and Yu Jin to share proposal for secrets management

    • Andrew: this method provides a lot of flexibilty for Discovery Handler implementers
    • Kate: Take a look at drogue and dapr
    • Kate: What's the motivation to add extra non secret parameters?
      • Adi: Was used in some protocols for extra details
      • Kate: If this isn't needed in discovery we should remove these to avoid confusion between discoveryDetails and discoveryProperties
    • Kate: How does this compare to the previous device registry implementation?
      • This passes all information to the discovery handlers themselves versus the device registry implementation has the discovery handler look up for any information it needs
      • Andrew: That would be more of a production scenario but for a demo this seems to be a better demo story
  • Kate: GitHub issues that seem to be prod blocking need to be resolved

    • Andrew: SUSE hired a new hire for full-time Akri work.
    • Adi and Kate pair programmed last week. Maybe we can do this to onboard new people.
  • Discussion on previous items:

    • Issue 492: Harrison working with Kate on this. Uniquely naming the device specific hash is the issue.
    • Issue 526: OPCUA demo changed to use the PLC Server
      • Shan: Emerson Automation Solutions has physical systems we can use to demo this as an end-to-end Akri project
    • Issue 521: udev now uses syspath now to open up more scenarios
    • Issue 491: Adi to investigate why dockerslim is so small and if we can reduce to that, need to prioritize this
      • Will work on it today and reach out to Kate & Andrew
    • Issue 490: udev has more specific properties
      • Kate: More reason to push out the discovery handlers into their own repos
      • Andrew: WASM could be the right way to go about it
      • Kate: WASM for brokers instead of discovery handlers
    • Issue 379: checked in
  • Ran out of time for Issue Triage

Action items

  • Discuss how we can separate the repos
  • Finish up existing items

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

January 3rd

Assignments

Assignment
Moderator Kate Goldenring
Notes Yu Jin Kim
Issue Triage Adithya J

Attendees (please add name and company)

  • Yu Jin Kim - Microsoft
  • Kate Goldenring - Fermyon
  • Adithya Jayachandran - Microsoft
  • Harrison Tin - Microsoft

Announcements

  • Happy new year!
  • SUSE has full time engineering position open to work on Akri - Andrew Gracey

Agenda

  • Open Discussion
  • Issue triage

Discussion

  • What are the desires for 2023 for Akri
    • [Andrew] seems like all customer requests are captured in tickets
  • How do we keep this community first?
    • [Kate] Proposals before code
  • Should we open GitHub Discussion
  • Status update on GitHub website
    • We had CNCF take over the Akri domain
    • Nameservers from CNCF were not fully set up
      • They use netlify to host and manage their sites, Akri does not have one yet
    • Immediate short term fix: have the CNCF point to another nameserver Adithya temporarily made
  • Credentials for ONVIF cameras
    • We would need hardware-based unique identifiers
    • Need some sort of device registry that DH can reach out to
    • [Kate] Investigation on MAC addresses - normally don't need authentication to get MAC addresses
    • Should we reach out to someone from the ONVIF foundation through Slack and the community?
    • [Adithya] We are working on getting a demo with Kubernetes CSI secrets store driver
    • [Adithya] devices in configuration
    • [Andrew] Concerned about creds in config. Want to use the same config across factory location. Could use config maps ([Kate] or separate CRD)
    • [Andrew] add a step before deploying broker that gets credentials for the broker for the device
    • [Kate] do we need creds for discovery? Can be useful for getting extra device information.
    • [Adithya] Dapr has so much to offer and using just a small subset of it.
    • [Kate] We should look at Drogue Cloud
    • [Andrew] Use side car for auth. You can pop in your sidecar instead of our sidecar but the same function interface regardless. gRPC interface like Discovery Handler

Action items

  • Kate : open GitHub action discussions
  • Yu Jin and Adithya: share architecture/plan for credentials demo next meeting
  • Adithya: update on docs.akri.sh

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

December 6th

Assignments

Assignment
Moderator Yu Jin
Notes Adithya
Issue Triage Harrison

Attendees (please add name and company)

  • Kate Goldenring - Fermyon
  • Adithya Jayachandran - Microsoft
  • Joseph Knierman - Microsoft
  • Harrison Tin - Microsoft
  • Yu Jin Kim - Microsoft

Announcements

  • Akri KubeCon CFP submitted 🎉
    • Discussing managing secrets within Akri
    • Take a look at Drogue, accomplishes very similar goals
      • Potentially see if we can work with them
    • Make sure we don't pass around data and enable each path to pull securely.
  • Issue triage

Discussion

Action items

  • Kate: create issue about "What to do about udev DH breaking change minor or patch version bump"

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

November 1st

Assignments

Assignment
Moderator Adi
Notes Kate
Issue Triage Joseph

Announcements

Discussion

Assignments for Next Meeting

Assignment
Moderator Yu Jin
Notes Adithya
Issue Triage Harrison

October 4th

Assignments

Assignment
Moderator Kate
Notes Joseph
Issue Triage Adi

Announcements

Agenda

Discussion

Release

  • Trying for a release by end of week(10/7/22)
  • Before the release is cut we should be checing in the current PRs for helm, opcua security fix, and the rust toolchain
    External device inventory demo - Leo
  • Akri agent polls external device inventory for device info and creds
  • These creds are helpful for ONVIF in particular
  • Agent querys a specific device with the discovery details as the payload
  • Demoing using happytime-onvif-server this would be a cool e2e demo! Also using the node-onvif packageuj
  • OT environments may have OPC UA certificate publishers, so could have them publish new certificate to device inventory and automatically update the instance
  • Bug: discovering 4 "instances" for one camera
    • Suggestion: use uuid and or xaddrs for camera id
    • TODO: make issue
  • Next Steps
    • Secure the plain text passwords being passed around (keyvault maybe)
    • Continue work in the PRs to further develop the design
  • One customer with a production scenerio already lined up for this feature

Issue Triage

  • Skipped this meeting

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

September 8th

Assignments

Assignment
Moderator Kate
Notes Adi
Issue Triage Adi

Announcements

Agenda

  • Slimming down Akri's test matrix to remove EOL versions per this comment
    • TODO: Adi remove before 1.21 and add all to latest
  • Akri's next release - Adi
    • We have mostly quality of life updates so we will not add a minor revision bump, keep the patch bumps as is.
    • #478 and #487 are the big items that we should test.
    • @michaelzhang114 mentioned we can do a small bugbash.
      • TODO: Adi on test matrix Update so we can unblock 3 pending PRs
  • General Discussion
    • Still need to move samples out of Akri core (we are getting flagged by dependabot on this)
      • TODO: Kate make a examples/sample repo
    • Slimming down container size efforts, we are looking to change the base container image for an immediate fix.
    • #492 should be a focus for the next release

Discussion

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

August 2nd

Assignments

Assignment
Moderator Michael
Notes Joseph
Issue Triage Brian

Announcements

Agenda

  • Discuss the next Akri release
  • Issue triage

Discussion

  • Akri Release
    • No specific criteria for cutting new release
    • Prepare a 0.x minor release for this month
    • Setup a bug bash for new release
    • Add a document specifically outlining production readiness for each release. It would likely be based off current Akri roadmap.
  • General Topics
    • Add a specific document outlining for the responsibilities of each role inside the Akri repo. Helps to acknowledge work that has already been done.
  • Future kubecons
    • Determine if we want a have a booth present for North American Kubecons
    • Determine if we would want to present something new during an ancillary talk/demo
  • Issue Triage

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

July 5th

Assignments

Assignment
Moderator Edrick
Notes Michael
Issue Triage Brian

Announcements

Agenda

Discussion

  • Udev Discovery Handler enhancement: discovering multiple devnodes for one device
    • Resolved this issue on GitHub.
    • Right now we define Discovery Handler as a single dev node
    • Next step could be a proposal for this
    • Udev specific. The only Udev filter is Udev rules. Want to say e.g. video0 AND video1
    • TODO: add an issue and capture the conversation from Slack (modify Udev DH for multiple nodes)
  • Moving Discovery Handlers and samples out of Akri repo
    • OPCUA broker is at a different version from OPCUA client, which has a security vulnerability
    • Our sample has a security alert, not Akri
    • TODO: create an issue to move the samples folder into an examples repo
  • Long term tracking engagement
    • Brian found a stars-over-time tool
    • Unsure how to do repo views (Insights -> Traffic only shows one month)
    • Need another metric on when our workflows ran
  • Security / dependabot alerts
    • High should be resolved in Akri core immediately
  • Issue triage

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

June 7th

Assignments

Assignment
Moderator Kate
Notes Roaa
Issue Triage Brian

Announcements

Agenda

  • KubeCon Recap - Edrick
  • SUSE's strategy around Akri - Andrew Gracey
  • Security and compliance with Akri discussion
  • Add an agenda item by commenting here
  • Issue triage
  • Open Discussion

Discussion

KubeCon Photo Scroll

SUSE strategy around Akri

  • Want to integrate Akri into their edge K8s offering which includes
    • Zero touch onboarding and zero provisioning of nodes: install os and then just turn it on and cluster is
    • high configence ota upgrades - os workloads and everything on up
  • Akri is a part of their vision for the future: everything be declarative: nodes can group themselves into a cluster, pull in connected devices
  • Demo at SUSECON: https://github.com/SUSE-Rancher-Community/edge-demo-keynote22
    • Using fleet for initial bootstrapping of akri
    • Doing an grub change update with oci image

Brainstorm - vendor Discovery Handlers, security, and IoT scenarios

  • Partner with device builders for discovery handlers
  • Leverage existing experts for security and vendor agnostic solution: industry fusion foundation
  • Security: cert manager style workflow
  • IoT Scenarios
    • retail
      • Connected systems like printers - serial bus: Serial devices are old and too specific to have one general discovery handler
      • Automotive: SOAFEE (started by Arm, AWS is involved) - want to use container orchestration in automotive - a lot of info through CAN bus
      • Industry/oil: 12c SUSE potential creating an I2C robot booth demo with Akri for embedded conference

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

May 3rd

Assignments

Assignment
Moderator Edrick
Notes Roaa
Issue Triage Kate

Announcements

Agenda

  • Discuss enabling Akri to discover authenticated devices which requires passing credentials to Discovery Handlers. This enables discovering IP cameras that require authentication and ensures that Akri is only creating Kubernetes Resources for trusted devices. - Kate
  • Bluetooth Discovery Handler investigation - Jake Gallow
  • Decluttering Akri's main repository moving out samples and tools (OPC UA cert generator)
  • Add an agenda item by commenting here
  • Issue triage
  • Open Discussion

Discussion

  • Announcements: See you at kubecon!

  • Discussions around discovery handler authentication

    • Currently broker is responsible for authenticating with device before using it. No way to report back that device is not trusted.
    • Need a way to pass credentials to discovery handlers. Is this configured via the Akri Configuration?
    • Add more details on how the creds can reach discovery handler
    • Document the proposal in akri docs while in parallel starting on a poc.
  • Bluetooth discovery handler investigation

    • Overview of bluetooth by Jake
    • Discussion around publiclly accessible info (like mac address) vs info that needs pairing - More drilling required to fully understand that.
    • Next steps, document the proposal in akri and discuss more next meetings.
  • Decluttering Akri main repo:

    • OPCUA cert generator as an example
    • Other things can be moved to a separate repo: discovery handler, broker sample, sample apps.
  • Triage

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

April 5th

Assignments

Assignment
Moderator Roaa
Notes Vince
Issue Triage Kate

Announcements

  • KubeCon booth and virtual project office hours (announced March 9th)

Agenda

  • Issue triage
  • Open discussion
  • Add an agenda item by commenting here

Discussion

Issues to prioritize for next release:

Note:
Today we will focus on triaging issues

KubeCon announcement

KubeCon EU is in May. We get to have our own booth and virtual office hour (announced March 9th). We will have 30 minute sessions and anyone can hop in.

Triage issues

We triaged these issues:

We also looked at stale issues:

We also looked at a few closed issues:

We also looked at open issues:

Roadmap

We also discussed about roadmap docs

  • New broker deployment strategies should be updated now that we have job broker.
  • Updated " Protocols we would love to be contributed"
  • Sort the list of protocols by priority
    • Zeroconf
    • Bluetooth

Open discussion

  • We have a big feature last release: jobs.
  • We should think about big features for next release and priority them.
    • Modify agent to reduce frequency of Pods getting unexpected admission error: https://github.com/project-akri/akri/issues/450
    • Camera get updated scoped to some different name, we don't update because we are using IP/MAC. We should check identity property.
    • Create new set of builds with different k8s versions for back-compat
    • Zeroconf (MDNS based devices with discovery on top of it)

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

March 1st

Assignments

Assignment
Moderator Roaa
Notes Edrick
Issue Triage Brian

Announcements

Agenda

  • Release features walk through - Kate
  • Bug Bash recap - Kate
  • Open discussion
  • Add an agenda item by commenting here

Discussion

CNCF Webinar

  • First CNCF live webinar, one of the benefits of being a sandbox project
    • right after this developer call
  • Lastest release has Kubernetes Job support, opens up more scenarios
  • All recordered [Link will be added here]

Akri New Release

  • Breaking change
    • What you can deploy either Kubernetes Jobs or Pods
  • crictl is embedded in the agent container
    • device plugin interface doesn't support deallocate
    • ensure that ever pod that is running is still running
    • Roaa is tracking the deallocate problem with SIG Node

Validating on Lastest Kubernetes

  • GitHub runners needs to be updated with containerd
  • Is expected to work on lastest Kubernetes but it hasn't been validated on vanilla Kubernetes

Community Bug Bash

  • Great way to get started with Akri
  • The scenario is still whole
  • Was testing the three new features
  • We got some issues and Kubernetes support
  • Side note, (Katacoda)[https://www.katacoda.com/] is great
  • Other communities do this too
    • somethimes they take a long time
    • ours were nice and concise

Jonathan from Goliath

Summary from Johnathan
Shared my exploration for using Akri for IoT device resource sharing for test/test automation. Here's some of the links I shared:

The two devices I held up:

Additional notes

  • exploring device testing, in the world IoT, there is no common CICD
  • how to share devices, especially in remote working
  • customers want to test a device
  • test eng writes scripts, waterfall testing
  • most of them have a USB itnerface that has the same device ID
    • managing this is difficult
    • LAVA from Linaro is one tool
    • labgrid DSL
    • mostly custom scripts in pytheon/PERL
  • anything in the cloud native world
  • Akri maps the model
  • site manager is a NUC (i3), dedicated container that is running the shell scripts/toolchain/CI scripts
  • solves a lot of classic problems
  • provide fundamentals for automated architecture
  • Two devices
    • both of different update stack
    • one has debug port and one doesn't
    • have to write it in their DSL or bash script
  • Currently there's less focus on protocols
  • Assumption ins none of these devices have software on it
    • when you're testing, there's nothing running on it, need a boostrapping process to get stack on it
  • How do we share these devices as resources
  • If you can't identify based on udev, basically identity based on pluggin order
    • some devices have unique USB ID
    • or jtag/ debugger inbetween
  • FIT IOT-Lab
    • need to run lots of test on lots of devices
  • Shared devices
    • container associated with a particular device
    • webapp, I would like this device in X location
    • get the container that has all the tools associated with it
  • Bootstraping scenario
    • multiple ways to flash firmware (>3)
    • is there a concept of bootstrapping process?
      • if there is a well defined process?
  • USB isn't only protocol
    • large scale deployments, USB becomes a bottleneck
    • ethernet has bigger bandwidth

Assignments for Next Meeting

Assignment
Moderator Kate
Notes Brian
Issue Triage Roaa

February 1st

Assignments

Assignment
Moderator Edrick
Notes Roaa
Issue Triage Brian

Announcements

  • Support for deploying Kubernetes Jobs as brokers in the development Helm chart!

Agenda

  • Demo of deploying Jobs to discovered devices with Akri
  • Discuss creating a community Jobs Bug Bash
  • Suggest agenda items by leaving a comment here!

Discussion

  • Annnouncements: Developer helm chart for brokers as jobs merged and ready for usage.
  • Demo:
    • Current documentation explains how to use main chart or a dev helm chart, set discovery handler to deploy and specify what configuration to create. Following the same steps, now akri also has jobs as an option for a broker.
    • Akri already supports for brokers as non terminating pods like pod continously fetching the temperature.. etc. Now Akri added support for specifying a job (terminating pod) where user want to run a job once (like an upgrading scenario or maintenance job)
    • Demo using mock ip camera and doing an upgrade.
  • Bug bash discussion:
    • Brainstorm scenarios, suggestion to open discussion on slack to get more scenarios including discovery protocols in progress.
    • Suggestion of using a K8s playground (ie: KataKoda) for bug bash for a quick start.
    • Documentation of using jobs is in PR, not merged yet since helm chart changes are only in the dev helm chart. We can still point to it for bug bash.
    • @Kate to share more details on the slack channel.
  • Next release:
    • Target mid Feb, pending the bug bash and merging the jobs as brokers PR.

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

January 4th

Assignments

Assignment
Moderator Kate
Notes Roaa
Issue Triage Brian

Announcements

  • PR up for supporting deploying Jobs to discovered devices

Agenda

  • Birds of a feather discussion
  • Suggest agenda items by leaving a comment here!

Discussion

  • Happy new year! 🎉
  • PR with using terminating pods as brokers is active
    • It is a breaking change, but extends functionality and allows using jobs features. More details in the PR.
    • Right now only supports one job per device, may change in the future.
  • Looking at demos on Akri's website. Currently there's a demo for a couple of supported discovery handler protocols, we are missing one for onvif. Contributions and demo ideas are welcomed!
  • Issues triage

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

December 7th

Assignments

Assignment
Moderator Kate
Notes Brian
Issue Triage Roaa

Announcements

Agenda

  1. Instead of PR title [FLAGS], can now add same version and build dependency containers labels to PRs via commenting to (not) execute certain workloads [PR] - Vincent
  2. Review proposal for supporting the deployment of Jobs as brokers in Akri - Kate Goldenring
  3. Review proposal for supporting multiple deployment strategies in the Controller - Kate Goldenring
  4. Comment here to add an agenda item

Discussion

  • Announcement regarding Kubecon and a call for ideas about Akri presentations. Have discussed the possibility of an IoT panel. Searching for questions, participants, etc.

  • PR Workflow improvement! Previously, text-based flags in PR/commit title were used to control action/workflow (SAME_VERSION, etc). Intent is to use github-native concept (labels) to control this. Labels can be added using comments in PRs.

  • Deployment expansion!! How do we expand Akri's scenarios beyond the microservice or "daemon set" style (essentially using nodes to gain high-availability) deployment?

    • Proposal one: pods that finish/terminates without being restarted
      • https://github.com/project-akri/akri-docs/pull/17
      • Leverage Kubernetes jobs concept
      • Requires changes to configuration CRD: additional property jobspec
      • Does broker become confusing? Is it still an applicable idea for Akri? Is it too restrictive?
      • Demo!!
    • Proposal two: deploying outside the daemon-set
      • https://github.com/project-akri/akri-docs/pull/18
      • Possible scenarios:
        • achieve high availability with pods rather than nodes
        • brokers that require more than one configuration
        • deploying to a configuration rather than an instance
      • Breaking changes:
        • Configuration-level resource requesting
        • Breaking Configuration into 2 CRDs (discovery and usage)
      • Potential issue:
        • Pending pod creation

Assignments for Next Meeting

Assignment
Moderator Brian
Notes Edrick Wong
Issue Triage Kate

November 2nd

Assignments

Assignment
Moderator @Roaa
Notes Edrick Wong
Issue Triage @bfjelds

Announcements

  1. New project-akri organization
  2. Akri's 1 year anniversary was October 20, 2021. Celebratory blog
  3. Add other agenda items by leaving a comment here!

Agenda

  1. Discuss Akri's Governance Doc. Feedback on criteria for moving up the contribution ladder? - Edrick Wong
  2. Birds of a feather discussion.

Discussion

  • KubeCon

    • Kate and Rods presented Akri on Krustlet [link added later]
    • Very exciting, WASM is secure by default and size reduction is significant
    • WASM still new, some features are missing, such as sockets
  • Akri organization

    • Akri is CNCF Sandbox now
    • Everything is open source (including the landscape), so everything is a PR
    • We had to move to an independent org as part of the move
    • Paid for and supported by CNCF
    • The main repos are all moved there
    • We have acces to Snyk now, which does container scanning for vulnerability
      • will put in a workflow to do vulnerability and license scanning
    • We have a DCO, which checks every commit is signed
      • if you are running into this issue, click on the error and there's one liners to fix
      • our update bot needs to be updated
    • Have access to service desk and community page in the CNCF
    • Can have maintainer sessions at future KubeCons
  • Happy Birthday Akri!

    • link above
  • Governance Doc

    • General meaning of what Governance means, what legal entity owns the project and how the project is governed and ran
      • Publically defined roles, how to move from role to role, and what to do when you're retiring from the project
    • Inspired by opengovernance.dev, and Porter
    • Our code of conduct also has been updated

Assignments for Next Meeting

Assignment
Moderator Kate Goldenring
Notes @bfjelds
Issue Triage @Roaa

October 5th (recording)

Assignments

Assignment
Moderator Edrick Wong
Notes @Roaa
Issue Triage Kate Goldenring

Announcements

  1. Akri was accepted into the CNCF as a Sandbox Project!!
  2. KubeCon NA WASM Day Talk on October 12 1:45pm - 2:15pm PT called Living on the Edge: Using IoT Devices on Kubernetes WebAssembly Applications
  3. New Akri release v0.6.19
  4. Will be moving to new Akri organization: https://github.com/project-akri

Agenda

  1. Changes that come with Akri as a CNCF Sandbox Project - Edrick Wong
  2. Grace Hopper Celebration Open Source Day outcomes and learnings - @Roaa
  3. New Akri release - Kate Goldenring
  4. Akri MQTT POC demo and Web UI discussion (context) - @vinay
  5. Add other agenda items by leaving a comment here!

Discussion

  • Akri is now a sandbox project in CNCF. what it means for Akri is there may be upcoming changes in docs, zoom links.. etc. Those will be communicated when they happen. Expect minor changes.
  • Grace hopper open source day recap. Contribution from OSD in PR.
  • Akri release:
    • Onvif improvements
    • Tokio version update
    • Akri code is moving to a different organization as part of the CNCF migration.
    • Open discussion on docs release and akri release synchronization.
    • MQTT demo and discussion:
      • Scenario is around using akri for asset management.
      • Currently using a mock MQTT broker from https://github.com/eclipse/mosquitto
      • Discussion around using a pod watcher vs watching to akri isntances instead.
  • Issue triage

Assignments for Next Meeting

Assignment
Moderator @Roaa
Notes Edrick Wong
Issue Triage @bfjelds

September 7th


Assignments

Assignment
Moderator Kate Goldenring
Notes @bfjelds
Issue Triage @Roaa

Announcements

  1. KubeCon NA WASM Day Talk on October 12 1:45pm - 2:15pm PT called Living on the Edge: Using IoT Devices on Kubernetes WebAssembly Applications
  2. OSD Day at Grace Hopper Celebration
  3. Updated dependencies to latest versions. Now using Tokio version 1.0.
  4. Docs site launched!

Agenda

  1. Issues and PR Bot for marking stale and closing - Roaa
  2. Add other agenda items by leaving a comment here!

Discussion

  • Quick updates on: WASM, Grace Hopper, Dependency updates, docs!
  • Staying fresh with our new Issues and PR bot driven by activity (comments, etc)
    • Lack of activity for 45 days will lead to stale label
    • Lack of activity on stale item will lead to closure
    • This allows us to keep our issues limited to things that are actively being worked on or researched.
  • Triage:
    • Frequent changes to Configuration resources causing instability (may be related to recent deps update) fix in progress
    • ONVIF duplicate instances detected good first issue, put on backlog
    • udev version update needed put in backlog

Assignments for Next Meeting

Assignment
Moderator Edrick Wong
Notes @Roaa
Issue Triage Kate Goldenring

August 3rd (recording)

Assignments

Assignment
Moderator Edrick Wong
Notes Vince Nguyen
Issue Triage @bfjelds

Announcements

  1. OSD Day at Grace Hopper Celebration

Agenda

  1. Akri on Krustlet demo: demo of optimizing an Akri Discovery Handler to run as a WebAssembly module on krustlet. - Rods
  2. ONVIF updates- Kate Goldenring
  3. Documentation repository (proposals location?) - Kate Goldenring
  4. Issues clean up - should we have a stale issue bot? - Roaa
  5. Broker deployment strategies discussion: Where and how many Pods should be deployed to discovered devices? - Kate Goldenring
  6. Add other agenda items by leaving a comment here!

Discussion

Akri and Krustlet

Trying to port Akri components to krustlet node in order to reduce Akri footprint even further, enable wasm modules to use devices, and join the wasm community

  1. First aimed to port Akri DebugEcho (for testing) Discovery Handler to run as a WASM module
  2. Wasi constrains: only supports single-threaded environment, limited networking
  3. Due to contraints (no sockets support in WASI), created a proxy for the Discovery Handler to talk to the Agent

Takeaways

  1. Decrease in DebugEcho discovery handler file size from 12M to 172K
  2. Start up time decrease from 4s to 3s
  3. Can use Akri devices on krustlet

Next steps

  1. Port more discovery handlers (that discover real devices) to Wasi
  2. Port Agent (might need to wait for Wasi improvements since multi-threaded)

Amazing demo from Rodrigo! [See meeting recording]
Try out the demo here.

ONVIF updates

  1. Optimized discovery handler so only makes calls to device endpoint for filtering once. Reduced CPU limit from 1300m to 24m k8s CPU units.
  2. Optimized sample broker, reducing size from 2GB to 800 MB. Still very large but only for demo purposes.
  3. Removed errors due to not being able to connect to IPv6 cameras
  4. Did IPv6 support investigation. See discussion
  5. Did authenticated camera discovery investigation. See discussion

Doc and repository

We are looking to move Akri docs to its own doc repo. It is easer to edit and format. Using GitHub repo we can track and people can add comments/feedbacks easier. Though the proposal docs are more technical. Should technical proposal docs be part of github repo or HackMD? Idea: we will have a repo with 2 folders: 1 for user docs and 1 for technical proposal docs.

Issues clean up

Some Akri issues have been around for a while. A lot of projects have bots to mark issues stale or close them out. Idea: we will add a bot to label old issues as "stale" and add comment to the issue. The bot would also bump and email people so bug filer can reactivate if they like.

Broker deployment strategies discussion

Triage

Execute workloads based on labels instead of flags in PR titles - PR on the way.
Improvements to the "Implementing a new Discovery Handler" Doc - Added "documentation" tag.
Switching from MIT to Apache-2.0 License - Moved to backlog.
Cluster setup flow for microk8s points to 1.19 - Investigating.
RUSTSEC-2021-0073: Conversion from prost_types::Timestamp to SystemTime can cause an overflow and panic - We are blocked on this because we have to increase tokio version and a versions for a bunch of others. Moved to backlog.
Akri architecture for IoT protocols with standalone devices - Moved to investigating.

Assignments for Next Meeting

Assignment
Moderator
Notes
Issue Triage

July 6 (recording)


Assignments

Assignment
Moderator @bfjelds
Notes @Roaa
Issue Triage Kate Goldenring

Announcements

  1. OSD Day at GHC

Agenda

  1. Akri + WASM - Rodrigo
  2. MQTT proposal discussion - Shan Desai
  3. GitHub Workflows improvements (Docker Buildx) - Shan Desai
  4. CoAP discovery handler development - Jiayi Hu
  5. Add other agenda items by leaving a comment here!

Discussion

Akri + WASM

MQTT proposal:

  • MQTT comes with the assumption that there are other publishers publishing special topics and that those topics are predefined. Application assumes by subscribing to a specific topic, it will get information back.
  • Akri's most prevailing benefit to MQTT is the dynamic deployment and scaling based on MQTT broker resources.
  • Handling devices going offline - MQTT can disconnect from a broker due to network connectivity or other reasons, in those scenarios the proposal is for Akri MQTT discovery handler to leverage last will and testament concept to stop advertising such topics and Akri's controller can take down the applications using it.
  • Defining MQTT Akri resources: Is a single device a resource? A topic? - Current proposal is to include general topics as an Akri resource. Akri can leverage the MQTT wild card option to group topics under general high level topics (ie: temperature.. etc)
  • Interesting followups
    • Can apps request akri resources target special topics? or only general ones?
    • When requesting a wild card topic as a resource, how can Akri make sure it deploys the app to the node including the publisher for that topic? Is deploying to all nodes a good strategy?
  • @ShanDesai to update the proposal with points discussed on GitHub and in meeting. More discussion offline can follow.

CoAP:

  • Discuss different challenges:
    • Ideally, the agent and discovery handler runs only in a single node or limited number of nodes. Can this be achieved by node selectors? other techniques?
    • How can akri better incorporate the needs for pub/sub protocols and their architecture? @Giovani to open an issue to discuss further.

Assignments for Next Meeting

Assignment
Moderator Edrick Wong
Notes Kate Goldenring
Issue Triage @bfjelds

June 1 (recording)

Assignments

Assignment
Moderator @Roaa
Notes @bfjelds
Issue Triage Kate Goldenring

Announcements

  1. Akri Blog contribute by writing a post!

Agenda

  1. KubeCon EU take aways - Kate Goldenring
  2. Review a few good first issues that are available - @bfjelds
  3. Donating Akri to a foundation - Edrick Wong
  4. Discussion of MQTT Protocol + Eclipse Sparkplug B Integration with Akri - @ShanDesai
  5. Add more items by leaving a comment here

Discussion

  • Akri blog introduction

KubeCon EU take aways

  • Presented an intro to Akri (117 live viewers, 230 views as of a week later, 100 more views as of now) and a breakout session.
  • Presentation contributed to gains in slack members, github repo views, and github clones(!).
  • Q/A session reflected user interest and areas for improvement
    • questions about ARM, etc
    • agnostic language support for extensibility
    • scalability?
    • stress testing?
    • device authentication?
    • iot protocols:
      • MQTT
    • roadmap
      • CNCF sandbox
  • Submitted for KubeCon NorthAmerica

Good first items

Donating Akri to a foundation - Edrick Wong

  • Success correlated to users, contributors, device support

  • Donation is the best path towards increasing Akri's success but where??

    • CNCF seemed natural to us
    • Akri built with the intent of being K8s native
  • Steps: #1 sandboxing

  • Other thoughts?

    • Other sandboxed projects? https://www.cncf.io/sandbox-projects/
    • Where to donate depends on what problem you are solving for example, if it is edge compute, target edge compute in CNCF (look at other similarly targeted projects like OpenYurt).
  • Projects considered every 2 months (next in June)

Discussion of MQTT Protocol + Eclipse Sparkplug B Integration with Akri

  • MQTT

    • publish/subscribe (topics: xxx/yyy/zzz) lightweight (vs Kafka, etc) proactive
    • Good for low power devices, perfect for IoT
    • Many implementations, payloads can use json, xml, etc
    • Sessions can establish connection
    • Last Will & Testament mechanisms allow a last gasp communication
  • Akri & MQTT using Sparkplug B (sparkplug.eclipse.org)

    • Has standardized specification/implementation for payloads and topic syntax
    • Start with basic implementation and add configuration as users dictate need
  • Interesting questions:

    • Where should the monitoring broker pod run? On an Akri node? Somewhere in the network?
    • How does discovery work? Is it needed or redundant?
    • MQTT v5 has new concepts (request/response), is v5 preferable?
  • Monitoring pod (in architecture diagram) vs MQTT broker? Monitoring pod was borrowed from OPCUA concept. MQTT broker is an MQTT concept (not an Akri broker). MQTT brokers have been stress-tested for 100k devices.

  • Most common devices: env sensor off-the-shelf (temp, humid, pressure) for prototyping

  • MQTT apps: time-series data from accelerometer on boat, industrial setting sensor analysis

Assignments for Next Meeting

Assignment
Moderator @bfjelds
Notes @Roaa
Issue Triage Kate Goldenring

May 4 (recording)

Assignments

Assignment
Moderator Edrick Wong
Notes @roaa
Issue Triage @bfjelds

Announcements

  1. KubeCon Edge Day lightning talk just happened! KubeCon EU Session on Akri in just two days, on Thursday at 3:25 AM PST.

Agenda

  1. Big v0.6.5 release overview - Kate Goldenring
  2. Ideal future release cycle and what should go into the next release? - Birds of a feather moderated by @bfjelds
  3. Documentation site - Edrick Wong

Discussion

KubeCon Akri Talk:

  • Talk on Thursday. Session will cover DHs and extensibilty model.
  • Recording will be posted May14th on CNCF website.

New Release

  • Full release notes on Github
  • New gRPC interface for discovery handlers
  • Support for full (wtih embedded discovery handlers) and slim agents.
  • Discovery handles are organized into library and folder to promote code sharing.
  • Better documentation on discover handlers & broker development
  • A webhook validating Akri's configurations.
  • Support for monitoring using Prometheus, exposing number of instances/brokers.

Release cycles:

  • Last release had lots of features. Team will be moving to more frequent and smaller releases.
  • Moving to a monthly cadence going forward.

Next release:

Team is looking for feedback and feature requests. Some of the discussed features:

  • Improving onvif: IPv6, authentication scenarios. Open questions around frequency of IP cameras with authentication?
  • Support more platforms: kubeedge, kind..
  • Handling stateful brokers/broker caching: Akri assumes all brokers are stateless, however there are some scenarios where the broker is stateful. Discussed ideas: broker requiring certain configurable grace period if a device is offline, assigning a new device to broker if a device is offline. Jiayi Hu to create an issue to continue the discussion.
  • MQTT

Akri team to update roadmap and document the strategy on how team decides what is picked next relase (ie: most up-voted issues)

Website:

  • Website is on track to be included in next release based of gitbooks.
  • Decision is to include it in the current akri repo -at least in short term- since it is based of docs in the repo.

Assignments for Next Meeting

Assignment
Moderator @Roaa
Notes @bfjelds
Issue Triage Kate Goldenring

April 6 (recording)

Assignments

Assignment
Moderator Kate Goldenring
Notes @bfjelds
Issue Triage @Roaa

Announcements

  1. KubeCon EU and KubeCon Edge Day 2021 proposals accepted! More information about the event here.

Agenda

  1. New Discovery Handler Model in main - Kate Goldenring
  2. New default Agent without Discovery Handlers - Kate Goldenring
  3. More descriptive Helm chart values - Kate Goldenring
  4. Community outreach, where we could use some hands - Edrick Wong
  5. Modularizing repo structure (should we have separate repositories for Akri core components, a Discovery Handlers, brokers, proto files, etc?) - @bfjelds
  6. Documentation site - Edrick Wong

Discussion

New Discovery Handler

  • Discovery handlers can live outside of the Agent
  • You can now write discovery handlers in Go
  • If interested in writing a Discovery Handler in Rust, we have a template to get started
  • Slim agent is default, seperated deploy discovery

Community outreach

  • Focus has been on K8s community
  • Include other non-K8s community memebers (devops, etc) call for ideas:
    • github actions & workflows (e2e tests good area for improvement)
  • Create page for help wanted (issue? dedicated page? dedicated page that points to issues? check out other projects)

Repo Modularization

  • Several options
    • split apart core repo
    • breaking out swagger for our API
    • seperate place for our gRPC
    • docs?
  • Would having more repos make it confusing for developers who are getting started
    • Having a documentation site might be useful as the first step if we had several repos
  • Can we serve up gPRC and swagger with GH Pages like we do with helm?
  • Start an Issue on this
  • Examples
  • Feedback:
    • +1 on proto
    • a scaffold project like Akri Proto + lang*

Documentation

  • Standalone docs page using gitbooks gitbooks will link to docs folder in repo
  • Once docs folder and gitbooks are sync'd, no real effort needed to keep in sync
  • Intent is for this to provide "single source of truth"
    • Especially important if repos are modularized
    • Could provide various flows to ease different user roles (akri dev, DH dev, etc)
  • Should docs be separated out into their own repo (related to Repo Modularization)

Other

  • Security audit action, other learnings could be really cool for an Akri blog

Assignments for Next Meeting

Assignment
Moderator Edrick Wong
Notes @roaa
Issue Triage @bfjelds

March 2 (recording)

Assignments

Assignment
Moderator Edrick Wong
Notes Kate Goldenring
Issue Triage @bfjelds

Announcements

  1. KubeCon EU 2021 proposal accepted! More information about the event here.

Agenda

  1. Discovery Handler Design and Demo - Kate Goldenring
  2. Dependency Update - @Roaa
  3. Webhooks - @bfjelds
  4. Deallocate - @Roaa
  5. CoAP protocol - Jiayi Hu
  6. How to engage more with community - Edrick Wong

Discussion

Discovery Handler

  • Protocols are decoupled from the Akri Agent now
    • Do not need to modify Akri Agent to add new protocols
  • Template to expediate protocol creation

Dependency Update

  • Streamline dependency updates via a GitHub Action: has update command and verify command.
  • If succeeds, a PR is created with updates
  • Monthly basis
  • GitHub action is generic so can be used for other languages feel free to use it in your other projects!
  • Also, now performing security audit checks on dependencies

Webhooks

  • For validating Configurations that are passed to K8s and then Akri
  • Couldn't put PodSpec in base of CRD: would tie us to specific version of K8s
  • Catch errors (such as improper indentation) earlier

Deallocate

  • Working to get deallocate functionality added to next version of Kubernetes

CoAP

Internet of Production - largest Manufacturing systems/digitalization research group in Germany

  • Setting up a K8s cluster to manage (?) machine connectivity
  • Large variety of machinery (sewing, welding, etc) that want to connect to one K8s cluster
  • How to connect all these devices that "speak" different protocols?
  • Looking at ways to aggregate machine data (fluentd, kafka, etc)
  • Don't want to "mess" with the machinery and have to install anything on it
  • Want device discovery, ability tochange aggregation method (want to pull data every minute vs every week), and not to have to modify devices/machinery.

Bird of a Feather

  • @Shan's similar scenario as IoP
    • Research scientist - working with robotic arms and industrial machines (not high risk setting)
  • @Jiayi researching "The Continuum of Computing" bringing cloud, edge and fog toghether
    • want edge to have same advantages as cloud: HA, etc
    • Same issue as Moritz: how to update code on devices without bringing them down
      • devices collect data that used in ML workload
      • update code on devices using WASM (lower mem requirements than compiler)
  • Start sharing ideas and designs on HackMD

Assignments for Next Meeting

Assignment
Moderator Kate Goldenring
Notes @bfjelds
Issue Triage @roaa

Febuary 2 (recording)

Assignments

Assignment
Moderator Edrick Wong
Notes @bfjelds
Issue Triage Kate Goldenring

Announcements

We have an official logo now!

Akri logo and wordmark

If you'd like to see the styleguide and other artwork, check out the GitHub folder

v0.1.5 Release

More information at https://github.com/deislabs/akri/releases/tag/v0.1.5

Discovery Handlers as Pods

More information at https://github.com/deislabs/akri/issues/198

Agenda

Discussion

Notes

  • Introductions, setting agenda, reminder of recording

  • Announcement of official logo, introduction to Akri name, process of logo creation

  • Announcement of OPCUA (v0.1.5), description of OPCUA implementation, and work to come

  • Discussion of new discovery handler extension method

    1. gRPC communication
    2. streaming response, allows for flexibility per handler
    3. concern: push-based communication
    4. still requires an agent for each node, this remains an area for improvement
  • Triage of issues, introduction of roadmap page

    1. Documentation of build process w/ HackMD page containing start at docs
    2. Represent CRD versioning in accessible way for Helm
    3. Update dependencies like kube and tokio ( ~0.2 -> ~1.* ), they are old
    4. Agent crash when API server not available
      1. Kubernetes guidance 5 minute before full on takedown of Agent
      2. Document what our decision is
      3. Cache?
      4. How does this impact slot usage, etc?
      5. Standby mode?
      6. Check out Krustlet and mimic
    5. Overly verbose, unstandardized logging can the logging level be set in the Helm chart?
      1. Document a standardized approach
      2. Guidelines for INFO vs TRACE, etc
    6. Helm linting needs to pass args to be effective, maybe create exhaustive set of lint options
  • Feedback, suggestions, concerns, etc

    1. What is the broader goal/vision/use case? (including from MSFT perspective) Would be clearer if this was found in the documentation, maybe HackMD.
    2. Other deislabs projects, Krustlet, have some blog info that was very helpful akri could benefit from similar.
    3. Why HackMD? Other people simply use github (single tool is easier)
      1. Thinking was: this was not code, so separate from github
      2. google docs vs hackmd hackmd didn't require an account
      3. In the end, we are just trying this out, open to migrating to other platforms
      4. Finding responses is difficult on HackMD, notification system may be insufficient
  • Intention for handling Community meeting, agenda

    1. All notes in HackMD
    2. To suggest Agenda item, anyone can add a comment for new agenda item
    3. aka.ms/akri/meeting-notes, search "akri hackmd", pinned to slack, should we add a badge to github

Assignments for Next Meeting

Assignment
Moderator Edrick Wong
Notes Kate Goldenring
Issue Triage @bfjelds