owned this note
owned this note
Published
Linked with GitHub
# "Focused and efficient triage" design meeting proposal
## Links
* [pnkfelix's previous notes about triage](https://hackmd.io/dvgegmdgQVSMbC4rcjG6EA?view) -- lots of good stuff here
* [Zulip topic for meeting pre-discussion](https://rust-lang.zulipchat.com/#narrow/stream/185694-t-compiler.2Fwg-meta/topic/focused.20and.20efficient.20triage.20compiler-team.23247)
* [Zulip topic for meeting](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/design.20meeting.202020-02-28)
## Proposal
* create pre-triage working group
* primary goal is *not* to assure overall quality but to identify critical release blockers and make sure they are on a path to being fixed
* process nominations to determine if something is "critical"
* critical bugs are "next release blockers"
* we should write down guidelines for this, but not as part of this meeting (see notes below)
* review critical bug status and identify those that are not making progress
* prepare weekly summary to guide compiler team triage meeting:
* critical issues that are not "on track" to being fixed before they hit stable
* most important issue(s) that need discussion by team to resolve disagreements
* "future compatibility warnings" and other longer term progressions that should be moving towards completion
* could be that this is part of release team wg-triage, I propose we don't decide this here but let those groups decide
## Summary of needs and proposals
* Monitor and identify "critical bugs" that are not making progress
* **Today:** this is done by pnkfelix in pre-triage meeting
* **Proposed:** create new compiler triage working group to focus on this and prepare a report for the triage meeting
* For critical bugs not making progress, find someone to fix
* **Today:** pnkfelix attempts to guilt-trip at beginning of meeting, often with little effect, sometimes self-assigns
* or we ping ad-hoc groups of folks
* **Proposed:** defer for this meeting, but we do need to resolve this eventually
* (additional) ICE-breaker groups or working groups can make "ad-hoc groups" less ad-hoc
* maybe identify "maintainance czar" for areas of the compiler and find ways to ensure they have time to ensure stuff is getting fixed?
* Making general quality improvements and enhancements (i.e., "fixing non-critical bugs")
* **Today:** this is done by contributors or working groups today
* e.g., the async-await group does a regular triage of all things tagged A-AsyncAwait, NLL used to do similar
* mostly today we focus on new reports and do not do comprehensive "bug scrubs" of older bugs
* **Proposed:** defer for this meeting, this "works reasonably well" today
* but we will likely want to consolidate on conventions for groups doing triage
* Ensuring deferred things are picked up again (e.g., future compatibility warnings)
* **Today:** nobody does this "project wide" that we know of
* **Proposed:** let's discuss; either the release team wg-triage or compiler triage can periodically revisit such issues and re-evaluate their status
* e.g., for future compatibility warnings --
* can we advance to a hard error yet?
* should we ping folks and re-run crater?
* Processing new issues, ensuring labels are up to date, identifying bugs that are out of date or have been fixed
* **Today:** this is currently done by the release team wg-triage
* but number of issues without prioritization is very large (2200 T-compiler issues, was )
* **Proposed:** no change but release team wg-triage and compiler team triage should obviously coordinate
* but there is also a good question: is this really the same group?
## Brainstorming and questions
These are topics we discussed while planning for this meeting. They may or may not be good things to dive into during the meeting. Ultimately, the working group should decide for itself.
### How do people bring things to the working group's attention?
If something seems "obviously criticial", people can tag it as `P-critical` (see below). But if unclear, use `I-nominated` as today to bring it to the group's attention.
However, as we already have problems where the "intent" of a nomination is unclear, we may wish to consider replacing `I-nominated` with more specific `N-*` labels that identify the reason the issue was nominated:
* `N-critical` -- nominated as a potential critical issue
* `N-compiler` -- nominated for discussion by compiler team
* `N-lang` etc
### How does the group track critical issues and when to revist?
The existing `P-*` labels are fairly confusing. `P-high` in particular was meant to be "critical" issues but has wound up being used for more things and has become somewhat inactionable.
Proposal:
* Introduce `P-critical` to specifically identify (potential) blockers for next release.
* (Gradually?) replace other `P` labels with `Hz` labels that identify frequency to revisit:
* Hz-daily -- for the most critical bugs
* Hz-weekly -- typical critical bug
* (Maybe) Hz-release -- revisit after release
### How do other working groups track priorities?
The async-await working group has been making an active effort to triage and track priorities. The Hz- labels could also be used for this purpose, perhaps with "non-time-based" scopes. For example:
* Hz-active -- part of current "sprint", will tag the first N "medium"-like issues as active, then gradually fix those
* Hz-deferred -- issues to revisit after sprint is over, or perhaps after enough time has elapsed
* Hz-never -- issues not scheduled to be revisited (i.e., P-low)
### What is a "critical" bug?
We should develop criteria for critical bugs. Here are some notes.
Critical bugs, at the broadest level, are those that will actively affect substantial portions of the ecosystem, harm production users, or existing users.
Examples of things we typically judge to be "critical" bugs:
* Regressions where code that used to compile no longer does
* Mitigating conditions that may lower priority:
* If the code should never have compiled in the first place (but if the regression affects a large number of crates, this may indicate that we need a warning period)
* If the code in question is theoretical and considered unlikely to exist in the wild, or if it only exists in small, unmaintained packages that are not widely used
* Once a regression has hit stable for a release or two, we typically lower the priority as well, as by hat time
* Regressions where code still compiles but does something different than it used to do (dynamic semantics have changed)
* Mitigating conditions that may lower priority:
* If code uses feature that is explicitly not specified (e.g. `Vec` docs state order in which it drops its elements is subject to change)
* Feature-gated features accessible without a feature gate
* Mitigating conditions that may lower priority:
* If the pattern is VERY unlikely
* Soundness holes where common code that should not compile actually does
* Mitigating conditions that may lower priority:
* Soundness holes that are difficult to trigger
* Soundness holes that have been around for a very long time may be critical, but typically require
* Diagnostic regressions where the diagnostic is very common and the situation very confusing
* ICEs for common scenarios or code patterns
* Mitigating conditions that may lower priority:
* If the code that triggers the ICE also triggers compilation errors, and those errors are emitted before the ICE
* If the code in question makes use of unstable features, particularly if the ICE requires a feature gate
Some unknowns:
* What about toolstate errors?
## Typical workflow
* Issue gets filed
* Release team applies labels
* area labels, team labels
* If bisection, mcve needed:
* tag with needs-bisection, needs-mcve
* cc "Cleanup crew"
* In some cases:
* directly prioritize or send to the right place if it seems clear
* otherwise, nominate for compiler team meeting
* Release team nominates for compiler team to further process
* also cc folks to bisect
* (usually ICE), usually includes needs-bisection and needs-mcve
* Compiler team triage group analyzes and figures out which things apply
* Critical bugs:
* Tag with P-critical
* Needs team discussion:
* Delegation:
* Should we cc
## "Mission statement" for proposed group
Compiler team wg-triage has the high-level goal of:
* processing 'nominations' and routing bugs to folks who can fix them
* identifying *critical* bugs and monitoring them to ensure they are making progress
* identifying the agenda for compiler team triage meetings
* critical issues that are not making progress
* issues where bugs are nominated for needing wider discussion
* ideally, crystallize
* tracking deferred things and ensuring they are picked up again
* future compatibility warnings
* anything else?
---
### Other notes
Here is a bunch of stuff we didn't finish processing.
* How can we characterize "critical" bugs?
* e.g., regressions
* what about impact on crates, can we characterize how much that matters?
* but what about stable-to-stable regressions, those too?
* All ICEs is probably too broad, but perhaps some subset of ICEs qualifies?
* What is the role of the P-medium, P-high, etc labels?
* Should we have a standard way to "delegate" bugs to a working group? (like the async-await triage)
* What parts of triage can be delegated to a working group, rather than using the "whole team meeting"?
* Are there other sorts of triage beyond "preventing catastrophic bugs" we should consider, perhaps through the WG at a different cadence. e.g.,
* tracking "future compatibility warnings" and ensuring we move those through the process
* periodic "bug scrubs" or reviewing the set of random, medium priority bugs and trying to close, delegate, take action, etc
## Assumptions
More specifically, we would like feedback on the following assumptions:
* Goal of compiler team triage should be to avoid shipping catastrophic bugs
* i.e., things that are really embarassing or which will effect a lot of users
* Goal of compiler team triage is NOT to ensure "overall quality" of the compiler
* that goal is maintained in a distributed way through reviews
* but also by the working groups dedicated to a particular feature
* We should focus triage meeting time on "here is a critical bug that is not being fixed"
## Questions to answer
Presuming we sort of agree with the above hypotheses, some questions we might try to answer:
* Do we need this many levels?
* Can we narrow down the criteria for the various levels?
* Can we justify how folks will use these labels in concrete ways?