owned this note
owned this note
Published
Linked with GitHub
# Maintenance
- **Title:** Discussing maintenance and triage
- **Estimate:** 1 meeting
- **Type:** Non-technical
# Summary
- (Check at beginning: are there topics we should cover today not on this list)
- We are having a hard time keeping up with P-high
- We never review P-medium bugs
- Our labeling system is too crude and not well documented
- Can more things be automated?
- Can we do a better job at getting help in bisection etc
- How should unassigned issues get assigned?
- Developer engagement during triage meeting
- We are having trouble keeping up with reviews
# Motivation
It seems like we are having a hard time keep up with maintenance, see above.
Also: We want the triage meeting to be a good use of everyone's time. If its just a boring slog through a laundry list of issues, and only having 30 seconds to devote to each one, fewer developers are going to be willing to attend (and even fewer still will pay attention to the discussion.)
* Arguably the desire to make better use of time was as motivation for the current structure where the meeting is divided into "triage" vs "working group checkin" sections. But the triage, which was meant to be reduced to 30min via "pre-triage effort", seems to spill over, taking up 45-55min on a fairly regular basis.
# Details
## Keeping up with P-high
In theory, each week the triage meeting organizer looks at each P-high bug, and if there hasn't been any activity in the past week, they ping the assignee. If no one is assigned, then the organizer is supposed to find someone to assign it to.
* In practice, just the effort of going through the unprioritized nominated issues and assigning priorities to them is taking up a lot of pre-triage time. pnkfelix hasn't done the exercise of visiting each P-high bug in a long while.
* Instead, pnkfelix has been providing the statistics (we have N open P-high bugs, M of which are unassigned), rather than clog up the group meeting time with rote traversal of the P-high bugs.
**pnkfelix's main question:** Should all of the P-high bugs actually be P-high? Should we be expecting, in general, to be seeking updates on 30-50 issues each week?
**Centril's question:** Should we have more dedicated groups for triage and meetings for that? More dedicated bisectors and minimizers?
## We never review P-medium bugs
As of this writing, we have 4,857 open issues in the Rust project; 1,569 are tagged T-compiler, so lets focus on those.
* 37 are P-high
* 129 are P-medium
* 38 are P-low
There is no set plan for when we revisit the P-medium or the P-low issues (to evaluate progress, double-check status, and potentially reassign). There's not even a target frequency for such revisiting.
Perhaps even more worrying is that the vast vast majority of the bugs (1,365) are not prioritized at all.
## Labeling system is too crude and not well documented
### Labeling in general
Right now, there are several distinct github lists that we (should/might) visit during (pre)triage: P-high issues, I-nominated issues, stable-to-beta regressions, stable-to-nightly regressions, stable-to-stable regressions, unprioritized issues.
* That is too many lists, IMO.
* But how to avoid the proliferation of such lists, while still prioritizing?
* github does not support OR'ing labels in searches; otherwise one might at least combine them all into one list for the main traversal
* But perhaps we could make a tool to produce the OR'ed result, perhaps also coarsing sorting so that e.g. I-nominated AND P-high comes first, then other I-nominated, then other P-high.
### Nomination labeling
Right now people nominate issues via the I-nominated label to raise attention to them among the relevant teams.
Sometimes the nomination is an implicit request for prioritization. (That is at least what I assume when I see that the nominated tag was added, but there is no associated comment describing what the intent of nomination was.)
Other times there is a specific question/debate that the nominator wants resolved by the team
In any case, this system works okay (apart from the aforementioned proliferation of issue lists). The main thing that sometimes irks me the targeted team for the nomination is meant to be inferred from whatever T-team labels are on the issue.
* Would there be any potential benefit to having specific I-nominated-lang, I-nominated-libs, I-nominated-compiler? Or would this just cause increased list proliferation?
### Priority labels
Its not clear what the priority labels (P-high, P-medium, P-low) are actually supposed to mean, apart from providing a coarse order on priority. There isn't a set semantic meaning for them, at least not across the project.
* Furthermore, discussion in T-compiler meeting from 2019-06-27 led centril/pnkfelix/estebank to muse that you *can't* have a set semantic meaning across the project.
* Namely, some issues might be P-high for WG-diagnostics, but only P-medium at best for T-compiler
* (maybe this implies that once an issue is assigned to a WG, you should *remove* its T-compiler label? Would that make sense, or just complicate searches?)
Anyway, pnkfelix can only speak on what he does on behalf of the compiler team:
Seemingly, under current practice for the compiler team:
* P-high is *supposed* to mean that the bug gets checked in on at every weekly triage meeting, to try to ensure progress (and reassign the bug to someone new if there is something blocking the current assignee). The previous section outlined where that goes wrong.
* P-low means "this is going to ignored until someone complains or it gets fixed by accident"
* P-medium means "this is something we do not want to ignore, but the organizer also don't want to think about it every week."
So: what should priority labels **mean**?
pnkfelix off-the-cuff proposal: Maybe instead of "P-high, P-medium, P-low", we should directly encode the intended visit frequency for each bug. Hz-weekly, Hz-monthly, Hz-yearly... (and find some way to also slice up the sets that are visited, perhaps via modular arithmetic on issue#) so that we could make headway each week on a deterministcally-predictable subset of the Hz-monthly/Hz-yearly bugs)
* (We would of course still need to figure out *how* we're going to go about actually doing the aforementioned visits. Is it the duty of the triage organizer?)
* This system also naturally extends to e.g. Hz-daily or even Hz-hourly for the really urgent bugs that we want to see immediately addressed.
* Likewise Hz-biweekly, Hz-quarterly etc for finer-grain distinctions between the three above.
* The main "advantage" here is that it makes the expectations concrete, in the name of the label itself. Something can have high priority for the project, but that doesn't mean we're going to talk about it at every weekly triage meeting.
* (Alternatively, we could keep the current P-high/medium/low, and just *state* that they correspond to weekly/monthly/yearly visit rates.)
----
Another related issue is that right now, bugs in code that relies on `#![feature(gates)]` tend to be given low to medium priority, because:
* feature-gated things are often known to be in flux and so its not worth spending the whole T-compiler team's time trying to address them.
* feature-gated things only affect the subset of our user base that is willing to use nightly and thus presumably put up with (work around or contribute fixes for) bugs in the compiler.
However, there is an important exception that we are not dealing with terribly well: feated-gated things that are actually on the short-list for near term stabilization.
* Right now, there is not much of a safe-guard against pnkfelix assigning P-medium to such bugs (apart from Centril noticing that pnkfelix has done so and making an alarmed comment).
* So, what can we do to ensure that we correctly prioritize such things? Should we just, as part of (pre)triage, review the set of features on the aforementioned short-list, so that it is on our mind during issue prioritization?
(niko does not have a concrete proposal yet. I had hoped we might try to develop one, but maybe the meeting should wait until we have more details to present.)
(Centril's concrete proposal: Add a label for things that need feature gates and cannot happen on stable; T-release also has decided to add F-* labels per feature gate)
------
Centril believes there's another important exception:
- Soundness holes.
We have a lot of them, some exposed on stable, and in Centril's view they do not get resolved in a timely fashion.
## Can more things be automated?
The first step to this might be to even just finish documenting what the current processes even are.
* pnkfelix has been trying to jot down the pre-triage process over on https://github.com/rust-lang/rust/issues/54818
* (We want *something* at that github issue, just to easily allow links from the Zulip topic for each meeting. But arguably a lot of that documentation should not live on #54818 anymore, but rather in some document on https://github.com/rust-lang/compiler-team/).
Relevant Zulip quote from niko on the matter of documentation here:
> we have some kind of "heuristics" we apply but I don't think we've ever written them down (similar to the one you mentioned, i.e., does it affect stable code, etc). Similarly, we seem to value preventing new regressions higher than fixing old ones (righly or wrongly, hard to say, but there's a logic to it:)
## Can we do a better job at getting help in bisection etc?
What are the problems with rust-bisect?
* Can new-comers use it readily enough (as an easy way to assist the project)?
* Could we make it unroll the rollups (if only via local rustc builds)?
* Relatedly, can we help people help us with more minimization of ICEs and other bugs? // Centril
## How to assign the unassigned
Can we provide guidance for assigning the unassigned issues, beyond just waiting for volunteers?
pnkfelix is hesitant to adopt a system like "round-robin assignment" that he's seen used elsewhere, since the Rust project is largely supported from volunteer effort, and so it is probably a bad idea to assume anything about how volunteers can contribute.
proposal: if a bug has a topic area (e.g. WG-traits), and has not gotten an assignee via asynchronous "work-stealing", then the meeting organizer assigns to the head of the relevant working group (with intention that they delegate it either during or before that WG's next meeting).
## Developer engagment during triage meeting itself
Do we want more engagement during the triage meeting itself?
Context: pnkfelix sometimes feels like he's just talking into a black hole during the triage meeting. There isn't always much discussion that follows the items that arise.
* This is to be expected, especially if the meeting topics are being delivered on-the-fly, and so people (including pnkfelix) need time to read the linked issues and catch up with the comment threads there.
* Of course if the issue owner is present at the meeting, one might expect them to provide a summary of the issue's progression and its current status.
So, you'd *think* pnkfelix is asking everyone to chime in during the meeting. But at the same time, such chiming in risks devolving into deep discussion of small details of bugs.
* Is that okay?
* pnkfelix often reacts to it by saying "we don't have time for this, we just need to get the bug assigned" (or its status updated, etc).
* But the issue isn't, or shouldn't be "we don't have time for this". The issue **is**: Is this a good use of the time slot we've alloted for *synchronous* communication between the T-compiler team members.
## We are having trouble keeping up with review queue
* Would cycling reviewers help?
* Does the review queue itself need to be reviewed weekly as part of triage?
* Niko points out that some PR's don't get reviewed because they are blocked on design discussion that is meant to happen at the Friday meeting(s).
# Challenges / Key design questions
Challenge: Do we have enough experienced T-compiler members to tackle the set of bugs?
* Developers are often enthusastic about scratching their own itch: either a feature they desire, or a bug that is blocking their own work. And this preference presumably is even more important for a volunteer work force like our own.
* So are we even focusing our effort in the right place? Or should we be trying to figure out how to motivate new compiler developers, and/or how to provide compensation for people as a way to side-step the "itch-scratching" issue...