T-Testing-DevEx

# T-Testing-DevEx ## Links * Meeting: https://meet.jit.si/t-testing-devex-team * Calendar: https://rust-lang.github.io/calendar/testing-devex.ics ## Potential topics - What all do people want to do? - Trying to capture things with json output - Invite rstest, gtest-rust, catch-rust, expect-test/insta people? - Do it more informally? - [doctests](https://github.com/rust-lang/testing-devex-team/issues/5) - Invite Guillaume? - [coverage](https://github.com/rust-lang/testing-devex-team/issues/4) - Testing problem domains - Database - CLIs - Linux kernel - Embedded - GUI - TUI - End-to-end tests when there is an external system it plugs into - Lunatic, database plugins, etc ## 2025-07-29 Attendees: Ed, Scott, Caleb, Weihang Agenda - https://github.com/rust-lang/testing-devex-team/issues/10#issuecomment-3062999640 - weihang: do the envs have flags? - epage: yes - weihang: what about other harneses, communicating this out to them - epage: libtest-mimic does not have them - weihang: what about documenting the expected API for custom test harnesses - epage: haven't done so yet, expect we will as we move forward with that part of the project - scott: someone will yell but fine with it - epage: with how soft we deprecated `--nocapture`, hard for someone to scream - https://github.com/rust-lang/rust/issues/142859 - we're good with this - endorse it for T-libs-api to sign off - epage: will take care of this ## 2025-04-22 Attendees: Ed, Scott, Caleb Agenda - All hands - Ed and Scott - Topics: doctests, consolidated doc tests and MSRV - https://rust-lang.zulipchat.com/#narrow/channel/404371-t-testing-devex/topic/crate.20ownership/with/513725057 - Check on transition out of experiment - Either move out or experiment ## 2025-03-25 - Should `cargo test` / test executor handle signals - https://rust-lang.zulipchat.com/#narrow/channel/404371-t-testing-devex/topic/meeting.202025-03-24 ## 2024-11-19 Attendees: epage, weihanglo, muscraft - FCP started on https://github.com/rust-lang/testing-devex-team/issues/9 - https://github.com/rust-lang/rust/issues/133073 - epage: propose `--nocapture` -> `--no-capture` - epage: propose `--test-threads` -> `--jobs` - muscraft: `--nocap` doesn't make sense - muscraft: no problem as long as its done properly - weihang: `cargo test -j 2 -- -j2` could be weird but fine - muscraft: parallel frontend has a `-Z` that uses threads? - muscraft: will they stabilize that name? which should we be consistent with? - muscraft: should each area of threading be individual controlled or under one `cargo --jobs` - weihang: threads for linker is generally a problem - epage: so `--jobs`? - consesus: defer `--jobs` to see what rustc does ## 2024-11-05 Attendees: epage, weihanglo, muscraft ### Agenda - https://github.com/rust-lang/testing-devex-team/issues/9#issuecomment-2414022196 - They want it for their patch bazel rules - They can't process stdout, only a file, even if a tool is doing that processing - As precedence, cargo previously rejected this kind of teeing - Their proposal is a breaking change - Restricting this to just `--format json` and `--format junit` feels too special - "Improper" tests mixing output with the json on stdout is understandable but `--logfile` would likely not work for other cases like Cargo - A solution for processing json output will need to be looked at when we get there - In general, this puts extra expectations on custom test harnesses in what they need to support - Maybe deprecate it? - In `--help` and print a message for at least pretty format - Seems like the focus should be on addressing Bazel's side of the issue - Hold FCP for the above ## 2024-09-24 Attendees: epage, weighanglo, caleb ### Agenda doctest * Idea: `rustdoc` emits source from doc comments, and let `cargo` execute them. * https://github.com/rust-lang/testing-devex-team/issues/5#issuecomment-2372410448 rustconf ## 2024-09-11 Attendees: epage, muscraft, arlosi, weighanglo, rain, thomcc, David Barsky ### Agenda JSON output - Amazon wants feature parity with existing output - what nextest wants to consume from it - start time / end time - Infer from message arrives? - But that could get off - Anything wopuld work, including time from process start - Why it was skipped - Always enumerate all tests, maybe some static skips - Ignores because of slow, failing - Skip because of missing VM - nextest has an enum of 5 reasons - When to use description vs enum? - Tests not in default filter - Then show message to tell people how more can be run - CLI and non-default set - Do you ovverride non-default state - Is CLI de-selected a Skip or Ignore? - nextest is really two separate enums? - Separate out cause (machine readable string) vs description (human readable) - User doesn't get control over cause, only description - Maybe a internally tagged enum - This is getting complicated, maybe we should not have this in MVP - Timeouts - Do this through bi-directional design? - Except jobserver? - rust-analyzer is fine with bidirectional - custom libtest problem - compile times - easy for third-party implementations - Process-per-test is bidirection in terms of killing tests - When you hit ctrl-z it pauses timers and stops tests - If you want to do this with single process - Should timeout be runner or harness? - Killing threads is unsound? but libtest has unstable support for it - Retries for flaky tests - Test harness? - panic=abort would be dead - Do we want the harness to retry in the process per test or have discovery report attributes up and then the runner tells the harnes not to retry - Ed: hoping runners and harnesses to be single digit - One of the problems is if the first try corrupts process state - But also for external fixtures - How much state do you try to retry? - Rety again within the same process can hide a bug - Specialize the json message - Add an `attempt` field - Maybe not there if on the first run - If two m essages with same name, is that sufficient? - Stress tests - Different mode of running multiple times - Needs to separate from `attempt` tracking - Generally run these stress in parallel - In harness, you can force to serialize if needed - But generally this is to find problems by running in parallel (e.g. a common port or file) - So needs a "run id" to distinguish messages from parallel runs - Can we use that for retries? - what nextest wants to produce json output - schema evolution - json output vs schema stabilization doctests ## 2024-07-16 ### Attendance Weihang, Ed Scott, David, Caleb ### Agenda - Short-cut to stable json? - David: Be like rustdoc and version it, breaking it? - Ed: We want custom test harnesses, so this is ecosystem wide - Ed: Having Cargo support "last 3" would be impossible for MSRV - David: What if we just do this during a "preview" window? - Schema evolution - https://github.com/rust-lang/libtest-next/issues/71 - David: what LSP does is good - Note after the fact: Dynamic Registration is bad: https://matklad.github.io/2023/10/12/lsp-could-have-been-better.html#Dynamic-Registration - Some extended discussion on why it sucks (nested, complicated RPC mechanism) https://github.com/rust-lang/rust-analyzer/pull/5516#issuecomment-757534063 - Ed: versioning of extensions? - David: nah, just support both and then drop it. Few people broken - Ed: meh, in the LSP model its up to the plugin what their compatilbity needs are - Ed: if we had runnable dev-dependencies, then versioning wouldn't be an issue - David: example: https://github.com/rust-lang/rust-analyzer/pull/17246 - David: client: https://github.com/facebook/buck2/blob/main/integrations/rust-project/src/main.rs#L139-L152 ## 2024-07-09 ### Attendance Weihang, Ed, Scott ### Agenda - epage: status report on "global registration" - Mara's work on "externally implementable items" has raised concerns on visibility, backwards compatibility, and multiple-instances. - This concern has been extended to "global registration" - Different ideas are floating around for how to handle this, like excluding inter-crate registration with some other extensions - My feedback was that disallowing registration of items from other crates is ok as it leaves the door open for the future, so long as they support a way to manually wire in a registration item from a dependency - Similarly, on https://internals.rust-lang.org/t/blazing-fast-unlinking/21073/54, people are pushing for cargo to do dylibs by default to cut link times out of edit/build/test cycles which could be exclusive to having inter-crate registration in a meaningful way - Maybe useful for build script, independent of everything else - However, blocks reusing target builds for build scripts ## 2024-05-21 ### Attendance Weihang, Ed, Scott, Caleb ### Agenda - Scott: https://github.com/rust-lang/rust/issues/123365' - See also https://rust-lang.zulipchat.com/#narrow/stream/404371-t-testing-devex/topic/meeting.202024-05-21/near/439874279 logfile format descrepancy relative to --format does addressing this constitute a bug fix or a breaking change need to understand current usage as feature is behind a stable flag, and whether file is leveraged by humans retroactively or parsed/used (e.g. IDE/editor) then would constitute a breaking change 1. what is intent (historical context around addition of feature) 2. how's it being used today https://github.com/rust-lang/rust/issues/57147 current format appears focused on simplicity, and thus conceivably parsing `test_status test_name` could standard unix command line utilities be leveraged to send stdout content to a file no, would cause mixed output with `cargo test` (Ed can you expand/correct this?) options moving forward: * deprecate the logfile feature * add a new flag to control the format used for log file output revisited potentially doing something sooner rather than later by add that flag with the default maintaining the current value to avoid breaking changes while also providing the option to change the file format consensus that while still a potential option, we want to hold off on that to honor t-libs-api desire to remove existing unstable options vs adding new ones the file feature was originally added in: https://github.com/rust-lang/rust/commit/5cc050b265509c19717e11e12dd785d8c73f5b11 https://github.com/rust-lang/rust/pull/2127 https://github.com/rust-lang/rust/pull/82350#discussion_r579732071 team reviewed and discussed origins and potential intent: Looks like this was a short term fix that slipped through into Rust 1.0 and when documenting it later, Eric was hesitant because this was bespoke and unclear if anyone used it. However, this was intended as a programmatic output, so its hard to say no one uses it. Should we deprecate, telling people we will not expand this feature and recommend people move away from it. - This also implies we wouldn't recommend custom test harnesses to implement `--logfile` as part of the minimum interface - e.g. junit output is something that people might want tee'ed out of `cargo test` (granted, that might be hampered by "what version") - Likely something to revisit inside of `cargo test` when we move reporting to `cargo test` - They can always use custom test harnesses to add whatever features they want Deprecation would not eliminate the possibility of supporting future use cases (even if those happened to also involve file output) Next Steps: caleb to draft update message for team to review, which will then be posted back on the PR ## 2024-05-14 ### Attendance Weihang, Ed, Scott, thomcc ### Agenda - epage: RustNL - custom test harnesses - [stdout capture](https://github.com/rust-lang/testing-devex-team/issues/8), see issue for updates - [test registration](https://github.com/rust-lang/testing-devex-team/issues/3), see issue for updates - As a short term workaround, Divan has found static constructors work best - [json output updates](https://github.com/rust-lang/libtest-next/issues/71) - Got rustdoc folks in touch with Edition folks so its tracked - thomcc: in dicussion with cargo-taurplin / cargo-nextest about what they need from doctests, preparing thoughts for a future meeting on how to meet their needs ## 2024-03-26 ### Attendance Weihang, Ed ### Agenda doctests - Crazy idea: https://github.com/rust-lang/testing-devex-team/issues/5#issuecomment-2021482312 ## 2024-03-05 ### Attendance Scott, Weihang, Caleb, Ed ### Agenda ## 2024-02-20 ### Attendance Ed, Weihang ### Agenda - json design - https://github.com/rust-lang/libtest-next/blob/main/DESIGN.md#json-format - attaching source location - Maybe be its own event so you can have multiple (trybuild) or decide to do it during discovery (fast, no errors) or during execution (slow, errors) - test discovery - some can be upfront but some can't be until later (pytest style fixtures) - will need to figure out the UI and json format around this which is why experimenting with an actual implementation can help ## 2024-01-30 ### Attendance ### Agenda - Thom: Deep dive: benchmarking ## 2024-01-16 ### Attendance Ed, Scott, Thom, Weihang ### Agenda - epage: Experiment home - libs does this - rust-lang preferred to ensure people have permissions going forward - publish permission is a concern - https://github.com/orgs/rust-lang/teams/testing-devex - epage: reach out to infra - Deep dive topics? - What all do people want to do? - Trying to capture things with json output - Invite rstest, gtest-rust, catch-rust, expect-test/insta people? - Do it more informally? - doctests - Invite Guillaume? - coverage - Testing problem domains - Database - CLIs - Linux kernel - Embedded - GUI - TUI - End-to-end tests when there is an external system it plugs into - Lunatic, database plugins, etc - Deep dive: benchmarking - Scheduled for next week - Particular topic: why is it bad to have a default analysis within cargo Action items - epage: reach out to infra about repo: https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/Transfering.20a.20repo.20to.20rust-lang/near/420849469, https://github.com/rust-lang/libtest-next - epage: create eRFC: https://github.com/rust-lang/rfcs/pull/3558 - scott: add ICS link to agenda: see top - epage: record current json format motivations: tracking in https://github.com/rust-lang/libtest-next/issues/74 ## 2024-01-09 ### Attendance Ed, Scott, Thom, Weihang, Caleb ### Agenda - Scott: Calendar invite generated from [rust-lang/calendar](https://github.com/rust-lang/calendar) Scott: New team calendar generating option available project wide https://github.com/rust-lang/calendar https://rust-lang.zulipchat.com/#narrow/stream/217588-project-leads-.28public.29/topic/team.20calendars Scott: example setting up for Cargo: https://github.com/rust-lang/calendar/pull/9 Scott: email would be needed for organizer, outstanding question to Eric regarding which to use Scott: other oustanding question would be duration (30 vs 60 minutes) - thomcc: should be 60 minutes - Ed: 60 should generally be fine, may occasionally be instances of needing to end early - 60 minutes is fine for all caleb: any objections? - Scott: wants the process improved - thomcc: wants a team @-ing to ensure we all attend Scott: owning getting it merged - epage: json output - https://internals.rust-lang.org/t/path-for-stabilizing-libtests-json-output/20163 - https://github.com/epage/pytest-rs/blob/main/crates/libtest2-harness/src/notify/mod.rs#L27 Ed: Recap of prior meetings, and suggestion to focus on json stabilization as an initial priority. Ed: eRFC vs experiment in separate repo, main objective is maximizing input, and iterating on design for json output. Ed: question about getting updates into unstable output in libtest to make available for cargo Thom: libs unlikely to have any concerns with changes to relative anything unstable Ed: question about libs' preference for plan/nature of design Thom: depends on size and scope, if large and significant then potentially an RFC Ed: changes to json structure would be impactful, particularly to direct consumers and their respective users Ed: discussions elsewhere about feature availability in IDE, when features are dependent on unstable features Scott: Rust Rover has many conditionally functional features Weihang: Could we use versioning Ed: Potentially, and this is part of the details of the evolution plan Ed: part of prior experimentation is shifting more to cargo test from libtest Ed: custom test harnesses can utilize json output from libtest to produce various other formats Ed: how can we best go about experimenting to stressing the json structure as part of increasing confidence that the structure will support the various needs and use cases Scott: All for shifting more to cargo test Scott: there's precedent for rustc emitting things for cargo/others to pickup Thom: question if cargo can utilize the json output. Ed: clarifying question as to whether the ask was for something underneath cargo that isn't libtest? Ed: libs api wants to shift additional functionality away from libtest to custom test harnesses Thom: benchmarks may not fit in this model. benchmarks being in libtest has been somewhat problematic Ed: still want json relationship even with benchmarks as cargo will still need to consume Ed: cargo criterion and cargo nexttest would still exist as alternatives to cargo test and cargo bench and offer their own feature rich interface Scott: (I missed this sorry) Ed: cargo test leaks temp directories Scott: how does sqlx do cleanup, can we look at that Thom: cleanup is more tricky Ed: metrics we want to be able to collect during tests, benchmarks have similar needs (e.g. icount, binary size). Thom: that information isn't trivial. benchmarks are non IID, many statistics will produce bad results Thom: unsure if cargo will ever be the best place for all of these Caleb: clarifying question regarding this topic for today's meeting Ed: intent was to have a catch up, process check-in, issue w/fcp or eRFC in the rfc repo. question on process Thom: suspect the only thing libs api will care about is any exposed apis Weihang: should we discuss doctests with rustdoc team for input on json output Ed: think this is part of the determining the process Thom: eRFC makes sense, offers something to point people to. have we discussed with any external parties like nexttest team? Ed: actively discussing with Rainn Ed: any concerns with eRFC Scott: question around difference between RFC and eRFC Ed: eRFC are more rare but coming back around. somewhat like an MCP, recent example is cargo script - example: https://github.com/rust-lang/rfcs/pull/3424 Scott: announces intent without any binding promises Thom: no objections Ed: perhaps helpful to have 1:1 with Thom to dive in to benchmarks Ed: going into cargo criterion and incorporating into experiments Thom: issues with distributed slice approach Ed: discussed this with libs api team, answer was "no, we want less in libtest" Thom: lld has documentation staking claim to being able to rip out. static constructors work everywhere except wasm Ed: end goal is json output, and using custom harness to achieve that so that should be next focus area Ed: good overveiw of distributed slice Distributed slice: https://github.com/dtolnay/linkme#distributed-slice Thom: good idea, but implementations aren't necessarily there and doesn't always work (e.g. on Windows) Ed: there was a [previous eRFC](https://github.com/rust-lang/rfcs/pull/2318) for custom test harnesses. was an extension of rustc's special relationship with libtest as opposed to language level, was an overly restricted design. if harness = false then everything comes from the library Thom: not a fan of that rfc either, and was part of what led to libtest becoming somewhat unmaintained Decision: eRFC Action Item: Ed, proceed with eRFC once internals conversation resolves and have deeper discussion with Thom on benches and statistics - epage: Future topics? - How much do we want to plan further ahead while things slowly move forward with json? ## 2023-12-19 ### Attendance Ed, Scott, Weihang, Caleb ### Agenda * Agenda consent * Discuss future meeting logistics * forum (async & zulip, sync & jitsi/zoom/etc.) * note taker role (today, ongoing) * schedule - https://lettucemeet.com/l/23Ma7 * notes (hackmd, team repo, etc.) * Discuss high level goals and/or initial focus areas * epage: some prior thoughts at https://epage.github.io/blog/2023/06/iterating-on-test/ * epage: backlog: https://github.com/orgs/rust-lang/projects/39 ### Minutes #### Meeting Logistics Ed: Propose sync, will be more conducive to initial activities Weihang: what does async meeting look like? - Just over text, like zulip - Still mostly sync Caleb: any objections to call? Decision made: jitsi-type synchronous meeting #### Note taking Caleb: wants a record - Recording means people might act different / not be as open - So capture minutes Decision made: keep minutes Caleb: facilitator and note taker doesn't work well Scott: Cargo is more of a group effort, not sure what a primary note taker role would look like Ed: Cargo: Facilitator tends to drive notes, other assists - Helpful when we step in and help out when the note taker needs to speak, so role overlap doesn't matter as much Ed: Cargo team captures notes within the agenda, small but effective practice in Cargo team's experience Caleb: location Scott: Captured in hackmd and then copied out, don't have time to work on it though Ed: Real time co-editing the most important part. Still need to address the question of publishing and hackmd's length limit. Ed: Experimenting with role of "historian", going back through content and summarizing + publishing. Not likely to be feasible for devex in immediate future Weihang: Hackmd has publish to github feature but seems broken Weihang: hackmd has a github sync feature - https://hackmd.io/c/tutorials/%2Fs%2Flink-with-github Decision: hackmd, maybe copy out as needed and do more community outreach as we feel we need #### Meeting Times Caleg: *recorded in zulip for timezone adjustments* Decision: Tuesday afternoon Ed: calendar invite? - Make it easier to work around time logistics, make sure people have right links Decision: Caleb will send invites with personal email to our github emails Next meeting on Jan 2nd or 9th - Scott: no idea what I'm doing that far ahead to know if we should cancel #### Vision, Goals, Roadmap or Low-hanging fruit Ed: Unlikely there are any low-hanging items Ed: libs-api inclined to freeze libtest, in favor of braoder proposal Ed: Stabilizing json output would probably be the closest such candidate. Doing this would potentially create some low-hanging items and unblock work Ed: Custom test harness would/should move in parallel, would require language extension. Close relationship with json output and mutually inform each other Scott: Agree, similarity to diagnostics Caleb: what's permissible to evolve with json output without violating stability guarantee Ed: libest existing json output is unstable, so could be fully evolved but libs-api may want a transition plan to minimize impact Ed: evolution - versioning (both in asking for a version and in reporting version) - add new fields, new message types - experiment with what we need to get idea of what will need to support Weihang: How to integrate with doctest is another big topic. Can't run doctests along with other suites, ergonomics issue as it has to be executed separately. Performance issues Weihang: Ed shared idea about integrating with rustdoc and parallel opportunities Ed: haven't spent as much time digging in on the doctest, but agree as a future topic of investigation - once we have json, have cargo speak to itself in those terms so it looks to the user like libtest - could then explore actually using libtest for doctests and treating this as normal tests Caleb: sounds like our focus can be on json out, something to explore further