Compiler Team Planning for Rust All Hands 2019

Information about the All Hands itself

  • February 4-8 in Berlin
  • The event is for team members + select others
  • If you've not received an invitation and are interested in coming, feel free to reach out to @nikomatsakis

P1 problems

  • compilation time
    • better compilation time investigation tools/analysis
    • this should work in cooperation with cargo, so we can see whole story
  • rls, completions
  • “too hard to do anything” — technical debt
    • trait system, and to lesser extent type checker, specifically is straining to support newer/desired features (e.g., GATs, normalization bugs, coercions)
    • “too hard to find people to do things” — organizational debt
    • hard to learn, monolithic architecture
    • poorly documented
    • long compilation times, memory requirements

Shorter list

  • Radical new ideas
    • pre-compiling things from crates.io and downloading compiled artifacts
    • compiling for "all configurations" for rustdoc (thread)
  • Technical areas where design is needed
    • End-to-end queries
    • MIR 2.0 and MIR optimizations
    • Parallel execution (rayon fork)
    • Trait solving / chalk integration
    • NLL / Polonius ?
    • RLS
    • rustdoc?
    • Incremental compilation
    • Diagnostics, teach, etc
    • Telemetry
  • Technical areas where bus factor is small and/or ripe for extraction into libaries
    • Name resolution
    • Macro expansion
    • Type checker
    • Debuginfo
  • Interesting things to learn about that are not rustc (yet?)
    • cranelift
    • rust-analyzer
    • salsa? (related to incremental compilation)
    • GLL and the wg-grammar work
  • Other areas for discussion
    • rustc-guide
    • Bors queue and turnaround time, can we do anything to improve it? < probably more infra
    • Spreading maintenance and review burden
    • intermediate steps between newbie, full member
    • mapping out who are the experts and of what
    • const eval
    • future RFC process

Possible discussion topics

We are currently brainstorming. Feel free to add thoughts of your own.

  • MIR 2.0 (@nikomatsakis)
    • something that supports efficient updates
    • deaggregation (let x = (a, b); -> let x_0 = a; let x_1 = b;)
      • NOTE(eddyb) I'm actually not sure whether this is SROA (current example, added by @oli-obk at my suggestion), or removing Rvalue::Aggregate (next item). @nikomatsakis, can you clarify?
    • remove Rvalue::Aggregate (let x = (a, b); -> x.0 = a; x.1 = b;)
    • “flat places” (Place not being recursive anymore, but having a (potentially empty) stack of projections) (https://github.com/rust-lang/rust/issues/52708)
    • turn function call, drop and assert terminator into statements (cc https://github.com/rust-lang/rust/issues/39685)
    • general infrastructure (e.g. aliasing analysis) for optimisations and sharing of the facts/knowledge
    • perform type inference (coercions & methods) on MIR (using terminators for coercion sites; can try a constraint-based system)
      • I think a separate "typecheck constraint IR" would be a better choice for type-checking. Won't even need to know much about the CFG (only "is reachable").
    • extended basic blocks (https://github.com/rust-lang/rust/issues/39685)
    • remove regions (or make parametric over region type), encoding+enforcing region erasure (https://github.com/rust-lang/rust/pull/56638)
  • Long-term plan for RLS-compiler integration
  • Better RLS integration, technical aspects
    • related: Extend incr. comp. model to support partial updates (i.e. update caches even if compilation fails) - a.k.a "revisions"
    • Idea: sketch out queries needed to model rls-analysis, shop around
  • Bors Queue Pain, RLS breakage can we manage things better?
    • Chasing down breakage in dependent projects: does this really happen?
    • Long and unpredictable merge times are very frustrating
      • Maybe a different bors prioritization scheme would help?
    • Can be particularly hard to juggle longer lived projects landing in stages
  • Breaking compiler into independent pieces:
    • Things like miri, polonius, chalk — can/should we get more ambitious?
    • How should we setup the repository for this?
  • Improving “raw” compilation performance (see also this post):
    • End-to-end queries
    • Multicrate compilation session
    • MIR-only rlibs
    • Polymorphization (that is: try to merge monomorphizations of a function if they basically result in the same machine code)
    • Erasing regions from types
    • MIR-level optimizations (inlining, copy propagation)
    • Parallel queries
    • move some queries from item- to module-level for reduced overhead (see e.g.: https://github.com/rust-lang/rust/pull/51487)
    • reduce data dependencies of expensive queries (e.g. MIR) on flaky information (e.g. spans) (e.g. https://github.com/rust-lang/rust/issues/47389)
  • Technical debt candidates
    • type checker
    • diagnostics extensions clutter in regular code (like suggestions or just better to understand diagnostics causing large additions to otherwise diagnostic unrelated code)
    • query/dep-node definition macros
  • Improving the contributor experience
    • enable RLS?
    • debugger?
    • test suite improvements/compiletest
      • ∃ code coverage tools for rustc?
  • Diagnostics
    • can we better isolate diagnostic text from the code?
    • --teach mode
  • Telemetry
    • Can we get (opt-in) automatic feedback about how rustc is used
  • Support PGO (profile-guided optimization)
    • some support already in the compiler (but broken since last LLVM upgrade)
    • improves runtime performance (mw might work on this because Firefox wants to have it)
    • we could also PGO the compiler based on data from the bootstrap and/or the test suite
    • Related: Use section/symbol ordering files for compiling rustc (https://github.com/rust-lang/rust/issues/50655)
  • Debugger and how to make Tom less lonely
  • const eval
    • related to future RFC process
    • loads going on in https://github.com/rust-rfcs/const-eval/
    • seemingly many unrelated features
      • control flow
      • unsafe code
        • raw pointers
        • transmutation
      • const safety
      • loops
      • traits with const methods
      • impl const Trait for Type
  • future RFC process

Table of possible projects with some analysis

Project Implementation Cost Compile Time Improvements Runtime Perf Gains Code Quality Gains Blocked On
MIR 2.0 ? medium to high ? ?
incremental save-analysis medium high none medium
end-to-end queries high low (general) to medium RLS) none ? (depends) architecture decisions
multi-crate compilation high unknown medium (~LTO lite) ? (depends) architecture decisions
MIR-only RLIBS medium medium medium (~LTO lite) low architecture decisions, parallel queries
polymorphization medium medium (?) maybe some low
erase regions from types medium to high medium (?) none high NLL taking over completely
MIR-inlining medium to high medium (?) maybe some none MIR 2.0 (?)
MIR-copy propagation medium (?) maybe some maybe some none MIR 2.0 (?)
Parallel Queries medium (? - by now) high none none
module-level queries low low none none
query data-dep cleanup high medium to high (incr. only) none none architecture decisions
PGO low to medium maybe some (indirectly) medium none
Chalk high maybe some (?) none high Chalk being full featured?
Polonius high ? none high (?) Polonius being finished?
Salsa high ? none high architecture decisions
Type-check refactoring high (?) maybe some (?) none high (?) erase regions from types
Revisions (incr. comp.) medium to high low to high (incr. only) none none architecture decisions, Salsa (?)
Separate Diagnostics medium to high none none medium to high
Clean up defining queries low to medium none none medium
Select a repo