# rust-analyzer/rustc library-ification plans
- we should move towards rust-analyzer like architecture, because it works
- moving one component at a time is more effective than slowly moving the monolith as a whole (unclear if true: introducing library boundaries has costs)
# General idea
- Create shared libraries usable by both rustc and rust-analyzer
- Publish these libraries on crates.io
- Emphasis on selecting good boundaries and defining the right external interface
- and documenting proper behavior
# Top questions we'd like to discuss/answer today
- Are we on board with pursuing "library-ification"?
- In particular, we plan to start making changes to rustc itself.
- What is the first target of library-ification?
- name resolution
- trait system
- "type system" (type unification / representation / inference logic)
- Or are there other good candidates?
- As time allows, consider some of the technical details of the above targets, particularly what kind of "boundaries" we can draw.
# What makes a good "library" for extraction?
# How to manage libraries?
- Monorepo vs many repos?
- Monorepo make it easier to coordinate changes and to audit the effects of changes to crates like `chalk` that also interact with other bits of code.
- If we use multiple repos, we might want to carefully design the boundaries to collect together code that must be audited together.
- But multiple repos allow us to avoid centralized bors queues and may help collect issues in a more focused fashion.
- On the other hand, would require use to make better use of org-wide searches and the like.
- Making sure that the build is simple (ideally, vanilla `cargo test`)
- Making sure bits are testable in isolation:
- integration test defeat librarification
- unit tests need more maintanance and are somewhat less reliable
- carefully though out, robust API boundaries a must
- are we ready to commit to those?
- Avoid polyrepo unless there is a clearly definable boundary
- unit tests should be based only on interfaces considered highly stable
- chalk unit tests are (mostly) a subset of Rust syntax, though not in all cases (should be improved)
- Particularly when extracting code from rustc, it makes sense to start as a libary within the rust repo
- following example of lexer
# Initial candidates
## Lexer and Parser
- @matklad has extracted the lexer.
- begin exploring work on extracting to a shared parser
## Macro expansion and name resolution
- high priority for rust-analyzer: unlock no false-positives errors & fixes
- also seem hard to extract, as the boundary is large
No work has started, current heuristic-ish code in rust-analyzer is a bit of a mess.
## Traits and type manipulation
### Current status in rustc
We've integrated chalk into rustc behind the `-Zchalk` flag, but only in a very narrow capacity, and it's not extensively tested. In rustc, the only shared code is the "logic engine" (the `chalk-engine` crate). This means that rustc re-implements the rules for converting from Rust impl/trait declarations into lower-level logic rules. It also means that rustc implements a [trait](https://docs.rs/chalk-engine/0.9.0/chalk_engine/context/trait.Context.html) defining, for example, the representation of types, the [method for unifying them](https://docs.rs/chalk-engine/0.9.0/chalk_engine/context/trait.UnificationOps.html#tymethod.unify_parameters), and so forth. Basically a lot of the interesting stuff.
Already *this* integration exposed some interesting problems. The most notable is that chalk's current evaluation model relies on being able to work part-way through a problem and stop, and this was incompatible with the old `'tcx`/`'gcx` lifetime scheme. This required us to refactor chalk greatly to introduce canonicalization whenever it "paused" processing. Now that we've removed that scheme, in fact, the `Context` traits for chalk-engine can be simplified (and the entire engine made correspondingly more efficient).
### Current status in rust-analyzer
rust-analyzer went farther and integrated the chalk "solver" (the `chalk-solve` crate). Currently, this means that rust-analyzer implements a [trait](https://github.com/rust-lang/chalk/blob/b4a6b655578ee35b1b3f6b8579636269cf3b0b1a/chalk-solve/src/lib.rs#L17) that answers questions about the Rust program. For example, ["give me information about this impl"](https://github.com/rust-lang/chalk/blob/b4a6b655578ee35b1b3f6b8579636269cf3b0b1a/chalk-solve/src/lib.rs#L31-L32) or ["what are the impls for this trait"](https://github.com/rust-lang/chalk/blob/b4a6b655578ee35b1b3f6b8579636269cf3b0b1a/chalk-solve/src/lib.rs#L34-L39).
Implenting this trait currently works by having rust-analyzer convert from its internal datastructures to the data structures exposed by the chalk-rust-ir crate. These data structures are similar-ish to rustc's HIR, but very minimal, capturing only what is needed to do trait processing (method bodies, for example, are excluded). They are also pretty terrible, as they were designed more as a proof-of-concept test harness, and could definitely use some care and attention.
This integration *too* has exposed some interesting questions. For example, the way chalk handles lazy normalization is [producing more ambiguity than we'd like](https://github.com/rust-lang/chalk/issues/234), and we'd like to refine [the interface to the rest of the compiler](https://github.com/rust-lang/chalk/issues/241) to be more "targeted", for better IDE compatibility.
### Current thoughts
- Current boundary of `chalk-engine` in rustc feels too small
- Example: inputs are not rust programs but an internal IR
- Rust-analyzer boundary feels better, but there is still overlap
- Both rust-analyzer and chalk have representations of types and their own unification logic
- Possible boundary:
- The "type constraint solver system", meaning that it would
- ask the surrounding environment for details of traits/impls
- lowered internally to Chalk's representation
- export a representation of types
- export a type inference / unification table (also used internally during trait solving)
- similar to `rustc::infer` today
- be able to solve "goals" expressed in terms of rust terms
- lowered internally to Chalk's representation
- in rustc, this boundary is roughly
- some portion of `rustc_typeck::collect`
- Notably **excluded** is the "type checker", meaning the code which generates the constraints
- that is, we do not include `rustc_typeck::check`
- this is because this code is intimately tied to the HIR
- there *are* two-way interactions, but we hope to use chalk's "suggestions" to manage those
- Why this boundary?
- Can be expressed largely in terms of Rust programs
- Unification, variable instantiation, and subtyping are pretty tightly tied to trait solving
- especially when you consider associated types
- Similar logic is presently duplicated 3 times:
- chalk has a copy
- rustc has a copy
- rust-analyzer has a copy
- Useful on its own
- e.g., the "core part" of many RFCs, like specialization and GATs, can be expressed entirely in terms of this subset
- (obviously there are surrounding questions of syntax)
- But there are some challenging questions
- What should our type representation look like?
- chalk has a more minimal approach, rustc a more maximal one
- I think chalk's is closer to correct, but it's also incomplete at present
- Do we want an interning/arena setup like rustc has?
- Not obviously a good fit for a longer-lived process like rust-analyzer
- How to intergrate chalk's "persistent tables" with the incremental query system?
- Also arises with current chalk integration
- Also arises with MIR optimizations -- you want to be able to mutate the MIR for a function in place over time
- The "MIR box" idea that [simulacrum and I discussed](https://rust-lang.zulipchat.com/#narrow/stream/187679-t-compiler.2Fwg-parallel-rustc/topic/mir.20stealing) might be a good fit
- What happens with const generics and evaluation?
- I don't think we want "miri" to be part of this box
- We probably *will* want to share the definition of a "value"
- Probably something we can push into the interface with surrounding code