(draft) Polonius roadmap details

# (draft) Polonius roadmap details ## TL;DR This document lays out a roadmap to get Polonius on stable by Rust 2024. It identifies some high-level milestones and summarizes the key goals. ## Background on polonius Like chalk, polonius refers to many things. It is a new formulation of the borrow check. It is also a specific project that implemented that analysis, based on datalog. Our current plan does not make use of that datalog-based implementation, and instead is focused on reimplementing polonius within rustc. The motivating example for polonius is the so-called ["Problem Case #3: conditional control flow across functions"](https://github.com/rust-lang/rfcs/blob/master/text/2094-nll.md#problem-case-3-conditional-control-flow-across-functions): here, returning a reference out of a function, from a conditional. ```rust= fn get_default<'r, K: Hash + Eq + Copy, V: Default>( map: &'r mut HashMap<K, V>, key: K, ) -> &'r mut V { match map.get_mut(&key) { // -------------+ 'r Some(value) => value, // | None => { // | map.insert(key, V::default()); // | // ^~~~~~ ERROR // | map.get_mut(&key).unwrap() // | } // | } // | } // v ``` Returning the mutable reference `value` in the `Some` path requires the mutable loan on `map` to live until the end of the function. This prevents mutation in the `None` path even though no mutable loan on `map` would exist there in the first place. Fixing this borrowck issue requires more precision about flow-sensitivity. It also hints at limitations in our modeling of lifetimes, which appear more clearly in cases with only slightly more complicated control flow, like issue [#47680](https://github.com/rust-lang/rust/issues/47680): ```rust= struct Thing; impl Thing { fn maybe_next(&mut self) -> Option<&mut Self> { None } } fn main() { let mut temp = &mut Thing; loop { match temp.maybe_next() { Some(v) => { temp = v; } None => { } } } } ``` Here, the loan created on iteration N _may be used_ in the `None` path on iteration N+1, however the borrowed variable at that iteration is not the same. Issues like NLL problem case #3, #47680 and others, were therefore deferred from NLLs, and left as future work, [polonius](http://smallcultfollowing.com/babysteps/blog/2018/04/27/an-alias-based-formulation-of-the-borrow-checker/). The key ideas being: - switching from a model of _lifetimes_ as sets of points in the CFG (with _outlives_ relationships), to a model of _origins_ as sets of loans (with _subset_ relationships). - computing and tracking the subset relationships at each point in the CFG (whereas the existing borrowck computes a single subtype relation). ## Milestones This is a rough roadmap where we have the most visibility on the first steps of the chain: - each step has unknowns that will define what things need to be done in the later steps - similarly, who will do the tasks is sometimes up in the air: we don't have the bandwidth to mentor a lot of people unfamiliar with the topic, and can only have fewer people right now. - therefore we're talking more about milestones for the longer term roadmap, and proper tasks for the shorter term. As soon as more data and results are available, visibility will improve, as well as the "parallelism" of tasks. - the more steps are in the future, the less certainty and visibility they have. Here are the roadmap's milestones, with tentative ranges of possible completion: ```mermaid flowchart TD HR[" 1. Factoring out higher-ranked errors from the main path (End of June?) Santiago? "] IN[" 2. Location-insensitive loans-in-scope prototype (End of June) lqd "] LI[" 3. Location <i>in</i>sensitive pass on nightly (end of June?) "] VF[" 4. Verify full test suite passes with location insensitive Compilers in P2.0 mode accepts same programs as with NLL (Aug) "] P2["5. Location sensitive pass on nightly (Nov 15)"] STABLE["6. Location sensitive pass stable (end of 2024)"] HR --> LI IN --> LI LI --> VF VF --> P2 P2 --> STABLE ``` The first two steps are actionable and being worked on to differing degrees: ### 1. Factoring out higher-ranked concerns from the main path Today the trait solver produces higher-ranked outlives constraints and the borrow checker solves them. In the future, we would like to make the trait solver responsible for solving these higher-ranked constraints itself, so that it only produces constraints on regions in the root universe (which borrow checker can easily process). This would allow us to solve implication predicates like `for<T> { if (T: 'a, 'a: 'b) { T: 'b } }`; if the trait solver doesn't take responsibility for those, then it would have to capture the full environment and pass it to the borrow checker, which would effectively reproduce the same trait solving logic again. In the shorter term, we can refactor the borrow checker to separate out the higher-ranked processing from the ordinary processing. Right now, the values for all regions (regardless of universe) are computed in a single pass over the graph; this code must be careful to account for constraints like `'a: 'b` where `universe('a) = Un` and `'b` contains a placeholder from a higher universe `Um, m > n`. Such constraints are translated to `'a: 'static`. The goal would be to "preprocess" the outlives constraints in a kind of "leak check", where we determine whether a placeholder `P` in `Um` must (transitively) be outlived by variable `x` (`x: P`) in some smaller universe `Un` and, if so, constrain `x` to be `static`; similarly, if `x: P` must hold, then this must be an error (`x` cannot be known to outlive the placeholder). This "polonius leak check" code could then be removed once the trait solver takes responsibility for solving these constraints. This is a "nice to have" task, most likely, as we can probably keep the older code and workaround the higher-ranked constraints. The current plan as of the RustNL workshops last week is for @lcnr to mentor @spastorino to do this task. *time frame: next 3 months* ### 2. Location-insensitive loans in scope Out of the two key differences between polonius and the existing borrow check (regions as "sets of loans", and computing subtyping relations at each point in the CFG), this step is aimed at resolving the *first* difference, but not the second, so we call it the "location *in*sensitive loans in scope" (because subtyping is being done once, not per location): NLLs with the polonius model. Note that other aspects of the existing borrow checker are still flow-sensitive. In this step, we will rewrite the borrow checker so that its data structures store sets of loans. We believe this will be equivalent to the existing borrow check in terms of the errors that get reported (this will be useful to verify via crater). This change paves the way for adding location sensitivity (sets of loans are also a better foundation for future improvements to the borrow checker such as safe internal references). This is what @lqd is currently working on. *time frame: next 3 months* ### 3. Location *in*sensitive pass on nightly The two preceding tasks form a milestone together, that we can test better on the full test suite, and crater, to discover possible issues. It's possible that the analysis is too slow at this point. Some of the existing UI tests exercize the current datalog polonius fact generation and analysis heavily (both in CFG and size of the subset graph). Real world data from the prior polonius experiments show that this is also happening in published crates, which would make running on crater impossible if not improved at this point. We would then have here a better understanding of possible issues, from unexpected crater regressions. Each of these would need investigation, categorizing where the issue comes from, and then having them fixed. We'd also be less constrained on resources here, with some parallelism possibilities between different people, and not necessarily requiring a huge amount of expertise in the error detection, categorization, minimization, etc. Comparisons with a-mir-formality can also be done here. (Maybe with the possible fuzzer that is planned) *time frame: next 3 months - next 6 months* ### 4. Location insensitive verified At this stage we should have a location-insensitive analysis that is equivalent to today's NLLs. And we could take a look at its interaction with diagnostics and expressiveness, which _should_ also be the same as today (depending on the amount of polish), and compare its efficiency if need be. Diagnostics and error reporting may currently depend on internal details of constraints and region values, and since they will be changed by the previous steps, it's possible that we'd need to do some refactorings there as well. Then the (harder-to-do-efficiently) work to incorporate location-sensitivity can start. *time frame: next 6 months ### 5. Location-sensitive pass on nightly Some version of location-sensitive analysis is done. Still inefficient at this point. Efficiency work, that could include necessary theoretical improvements to the model, in concert with validation in a-mir-formality, can start to be done. If the experience with NLLs in edition 2018 is to be expected, a large amount of work and polish will also be needed handling diagnostic differences issues, and ensuring errors and notes are clear enough. *time frame: next 6 months - next 12 months* ### 6. Location sensitive enabled The location sensitive pass is (hopefully) efficient enough, tested in practice, somewhat formally verified and can be enabled in edition 2024. *time frame: next 18 months*