--- tags: "2021-10-15 sprint" --- # Polonius 2021-10-15 ## Big picture discussion Two roles of polonius * Provide a reference impl that is suitable for inclusion in reference etc * Provide a production impl that roughly fills the "current borrow checker" shape * As close to reference as possible Do we want to stick to datalog? Does it scale to real-world situations? Maybe not, but we'll figure that out. ## Richer language We might want a richer set of values, e.g. to be able to have structured paths. At some point we may need a more expressive language than Datalog, or to switch to a different paradigm (e.g. top-down solver). - Niko: For now, our problems have all been expressible in Datalog. In the long term, we may need a closer connection with Chalk for HRTBs. ## Fact generation scope How much of the borrow-checker is part of "fact generation" versus specified in Polonius? Should we try to write explicit logical rules for everything? - Example: `loan_invalidated` is a result of some simple rules around references (shared ones invalidate unique ones, etc.). Should these rules be specified in datalog? - Niko: Yes, in the long-term, but not important for now. ## Polonius.next! [Purely forward propagation](https://hackmd.io/KpK0ICQOQLSPxIcNIlhkJQ) --- ## Cleanup Things To Do - ~~Clean up and PR changes to tests done during the live call, also add them as `cargo test` tests (@Domenic Quirl)~~ ✓ - ~~write up answers to the questions in `.next` HackMD~~ ✓ - Visualize `subset` relation as a subgraph in graphviz output - somehow visualize fact generation over time (for things like loops)? - IIRC soufflé can debug/profile tuple generation (w/ provenance ?) so we may be able to use some of its output and visualize it (their own profiler visualized the number of tuples generated per round, but I don't know if we can see the actual tuples -- we probably can) --- ### Origins Example 1 ```rust let mut x = 42; let mut y = 22; let mut temp: &'temp mut u32 = &'L_x mut x; // &'L_x mut x // clear_origin('L_x) // introduce_subset('L_x, 'temp) let p: &'p mut u32 = &mut *temp; // &'L_*temp mut *temp == &mut x: double mut borrow of `x` // clear_origin('L_*temp) // introduce_subset('L_*temp, 'p) // introduce_subset('temp, 'L_*temp) // because of the `*` in the place // subset('L_x, 'temp) // computable based on subsets introduced thus far // subset('temp, 'L_*temp) // computable based on subsets introduced thus far // subset('L_*temp, 'p) // computable based on subsets introduced thus far // subset('L_x, 'p) // computable based on subsets introduced thus far x = 11; // Invalidates both borrows // invalidate_origin('L_x) // subset('L_x, 'p) // computable based on subsets introduced thus far temp = &mut y; // No use of `*temp` while referencing `x` (good). // But clears 'L_*temp??? // origin_invalidated('L_x) // origin_invalidated('p) -- // clear_origin('*L_temp) // clear_origin('temp) // due to [this rule][r0] *p = 33; // Should be ERROR, but invalidation of 'L_*temp is not propagated // to here because it is cleared above. // origin_invalidated('p) -- ``` [Relevant example][ex] \[r0\]: https://github.com/nikomatsakis/polonius.next/blob/d91f1608b4f0b5c63183c9aa5c34a0d89bf4d2b1/src/polonius.dl#L97-L102 [ex]: https://github.com/nikomatsakis/polonius.next/blob/d91f1608b4f0b5c63183c9aa5c34a0d89bf4d2b1/tests/killing-and-murder/program.txt ### Loop self-invalidation example --- ```rust fn nondet() -> bool { todo!() } fn main() { let mut x = 42; let mut v: Vec<&'v mut i32> = vec![]; while nondet() { let tmp = &'L_x mut x; // Multiple copies of 'L v.push(tmp); } } ``` ```graphviz digraph G { rankdir = "TD" node [ shape = "rectangle" ] a [ label = <<table border="0"> <tr><td>a: x = 22</td></tr> <tr><td>-------------------</td></tr> <tr><td>invalidate_origin('L_x)</td></tr> </table>> ] b [ label = <<table border="0"> <tr><td>b: v = vec![]</td></tr> <tr><td>-------------------</td></tr> <tr><td>invalidate_origin('L_v)</td></tr> <tr><td>-------------------</td></tr> <tr><td>origin_invalidated('L_x)</td></tr> </table>> ] a -> b c [ label = <<table border="0"> <tr><td>c: p = &amp;'L_x mut x</td></tr> <tr><td>-------------------</td></tr> <tr><td>invalidate_origin('L_x)</td></tr> <tr><td>clear_origin('L_x)</td></tr> <tr><td>clear_origin('p)</td></tr> <tr><td>introduce_subset('L_x, 'p)</td></tr> <tr><td>-------------------</td></tr> <tr><td>origin_invalidated('L_v)</td></tr> <tr><td>subset('p, 'tmp1)</td></tr> <tr><td>subset('L_x, 'p)</td></tr> <tr><td>subset('L_v, 'tmp0)</td></tr> <tr><td>subset('tmp1, 'v)</td></tr> <tr><td>subset('v, 'tmp1)</td></tr> <tr><td>subset('p, 'v)</td></tr> <tr><td>subset('L_x, 'v)</td></tr> </table>> ] b -> c e -> c d [ label = <<table border="0"> <tr><td>d: v_tmp = &amp;'L_v mut v</td></tr> <tr><td>-------------------</td></tr> <tr><td>access_origin('v)</td></tr> <tr><td>invalidate_origin('L_v)</td></tr> <!-- XXX --> <tr><td>clear_origin('L_v)</td></tr> <tr><td>clear_origin('tmp0)</td></tr> <tr><td>clear_origin('tmp1)</td></tr> <tr><td>introduce_subset('L_v, 'tmp0)</td></tr> <tr><td>introduce_subset('tmp1, 'v)</td></tr> <tr><td>introduce_subset('v, 'tmp1)</td></tr> <tr><td>-------------------</td></tr> <tr><td>subset('L_v, 'tmp0)</td></tr> <tr><td>subset('tmp1, 'v)</td></tr> <tr><td>subset('v, 'tmp1)</td></tr> <tr><td>subset('L_x, 'p)</td></tr> <tr><td>origin_invalidated('v)</td></tr> <tr><td>origin_invalidated('L_v)</td></tr> <tr><td>origin_invalidated('tmp0)</td></tr> </table>> ] c -> d e [ label = <<table border="0"> <tr><td>e: Vec::push(v_tmp, p)</td></tr> <tr><td>-------------------</td></tr> <tr><td>access_origin('p)</td></tr> <tr><td>access_origin('tmp0)</td></tr> <tr><td>access_origin('tmp1)</td></tr> <tr><td>introduce_subset('p, 'tmp1)</td></tr> <tr><td>-------------------</td></tr> <tr><td>subset('L_x, 'p)</td></tr> <tr><td>subset('L_v, 'tmp0)</td></tr> <tr><td>subset('tmp1, 'v)</td></tr> <tr><td>subset('v, 'tmp1)</td></tr> </table>> ] d -> e } ``` Problem: 'L_x gets both invalidated and cleared in block C, but 'v is not accessed until block D. Solution: Compute the `origin_invalidated` facts that will propagate to the next block *before* applying the effect of the clear (erasing the `subset` relation). ## Diamond CFG ```rust let mut x = 4; let mut p = &22; if maybe() { p = &x; } else { x = 0; // invalidate x } dbg!(p); // access p (or *p?) ``` <details> <summary>Facts</summary> ``` let x: i32; let y: i32; let p: &'p i32; bb0: { x = 4; p = &'L_y y; goto bb1, bb2; } bb1: { p = &'L_x x; goto bb3; } bb2: { x = 0; goto bb3; } bb3: { } ``` ``` // let x: i32; // let y: i32 = 22; // let p: &'p i32; bb0[0]: { invalidate_origin('L_x) goto bb0[1] } bb0[1]: { clear_origin('p) clear_origin('L_y) introduce_subset('L_y, 'p) goto bb1[0] bb2[0] } bb1[0]: { clear_origin('p) clear_origin('L_x) introduce_subset('L_x, 'p) goto bb3[0] } bb2[0]: { invalidate_origin('L_x) goto bb3[0] } bb3[0]: { } ``` </details> ## 47680 ```rust struct Thing; impl Thing { fn maybe_next(&mut self) -> Option<&mut Self> { None } } fn main() { let mut temp = &mut Thing; loop { match temp.maybe_next() { Some(v) => { // This path: // // * Borrow of temp being stored into temp // and thus persisting temp = v; } None => { // v is not live here } } } } ```