MIR Formality Roadmap

# MIR Formality Roadmap ## "Spike"-based approach A "spike" is a sample program. The goal is to build out the end-to-end system for a given example. At each point, we do the "roughly" the minimum work needed to get the spike to work -- adding auxiliary tests around that functionality -- and then turn to the next spike to motivate follow-on work. We've chosen to use open soundness bugs as our spikes. ### Spike 0: Issue [#25860] [#25860]: https://github.com/rust-lang/rust/issues/25860 ```rust static UNIT: &'static &'static () = &&(); fn foo<'a, 'b, T>(_: &'a &'b (), v: &'b T) -> &'a T { v } fn bad<'a, T>(x: &'a T) -> &'static T { let f: fn(_, &'a T) -> &'static T = foo; f(UNIT, x) } fn main() { let value = { let data = 22; bad(&data) }; let _ = *value; } ``` ```mermaid flowchart TD AutoTraits["Model auto traits"] Static["Model statics"] WfCheck AutoTraits --> Static ``` Steps to complete: * Add statics to decls * Model "auto traits" (`rust:Send`) * Add "ok" goals for statics: * Type of static must be well-formed * Type of static must implement `rust:Send` * Thing to test: static using out of scope names should be an error :) * Add enough MIR to model function bodies, static initializer * Add type-checker for MIR * Expect static, `foo`, and `main` to be well-typed * Expect `bad` not to be well-typed ### Spike 1: Issue [#25860] ## Yearly goal A complete model of safe Rust from "MIR backwards" Some subset of (maybe even all of) the following: * Type checker for MIR * Borrow check * Safety checks (e.g., detecting unsafe code) * Well-formedness checks for declarations * Operational semantics for safe Rust, probably similar to miri ## Milestones * Two questions I wanted answers to: * Interface between polonius and the trait checker * Reconcile the "everything is coinductive" rules with implied bounds * (needs an RFC) * Places where the design is not clear * Implied bounds * if you have `fn foo<'a, 'b>(x: &'a &'b u32)`, that function gets to assume that `'b: 'a` * if you have a where-clause like this `for<'a> u32: SomeTrait<'a>`, this is considered "legal" even if `u32` doesn't implement `SomeTrait` at all * Soundness questions around dyn* * Const generics / const-eval * what does "const evaluatable" mean and when can you use it? * broad question: * `const` bounds, etc etc * Full spike through the system of some sample program or various programs :) ## End to end example program ### Point ```rust struct Point { x: u32, y: u32 } fn main() { let p = Point { x: 22, y: 44 }; let x = p.x + p.y; } ``` Covers: * Struct declarations Sample MIR: ``` bb0: { StorageLive(_1); // scope 0 at test.rs:7:9: 7:10 _1 = Point { x: const 22_u32, y: const 44_u32 }; // scope 0 at test.rs:7:13: 7:35 FakeRead(ForLet(None), _1); // scope 0 at test.rs:7:9: 7:10 StorageLive(_2); // scope 1 at test.rs:8:9: 8:10 StorageLive(_3); // scope 1 at test.rs:8:13: 8:16 _3 = (_1.0: u32); // scope 1 at test.rs:8:13: 8:16 StorageLive(_4); // scope 1 at test.rs:8:19: 8:22 _4 = (_1.1: u32); // scope 1 at test.rs:8:19: 8:22 _5 = CheckedAdd(_3, _4); // scope 1 at test.rs:8:13: 8:22 assert(!move (_5.1: bool), "attempt to compute `{} + {}`, which would overflow", move _3, move _4) -> [success: bb1, unwind: bb2]; /> } bb1: { _2 = move (_5.0: u32); // scope 1 at test.rs:8:13: 8:22 StorageDead(_4); // scope 1 at test.rs:8:21: 8:22 StorageDead(_3); // scope 1 at test.rs:8:21: 8:22 FakeRead(ForLet(None), _2); // scope 1 at test.rs:8:9: 8:10 _0 = const (); // scope 0 at test.rs:6:11: 9:2 StorageDead(_2); // scope 1 at test.rs:9:1: 9:2 StorageDead(_1); // scope 0 at test.rs:9:1: 9:2 return; // scope 0 at test.rs:9:2: 9:2 } bb2 (cleanup): { resume; // scope 0 at test.rs:6:1: 9:2 } ``` ## Famous bugs ### 25860 ```rust static UNIT: &'static &'static () = &&(); fn foo<'a, 'b, T>(_: &'a &'b (), v: &'b T) -> &'a T { v } fn bad<'a, T>(x: &'a T) -> &'static T { let f: fn(_, &'a T) -> &'static T = foo; f(UNIT, x) } fn main() { let value = { let data = 22; bad(&data) }; let _ = *value; } ``` This compiles today ([playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=d0f42fb6c31d2a5c432946966765cf04)) but (a) should not and (b) will "go wrong" in the operational semantics. ## Complete roadmap for 25860 test * To type check the bodies of each function * Enough of the MIR type checker to do that * Determine the implied bounds and wf check for each function * (kind of have that today) * For statics, to check their types and the type of their initializer * also a MIR type check thing ## Further things we could check * The type of a static must be Send + Sync *