2021-02-17: Increasing trust in rust compiler

--- tags: design-meeting --- # 2021-02-17: Increasing trust in rust compiler ## Goal Expore the answers to the questions: If we were to charter a group to explore "improving trust" in the Rust compiler... * What would be the first deliverables? * What components would we want to "specify" and why/how? * What are the *audience* for those deliverables? * How can we help folks follow along who need to understand the details? * What teams and/or folks would we want involved? ## Agenda * Discussion of goal-setting proposal circulated before * Q&A * Industry specific details * Discussion of needs around safety critical * Potential next steps ## Attending ## Notes * Introduction to attendants * Walking through the document: * Core focus: basing analysis on MIR, that's where the research projects are focused * Having access to traits, borrow checker is also important * Might also need a "contract language" in order to do verification * contracts like this can be really useful for language verification and fuzzing * Example of verification that is done using MIR? * MIR semantics are much simpler than Rust. Therefore, creating formal semantics from MIR makes sense, because it's so much easier. * Contract language is "in scope" "around" a lot of the things, but isn't the primary goal for this proposal * What kinds of properties might people want to prove? * e.g., proving specific postconditions? Or proving functional correctness? * Answer: not there yet, need specification for MIR first * Example question: can we possibly have undefined behavior? * Other extreme: is this bit-for-bit identical to the specification? * That requires a lot more human effort in practice to make a proof. * How stable does MIR have to be etc? * Strict stability: we specify MIR and it never changes. * Semver stability: we specify MIR but we tell you when it changes. * The latter is easier and good enough. * "We think it would be a boost to the language if MIR-as-core-spec was something we were handling" * We're looking for the "normative definition" that we can do compiler assurance on * Starting from the Rust source language is much more complex than trying to assure from MIR on down * MIR is a good middle ground * eventually want a contract between front end and MIR * and MIR and the back end * "More than willing to invest time/engineering to keeping up with MIR, but how to know that we're able to do it" * want to avoid subtle changes of semantics * Primarily interested in consuming MIR that was generated from the Rust compiler... * Is there also interest in producing MIR that was not produced by the Rust compiler? * Answer: Consuming output is the more important goal. * Being able to say "this version of the compiler produces MIR 12" and then having tools being able to verify that the tool expects MIR 12 too * Also want to connect Rust front end to different (non-LLVM-based) backends, MIR might be a useful intermediary there * Serialized MIR might be one way to go about this * Josh: We probably don't want to be able to take arbitrary MIR yet, because portions of MIR not generated by rustc are likely to be less well-exercised, and that's a surface area we may not want to expose * Use case: compiling down from Coq etc would be easier if it targeted MIR * But we might want to do "MIR-to-MIR" transformations * e.g., adding "ghost state" * Other concern: * There could be UB in Rust surface syntax that doesn't surface in MIR * This is worth keeping an eye on * How can we keep lang team and group in sync? * Having a clear roadmap etc * This story is not just about MIR, but also parts of the type system * What other static analysis results would you need access to? * e.g., Borrow checker / region inference? * Answer: it depends on who you talk to and what their needs are * Some folks want a "MIR that is to be borrow checked" * Does this mean "people might want to write their own borrow checker", or that you want to consume results of the Rust borrow checker? * Answer: we do want analysis results, also for performance reasons; being able to rule out potential aliases thanks to compiler analysis can dramaticaly improve the scalability * Others want "MIR" post-borrow-checking, post-monomorphization * Specifications: * How should we think about specifying the borrow checker? * Consider how many times we have changed it, e.g. NLL * Versioning may be part of the answer * Stacked borrows also aimed to define a definition of UB that is kind of maximally flexible * with the borrow checker as a conservative approximation on it * Next steps? * One outcome: find the group that will continue this conversation *