owned this note
owned this note
Published
Linked with GitHub
# Stable MIR Design
*Note: The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt).*
## Introduction
Many static analysis tools rely on the internals of the rust compiler to be able to efficiently analyze programs that were written in Rust. Most tools analyze the mid-level IR (MIR) together with rustc's type system. The MIR is a much simpler language when compared to Rust, but much richer than LLVM's IR.
In order to do that today, these tools use rustc's internal APIs to extend the compiler. These mechanisms only work for nightly toolchain distributions, and the API changes very often, which makes maintenance very costly.
Another issue is that the semantics of the MIR is not well documented, which can potentially impact the result of these analysis.
The goal of the Stable MIR project is to provide a stable interface to the Rust compiler that allow tool developers to develop sophisticated analysis with a reduced maintenance cost without compromising the compiler development speed.
### Tenets
1. Enable tool developers to implement sophisticated analysis with low maintenance cost.
2. Do not compromise the development and innovation speed of the rust compiler.
3. Once stable (v1.0+), releases must follow semantic versioning https://semver.org/.
5. Backward compatible changes are preferred, not required.
6. The rust compiler should expose interface to Stable MIR versions released 3 months prior to its release. It may support multiple Stable MIR versions at the same time to achieve this.
7. The SMIR should be performant, with minimum memory and processing overhead when compared to using internal MIR.
8. Unstable APIs should be developed under feature gates and they shall be excluded from the semantic versioning.
## Project Milestones
This is a big endeavor for both compiler and tool developers. We anticipate that tool developers will need to make changes to their code to transition from internal APIs to stable ones. Whenever possible, the new APIs should mimic the current APIs to reduce the transition burden to tool developers.
For the compiler developers, once an API has been incorporated to the Stable MIR, they shall be maintained following the tenets and the [Stable MIR API guidelines](#api-guidelines).
We will follow a gradual transition to the new model.
### Minimum Viable Product (Stable MIR v0.1)
The first milestone will be landing a version of stable MIR that allows a MIR consumer (e.g. kani, miri, or a mini-miri demo) to switch its use of MIR datastructures completely over to it (even if it still uses other unstable APIs). We envision the following requirements for this release:
* Provide API to emit the body of a function solely with stable MIR datastructures.
* Provide opaque objects which will store enough information for users to leverage other parts of the compiler. E.g.: Extract the type of an Operand.
For this release, there will be no changes to how tools load the Rustc library or find the functions/statics they want to get the stable MIR for. Also, only nightly will be supported, even just to access the stable MIR datastructures.
We will only support MIR's ["runtime-optimized"](https://github.com/rust-lang/rust/blob/50d3ba5bcbf5c7e13d4ce068d3339710701dd603/compiler/rustc_middle/src/mir/syntax.rs#L101).
### API Coverage and Stabilization (v0.2+)
Subsequent releases of the Stable MIR will focus on increasing the API coverage to include other parts of the compiler that are required to semantically analyze the program as well as APIs used to provide a similar UX to the compiler.
* Provide API to visit all items (static variables, constants, and functions) of every crate compiled (current crate + dependencies).
* Support emitting stable MIR of analysis-MIR
### Long term goal (v1.0+)
The first stable release (v1.0) shall only be done once the SMIR has sufficient API coverage and the APIs are stable.
The following changes are orthogonal to the stabilization effort but shall be implemented to improve the overall experience, and they are in the scope of this project:
* Completely decouple tools from Rustc's internal APIs. Including invoking Rustc's driver, error reporting, type queries.
* Support different MIR dialects.
* Interface can be used with all toolchain channels (stable, beta, nightly)
* Provide a server / client architecture that provides a more robust and interactive experience.
* The Stable MIR can be used as an input to the compiler.
* possible future extension: share the MIR datastructures with the stable MIR datastructures via https://github.com/rust-lang/compiler-team/issues/233
## MVP Design
The `stable-mir` will follow a similar approach to `proc-macro2`. It's implementation will be broken down into two main crates:
* `stable_mir`: Public crate, to be published on crates.io, which will contain the stable data structure as well as proxy APIs to make calls to the compiler.
* `rustc_smir`: The compiler crate that will translate from internal MIR to SMIR. This crate will also implement APIs that will be invoked by `stable-mir` to query the compiler for more information.
This will help tools to communicate with the rust compiler via stable APIs. Tools will depend on `stable_mir` crate, which will invoke the compiler using APIs defined in `rustc_smir`. I.e.:
```
┌───────────────────────────────┐ ┌───────────────────────────────┐
│ External Tool ┌──────────┐ │ │ ┌──────────┐ Rust Compiler │
│ │ │ │ │ │ │ │
│ │stable_mir| │ │ │rustc_smir│ │
│ │ │ ├──────────►| │ │ │
│ │ │ │◄──────────┤ │ │ │
│ │ │ │ │ │ │ │
│ │ │ │ │ │ │ │
│ └──────────┘ │ │ └──────────┘ │
└───────────────────────────────┘ └───────────────────────────────┘
```
A compile time check will be added to trigger an error if the `stable_mir` is not compatible with the current compiler.
### Multiple version support
In order to support multiple versions of `stable_mir`, the `rustc_smir` will contain modules that can be mapped to different versions of `stable_mir`.
Example, the `stable_mir` v0.1, will invoke APIs from the `rustc_smir::version_0_1::*`, while the version v.0.2 will invoke `rustc_smir::version_0_2::*`.
What about forward compatibility? See [Open Questions section](#Open-Questions).
### MIR / SMIR interoperability
Until we have enough coverage, users will still rely on internal APIs to initialize the Stable MIR constructs and to retrieve further information that hasn't been covered by stable APIs. E.g.: Getting the type of an operand.
There are a few ways that this can be implemented, and we still need to assess them before settling into one. A few options are:
1. Some Stable MIR API's will still handle internal compiler data structures, such as `TyCtxt` and `Ty`. These types would be re-exported using an `unstable` or `internal` module.
2. Stable MIR constructs can be converted to internal ones.
3. Expose internal compiler APIs that support Stable MIR constructs.
## API Guidelines
### MIR Datatypes
* No interning, everything is using `Box<T>` instead of `&'tcx T'
* Types are opaque handles (`struct Ty(String)`) that for now only offer `Display` impls and no other API
* we'll quickly iterate to (`struct Ty(usize)`) that the API has a map for figuring out the `Ty<'tcx>` from and then querying that for infomation (e.g. size, layout, ...).
### Type Datatypes
* re-use `rustc_type_ir` for our types. This is independent of TyCtxt but still exposes the entirety of `TyKind` and similar.
# Appendix
## Open Questions
1. Should MIR dialects coverage be a requirement for v1.0?
2. Should we support forward compatibility? I.e.: Using a new version of `stable-mir` with an older version of the compiler? Do we have a use case in mind?
3. What traits should StableMIR types implement? For now, we've been adding Clone and Debug. We should come up with a guideline to when we can add more traits. E.g.::
* Hash
* Copy
* PartialEq / Eq
* PartialOrd / Ord
* Serialize / Deserialize (serde): This one could be guarded by a feature.
## FAQ
## Requirements
The table below contains a more detailed list of requirements and in which version of `stable-mir` they should be met. For the target version, we use the following values:
* **v0.1:** represents the minimum viable product.
* **v0.2+:** represents subsequent releases of `stable-mir` that increase the API coverage.
* **v1.0+:** represents the long term goal.
| # | Requirements | Description | Target Version* |
| --| ------------ | ----------- | --------------- |
| 0 | Re-use `rustc_smir` | Remove all unstable depencencies of `rustc_smir` | v0.1 |
| 1 | MIR Datatypes | Provide API that correspond to the current [MIR Datatypes](https://rustc-dev-guide.rust-lang.org/mir/index.html#mir-data-types). | v0.1 |
| 2 | Support API | Provide API to retrieve code location, attributes, debug information. | v0.2+ |
| 3 | Message API | Provide API to generate user friendly and json messages. | v0.2+
| 4 | Type API | Provide API to retrieve type information (including layout) | v0.2+ |
| 5 | Semantically versioned API | New versions of `stable-mir` follow semantic versioning (https://semver.org/). | v1.0 |
| 6 | MIR Dialects | Support multiple MIR dialects. | v1.0+ |
| 8 | Stable channel | `stable-mir` can be used with all toolchain channels including stable. | v1.0+ |
| 9 | Server/Client | The `stable-mir` APIs are implemented over IPC. Tools no longer need to link against rust compiler library. | v1.0+ |
| 10 | Load SMir | The compiler is able to load SMir definitions and compile it to the target platform. | v1.0+ |