StableMIR - Release and Stability Proposal

StableMIR is being developed to become the public interface of the Rust compiler to analysis tools that can be developed outside of the Rust main repository.

It is intended to be more stable than the internal APIs, and to follow semantic versioning. For that, the goal is to start publishing a stable_mir crate on crates.io, which can be explicitly selected by tool developers.

This document proposes what the first releases will look like, as well as how development will be done in the Rust compiler in between version releases.

Context

In our first development phase of StableMIR, we focused on adding enough coverage for static analyzers tools to use in order to interpret a Rust program.

For that, we added two crates to the Rust compiler, stable_mir and rustc_smir, the first is a shallow shell that implements the public APIs while the second implements the interface between public APIs and the compiler internal APIs, including translation. Because of that, the rustc_smir crate depends on the stable_mir crate.

This dependency order makes it harder for the user to select which stable_mir crate to depend on, since the supported version would have to be hard coded in the compiler's rustc_smir crate.

Goal

Our goal is to publish stable_mir in crates.io, and have the Rust compiler rustc_smir to implement an interface known to stable_mir to provide the communication with rustc. We would also like to reduce friction for rustc developers.

Proposal

We would like to propose a change to the crates architecture and invert the dependency between stable_mir and rustc_smir. I.e., the stable_mir crate should depend on the rustc_smir crate, not the other way around. The main advantages are:

This would enable a single stable_mir crate to be compatible with different compiler versions, by using conditional compilation that checks the compiler nightly version.
The rustc_smir would be completely agnostic of stable_mir, which would allow different stable_mir versions to interface with the same compiler version.
The stable_mir crate hosted in the compiler can be kept up-to-date with changes to
the compiler MIR. This crate would become the base of a new major release.

For that to happen, we will need to make the rustc_smir interface based only on internal APIs. The rustc_smir would have an implementation of the CompilerInterface similar to the one existing today, i.e., the same functions and logic, however, their input and output would all be internal rustc constructs. There would be no CompilerInterface trait.

We would then move the translation between internal rustc and stable constructs to a private module^[1] inside the stable_mir crate. This translation would incorporate conditional compilation for handling divergence between rustc versions. The facade of this internal module would be a proxy, CompilerInterface, which would have the same set of functions implemented in the rustc_smir crate, however, it would be responsible for the translation between stable / internal constructs.

See "Examples" for more details on how this would work.

Release Process

We will need to keep two stable_mir crates:

One that will live in its own repository, and that will be the base of any minor update. This crate will compatible with multiple versions of the compiler. We will use conditional compilation based on the compiler version to do that. See "Changes to Variants" for more details.
We will still keep a stable_mir crate in the compiler, which will be kept up-to-date with the compiler, and it will serve as the basis for the next major release of stable_mir. This stable_mir has no compatibility or stability guarantees.

Whenever a change is made to the Rust compiler that breaks the current published version of stable_mir, the StableMIR group will have to publish a new version (minor or major) that is compatible with the new compiler:

Minor release

If the change can be easily addressed by adding some conditional compilation, the team should prefer doing that.

The stable_mir crate will have a build.rs file that checks the compiler nightly version (which is the date it was published). Depending on the date, we will set a cfg nightly_features which will be used inside the stable_mir to check for features availability.

The stable_mir crate now needs to be implemented for versions with and without the breaking change.

See "Changes to Variants" for an example.

Major release

In case of a significant change, or there's too many nightly_features being tracked, the StableMIR group may choose to deprecate the current major version and release a new one.

In order to release a new major version, we will copy the compiler's stable_mir crate into the project stable_mir. We will create a release note documenting the major changes, and publish it to crates.io.

New major versions may only be compatible with newer compiler versions. A compilation error should be triggered if an unsupported version is used suggesting users to upgrade to the minimum required version.

In order to deprecate a major version, we recommend creating a minor version that will trigger a compilation error for unsupported versions of the compiler. This error should suggest users to migrate to a newer version of stable_mir.

Latest

Note that tools could still be developed on the top of the compiler version of the crate, but they would have to face constant breakage. We still expect rustc developers to keep the compiler crate (2). They won't however need to worry about breaking change, delaying renames or things like that.

Examples

Ty::kind() implementation

The basic user flow for retrieving the kind of a type today is implemented as follow (some details were omitted for simplicity):

I.e.: The conversion between stable_mir and MIR components is done inside the Rust compiler. Thus, the compiler must be aware of the stable_mir version. The implementation is the following:



























// stable_mir/src/ty.rs
impl Ty {
    pub fn kind(&self) -> TyKind {
        with(|context| context.ty_kind(*self))
    }
}

// stable_mir/src/compiler_interface.rs
pub trait Context {
    fn ty_kind(&self, ty: Ty) -> TyKind;
}

// rustc_smir/src/context.rs
impl<'tcx> Context for TablesWrapper<'tcx> {
    fn ty_kind(&self, ty: stable_mir::ty::Ty) -> TyKind {
        let mut tables = self.0.borrow_mut();
        tables.types[ty].kind().stable(&mut *tables)
    }
}

// rustc_smir/src/convert.ty
impl<'tcx> Stable<'tcx> for ty::TyKind<'tcx> {
    type T = stable_mir::ty::TyKind;
    fn stable(&self, tables: &mut Tables<'_>) -> Self::T {
        // implementation
    }
}

With the new proposal, the conversion will now live inside stable_mir crate, and the rustc_smir crate inside the compiler will not be aware of it.

For the same use case as above, this will be the new sequence flow to retrieve the kind of a type.

Note that we would be basically splitting the current impl Context into 2, the stable_mir::CompilerInterface and the rustc_smir::CompilerContext. The one living inside rustc_smir would be able to query the Rust compiler and do any information processing. While the proxy living inside stable_mir would only be responsible for translating internal to stable constructs, as well as caching any result, such as the def_id map.

Here is how it would be implemented:































// stable_mir/src/ty.rs
impl Ty {
    pub fn kind(&self) -> TyKind {
        with(|context| context.ty_kind(*self))
    }
}

// stable_mir/src/compiler_iterface.rs (new struct)
impl<'tcx> CompilerInterface<'tcx> {
    fn ty_kind(&self, ty: stable_mir::ty::Ty) -> stable_mir::ty::TyKind {
        let mut tables = self.0.borrow_mut();
        let internal_kind = tables.cx.ty_kind(tables.types[ty].kind();
        internal_kind.stable(&mut *tables)
    }
}


// rustc_smir/src/context.rs
impl<'tcx> Context for TablesWrapper<'tcx> {
    fn ty_kind(&self, ty: rustc_middle::ty::Ty) -> rustc_middle::ty::TyKind {
        ty.kind()
    }
}

// stable_mir/src/convert.ty
impl<'tcx> Stable<'tcx> for ty::TyKind<'tcx> {
    type T = stable_mir::ty::TyKind;
    fn stable(&self, tables: &mut Tables<'_>) -> Self::T {
        // implementation
    }
}

Function rename

In the case where an internal compiler function, such as Ty::kind, is renamed, changes would only need to occur in the rustc_smir module. Like today, that would not affect the end user or the stable_mir crate.

Changes to variants

Changes would need to be in the conversion inside covert.ty, and we would bump the stable_mir crate minor version to support newer versions of the compiler.

Let's say a new TyKind was added: TyKind::Dummy, we could add a function try_ty_kind() to the existing version of StableMIR:


























type UnsupportedTyKind = Opaque;

// stable_mir/src/ty.rs
impl Ty {
    pub fn try_kind(&self) -> Result<TyKind, UnsupportedTyKind> {
        with(|context| context.try_ty_kind(*self))
    }
}

// stable_mir/src/compiler_iterface.rs (new struct)
impl<'tcx> CompilerInterface<'tcx> {
    fn try_ty_kind(&self, ty: stable_mir::ty::Ty) -> Result<stable_mir::ty::TyKind, Opaque> {
        let mut tables = self.0.borrow_mut();
        let internal_kind = tables.cx.ty_kind(tables.types[ty].kind();
        
        #[cfg(nightly_feature = "DummyTyKind"]
        if matches!(internal_kind, TyKind::Dummy(..)) {
            Err(opaque(internal_kind))
        } else {
            Ok(internal_kind.stable(&mut *tables))
        }
            
        #[cfg(not(nightly_feature = "DummyTyKind"))]
        internal_kind.stable(&mut *tables)
    }
}

The conversion would now become:














// stable_mir/src/convert.ty
impl<'tcx> Stable<'tcx> for ty::TyKind<'tcx> {
    type T = stable_mir::ty::TyKind;
    fn stable(&self, tables: &mut Tables<'_>) -> Self::T {
        // implementation
        match self.kind {
            // .. all other variants
            #[cfg(nightly_feature = "DummyTyKind"]
            TyKind::Dummy(..) => {
                unimplemented!("New dummy type not implemented. Use `try_kind` instead")
            }
        }
    }
}

And the crate build.rs would add to nightly_feature the "DummyTyKind" based on the nightly compiler version.

Users that would like to use new versions of the compiler would just need to update the minor version they are using. Users of older compiler versions would still be able to update the minor version.

New StableMIR functions

Adding a new StableMIR functionality would not break older versions of StableMIR or users using older compiler versions.

If changes are needed on the compiler side, such as adding a new method to CompilerContext, the new feature should be guarded using the nightly_feature mechanism mentioned above. In this case, the new feature will only be available for users of the new compiler.

Conclusion

This proposal will allow the published stable_mir crate to be compatible with different versions of the Rust compiler. It will also reduce the current friction with rustc developers, since they would no longer need to worry about backward compatibility.

The main downsides of the proposed solution are the upfront cost of having to refactor our existing crates, and the extra maintenance to keep the current major version compatible with newer versions of the compiler. This maintenance cost will be taken by the StableMIR group.

However, the StableMIR group will always have the option to publish a new major version using the compiler's stable_mir crate. This can be done whenever backward compatibility due to newer changes in the compiler are too costly.

I.e.: we would be able to keep a better support to multiple versions of the compiler when possible, while keeping the basis to a new major version in place.

See the appendix for alternatives considered so far.

Appendix

Alternatives

Here are a few alternatives that we have explored so far.

1. Multiple versions tracked within the compiler

The initial version proposed. The compiler would still have 2 crates, stable_mir and rustc_smir. The stable_mir crate would represent the latest unreleased version of stable_mir, while rustc_smir would be aware of different stable_mir.

This approach is described here: https://hackmd.io/XhnYHKKuR6-LChhobvlT-g#MVP-Design

The main drawback of this approach would be the overhead to Rust compiler developers, since they would need to be aware of all stable_mir versions that are supported by the compiler at the time they are making changes to rustc.

2. StableMIR ABI

Described in more details here: https://hackmd.io/WXdHKkVAQMaEdk4xLxv8Tg

In this alternative they end up with at least two versions of the stable_mir crate, where the one that has been published to crates.io has to be able to translate from stable_mir included in the compiler. The more stable_mir crates are published, the number of translation layers would increase.

Updates to the stable_mir crate would always need to be made together with a compiler upgrade, which would force us to constantly release new major versions of stable_mir, even if there was no change to the public APIs. I.e., without bumping the major version, users running cargo update could end up with a version of stable_mir that is not compatible with the current compiler.

3. Strict versioning

The compiler tracks a single StableMIR version like option 2. However, we won't support any proxy.

The build script of stable_mir will use this option to validate whether the current version is supported, and if not, it will error telling users to upgrade to the supported version.

Not every rust compiler would support a published StableMIR.

This alternative would have minimum impact on the rustc development, however, it would greatly impact the stable_mir usability.

We would still expose internal and stable methods as unstable methods for helping users bypass stable-mir. ↩︎

Felix S Klock II

2024/06/24 14:20:35

The stable_mir crate hosted in the compiler

Just so I'm clear: Part of the vision here then is to have a `stable_mir` crate available on crates.io, and simultaneously have a `stable_mir` crate that is provided by rustc itself. Does this sound correct? If that is correct, then I assume the ide ais that the rustc provided `stable_mir` is effectively the "unstable/nightly" version of stable_mir that is under development and slated for eventual deployment to crates.io. Is that also correct?

2024/06/24 14:21:43

Oh, you in fact addressed this down below in "Release Process"

Celina G. Val

2024/06/24 21:52:55

Yes, you are spot on!

2024/06/24 14:24:09

We still expect rustc developers to keep the compiler crate (2)

the "compiler crate" here denotes the rustc-hosted `stable_mir` crate, right? Or does it refer to `rustc_smir`?

2024/06/24 21:53:47

Yes. We expect the rustc developers to maintain `rustc_smir` and the compiler hosted `stable_mir`.

2024/06/24 14:31:24

we will need to make the rustc_smir interface based only on internal APIs.

Under the old design, `rustc_smir` carried a lot of weight; e.g. it was in charge of the translation between internal and external API's. You say the new plan is to have translation now be handled by `stable_mir`. What *is* your vision for what role `rustc_smir` itself is meant to have under the new system? Your example shows it providing the `impl Context for TablesWrapper`. But I'm still trying to understand: Can the `stable_mir` crate be expected to have dependencies on arbitrary internal details of rustc (that are conditionally included based on rustc-version-introspection)? Or is there some limitation on what internal-details that `stable_mir` can depend upon? (Is `rustc_smir` intended to be an way to gather all the code that the Rust Compiler Team is expected to maintain as their informal support procedure for StableMIR?

2024/06/24 21:58:08

I think you defined it well: "`rustc_smir` intended to be a way to gather all the code that the Rust Compiler Team is expected to maintain as their informal support procedure for StableMIR". I.e.: `rustc_smir` processes all the data, such as calls to monomorphization, error handling, type checking, and it aggregates it in a way that `stable_mir` only needs to provide the translation.

2024/06/24 14:38:31

g. While the proxy living inside stable_mir would only be responsible for translating internal to stable constructs, as well as caching any result, such as the def_id m

I guess this is a follow-up for my earlier question about what the role is for `rustc_smir` under the new regime. You state here that `stable_mir` would only be responsible for translating internal to stable components. So that means they would still need to *know* the structure of those internal components (i.e., they would still depend on those unstable representations). Looking at the example you've written, I'm inferring that the crucial difference is that `rustc_smir` is allowed to invoke `rustc_middle::ty::Ty::kind()` directly, while the methods in `stable_mir` all are meant to make calls into `rustc_smir` methods, never directly to `rustc_middle`. Does that sound accurate?

2024/06/24 21:58:39

Yes!! 100%

2024/06/24 22:01:58

We are trying to minimize the amount of internal compiler logic `stable_mir` depends on. So it should really only be looking at the data directly stored and exposed by an ADT.