Links

Meeting proposal: https://github.com/rust-lang/compiler-team/issues/257
Project: https://github.com/bjorn3/rustc_codegen_cranelift

Agenda

Announcements
- Major Change Proposal (should have been on Thursday agenda): Use git subtree instead of git submodule for external deps: https://github.com/rust-lang/compiler-team/issues/266
Cranelift

Summary

Learn more about Cranelift
Talk about how to support Cranelift work

Motivation

The current LLVM backend for rustc is very good at optimizing functions. However it is not the fastest, even with optimizations disabled. When compiling in release mode this is not a big problem, but during development this can be annoying. Cranelift has the potential to improve compilation time, as it is optimized for compilation time as opposed to being optimized for good optimizations like LLVM. Over the course of the past ~1.5 year I have been working on a Cranelift based codegen backend for rustc (rustc_codegen_cranelift or cg_clif for short). It is currently complete enough to compile many programs. While there are cases where LLVM is faster, Cranelift is already faster than LLVM in many cases.

Details

Rustc_codegen_cranelift is currently at a point where it may make sense to talk about bringing support in tree. We should decide on if we do want this and how. Do we want to use a submodule, like miri and clippy, or do we want to merge it in tree?

We may want to talk about the key differences between LLVM and Cranelift and in particular about why Cranelift can be more suitable for debug builds.

I would also like to talk about how to integrate the JIT support of cg_clif with cargo. This would need a way to compile all dependencies into one or more dylib's for cg_clif to load. It would also need a way for cargo run to actually request the JIT mode instead of compiling the executable and then running it.

Challenges

There are several things that cg_clif doesn't support very well, or doesn't support at all

Inline assembly is completely missing. (wasmtime#1041)
SIMD is partially implemented. Some LLVM intrinsics are emulated, but there are still a lot missing. (cg_clif#171)
- The target feature detection x86 doesn't work as inline assembly support is missing. There are several crates like c2-chacha that only work when a certain target feature is supported that is supported on most modern cpu's, but is not required by the x86_64-unknown-linux-gnu and similar targets. This means that they have to be patched to support this combo.
  - Question: Can we fix feature detection by moving them to intrinsics? (Potentially better for Miri as well.) // Centril
    - cg_clif has code to emulate/codegen an abort on certain known pieces of inline asm. If Cranelift added a cpuid instruction, it would be possible to use that when a cpuid inline asm is used. It would also be possible to define a function with the bytes forming a cpuid + ret and then call that function instead when cpuid inline asm is requested. // bjorn3
ABI compatibility with C and preferably cg_llvm. (cg_clif#10)
- Necessary to compile proc macros using cg_clif.
- Necessary for the objc crate on macOS.
Because of missing optimizations in Cranelift, the sysroot rlibs are much bigger than with cg_llvm. This means that the linker has to do much more work. (cg_clif#762)
- https://github.com/bjorn3/rustc_codegen_cranelift/issues/878#issuecomment-598439263
- Using lld may help with this.
  - Edit(2020-03-14): lld helps a bit, but it is still much slower in many cases.
  - Edit(2020-03-15): with firefox and vscode closed, most regressions get much smaller and some even disappear completely: https://github.com/bjorn3/rustc_codegen_cranelift/issues/878#issuecomment-599271294
Unsized locals are not supported, because Cranelift doesn't have alloca support. (cg_clif#15, wasmtime#1105)
- I had to write code to codegen <Box<F> as FnOnce()>::call_once in a way that doesn't require alloca because of this.
There are failing tests in the rustc test suite. Some are LLVM specific, while others are not yet implemented features or in some cases (likely) miscompilations. (cg_clif#381)
Stack unwinding is not yet supported by Cranelift.
- There is work being done to create a .eh_frame section for backtraces.

Key design questions

Should we support cg_clif in tree?
- If so, use a submodule or merge it into rust-lang/rust?
- See also https://github.com/rust-lang/compiler-team/issues/257
How to handle unimplemented features?
- If there was cg_llvm ABI compatibility it would be possible to transparently fall back to cg_llvm for unimplemented features.

Random other questions =)

What is the maintainance story? (nikomatsakis)
- Who beyond bjorn3 is familiar? If there are bugs, how will they be prioritized?
dylib vs other options (nikomatsakis)
- As I understand it, we currently load cranelift dynamically and only if it's being used, do we wish to continue with this strategy? Will it be distributed via rustup etc?
When do we start gating UI tests on cranelift? Should we have some compiletest flag for that? (centril)
- I'm wondering if having multiple backends is going to put pressure on us to go back to a ui/run-pass split in the test suite. It seems silly to re-run the front-end tests when one is focusing on backend compatibiilty issues (pnkfelix).
  - We can filter out run-pass tests; I had a PR for that ready (https://github.com/rust-lang/rust/pull/61719) (centril)
  - True, the meta-data does exist in the test files. (Just need to update compiletest accordingly…) (pnkfelix)
    - See PR above for "update accordingly" ^– (centril)
Should we start perftesting the cranelift backend? (centril)