Title: Cranelift backend for rustc
Estimate: 1 meeting
Type: Technical
Meeting proposal: https://github.com/rust-lang/compiler-team/issues/257
Project: https://github.com/bjorn3/rustc_codegen_cranelift
The current LLVM backend for rustc is very good at optimizing functions. However it is not the fastest, even with optimizations disabled. When compiling in release mode this is not a big problem, but during development this can be annoying. Cranelift has the potential to improve compilation time, as it is optimized for compilation time as opposed to being optimized for good optimizations like LLVM. Over the course of the past ~1.5 year I have been working on a Cranelift based codegen backend for rustc (rustc_codegen_cranelift or cg_clif for short). It is currently complete enough to compile many programs. While there are cases where LLVM is faster, Cranelift is already faster than LLVM in many cases.
Rustc_codegen_cranelift is currently at a point where it may make sense to talk about bringing support in tree. We should decide on if we do want this and how. Do we want to use a submodule, like miri and clippy, or do we want to merge it in tree?
We may want to talk about the key differences between LLVM and Cranelift and in particular about why Cranelift can be more suitable for debug builds.
I would also like to talk about how to integrate the JIT support of cg_clif with cargo. This would need a way to compile all dependencies into one or more dylib's for cg_clif to load. It would also need a way for cargo run
to actually request the JIT mode instead of compiling the executable and then running it.
There are several things that cg_clif doesn't support very well, or doesn't support at all
c2-chacha
that only work when a certain target feature is supported that is supported on most modern cpu's, but is not required by the x86_64-unknown-linux-gnu
and similar targets. This means that they have to be patched to support this combo.
cg_clif
has code to emulate/codegen an abort on certain known pieces of inline asm. If Cranelift added a cpuid
instruction, it would be possible to use that when a cpuid
inline asm is used. It would also be possible to define a function with the bytes forming a cpuid
+ ret
and then call that function instead when cpuid
inline asm is requested. // bjorn3objc
crate on macOS.lld
may help with this.
lld
helps a bit, but it is still much slower in many cases.alloca
support. (cg_clif#15, wasmtime#1105)
<Box<F> as FnOnce()>::call_once
in a way that doesn't require alloca
because of this..eh_frame
section for backtraces.What is the maintainance story? (nikomatsakis)
dylib vs other options (nikomatsakis)
When do we start gating UI tests on cranelift? Should we have some compiletest flag for that? (centril)
ui
/run-pass
split in the test suite. It seems silly to re-run the front-end tests when one is focusing on backend compatibiilty issues (pnkfelix).
compiletest
accordingly…) (pnkfelix)
Should we start perftesting the cranelift backend? (centril)