Try   HackMD

bootstrapping rustc_private tools from first principles

A rustc tool links against a rustc with #![feature(rustc_private)].
The way this is implemented is that the compiling rustc has libdir = $sysroot/lib/rustlib/$target/lib in the crate search path, and will pick up rustc there.
In dist, the compiler crates will only end up there when the rustc-dev component is installed, as it contains all the compiler rlibs.

In dist, this is super simple. rustc has its own (or its friend when cross-compiling) rlibs in $libdir and will just pick them up from there when you do extern crate rustc_driver.

Bootstrap is harder. The problem here is stage 1.
When a tool links against a librustc, it is critical that the ABI of the tool and the librustc match.
This is the case for stage2/dist compiling a tool against itself, as the ABI stage2 was compiled with is the same ABI as the one stage2 compiles.
So stage2 can link tools against itself just fine.

Stage1 cannot. Stage1 was compiled with the previous ABI, but compiles the current ABI.
So when we want to link a tool against stage1 librustc, we need to use the stage0 compiler to compile the tool.

When compiling against stage2, we can also use the stage1 compiler, or the stage2 compiler. Both are fine.

This leaves us with two options for compiling tools linked against >=stage2.
We can either use stage{n-1}, or stage{n}.

  • Using stage{n-1} means that it's consistent with stage1.
    This will be referred to as "forward compilation".
  • Using stage{n} means that it's consistent with dist.
    This will be referred to as "self compilation".

Another open question is how exactly we get the rlibs for the forward compilation in the proper sysroot.
The sysroot of the stage1 compiler is stage1.
This means that the rlibs of the stage2 compiler need to end up in the stage1 sysroot.

Forward compilation, in short:

  • stage1 builds stage2 rustc
  • the compiled rlibs get copied into the stage1 sysroot
  • stage1 builds the tool
  • the compiled tool gets copied into the stage2 sysroot

![NOTE]
This works naturally with cross-compilation!

This might seem a bit like it's wrong, but it's right - the $sysroot/lib librustc_driver.so that the tool links against at runtime is in the stage2 sysroot, and it is the equivalent of the rlibs in stage2/libdir.

When using forward compilation, --stage n means that the tool will be compiled with n and end up in sysroot stage{n+1}.
This is derived from --stage 0 having to mean "compile with stage0 rustc and link against stage1 rustc".

Forward compilation gets a bit weirder when we reach stage 2.
--stage 0 means link against stage1, --stage 1 means link against stage2, and --stage 2? It means link against stage3 of course!
Now, we generally don't actually have a stage3, but that's ok.

yeah this sounds cool but how exactly does this work today???

does it?

This section assumes --stage n when it uses n.
First, it ensures the stage{n+1} compiler exists.

It then does a cargo build with the stage{n} compiler.

librustc rlibs are copied in the RustcLink step.
The rlibs are copied from stage1-rustc (which is where stage2 rustc lives) into stage1, doing that "back step" such that the stage1 compiler can pick them up when compiling a tool.

That part makes sense. So for any n, a tool --stage n is linked against the stage{n+1} compiler.
This means it should end up in the stage{n+1} compilers sysroot.

Haha, no. After building a --stage n tool, we throw it into the stage{n} compilers sysroot.
This is wrong, as it's now in a different sysroot than the one where its compiler lives.
It was separated from its best friend :(.
As an aside, that also means that even if it had a fancy $ORIGIN/../lib rpath, it would simply not work.

ok and what about rustdoc? isn't it good?

I'm glad you asked! The good news is that rustdoc is completely different code.
yeah.

The bad news is that it's completely different code.

So. Remember the part about --stage n meaning that the stage{n} compiler builds it? Yeah no.
Here, it means that stage{n-1} builds rustdoc.

So --stage 0 means that we just get rustdoc fresh out of the downloaded stage0 sysroot, easy!

It then builds it as usual.

It's copied into the stage{n} sysroot, which is actually correct here, as that is the sysroot of the compiler it's linked against.

End of the story.