Summary

The crates that form Rust's standard library are managed differently than how Cargo handles other crate dependencies. They are shipped as compiled artifacts and because these are expected to handle every use case of a particular target this causes issues for certain users. This Project Goal aims to propose a design to build standard library crates locally, similarly to other dependencies.

In 2019 in initial implementation of build-std was merged, an RFC for std-aware Cargo was closed and the std-aware Cargo working group was formed. This project goal aims to produce a continuation of the closed RFC using the issues listed in the working group as a guide.

This document will link to issues there for additional context. Low issue numbers below 100 are links to the Working Group issues, issues in the order of #10,000 are generally Cargo issues, and issue numbers in the order of #100,000 are Rust issues.

Motivation

In today's Rust environment, core and std are shipped as precompiled objects. This was done for a number of reasons such as faster compile times and a more consistent experience for users of these dependencies. This design has served the bulk of users fairly well. However there are a number of less common uses of Rust that are not well served by this approach. Examples include:

  • Supporting tier 3 targets, for which builds are not distributed, and custom (".json") targets. These targets often require building at least core through Cargo or from the rust-lang/rust tree.
  • Users who would like to make different optimizations to core or std, such as opt-level = 's', with panic = "abort". This is common for embedded users, who would like smaller binaries. Performance or security optimisations can also be achieved through codegen flags like -Ctarget-cpu or -Ctarget-feature.
  • Supporting compilation options that affect the ABI or otherwise require full coverage of the binary, such as santizers (CFI in particular) or target-features like softfloat/SVE. Users changing flags to be classed as Target Modifiers will need to rebuild all required standard library crates in order to avoid errors. Standard library build support has been cited as a blocker for stabilising flags like -Zbranch-protection.
  • Modifying core or std through the use of feature flags (#4). In particular this may be useful to reduce binary size for embedded users through features such as backtrace or optimize_for_size. Swapping the panic_unwind feature for the panic_immediate_abort feature is required to build with panic = "abort" on a target that is unwind by default, and enabling profiler is required for PGO (#68).

Many of these use cases can be addressed by a minimal method for building standard library crates with custom Rust flags. This is evidenced by the current -Zbuild-std usage, despite the fact that it is not suitable for stabilisation at this point.

Prior Art

2015 RFC

In 2015, an RFC written by Ericson2314 was opened. It proposed making Cargo aware of standard library dependencies by allowing the user to specify them in Cargo.toml and build them as regular dependencies. This included unique syntax and behaviour for cargo to specify that a dependency belongs to the standard library and to handle backwards compatibility with crates that do not specify their standard library dependencies. After much discussion the RFC was closed for a number of reasons - a lack of time from the author and the development of Xargo were cited as reasons.

Xargo

In 2016, Xargo was created as an unofficial experiment to rebuild the sysroot before building user crates (as opposed to building std crates as regular dependencies). The project was successful and proved useful to users at the time.

A request to merge similar functionality into Cargo was opened in 2018 and led to the closure of the previous RFC. The approach is largely easier to implement than the previous proposal but concerns were raised over building the sysroot separately including the ongoing maintenance cost of building standard library crates differently to user ones and the build time parallelism left on the table when considering the user's dependency graph too. The concept of the sysroot as a whole (over explicit dependencies via --extern) was also questioned.

The request proposed a user interface involving adding fields to the .cargo/config file to explicitly specify bootstrapping stages, though this didn't gain much support and was considered brittle and user-unfriendly.

2019 RFC

In 2019, an RFC written by James Munns was published. The proposal was much more comprehensive than previous approaches, going into detail on the interface and functionality of the new features, but was light on implementation details required for Cargo. It proposed inferring whether to build standard library crates from the Cargo profile and specified a method to declare crate features in the user's Cargo.toml. It proposed allowing the user to provide their own custom source versions of standard library crates by declaring a [patch.sysroot] section of the Cargo.toml. It also proposed stabilising a Target Specification Format ("JSON targets") - an idea that has since been shown to be difficult.

The proposal was closed shortly after it became clear the work was larger than originally thought, though no large concerns were raised over the design.

wg-cargo-std-aware & MVP

When the 2019 RFC was closed, the cargo-std-aware working group was created with a comprehensive issue tracker documenting various considerations for build-std. Shortly after, an MVP for -Zbuild-std was merged that formed the basis of the implementation still used today.

The implementation involves manually specifying whether to build standard library crates via the -Zbuild-std flag (and optionally changing the default std + panic_unwind crate set by configuring a value for the flag). There is also a -Zbuild-std-features=... flag for overriding the default crate features (this interface makes no attempt at stability). The rust-src component is used for distributing the standard library. Cargo resolves the standard library separately and then injects it into the user's unit-graph, allowing for build-time parallelism and shared artifacts between the two resolves - a design which is arguably not a long-term solution. Modifications to the standard library are not supported, either through source code modifications (while there's a way to do this, rebuilds won't be triggered) or through [patch] in the Cargo.toml.

The total scope of the issues in the working group repository is very large and upon revisiting it a few years later I've found it very difficult to individually solve issues without an overarching plan for the feature - hence the creation of this Project Goal.

Suggested scope

This section is subject to change depending on what is discovered.

The Working Group issue tracker is a comprehensive list of all facets of the problem discovered so far - many of which tagged as a "Stabilization Blocker" or as "Plan Before Stabilization". The tracker should be used as the full list of ideal requirements, but in the interest of progress I would like to reduce the scope for this initial stabilisation push. Even if a feature is out of scope at this stage I would like to ensure that there's a likely path for implementing the feature in the future.

These below features are suggestions at this stage to leave out of scope given their potential use cases balanced against their inherent complexity.

  • Non-critical features that don't currently work with a precompiled standard library but might be expected to work with a regular Cargo dependency built from source, such as cargo vendor
    • I'm not aware of any strong use cases that have emerged for features like this, but they seem like a likely candidate to build in after stabilisation. I'm interested in hearing cases for individual features that may fall under this category.
  • Modifying the source of the standard library crates.
    • While this seems a small step from building std crates locally, the consequences seem fairly large and involve project team consensus. I don't want to distract from the core use cases outlined above.
    • From a technical standpoint, this currently works (and is indeed very difficult to prevent), but Cargo will not trigger rebuilds on source code changes without a clean first.
  • Patching standard library dependencies
    • I believe the main use case for this may be fixing up the standard library for a tier 3 or custom target.
    • It is proposed to be out of scope for similar reasons to the above, but the implementation and UI considerations for this feature are much larger.

Large open questions

All of these questions are core to the idea of build-std and affect all use cases, with the arguable exception of the final question regarding std Cargo features. They are not a complete list of problems to solve - rather, user-facing questions to answer that will guide further decisions. The main goal of this section is to outline the problems rather than try to solve them.

Is the standard library special?

There is a clear want for the standard library to behave more like a regular crate dependency - doing so would reduce Cargo's maintenance overhead significantly and will likely reduce user friction with the feature. Past designs and implementations have tangled with the reality that the standard library behaves nothing like a regular crate dependency right now and changing that is rather hard. Deciding whether the standard library should remain special has a large impact on Cargo's internal implementation as well as an effect on the UX (#43) of the feature.

The standard library is currently special for a number of reasons:

  • It lives in the global sysroot and rustc looks for it there by default.
    • This is partially solved by the work ehuss did on --extern=noprelude: (#49) which allow for explicitly specifying artifacts for the standard library.
    • Rust will still look in the global sysroot if it needs a crate that has not been passed with an --extern flag which leads to confusing "duplicate item" errors at times (#71, among many others). This can be solved with a mechanism to disable this sysroot lookup (#31), such as a new rustc flag.
  • The standard library and some of it's crates.io dependencies requires nightly features
    • This is valid in general because the standard library version always matches the compiler version and standard library dependency versions are tracked in it's Cargo.lock file.
    • This seems really hard to change but is easy to mitigate. Handling this problem on stable toolchains is covered by the below question
  • The standard library must be built with the dependency versions in its lockfile.
    • This is because dependencies may use nightly features and that changing dependency versions may invalidate careful testing performed on Rust releases.
    • In addition, any crates.io dependency may push a new broken minor version which, if pulled into the build by Cargo's default "greedy" version resolution strategy, could cause breakage on a large scale.
    • This has a significant cost for Cargo's resolution which is explained in more detail below.
  • The user specifies their dependency on standard library crates in source code
    • This should ideally be a Cargo feature rather than directives like #![no_std], #![no_core] or extern crate alloc which cannot be read by Cargo.

A future where standard library crates are not special would be really nice, but it seems most likely that they will have to be somewhat special, if less so than today.

An alternative proposal to a Cargo solution in rustup which embraces the fact that standard library crates are special has been explored but hit a roadblock on a niche interaction with artifact deps (#9096).

How does a stable compiler build standard library crates with nightly features?

Non-Cargo users like Rust for Linux want an approved method for doing this too. Currently this works via RUSTC_BOOTSTRAP which is a satisfactory solution if blessed by the compiler team for these purposes. It's valid for a rustc user to enable nightly features on a stable compiler when the crate sources and compiler agree on the set of nightly features available - as is the case with build-std where the standard library sources (and the dependencies it specifies) match the version of the compiler in use.

A solution must:

  • Be applicable to standard library dependencies too, which may also use nightly features. Having rustc check the crate names against a pre-approved list of standard library crates is not acceptable.
  • Not be open to abuse. Some users, once learning about this feature, may use it incorrectly and potentially expose this to the ecosystem. A new stable compiler flag may be considered inappropriate for this reason.

If standard library dependencies were vendored a possible solution includes an undocumented file that could be included with the crate sources and detected by rustc. Alternatively we could check if crate sources are in the sysroot (and thus for use only by a single toolchain).

The standard library is also currently required to be built with the unstable "-Zforce-unstable-if-unmarked" rustc flag.

Source vs Unit auditing

A very robust solution should not only validate the crate source involved, but also the compiler flags. E.g., even if the "correct" source code corresponding to the Rust compiler being used is to be built, it may be possible to cajole Cargo into compiling with peculiar flags that results in behavior that shouldn't be deemed stable.

To address this problem, we could consider auditing not the the standard library source code in the lockfile (pre-resolution), but the standard library compilation steps in the Unit Graph (post-resolution). The chance of something sneaking in between the unit graph and the actual invocation of rustc and other tools is far, far lower.

How are standard library crates distributed? #11

Currently this is the rust-src rustup component which packages the library/ and src/llvm-project/ directories. Continuing to do so may make it harder to change the structure of rust-lang/rust, as seen with the build-std breakage from #128534 though this can be mitigated with proper rust-lang/rust build-std testing (#85).

Concern should be given to dependencies of the standard library on crates.io - what happens if a version relied on by a version of Rust is pulled? Vendoring the standard library dependencies would help mitigate any fallout from this.

This question also has an impact on how Cargo loads std crates and resolves them to generate a Unit Graph (a collection of things to build). Other options to rust-src include distributing .crate tarballs for standard library crates.

The solution to this question relates to the solution to the previous question, in that tracking what source code is reduces the requirements to track where it came from. For example, if rustc comes hard-coded with a whitelist of crates hashes, Cargo is relieved from caring as much whether the source code was provided by rustup, because it will check the crate hashes against rustc's whitelist either way.

How does the user enable building crates from source? #43

Previous solutions to this problem attempted to specify standard library dependencies in the user's Cargo.toml file (#5, RFC #2663) but ran into numerous problems. My current aim for this RFC is to explore other approaches.

Some aspects to balance in this decision include:

  • Making build-std as transparent to the user as possible as many situations where it is required are not currently obvious to end users, though there is definitely room for improvement in this through initiatives such as Target Modifiers which I will expand on below.
    • In situations where the user must or should use build-std suitable error messages must be implemented, many necessarily in rustc, directing them to rebuild std crates.
  • Enabling build-std comes with a compile-time cost, significantly worsening one of Rust's most common complaints. In particular this is an issue when many user-specified dependencies depend on std, blocking compilation on the building of that one crate.
    • Quite often users will iterate on a build configuration, causing many rebuilds of std, but once this build configuration is decided built std artifacts can be reused in future builds.
    • Release builds are less sensitive to build times and more sensitive to runtime performance than debug builds.

The current build-std flag implementation is not suitable for stabilisation as it exposes standard library crates names directly which the library team have not committed to being stable with some actively wanting to change in the area - there are discussions around moving the standard library into a single crate with conditional compilation. At minimum this flag requires some stable set of arguments to form an abstraction over the standard library crate structure.

To explore this area deeper, lets explore making build-std as transparent as possible to the user while only focusing on build times to the point where they do not hurt the ease of use this brings.

Doing so involves Cargo making the decision on whether to build std crates itself, what crates to build and what crate features to build them with. Such a solution might be possible at this point but requires more design work to ensure this is feasible.

The Target Modifiers RFC proposes marking flags that must be set for all crates for correct usage as target modifiers, ensuring that rustc errors if flags are improperly mixed. If Cargo could gain awareness of these flags then it could automatically enable build-std if the user changes these flags. When changing other flags, which do not break linking, Cargo could also force rebuilding std for release builds and not rebuild during debug builds (which would involve tolerating differences in fingerprints) for build time reasons.

If Cargo is to determine when to use rebuild standard library crates, it must also determine the set of crates to build. I believe it should be possible to determine something close to the desired set of standard library crates automatically by looking at the Cargo profile, codegen flags supplied and whether the target supports std, though this also needs needs more work. For example:

  • The Cargo test profile implies building test
  • Whether the target supports std implies whether to build std
  • The panic strategy field in the Cargo profile together with the target's default panic strategy implies whether to build panic_abort or panic_unwind

This must of course all happen over a stable interface with rustc - that is, not via a flag like -Zprint=target-spec-json.

It may be hard to determine whether alloc is required without parsing source code but building alloc is generally fairly quick compared to core or std.

@Ericson2314 is of the opinion that this is tarpit that we should leave out of scope. Deciding whether the cached builds in the sysroot are sufficient for the task is, in principle, just as hard as deciding whether a global, multi-project Cargo cache's build is applicable. A global build cache is a highly-desirable feature, but also a hard one.

In the short term, we can leave sysroot vs from-scratch building an explicit choice int the users hands, just as -Zbuild-std is today. In the much longer term, once we have the global build cache, we may decide to deprecate the sysroot stdlib mechanism altogether, instead letting rustup's pre-built standard library components be merely a means to prepopulate the global cache.

To recap, in this view, the status quo, and the ideal "unified cache", are two stable points in the design space, and middle-grounds are unstable design points whose benefits are not worth the costs.

How does the user modify the crate features? #4

Some maintainers are already in support for stabilising some subset of standard library features, listed in the sysroot Cargo.toml. The use cases for this currently include:
- Disabling backtrace or enabling optimize_for_size for code size improvements.
- The latter is very new and not currently particularly useful.
- Linking in an alternate panic strategy with panic-unwind and panic_immediate_abort
- Enabling profiler_builtins for PGO

More standard library features may be introduced in the future. These current use cases touch a variety of users from embedded software (which cares about code size) to application-level software (which may care about runtime performance). This has led to me exploring this feature for inclusion in the project goal's RFC despite it the fact that it is arguably not necessary for some common use cases for build-std.

Default features stability issues

The deficiencies of the current design of "default features" make it ill-suited to the standard libary. Niko raised a concern previously that warned that the act of disabling the default features when supplying a set of features to enable makes it a breaking change to move items in the standard library behind a feature flag, which I presume to be a tough commitment for the library team to adhere to. This means the current mechanism, build-std-features, isn't currently suitable as providing any arguments to the command line option disables the default set of features.

This use case has been raised before (#3126) more generally for Cargo features too.

Also, a "trick" to get around this problem is to keep opting out of default features in the standard library unstable, until if/when we have a better design. This confines the breakage to unstable-using code, making such brekage permissable.

Utility of automatically resolving standard-library features

Currently build-std-features are set globally. There is the question of whether regular crates should be able to depend on standard library crates' features. As discussed below, currently standard library and "userspace" crates are resolved separately, explictly disabling this, but we could also revisit this and resolve them together, allowing this. Is doing so worth the effort?

  • For the features the standard library has today, the answer appears to be "No".

    It seems unnecessary because currently existing std features only make sense to use with knowledge about the target platform or codegen options used at build time. This makes libraries poorly placed to make decisions about the features used.

  • But for futures that may exist in the future, the answer could we be "Yes".

    For example, alloc currently has an ad-hoc CFG to disable panicking on OOM failure. This was used but Rust for Linux (until they forked alloc for other reasons, but the differences may be reconcilable), and is also used by the Windows kernel. In idle world, we would have driver and other low-level OS code on crates.io that would be usable multiple OSes; such crates would want to opt-out of fallibility in alloc via their Cargo depencies.

    (See https://rust-lang.zulipchat.com/#narrow/channel/219381-t-libs/topic/no_global_oom_handling.20removal for some wide-ranging recent discussion on this topic.)

Further question

There are multiple questions to answer as part of this problem:

  • What is the mechanism for doing so?

    • It could in theory be as simple as prepending "unstable-" to any unstable ones and inserting relevant checks on feature names in Cargo. This comes with a small implementation and maintenance cost. However, the feature names would still be present on stable/beta (as unstable features will be required for building some targets) - is it necessary to hide these in the rust-src component to avoid non-Cargo users using them on stable?
    • There is a more general ask for stable/unstable features (#10881) in Cargo, which of course would be a preferable long term solution. This feature is very unlikely to use prefixes and more likely to introduce an explicit field on features. It is up to the Cargo team to decide whether to support a solution specific to the standard library in the meantime.
  • Are the library team ok with the idea of stabilising some subset of crate features?

    • Individual features are of course subject to their own individual decisions, but a team commitment to the principle is needed sooner.
    • I currently believe the commitment should include ensuring that when an item is in the default feature set, it should always be available in the default feature set. Moving items behind a new feature gate means that the new feature should be part of the default set.
  • How are crate features exposed to the user from Cargo?

This last question is very open ended right now. As mentioned above opting out of the default features isn't valid behaviour so any user-exposed features might have to be tweaks to or from this baseline, more akin to -Ctarget-feature behaviour. Some more thought is required in determining exactly what solution would be the most flexible for the standard library moving forward.

Some features might not need to be exposed to the user from Cargo at all. For example, Cargo could set panic_unwind or panic_immediate_abort depending on the profile value (see also #29).

Low-level requirements

This section documents requirements found during investigations that don't clearly fit into one of the user-facing questions above; many are related to implementation details.

  • This feature must be in Cargo.
    • @davidtwco has explored the idea of using rustup to build and manage sysroots, embracing the idea that std and related crates are special in Rust.
    • The approach seemed very appealing but requires Cargo to defer back to rustup to choose a toolchain for building artifact dependencies with a different std artifact which does not seem desirable for Cargo.
  • The standard library must always be built with the same dependency versions as its source Cargo.lock file.
    • This is because some dependencies use nightly features when part of the standard library.
  • The compiler should not search the sysroot for standard library dependencies when build-std is enabled to avoid confusing linkage errors.

Unified vs separate resolution #64

Today, -Zbuild-std works by separately resolving a build plan for the standard library and "userspace" crates, and then combining those build-plans together. An alternative would be to instead jointly resolve all creates in a single resolution.

A unified resolution is necessary for having userspace crates' deps influence how the standard library is compiled, such as depending-upon specific standard library crates and specific standard library crates' features. The utility of this is discussed above.

Performing a single resolution is easy; Cargo's default mode of operation (without today's -Zbuild-std) is to perform a single resolution. The only part that needs work is to assemble the inputs for a single, unified resolution. This means assembling a single lockfile. Such a lockfile is currently impossible to build as the standard library dependency versions cannot be changed from its own lockfile and Cargo does not allow conflicting dependencies with the same major version in the same lockfile in the case the user wants a different version of a std dependency. Relaxing this second restriction in general is not valid because it breaks producing a resolve from vendored dependencies in the absence of a lockfile as the resolver cannot distinguish a particular version to choose.

Relaxing this requirement in a way specific to the standard library may be possible as the standard library does not export any types from its dependencies and hence cannot leak "conflicting" versions of the same type outside of the standard library. This possibility, of "different" indistinguishable type definitions existing at the same point in Rust code is the reason for the Cargo's resolver's original restriction.

As of https://github.com/rust-lang/rust/pull/135501, the standard library is starting to use public and private deps.