Projects for incremental compilation in rustc

Projects for incremental compilation in rustc

Document the query system

The current query system has become a complicated mess, spread across rustc_middle, rustc_query_system and rustc_macros. All this code is interdependent with large parametrization traits providing a subtle back-and-forth between the crates.

A lot of its concepts are still unsufficiently documented:

anonymous, eval-always, normal queries;
the dep-graph algorithm;
the handling of cycles;
behaviour of the parallel compiler.

There is a need for both a documentation effort, and a simplification of the implementation to reduce the spaghettiness of the code.

Incremental AST

See here

Index-based HIR

However, the savings tend to be inpredictable. When one part of the HIR gets invalidated, the consequences of this invalidation are not precisely controlled.

A way forward is to transition the HIR from a pointer-based tree to an index-based tree. In a pointer-based tree, the modification of a tree node will change the hashes of all the nodes that point to it, recursively. Using index-based trees allow to put a stop to this invalidation.

This is already what happens with the Item/ItemId and theBody/BodyId distinctions. However, such distinctions can be made more widespread, in order to reduce this kind of spurious invalidation.

This should be applied in priority to:

attributes #79519;
identifiers;
spans #72015.

As an extra, the full HIR could be converted to using indices instead of pointers. This could lead to a performance increase (by cache-locality) in code which traverses the HIR but does not need to do so in order. Since the indices do not need to be 8 bytes-long, there could be some memory savings too.

Stable spans

See #84373 and its extension #84762.

Drop the HIR once MIR is built

The HIR stays in during the full MIR checking and optimisation phase, but is not needed any more at that point. We could investigate freeing the HIR memory once the mir_built queries have been called. One difficulty is that the same HIR owner will yield multiple MIR bodies (eg: closures).

Steps:

steal the HIR indexes (hir_owner and hir_owner_nodes)
when producing in mir_built;
stop creating &'tcx references to HIR nodes in rustc_middle::hir::map;
make sure there are no pointers to the HIR stored anywhere;
HARD: allocate the HIR in one arena for each HIR owner and make sure there are not leaks;
steal and drop those HIR arenas when constructing MIR.

Fixed-point MIR optimisations

MIR optimisations are for now limited to run only once. If this limitation were lifted, we could end up wanting to inline partially-optimised MIR from one body to another.

This will require a subtle dependency tracking, akin to what oli-obk has been doing in #68828.

A more comprehensivve framework could be devised, with inspiration on how the trait selection and evaluation cache work.

#81668 may help.