Dataflow-based const validation required a more generic dataflow framework with support for arbitrary transfer functions. This new framework was implemented in rust-lang/rust#64566. As a result, there are presently two separate frameworks for dataflow analysis that don't share any code. The goal of this design document is a plan to merge the two frameworks into one.
I have implemented a prototype that accomplishes this, but there may be concerns that I've overlooked. I will explain whether/how the prototype addresses each concern that I have identified so far.
A dataflow analysis is defined by the lattice which establishes the universe of possible state vectors, the transfer function for each statement and the direction in which statements are processed (forwards or backwards). At the moment, only forward analyses are supported, and the dataflow state vector must be a bitset (equivalently, the lattice must be a powerset). A new unified framework should make it easy to remove these restrictions in the future. Currently, backwards dataflow is done by hand, and more powerful const-propagation will, I think, require a more complex lattice.
There are a few optimizations which could be applied to dataflow analyses, and a unified framework should allow as many as possible. For example, we could skip caching block transfer functions for dataflow analyses on acyclic MIR, since we never have to apply a block transfer function more than one. Some more complex ones are described in #44285.
Because we have restricted the dataflow lattice to a powerset for now, the only difference between gen-kill problems and generic ones is that transfer functions for the former are restricted to two operations ("gen" and "kill").The prototype splits the interface for dataflow analyses into three separate traits. A base trait, AnalysisDomain
, defines the index type used as the lattice element as well as the size of the state vector for a given mir::Body
.
trait AnalysisDomain<'tcx>: BottomValue {
type Idx;
fn bits_per_block(&self, body: &Body<'tcx>) -> usize;
fn initialize_start_block(&self, body: &Body<'tcx>, state: &mut BitSet<Self::Idx>);
}
The generic dataflow interface defines the transfer functions for a given dataflow problem. Notice that there are two effects for each statement or terminator. Upon entry to a statement (or terminator), the "before" effect is applied, and upon exit the unprefixed effect is applied. This "two-phase" effect system is needed for the dataflow analysis done by the borrow checker, and is unchanged from the original framework.
pub trait Analysis<'tcx>: AnalysisDomain<'tcx> {
fn apply_statement_effect(
&self,
state: &mut BitSet<Self::Idx>,
statement: &mir::Statement<'tcx>,
location: Location,
);
fn apply_before_statement_effect(
&self,
_state: &mut BitSet<Self::Idx>,
_statement: &mir::Statement<'tcx>,
_location: Location,
) {}
fn apply_terminator_effect(
&self,
state: &mut BitSet<Self::Idx>,
terminator: &mir::Terminator<'tcx>,
location: Location,
);
fn apply_before_terminator_effect(
&self,
_state: &mut BitSet<Self::Idx>,
_terminator: &mir::Terminator<'tcx>,
_location: Location,
) {}
fn apply_call_return_effect(
&self,
state: &mut BitSet<Self::Idx>,
block: BasicBlock,
func: &mir::Operand<'tcx>,
args: &[mir::Operand<'tcx>],
return_place: &mir::Place<'tcx>,
);
}
Finally, the gen-kill interface is almost identical to the generic one. However, a GenKillAnalysis
is not given direct access to the state vector, but can only access it through a GenKill
interface. GenKill
abstracts the "gen" and "kill" operations either so they can be recorded while coalescing block transfer functions, or they can be applied directly to a BitSet
. When arbitrary lattices for generic analyses are allowed in the future, the super trait will have an additional restriction, e.g., GenKillAnalysis<'tcx>: Analysis<'tcx, Lattice = BitSet<Self::Idx>>
.
pub trait GenKillAnalysis<'tcx>: Analysis<'tcx> {
fn statement_effect(
&self,
trans: &mut impl GenKill<Self::Idx>,
statement: &mir::Statement<'tcx>,
location: Location,
);
fn before_statement_effect(
&self,
_trans: &mut impl GenKill<Self::Idx>,
_statement: &mir::Statement<'tcx>,
_location: Location,
) {}
fn terminator_effect(
&self,
trans: &mut impl GenKill<Self::Idx>,
terminator: &mir::Terminator<'tcx>,
location: Location,
);
fn before_terminator_effect(
&self,
_trans: &mut impl GenKill<Self::Idx>,
_terminator: &mir::Terminator<'tcx>,
_location: Location,
) {}
fn call_return_effect(
&self,
trans: &mut impl GenKill<Self::Idx>,
block: BasicBlock,
func: &mir::Operand<'tcx>,
args: &[mir::Operand<'tcx>],
return_place: &mir::Place<'tcx>,
);
}
By making Analysis
a super-trait of GenKillAnalysis
, it becomes easy to implement the results cursor and visitor for all analyses. We add a blanket impl<A: GenKillAnalysis> Analysis for A {}
to avoid code duplication. This precludes any additional specialized analysis traits, but I don't know of any that would be useful. Another downside of this approach is that a GenKillAnalysis
can be passed to the Engine::new_generic
method, which will execute that analysis without caching block transfer functions.
Alternatively, the two traits could be separate, and a defaulted apply_whole_block_effect
added to Analysis
. We would use an adapter (which also holds the cached transfer functions) to wrap a GenKillAnalysis
and implement Analysis
. The adapter must be passed into the results cursor, however, which prolongs the time that the cached transfer functions are resident in memory. We would have to free the cache once fixpoint is reached via another method on Analysis
.
Currently, the API for dataflow results consumers looks something like this:
let engine = Engine::new_generic(MyGenericAnalysis::new(), /* body, etc. */);
// -or-
// let engine = Engine::new_gen_kill(MyGenKillAnalysis::new(), ...);
let results = engine.iterate_to_fixpoint()
let mut cursor = ResultsCursor::new(body, results)
for (_, statement_index) in body.basic_blocks()[START_BLOCK].statements.iter_enumerated() {
cursor.seek_before(Location { block: START_BLOCK, statement_index });
println!("state: {:?}", cursor.get());
}
TODO: finish
The prototype gen-kill dataflow framework provides a few different APIs for inspecting the results of a dataflow analysis. There are two major paradigms, a "pull"-style ResultsCursor
, and a (to be implemented) "push"-style ResultsVisitor
. "Push"-style APIs are more easily optimized (it's why Iterator
implementations specialize try_fold
), but "pull"-style APIs can be more efficient if we don't actually need to inspect the state in every block (like when doing const validation).
The current ResultsCursor
is very fine-grained to allow for debugging of two-phase dataflow analyses: It allows you to observe either the "before" effect or the full effect of every statement. This functionality is not very useful outside of this context, since most consumers of dataflow results only need to observe the "before" effect for each statement. If the new framework causes a perf regression, we may consider renaming the current cursor to ResultsCursorFine
and implement a separate, less-flexible one.
While results cursors are great for consumers of a dataflow analysis, they make some optimizations more difficult. To operate efficiently, they require that the full dataflow state at entry to each block is stored.
Unifying the graphviz output for both interfaces is actually not trivial. This is because we would like to print the block transfer functions for gen/kill problems (as is done in the original framework). However generic dataflow problems do not allow for transfer functions to be coalesced, so instead we can only print the diff as each statement is processed. The prototype uses another argument to rustc_mir
, borrowck_graphviz_format
to control how the state is rendered (sometimes it is nice to view state diffs for gen-kill problems as well). You can see an example of the diff-style output below, as well as the output from the original graphviz debugger on the same MIR.
Original:
My plan is to replace the generic analysis in librustc_mir/dataflow/generic
with the unified version, then migrate the various implementers of BitDenotation
over one-by-one. After this, the "generic" implementation will become the canonical one.
The unified dataflow framework described above is not intrinsically less performant than the current setup, but the increased complexity may result in missed optimizations. It will be important to check for performance regressions as migration occurs.
Once a unified framework is merged, I plan on writing a section in the rustc
guide. I believe that there's a lecture video where the current BitDenotation
is discussed, but it would be helpful to have some written content for new contributors.