Salsa fixed point sync

# Salsa fixed point sync ## General sync * https://github.com/salsa-rs/salsa/pull/527 * Benchmark perf regression -- what's happening * easing the ordering (6-9%) ## Dashmap Three uses of dashmap * tracked functions (memoized results) * interned values * tracked structs (get to a unique identity) "validate function" -- have the ingredient index + key but still have to lookup the entry. with a slab setup we could avoid that lookup tracked functions: ```rust= #[salsa::input] struct Input { } #[salsa::tracked] fn foo(x: &dyn Db, input: Input) -> { } ``` Micha's question: is it possible to avoid the hashmap lookups when validating derived tracked structs in: https://github.com/salsa-rs/salsa/blob/86f06d8485edd3ad829a186eadb9fee91dbcf32c/src/function/maybe_changed_after.rs#L195-L225 by directly storing the pointer into the Slab page instead of the `DependencyKeyIndex`? - Going to mostly read-only after program startup. some mutation at the start, so a fast, read/lockfree-optimized data structure would be appropriate here. Options: 1. (today) global map from input key 2. sparse vector from input key (kind of the same...?) 3. store a hashmap of what functions track an input on the input itself. this small hashmap; allows for some DICE-style eager invalidation with (ideally) a high degree of memory locality a. This might be the best option. ## Fixed point sync ```rust x = 0; flag = True while flag: x = x or 1 flag = check_flag() ``` Abstract interpretation on Python code, more or less. ### Option A * Execute a query F * Execute a query G * Execute a query H * Execute a query F becomes * Execute a query F (cycle-participant) * Execute a query G (cycle-participant) * Execute a query H (cycle-participant) * Execute a query F -- return the current value (bottom) * yields F0 * execute query F again, with F0 as current value * Execute a query G (cycle-participant) * Execute a query H (cycle-participant) * Execute a query F -- return F0 * yields F1 -- we store LUB(F0, F1) * execute query F again, with F0 as current value * Execute a query G (cycle-participant) * Execute a query H (cycle-participant) * Execute a query F -- return F0 multiple threads would be annoying, it'd be slightly nicer if we knew F G and H are all on the same thread ### Option B ```rust query_f(db) .then(async { query_g(db).then { query_h(db). } }) async query_h(db) { let x = query_f(); let y = independent_work(); join(, y).await; } ``` ## DICE -- incremental Partially eager PDF: https://github.com/facebook/buck2/blob/main/dice/dice/docs/DiceIncrementalityAlgorithms.pdf - DavidB's understanding of Salsa today: incrementing a global revision count requires walking *all* deps, even if a changed value does not impact a dependent value. - Modulo durabilities: if we see that a query only depends on inputs with durability {Medium, High}, we only invalidate the input if a Medium/High dependency is invalidated - Possibly consider a bloom filter! - DICE is designed around the idea/assumption that most changes will be pretty late in a topologically sorted dependency graph, as most of the build can be cached/unchanged since people are rebuilding a binary/library with few/little dependencies. - DavidB: https://lord.io/spreadsheets/ helped me understand the different tradeoffs. Asymptotic complexities of different incremental compuation systems: - Dice: `O(changed_subset)` recomputation; (near) `O(invalidated_set)` graph traversals - Adapton: `O(changed_subset)` recomputation; `O(invalidated_set)` traversals - Salsa: `O(changed_subset)` recomputation; `O(mountain)` traversals if lazy - `mountain` is defined, by the Buck2 folks, as "the set of nodes required for a given computation request" - Salsa is less asymptoticly efficient than alternatives so that it can be more lazy: an IDE might not care! - In other cases, however, we do: https://github.com/salsa-rs/salsa/issues/41