# Persistent caching
## Summary
Being able to save to disk and reload (cheaply)
## 2025-06-11
High-level strategy:
* https://github.com/nikomatsakis/salsa/tree/serialization
* mark the tracked structs as persistable (serializable may be useful independently...)
* and list out the things you may serialize when creating the database
* eagerly add those ingredients so we know the indices don't change
* panic somewhere or other if an ingredient that may be serialized is not in that list
* storing/loading the data is relatively straightforward then except for
* how do you manage inputs?
* how do you remove intermediate nodes you don't want to serialize?
* etc
* this must be explored
## 2025-01-06
Goal:
* `db.serialize_to(...)`
* `let db = RustAnalyzerDb::deserialize_from(...)`
* given some kind of source (e.g., a path) where previous state was serialized
* it will deserialize a minimum amount of work and lazilly deserialize as existing items are accessed
In rust-analyzer, we populate the defmap from here: https://github.com/rust-lang/rust-analyzer/blob/bfb81275fb746dadb7664831d7d7611fd72cc955/crates/ide-db/src/prime_caches.rs#L67-L72
With Salsa 3.0, this is basically what we have--
* an input `crate_graph` that references other inputs (`Crate`)
* a tracked function `crate_def_map(CrateId) -> DefMap`
* a tracked struct [`DefMap`](https://github.com/rust-lang/rust-analyzer/blob/979e3b54f70f6f231c117a5d628b98106e5c7d31/crates/hir-def/src/nameres.rs#L105-L137)
* the only "external" things it uses are salsa structs (inputs, interned, tracked)
```rust
#[salsa::tracked(serialize)]
fn crate_def_map(db: &dyn crate::Db, krate: Crate) -> DefMap {
}
#[salsa::tracked(serialize)]
struct DefMap {
... // anything in here has to be serializable
}
#[salsa::input(serialize)]
struct Crate {
...
}
```
* API idea
* You tag the type as `serialize` when they are declared, which will generate the `Serialize` impl (as above)
* You can "serialize" given a set of starting input roots
* and a set of types LS (serializable jars, e.g., salsa structs, tracked functions, etc) that may be serialized. we need it because we need to do a `type_of(serializedType)` during deserialization.
* all (de)serializable elements need to be Salsa structs (or impl `salsa::Update`)
* transitive tracked fns *also* need to be called.
* this list needs to be given to the database at the time of the database's creation.
* assertion failure if any type needs to be serialized that is not in LS
* You "deserialize" by giving that same set LS and you get back the set of roots
* Database creation
* When a salsa structure tagged as serialize is added to the database:
* serializable ingredients
* might want a static way to find the ones that are unexpected
* Serialization
* User provides a set of salsa inputs that are the "roots" of serialization
* Salsa will serialize
* Let `SalsaStructsToBeSerialized` be the roots of serialization
* Until fixed point is reached:
* Extract an id ID from SalsaStructsToBeSerialized
* Serialize ID, including attached memos
* for everything serializable, serialize the memo
* which will serialize the return type of the memo (which can be a tracked/interned struct)
* Serializing a salsa struct
*
* Deserialization
## Open question
What traits would we need and can we make them optional?
Some state in rust-analyzer might not be serializable. It'd probably only want to (de)serialize [DefMap](https://github.com/rust-lang/rust-analyzer/blob/979e3b54f70f6f231c117a5d628b98106e5c7d31/crates/hir-def/src/nameres.rs#L105-L137) in rust-analyzer.
- in rust-analyzer, (de)serialization is mostly the exception.
- For `DefMap`, `CrateId` would likely be the ID.
- `DefDatabase::crate_def_map` would need to be a tracked function.
- `CrateId` would need to be (de)serializable.
- [`CrateData`](https://github.com/rust-lang/rust-analyzer/blob/9aa42935947024090d423b0cec801aee59132f5e/crates/base-db/src/input.rs#L276-L294) would need to be a `#[salsa::input]`.
## 2024-10-09
How would this work?
Three key parts:
* View map
* not serializable, populated lazily upon read/write usage.
* Ingredients -- general metadata
* Has some mutable state (LRU), but it can be removed. For functions, there's a free list.
* Might need to be (de)serialized as well; serves as the "schema".
* Table -- data for each entity
* The data itself!
We're pretty sure there needs to be a mechanism all the (de)serialized tracked functions and structs: this makes the persistent state contract an explicit API boundary.