owned this note
owned this note
Published
Linked with GitHub
# Rustdoc JSON Cross-Crate `Id` resolution
NOTE: Right now, these notes are mainly targetted at me, and assume you have context.
These were 50% dumping from my brain, and 50% meeting notes from the all hands.
## Big Non-contirverial things
- Add `version` to `rustdoc_json_types::ExternCrate`
- Turns out we don't have the information to do this I think.
- Add rustdoc output paths to `cargo --message-format=json`
- <https://github.com/rust-lang/cargo/issues/15558>
## Problem
> This need's it's own issue at some point, but for [the zulip chat](https://rust-lang.zulipchat.com/#narrow/stream/266220-rustdoc/topic/Rustdoc.20JSON.3A.20Include.20All.20Foreign.20Items.3F) is the best resource.
>
> The problem we want to solve is that if I have an `Id` from one crate's json, but the Id is for a foreign type, then it won't be present in `index`, only `paths`. If you want to see the `Item`, you need to find the json for the crate that this `Id` is from. The problem is that in this JSON you need to use a different ID. To find the ID in the crate the item is local to, you need to find the path for the item from `paths`, and then look that up the the json for the crate the item's from.
>
> This is cumbersome, slow, and unreliable [^unreliable].
>
> Idealy you should be able to just use the Id from one JSON in another. I'm not sure if this is possible, or if we'll need some translation scheme, potentially spiting Id's into two fields (crate id, and item id). How exactly this will work needs further design work.
[^unreliable]: Especialy when the `path` isn't present in the public docs.
-- [Rustdoc-JSON 2023 roadmap](https://github.com/rust-lang/rust/issues/106697), ~~which still isn't implemented lol~~
## Solution
- `Id` should be split into 2 parts, crate id & item id
- Resolve crate id to find json document with info
- Needs cargo cooperation to deal with versions. CC `--orchestator-id`
- Alternative option from Urgau, map crate id -> rlib
- Then callers zip this information with cargo json output to find exact details of crate that gave that rlib.
- `cargo build --message-format json` prints enough info I think, has paths to `.rmeta` for crates that arn't built.
- Can use the full `"package_id"` from this output: `cargo +nightly rustdoc -p registry+https://github.com/rust-lang/crates.io-index#libc@0.2.172 --output-format json -Z unstable-options
- Cargo people think that the rmeta path is stable enough to be the same,
- `-C extra-filename` is the hash used for the rmeta path.
- https://doc.rust-lang.org/nightly/nightly-rustc/cargo/core/compiler/fingerprint/index.html
- `-C metadata` isn't the same as `-C extra-fielename` anymore: see https://doc.rust-lang.org/nightly/nightly-rustc/cargo/core/compiler/fingerprint/index.html
- Can cargo ever map from a metadata/extra-filename back to something more usefull?
- Maybe?? But it'd be a heuristic, and this shouldn't be in cargo cli's.
- You need to run build-scripts to get a build graph, to get this stuff.
- Also, it might not exist anymore in the current build graph, but it could exist in the target dir.
- A question: Can we use package-id in rustdoc-json output:
- NO: package-id uniquely identifies a package, but not a unit in a build-graph: https://doc.rust-lang.org/cargo/reference/pkgid-spec.html
- Also: requires invasive changes to rustc.
- Alternative from jyn: Use `-C metadata`
- Problem: cargo passes different `-C metadata` to rustc and check in json
- Not sure why? Jacob Finkelman thinks it's a reasonable
- https://github.com/rust-lang/cargo/blob/47c911e9e6f6461f90ce19142031fe16876a3b95/src/cargo/core/compiler/build_runner/compilation_files.rs#L615
- Needs a plumbing command to map metadata -> packageid
- Or build everything with `(package, version)`, then iterate and find document with that metadata hash.
- Can rustdoc emit metadata as well as build stuff.
- Might be good in general
- This way, you build your docs for deps at the same time as you build their metadata for leaf crate.
- Falls over due to `cfg(doc)`
- Lookup idem id in the `$.index` of the json document for that crate
- Needs rustc cooperation to know what that ID will be.
- We can't """just""" use `DefId`/`HirId` becuase rustdoc-json gives more things ID's than they do: https://github.com/rust-lang/rust/blob/master/src/librustdoc/json/ids.rs
From Predrag: CSC already does the nameres, so only solving the crate resolution problem is still super usefull for them.
Another thaught: However we link CCI for HTML should also work for json!
- NOPE: The HTML way it at the moment is kinda sad, mangles links between `time@1` and `time@2`
- Cargo doesn't have a way to determine set of features of a depenceny yet.
- Maybe "manifest v2 metadata plumbing" will solve this.
- GSOC student working on it.
## Docs.rs stuff
- We should add stuff to `ExternCrate`:
- Version
- Can't know this stuff well :'(
- Provide both package name and crate name
- Rustdoc has to know this, for html_root_url
- Need to pass `-Zrustdoc-map` when running `cargo rustdoc`
- Memory usage: Can we borrow string from the input?
## Other stuff from rustdoc <-> cargo
- Cargo wants a `.d` file from rustdoc, and maybe some more things so they rebuild correctly. Talk to Jacob Finkelman about it.
- Also `--emit=filenames`
- Apparently this exists, guiomme and epage working on it!
- https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#rustdoc-depinfo
- Cargo should use https://rust-lang.github.io/rfcs/3662-mergeable-rustdoc-cross-crate-info.html
- Fuschia might have a patch to cargo.
- Rustdoc should add `-C metadata` to rustdoc file name
- Cargo `--message-format=json` should print the output json path with `-w json`
- Also maybe needs `--emit=filenames`
- If cbor becomes a thing across project, rustdoc-json
## Other random perf stuff
- Bincode output for rustdoc-json
- Put rustdoc-json in rustc perf.
## Issues/Links
- [`--orchestrator-id` MCP](https://github.com/rust-lang/compiler-team/issues/635)
- Possibly a bad idea says Urgau
- Adds a new tracked input to rustc, will change crate hashes
- [[rustdoc-json] `paths` is inconsistent and questionably useful](https://github.com/rust-lang/rust/issues/93522)
- [[rustdoc-json] Partially remove `paths` and introduce `external_index`](https://github.com/rust-lang/rust/pull/103085)
- [Rustdoc-Json: some item IDs are missing in `paths` field.](https://github.com/rust-lang/rust/issues/101687)
- [[rustdoc-json] absolute position of item is hard to get](https://github.com/rust-lang/rust/issues/93524)