owned this note
owned this note
Published
Linked with GitHub
## How to migrate from `git subtree` to `josh-proxy`
"subrepo" refers to the repo that reflects just a folder of rustc, e.g. the Miri or rust-analyzer repository.
- Note down your current subrepo head commit as LAST_SUBTREE_COMMIT.
- Do one last subtree sync from the subrepo to rustc so that LAST_SUBTREE_COMMIT exists in rustc histroy.
- Now do *not* do any more subtree syncs in either direction! Josh would have to figure out how to sync those and that usually fails.
- Construct your josh filter. For rust-analyzer it is
`:rev(LAST_SUBTREE_COMMIT:prefix=src/tools/rust-analyzer):/src/tools/rust-analyzer`. This tells josh how to extract the subrepo history from the rustc history: generally only the part inside `src/tools/rust-analyzer` matters, but for everything before LAST_SUBTREE_COMMIT, it needs to be treated as-if it was inside `src/tools/rust-analyzer`.
This reflects a fundamental difference in how subrepo commits get reflected in the rustc repo: with `git subtree`, they are copied identically, even preserving their git hash, so their tree contains just your subrepo at its root. With josh, the commits look exactly as-if the same change was made inside the rustc repo, i.e. your subrepo is as its usual place in the rustc folder hierarchy and the rest of the rustc repo also exists in these commits (but is of course left unchanged). With josh, looking at the rustc history, you can't tell which changes were made directly in Rust vs in the subrepo.
- Set up the scripts to do rustc-push and rustc-pull via josh semi-automatically, e.g. by copying it from Miri or rust-analyzer. Make sure the commit that adds this also contains an empty file in the crate root called `rust-version`.
- Do the first rustc-pull. You need to pull from a rustc commit that contains LAST_SUBTREE_COMMIT. This will update the `rust-version` file and potentially a *ton* of merge commits. (In Miri we didn't get many merge commits, probably because we didn't actually successfully use `git subtree` for more than a single sync; in RA, we got around 1500 merge commits.) The overall diff should be only whatever changed in rustc that has not been synced to the subrepo yet. `git rev-list HEAD --max-parents=0 --count` should say that there is only a single root commit (unless your project already had multiple roots before josh entered the picture).
Specifically, a merge commit is created for each rustc merge commit where your subrepo differs between the two parents of that merge commit. josh considers those merge commits to "matter" for the subrepo history, and reflects them in the subrepo. So if someone creates a PR for rustc, and between them forking off of master and their PR landing in master a subrepo change lands (either via a sync or via a change that directly happens in rustc), then the merge commit of that PR will become visible in your subrepo. (Only the merge commit becomes visible, no other part of the PR.) You can look at [the rustup PRs in Miri](https://github.com/rust-lang/miri/pulls?q=is%3Apr+is%3Aclosed+rustup) to see how many of these merge commits we get in practice.
This reflects the deeper philosophy of josh that the "main" repo is rustc itself, and the subrepo is "just" a projection to a particular folder (as defined by the filter).
- Now you can merge your subrepo master (in case it changed since LAST_SUBTREE_COMMIT), and do a rustc-push to ensure everything works. I recommend playing around with this a bit, i.e. doing changes in the pushed branch and pulling them again and doing changes in the subrepo and pushing those.
In steady state:
- The `rust-version` always keeps track of the last rustc commit that got merged into the subrepo (i.e., the last rustc-pull). This is useful for rustc-push to behave in a more predictable way, and it can also be useful for the subrepo's CI -- it can download that version of rustc to obtain a rustc that is guaranteed to be in sync with this repo.
- When doing rustc-pull and there are conflicts, just resolve them in the subrepo as part of the merge commit. Do *not* rebase.
- When doing rustc-push and there are conflicts, likewise do *not* rebase. In Miri what we usually do is abort the rustc-push, do a rustc-pull instead and resolve the conflicts there, and then do another rustc-push which should no longer have any conflicts.