# Coupling Algorithm
### Problem:
We describe the state of the borrow checker with a graph, whose vertices are capabilities, whose edges `X -> Y` describe how to exchange capabilities `X` for capabilities `Y` either abstractly (with a magic wand) or concretely (with flow-dependent instructions for the PCS stored in a _reborrowing DAG_ at each point).
At a join point, we will have two graphs. What is the graph which describes the combined state afterwards?
- The combined graph must still be able to decide whether any capability is unambiguously usable, ie. in the PCS.
- The combined graph must still be able to explain how it exchanges capabilities.
### Idea 1: Store both graphs, under a flag
This idea works for simple join points, but I'm not aware of any strategy to reach a fixed point at loop heads. This was the original idea for the reborroing DAG.
### Idea 2: Take the union of both graphs
This is what Polonius does, since the edges in Polonius' subset graph are fully abstract, and it's edges are not linear.
For example, suppose one subset graph contains `'a <: 'b` and the other contains `'a <: 'c`, then their union contains both subsets, both of which are eliminated when `'a` expires.
We might hope to represent an edge in the subset graph as either a magic wand, or a subgraph of some concrete reborrowing DAG. In a naive implementation, this means we would have to apply wands `a -* b` and `a -* c` at the same time-- this will not work. For one we cannot consume `a` twice. For another, applying either wand implies we reobtain `b` and `c` in the PCS, which may not be the case as these capabilities may still blocked by some other lifetimes. It is also not immediate in this interpretation which subsets should become magic wands (subsets are transitvely closed, and the subset whose wand we might want to apply depends on the expiry order), and in this approach subset cycles are meaningless.
### Idea 2.5: Flow-dependent edges
The reason idea 2 loses linearity at join points is because the union loses information about control flow. One possibility for avoing this is to tracking path conditions on every edge.
However, in this approach it is more challenging to understand what "expiring an edge" means for the PCS, and the concrete annotations which explain it. We have a choice about how to intrepret expiring a conditional magic wand:
1. The conditional magic wands are applied eagerly, consuming and producing path-dependent capabilities in the PCS. Now the PCS is path-dependent, and must solve path conditions to determine if a capability is usable.
2. The conditional magic wands are applied lazily, only after all path conditions on their output are met. Now the capabilities in the PCS are not path-dependent, but graph must remember which edges are lazily expired, and solve path conditions to determine when to apply them.
The current PCS implementation for Prusti plans to follow strategy 2, using future features of Viper to do the accounting about lazily applied edges. This is completely sufficient to come up with a core proof in Viper, but it means we cannot fully explain the PCS at an earlier stage.
### Idea 3: Coupling
Coupling is very similar to strategy 2.5, however it uses the fact that Polonius does _not_ store flow dependent constraints to figure out when to apply the lazily expired wands (resp. subgraphs) without solving path conditions.
Concretely, at each point the coupling graph will instruct a reborrowing DAG to apply some subgraph of edges at each point. Each edge (in the DAG) can still be under an arbitratily precise path condition, and that won't change the point in the program where that edge is applies.
The key is the _coupling algorithm_: given two graphs like described in idea 2.5, the coupling algorithm will find collections of lifetimes `A` and `B` where every lifetime in `A` must expire before any lifetime in `B` is used. By grouping all of the edges in `A` together (from the graph of strategy 2.5), we know that it is sound to apply all annotations exactly when the last lifetime in `A` expires.
In my view, the coupling graph is the most precise common approximation of the two input graphs that still respects the expiry rules of Polonius.
#### Maximal Sharing/The Reborrowing DAG
Coupling could in principle not be computed at simple join points and instead only at loop heads. However, computing coupling everywhere allows for a _maximally shared_ representation of the reborrowing DAG. In the past, we discussed how we might want to represent joins of reborrowing DAGs (as per idea 1) without duplicating equal stems. In the [feature-complete coupling](https://hackmd.io/@be6mqvt0QwS_i18WLiSLBA/rkuyv9Dd3/edit) document I explain how we can achieve this with the coupling graph, as well as sharing further up the reborrowing DAG as well.
#### Legal Expiries
By "expire" in the coupling graph, I mean removing an edge from the coupling graph and exchanging some capabilities in the PCS.
By "expire" in the reborrowing DAG, I mean removing some set of edges in the reborrowing DAG, and interpreting their removal as inserting some instructions (stored on the edges) into the program.
The two have to remain in sync: the explanation given by the reborrowing DAG should match the change in the PCS given by the coupling graphs. This also means that we will never ask the reborrowing DAG to expire edges finer than a single origin: to that end, all edges in the reborrowing DAG must be labelled with the coupling graph lifetime (or coupled edge identifier, as we will see later) they are waiting on.
An expiry of a set of lifetimes in the coupling graph is legal if there is some sequence of edges in the reborrowing DAG, starting at the leaves under arbitrary path conditions, which contains all edges associtated to a lifetime in the sert.
#### Coupling Graphs
A coupling graph is similar the graphs in idea 2.5. Here I will explain the simple version of the graph which only supports non-iterated mutable borrows and no extra lifetime constraints (function calls, borrows in structs).
Like idea 2.5, each vertex in the coupling graph is a lifetime, which for this subset of Rust also corresponds to an exclusive resource. Each hyperedge in the coupling graph corresponds to a subgraph of the reborrowing DAG, whose start and end vertices match the start and end vertices in the coupling graph.
The coupling and reborrowing graphs must remain in sync. The interface between the reborrowing DAG and coupling graph is regulated by *introduction*, *elimination*, and *coupling* instructions. First, the analysis which calculates the coupling graph will emit a sequence of these instructions at each program point. Next, the analysis which calculates the reborrowing DAG will interpret the MIR at each program point using these instructions (the reborrowing DAG may be tuned to match the kind of information needed by the backed: for example by changing the specific repacking strategy, or with extra instructions to preserve values for purification, or by abstracting at fewer or more join points).
Example introduction instructions include *issue a borrow* or *move a place*.
Example elimination instructions include *consume a place from the PCS* or *apply the subgraph associated to lifeimte 'a*. We will discuss coupling instructions later.
All vertices in the coupling graph are *live* if their lifetime is live, or *tagged* with the location where there lifetime died. When a lifetime is live, it's resource corresponds to a resource that is either currently in the PCS, or which could return to the PCS when unblocked. The resources in the coupling graph which are live and unblocked are also in the PCS. When a lifetime is tagged, it's corresponding resource is also tagged in the reborrowing DAG, and it no longer corresponds to a resource that could return to the PCS.
The coupling graph includes a special kind of hyperedge, called a *coupled edge*. A coupled edge has an *identifier*, and there may be many coupled edges with the same identifier in the graph. Each coupled edge with the same identifier has the same set of destination vertices. Each coupled edge identifier corresponds to a sequence of _coupling instructions_. Coupling instructions include any elimination instruction, `freeze` and `unfreeze` instructions, and can depend on the control flow of the join point where they were created.
Finally, coupled edges must be able to be reified in the reborrowing DAG. The reborrowing DAG chooses to represent each coupled edge identifier as either *translucent* or *opaque*. If *translucent*, it does not have to make any change aside from marking that the subgraph corresponding to the coupled edge now depends on the coupled edge identifier, rather than the lifetimes the depended upon before. If an identifier is chosen to be opaque *opaque*, in which case, it must package all of the annotations of the respective subgraph into a single magic wand.
#### The Coupling Algorithm
The algorithm is a greedy traversal of the two graphs. Let G1 and G2 be two coupling graphs.
1. Find the sets of leaves `L1` and `L2` for both graphs.
2. By Polonius liveness, there must exist a vertex which is either a root or a live origin, and which is reachable from `L1` in `G1` and `L2` in `G2` by at most one live hyperedge. Find some candidate for this vertex, call it `r`.
3. On each branch, combine the hyperpath from `Li` to `r` as you would Hoare triples. This yields a single Hoare triple whose precondition is contained in `Li` and postcondition contains `r`. Denote the sequence of edges (lifetimes) on this hyperpath as `e_(i, 1) ... e_(i, m_i)` and let `sig_i = { pre(i) } expire(e_i, 1) ... expire(e_(i, m_i)) { post(i) }`.
4. By Polonius rules, any invalidation of `r` means any lifetime blocking `r` in any branch is expired. Let `L'` be the union of all `Li`. In each branch, either unfreeze any elements `s` of `L'` that are not in `Li` if they are frozen (prepending `unfreeze(s)` to `sig_i`, exchanging `|s|` for `s` in `Li`) or by expiring `s` up to a live origin (prepending the sequence of expiries to `sig_i`, and updating )
5. Now all branches definitely consume the same set of resources, and definitely all emit `r`. Each branch may emit other resources `s` as well, for each of these append a `freeze(s)` instruction to `sig_i` and replace `s` with `|s|` in `Li`
6. Emit a new coupled edge identifier `c`. The coupling instructions are is `if (branch = 1) then sig_1 ... `. For each branch, add a coupled edge with identifier `c`, whose source and destination vertices are the unfrozen part of `sig_i`. The unfrozen vertices are the same in all branches, so this is well-defined.
7. For each source vertex `v`, add a coupled edge from `v` to all destination vertices, with identifier c.
8. Remove all edges in every branch which has been coupled, updating the leaf set `Li`. Repeat until all graphs have been traversed.
See the coupling examples in [feature complete coupling](https://hackmd.io/@be6mqvt0QwS_i18WLiSLBA/rkuyv9Dd3/edit).
The flow-dependent change in capabilities are given by examining the "frozen" capabilities in each `Li`: frozen access to a place in `Li` means that we have access to the place in branch `i` but not all branches. This information can be removed, if a backend (like Prusti) can remember these simple constraints itself. For backends that do not store a version of the program state, freezes and unfreezes might be necessary to remember where information came from.