Markus de Medeiros
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # Polonius Information ### Questions for Niko II - Two related technical problems that are blockers: - Rules around function calls - Are they written down anywhere? - What origins are generated - What do they mean - What facts about function signatures can we infer if we know the function body respects the universal origins? - Rules around lifetimes and structs - Are they written down anywhere? - What origins are generated - What do they mean - Both seem to use type information, somehow, and should be modular, somehow. - So our abstract resources can explain both of the origins. - Explain Alex's strange struct example - Concept of "everything blocked by a lifetime" as a resource - What in a struct blocks it's borrows from expiring? - Does variance play a role? - Persistent annotations for "equal" - Immediate versus Persistent subset relationships - Are function annotations persistent? - Broadly, is the "persistent/instantenous" dichotomy the same as the "type system/dataflow" dichotomy? We want to model the latter and treat the former as a black box. - Updates from last time: - We have a decent idea of how shared borrows can work - Alex's idea of an abstract resource for structs and function calls - Separation between deep and shallow resources in the model. ### Questions for Niko - Characterization of origins: - Characterization of subset_base: - Function calls: - Borrows inside structs: - Any other features of Rust which generate odd facts? From the thread in Rust Polonius channel > Forgive me if this is the wrong place to ask this-- I'm working on an interface between Prusti and Polonius and I have some questions about how the compiler generates input facts: > > I understand that some origins are "issuing origins" (corresponding to a loan issue). It seems to me that all assignments also create an origin. > > Are there any other situations which cause the compiler to use a new origin? > We can characterize which origins are "issuing origins" by reading the loan_issued_at fact. Is there any easy way to characterize the other kinds of origins? > The same for subsets. Is there anywhere in the compiler or Polonius that allows us to determine why a subset between two origins is enforced? > Any input, or a pointer to a relevant place in the compiler, would be greatly appreciated. I've tried to dig around in the compiler but haven't found any definitive answer yet. - Translations of Polonius facts - Stability of the fact generation in the compiler? ### Sources: Video of Niko: - [Polonius@Rust Verification Workshop 2021](https://www.youtube.com/watch?v=H54VDCuT0J0&t=10s) - [Polonius Walkthrough](https://www.youtube.com/watch?v=i5KdU0ieb_A&t=1536s) - [Polonius WG: subsets](https://www.youtube.com/watch?v=mAUGvNgZYtw) - [Polonius WG: initialization and moves](https://www.youtube.com/watch?v=ilv9V-328HI) - [dropbox paper](https://paper.dropbox.com/doc/uHumUOGAVtU4anFDNkWEI) - [Region Inference](https://www.youtube.com/watch?v=DqojDh9kFdI) ### Compiler files: ``` compiler/rustc_borrowck/src/constraint_generation.rs compiler/rustc_borrowck/src/dataflow.rs compiler/rustc_borrowck/src/facts.rs compiler/rustc_borrowck/src/invalidations.rs ::{generate_invalidates} compiler/rustc_borrowck/src/nll.rs ::{populate_polonius_move_facts} compiler/rustc_mir_dataflow/src/impls/mod.rs ::{MaybeUninitializedPlaces, EverInitializedPlaces} ``` ### Misc - [Niko's borrowck implementation in a-mir-formality (Written Oct 10, 2022)](https://github.com/nikomatsakis/a-mir-formality/pull/111/files#diff-ddc31fe20682fec9db8e5efcc4cefc56b2c16c287d5ab3bc14ca95bd841328c2) ## Borrows - A **borrow** is a MIR statement which assigns to a borrow-typed local. ``` x = &mut y x = y (where y is borrow-typed) (footnote 1) ``` borrows are given unique **borrow ID**'s corresponding to their MIR statement. - In the above examples, we call - ``x`` the **borrowed-to** place, and - ``y`` the **borrowed-from** place. Sometimes, these are called the **borrowed place** and **assigned place** respectively. We investigate the case where borrows inside structs are not allowed, so ``y`` is always a top-level local in our examples (but this need not always be the case). - Several often confused notions of borrow death are different: - A borrow is **killed** when it's borrowed-to place is overwritten. - A borrow is **invalidated** when it's borrowed-from place is accessed (footnote 2) - A borrow is **expired** when Polonius concludes that ownership should be transferred from the borrowed-to place to the borrowed-from place. - A borrow is considered **live** for at least as long as it's borrowed-to place is live. That is, for as long as the borrowed-from place is possibly used in the future. - Borrow facts are realized in Polonius facts as - killed: ``loan_killed_at`` - invalidated: ``loan_invalidated_at`` - expired (on an edge) when a loan is live at the start of the edge and not live at the end of the edge. - Rust code **borrow-checks** exactly when for some configuration of loan expiries: - no live loans are invalidated - no additional constraints are imposed on user-suppled lifetimes ## Subsets and Origins - An **origin** is a set of loans. - All newly created borrows have an **issuing origin** which contains only themself at the time of their instantiation. - Issuing origins do not necessarily propagate. Frequently, an issuing origin is set to be a subset of another origin. - Colloqually, a *lifetime* is a contiguous subset of the control flow graph. - Under non-lexical lifetimes, several seemingly conflicting borrows can coexist, provided none of them cause any invalidations at the wrong time. - Borrows can be forced to outlive each other by the **subsets** or **outlives** relation. - Subsets are also called the **flows-into** relation. If a borrow ``bwx`` possibly flows into another borrow ``bwy``, then ``bwx`` must outlive ``bwy``. Correspondingly, the set of possible loans with origin ``bwx`` are a subset of the possible loans for ``bwy``. - See 10:00 of Polonius WG on subsets for subset propogation with nested references. - At method boundaries, subset facts are given as ``known_placeholder_subset``s between ``universal_origins`` - Not all subset relations are derived from control flow: the type system can emit subsets from explicit user annotations, or derived (eg. loans in structs must outlive the struct). - Lifetime constraints in a function body, both user supplied and derived by the type system, are given in the ``subset_base`` fact. - The terminology for universal origins is actively changing. - ** Open question:** What are the possible reasons an origin is created? - Aside from issuing and universal origins, it seems origins are created on every assignment. Is this true? - ** Open question:** What are the possible reasons a subset is created? - ** Open question:** What are the permissions-focused invariants of an origin? - If my suspicion about origins being created on assignments is true, then - Since Ref is a RValue (in MIR terms), all non-universal origins correspond to an assignment - So all non-universal origins have a fixed LHS. ###### Summary of Polonius Facts: In this table I use the shorthand - ``p``: point - ``o``: origin - ``l``: loan - ``v``: variable (ie. base local) - ``t``: path (variable with projections, ie. place) | Kind | Input Fact | Interpretation | MIR/generation | | ---- | ---- | -------------- | ---------------| | Loans/MIR | ``loan_issued_at(o, l, p)`` | At ``p``, origin ``o`` includes loan ``l``. | Only observed for ``_=&_;`` statements. Origin ``o`` is always new. | | Loans/MIR | ``cfg_edge(p1, p2)`` | ``p1`` directly precedes ``p2``. | ``p1`` is ``Start`` iff ``p2`` is ``Mid`` | | Loans/MIR | ``loan_killed_at(l, p)`` | At ``p`` loan ``l`` no longer invalidates any origin. | Generated even when ``l`` is out of scope. Generated when a prefix of ``l``'s LHS is assigned to. | | Loans/MIR | ``subset_base(o1, o2, p)`` | At ``p``, all loans of ``o1`` also invalidate ``o2`` | **several sources, unknown generation.** | | Loans/MIR | ``loan_invalidated_at(p, l)`` | Loan ``l`` cannot be in any live origin at ``p``. | | | Loans/Signature | ``universal_region(o)`` | | Origins ``'_#0r`` and ``'_#1r`` are always universal | | Loans/Signature | ``known_placeholder_subset(o1, o2)`` | Subset between universal origins | | Loans/Signature | ``placeholder(o, l)`` | Universal origin ``o`` includes loan ``l`` | | | Moves/Liveness | ``var_used_at(v, p)`` | ``v`` used at ``p`` (except for a drop) | | | Moves/Liveness |``var_defined_at(v, p)`` | ``v`` written to at ``p`` | | | Moves/Liveness | ``var_dropped_at(v, p)`` | ``v`` dropped at ``p`` | | | Moves/Liveness | ``use_of_var_derefs_origin(v, o)`` | ? | ? | | Moves/Liveness | ``drop_of_var_derefs_origin(v, o)`` | ? | ? | | Moves/Initialization | ``child_path(t1, t2)`` | ``t1.f = t2`` for some projection ``f`` | Unpack of ``t1`` includes ``t2``. | | Moves/Initialization | ``path_is_var(t, v)`` | ``t = v.f1.f2. ... `` for some projections. | | | Moves/Initialization | ``path_assigned_at_base(t, p)`` | ``t`` is assigned at ``p`` | | | Moves/Initialization | ``path_moved_at_base(t, p)`` | ``t`` is moved at ``p`` | | | Moves/Initialization | ``path_accessed_at_base(t, p)`` | ``t`` is accessed at ``p`` | | | Kind | Output Fact | Interpretation | | | ---- | ----------- | -------------- | -- | | Errors | ``errors(?)`` | | | | Errors | ``subset_errors(?)`` | | | | Errors | ``move_errors(?)`` | | | | Loans | ``loan_live_at(?)`` | | | | Loans | ``origin_contains_loan_at(?)`` | | | | Loans | ``origin_live_on_entry(?)`` | | | | Loans | ``loan_invalidated_at(?)`` | | | | Loans | ``subset(?)`` | | | | Initialization | ``var_live_on_entry(?)`` | | | | Initialization | ``var_drop_live_on_entry(?)`` | | drop-live means "dropped in the future"| | Initialization | ``path_maybe_initialized_on_exit(?)`` | | | | Initialization | ``path_maybe_uninitialized_on_exit(?)`` | | | | Initialization | ``var_maybe_partly_uninitialized_on_exit(?)`` | | | | | ``known_contains(?)`` | | Transitive closure of ``known_placeholder_subset`` | | Approximate | ``origin_contains_loan_anywhere(?)`` | | | | Approximate | ``subset_anywhere(?)`` | | | ## Polonius Execution: ###### Summary of the full borrow checking process: 1. Compute Initialization Data 1.1. Compute the transitive closure of path accesses 1.2. Compute move errors Step 1 computes the output facts: - ``var_maybe_partly_initialized_on_exit`` - ``move_errors`` 2. Compute Liveness 2.1. Compute live origins 2.2. Make universal regions live 3. Compute Loan Analysis 3.1. Compute ``known_contains`` 3.2. Compute ``known_placeholder_subset`` 3.3. Naive borrow checking #### 1.1: Transitive closure of paths ``` Input facts for this step: - child_path (t1, t2) - path_assigned_at_base(t, p) - path_moved_at_base(t, p) - path_accessed_at_base(t, p) - path_is_var(t, v) Output facts for this step: - ancestor_path(t1, t2) - path_moved_at(t, p) - path_assigned_at(t, p) - path_accessed_at(t, p) - path_begins_with_var(t, v) // 1.1.1. (Setup) Promote inputs to iteration facts ancestor_path(Parent, Child) :- child_path(Child, Parent). path_moved_at(Path, Point) :- path_moved_at_base(Path, Point). path_assigned_at(Path, Point) :- path_assigned_at_base(Path, Point). path_accessed_at(Path, Point) :- path_accessed_at_base(Path, Point). path_begins_with_var(Path, Var) :- path_is_var(Path, Var). // 1.1.2. (Iterate) Compute transitive closure. ancestor_path(Grandparent, Child) :- ancestor_path(Parent, Child), child_path(Parent, Grandparent). path_moved_at(Child, Point) :- path_moved_at(Parent, Point), ancestor_path(Parent, Child). path_assigned_at(Child, point) :- path_assigned_at(Parent, point), ancestor_path(Parent, Child). path_accessed_at(Child, point) :- path_accessed_at(Parent, point), ancestor_path(Parent, Child). path_begins_with_var(Child, Var) :- path_begins_with_var(Parent, Var) ancestor_path(Parent, Child). ``` #### 1.2: Move Errors ``` Input facts for this step: - all 1.1 outputs - cfg_edge(p1, p2) Output facts for this step - var_maybe_partly_initialized_on_exit(v, p) - path_maybe_initialized_on_exit(t, p) - path_maybe_uninitialized_on_exit(t, p) - move_error(t, p) // 1.1.1 (Setup) Promote input facts path_maybe_initialized_on_exit(path, point) :- path_assigned_at(path, point). path_maybe_uninitialized_on_exit(path, point) :- path_moved_at(path, point). // 1.1.2 (Iterate) // Propagate initialization unless moved path_maybe_initialized_on_exit(path, point2) :- path_maybe_initialized_on_exit(path, point1), cfg_edge(point1, point2), !path_moved_at(path, point2). // Propagate uninitialization unless initialized path_maybe_uninitialized_on_exit(path, point2) :- path_maybe_uninitialized_on_exit(path, point1), cfg_edge(point1, point2) !path_assigned_at(path, point2). // Variable is partly initialized if any child is var_maybe_partly_initialized_on_exit(var, point) :- path_maybe_initialized_on_exit(path, point), path_begins_with_var(path, var). // Accesses of maybe uninitialized paths are errors move_error(Path, TargetNode) :- path_maybe_uninitialized_on_exit(Path, SourceNode), cfg_edge(SourceNode, TargetNode), path_accessed_at(Path, TargetNode). ``` #### 2.1: Live Origins ``` Input facts for this step: - var_used_at(v, p) - var_defined_at(v, p) - var_dropped_at(v, p) - use_of_var_derefs_origin(v, o) - drop_of_var_derefs_origin(v, o) - cfg_edge(p1, p2) - var_maybe_partly_initialized_on_exit(v, p) Output facts for this step: - var_live_on_entry(v, p) - var_drop_live_on_entry(v, p) - origin_live_on_entry(o, p) - var_used_at(v, p) // 2.1.1 (Setup) var_live_on_entry(var, point) :- var_used_at(var, point). var_maybe_partly_initialized_on_entry(var, point2) :- var_maybe_partly_initialized_on_exit(var, point1), cfg_edge(point1, point2). var_drop_live_on_entry(var, point) :- var_dropped_at(var, point), var_maybe_partly_initialized_on_entry(var, point). // 2.1.2 (Iterate) // Origins are live if // - a drop-live variable derefs the origin on drop // - a live variable derefs the origin on use origin_live_on_entry(origin, point) :- var_drop_live_on_entry(var, point), drop_of_var_derefs_origin(var, origin). origin_live_on_entry(origin, point) :- var_live_on_entry(var, point), use_of_var_derefs_origin(var, origin). // Variables are live if live in the future and not overwritten var_live_on_entry(var, point1) :- var_live_on_entry(var, point2), cfg_edge(point1, point2), !var_defined(var, point1). // Variables are drop-live if drop-live in the future, // not overwritten, and partly intialized. var_drop_live_on_entry(Var, SourceNode) :- var_drop_live_on_entry(Var, TargetNode), cfg_edge(SourceNode, TargetNode), !var_defined_at(Var, SourceNode), var_maybe_partly_initialized_on_exit(Var, SourceNode). ``` #### 2.2: Make universal regions live This is a helper function which makes universal origins live at all program points. #### 3.1: Compute ``known_contains`` Simple iteration to compute the transitive closure of ``known_placeholder_subset`` as ``known_contains`` #### 3.2: Compute ``known_placeholder_subset`` Also a transitive closure? #### 3.3: Naive Borrow checking ``` Input facts for this step: - origin_live_on_entry(o, p) - loan_invalidated_ad(l, p) - subset_base(o1, o2, p) - loan_issued_at(o, l, p) - loan_killed_at(l, p) - known_contains(o, l) - placeholder_origin(o) - placeholder_loan(l, o) - known_placeholder_subset(o1, o2) - cfg_edge(p1, p2) Output facts for this step: - subset(o1, o2, p) - origin_contains_loan_on_entry(o, l, p) - loan_live_at(l, p) - errors() // 3.3.1 (Setup) // Rule 1: subset_base is promoted to subsets subset(Origin1, Origin2, Point) :- subset_base(Origin1, Origin2, Point). // Rule 4: loans are contained in their issuing origins origin_contains_loan_on_entry(Origin, Loan, Point) :- loan_issued_at(Origin, Loan, Point). // 3.3.2 (Iteration) // Rule 2: Transitively close subset constraints subset(Origin1, Origin3, Point) :- subset(Origin1, Origin2, Point), subset(Origin2, Origin3, Point). // Rule 3: Propagate sbset constraints between live origins subset(Origin1, Origin2, Point2) :- subset(Origin1, Origin2, Point1), cfg_edge(Point1, Point2), origin_live_on_entry(Origin1, Point2), origin_live_on_entry(Origin2, Point2). // Rule 5: Apply subset constraints between origins origin_contains_loan_on_entry(Origin2, Loan, Point) :- origin_contains_loan_on_entry(Origin1, Loan, Point), subset(Origin1, Origin2, Point). // Rule 6: Propagate loans when: // - the loan propagates into a live origin // - the loan is not killed origin_contains_loan_on_entry(Origin, Loan, Point2) :- origin_contains_loan_on_entry(Origin, Loan, Point1), !loan_killed_at(Loan, Point1), cfg_edge(Point1, Point2), origin_live_on_entry(Origin, Point2). // Rule 7: Loans are live when in live origins loan_live_at(Loan, Point) :- origin_contains_loan_on_entry(Origin, Loan, Point), origin_live_on_entry(Origin, Point). // Rule 8: No live origin can contain an invalidated loan errors(Loan, Point) :- loan_invalidated_at(Loan, Point), loan_live_at(Loan, Point). // Rule 9: No extra subsets are allowed for universal origins subset_error(Origin1, Origin2, Point) :- subset(Origin1, Origin2, Point), placeholder_origin(Origin1), placeholder_origin(Origin2), !known_placeholder_subset(Origin1, Origin2). ``` ## Fact Generation Input facts are not goverened by Polonius, but by the rust compiler. This section outlines the situations when Polonoius facts are generated. - ``cfg_edge``: Generated - between the end of one MIR statement and the start of the next (external edges) - between the start and end of a single MIR statement (internal edges) Our semantics equips each internal edge with a Hoare triple which the free PCS must satisfy, and each external edge with some collection of annotations. - ``loan_invalidated_at``: Generated when > A borrowed-from place is accessed in some way which invalidates terms of a loan. - Loan invalidations are generated even before a loan is possibly instantiated. - In borrow-checked code, ``loan_invalidated_at`` asserts that no live origin contains a given loan. - ``loan_killed_at``: Generated when > A prefix of the borrowed from place is assigned to - Invalidated loans are not tracked by Polonius after their point of invalidation as they can no longer cause borrow check errors. - In our interpretation, killed loans are tagged. - ``origin_live_at``: Computed by initialization checker. - When an origin may be used in the future, all of the loans it contains must not be invalidated. ##### Subset Generation ``compiler/rustc_borrowck/src/type_check::add_reborrow_constraint`` ### Dataflow The compiler uses three analyses: 1. ``Borrows`` 2. ``MaybeUninitializedPlaces`` 3. ``EverInitializedPlaces`` Analyses 2 and 3 are performed over the set of MovePaths (the finite set of paths, not places, which are possibly accessed throughout the program). The helper function ``move_path_closest_to`` can be used to apply these analyses on more granular places. - Current Polonius has dataflow integrated into it's model. However due to performance issues, it is expected that they will revert to the compiler analyses. ## Observed Behaviors of the Borrow Checker Possibly these can be shown to be consequences of the rules as laid out above. ### 1. Lexical boundaries and loops - Loans created in loops by ``x = &mut y`` do not outlive the loop. - Loans which are overwritten by a borrow like ``x = &mut y`` are expired before the loop head. ### 2. Coupled Loans - At a loop head, subset relations from the future can force two origins to be equal. ### 3. Eager Expiry - Origins which are not live are not propagated. - It may not be sound to add user-provided assertions about variables (including borrows) which are not live. ## Footnotes 1: This possibly does not contain all borrows, for example functions call terminators may be a borrow to the local ``_0``. More investigation is needed, but restricting our attention to these two statements should be fine for now. 2: *accessed* means "read or written to" for mutable borrows, and "written to" for shared borrows. # Connection to Prusti ## Problem Track the status of loans in a data structure such that 1. Loan expiry emits the correct annotations for a Viper proof 2. The free PCS at each program point soundly represents the places that the Rust compiler knows it has access to. Motivation for the current approach: - Many of the features we seek to implement involve adding a notion of relationships between borrows to the reborrowing DAG. We will try to define the relationship between Polonius's approximate facts about borrow relations and our concrete DAG of annotations, and to read off facts such as loop invariants and borrows at method boundaries. ## Coupling Graph: Separation of Concerns The classic approach involves a *reborrowing DAG*, a data structure whose purpose is to store the annotations associated with the expiry of a loan. This is sufficient when the relationships between borrows are pureley syntactic, but with non-lexical lifetimes there are more complicated relationships between borrows that need to be taken into account. The reborrowing DAG in its current form (a tree of hypergraphs) is not particularly well-suited to reason about relationships between loans. Furthermore even in an ideal reborrowing DAG this information is irrelevant to the annotations ultimately produced. The proposed method is to separate the reborrowing DAG from another data structure, the *coupling graph*, whose responsibility is exclusively to translate compiler information into invariants of the Reborrowing DAG. ## Design We propose two data structures: a *Reborrowing DAG* (similar to what is implemented) and a *Coupling Graph*. - The reborrowing DAG is responsible for - Soundly generating annotations for loan creations, expiries, and magic wands. - Do so efficiently (in particular, the reborrowing DAG is free to optimize away aspects formerly required to deduce loan expiries) - The coupling graph (CDG) is responsible to: - Interface with the free PCS: - Determine what permissions are required by changes to the reborrowing DAG at any program point - Respond to queries from the reborrowing DAG: - Determine the sets of live loans at each program point - Determine the set of loan expiries at a program point, and any order whereby they must expire - Deduce the signatures of magic wands at loop heads - Infer the opque edges from the method signature, and if their structure changes throughout execution. - Act as an interface to Polonius: - faithfully represent relationships between borrows in a general, feature-agnostic way - for example: working with *all* subset relations between borrows, even for features not currently supported by the encoding. - Be lightweight, and robust against changes to Polonius or the compiler. Essentially, to be a stable API between the compiler and the reborrowing DAG. ###### Summary: | | Reborrowing DAG | Coupling Graph | |-| --------------- | -------------- | | Conditional? | Guarded by dynamic flags | No, static | | Granularity | As precise as Viper requires | As approximate as Polonius provides | | Repacking state | Explicit edges | Implicit in edge LHS/RHS | | Can emit annotations | Yes | No* | | Free to optimize structure | Yes | No | | Can infer invariants | No | Yes*** | | Readable from compiler facts | No | Yes** | \* at least, not without violating separation of concerns. \*\* A few aspects are *directly* readable, but there is some state to flow through the CFG (Fortunately, only in one pass... no additional dataflow analysis is needed) \*\*\* Some. WIP. ## CDG Implementation: Data structure *The computation of the CDG is still a work in progress, and aspects of the design will remain unclear until implemented. Consider this less of a plan and more of a high-level log for the implementation progress.* The CDG is a directed hypergraph where - Nodes are (possibly tagged, TBD) places - Edges are annotated with a collection of coupled loans - Edges are uniquely determined by their LHS and RHS... the CDG is not a multigraph An edge in the coupling CDG (with LHS, RHS, loans) should be interpreted as > If the reborrowing DAG is allowed to expire "loans", it can *unconditionally* unwind edges to consume the LHS and regain the RHS. ## CDG Implementation: Computation (Current) (wip) [Examples of the system (hackmd)](https://hackmd.io/@be6mqvt0QwS_i18WLiSLBA/SyU1thY9j/edit) - Tracks aliasing at the level of borrows: coupled edges in the coupling graph are aliasing constraints - A edge can be expired when all its aliases are eliminated from the coupling graph - When an edge is eliminated from the coupling graph, it consumes its left hand side from the free PCS - a kind of drop-liveness for consumed places. - Join: A kind of idempotence is needed to reflect the Polonius approximation. - A magic wand for a coupled borrow can be represented as several edges in the CDG (for expiry purposes). Not being able to do this is a flaw of the old model - Examples of magic wand inference ``` #1: [_2] -> ... bw1 -> [_4] [_1] -> ... bw2 -> [_3] #2: [_1] -> ... bw1 -> [_4] [_2] -> ... bw2 -> [_3] coupled at the join to get the magic wand #1: [_1, _2] -cb1(bw1, bw2)-> [_3, _4] #2: [_1, _2] -cb1(bw1, bw2)-> [_3, _4] when #1 is no longer live - consume _1 and _2 from the free PCS #2: [_1, _2] -cb1(bw1, bw2)-> [_3, _4] expire-live places {_1, _2} when #2 is no longer live - do not consume any more places - expire cb1 - emit _3 and _4 to the free PCS. ``` ``` #1: [_7] -> ... bw1 -> [_1] [_7] -> ... bw1, bw2 -> [_7@L] (???) coupled at the join to get the magic wand #1: [_7] -cb1(bw1, bw2)-> [_1] What magic wand do we want???? ``` ## CDG Implementation: Computation (Depricated) *The computation of the CDG is still a work in progress, and aspects of the design will remain unclear until implemented. Consider this less of a plan and more of a high-level log for the implementation progress.* #### Iteration - Each CDG is computed from one or more prior CDG's. We iterate over the control flow graph in order of priority: 1. Control flow points with one predecessor which is fully computed 2. Control flow points with several predecessors which are fully computed 3. Control flow points with any number of predecessors which are not fully computed. doing so ensures we compute the CDG of a loop head in and only in case (3), and that we have as much information as possible before computing the CDG of any point. #### Coupling/SCC At each point, the *SCC* is the strongly connected components graph of the origin_contains_loan_at fact. The nodes of the SCC are sets of borrows, and the edges represent subsets between them. An easy way to think about this construction is > 1. Take the transitive closure of the origin_contains_loan_at fact > 2. Quotient by complete subgraphs Informally, a collection of loans is *coupled* when the borrow checker no longer distinguises their points of expiry. At each point we can determine the loans which must be coupled by computing the coarsest partition of the live loans such that each class is either contained in or disjoint from each node in the SCC. I call this the *coupling parition*. An important invariant: Polonius propagates loans forward via the origin_contains_loan_at by subsets between origins, and loans are only created in one spot. As a result (and excluding details about loan creation or expiry) the coupling partition **only gets coarser** through forward control flow. ###### Example Directly after a reborrow the SCC might look like ``` [bw0, bw1] --> [bw0] ``` In this case, the coarsest parition of nodes is ``{{bw0}, {bw1}}``, so the two borrows are not yet coupled. However, once polonius simplifies the SCC to ``` [bw0, bw1] ``` (for example, if the clique of origins whose loans are ``[bw0]`` are no longer in a live variable) then the coarsest partition is ``{bw0, bw1}`` and the two borrows are coupled. **wip in this section:** - The step of coupling loans together needs to happen before applying constraints to the CDG imposed by other edges in the SCC. - I'm still working on what this looks like, but since the **coupling parititon** is a partition of loans, it could be something like 1. Couple loans together (see affine path section) 2. Ensure the family of coupled loans in (LHS-RHS) occurs before the RHS family - It's still unclear if that's what Polonius is actually doing though. An edge in the SCC may imply a statement about connectedness between coupled borrows (eg (LHS-RHS) --(repack)--> RHS as a linear path). - I'm pretty sure it's impotant to make sure the CDG is constrained by all SCC edges #### Hyperpaths, Affine Paths, Linear Paths The regular definition of a hyperpath between nodes ``a`` and ``b`` is > A sequence of edges A_1...A_n such that > - LHS(A_1) contains a > - RHS(A_n) contains b > - RHS(A_i) intersect LHS(A_(i+1)) is nonempty This is the notion of a hyperpath used by the Reborrowing DAG's current unwinding algorithm, and is probably sound for that use case. However, it is insufficient for finding paths in the coupling graph if nodes are interpreted as linear resources. ###### Example Consider a hypergraph on nodes ``1``..``4`` with the following edges ``` A: {1,2} --> {3} B: {1,3} --> {4} ``` The sequence ``A,B`` is a path from ``1`` to ``4``, but makes no sense as a sequence of loan expiries since ``1`` is reused. In the graph with edges ``` C: {1} --> {2} D: {2,3} --> {4} - Hyperpath between the node 1 and the node 4 - Hyperpath between the node 3 and the node 4 - Affine path between the set of nodes {1, 3} and the node {4} - Not an affine path between the set of nodes {1} and {4} ``` the sequence ``C,D`` is also a path from ``1`` to ``4`` but also requires an additional permission (``3``) to expire a loan in the middle. --- In response, I use a construction I haven't seen a name for in wider hypergraph theory: - Let ``A`` and ``B`` be sets of nodes in a hypergraph ``G``. - A single edge ``e`` is an **affine path** from ``LHS(e)`` to ``RHS(e)``. - If ``H'`` is a H-path from ``A`` to ``B`` and ``e`` is an edge, ``H' + e`` is an **affine path** from ``LHS(e) U (A - RHS(e))`` to ``B U (RHS(e) - A)``. - If, for all ``a`` in ``A`` and ``b`` in ``B``, an affine path ``H`` contains a path from ``a`` to ``b``, we say ``H`` is **connected**. - If ``B U {LHS(e) | e in H} = A U {RHS(e) | e in H}`` as multisets, we say ``H`` is a **linear path**. A linear path in the CDG represents a collection of edges which can be abstracted into a single hyperedge (between sets of exclusive resources). #### Moves Our current formulation of the CDG treats origins as sets in the mathematical sense, so that two origins which contain the same collection of borrows are indistinguishable. This perspective has some advantages: it has yet to appear insufficient for collecting the coupling information needed by the SCC and is simple to compute. However, the mathematical model is not fully refletive of Polonius' understanding of the equality of origins: even if two origins have the same loans then they are nevertheless distinct. This manifests as an edge case for borrow assignments. In an assignment ```rust= let x = &mut y; ``` an _issuing origin_ (say ``'_#2r``) is created for the Rval ``&mut y`` containing only the new loan (say ``bw0``), an empty origin (say ``'_#3r``) is created for the assignement, and a subset ``'_#2r <: '_#3r`` is created to constrain the two to be equal. From the SCC perspective, this corresponds to one component ``{bw0} = '_#2r = '_#3r``. The edge case comes in when moves constrain other origins to be equal. Suppose for example ``_1`` and ``_2`` are places which are borrow-typed, and we are computing ``_2 = _3`` which generates a subset relation ``'_#3r <: '_#2r``. It is possible that this new subset relation does not change the SCC and so is not reflected in the construction above. This is a problem because at a CDG level we have to change the left-hand side of the edge borrowing into ``_3`` to track ``_2`` instad. Currently, we solve this at a syntactic level by detecting borrow moves and mutating the CDG. One invariant is that moved places should always be able to be fully packed: for example the following code (with a move on line 3) does not compile. ``` rust= let mut s0 = S {x: 0, y: 0}; let bx = &mut s0.x; let s1 = s0; let bx1 = bx; ``` Due to moves also being kills, it suffices to search through the entire tree for non-killed permissions and mutate their LHS to the new moved place. I suspect a more robust solution is out there though: if it is possible to determine the reason a compiler emitted a given origin or subset fact, we can change the CDG computation away from using mathematical sets and interpret the datalog facts as given instead. I am working to understand Polonius fact generation ([thread in the rust-lang zulip](https://rust-lang.zulipchat.com/#narrow/stream/186049-t-compiler.2Fwg-polonius/topic/Understanding.20Polonius.20input.20facts/near/318337227)) and will explore this further. #### Kills The CDG tracks tagged permissions. Tagged permissions are an affine resource which do not correspond to actual locations in memory. In our characterization of permissions: | | YES write | NO write | | ------------ | ------------- | ----------------- | | **YES read** | exclusive (e) | shared (s) | | **NO read** | uninit (u) | tagged (te/ts/tu) | Much like moves, when a place is killed we replace all nodes in the CDG which refer to that place with its tagged variant. Due to eagerness in Polonius, a tagged place will always be an intermediate node in the CDG. Note that, while only one non-tagged permission to a place can exist there may be many tagged versions of the same place. For straight-line code, it suffices to tag each place with the MIR location at which it was killed. For code with loops, we recall that the compiler never refers to code more than one loop iteration backwards. It's not clear to me what the situation with tagging and loop invariants should look like yet. #### wip in this section: - The intent of linear paths is to formalize the procedure of coupling borrows together in the CDG. - The story about affine paths and shared borrows is less clear to me, but I think there's something there. - I have a basic algorithm for finding linear paths, it is brute-force and I am almost certain a better algorithm exists. - Generally, we care that an affine path exists that contining some collection of loans in any order. This algorithm needs thought too, because my algorithm right now is especially stupid. - Nothing in the linear path precludes cycles, though a cycle in the CDG probably shouldn't happen in the presence of tagging (it should be a DAG). I need to think more about this. - Linear paths where we replace the union with a separating union do preclude cycles, note that B is unioned with the left-hand sides and A is unioned with the right-hand sides. - I'm not fond of the fact that an affine path is not a path, I'm sure there's a better name out there somewhere. # WIP's ``` compiler/rustc_borrowck/src/lib.rs::do_mir_borrowck compiler/rustc_borrowck/src/nll.rs ``` `` compiler/rustc_borrowck/src/constraints/mod.rs::impl OutlivesContraintSet::compute_sccs `` Rewriting inside reborrowing DAG edges: - Function boundaries might need complex edges - Nested wands - Semi-transparent edges in the reborrowing DAG: Wands which tell us which inputs block which outputs - nesting edges... how would that even work? - How to rewrite an edge so - Reading changes to the CDG before/after a call ###### Weird questions - Tagged permissions === drop-live places analogy. - Can the coupling graph be computed in datalog?

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully