# Common definitions ## Syntax and contexts The *context* of a syntax tree node uniquely identifies where in the source it appears. This will be a particular sort of *sub-node*; e.g. the tail expression of a block or the n^th^ argument to a function call. Let's define the possible sub-nodes first: ```haskell -- We leave the parent contexts of body expressions abstract. module ScopingComparison (FnCtxt ConstCtxt : Set) where open import Data.Maybe using (Maybe; just; nothing) open import Data.Nat using (ℕ) open import Function using (_∘_; _|>_; case_of_) open import Relation.Binary.PropositionalEquality using (_≡_; refl) data NodeKind : Set where Stmt : NodeKind Expr : NodeKind FnDecl : NodeKind ConstDecl : NodeKind Toplevel : NodeKind -- For our purposes, patterns are either extending patterns or not. data Pat : Set where pat-ext : Pat pat-non : Pat -- In addition to having its own kind, each sub-node is the child of a -- particular kind of parent node. data _SubnodeOf_ : NodeKind → NodeKind → Set where -- A function. node-fn : (fn : FnCtxt) → FnDecl SubnodeOf Toplevel -- A constant. node-const : (const : ConstCtxt) → ConstDecl SubnodeOf Toplevel -- The nᵗʰ statement in a block: ⦃…●…⦄ : (n : ℕ) → Stmt SubnodeOf Expr -- The body expression of a function: |…|● : Expr SubnodeOf FnDecl -- The body expression of a constant item: const…=●⨾ : Expr SubnodeOf ConstDecl -- The initializer expression of a `let` statement: let…=●⨾ : (pat : Pat) → Expr SubnodeOf Stmt -- The expression of an expression statement: ●⨾ : Expr SubnodeOf Stmt -- The tail expression of a block: ⦃…⨾●⦄ : Expr SubnodeOf Expr -- The nᵗʰ argument expression to a call: …⦅…,●,…⦆ : (n : ℕ) → Expr SubnodeOf Expr -- The nᵗʰ operand expression to an array constructor: ⟦…,●,…⟧ : (n : ℕ) → Expr SubnodeOf Expr -- The indexed array operand of an indexing operator expression: ●⟦…⟧ : Expr SubnodeOf Expr -- The operand of a borrow expression: &● : Expr SubnodeOf Expr ``` Then, we can define contexts inductively: there's a top-level context, and for any context and any node that can go in that context, there's a context for each of its sub-nodes. ```haskell infixl 5 _▷_ data Context : NodeKind → Set where ctxt-top : Context Toplevel _▷_ : ∀ {k₁ k₂} → Context k₁ → k₂ SubnodeOf k₁ → Context k₂ ``` This gives us infix syntax for constructing contexts. For example, to represent the context of `const X: &T = f(&●);`, we would write ```haskell exampleContext : ConstCtxt → Context Expr exampleContext const = ctxt-top ▷ node-const const ▷ const…=●⨾ ▷ …⦅…,●,…⦆ 0 ▷ &● ``` where the `const` parameter identifies the constant item. Notably, this formulation of contexts only includes direct ancestors; no siblings or parents' siblings, etc., except for `let` initializers' contexts including the `let` statement's pattern, since that affect lifetime extension. For another example, let's define a helper function to get the context of the block that a statement is in: ```haskell blockOf : Context Stmt → Context Expr blockOf (e-block ▷ ⦃…●…⦄ _) = e-block ``` :::info For scoping, it would be desirable to parameterize contexts over the AST, or introduce a well-formedness relation, so that only contexts for actual AST nodes can be represented as scopes. I've opted for a simpler representation to keep the focus on definitions of temporary scoping. ::: ## Scopes A scope is the syntactic context in which a temporary or variable lives; when execution leaves that context, it's dropped. If this corresponds to the top-level context, then the temporary or variable is never dropped. ```haskell record Scope : Set where constructor scope field {kind} : NodeKind ctxt : Context kind ``` :::info Technically, scopes in rustc are finer-grained than this. For example, each block expression node contains a block node with its own associated scope, and each `let` statement introduces a "remainder scope" spanning the rest of its block, serving as the scope of bindings (and extended temporaries) of the subsequent `let` statement. For simplicity, I've chosen not to model these details. ::: ### Aside: scope hierarchy This representation of scopes makes their hierarchy easy to express. Other than the top-level scope, each scope has a direct parent; this forms a binary relation. Then, the reflexive transitive closure of that relation gives us an ancestry relation (counting each node as an ancestor of itself). ```haskell module Hierarchy where open import Relation.Binary.Construct.Closure.ReflexiveTransitive using (Star) data _≻_ : Scope → Scope → Set where parent≻child : ∀ x {k} (n : k SubnodeOf Scope.kind x) → x ≻ scope (Scope.ctxt x ▷ n) _≻*_ : Scope → Scope → Set _≻*_ = Star _≻_ ``` It's out of the scope of this document, but this lets us state a crucial property of temporary scoping: the temporary scope of an expression context is an ancestor of that expression. As a corollary, this lets us informally work around the issue of being able to represent scopes that don't correspond to real AST nodes: if we assume any expression we're scoping is real, then any ancestor of it is also real, so its temporary scope is real. More formally, if we had a well-formedness relation relating contexts to the AST, then we could prove that the ancestors of a well-formed context are well-formed. ## The enclosing temporary scope The temporary scope of non-extended temporaries is the closest ancestor temporary scope boundary, as defined by [destructors.scope.temporary.enclosing](https://doc.rust-lang.org/nightly/reference/destructors.html#r-destructors.scope.temporary.enclosing): ```haskell enclosingTempScope : Context Expr → Scope -- Body expressions are temporary scope boundaries. enclosingTempScope e-body@(_ ▷ |…|●) = scope e-body enclosingTempScope e-body@(_ ▷ const…=●⨾) = scope e-body -- Statements are temporary scope boundaries. enclosingTempScope (s-let ▷ let…=●⨾ _) = scope s-let enclosingTempScope (s-expr ▷ ●⨾) = scope s-expr -- Block tails are temporary scope boundaries enclosingTempScope e-tail@(_ ▷ ⦃…⨾●⦄) = scope e-tail -- Our other contexts are not temporary scope boundaries. enclosingTempScope (e-call ▷ …⦅…,●,…⦆ _) = enclosingTempScope e-call enclosingTempScope (e-array ▷ ⟦…,●,…⟧ _) = enclosingTempScope e-array enclosingTempScope (e-indexing ▷ ●⟦…⟧) = enclosingTempScope e-indexing enclosingTempScope (e-ref ▷ &●) = enclosingTempScope e-ref ``` # Stable Rust First, we need an auxiliary definition: [*extending contexts*](https://doc.rust-lang.org/nightly/reference/destructors.html#r-destructors.scope.lifetime-extension.exprs.extending). The operand of an extending borrow expression has an extended temporary scope. ```haskell -- Is a context extending? If so, what would it extend a borrowed temporary to? extendingScope : Context Expr → Maybe Scope -- `let` statements extend temporaries to the end of the block. extendingScope (s-let ▷ let…=●⨾ _) = just (scope (blockOf s-let)) -- Constant items extend temporaries to the end of the program. extendingScope (_ ▷ const…=●⨾) = just (scope ctxt-top) -- Block tails, array expression operands, and borrow expression operands are -- extending if their parents are, and extend to the same scope. extendingScope (e-block ▷ ⦃…⨾●⦄) = extendingScope e-block extendingScope (e-array ▷ ⟦…,●,…⟧ _) = extendingScope e-array extendingScope (e-ref ▷ &●) = extendingScope e-ref -- These expressions aren't extending. extendingScope (_ ▷ |…|●) = nothing extendingScope (_ ▷ ●⨾) = nothing extendingScope (_ ▷ …⦅…,●,…⦆ _) = nothing extendingScope (_ ▷ ●⟦…⟧) = nothing ``` There's one other thing to consider in the definition of temporary scopes as well: [destructors.scope.lifetime-extension.sub-expressions](https://doc.rust-lang.org/nightly/reference/destructors.html#r-destructors.scope.lifetime-extension.sub-expressions), or as I'll call it, *the sub-expressions rule*. I've chosen to interpret this as defining a grammar of *sub-expression contexts*. The temporary scope of a sub-expression context is the same as its parent's temporary scope. ```haskell tempScope-stable : Context Expr → Scope -- `let ref x = ●;` has an extended temporary scope. tempScope-stable (s-let ▷ let…=●⨾ pat-ext) = scope (blockOf s-let) -- The sub-expressions rule: `●⟦…⟧` has the same temporary scope as its parent, -- and `&●` does as well in "sub-expression contexts". tempScope-stable (e-indexing ▷ ●⟦…⟧) = tempScope-stable e-indexing tempScope-stable (e-ref@(_ ▷ ●⟦…⟧) ▷ &●) = tempScope-stable e-ref tempScope-stable (e-inner-ref@(_ ▷ &●) ▷ &●) = tempScope-stable e-inner-ref -- The scope of `&●` in other contexts depends on whether its parent is -- extending. If so, its temporary scope is extended by `extendingScope`. -- Otherwise, its temporary scope is its enclosing temporary scope. {-# CATCHALL #-} tempScope-stable e-operand@(e-ref ▷ &●) = case extendingScope e-ref of λ where (just x-extended) → x-extended nothing → enclosingTempScope e-operand -- Our other contexts don't have extended temporary scopes. -- Their temporary scopes are their enclosing temporary scopes. {-# CATCHALL #-} tempScope-stable e-other = enclosingTempScope e-other ``` ## Alternate formulation: expanded sub-expressions rule We can reinterpret the sub-expressions rule as applying more broadly and uniformly: the temporary scope of `e ▷ &●` for non-extending `e` is the temporary scope of `e`: ```haskell tempScope-stable′ : Context Expr → Scope -- `let ref x = ●;` has an extended temporary scope. tempScope-stable′ (s-let ▷ let…=●⨾ pat-ext) = scope (blockOf s-let) -- `●[…]` has the same temporary scope as its parent, per the sub-expr rule. tempScope-stable′ (e-indexing ▷ ●⟦…⟧) = tempScope-stable′ e-indexing -- The scope of `&●` depends on whether the borrow is extending. -- If it's extending, its temporary scope is extended by `extendingScope`. -- Otherwise, its temporary scope is the same as its parent's, per the -- expanded sub-expressions rule. tempScope-stable′ (e-ref ▷ &●) = case extendingScope e-ref of λ where (just x-extended) → x-extended nothing → tempScope-stable′ e-ref -- Our other contexts don't have extended temporary scopes. -- Their temporary scopes are their enclosing temporary scopes. {-# CATCHALL #-} tempScope-stable′ e-other = enclosingTempScope e-other ``` This is equivalent to the definition of `tempScope-stable` above: `∀ e → tempScope-stable e ≡ tempScope-stable′ e`. # Compiler PR #146098 For [rust-lang/rust#146098](https://github.com/rust-lang/rust/pull/146098), we replace the partial definition of extending expressions (`extendingScope`) with a total function that determines the scope of borrow expressions' operands in any non-"sub-expression" context. This is what I called the "extended scope" in [rust-lang/reference#2051](https://github.com/rust-lang/reference/pull/2051), though I'd like a better name for it. ```haskell -- `refScope-146098 e` is the scope of `e ▷ &●` when `e` is not a -- sub-expression context. refScope-146098 : Context Expr → Scope -- `let x = &●;` has an extended temporary scope. refScope-146098 (s-let ▷ let…=●⨾ _) = scope (blockOf s-let) -- `const … = &●;` has an extended temporary scope. refScope-146098 (_ ▷ const…=●⨾) = scope ctxt-top -- Extending subexpressions preserve `&●`'s temporary scope. refScope-146098 (e-block ▷ ⦃…⨾●⦄) = refScope-146098 e-block refScope-146098 (e-array ▷ ⟦…,●,…⟧ _) = refScope-146098 e-array refScope-146098 (e-ref-outer ▷ &●) = refScope-146098 e-ref-outer -- In other contexts, the temp scope of `&●` is the enclosing temp scope. {-# CATCHALL #-} refScope-146098 e-other = enclosingTempScope e-other ``` We can then define temporary scoping with this new rule analogously to how we originally did for stable Rust: ```haskell tempScope-146098 : Context Expr → Scope -- `let ref x = ●;` has an extended temporary scope. tempScope-146098 (s-let ▷ let…=●⨾ pat-ext) = scope (blockOf s-let) -- The sub-expressions rule: `●⟦…⟧` has the same temporary scope as its parent, -- and `&●` does as well in "sub-expression contexts". tempScope-146098 (e-indexing ▷ ●⟦…⟧) = tempScope-146098 e-indexing tempScope-146098 (e-ref@(_ ▷ ●⟦…⟧) ▷ &●) = tempScope-146098 e-ref tempScope-146098 (e-inner-ref@(_ ▷ &●) ▷ &●) = tempScope-146098 e-inner-ref -- The scope of `&●` in other contexts is given by `refScope-146098`. {-# CATCHALL #-} tempScope-146098 (e-ref ▷ &●) = refScope-146098 e-ref -- Our other contexts don't have extended temporary scopes. -- Their temporary scopes are their enclosing temporary scopes. {-# CATCHALL #-} tempScope-146098 e-other = enclosingTempScope e-other ``` This still splits temporary scoping into two steps: first, apply the sub-expressions rule; second, if you end up in an `&●` you take the `refScope-146098`, and otherwise you take the `enclosingTempScope`. The separation is important: it means PR #146098 only extends temporaries past statement/`const` boundaries when stable Rust does. # The expanded sub-expressions rule In [this commit to the Reference PR](https://github.com/rust-lang/reference/pull/2051/commits/6536b3be7957261bfe91567af8737454b6c34d40), I added an example based on an alternative interpretation of the sub-expressions rule: that the temporary scope of `&●` in a non-extending context is the temporary scope of its parent. We can get this by defining `tempScope-refPr` and `refScope-refPr` through mutual recursion: ```haskell tempScope-refPr : Context Expr → Scope refScope-refPr : Context Expr → Scope -- `let ref x = ●;` has an extended temporary scope. tempScope-refPr (s-let ▷ let…=●⨾ pat-ext) = scope (blockOf s-let) -- `●[…]` has the same temporary scope as its parent, per the expanded -- sub-expressions rule. tempScope-refPr (e-indexing ▷ ●⟦…⟧) = tempScope-refPr e-indexing -- The temporary scope of `&●` is given by `refScope-refPr`. tempScope-refPr (e-ref ▷ &●) = refScope-refPr e-ref -- Our other contexts don't have extended temporary scopes. -- Their temporary scopes are their enclosing temporary scopes. {-# CATCHALL #-} tempScope-refPr e-other = enclosingTempScope e-other -- `let x = &●;` has an extended temporary scope. refScope-refPr (s-let ▷ let…=●⨾ _) = scope (blockOf s-let) -- `const … = &●;` has an extended temporary scope. refScope-refPr (_ ▷ const…=●⨾) = scope ctxt-top -- Extending subexpressions preserve `&●`'s temporary scope. refScope-refPr (e-block ▷ ⦃…⨾●⦄) = refScope-refPr e-block refScope-refPr (e-array ▷ ⟦…,●,…⟧ _) = refScope-refPr e-array refScope-refPr (e-ref-outer ▷ &●) = refScope-refPr e-ref-outer -- `&●` outside of an extending subexpression has the same temporary scope as -- its parent, per the expanded sub-expressions rule. {-# CATCHALL #-} refScope-refPr e-ref-other = tempScope-refPr e-ref-other ``` Similar to stable Rust, the scoping rules for `e ▷ &●` no longer depend on whether `e` is a sub-expression context. This does however mean that it allows more programs than PR #146098 does by extending some temporaries past statement boundaries. Consider the expression context `let x = &[&●][0];`: ```haskell let_=&⟦&●⟧⟦0⟧⨾ : Context Stmt → Context Expr let_=&⟦&●⟧⟦0⟧⨾ s = s ▷ let…=●⨾ pat-non ▷ &● ▷ ●⟦…⟧ ▷ ⟦…,●,…⟧ 0 ▷ &● ``` In stable Rust and under PR #146098, the context's temporary scope is not extended, meaning any later use of `x` would result in a borrow-checking error: ```haskell scope-let_=&⟦&●⟧⟦0⟧⨾-stable : ∀ s-let → tempScope-stable (s-let |> let_=&⟦&●⟧⟦0⟧⨾) ≡ scope s-let scope-let_=&⟦&●⟧⟦0⟧⨾-stable _ = refl scope-let_=&⟦&●⟧⟦0⟧⨾-146098 : ∀ s-let → tempScope-146098 (s-let |> let_=&⟦&●⟧⟦0⟧⨾) ≡ scope s-let scope-let_=&⟦&●⟧⟦0⟧⨾-146098 _ = refl ``` However, with the expanded sub-expressions rule, this is extended to the end of the block: ```haskell scope-let_=&⟦&●⟧⟦0⟧⨾-refPr : ∀ s-let → tempScope-refPr (s-let |> let_=&⟦&●⟧⟦0⟧⨾) ≡ scope (blockOf s-let) scope-let_=&⟦&●⟧⟦0⟧⨾-refPr _ = refl ``` ## Alternate formulation: single function These rules can also be expressed using a single function: ```haskell tempScope-refPr′ : Context Expr → Scope -- `let ref x = ●;` and `let _ = &●;` have extended temporary scopes. tempScope-refPr′ (s-let ▷ let…=●⨾ pat-ext) = scope (blockOf s-let) tempScope-refPr′ (s-let ▷ let…=●⨾ _ ▷ &●) = scope (blockOf s-let) -- `const … = &●;` has an extended temporary scope. tempScope-refPr′ (_ ▷ const…=●⨾ ▷ &●) = scope ctxt-top -- `●[…]` has the same temporary scope as its parent, per the expanded -- sub-expressions rule. tempScope-refPr′ (e-indexing ▷ ●⟦…⟧) = tempScope-refPr′ e-indexing -- Extending subexpressions preserve `&●`'s temporary scope. tempScope-refPr′ (e-block ▷ ⦃…⨾●⦄ ▷ &●) = tempScope-refPr′ (e-block ▷ &●) tempScope-refPr′ (e-array ▷ ⟦…,●,…⟧ _ ▷ &●) = tempScope-refPr′ (e-array ▷ &●) tempScope-refPr′ (e-ref-outer ▷ &● ▷ &●) = tempScope-refPr′ (e-ref-outer ▷ &●) -- `&●` outside of an extending subexpression has the same temporary scope as -- its parent, per the expanded sub-expressions rule. {-# CATCHALL #-} tempScope-refPr′ (e-ref-other ▷ &●) = tempScope-refPr′ e-ref-other -- Our other contexts don't have extended temporary scopes. -- Their temporary scopes are their enclosing temporary scopes. {-# CATCHALL #-} tempScope-refPr′ e-other = enclosingTempScope e-other ``` This offers a more visual interpretation of extending subexpressions: to find the temporary scope of `&●` within an extending subexpression, imagine moving the `&●` up as though it was in its parent's place. `tempScope-146098` can also be expressed in this way, but it still requires two function definitions or some extra bit of state to encode whether the sub-expressions rule has been applied already.