RFC 2229: 2020 Status

There is a some what functional implementaion of the RFC available behind feature gate capture_disjoint_fields.

Proposed restrictions to the RFC

This section list restrictions to how precisly a closure captures paths mentioned within it.

1. Move closure don't capture Derefs

Motivation

  • Move closures are often detached from the stack frame, for eg: to return a value or start a thread.
    • Therefore we move things that are on the stack and it's possible that a dereferenced value isn't available on the enclosing stack frame. For eg:
    ​​​​fn f<'a>(&'a mut x: NonCopyType) {
    ​​​​    let c = move || {
    ​​​​        move_value(*x);
    ​​​​        //         ^Error: Can't move *x as it's behind a &mut reference.
    ​​​​    };
    ​​​​}
    
  • Box performance
    • One of the motivations for using a box is to be able to cheapy to move the data by moving the box itself
    ​​​​fn f() {
    ​​​​    let tuple: Box<([String; 4096], String);
    ​​​​    
    ​​​​    let c = move || {
    ​​​​        move_value(tuple.0); // This is really (*tuple).0
    ​​​​    };
    ​​​​}
    
    • Notice that sizeof(tuple) = 8, where as size of tuple.0 is large.
    • We will also be moving the capture twice, once to create the closure i.e. to capture and then when the closure is executed.
    • Therefore moving tuple.0 is much more expensive.
    • Dropping Deref would mean that tuple is captured completely and only 8 bytes need to be moved.

2. By value captures don't capture Derefs

Motivation

  • Allows us to maintain symmetry with move closures where all paths are captured by value.
    ​​​​fn f<'a>(&'a mut x: NonCopyType) {
    ​​​​    let c = || {
    ​​​​        move_value(*x);
    ​​​​        //         ^Error: Can't move *x as it's behind a &mut reference.
    ​​​​    };
    ​​​​}
    

3. Don't capture anything more precise on top of Raw ptrs

Motivation

  • We won't know if dereferencing a raw pointer is safe until we execute the code.
struct X(i32, i32);

fn main() {
    let x = X(42, 42);
    let px: *const X = &x;
    
    let c = || {
        let val = unsafe { (*px).1 } ;
        println!("{}", val)
    };
    c();
}

Here if we capture (*px).1, when building the clsoure we will be dereferencing a raw pointer outside of an unsafe block. Therefore we stop capturing anything more precise once we see a raw pointer.

4. Repr packed

[TBD]

#[repr(packed)]
struct Foo { x: u8, y: u8 }

fn main(foo: Foo) {
    let c = || {
        let z: &u8 = unsafe { &foo.x };
    
    };
    // cannot capture `foo.x` at the point of closure *creation*, must just capture `foo`
    // In your code:
    // You'll see a place like `[foo, x]`
    // if you have a field projection where the field is defined in a `#[repr(packed)]` struct, you need to capture the struct itself
}

Related issues: #33

5. Reborrowing captures for move closures (possible extension)

  • Possible extension to the rules for move closures that makes them behave a bit more like regular closures:
    • Sometimes capture by reference, if the place is only used by reference and it is a "reborrow" (deref of a reference)

To understand the motivation, consider this example:

struct S { x: String, y: String } fn x(mut s: &mut S) { let c = move || { s.x.truncate(0); }; // Under rule 1, this would move `s` into the closure (because the capture stops at the dereference). // under rule 5, this would mutably reference `s.x` (because the capture changes to a mutable borrow as we pass through the deref of an `&mut` reference). let d = move || { s.x.len(); }; // Under rule 1, this would move `s` into the closure (because the capture stops at the dereference). // under rule 5, this would share reference `s.x` (because the capture changes to a borrow as we pass through the deref of an `&` reference). }

Open bugs/migration concerns

Place mentioned in the closure isn't entirely captured

This set of problems arise because a place that is mentioned within the closure is either partly used or we see use of each field of the place being used separately.

The issue shows up in the compiler when we start building MIR where in the initial passes before complete analysis of what is read is done, and we expect the place to be avaiable in its entirety.

This can take two (known) forms

1. Patterns

let tup = ("A", "B", "C"); let c = || { let (a, b, c) = tup; }; // tup[0], tup[1], tup[2] captured let c = || { let (_, b, c) = tup; }; // tup[1], tup[2] captured let c = || { let (_, _, _) = tup; }; // nothing is captured

Here when we start building MIR we expect tup to be present and only once we do analysis on patterns, we might realise that we won't be reading some of the places that are available in tup (eg. tup[0] on line 2).

Example on line one is particulary interesting because we do capture tup entirely, just not how the MIR build expects it.

Related issues: #24, #27

2. Struct update syntax/ Functional Record Update syntax

This is similar to (1) where the MIR build expects the Place that is mentioned within the closure to be completely captured

struct X {
    a : String,
    b : String,
}

impl X {
    fn from_other_X(o: Self) -> Self {
        let c = || {
            X { a: "".into(), ..o } // Only `o.b` is captured
        };
        c()
    }
}

Here we expect o to be completely avaialble even though only o.b is used.

Note: We have a workaround for such issues that allows us to compile code at the cost of loosing some precision. Check this commit.
The workaround essentially captures the place that is mentioned within the closure and the precision of the capture would stop at the precision of the place that is mentioned in patterns or in the struct update syntax.

Related issues: #32

Struct implements a Trait but subfield doesn't (Migration)

Consider example:

fn assert_panics<F>(f: F) where F: FnOnce() {
    let w = panic::AssertUnwindSafe(f);
    let result = panic::catch_unwind(move || {
        w.0()
    });
    if let Ok(..) = result {
        panic!("diverging function returned");
    }
}

Wrapping f in AssertUnwindSafe ensures that f is can be used with panic::catch_unwind, however using f.0 in the closure makes us capture f itself which can't be used with panic::catch_unwind essentially rendering the wrapper redundant.

Related issues: #28, #29

Base of an index projection is always captured

Related issues: #26

Perf results

Capture analysis

  1. Results with just the capture analysis. Related PR: #78762

THIR/MIR lowering

  1. Results without fixes to handle mutability properly and without any restriction and/or workaround. Related PR: #79553. Note that the feature capture_disjoint_fields
    isn't enabled by default here.

After applying the restrictions and the workaround discussed above, perf tests were done here. Related PR #79696

The PR enables capture_disjoint_fields by default

  1. Result after applying restrictions (1) (3)
  2. Result after applying restrictions (1) (2) (3)
  3. Perf difference between without (2) and with (2). Note that some PRs were merged between perf runs (2) and (3), so we might not want to read too much into the difference increased bootstrap times.

All the increase in instruction counts are in the cases of incremental compile. I speculate this is because we added closure_min_captures to TypeckResults without removing closure_captures and upvar_capture_map, i.e. stricty increasing work that needs to be done for stable hashing TypeckResults.

Removing Trailing derefs

  1. Perf after removing trailing derefs along with restrictions (1) (2) (3)
  2. Perf difference between with and without trailing derefs

We see that removing trailing derefs(less information that is written in TypechkResults) results in minor savings in incr-full cases, which kind of supports my speculation earlier.

That aside, removing trailing derefs don't have any significant performance difference.

Migration Plan

In some cases, closures will capture more precise places than they did before. This effects the runtime semantics because the destructors can run at different times. In the older version, the destructor for the base variable x would run when the closure was dropped (because the closure owned x). In the newer version, if the closure only captures x.y, then the destructor for the field y will run when the closure is dropped, but the destructor for the remainder of x will run when x is dropped in the creator's scope.

Workaround: introduce let x = x; into the closure.

Observation:

  • Few destructors actually have side effects in practice
    • Most just free memory
  • Most users don't really care when the destructor runs, as long as it runs, and/or they would probably prefer the memory to be freed earlier, which it may be (or may not be) with the more precise capture.
    • Hypothesis: so long as the destructors only free memory, we should not try to preserve their precise ordering through automated migation.
      • We could have an opt-in maybe, but we have no mechanism for that right now
  • We should detect if the "no longer moved" content has a type that may have a "significant destructor"
    • How do you define a significant destructor?
      • Probably: annotate destructors in the stdlib with #[rustc_insignificant_dtor] or something like that
      • Everything else is assumed significant
    • Walk the type of each no-longer-captured to see whether it may include a dtor
  • What you need to do after capture analysis:
    • Clone the "captured variable data" before adding in the "fake reads" of mentioned upvars
    • Add fake reads of mentioned upvars, rerun captured analysis
    • Now we can diff the results
  • Analysis:
    • Givens:
      • minimum captured place
    • Returns true if:
      • the local variable may have a significant destructor in its type
      • option: and that destructor is found somewhere outside the captured place
    • Related:
      • needs_drop if any part of this type may define a destructor, start with this
      • then make needs_drop understand significant destructors
      • then make a variant that takes in a slice of projections and kind of tracks which "part" of the projection it is in
  • We will have a list of local variables that used to be captured and they may have significant dtors
    • let x = x; inserted into the body
      • if the closure doesn't have a block already, we have to add {let x = x; at the front
        • and } at the end
      • else insert let x = x; at the start of the block
    • move[x, y] || this would be an easier transition for us to make and it would be cleaner for users
      • { let x = &22; move || x }
      • { let x = x.clone(); move || x } where x: Rc<u32>
    • make a lint with a suggestion
  • PR series:
    • Just add the lint but using needs_drop and with no suggestions
    • Make lint more targeted
    • Add suggestions
struct Foo { } impl Drop for Foo { } fn main() { let x = (String::new(), String::new()); let c = || drop(x.0); // <-- changes semantics, but who cares // change in semantics is: // * `x.1` is dropped when the block exits, not when `c` is dropped // option: let x = (Foo { }, String::new()); let c = || drop(x.0); // <-- changes semantics, but who cares // change in semantics is: // * `x.1` is dropped when the block exits, not when `c` is dropped let x = (String::new(), Foo { }); let c = || drop(x.0); // <-- changes semantics, we maybe care // change in semantics is: // * `x.1` is dropped when the block exits, not when `c` is dropped, // and it has a significant destructor }

Diagnostics Plan

Borrow checker diagnostics

let mut p = Point { x: 10, y: 10 };

let c = || {
    p.x += 10;
//  ^^^ mutable borrow of `p` happens here
    println!("{:?}",  p);
//                    ^ `p` is captured since it's used here.
};

In case both the CaptureKind span and CapturePath span are the same we only emit one-note similar to how we do it today.

let mut p = Point { x: 10, y: 10 };

let c = || {
    p  = Point { ... } ;
//  ^ mutable borrow of `p` happens here
};

Diagnostics around special restrictions

[TBD]

Aside

Debugging support

Annotating a closure with rustc_capture_analysis will print out the results of the capture analysis to stderr.

#![feature(rustc_attrs)]

fn main() {
    let foo = [1, 2, 3];
    let c = #[rustc_capture_analysis] || match foo { _ => () };
}

Extension to improve fn coercion

Once we fix the known issues with precise captures we can coerce functions that mention upvars but never use them.
Related Issue: Related issues: #23

Select a repo