owned this note
owned this note
Published
Linked with GitHub
# RFC 2229: 2020 Status
There is a some what functional implementaion of the RFC available behind feature gate `capture_disjoint_fields`.
## Proposed restrictions to the RFC
This section list restrictions to how precisly a closure captures paths mentioned within it.
### 1. Move closure don't capture Derefs
#### Motivation
- Move closures are often [detached] from the stack frame, for eg: to return a value or start a thread.
- Therefore we move things that are on the stack and it's possible that a dereferenced value isn't available on the enclosing stack frame. For eg:
```rust
fn f<'a>(&'a mut x: NonCopyType) {
let c = move || {
move_value(*x);
// ^Error: Can't move *x as it's behind a &mut reference.
};
}
```
- Box performance
- One of the motivations for using a box is to be able to cheapy to move the data by moving the box itself
```rust
fn f() {
let tuple: Box<([String; 4096], String);
let c = move || {
move_value(tuple.0); // This is really (*tuple).0
};
}
```
- Notice that `sizeof(tuple) = 8`, where as size of `tuple.0` is large.
- We will also be moving the capture twice, once to create the closure i.e. to capture and then when the closure is executed.
- Therefore moving `tuple.0` is much more expensive.
- Dropping `Deref` would mean that `tuple` is captured completely and only 8 bytes need to be moved.
[detached]: https://github.com/rust-lang/rfcs/pull/2229#issuecomment-410835070
### 2. By value captures don't capture Derefs
#### Motivation
- Allows us to maintain symmetry with move closures where all paths are captured by value.
```rust
fn f<'a>(&'a mut x: NonCopyType) {
let c = || {
move_value(*x);
// ^Error: Can't move *x as it's behind a &mut reference.
};
}
```
### 3. Don't capture anything more precise on top of Raw ptrs
#### Motivation
- We won't know if dereferencing a raw pointer is safe until we execute the code.
```rust
struct X(i32, i32);
fn main() {
let x = X(42, 42);
let px: *const X = &x;
let c = || {
let val = unsafe { (*px).1 } ;
println!("{}", val)
};
c();
}
```
Here if we capture `(*px).1`, when building the clsoure we will be dereferencing a raw pointer outside of an unsafe block. Therefore we stop capturing anything more precise once we see a raw pointer.
### 4. Repr packed
[TBD]
```rust
#[repr(packed)]
struct Foo { x: u8, y: u8 }
fn main(foo: Foo) {
let c = || {
let z: &u8 = unsafe { &foo.x };
};
// cannot capture `foo.x` at the point of closure *creation*, must just capture `foo`
// In your code:
// You'll see a place like `[foo, x]`
// if you have a field projection where the field is defined in a `#[repr(packed)]` struct, you need to capture the struct itself
}
```
Related issues: [#33](https://github.com/rust-lang/project-rfc-2229/issues/26)
### 5. Reborrowing captures for move closures (possible extension)
* Possible extension to the rules for move closures that makes them behave a bit more like regular closures:
* Sometimes capture by reference, if the place is only used by reference and it is a "reborrow" (deref of a reference)
* May have complex lifetime interactions and is hard to explain
* samsartor [proposed this on the original RFC](https://github.com/rust-lang/rfcs/pull/2229#issuecomment-411910790)
To understand the motivation, consider this example:
```rust=
struct S {
x: String,
y: String
}
fn x(mut s: &mut S) {
let c = move || {
s.x.truncate(0);
};
// Under rule 1, this would move `s` into the closure (because the capture stops at the dereference).
// under rule 5, this would mutably reference `s.x` (because the capture changes to a mutable borrow as we pass through the deref of an `&mut` reference).
let d = move || {
s.x.len();
};
// Under rule 1, this would move `s` into the closure (because the capture stops at the dereference).
// under rule 5, this would share reference `s.x` (because the capture changes to a borrow as we pass through the deref of an `&` reference).
}
```
## Open bugs/migration concerns
### Place mentioned in the closure isn't entirely captured
This set of problems arise because a place that is mentioned within the closure is either partly used or we see use of each field of the place being used separately.
The issue shows up in the compiler when we start building MIR where in the initial passes before complete analysis of what is read is done, and we expect the place to be avaiable in its entirety.
This can take two (known) forms
#### 1. Patterns
```rust=
let tup = ("A", "B", "C");
let c = || { let (a, b, c) = tup; }; // tup[0], tup[1], tup[2] captured
let c = || { let (_, b, c) = tup; }; // tup[1], tup[2] captured
let c = || { let (_, _, _) = tup; }; // nothing is captured
```
Here when we start building MIR we expect `tup` to be present and only once we do analysis on patterns, we might realise that we won't be reading some of the places that are available in `tup` (eg. `tup[0]` on line 2).
Example on line one is particulary interesting because we do capture `tup` entirely, just not how the MIR build expects it.
Related issues: [#24](https://github.com/rust-lang/project-rfc-2229/issues/24), [#27](https://github.com/rust-lang/project-rfc-2229/issues/27)
#### 2. Struct update syntax/ Functional Record Update syntax
This is similar to (1) where the MIR build expects the Place that is mentioned within the closure to be completely captured
```rust
struct X {
a : String,
b : String,
}
impl X {
fn from_other_X(o: Self) -> Self {
let c = || {
X { a: "".into(), ..o } // Only `o.b` is captured
};
c()
}
}
```
Here we expect `o` to be completely avaialble even though only `o.b` is used.
**Note:** We have a workaround for such issues that allows us to compile code at the cost of loosing some precision. Check this [commit].
The workaround essentially captures the place that is mentioned within the closure and the precision of the capture would stop at the precision of the place that is mentioned in patterns or in the struct update syntax.
[commit]: https://github.com/sexxi-goose/rust/commit/8c64f01a21f200cbc727c1fee27dca629523190c#diff-7e389c2ca4db213dcaee698d12a364e7e39f77179a0da1c6578e1ffc9187a9e2
Related issues: [#32](https://github.com/rust-lang/project-rfc-2229/issues/32)
### Struct implements a Trait but subfield doesn't (Migration)
Consider example:
```rust
fn assert_panics<F>(f: F) where F: FnOnce() {
let w = panic::AssertUnwindSafe(f);
let result = panic::catch_unwind(move || {
w.0()
});
if let Ok(..) = result {
panic!("diverging function returned");
}
}
```
Wrapping `f` in `AssertUnwindSafe` ensures that `f` is can be used with `panic::catch_unwind`, however using `f.0` in the closure makes us capture `f` itself which can't be used with `panic::catch_unwind` essentially rendering the wrapper redundant.
Related issues: [#28](https://github.com/rust-lang/project-rfc-2229/issues/28), [#29](https://github.com/rust-lang/project-rfc-2229/issues/29)
### Base of an index projection is always captured
Related issues: [#26](https://github.com/rust-lang/project-rfc-2229/issues/26)
## Perf results
### Capture analysis
1. [Results with just the capture analysis](https://perf.rust-lang.org/compare.html?start=9d78d1d02761b906038ba4d54c5f3427f920f5fb&end=4ef9e8c67e8251f2b7d6a6657c2b8cdf2c13b4d2). Related PR: [#78762](https://github.com/rust-lang/rust/pull/78762)
### THIR/MIR lowering
1. [Results without fixes to handle mutability properly and without any restriction and/or workaround](https://perf.rust-lang.org/compare.html?start=28b86e0860f0593b85cda6c2c7b03ae8a582962f&end=f76f72250e36c9ad4eb3bd3f08770228cc07da6c). Related PR: [#79553](https://github.com/rust-lang/rust/pull/79553). Note that the feature `capture_disjoint_fields`
isn't enabled by default here.
After applying the restrictions and the workaround discussed above, perf tests were done here. Related PR [#79696](https://github.com/rust-lang/rust/pull/79696)
The PR enables `capture_disjoint_fields` by default
2. [Result after applying restrictions (1) (3)](https://perf.rust-lang.org/compare.html?start=15eaa0020b79ad9a9a0c486d1abd00b29c6c5ae2&end=d5c47437deb5a77afed53e0bba2a050fbeaf974c)
3. [Result after applying restrictions (1) (2) (3)](https://perf.rust-lang.org/compare.html?start=0f6f2d681b39c5f95459cd09cb936b6ceb27cd82&end=aee064ad665521e03371fbb810437a813fbdd365)
4. [Perf difference between without (2) and with (2)](https://perf.rust-lang.org/compare.html?start=d5c47437deb5a77afed53e0bba2a050fbeaf974c&end=aee064ad665521e03371fbb810437a813fbdd365). Note that some PRs were merged between perf runs (2) and (3), so we might not want to read too much into the difference increased bootstrap times.
All the increase in instruction counts are in the cases of incremental compile. I _speculate_ this is because we added `closure_min_captures` to TypeckResults without removing `closure_captures` and `upvar_capture_map`, i.e. stricty increasing work that needs to be done for stable hashing TypeckResults.
### Removing Trailing derefs
1. [Perf after removing trailing derefs along with restrictions (1) (2) (3)](https://perf.rust-lang.org/compare.html?start=0f6f2d681b39c5f95459cd09cb936b6ceb27cd82&end=80f976ebf93cfeeb1f52bbd43eedb03b4568b8bc)
2. [Perf difference between with and without trailing derefs](https://perf.rust-lang.org/compare.html?start=aee064ad665521e03371fbb810437a813fbdd365&end=80f976ebf93cfeeb1f52bbd43eedb03b4568b8bc&stat=instructions%3Au)
We see that removing trailing derefs(less information that is written in TypechkResults) results in minor savings in `incr-full` cases, which kind of supports my speculation earlier.
That aside, removing trailing derefs don't have any significant performance difference.
## Migration Plan
In some cases, closures will capture more precise places than they did before. This effects the runtime semantics because the destructors can run at different times. In the older version, the destructor for the base variable `x` would run when the closure was dropped (because the closure owned `x`). In the newer version, if the closure only captures `x.y`, then the destructor for the field `y` will run when the closure is dropped, but the destructor for the remainder of `x` will run when `x` is dropped in the creator's scope.
Workaround: introduce `let x = x;` into the closure.
Observation:
* Few destructors actually have side effects in practice
* Most just free memory
* Most users don't really care when the destructor runs, as long as it runs, and/or they would probably prefer the memory to be freed earlier, which it may be (or may not be...) with the more precise capture.
* Hypothesis: so long as the destructors only free memory, we should not try to preserve their precise ordering through automated migation.
* We *could* have an opt-in *maybe*, but we have no mechanism for that right now
* We should detect if the "no longer moved" content has a type that may have a "significant destructor"
* How do you define a significant destructor?
* Probably: annotate destructors in the stdlib with `#[rustc_insignificant_dtor]` or something like that
* Everything else is assumed significant
* Walk the type of each no-longer-captured to see whether it may include a dtor
* What you need to do after capture analysis:
* Clone the "captured variable data" before adding in the "fake reads" of mentioned upvars
* Add fake reads of mentioned upvars, rerun captured analysis
* Now we can diff the results
* Analysis:
* Givens:
* minimum captured place
* Returns true if:
* the local variable may have a significant destructor in its type
* option: and that destructor is found somewhere outside the captured place
* Related:
* [`needs_drop`] if any part of this type may define a destructor, start with this
* then make needs_drop understand significant destructors
* then make a variant that takes in a slice of projections and kind of tracks which "part" of the projection it is in
* We will have a list of local variables that used to be captured and they may have significant dtors
* `let x = x;` inserted into the body
* if the closure doesn't have a block already, we have to add `{let x = x; ` at the front
* and `}` at the end
* else insert `let x = x;` at the start of the block
* `move[x, y] || ` -- this would be an easier transition for us to make and it would be cleaner for users
* `{ let x = &22; move || x }`
* `{ let x = x.clone(); move || x }` where `x: Rc<u32>`
* make a lint with a suggestion
* PR series:
* Just add the lint but using `needs_drop` and with no suggestions
* Make lint more targeted
* Add suggestions
[`needs_drop`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.TyS.html#method.needs_drop
```rust=
struct Foo { }
impl Drop for Foo { }
fn main() {
let x = (String::new(), String::new());
let c = || drop(x.0); // <-- changes semantics, but who cares
// change in semantics is:
// * `x.1` is dropped when the block exits, not when `c` is dropped
// option:
let x = (Foo { }, String::new());
let c = || drop(x.0); // <-- changes semantics, but who cares
// change in semantics is:
// * `x.1` is dropped when the block exits, not when `c` is dropped
let x = (String::new(), Foo { });
let c = || drop(x.0); // <-- changes semantics, we maybe care
// change in semantics is:
// * `x.1` is dropped when the block exits, not when `c` is dropped,
// and it has a significant destructor
}
```
## Diagnostics Plan
### Borrow checker diagnostics
```rust
let mut p = Point { x: 10, y: 10 };
let c = || {
p.x += 10;
// ^^^ mutable borrow of `p` happens here
println!("{:?}", p);
// ^ `p` is captured since it's used here.
};
```
In case both the `CaptureKind` span and `CapturePath` span are the same we only emit one-note similar to how we do it today.
```rust
let mut p = Point { x: 10, y: 10 };
let c = || {
p = Point { ... } ;
// ^ mutable borrow of `p` happens here
};
```
### Diagnostics around special restrictions
[TBD]
## Aside
### Debugging support
Annotating a closure with `rustc_capture_analysis` will print out the results of the capture analysis to stderr.
```rust
#![feature(rustc_attrs)]
fn main() {
let foo = [1, 2, 3];
let c = #[rustc_capture_analysis] || match foo { _ => () };
}
```
### Extension to improve fn coercion
Once we fix the known issues with precise captures we can coerce functions that mention upvars but never use them.
Related Issue: Related issues: [#23](https://github.com/rust-lang/project-rfc-2229/issues/23)