There is a some what functional implementaion of the RFC available behind feature gate capture_disjoint_fields
.
This section list restrictions to how precisly a closure captures paths mentioned within it.
fn f<'a>(&'a mut x: NonCopyType) {
let c = move || {
move_value(*x);
// ^Error: Can't move *x as it's behind a &mut reference.
};
}
fn f() {
let tuple: Box<([String; 4096], String);
let c = move || {
move_value(tuple.0); // This is really (*tuple).0
};
}
sizeof(tuple) = 8
, where as size of tuple.0
is large.tuple.0
is much more expensive.Deref
would mean that tuple
is captured completely and only 8 bytes need to be moved.fn f<'a>(&'a mut x: NonCopyType) {
let c = || {
move_value(*x);
// ^Error: Can't move *x as it's behind a &mut reference.
};
}
struct X(i32, i32);
fn main() {
let x = X(42, 42);
let px: *const X = &x;
let c = || {
let val = unsafe { (*px).1 } ;
println!("{}", val)
};
c();
}
Here if we capture (*px).1
, when building the clsoure we will be dereferencing a raw pointer outside of an unsafe block. Therefore we stop capturing anything more precise once we see a raw pointer.
[TBD]
#[repr(packed)]
struct Foo { x: u8, y: u8 }
fn main(foo: Foo) {
let c = || {
let z: &u8 = unsafe { &foo.x };
};
// cannot capture `foo.x` at the point of closure *creation*, must just capture `foo`
// In your code:
// You'll see a place like `[foo, x]`
// if you have a field projection where the field is defined in a `#[repr(packed)]` struct, you need to capture the struct itself
}
Related issues: #33
To understand the motivation, consider this example:
struct S {
x: String,
y: String
}
fn x(mut s: &mut S) {
let c = move || {
s.x.truncate(0);
};
// Under rule 1, this would move `s` into the closure (because the capture stops at the dereference).
// under rule 5, this would mutably reference `s.x` (because the capture changes to a mutable borrow as we pass through the deref of an `&mut` reference).
let d = move || {
s.x.len();
};
// Under rule 1, this would move `s` into the closure (because the capture stops at the dereference).
// under rule 5, this would share reference `s.x` (because the capture changes to a borrow as we pass through the deref of an `&` reference).
}
This set of problems arise because a place that is mentioned within the closure is either partly used or we see use of each field of the place being used separately.
The issue shows up in the compiler when we start building MIR where in the initial passes before complete analysis of what is read is done, and we expect the place to be avaiable in its entirety.
This can take two (known) forms
let tup = ("A", "B", "C");
let c = || { let (a, b, c) = tup; }; // tup[0], tup[1], tup[2] captured
let c = || { let (_, b, c) = tup; }; // tup[1], tup[2] captured
let c = || { let (_, _, _) = tup; }; // nothing is captured
Here when we start building MIR we expect tup
to be present and only once we do analysis on patterns, we might realise that we won't be reading some of the places that are available in tup
(eg. tup[0]
on line 2).
Example on line one is particulary interesting because we do capture tup
entirely, just not how the MIR build expects it.
This is similar to (1) where the MIR build expects the Place that is mentioned within the closure to be completely captured
struct X {
a : String,
b : String,
}
impl X {
fn from_other_X(o: Self) -> Self {
let c = || {
X { a: "".into(), ..o } // Only `o.b` is captured
};
c()
}
}
Here we expect o
to be completely avaialble even though only o.b
is used.
Note: We have a workaround for such issues that allows us to compile code at the cost of loosing some precision. Check this commit.
The workaround essentially captures the place that is mentioned within the closure and the precision of the capture would stop at the precision of the place that is mentioned in patterns or in the struct update syntax.
Related issues: #32
Consider example:
fn assert_panics<F>(f: F) where F: FnOnce() {
let w = panic::AssertUnwindSafe(f);
let result = panic::catch_unwind(move || {
w.0()
});
if let Ok(..) = result {
panic!("diverging function returned");
}
}
Wrapping f
in AssertUnwindSafe
ensures that f
is can be used with panic::catch_unwind
, however using f.0
in the closure makes us capture f
itself which can't be used with panic::catch_unwind
essentially rendering the wrapper redundant.
Related issues: #26
capture_disjoint_fields
After applying the restrictions and the workaround discussed above, perf tests were done here. Related PR #79696
The PR enables capture_disjoint_fields
by default
All the increase in instruction counts are in the cases of incremental compile. I speculate this is because we added closure_min_captures
to TypeckResults without removing closure_captures
and upvar_capture_map
, i.e. stricty increasing work that needs to be done for stable hashing TypeckResults.
We see that removing trailing derefs(less information that is written in TypechkResults) results in minor savings in incr-full
cases, which kind of supports my speculation earlier.
That aside, removing trailing derefs don't have any significant performance difference.
In some cases, closures will capture more precise places than they did before. This effects the runtime semantics because the destructors can run at different times. In the older version, the destructor for the base variable x
would run when the closure was dropped (because the closure owned x
). In the newer version, if the closure only captures x.y
, then the destructor for the field y
will run when the closure is dropped, but the destructor for the remainder of x
will run when x
is dropped in the creator's scope.
Workaround: introduce let x = x;
into the closure.
Observation:
#[rustc_insignificant_dtor]
or something like thatneeds_drop
if any part of this type may define a destructor, start with thislet x = x;
inserted into the body
{let x = x;
at the front
}
at the endlet x = x;
at the start of the blockmove[x, y] ||
– this would be an easier transition for us to make and it would be cleaner for users
{ let x = &22; move || x }
{ let x = x.clone(); move || x }
where x: Rc<u32>
needs_drop
and with no suggestions
struct Foo { }
impl Drop for Foo { }
fn main() {
let x = (String::new(), String::new());
let c = || drop(x.0); // <-- changes semantics, but who cares
// change in semantics is:
// * `x.1` is dropped when the block exits, not when `c` is dropped
// option:
let x = (Foo { }, String::new());
let c = || drop(x.0); // <-- changes semantics, but who cares
// change in semantics is:
// * `x.1` is dropped when the block exits, not when `c` is dropped
let x = (String::new(), Foo { });
let c = || drop(x.0); // <-- changes semantics, we maybe care
// change in semantics is:
// * `x.1` is dropped when the block exits, not when `c` is dropped,
// and it has a significant destructor
}
let mut p = Point { x: 10, y: 10 };
let c = || {
p.x += 10;
// ^^^ mutable borrow of `p` happens here
println!("{:?}", p);
// ^ `p` is captured since it's used here.
};
In case both the CaptureKind
span and CapturePath
span are the same we only emit one-note similar to how we do it today.
let mut p = Point { x: 10, y: 10 };
let c = || {
p = Point { ... } ;
// ^ mutable borrow of `p` happens here
};
[TBD]
Annotating a closure with rustc_capture_analysis
will print out the results of the capture analysis to stderr.
#![feature(rustc_attrs)]
fn main() {
let foo = [1, 2, 3];
let c = #[rustc_capture_analysis] || match foo { _ => () };
}
Once we fix the known issues with precise captures we can coerce functions that mention upvars but never use them.
Related Issue: Related issues: #23