owned this note
owned this note
Published
Linked with GitHub
# temporary lifetimes draft RFC
:::danger
This draft is deprecated. See [v2 draft](https://hackmd.io/2x6WhpuiTvqDq2MUpRLCeg)
:::
# Contents
- Feature Name: `new_temp_lifetime`
- Start Date: (fill me in with today's date, YYYY-MM-DD)
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)
- [Hackmd with meeting notes](https://hackmd.io/-l7-r7GiSWu2HJrPOI74BA?edit)
# Summary
[summary]: #summary
Motivation:
* Temporary rules today are a common source of bugs
* No way to write macros that introduce temporaries that work in all positions
Mechanism:
* In Rust 2024, adjust temporary rules to remove known footguns
* Introduce `super let` to allow users to explicitly create values, references to which will "escape" the block
* Find a way to ensure the resulting scope, especially from macros, is still easily understood.
# Motivation
[motivation]: #motivation
Why are we doing this? What use cases does it support? What is the expected outcome?
## Rust 2021 footguns
* Accidental mutex deadlock is the most prominent example. See [Listing 4](#Listing-4).
## Rust 2021 counterintuitive results
* in Rust 2021, the [tail expression of a block is considered special](#Listing-7), so temporaries in that position outlive the block itself
* intention was that `{E}` and `E` are equivalent
* but this gives rise to bad errors
* Tail-expr-drops-early example
## Rust 2021 annoying cases
You can do this
```rust=
let y;
let x = if ... { y = value(); &y } else { ... };
```
but that's surprising, people don't think of it. (Escape-from-if example shows how it works with `super let`)
## Rust 2021 macro limitations
### `pin!` macro
`pin!($value)` in Rust 2021 is expanded into simply `pin::Pin::<&mut _> { pointer: &mut { $value } }`. However, subtlety around treatment of temporary lifetime extension in Rust 2021 makes the behavior rather unpredictable.
As explained in the [`pin!` documentation](https://github.com/rust-lang/rust/blob/48d81005434718922c3daaa79c952a1d4ecf7224/library/core/src/pin.rs#L1184-L1233), naturally `pin!` should expands to a method call to `Pin::new_unchecked`. This alternative definition is failing a simple use case like the following.
```rust=
let mut schaden = pin!(async { "Äh?" });
```
This expands to
```rust=
let mut schaden = Pin::new_unchecked(&mut async { "Äh?" });
```
Under Rust 2021, it fails the borrow check because the future `async { "Äh?" }` is dropped at the end of the semi-colon. It was intended to expands to calls to publicly defined associated methods on `Pin` like `let x = format_args!(...);` is doing, only is prevented by the temporary lifetime rules set by the current edition.
# Guide-level explanation
[Guide-level-explanation]: #guide-level-explanation
In this section, we would like to develop an intution of temporary lifetime by showing examples in Rust Edition 2021 and in this proposal. To be completely clear of which rule the examples are following, the code snippets will begin with `// edition: 2021` to signify that Rust Edition 2021 is used for the code; while `#![feature(new_temp_lifetime)]` will signify that this proposal is being followed.
## Highlights of proposed changes
We start our RFC guide with highlights of changes that this RFC will bring to Rust Edition 2024, compared to Edition 2021. However, this assumes your familiarity with lifetime rules around temporary values. We highly recommmend reading in the [Appendix](#Appendix) for better context.
### Shorter lifetime assigned to temporary values in `if let`s and `match`es
As a direct answer to the pitfalls mentioned earlier, we will make temporaries in `if let` initializers and `match` scrutinee expressions to expire sooner before the evaluation descends into `if let` body blocks or `match` arm bodies, except cases where it is "syntatically clear", via use of borrow `&` or dereference `*`, that a particular expression needs to survive `if let` bodies or `match` arm blocks.
Let us look at a positive case first.
###### Before
```rust=
// edition: 2021
if let Some(code) = some_lock.lock().clone_code() {
compute(code); // This compiles but lock is still held
}
```
###### After
```rust=
#![feature(new_temp_lifetime)]
if let Some(code) = some_lock.lock().clone_code() {
// ^~~~~~~~~~~~~~~~ ^
// This value is dropped... here
compute(code); // This compiles except the lock is released.
}
```
Now let us look at a negative case, where the accidental lock acquisition is "detected" and reported.
###### Before
```rust=
// edition: 2021
if let Some(value) = some_lock.lock().borrow_value() {
compute(*value); // This compiles but lock is still held
some_lock.lock().do_something(); // ... so this enters deadlock.
}
```
###### After
```rust=
#![feature(new_temp_lifetime)]
if let Some(value) = some_lock.lock().borrow_value() {
// ^~~~~~~~~~~~~~~~ ^
// This value is dropped... here
compute(*value); // This does not compile because the lock guard has been
// dropped but the borrow `value` needs to outlive
// the guard.
some_lock.lock().do_something(); // This just works.
}
```
### More consistent temporary lifetime in function return position.
It comes as a surprise that temporaries are dropped later than all other local variable bindings in the function body.
###### Before
```rust=
// edition: 2021
fn main() {
let mutex = std::sync::Mutex::new(());
*mutex.lock().unwrap() // `mutex` does not live long enough
}
```
###### After
Under the proposed rules, however, temporaries at the function and block result location has definite lifetime such that they expire before the local declarations are dropped.
```rust=
#![feature(new_temp_lifetime)]
fn main() {
let mutex = std::sync::Mutex::new(());
*mutex.lock().unwrap() // It just works
}
```
### Finer and more sensible management of temporary lifetimes
We also recognize the need for better control of temporary lifetimes, especially for the macro authors. In response, we propose a new variable binding flavour `super let` to extend lifetime of bound values to a suitable scope.
###### Before
Under Rust 2021, it is not easy for an expression-like macro, which basically can appear in any part of a main expression, to control the temporary values to last as long as the result of the expanded expression. In the following example, the expanded macro intends to use a variable binding to hold the temporary values, except it is impossible to make one since it has to be external to the macro invocation site.
```rust=
// edition: 2021
{
// The intention of a macro my have been so...
let _tmp; // ... but this `_tmp` declaration is impossible.
// The following is the expansion of `some_macro!`.
let x = 'macro_in_an_expression_location {
_tmp = macro_expanded_expression1();
&_tmp.compute().then_borrowed()
};
}
```
###### After
Under our proposed `super let`, this is now possible.
```rust=
#![feature(new_temp_lifetime)]
{
// The intention of a macro my have been so...
// The following is the expansion of `some_macro!`.
let x = 'macro_in_an_expression_location {
super let tmp = macro_expanded_expression1();
// ^~~ so this temporary value
&tmp.compute().then_borrowed()
};
// ... lives as long as `x`
}
```
## Principles of this RFC
Under Rust 2021 rules, we have a robust framework of managing temporaries. The issue with Rust 2021 rules is that the rules are too simple to handle more complex cases in which the intention to extend lifetime is made clear; but are also inappropriately applied in cases of `if` and `match` expressions or statements.
It would be helpful to identify program points at which temporaries should be stored for deriving further results instead of expiring and vice visa. Conceptually, if a sub-expression will eventually be stored in a local variable binding, pattern matching binding or indirectly via borrows purely by the merit of expression syntax. Rust 2021 handles a few limited cases like in [Listing 3](#Listing-3), because the mutable borrowing `&mut` on Line 7 is going to be stored in a local variable `b`; and for that borrow to be valid anyway, the subexpression behind the borrow must at least out-live the local variable `b` as well. Here, we trust that the intention is clear and convenience is desirable.
Furthermore, we will extend the identification of sub-expressions for extension deeper into the syntax tree. Take [Listing 6](#Listing-6) as our example which is reproduced here.
```rust=
#![feature(new_temp_lifetime)]
fn conditionally(flag: bool) {
let x = if flag {
&mut construct_zero()
// ^~~~~~~~~~~~~~~~
} else {
&mut construct_one()
// ^~~~~~~~~~~~~~~
};
*x = construct(2);
}
```
Since the `if` expression between Line 3 and 9 will be store into the variable `x`, eventually so for the two blocks of the `if` on Line 3-6 and 6-9. Since `x` is alive beyond the semi-colon on Line 9, so should be `construct_zero()` and `construct_one()` to allow the borrows to *escape* and out-live `x` for the sake of convenience.
`if` expression is not the only construct that the desirable extension rules should apply to. Here are some similar constructs that also conveys signals for extension, which we should register and apply extension for.
```rust=
#![feature(new_temp_lifetime)]
let x = {
do_something();
&mut 0 //~ this result is eventually stored in `x`
};
let x = match f() {
Some(_) => {
&mut 0 //~ this result is eventually stored in `x`
}
_ => {
&mut 1 //~ here too
}
};
let x = {
// nesting ...
do_something();
{
// blocks ...
do_something();
{
// works!
do_something();
&mut 0
}
}
};
```
On the contrary, if a sub-expression is at a position that is typically not involved in variable bindings, or more involved in control flow rather than variable declaration, there is rarely necessity to hold onto the temporaries arised from evaluating that sub-expression. In fact, examples like [Listing 4](#Listing-4), [Listing 5](#Listing-5) and [Listing 7](#Listing-7) demonstrate the hazard and quirkiness when temporaries are alive unnecessarily long. Listing 4 and 5 involves control-flow constructs `if` and `match` with pattern matching.
```rust=
#![feature(new_temp_lifetime)]
if let Some(x) = f().g().h() {
// ^ ^~~~~~~~~~~
// x is used
// but intemediate results f() and f().g()
// may or may not escape into x,
// so we should drop them before entering this block
}
match f().g().h() {
Some(x) => {
// similarly, f() and f().g() may or may not escape
// into x, so we should just drop them
}
}
```
Although there are variable bindings, only the value being pattern-matched against is relevant and temporaries at the initializer positions in these cases are not *intended* to escape into the consequent blocks or match arms. Meanwhile, consequent blocks and match arms perform more computation and in most cases. Keeping the temporaries from the pattern matching inside these code blocks is not particularly economical in terms of resource use and even harzadous when locks are involved.
In doing so, we have to recognise that there are legitimate need for extension despite of favouring shorter lifetime. Let us examine the following Rust 2021 code snippet.
```rust=
// edition: 2021
{
match compute().as_ref() {
x => {
compute_with(x);
}
}
}
```
If this code is to be interpreted under our new rules in this RFC, however, it fails borrow checking because `compute()` is not for local variable binding and so `compute()` expires when the borrow `compute().as_ref()` is under a pattern matching. However, lifting `compute()` to a variable allows the value to live beyond the `match` statement. The minimal example to mirror the Rust 2021 semantics is the following.
```rust=
#![feature(new_temp_lifetime)]
{
let scrutinee = compute();
match scrutinee.as_ref() {
x => {
compute_with(x);
}
}
drop(scrutinee);
}
```
We need to take into account that there can be a need to preserve the Rust 2021 temporary lifetime assignment. This is by coincidence that the same semantics can be recovered in this case, that we must remark. In order to allow the code to be more compact without introducing the manual drops, and also out of concern of allowing to recover the classical semantics in a limited sense, we propose to introduce a new kind of variable binding `super let`.
```rust=
#![feature(new_temp_lifetime)]
{
match { super let scrutinee = compute(); scrutinee.as_ref() } {
x => {
compute_with(x);
}
}
// The storage of scrutinee expires here.
}
```
The `compute()` value will expire beyond the immediately surrounding `super let` declaration on Line 3, but instead expire at the end of the `match` statement on Line 7. The lifetime assigned to `scrutinee` in this set-up extends to the point where the parent expression `match` is used as a result. Since `match` here is just a statement, its result is discarded at the end of it, and so should be `scrutinee`.
Along the same line of thought, `super let` in the following example extends the lifetime to the outmost block.
```rust=
#![feature(new_temp_lifetime)]
{
let x = { super let y = compute(); &y };
// this is just equivalent to ...
let x = &compute();
}
```
The `super let y = computer();` on Line 3 picks up that it is inside an expression that will get stored into a local variable, and so should the scope of the value `compute()` to be extended to the block containing `x` to outlive the borrow `&y`.
In essence, the introduction of `super let` should allow us to re-write a Rust 2021 expression `E` into another expression `{ E' }` surrounded by a block, which is completely equivalent in semantics under the new temporary lifetime rules, by surrounding some sub-expressions of `E` with `super let`s to obtain `E'`, wherever necessary.
## The proposed rules under cases
When we consider whether a expression or its subexpressions should have its lifetime extended, there are only four cases to analyse.
1. The expression is bound to a local variable.
2. The expression is being pattern-matched against as a predicate in control-flow constructs like `if` and `match`.
3. The expression is bound to a `super let` variable.
4. The expression is at any other location of a Rust program.
We will now look at these cases.
### Local variable binding
The case for local variable binding is typically like the following.
```rust=
{
let x = compute();
}
```
A variable binding is always in a block, in which the variable is alive through out. The result of the initializer `compute()` will be stored into a place for `x`.
```rust=
#![feature(new_temp_lifetime)]
{
let x = &compute();
}
```
By putting a borrow in front of `compute()`, it makes the borrow to escape the usual lifetime of `compute()`. To "fix it", we would look up where the scope of the storage and fix the temporary lifetime of `compute()` to that. The scope of the storage `x` is the enclosing block around it. Therefore, we will extend the lifetime to that block. Now the temporary outlives the borrow `x`.
This will get more interesting. Suppose we construct a structure like so.
```rust=
#![feature(new_temp_lifetime)]
{
let x = MyStruct {
field: &compute(),
};
}
```
Again, we can recognise that the structure is going to be stored in `x` and `field` is a part of the result to be stored. Follow the same intuition, we determine that `&compute()` forms a part of the result for storage and `compute()` qualifies for extension into the enclosing block.
Leveling up the game, let us analyse this case.
```rust=
#![feature(new_temp_lifetime)]
{
let x = MyStruct {
field: match flag() {
Some(0) => &zero(),
Some(1) => &one(),
_ => {
let s = &log_string();
report(s);
&none()
}
}
};
}
```
The intuition is to find out the subexpressions in the initializer that are *simply connected* to the eventual stored results in `x`. Here is a chain of deduction.
1. `MyStruct { .. }` eventually gets stored in `x`, whose scope is the block on Line 2-14.
2. `match flag() { .. }` eventually gets stored in `MyStruct { .. }`.
3. `&zero()`, `&one()`, `&none()` eventually gets stored in `MyStruct { .. }` since they are at the result position of the `match` expression.
4. `zero()`, `one()` and `none()` eventually forms the results, so they need to be extended to the scope on Line 2-14.
Here is a separate case for the variable binding `s`.
1. `s` is not at a result producing location in the block on Line 7-11, so its scope is just that inner enclosing block.
1. `&log_string()` eventually gets stored in `s`, whose scope is the block on Line 7-11.
1. `log_string()` eventually forms the results, so they need to be extended to the scope on Line 7-11.
A good way to pick out the subexpressions that participate in result construction under the criterion set by this RFC, is to find if the expression is connected to the main initializer via only structure or tuple construction, borrows or dereferences, and block tails.
### `if let` and `match`
Let's first start with `if let`.
```rust=
if let Some(x) = &shared.lock().field {
// shared.lock() is dropped here
}
```
Following our discussion, `&shared.lock().field` should see the temporaries dropped before the pattern match against `Some(x)` takes place.
```rust=
if let Some(x) = {
&(&&&shared.lock().triple_ref()).field
} {
// shared.lock(), &&&shared.lock().triple_ref() are dropped here
}
```
This example involves three consecutive dereference operations, creating a temporary at each turn. However, we shall stick to the decision to drop the temporaries right before the pattern matching, even though `triple_ref()` is considered to be producing results in order for the triple borrows to be valid. In this case, the lifetime of the storage for `triple_ref()` is made no longer than the overall initializer.
Following the same intuition, rewriting `if let` into `match` will not change the scoping of `shared.lock()`.
```rust=
#![feature(new_temp_lifetime)]
// error: shared.lock() does not live long enough
// but this is desirable under this proposal
match &shared.lock().field {
Some(x) => {}
None => {}
}
```
### The `super let` variable declaration
Let us start with composing `super let` with `let` in this example.
```rust=
#![feature(new_temp_lifetime)]
{
let x = {
super let y = compute();
&mut y
};
*x += 1;
}
```
We find that `y` is declared in a block that is participating in the result for storage into `x`. Unlike classical `let`, `y` will be considered a participant in result construction, so `y` is assigned a lifetime as long as `x` and so is the initializer `&compute()` and `compute()` behind the borrow.
Suppose we have the following `match` statement.
```rust=
#![feature(new_temp_lifetime)]
{
match {
super let x = compute();
x.some_field()
}
{
Some(y) => *y += 1,
_ => {}
}
// The storage for `value` expires after the match and before `do_something()`.
do_something();
}
```
Normally according the previous rule for expressions under pattern matching, all temporaries and variables will expire at the end of the block on Line 10. Via `super let`, `x` will be assigned to a lifetime up until the result of `match` is used. This time, `match` is not participating in constructing a result, so that lifetime ends at the end of this `match` statement.
### The rest of the cases
Expressions under this category will be certainly a part of a statement in a block or a function body. The temporaries will drop at the end of that statement.
```rust=
#![feature(new_temp_lifetime)]
do_something_with(shared.lock().field);
```
Like in Rust 2021, `shared.lock()` will be dropped at the end of the semi-colon.
When expressions under the other categories are encountered, the choice of a scope for storing result or a scope for intermediates may be overriden, but it will not out-live the outmost statement.
```rust=
#![feature(new_temp_lifetime)]
do_something_with({
super let guard = shared.lock();
&guard.field // no extension on `guard` effectively
});
```
# Reference-level explanation
[Reference-level-explanation]: #reference-level-explanation
In this RFC, we will create a model for lifetime of temporary values, which will be referred to as *temporary lifetimes* henceforth, that arise from evaluating Rust expression and statements. We would like to build a clear mental framework to reason about temporary lifetimes.
## Tenets
The RFC should adopt these principles in its approach to design the rules.
- **When in doubt, shorter scopes are more reliable.**
- We have witnessed a few examples that could have benefited from shorter scopes. In those deadlocking examples, a longer lifetime assigned leads to hard to detect issues at run-time. Assigning a shorter life, however, such traps will surface as borrow check errors.
- **Larger scopes are more convenient.**
- Larger scopes are in general more convenient and give less borrow checker errors. Therefore, we should try to assign longer lifetime when it is clear that the user intended to. Examples like `let x = &compute();` show a pattern in which temporaries will need to outlive the binding in order for the binding to make sense. In other cases, however, we avoid extension if we need to rely on information other than the syntactical feature or any fine-grained lifetime analysis is required.
- **Scopes should be predictable and stable.**
- Assigning scopes should not depend on type-checking or frine-grained analysis of expressions. The proposed rules could be complex, but they avoid fine-grained analysis and, instead, work largely by analysing the syntatical structure of expressions. For instance, we have a set of rules for `$expr` in `let $pat = $expr;`, but the rules do not depend on fine-grained details of the `$pat` pattern.
- **It should always be possible to rewrite any expression Rust 2021 expression `E` with an enclosing block `{ ... }` so that under our proposed rules behaves the same.**
- To achieve this, we need to introduce some way inside a block to create values with the same scope as temporaries would in `E` under Rust Edition 2021. Our proposed `super { .. }` achieves this. As an example, `{super let value = temp(); &value}` works exactly in the same way as `&temp()` under Rust 2021.
## Glossory and background
A temporary lifetime is assigned to each Rust expression and its subexpressions. Let's take the following expression as an example.
```rust=
function_call(match input {
Some(arg) => arg,
None => &make_default_arg(),
}).field_projection
```
For clarity and easy reference, we will break this expression down and give each subexpression a name.
```rust=
'call: {
'callee1: { function_call }
(
'call_argument: {
match 'scrutinee: { input }
{
Some(arg) => 'arm1: { arg },
None => 'arm2: {
'borrow: {
&
'referee: {
'callee2: { make_default_arg }()
}
}
}
}
}
}
)
}.field_projection
```
While this expression is evaluated according to a pre-defined order, for each subexpression encountered a corresponding scope is pushed onto a stack. We would call this stack as a scope stack. When the result of a subexpression is yielded and the result is stored in a temporary MIR place, the scope of that subexpression is pop off the scope stack. At this point, the scope taken off the scope stack is a book-keeping device that holds which temporary MIR places need to be freed and destructors needs to be called.
In most cases, the intuitive behaviour would be the case that each scope would assert that by popping out this scope, the temporary place of that corresponding subexpression should be cleaned up. For instance, when the example expression at the beginning of this section is evaluated, the result of the `'referee` expression will be dropped when it yields. However, this introduces a problem where `'borrow` would attempt to borrow the `'referee` result and this fails the borrow checker.
Therefore, the answer to this problem, as of Rust Edition 2021, is to introduce a system of terminating scopes and delay the clean-up until a stop-gap scope `Destructor`. Such terminating scopes usually surrounds blocks; statements; function, arm and loop bodies; atoms of boolean expressions and others. Using the same example above; `'callee` and `'call_argument`, meanwhile, will out-live `'call` and even beyond the field projection, so far up the expression tree to the lowest terminating scope. The exception would be `'referee` will out-live `'borrow` but still drops at the end of `'arm2`, which matches a typical Rust user targeting Edition 2021 thus far.
However, some intricacies and "foot-guns" have been known in the recent years. As of Rust 2021, there are implicit rules both for intermediate values including those involved in local variable initializers. The first example that frequently gives us surprises is the use of values that has side-effect upon destruction. A prime example is the `MutexGuard`.
```rust=
fn mutex_example(mutex: &Mutex<A>) {
if let Some(value) = &mutex.lock().unwrap().status {
*mutex.lock().unwrap().enqueue_work();
}
}
```
Under Rust 2021, this will lead to a deadlock. `mutex.lock().unwrap()` is a lock guard that only gets released at the end of the `if let` statement.
To address these issues, we would like to propose a new framework for assigning temporary lifetimes, in order to strike a balance between reliability, safety and convenience. We will introduce the concept of enclosing blocks of expressions or statements; expression categories; result and intermediate scopes; and, finally, rules to assign temporary lifetime to expressions.
### Enclosing blocks
Blocks, as in `{ <expr> }`, are traditional instrument to control lifetimes in Rust Edition 2021. We intend to continue to use it for the same purpose. We are particularly interested in the enclosing block of an expression, which we define as the nearest block that contains the expression in question as its subexpression.
*(B)* There are several cases in which blocks will be used, even implicitly.
- *(B1)* The explicit instantiation of a block `{ ... }` by a pair of curly brackets is considered a block.
- *(B2)* Function bodies are, required syntatically, blocks.
- *(B3)* `match` arm bodies, `if` consequents and alternatives are surrounded by an elided block. For instance, the `x + 1` expression is still treated as if it is written as `{ x + 1 }`.
- *(B4)* Statements excluding variable declarations via `let`, including those ends with `;`, `if` and `match` expressions *other than those at the tail location of a block*, are implicitly surrounded by a block `{ .. }`.
```rust=
match get() {
Some(x) => x + 1,
// ^~~~~ interpreted as { x + 1 }
None => {
...
}
}
```
- Blocks containing at least one statements have a series of nesting hidden blocks. After each statement that is not the tail expression, a hidden block is introduced to surround the rest of the main block. For instance, a block in source like the following ...
```rust=
{
a;
b;
c;
d
}
```
... is read as ...
```rust=
'a: {
a; // the enclosing block of `a` is 'a
'b: {
b; // the enclosing block of `b` is 'b
'c: {
c; // the enclosing block of `c` is 'c
d // ! the enclosing block of `d` is 'c
}
}
}
```
**Before proceeding, the terminology in the coming sections is put forth by this RFC, as opposed to the earlier text that describes Rust across all editions so far.**
### Expression categories
We sort all expressions, depending on its location, into three categories.
- *(S)* An expression is *scrutinee*, when
- *(S1)* it appears in the scrutinee location of a `match` expression, or
- *(S2)* the condition of an `if` expression;
- *(I)* An expression is *intermediate*, when its result is expected to be consumed soon and preferrably cleaned up at earliest opportunity. All expressions are considered intermediate;
- *(RP)* An expression is *result-producing*, when its result is expected to be persisted for storing in a variable or an abstract data type, or getting borrowed. Specifically,
- *(RP1)* An initializing expression of a local variable declaration statement;
- e.g. `compute()` subexpression in `{ let x = compute(); }` is result-producing
- *(RP2)* The expression behind a borrow or a de-reference;
- e.g. both `*make()` and `make()` in `&*make()` are result-producing
- *(RP3)* All fields of an initializing structure, as well as its structure base;
- e.g. `a`, `&b` and `c()` subexpressions in `Struct { a, x: &b, ..c() }` are result-producing; additionally `b` is also result-producing only due to the previous rule
- *(RP4)* The tail expression of a block, like `x / 4` in the following example
```rust=
{
let x = compute_pi();
x / 4.0
}
```
### Resulting and intermediate scopes
We will also assign to each expression its resulting and intermediate scope.
- A resulting scope is the scope or lifetime to which an expression and its subexpression will preferably store their results.
- An intermediate scope is a scope or a lifetime which the temporary values arising from evaluation of this expression should drop at.
Intuitively, a resulting scope captures the upper bound and is more convenient due to its longer lifetime, while an intermediate scope captures the lower bound. In this proposal, we additionally prefer shorter resulting scopes so that if a result may or may not be used within one scope, we will prefer a shorter resulting scope than that scope.
Given the resulting and intermediate scope of a parent expression, these two kinds of scope can be assigned to each of the subexpression. Suppose that we are inspecting an expression *`E`* and an immediate subexpression *`S`* and knowing that the result scope *`ResultScope(E)`* and intermediate scope *`IntermediateScope(E)`* of *`E`*, depending on the category of *`S`*, we propose the following assignment to the result scope *`ResultScope(S)`* and intermediate scope *`IntermediateScope(S)`* of *`S`*.
1. *`S`* is an intermediate subexpression:
- *`ResultScope(S) = IntermediateScope(S) ≜ IntermediateScope(E)`*
- *`S`* should be cleaned up as soon as it yields result
2. *`S`* is a scrutinee subexpression:
- *`ResultScope(S) ≜ IntermediateScope(E)`*
- The *`S`* result might be used in the match arms or the `if let` consequent, but it *might not* outlive the expression *`E`*. For this reason, we choose to limit it to just *`IntermediateScope(E)`*. It is also necessary since *`S`* will be pattern-matched against and further borrowed or consumed while evaluating other subexpressions of *`E`*.
- *`IntermediateScope(S) ≜ S`*
- Unlike intermediate subexpressions, which drop their temporary values at the same intermediate scope of their parents, intermediate values arising from evaluating *`S`* should drop at the end of *`S`*, or at the point right before pattern matching. The rationale is that `match` arms and `if let` consequents can perform heavy work after the pattern matching; and dropping them early will avoid the subtlety and common temporary lifetime bugs as mentioned in the foot-gun section. For reliability, we opt for a shorter lifetime for them.
3. *`S`* is a result-producing subexpression:
- *`ResultScope(S) ≜ ResultScope(E)`*
- The result of *`S`* will need to live as long as *`E`*.
- *`IntermediateScope(S) ≜ E`* if *`E`* is the enclosing block of *`S`*; or else *`IntermediateScope(S) ≜ IntermediateScope(E)`*
- Specifically, the case when *`E`* is the enclosing block of *`S`* is that *`S`* is a local variable initializer.
### Assigning temporary lifetimes
We can finally spell the new rules with the preceding glossory. An expression will be assigned a lifetime with its resulting scope, when it is result-producing. Otherwise, it is assigned with its intermediate scope.
Here are some examples of the assignment.
```rust=
use std::sync::{Mutex, Arc};
fn example_a() {
let a = &make_value(); // (a)
println!("{a}")
}
```
In `example_a`, `make_value()` on *(a)* is at a result-producing location. So it is assigned a lifetime equal to the resulting scope of its parent `&make_value()`, which is an initializer expression. This in turn implies that this lifetime is the whole function body, because initializers are result-producing and, in this example, the function body is the enclosing block of this initializer.
```rust=
// edition: 2024
fn example_b(counter: Arc<Mutex<Option<usize>>>) {
match counter.lock().unwrap().take() { // (a)
Some(counter) => {
*counter.lock().unwrap() = Some(counter + 1);
work();
}
None => {}
}
}
```
`example_b` is a prime example of a deadlocking trap in Rust 2021. Under the new temporary lifetime rule, the subexpression `counter.lock().unwrap()` of type `MutexGuard<'_, Option<usize>>` will be dropped before the `Option<u8>` result from the `take` call is inspected on its discriminant.
<details>
<summary>Disclaimer</summary>
This example is purely for demonstration purpose. Do not ever use this as a thread-safe counter, because it does not prevent racing. Use atomics or mutate the counter through the `MutexGuard` directly for this purpose.
</details>
```rust=
// edition: 2024
fn example_c(mutex: &Mutex<usize>) -> usize {
*mutex.lock().unwrap()
}
```
`example_c` does not compile in Rust 2021. However, under the new rules, this will compile because `mutex.lock().unwrap()` sub-expression is assigned the intermediate scope, which surrounds the tail expression `*mutex.lock().unwrap()`. This means that the lock guard is dropped right after the auto-deref operation.
### Controlled lifetime extension via `super let`
With the classic `let`, there are certain Rust programs that can no longer be coded for the same desired behaviour under our new propsal. For instance, the following example may no longer compile if it is interpreted under the new rules.
```rust=
// edition: 2021
macro_rules! m {
($v:expr) => { &$v.as_ref() }
}
match m!(temporary_value()) {
...
}
```
To understand the
The reason is that `temporary_value()` will be dropped before the pattern matching, while its `as_ref()` will be tested. There is no way to use the classical `let` to rewrite this macro `m!` for the same behaviour.
Our proposal for a remedy is to introduce `super let` variable bindings to annotate an expression, so that the temporary lifetime can be declaratively extended.
```rust=
#![feature(new_temp_lifetime)]
{
super let $pat = $init;
}
```
*(SB)* The semantics is that the expression `$expr` will be assigned a lifetime of the **resulting scope of its enclosing block** of the `super` expression. Intuitively, it prolongs the lifetime of the storage *and* value to be as long as the storage lifetime of the enclosing block but not further.
Let us take the `m!` macro as an example, transcribing it with our `super let` as demonstration.
```rust=
#![feature(new_temp_lifetime)]
macro_rules! m {
($v:expr) => {
{
super let tmp = $v;
&tmp.call_and_borrow()
}
}
}
...
fn main() {
match m!(temporary_value()) {
// match arms ...
}
some_other_work();
}
```
This time the `match` expression gets expanded to...
```rust=
match
{ super let tmp = temporary_value(); &tmp.call_and_borrow() }
{
// match arms ...
}
```
Note that there is an implicit surrounding block around the `match` by *(B4)*. The resulting block of this `match` is the statement surrounding this expression. Applying the rule *(SB)* for `super let` at this site it is clear that, therefore, `tmp` is alive throughout the entire `match` statement. The lifetime of `temporary_value()` does not extend beyond the `match`, so one does not expect it alive when `some_other_work()` is then evaluated.
# Drawbacks
[drawbacks]: #drawbacks
The proposed rule is a departure from the adopted rules in Rust 2021. It would mandate linting upgrades and there might be a necessity to develop a migration assistance to help rewriting code to adapt to the new expectation.
# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives
- Why is this design the best in the space of possible designs?
- What other designs have been considered and what is the rationale for not choosing them?
- What is the impact of not doing this?
- If this is a language proposal, could this be done in a library or macro instead? Does the proposed change make Rust code easier or harder to read, understand, and maintain?
# Prior art
[prior-art]: #prior-art
There has been an earlier attempt with RFC 66. However, during the experiment with a trial implementation, we found a great burden has fallen on the experimental compiler to resolve types prematurely while it introduces problems of inconsistency when generics are concerned. This experiment has, therefore, enlightened the principle to rely solely on syntactical structure of a Rust program to assign temporary lifetimes, as opposed to relying on typing suggested byu RFC 66.
# Unresolved questions
[unresolved-questions]: #unresolved-questions
- We would like to further examine the use case of this proposal in macro development in Rust.
- We would also like to investigate the impact of the new rules on existing Rust ecosystem and discover the necessary migration guide and tooling in order for a smooth transition, when this proposal is implemented in Rust Edition 2024.
# Future possibilities
[future-possibilities]: #future-possibilities
# Appendix
[edition-2021-walkthrough]: #edition-2021-walkthrough
## Rust Edition 2021 Temporary Lifetime Tutorial
###### Listing 1. The basics.
```rust=
// edition: 2021
use std::rc::Rc;
struct A {
x: Option<u8>,
}
impl A {
fn f(&self) -> &Option<u8> {
&self.x
}
}
fn demo(rc: &mut Rc<A>) {
rc.f().clone();
*rc = Rc::new(A { x: None });
}
```
We focus on the first statement in the `demo` body in this example. In order to successfully evaluate the statement, the intermediate results needs to be held onto. Namely
1. `rc` as an immutable borrow needs to be held long enough, so that ...
2. `rc.f(): &Option<u8>` can be evaluated, which needs to be held long enough, so that ...
3. `rc.f().clone(): Option<u8>` can be evaluated.
Additionally there is another location in this expression that creates a *hidden* intermediate result. The method call `f()` is associated with `impl A`, but the receiver is a `&Rc<A>` for now. For this call to happen, a dereference must be taken on the guard in order to obtain an intermediate value `&A` as the receiver of the method call. This is the auto-ref and auto-deref mechanism of the Rust language as of Edition 2021, so that a desirable receiver type is automatically computed to match the associated method signature.
In total, 3 intermediate values are generated during the evaluation, all of which are cleaned up at the end of the statement. Rust 2021 decides to pick this point in the program for the clean-up, because intermediate values, unless explicitly persisted in local variable binding via assignment, sees no use for evaluating the rest of the program. Otherwise, holding onto the immutable borrow `rc` will prevent the later assignment on line 13 to fail the borrow checker. Following this intuition, the end of a statement is a good place to drop all the temporaries.
###### [Listing 2.](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=30e3deefb2dabc619c35a58a92618cea)
```rust=
// edition: 2021
struct A;
fn a() -> A {
A
}
impl Drop for A {
fn drop(&mut self) {
println!("drop!")
}
}
impl A {
fn call(&mut self) -> bool {
println!("call!");
true
}
}
fn main() {
if a().call() {
println!("in")
}
println!("Hello, world!");
}
```
We will focus on the `if` statement on line 18. Right before testing the boolean condition `a().call()`, one intermediate value `a()` of type `A` is constructed; and by auto-ref another immediate value of type `&mut A` is constructed, as if it was written as `&mut a()`. Since the use of `a()` is only for the eventual evaluation of a boolean value, intuitively one should discard it right before proceeding into the consequent block. It is true for Rust 2021, as its output is the following
```=
call!
drop!
in
Hello, world!
```
It may surprise readers that this following Rust 2021 program compiles.
###### [Listing 3.](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=29abb1bb19996971299fa4393cefb186)
```rust=
// edition: 2021
use std::sync::{Arc, Mutex};
struct A {
b: u8,
}
fn increment(a: Arc<Mutex<A>>) {
let b = &mut a.lock().unwrap().b;
*b += 1;
}
```
On line 7, `a.lock().unwrap()` creates a few intermediate value: an auto-deref'ed `&Mutex<A>`, a `MutexGuard<'_, A>`, an auto-deref'ed `&mut A` before storing a mutable borrow on the field `b` into a variable binding `x`. The guard `a.lock().unwrap()` is supposed to be dropped at the end of the statement while a borrow is getting used later on line 8.
The merit of admitting this Rust program is clear, however. Otherwise one would be forced to write this code for the equivalent semantics.
```rust=
// edition: 2021 modulo automatic extension
let mut guard = a.lock().unwrap();
let b = &mut guard.b;
*b += 1;
```
Certainly, enlarging the lifetime of the said temporaries allows a shorter Rust program and provides *convenience* and better developer experience. This is a feature of Rust 2021 that many users may have been unknowingly leveraging. In essence, intermediate values at variable binding initialization will receive a longer lifetime, up to the surrounding block. The precise wording of the rules for extension can be found in the appendix.
By mixing the lifetime extension at initializer and pattern matching constructs such as `if let` and `match`, we would arrive at some unfortunately unintuitive and complex cases.
###### [Listing 4.](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=66ae5c7f35a159af310164e824489431)
```rust=
// edition: 2021
use std::sync::Mutex;
struct A {
b: Option<u8>,
}
fn increment(a: &Mutex<A>) {
if let Some(b) = &mut a.lock().unwrap().b {
*b += 1;
if *b == u8::MAX {
a.lock().unwrap().b = None;
}
}
}
fn main() {
let a = Mutex::new(A { b: Some(u8::MAX - 1) });
increment(&a);
}
```
This code does not progress due to dead-locking at line 10. It is not immediately obvious at the first glance. Carefully applying the extension rules on line 6, one can spot out that a guard has been implicitly held across the `if`-consequent block from line 6 to 11. Making the matter worse, this dead-locking can only be detected at run-time.
Rewriting into `match` in this way does not mitigate the issue either.
###### [Listing 5.](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=7b0ecdd24e07c9d1157171f9c92218f2)
```rust=
// edition: 2021
fn increment(a: &Mutex<A>) {
match a.lock().unwrap().b {
Some(u8::MAX) => {
a.lock().unwrap().b = None;
}
Some(b) => {
a.lock().unwrap().b = Some(b + 1);
}
_ => {}
}
}
```
This time the guard on line 3 is only dropped at the end of `match` statement on line 11. So the dead-locking still happens.
While the intuitive approach to manage temporaries and lifetime extensions in Rust 2024 leads to accidental and undesirable consequences, it is insufficient for other use cases.
###### [Listing 6.](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=ed1b95dd0f0a65d4434bf181bd47bdc2)
```rust=
// edition: 2021
struct DataStructure(u8);
fn construct(x: u8) -> DataStructure {
DataStructure(x)
}
fn construct_zero() -> DataStructure {
construct(0)
}
fn construct_one() -> DataStructure {
construct(1)
}
fn conditionally(flag: bool) {
let x = if flag {
&mut construct_zero()
} else {
&mut construct_one()
};
*x = construct(2);
}
```
It is a simplified of a general pattern in which one would construct predicated values, involving temporaries in anticipation of application of extension rules. However, this fails the borrow checker in Rust 2021 since the extension rule is unable to capture this. As a work-around, it has to be written as follows.
```rust=
// edition: 2021
fn conditionally(flag: bool) {
let mut y;
let x = if flag {
y = construct_zero();
&mut y
} else {
y = construct_one();
&mut y
};
*x = construct(2);
}
```
There are quirks involving the temporaries at the block tail position and function return location.
###### [Listing 7.](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=263d3223c283449cb1561756a46de214)
```rust=
// edition: 2021
use std::sync::Mutex;
fn main() {
let x = Mutex::new(());
*x.lock().unwrap()
}
```
On line 6, the guard is granted the lifetime of the whole block in Rust 2021, because it is at the function return location instead of inside of a statement. For this reason, the mutex `x` itself is dropped first, failing the borrow checker.
It is also inconsistent in a sense that it still does not compile when the tail expression is surrounded by a block.
```rust=
// edition: 2021
use std::sync::Mutex;
fn main() {
let x = Mutex::new(());
{
*x.lock().unwrap()
}
}
```
## Rust Edition 2021 temporary value extension rule for variable bindings.
Suppose `let $pat... = $expr` where `$pat` has at least one borrowing binding via `ref` or `ref mut`. Then `$expr` is recursively examined and extend its lifetime to the nearest surrounding block as follows.
- If `$expr` is `&mut $subexpr`, `& $subexpr` or `* $subexpr`, `$subexpr` gets extended and further examined for extension.
- If `$expr` is a structure constructor `Struct { .. }`, the fields *except* the base expression will be further examined individually.
- If `$expr` is a tuple constructor `($subexpr..)`, each coordinate `$subexpr` will be further examined individually.
- If `$expr` is a array constructor `[$subexpr..]`, each element `$subexpr` will be further examined individually.
- If `$expr` is a field projection `$subexpr.$field` or a tuple coordinate projection `$subexpr.$index` like `x.0` or `x.1`, `$subexpr` will be further examined for extension.
- Otherwise, no extension is made for `$expr`.
Unfortunately, enumeration tuple variant `Enum::Variant(..)` and tuple structure are not considered structure constructors and their field initializers will not be extended in any case.