owned this note
owned this note
Published
Linked with GitHub
# Ding Xiang Fei / nikomatsakis
## 2024-04-04
* Temporary lifetimes
* RFC needs to be approved
* https://github.com/rust-lang/rfcs/pull/3606/
* Implementation work
* `super let`
* change lifetimes for temporaries in tail expressions <-- edition dependent
* migration?
* plus mara's PR
* Other things
* lint
* eager drop
Breakage 1:
```rust
fn why_would_you_do_this() -> bool {
let mut x = None;
// Make a temporary `RefCell` and put a `Ref` that borrows it in `x`.
x.replace(RefCell::new(123).borrow()).is_some()
}
```
the correct migration is fairly nontrivial
```rust
fn why_would_you_do_this() -> bool {
let tmp;
let mut x = None;
// Make a temporary `RefCell` and put a `Ref` that borrows it in `x`.
x.replace({ tmp = RefCell::new(123).borrow(); tmp }).is_some()
}
```
Niko's opinion: this is too complex and we should do a crater run to see what patterns (if any) actually occur in the wild.
Breakage 1b:
```rust
fn why_would_you_do_this() -> usize {
*foo(
{
let tmp = vec![44];
Some(tmp.len()).as_ref() // Option<&u32>
},
&(22+44),
)
}
fn foo<'a, T>(
x: Option<&'a T>,
y: &'a T,
) -> &'a T {
x.unwrap_or(y)
}
fn main() { }
```
```rust
fn why_would_you_do_this() -> usize {
*foo(
{
let tmp = vec![44];
let tmp1 = Some(tmp.len());
tmp1.as_ref() // Rust 2024 equivalent
},
&(22+44),
)
}
fn foo<'a, T>(
x: Option<&'a T>,
y: &'a T,
) -> &'a T {
x.unwrap_or(y)
}
fn main() { }
```
Breakage 2:
```rust
fn f(m: &Mutex<i32>) -> i32 {
let _x = PanicOnDrop;
*m.lock().unwrap()
}
```
```rust
fn f(m: &Mutex<i32>) -> i32 {
let _x = PanicOnDrop;
super let g = m.lock().unwrap();
*g
}
```
```rust
fn f(m: &Mutex<i32>) -> i32 {
let _x = PanicOnDrop;
{super let tmp = m.lock().unwrap(); *tmp}
}
// ...equivalent to...
fn f(m: &Mutex<i32>) -> i32 {
let _x = PanicOnDrop;
let tmp = m.lock().unwrap();
*tmp
}
```
LINTS
* Breakage 1: (stops compiling)
* tail expression of a block creates a temporary, a reference to which is stored into a local with a destructor (and the type is not declared `#[may_dangle]`)
* in Rust 2021, this is ok, because local is dropped first
* in Rust 2024, this is an error, because temporary is dropped first
* Breakage 2: (changes semantics)
* tail expression of a block creates a temporary which references a let ?
* in Rust 2021, destructor runs *after* the lets in the block are dropped
* in Rust 2024, destructor runs *before* the lets in the block are dropped
* and destructor of the temporary is significant!
* Maybe to be done with clippy but we gotta check (Ding)
* Anti-breakage 3: (non-issue)
* tail expression of a block creates a temporary that holds a reference to a local
* in Rust 2021, this does not compile, because reference is dropped first
* in Rust 2024, it does
OTHER LINTS
* Likely Bug 1: Match scrutinee creates a temporary but no reference to that temporary escapes
* `match foo.lock().unwrap().is_something() { true => ... }` (also `if let`)
* If a scrutinee creates a temporary...
* ...and the type of that scrutinee has no references... (overapproximated)
* ...then warn.
* ...and the type of that scrutinee does not contain a lifetime `'x` that outlives that temporary (better)
* ...then warn.
```rust
use std::sync::Mutex;
fn main() {
let p = 22;
let m: &Mutex<&u32> = &Mutex::new(&p);
match *m.lock().unwrap() {
i => { // type_of(i) = &u32
// still holding the lock here, but why?
}
};
}
```
```
⣿
Standard Error
Compiling playground v0.0.1 (/playground)
warning: unused variable: `i`
--> src/main.rs:8:5
|
8 | i => { // type_of(i) = &u32
| ^ help: if this is intentional, prefix it with an underscore: `_i`
|
= note: `#[warn(unused_variables)]` on by default
warning: `playground` (bin "playground") generated 1 warning (run `cargo fix --bin "playground"` to apply 1 suggestion)
Finished dev [unoptimized + debuginfo] target(s) in 0.42s
Standard Output
Result
// WARNING: This output format is intended for human consumers only
// and is subject to change without notice. Knock yourself out.
fn main() -> () {
let mut _0: ();
let _1: u32;
let _2: &std::sync::Mutex<&u32>;
let _3: std::sync::Mutex<&u32>;
let _4: &u32;
let mut _5: &&u32;
let mut _6: &std::sync::MutexGuard<'_, &u32>;
let _7: std::sync::MutexGuard<'_, &u32>;
let mut _8: std::result::Result<std::sync::MutexGuard<'_, &u32>, std::sync::PoisonError<std::sync::MutexGuard<'_, &u32>>>;
let _10: ();
scope 1 {
debug p => const 22_u32;
scope 2 {
debug m => _2;
let _9: &u32;
scope 3 {
debug i => _9;
}
}
}
bb0: {
_1 = const 22_u32;
_4 = &_1;
_3 = Mutex::<&u32>::new(_4) -> [return: bb1, unwind continue];
}
bb1: {
_2 = &_3;
_8 = Mutex::<&u32>::lock(_2) -> [return: bb2, unwind continue];
}
bb2: {
_7 = Result::<MutexGuard<'_, &u32>, PoisonError<MutexGuard<'_, &u32>>>::unwrap(move _8) -> [return: bb3, unwind continue];
// SOMEHOW: know that _7 is a temporary
}
bb3: {
_6 = &'a _7;
_5 = <MutexGuard<'_, &u32> as Deref>::deref(move _6) -> [return: bb4, unwind: bb7];
}
bb4: {
_9 = (*_5);
// SOMEHOW: know that match arm starts here
// and see that:
// * no reference to `_7` is live here
// * `_7` is a temporary with a significant destructor
// and issue a warning.
//
// BORROW CHECKER CONCEPT THAT IS RELEVANT:
// * X is "live" at a point P if X may be used by something other than a DROP
// * X is "drop live" at a point P if X may be used by a DROP
//
// CONDITION WE WANT TO LINT ON IS:
//
// * For each lifetime `'a` that was part of a borrow of the temporary (e.g., `_7`):
// * `'a` is not live (or drop live) on entry to the match arm
//. * then warn
_10 = do_this() -> [return: bb5, unwind: bb7];
}
bb5: {
drop(_7) -> [return: bb6, unwind continue];
}
bb6: {
return;
}
bb7 (cleanup): {
drop(_7) -> [return: bb8, unwind terminate(cleanup)];
}
bb8 (cleanup): {
resume;
}
}
fn do_this() -> () {
let mut _0: ();
bb0: {
return;
}
}
```
## 2022-12-15
some questions to revisit:
- why is the mpsc test failing? (Xiang is investigating)
- come up with some tests where we DON'T expect extension, comprehensive tests around dtor execution (nikomatsakis: think about)
- what is the syntactic pattern for method receivers -- look at adjustments or should we use `&self` declarations?
- if there are multiple "Extensible" arguments, must all of them be of an extensible pattern? e.g., `E& = call(..E&1...E&2...)`
## 2022-10-06
```rust!
fn get_name<'a>(ch: &'a Character) -> &'a Name {
// ^^^^
&ch.name
}
fn get_name(op: for<'a> fn(&'a Character) -> &'a Name) {
// ^^^^^^^
// same type as this:
//
// for<'b> fn(&'b Character) -> &'b Name
// ^ |
// +--------+ ReBound
//
// fn(&'c Character) -> &'c Name
// ^
// ReFree and friends, e.g., ReStatic
op(...);
// Binder<FnSig>
}
```
[FnPtr](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/enum.TyKind.html#variant.FnPtr)
[`liberate_late_bound_regions`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/context/struct.TyCtxt.html#method.liberate_late_bound_regions)
```rust
fn get_name<'a>(ch: for<'b> fn(&'a Character, &'b str)) -> &'a Name {
// ^^^^
&ch.name
}
```
fn sig for `get_name` will be
```
for<'a> { // Binder<FnSig>
FnSig {
inputs: [
for<'b> { // Binder<Ty>
fn(&^a Character, &^b str) /// Ty
// -- --
// both ReBound, but referring to different binders
}
]
output: &^a Name // ^a == ReBound
}
}
```
then we invoke `liberate_late_bound_regions` will peel away one level of binder
```
FnSig {
inputs: [
for<'b> { // Binder<Ty>
fn(&!a Character, &^b str) /// Ty
// -- -- ReBound
// ReFree
}
]
output: &!a Name // !a == ReFree
}
```
### early vs late bound
```rust!
fn foo<T>(t: T) { }
fn main() {
let f: fn(?T) = foo::<?T>;
f(22_u32);
f(44_i32); // error
foo(22_u32);
foo(44_i32); // OK
}
```
```rust!
fn foo<'a, T>(t: &'a T) { }
fn main() {
let f: for<'a> fn(&'a ?T) = foo::<?T>;
f(&22_u32);
f(&44_u32);
}
```
```rust!
fn foo<'a, T>(t: &'a T)
where
T: SomeTrait<'a>,
{
}
fn main() {
let f: fn(&'?a ?T) = foo::<'?a, ?T>;
// register obligation: ?T: SomeTrait<'?a>
f(&22_u32);
f(&44_u32);
}
```
See `compiler/rustc_hir_analysis/src/check/mod.rs: typeck_with_fallback -> liberate_late_bound_regions`
## 2022-09-13
Use `parameters_for` from `compiler/rustc_typeck/src/constrained_generic_params.rs, rustc_typeck::constrained_generic_params` to test if region variables are constrained.
We shall start with a version that ignores `where` clauses and run an experiment based on this idea first.
## 2022-07-15
Investigate `break_for_else`.
`break_for_else` builds `IfThenScope::else_drops`.
`build_exit_tree` builds MIR to drop temps recorded in `else_drops`.
## 2022-07-12
PR #98574 is pending
## 2022-07-05
Add async tests
```rust
async fn foo(x: Option<bool>) {
let Some(_) = x else {
let r = Rc::new(());
bar().await
};
}
async fn bar() -> ! {
panic!()
}
fn is_send<T: Send>(_: T) { }
fn main() {
is_send(foo(Some(true))); // ERROR not send
}
```
```rust
async fn foo(x: Option<bool>) {
let Some(_) = x else {
bar(Rc::new()).await
};
}
async fn bar<T>(_: T) -> ! {
panic!()
}
fn is_send<T: Send>(_: T) { }
fn main() {
is_send(foo(Some(true))); // ERROR not send
}
```
```rust
async fn foo(x: Option<bool>) {
let Some(_) = x else {
(Rc::new(), bar().await);
return;
};
}
async fn bar<T>(_: T) -> ! {
panic!()
}
fn is_send<T: Send>(_: T) { }
fn main() {
is_send(foo(Some(true))); // ERROR not send
}
```
```rust
async fn foo(x: Option<bool>) {
let Some(_) = x else {
bar().await;
Rc::new();
return;
};
}
async fn bar<T>(_: T) {
}
fn is_send<T: Send>(_: T) { }
fn main() {
is_send(foo(Some(true))); // OK
}
```
## 2022-07
`find_capture_matching_projections` at `compiler/rustc_mir_build/src/build/expr/as_place.rs`
## 2022-06-21
### The happy path
- Investigate `rustc_mir_builder::Builder::storage_live_binding`
- takes a `LocalVarId` and constructs a place
- used by `expr_into_pattern`
- `rustc_mir_builder::Builder::declare_binding` pushes `Local` to `Builder::local_decls`
- There are places for borrowing in guards, but they are not relevant
- We actually want to create locals in the pattern
- We actually want to bind them to temporary locals and move them into the right place
- `Builder::bind_and_guard_matched_candidate`
- `Builder::bind_matched_candidate_for_arm_body`
- `Builder::storage_live_binding` *
### Questions to resolve
- What should `opt_match_place` be for use in `declare_bindings`?
- It is used in `rustc_borrowck::diagnostics::move_errors::MirBorrowckCtxt::append_to_grouped_errors`
## 2022-06-16
### Trouble for the `LocalVarId` idea
<details>
We wanted to store in THIR a `LocalVarId => LocalVarInfo` mapping, say `local_var_defs`, while mirroring HIR into THIR. However, (up-)variable references to function arguments are resolved much later.
```
compiler/rustc_mir_build/src/build/mod.rs: Builder::args_and_body, called by
compiler/rustc_mir_build/src/build/mod.rs: construct_fn, called by
compiler/rustc_mir_build/src/build/mod.rs: mir_build, ...
```
Ideally, we would build the `local_var_defs` mapping, we fill it whenever we encounter a pattern binding, or a `let-else` statement. However, we would not be able to account for variable references to function arguments unless we move some analysis done in `mir_build` and `args_and_body` into `thir_body` when the subject item is a function or closure.
</details>
Let us "desugar" in MIR instead.
### MIR match tree
What we want to do is...
* create [`Candidate`] for the let-else
* one will be from the pattern we have
* the other will be an `_` pattern
* `has_guard` false
* `match_pairs` uses a `PatKind::Wild`, should be easy to synthesize
* look at `Candidate::new`, most everything else is initialized to `vec![]`/None
* invoke `lower_match_tree`
* this will modify the `Candidate` structures in place
* once it is done
* `subcandidates` should be empty
* `ascriptions` -- ignore for now, might be relevant, we might want to put it in there, or we might enforce it in the let lowering? something
* in particular the
* `bindings` fields will store where to find the values for all the bindings in the let
* the `pre_binding_block` will be the place that code jumps to if the let *matches*
* we don't have a guard, so the "otherwise" block is irrelevant
* also `next_candidate_pre_binding_block` is not relevant, I don't think
* then we want to
* for the `_` arm (that we created), we execute the "else" block
* this must diverge, so we don't need to connect its "end block" to anything
* for the "main" arm, we want to invoke
* `this.declare_bindings` -- [like this](https://github.com/rust-lang/rust/blob/a72e4308dfe75c45c44162c43d4f98d32661c8b0/compiler/rustc_mir_build/src/build/matches/mod.rs#L355-L372)
* call `bind_pattern`---
* `guard` should be `None`
* `fake_borrow_temps` we get from `lower_match_tree`
* `arm_scope` is the scope of the enclosing block (or the suffix thereof)
* `match_scope` -- either None or the scope of the let?
* store variables into the bindings (XXX find method)
* take the current "end block" and use it to generate the remaining statements
## 2022-06-09
* Completed https://github.com/rust-lang/rust/pull/97913/ to use `LocalVarId` instead of raw `HirId`, two places where HirIds leaks:
* matching against closure capture information
* looking up the local variable scopes
* Our goal:
* be able to introduce local variables for the new patterns we are creating during the desugaring
```rust
// from
let $pattern = $expr else $else_expr;
// ^^^^^^^^^ these variables are in the surrounding block scope
// to
let $pattern = match $expr {
$pattern' => (...),
_ => $else_expr
};
```
the problem is:
* `pattern'` needs to replace each binding in `$pattern` with a fresh variable that can be reference in the `(...)` code
### possible solutions
#### push localvarid out of thir
seems complicated
#### keep LocalVarId specific to thir but extend to have more possible kinds of local-var-ids
conceptually:
```rust
enum LocalVarId {
HirId(HirId),
Synthetic(SyntheticVarId)
}
enum SyntheticVArInfo {
LetElseTemporary(statement, orig: LocalVarId),
}
struct Thir {
synthetics: IndexVec<SyntheticVarId, SyntheticVarInfo>
}
```
need to map from `SyntheticId` to ... enough information to answer:
* closure capture stuff
* easy -- synthetic things are never captured by closures
* region scope tree lookups
* always the same as the let-else statement
introduce helper functions:
```rust=
fn lookup_in_closure_cpaure(lvar: LocalVArId, ...) -> Option<> {
match lvar {
HirId(h) => capture.get(h),
LetElseTemporary(_) => None,
}
}
fn region_scope_info(lvar: LocalVArId, ...) -> Scope {
match lvar {
HirId(h) => region-tree.get(h),
LetElseTemporary(_) => scope-of-the-statement,
}
}
```
### hacks for size
goal: sizeof(HirId) == sizeof(LocalVarId)
we can make LocalVarId be something likje:
```rust=
pub struct LocalVarId {
pub owner: LocalDefId,
pub local_id: u32,
}
```
where if Local_id < number-of-hir-ids, it is a hir-id
but if local_id >= number-of-hir-ids, it is a synthetic id (and we can derive the index by subtracting the max)
only question is: how to find number of hir-ids? (it's somewhere)
## Older
### Bindings -> HirId <- Res <- AST nodes
- Bindings gets HirIds in `rustc_ast_lowering`
- Paths resolved to HirIds in `LateResolutionVisitor`
- `rustc_resolve::late::LateResolutionVisitor::resolve_pattern_inner` -> `rustc_resolve::Resolver::record_partial_res`
### Next up
The following have references to `HirId`s.
- `rustc_middle::thir::PatKind::Binding`
- `rustc_middle::thir::ExprKind::{Closure, VarRef, UpvarRef}`
Use of `HirId` in the context of variables while building MIRs:
- Query against region scope tree for `var_scope`s
### Scratch Pad
```
let T(a, b) = f() else {
...
}; use_here(a, b);
=>
let (a$2, b$3) = match f() {
T(a$0, b$1) => (a, b),
_ => ...
}; use_here(a$2, b$3);
```
```
let T(a0, b0) = f() else {
...
}; use_here(a0, b0);
// Option 1: lower to this
//
// In THIR lowering, keep a map from (a0 -> a1, b0 -> b1)
// and do the renumbering as we go.
//
// Side note: it might be nice to introduce a newtype'd HirId
// (let's call it `LocalVarId(HirId)`) within THIR so that we can
// cleanly identify each point where a `HirId` is being used as a
// `LocalVarId` and ensure that it is remapped.
//
// (And at some point I would like to stop using `HirId`
// for this purpose.)
let (a1, b1) = match f() {
T(a0, b0) => (a0, b0),
_ => ...
}; use_here(a1, b1);
// Option 2: use the "replacement" hir-ids in the pattern
//
// seems hard-ish to me, feels like it will lead to some ICE somewhere
let (a0, b0) = match f() {
T(a1, b1) => (a1, b1),
_ => ...
}; use_here(a1, b1);
// Option 3: modify name resolution to use the replacement id's
//
// seems grody, non-local
let T(a0, b0) = f() else {
...
};
use_here(a1, b1);
```
# Scratch
In `rustc_mir_build`
```
src/thir/cx/block.rs: Cx::mirror_stmts ->
src/thir/cx/mod.rs: Cx::pattern_from_mir ->
src/thir/pattern/mod.rs: pat_from_mir ->
src/thir/pattern/mod.rs: PatCtxt::lower_pattern (!) ->
src/thir/pattern/mod.rs: PatCtxt::lower_pattern_unadjusted *
```
Other outgoing calls of `src/thir/pattern/mod.rs: PatCtxt::lower_pattern`
- `src/thir/pattern/check_match.rs`: `check_match` query
- which can be subbed with its own local var registry
- self recursion
- `src/thir/pattern/mod.rs: PatCtxt::lower_pattern_unadjusted`
- `src/thir/pattern/mod.rs: PatCtxt::lower_patterns`
- `src/thir/pattern/mod.rs: PatCtxt::lower_tuple_subpats`
- `src/thir/pattern/mod.rs: PatCtxt::lower_opt_pattern`
After THIR mirroring, a second round of registering new bindings would be performed in
```
src/build/mod.rs: Builder::args_and_body ->
src/build/mod.rs: construct_fn ->
src/build/mod.rs: mir_build ...
```
`args_and_body` is to build bindings from the function argument list.