---
title: "Self-borrowing generator examples"
author: tmandry, TC
date: 2025-01-27
url: https://hackmd.io/DTSOVR4QRLyvaU1HQiQZvg
---
# Self-borrowing generator examples
[Zulip thread](https://rust-lang.zulipchat.com/#narrow/channel/481571-t-lang.2Fgen/topic/When.20do.20generators.20need.20to.20hold.20a.20borrow.20during.20a.20yield.3F/with/496511831)
## Example: iter_set
This works...
```rust
gen fn iter_set<T>(xs: &HashSet<T>) -> &T {
for x in xs {
yield x;
}
}
gen fn iter_set_owned<T>(xs: HashSet<T>) -> T {
for x in xs {
yield x;
}
}
```
but this does not...
```rust
gen fn iter_set_rc<T: Clone>(xs: Rc<RefCell<HashSet<T>>>) -> T {
for x in xs.borrow().iter() {
// ^^^^^^^^^^^ ERROR borrow may still be in use
yield x.clone(); // during this yield
}
}
```
Remedy: Have the caller borrow out of the RefCell and call `iter_set` instead.
- Advantage: `T` does not need to be cloned
- Disadvantage: Cannot return this generator from a function that holds the Rc/RefCell.
[Playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2024&gist=32ff18ed0a6c604baf4df5685f312584)
## Example: vals_for_keys
Here we need to borrow a generator-owned object while iterating through part of it.
```rust
/// Yields the set of values for each key listed in `keys`.
gen fn vals_for_keys<K, T>(
map: HashMap<K, Vec<T>>,
keys: Vec<K>,
) -> T
where
K: Eq + Hash,
T: Clone,
{
for k in keys {
if let Some(values) = map.get(&k) {
// ^^^ ERROR borrow may still be in use
for val in values {
yield val.clone(); // during this yield
}
}
}
}
```
This also does not work in the shared ownership (handle) case. Versions of this example that do work:
- Accepting a `&HashMap<K, Vec<T>>`
- Advantage: `T` does not need to be cloned and can be handed out by reference
- Disadvantage: Cannot return this generator from another function that owns the containers
- Destructively accessing the owned HashMap via `map.remove(&k)`
- Advantage: `T` does not need to be cloned and can be handed out by value
- Disadvantage: In this example, cannot support repeated keys
[Playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2024&gist=612276051e6b95bf74b6ba86b7ed6959)
## Example: Operating on values from lending traits
```rust!
pub trait Lend {
type Item<'a> where Self: 'a;
fn lend(&mut self) -> Option<Self::Item<'_>>;
}
pub trait Txn {
type Status;
fn prepare(&mut self) -> Self::Status;
fn execute(&mut self) -> Self::Status;
}
pub gen fn stream_txns<I, T>(mut xs: I) -> T
where
I: for<'a> Lend<Item<'a>: Txn<Status = T>>,
{
while let Some(mut txn) = xs.lend() {
//~^ ERROR borrow may still be in use when `gen` block yields
yield txn.prepare();
yield txn.execute();
}
}
```
Here, the generator captures by value a lending iterator. When we call `xs.lend()`, we get a value that's pointing into that owned lending iterator, creating a self-reference. Because we need to interleave our operations with yielding values, there's no way to avoid holding this self-reference across a yield point.
## Example: Working with capturing opaque types
```rust
pub fn f<T>(v: &T) -> impl FnMut() -> Option<u8> + use<'_, T> {
move || { let _v = v; todo!() }
}
pub gen fn g<T>(v: T) -> u8 {
loop {
let mut h = f(&v); //~ Captures reference.
//~^ ERROR borrow may still be in use when `gen` block yields
while let Some(x) = h() { yield x };
}
}
```
Here, the value of type `T` is moved into the generator. When we call `f(&v)`, we get back a `FnMut` type that captures that local reference to the value owned by the generator, creating a self-reference. Since we interleave calls to that `FnMut` with yields, there's no way to avoid holding this self-reference across a yield point.
Suppose that `f` took `T` by value rather than by reference. Then we could make this work, but that would have two problems: 1) it's not a local rewrite from `g`, and 2) we'd need to require that `T: Clone` and then pay the cost of cloning the value, for no good reason, on each iteration of the outer loop.
Conversely, `g` could conceivably take `T` by reference, but then we'll end up with a non-`'static` and opaque generator type that itself captures a reference, potentially recreating this same problem somewhere further downstream.
## Example: Producer/consumer patterns
```rust
trait Hash {
fn update(&mut self, data: &[u8]);
fn digest(&mut self) -> [u8; 32];
}
struct S<'cx, CX>(&'cx CX);
impl<'cx, CX> Hash for S<'cx, CX> {
fn update(&mut self, _data: &[u8]) { todo!() }
fn digest(&mut self) -> [u8; 32] { todo!() }
}
pub fn make_hash<CX>(ctx: &CX) -> impl Hash + use<'_, CX> {
S(ctx)
}
enum StreamState {
Length(usize),
Digest([u8; 32]),
}
pub gen fn stream_hash<I, CX>(xs: I, ctx: CX) -> StreamState
where
I: IntoIterator<Item = Vec<u8>>
{
let mut hash = make_hash(&ctx);
//~^ ERROR borrow may still be in use when `gen` block yields
let mut xs = xs.into_iter();
while let Some(x) = xs.next() {
hash.update(&x);
yield StreamState::Length(x.len());
}
yield StreamState::Digest(hash.digest());
}
```
Here, `make_hash` returns a type that captures a self-reference. Since we need to interleave operations on this producer/consumer type with yields, we must hold this self-reference across a yield point.
## Example: Turning a borrowing iterator into an owning one
```rust
gen fn chars(s: String) -> char {
for c in s.chars() {
//~^ ERROR: borrow may still be in use when `gen` fn body yields
yield c;
}
}
```
[Playground link](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&code=%23%21%5Bfeature%28gen_blocks%29%5D%0A%0Agen+fn+chars%28s%3A+String%29+-%3E+char+%7B%0A++++for+c+in+s.chars%28%29+%7B%0A++++++++%2F%2F~%5E+ERROR%3A+borrow+may+still+be+in+use+when+%60gen%60+fn+body+yields%0A++++++++yield+c%3B%0A++++%7D%0A%7D%0A%0Afn+main%28%29+%7B%7D%0A)
This particular case could instead be written as:
```rust
pub gen fn chars(s: String) -> char {
let mut idx = 0;
loop {
let c = match s[idx..].chars().next() {
Some(x) => x,
None => return,
};
idx += c.len_utf8();
yield c;
}
}
```
[Playground link](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&code=%23%21%5Bfeature%28gen_blocks%29%5D%0A%0Apub+gen+fn+chars%28s%3A+String%29+-%3E+char+%7B%0A++++let+mut+idx+%3D+0%3B%0A++++loop+%7B%0A++++++++let+c+%3D+match+s%5Bidx..%5D.chars%28%29.next%28%29+%7B%0A++++++++++++Some%28x%29+%3D%3E+x%2C%0A++++++++++++None+%3D%3E+return%2C%0A++++++++%7D%3B%0A++++++++idx+%2B%3D+c.len_utf8%28%29%3B%0A++++++++yield+c%3B%0A++++%7D%0A%7D%0A%0Afn+main%28%29+%7B%7D%0A)
But that of course takes much more thought and care, and it loses much of the benefit of using generators in the first place.
With types that are more complicated or more opaque than `String`, this kind of rewrite may not always be available.
---
As a kind of horrifying aside, since strings are heap allocated, we could also rewrite this particular example as:
```rust
pub gen fn chars(s: String) -> char {
// SAFETY: We own the string, and we ensure that `xs` is dropped
// before the string buffer that it references. We're doing this
// since the borrow checker isn't otherwise smart enough to work
// out that `Chars<'_>` isn't actually referencing memory that's
// part of this generator since the buffers for strings are heap
// allocated.
//
// Note that we're making unwarranted assumptions about how
// `str::chars` works. In particular, if it were to hold a
// reference to the string length, which is part of the data owned
// by the generator, then this would be UB. As it happens, it
// doesn't. Internally, it uses `slice::iter` which holds only a
// pointer to the buffer and a pointer to one past the last
// element of the string. But it's not really fair of us to rely
// on that.
//
// And, of course, this wouldn't work at all with a
// `SmallVec`-style string.
let xs: core::str::Chars<'static> = unsafe { transmute(s.chars()) };
for c in xs {
yield c;
}
}
```
If we don't support self-referential generators, we might tempt people into writing this sort of thing.
## Example: (name here)
## Example: (name here)
## Example: (name here)
## Example: (name here)