Design meeting 2024-01-17: Simple postfix macros

--- title: "Design meeting 2024-01-17: Simple postfix macros" date: 2024-01-17 tags: ["T-lang", "design-meeting", "minutes"] discussion: https://rust-lang.zulipchat.com/#narrow/stream/410673-t-lang.2Fmeetings url: https://hackmd.io/zsu_30qRRyezzDr6xIQpkA --- # Design meeting 2024-01-17: Simple postfix macros [RFC 2442](https://github.com/joshtriplett/rfcs/blob/simple-postfix-macros/text/2442-simple-postfix-macros.md) proposes "simple postfix macros". This was proposed several years ago, but I've updated it since then, and incorporated every bit of feedback from the GitHub thread. This document gives a brief overview of the proposal and rationale, and then goes into open questions the team needs to decide. ## The proposal Allow defining and invoking postfix macros, so that users can write and read expressions involving macros from left to right, rather than having to repeatedly go back to the beginning: ```rust macro_rules! postfix { ($self:self) => { ... } ($self:self, $arg:expr) => { ... } } expr.postfix!() // invokes the first rule expr.postfix!(another_expr) // invokes the second rule ``` `$self` gets evaluated *before* the macro invocation, so it can't reinterpret the entire previous expression and not evaluate it. This preserves the property that code *not* appearing *inside* a macro invocation (e.g. not `macro!(code)`) gets interpreted in the obvious and expected way, and in particular will be evaluated exactly once, not zero times or multiple times. (This also avoids the "garden path sentence" problem of humans having to go back and reinterpret the previous code after seeing a macro.) `stringify!($self)` does show the whole expression as a string, which allows for debug/assert macros. The invocation uses autoref on self, just like method dispatch and closures do: it figures out whether it needs the expression to be `&self`, `&mut self`, or `self`, based on the body. This is the same convenience users expect for methods. For instance, users can call `some_struct.field.write_all(b"...")`, and `write_all` will receive `&mut some_struct.field`, so it does not consume the field; similarly, users will be able to call `some_struct.field.write!("...")` and it will not consume the field. ## Use cases Some notable potential use cases include: - `expr.log_err!(...)?` to conveniently log an error without having to do `expr.map_err(|e| { error!(...); e })` - `expr.unwrap_or! { ... }`, which allows the `expr` to do control flow like `continue` or `break` or `return`. - `expr.matches!(pattern)` or `expr.is!(pattern)`, which allows experimenting with infix match syntax. - `expr.field.dbg!().method().dbg!().method2()`, which makes it easy to inspect the steps in a chain. - `expr.writeln!("...")`, which makes the invocation feel like a method rather than `writeln!(expr, "...")` This also allows experimentation with language design in general: this lets library crates show us the potential for broader kinds of syntax, and popular demonstrations of intuitive/ergonomic solutions may influence future language design. ## Two points of discussion ### Non-type-based dispatch One point of *extensive* discussion in the GitHub thread was whether anything using method syntax should *always* be dispatched based on the type/traits of the receiver. Type-based macro dispatch is a research project, and would likely be a far-future thing, if we even wanted it at all. This proposal for simple postfix macros is intentionally *not* dispatched based on type; it's effectively a method on any possible type, like a blanket impl. I propose that, whether or not we *might* want type-based macro dispatch in the future, we will still want *non-type-based* postfix macros. Consider things like `.matches!`/`.is!`, which would apply to *any* type that can be pattern-matched, for example. The proposal in the RFC is forward-compatible with adding type-based or trait-based dispatch in the future, *if we want that*. ### autoref The RFC has the compiler use the same machinery for postfix macros that it already uses for closures and method dispatch. Today, you can write a closure that calls `something.method()`, and the compiler will figure out how to capture `something` based on what `method` needs. The autoref mechanism The RFC, as a placeholder, writes the desugaring as if we had a `k#autoref` construct to invoke this machinery. However, this RFC does *not* propose exposing that syntax; we can consider that as potential future work if we want it. --- # Discussion People: TC, Josh, scottmcm, eholk, Lukas Wirth, pnkfelix, tmandry, nikomatsakis, waffle, Nadri Minutes: TC ## Effect of Eager evaluation on expansion pnkfelix: (this may overlap with scottmcm's Q above about place-forwarding) pnkfelix: Does achieving eager evaluation of `$self` affect expansion in some way? I'm trying to understand how you will get this effect (which I intepret as doing some sort of early `let-binding` of `$self` so that you have a variable to hold the value it yields) while also allowing `stringify!($self)` to expand to the receiver expression. JT: Yes, we fully evaluate it and end up with a place. pnkfelix: How does `stringify` work then? JT: It gives you the exact string of the receiver. It doesn't indicate that it wasn't eagerly evaluated? TC: So it's special-cased. pnkfelix: That sounds bad. Josh: pnkfelix, why does that seem bad? It's just the string returned by `stringify!($self)`, and it makes things like `.dbg!()` or `.log!()` or `.assert!()` useful. scottmcm: I don't know how we'd forward the place. JT: The desugaring in the RFC is not sufficient to describe the concept of place forwarding. Are you asking how to write out this forwarding in surface syntax, or are you asking whether it's possible at all? The semantics seem clear to me, but we definitely don't have a surface syntax for it. scottmcm: We definitely don't have a surface-level Rust type that does this. That's what scares me. If we have to add something to the type system to do this, that's a much more complicated change. JT: Fortunately, it's a macro, so there's nothing for the type system to do here. scottmcm: But it produces a place that needs to be type checked. JT: We could not define a syntax for it and it would still work. scottmcm: I'm not so worried about the syntax here. NM: There are two constraints that this proposal is trying to evaluate the self type once. The other is that we want it to autoref/autoderef. JT: It's also about the garden path problem (reinterpreting a whole expression/chain after getting to the end of it). NM: But how important is the first one? JT: I wouldn't want to do it without that constraint. pnkfelix: If the macro tries to do inspection of the self argument, what does it get? NM: That's a detailed question. I see what Josh is saying, but on the other hand, the `!` means that macros can surprise you. NM: We have the exclamation point for a reason, to flag this kind of thing. tmandry: +1. I wonder whether this will be a real issue in practice. Josh: Normally, macros wrap around the code they reinterpret. I don't think the bang at the end of a long chained expression is sufficient warning. This RFC was intentionally giving up that capability as you can do useful things without it and without scaring people with arbitrary reinterpretations. NM: It answers my question, but I'm not sure I agree. I'm not sure it'd be such an issue in practice, and it's bringing a lot of complexity in the RFC. pnkfelix: Question one. You've brought up that we don't have to worry about the surface syntax. Are you talking about what a human writes, or are you talking about the object syntax of what the RHS, recursively expanded, will produce? JT: We have bulitin macros. But what I'm suggesting is that the choice of the surface syntax we expand to is somewhat arbitrary. We don't have to expose it as stable surface syntax. pnkfelix: We'd have to write a specification for what the language means. So this matters to me, that we'd have to write this out. JT: We have ways to pass a place to something already in the language, since we can pass a place to a closure. (And we have no way to "name" that mechanism today, as far as I know.) pnkfelix: Question two. This focus on the fact that evaluation is eager. That also means that expansion must be eager. The expansion of the receiver happens before you expand the postfix thing. pnkfelix: i.e. `foo!().postfix!()`, when does the expansion of `foo!` happen relative to the expansion of `postfix!`? nikomatsakis: The doc doesn't specify the "desugaring", is it correct that it would look like: ```rust! macro_rules! foo { ($self, ...) => { $body } } <expr>.foo!() ``` This becomes: ```rust match /* tokens of expr */ { k#autoref expr => { ($body with $self replaced with `expr`) } } ``` So you can do this, as with a closure: ```rust macro_rules! foo { ($self, ...) => { $self.as_ref() $self.as_mut() } } ``` But you can't do a conditional move: ```rust macro_rules! foo { ($self, ...) => { if (false) { drop($self); } } } let v = vec![]; v.foo!(); // v is moved here ``` NM: Why do we want the place thing? JT: The simplest case is this: ```rust macro_rules! postfix_assign { ($self:self, $e:expr) => { $self = $e } } ``` NM: ```rust macro_rules! increment { ($self) => { *self.foo() += 1; } } trait Foo { fn foo(&mut self) -> &mut Self; } ``` JT: One answer to the question is that you don't get to do the place thing. But it may be limiting, and we know that closures are able to do this. NM: I understand now. Yes, you could make that work then. You kind of want a closure that only captures self and everything else is transparent. Josh: With closures it looks like this: ```rust let mut x = 42; let mut assign = |y| { x = y }; assign(10); println!("{}", x); // this works and prints 10 ``` The closure captures something as a place and uses it as a place. NM: What would it do if the thing you called it on was not a place? `vec![].postfix_assign!(42);` JT: It would expand, it wouldn't be a place, so you wouldn't be able to assign to it. pnkfelix: If you put a `vec![]` into: ```rust! macro_rules! foo { ($self, ...) => { $body } } <expr>.foo!() ``` Then you wouldn't necessarily get the error. NM: Let's maybe say, for the sake of argument, that we could do what JT is proposing... ```rust let place expr = <expr> in { ... } ``` tmandry: I'm not sure I like the place thing. NM: It seems like we're going for a worse is better design. But on the other hand, this doesn't feel as simple as it could be to me. JT: Either we allow you do what you can with a place (and give an expansion that allows that), or you can't do that, e.g. you couldn't do assignments? I don't think we have the option to choose to do it later. NM: We could also choose to do it as though it were inside the parentheses. Or we do something with potentially some limitations to early evaluate the `Self`. I do think we could launch with a narrower set of uses and then expand. JT: Could we launch without a place and then expand? NM: Maybe. pnkfelix: I.e., we'd only allow the self in value contexts. tmandry: My feeling is that if we allow that we should just allow arbitrary evaluation. JT: Late evaluation seems too surprising to me. TC: Perhaps shouldn't we explore use cases, look at them, and see how much we like them, both good and bad? NM: E.g., the logging case, you may want to conditionalize it. JT: It's reasonable to me that you'd have to write it prefix in that case. TC: The conditional is what you want here though. It'd be annoying to have to rework to prefix in that case. NM: Are the use cases above exhaustive? JT: They're representative ones I collected. People submitted more. Perhaps the most popular one was `unwrap`. tmandry: In terms of the text, it could be more clear that it's talking about eager evaluation of the expression. eholk: Coming from other languages, the idea that self would be evaluated eagerly would be really surprising. I feel like this hasn't been an issue with `.await` or `?`. How concerned are we really that people will do stupid things with this power? Because that seems like a much simpler design. And I feel like in practice, I don't feel like people will do stupid things here. pnkfelix: I was expecting to allow people to write DSLs (on the receiver expression) with this. (But once I understood that was not the goal, I accepted that.) JT: Most of the time, people probably won't do surprising things, at least *on purpose*... But as with all macros, there may commonly be issues with duplicate evaluation. tmandry: One simple example that comes to mind: ```rust println!("foo").repeat!(5); ``` tmandry: I think the place thing that increases the complexity, and I may be OK with it without that. JT: I'm getting the feeling that the biggest point of contention is the question of whether we want to support places. The one other question I'd like to answer is whether doing non-type-based dispatch is fine. I'd like confirmation there. NM: I'm fine with that. pnkfelix: +1. I'm not sure how it would work anyway. NM was probably the expert on that. scottmcm/tmandry: +1. NM: I'm not convinced that the misuse potential is greater for self than for other macros. E.g. duplicate evaluation or unsafe scoping. These things are equally surprising for any macros. tmandry: Doing it differently here could set expectations that could mislead people for normal macros or other argument positions. NM: I'd rather have a general mechanism for solving these problems for all macros. JT: I can appreciate the orthogonality of treating these separately. NM: Where tmandry and I (and TC + eholk) perhaps differ is that we're not as convinced that we need to prohibit inspecting the self type. tmandry: I see the arguments JT is making. NM: It just seems the RFC could be much simpler if we just allowed this. (The meeting ended here.) ## Scott's usual statements about postfix scottmcm: I think that postfix in general is better *only* for things that take values. That's why I like `.await` but not `{ ... }.async`, why `x.break` would be fine but `{ ... }.loop` is not, etc. This is the general "do you need a warning about it up-front?" thing: you should be thinking about the *code* in there differently, rather than thinking about the value. This is also why we said `unsafe` blocks are lexical, not dataflow-aware. ## Conditionally moving out scottmcm: There's no *type* in rust (surface rust at least) that the temporary in the `match` can have that makes this do what we want, I think: ```rust macro_rules! maybe_move { ($p:self) => { if foo() { return Some($p) } } } foo.b.maybe_move!(); ... keep using `foo.b` here because it wasn't moved ... ``` ## Place-forwarding scottmcm: If the receiver is used multiple times, how does it handle it? Can I get place-level access to something, and how is that forwarded? I'm having trouble coming up with a perfect example, but how about something like this: ```rust macro_rules! assign { ($p:self, $e:expr) => ({ $p = $e; }) } x.0.assign!(4); ``` That's not something where "do what autoref does" can explain what should happen, and if `self` is a temporary, then it's useless. josh: Can you clarify the primary thing your example is trying to demonstrate? Let's discuss this one after people are done reading. scottmcm: the temporary can't be `T`, `&mut T`, nor `&T` and still have those *tokens* in the expansion do what I want. Specifically, the type of the temporary mentioned in > the compiler will effectively create a temporary binding (as though with match) for the value of $self josh: I see what you mean; at a *token* level, this can't just desugar to `match $x { autoref __x => ... }`, because you couldn't write `__x = y` and have the right thing happen. If it's `&mut` you have to write `*` and if it's a move then it's not referencing the original. So you're right, it's not sufficient to describe this as desugaring to a match-with-autoref. Clarifying something: do you think the correct behavior is obvious here and it just doesn't match the RFC's desugaring as written, or is the behavior unclear? I'm expecting this to have the same behavior as: ```rust let mut x = 42; let mut assign = |y| { x = y }; assign(10); println!("{}", x); // this works and prints 10 ``` scottmcm: I think that it's exposing a gap in the "it's like autoref" argument, because with autoref we know what it should be, but there's no autoref for assignment. josh: Would removing the concrete "autoref" desugaring and just explaining that it uses the same machinery as closures be sufficient here? ## Eager expansion tmandry: Correct me if I'm wrong, but I believe this eager expansion behavior of `$self:self` is different from the behavior of other macro args (annoyingly, in my experience). Should we consider as a future possibility some extension that would cause any macro arg to be evaluated eagerly? josh: We absolutely could, and that seems potentially useful, though it's also possible to emulate it with `match $arg { __arg => ... }`. ## In-place dbg nikomatsakis: This example is interesting: > `expr.field.dbg!().method().dbg!().method2()`, which makes it easy to inspect the steps in a chain. How would that `dbg!` work... I presume it would borrow the self and then move the value as the result, which means that if `method()` were `fn(&self)`, you'd have subtle differences. ```rust macro_rules! dbg { ($self:self) => { { eprintln!("{}: {}", stringify!($self), $self); $self } } } ``` `stringify!` aside, this could be done in a non-macro AFAICT. scottmcm: relatedly, what's the `stringify!` that gets? The value moving behaviour I understand, but not what will be output. Is it `expr.field.dbg!().method().dbg!(): 40`? josh: The first `dbg!()` will get `expr.field`, the second one will get `expr.field.dbg!().method()`. (They're just strings, the receiver is still fully evaluated.) ## Use cases for late evaluation TC: Are there any appealing known use cases for late evaluation of the `self` type that we'd be ruling out? (In these cases, of course people could fall back to using a prefix macro, but perhaps that would seem awkward given the presence of postfix macros.) josh: It'd rule out things like `.loop!()` or `.while!()`, for instance, but I think it's reasonable to rule those things out. Tyler, Niko: Not completely convinced we have to ban late evaluation. Tyler: Fine with moving forward with something that does ban late evaluation, at first. ## Auto-ref nikomatsakis: I vaguely remember, Josh, you and me discussing the auto-ref idea, but I'm kind of confused about how it's expected to work here. Are there limits to how you can use `self` fragments, and what happens if you use it more than once? (Presumably you can't.) ```rust! macro_rules! method_macro { ($p:self, $x: expr) => { $p.method1($x); $p.method_mut($x); } } struct SomeType; impl SomeType { fn method(&self, arg: u32); fn method_mut(&mut self, arg: u32); } SomeType.method_macro!(22) // ? ``` TC: +1, I was writing out this same question. tmandry: I'm trying to understand why this is necessary in the first place if `thing.foo!()` can desugar directly to `thing.some_method()`... okay, I think I get it now. ## Is it allowed for a macro to be both postfix and "normal"? waffle: is this allowed? ```rust macro_rules! mac { ($self:self) => { "self" }; () => { "empty" }; ($x:expr) => { "expr" }; } assert_eq!(1.mac!(), "self"); assert_eq!(mac!(), "empty"); assert_eq!(mac!(1), "expr"); ``` Josh: Yes, and I mentioned that in the RFC. One macro can have both postfix and non-postfix rules. ## Expansion Order of nested macros Lukas: `foo!().bar()` and `1.bar().baz()`, is the `$self` "receiver" eagerly expanded and then passed to the postfix macro transcriber? This can be observed by proc-macros. Lukas: What is the output of the following? ```rust macro_rules! stringify_self { ($self:self) => { stringify!($self) } } macro_rules! foo { ($expr:expr) => {$expr}; } foo!(42).stringify_self!() ``` Lukas: will a proc macro that runs [TokenStream::expand_expr](https://doc.rust-lang.org/proc_macro/struct.TokenStream.html#method.expand_expr) (unstable feature) on its input see that `$self` var? (Assuming that feature gets expanded to allow more than just literal expansions) If it does, the expansion of postfix macros is not allowed to change once proc-macros can observe it on stable like this. ## Do we have consensus that non-type-based dispatch is OK? josh, Scott, Tyler, Niko, Felix: yes

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.