Extended dot operator RFC

Feature Name: extended_dot_operator
Start Date: 27.01.2019
RFC PR: —
Rust Issue: —

Summary

Extended dot operator provides a more flexible way of invoking associated methods, functions, macros, and even control flow constructs, which in some cases allows to avoid extra bindings, parentesis, identifiers, and unnecessary mut state.

let text = String::new() .[
    {file_a}.read_to_string(&mut this)?,
    push_str("\n ------ \n"),
    {file_b}.read_to_string(&mut this)?,
];

Motivation

This syntax would be simple, uniform, and powerful substitution for many syntax constructs available in other programming languages, that have more or less the same purpose and similar structure, but presumable in their primary forms would never be available in Rust due to insuitable complexity/usefulness/verbosity ratio.

The following use cases are supported

1. Deferring prefix operators

let not_empty = get().some_collection().[!this.is_empty()];
let value = something.[*this.method_returns_ref()].continue_chain();

This syntax allows to move any prefix operator directly to method inside of a method call chain that returns value on which that operator will be applied. In some cases it allows to simplify scoping and improve readability.

2. Pipeline operator

let deserialized: DataType =
    Path::new("path/to/file.json")
        .[File::open(&this)].expect("file not found")
        .[serde_json::from_reader(this)].expect("error while reading json");

This syntax looks completely different than familiar for everyone |> operator but instead it plays well with move/borrow semantics, has cleaner scoping/precedence, and overall is way more flexible.

3. Overriding results

let sorted_vec = iter
    .collect::<Vec<_>>()
    .[sort(),];

This syntax allows to drop method return value and substitute it with previous value in method call chain. In this way it's possible to fluently interact with APIs that don't support method chaining for various reasons.

4. Method cascade

consume(&HashMap::new() .[
    insert("key1", val1),
    insert("key2", val2),
]);

This syntax allows to initialize values without using one-off mut bindings. Also, it allows to save on providing initialization macro or implementing builder pattern that otherwise would be used very rarely.

5. Chain derivation

let sf = create_surface() .[
    draw_circle(ci_dimens).draw_rectangle(rect_dimens).finish()?,
    draw_something_custom(&this).finish()?,
];

This syntax allows to handle errors in separate chain and some DSLs could adopt it instead of macros for better integration with IDE autocompletion with guarantee that there's no hidded magic inside.

6. Postfix macros

let x = long().method().[dbg!(this)].chain();

This syntax is the same as for pipeline operator. It's explicitness don't require any modification on macro declaration side, therefore on postfix position we are able to apply all existed macros.

7. Tuple reorganization

let (a, b) = ("c", "b", "a", "x").[(this.2, this.1)];

This syntax allows to reorganize tuple "on the fly" without introducing many temporary bindings in destructuring. It's also helpful to simplify tuples that have complex pattern before they would be destructured.

Guide-level explanation

Description

Extended dot construct adds abillity to compose chainable and non-chainable expressions with . operator. It could be seen as combinator that drags its argument through a series of actions. The difference from regular combinator is that it don't uses closures, has distinguishable syntax, and properly handles all early-returning syntax constructs.

The grammar for it could be expressed as:

extended_dot_expr : expr "." "[" [ expr [ "," expr ]* ]? ","? "]"

Its based on brackets that are put right after . operator and takes a comma separated list of special expressions.

The expressions taken by brackets are called actions
The value placed before . operator is called receiver
The scope inside of brackets is called extended dot scope

The rules of extended dot scope

1. All actions always uses receiver: it's always implicitly available inside of all actions under this alias that has the same properties as any other regular mutable binding.

receiver .[
    action1(&this),
    action2(&mut this),
    action3(this),
];

receiver .[
    if action(this) {
        success()
    } else {
        error()
    },
];

2. Action can use receiver implicitly: when action begins with a number, simple identifier, or function call that don't takes this as parameter, then this. would be prepended to action producing tuple indexing, property access, or method call respectively.

This implies:

it's an ergonomic improvement because in most of cases we wouldn't need using explicit this
external functions and bindings are shadowed with associated items and cannot be accessed
chain always remains connected and it's impossible to switch its context to some external
compile-time warning is produced when some action begins with this.

receiver .[
    method1(),          // The same as `this.method1()`
    this.method2(),     // Warning: `this` is obsolete
];

3. Explicit receiver becomes unavailable in braces nested inside of extended dot scope: even braces of control flow constructs counts, although in nested brackets, parenteses, etc. everything works as expected .

This implies:

it's hard to unneessarely grow code in horizontal direction which makes it more intricated
overusing of extended dot construct immediately becomes impractical
compile-time error is produced when this occurs inside of nested braces
temporary binding or function is cleaner solution when we reaching this constrain

receiver .[
    if this.method() {
        success()
    } else {
        error()
    },
];

receiver .[{
    this.method()      // Error: use of `this` in nested scope
}];

receiver .[
    if this.method() {
        this.success() // Error: use of `this` in nested scope
    } else {
        this.error()   // Error: use of `this` in nested scope
    }
]

4. Presence of trailing comma determines return value: without trailing comma the result of last action is returned, but with trailing comma it's dropped and receiver is returned instead.

receiver .[
    action()           // Result of `this.action()` is returned
];

receiver .[
    action(),          // `this` value is returned instead
];

Additional examples

// Current Rust (notice that methods are invoked in different order)
let ag1 = apply_common_properties(ArgGroup::with_name("group1"))
    .multiple(true);
let ag2 = apply_common_properties(ArgGroup::with_name("group2"))
    .multiple(false);

// With this RFC (now all methods runs sequentially)
let ag1 = ArgGroup::with_name("group1")
    .[apply_common_properties(this)]
    .multiple(true);
let ag2 = ArgGroup::with_name("group2")
    .[apply_common_properties(this)]
    .multiple(false);

// Current Rust (notice that `descr` is available in whole scope)
let var = nvim.get_var(key).map(|v| v.to_string())
    .or_else(|e| {
        let descr = e.to_string();
        if descr == format!("1 - Key '{}' not found", key)
        || descr == format!("1 - Key not found: {}", key) {
            Ok(String::from(default))
        } else {
            Err(e)
        }
    })?;

// With this RFC (now we can ommit it at all)
let var = nvim.get_var(key).map(|v| v.to_string())
    .or_else(|e| e.to_string() .[
        if this == format!("1 - Key '{}' not found", key)
        || this == format!("1 - Key not found: {}", key) {
            Ok(String::from(default))
        } else {
            Err(e)
        }
    ])?;

// Current Rust (notice that `msg` is available in whole scope)
let msg = Message::new("event");
dbg!(&msg);
notify_send_msg(&msg);
let resp = send_msg(&msg);

// With this RFC (now we see only essential things)
let resp = Message::new("event") .[
    dbg!(&this),
    notify_send_msg(&this),
    send_msg(&this)
];

// Current Rust (notice that `mut nvim_args` available in whole scope)
let mut nvim_args = String::new();
nvim_args.push_str("--cmd 'set shortmess+=I' ");
nvim_args.push_str("--listen ");
nvim_args.push_str(&nvim_child_listen_address.to_string_lossy());

// With this RFC (now it's immutable)
let nvim_args = String::new() .[
    push_str("--cmd 'set shortmess+=I' "),
    push_str("--listen "),
    push_str(&nvim_child_listen_address.to_string_lossy()),
];

// Current Rust (notice that mutable state and actions looks the same)
fn main() {
    druid_win_shell::init();

    let mut file_menu = Menu::new();
    file_menu.add_item(COMMAND_EXIT, "E&xit");
    file_menu.add_item(COMMAND_OPEN, "O&pen");
    let mut menubar = Menu::new();
    menubar.add_dropdown(file_menu, "&File");

    let mut run_loop = win_main::RunLoop::new();
    let mut builder = WindowBuilder::new();
    let mut state = UiState::new();
    let foo1 = FooWidget.ui(&mut state);
    let foo1 = Padding::uniform(10.0).ui(foo1, &mut state);
    let foo2 = FooWidget.ui(&mut state);
    let foo2 = Padding::uniform(10.0).ui(foo2, &mut state);
    let button = Button::new("Press me").ui(&mut state);
    let buttonp = Padding::uniform(10.0).ui(button, &mut state);
    let button2 = Button::new("Don't press me").ui(&mut state);
    let button2p = Padding::uniform(10.0).ui(button2, &mut state);
    let root = Row::new().ui(&[foo1, foo2, buttonp, button2p],&mut state);
    state.set_root(root);
    state.add_listener(button, move |_: &mut bool, mut ctx| {
        println!("click");
        ctx.poke(button2, &mut "You clicked it!".to_string());
    });
    state.add_listener(button2, move |_: &mut bool, mut ctx| {
        ctx.poke(button2, &mut "Naughty naughty".to_string());
    });
    state.set_command_listener(|cmd, mut ctx| match cmd {
        COMMAND_EXIT => ctx.close(),
        COMMAND_OPEN => {
            let options = FileDialogOptions::default();
            let result =ctx.file_dialog(FileDialogType::Open, options);
            println!("result = {:?}", result);
        }
        _ => println!("unexpected command {}", cmd),
    });
    builder.set_handler(Box::new(UiMain::new(state)));
    builder.set_title("Hello example");
    builder.set_menu(menubar);
    let window = builder.build().unwrap();
    window.show();
    run_loop.run();
}

// With this RFC (now different things are separated)
fn main() {
    druid_win_shell::init();

    let file_menu = Menu::new() .[
        add_item(COMMAND_EXIT, "E&xit"),
        add_item(COMMAND_OPEN, "O&pen"),
    ];
    let menubar = Menu::new() .[
        add_dropdown(file_menu, "&File"),
    ];

    let mut run_loop = win_main::RunLoop::new();
    let mut builder = WindowBuilder::new();
    let mut state = UiState::new();
    let foo1 = FooWidget.ui(&mut state);
    let foo1 = Padding::uniform(10.0).ui(foo1, &mut state);
    let foo2 = FooWidget.ui(&mut state);
    let foo2 = Padding::uniform(10.0).ui(foo2, &mut state);
    let button = Button::new("Press me").ui(&mut state);
    let buttonp = Padding::uniform(10.0).ui(button, &mut state);
    let button2 = Button::new("Don't press me").ui(&mut state);
    let button2p = Padding::uniform(10.0).ui(button2, &mut state);
    let root = Row::new().ui(&[foo1, foo2, buttonp, button2p],&mut state);
    state .[
        set_root(root),
        add_listener(button, move |_: &mut bool, mut ctx| {
            println!("click");
            ctx.poke(button2, &mut "You clicked it!".to_string());
        }),
        add_listener(button2, move |_: &mut bool, mut ctx| {
            ctx.poke(button2, &mut "Naughty naughty".to_string());
        }),
        set_command_listener(|cmd, mut ctx| match cmd {
            COMMAND_EXIT => ctx.close(),
            COMMAND_OPEN => {
                let options = FileDialogOptions::default();
                let result =ctx.file_dialog(FileDialogType::Open,options);
                println!("result = {:?}", result);
            }
            _ => println!("unexpected command {}", cmd),
        }),
    ];
    let window = builder .[
        set_handler(Box::new(UiMain::new(state))),
        set_title("Hello example"),
        set_menu(menubar),
        build().unwrap()
    ];
    window.show();
    run_loop.run();
}

Reference-level explanation

Desugaring

Capturing receiver value:

*here and below _id only represents a differently named unique identifier.

 a.b.[
 ];
     ⇒
 {
     let mut _id = a.b
 };

Inserting provided actions:

*warning should be produced when single action is provided and there's no trailing comma after it.

 a.b.[
     c(),
     d,
     e(this)
 ];
     ⇒
 {
     let mut _id = a.b
     ;{ c()     }
     ;{ d       }
     ;{ e(this) }
 };

Prepending implicit this to actions:

*prepending should be skipped for language constructs.
*prepending should be skipped for methods that already takes this as parameter.
*prepending should be skipped for actions that already begins with this.
*warning should be produced when action already begins with this.

 a.b.[
     0,
     c(),
     d,
     this.0,
     this.c(),
     this.d,
     e(this),
     !f()
 ];
     ⇒
 {
     let mut _id = a.b
     ;{ _id.0    }
     ;{ _id.c()  }
     ;{ _id.d    }
     ;{ this.0   }
     ;{ this.c() }
     ;{ this.d   }
     ;{ e(this)  }
     ;{ !f()     }
 };

Replacing this with _id inside of actions:

*replacing should be skipped for this placed inside of braces.

 a.b.[
     this.0,
     this.c(),
     this.d,
     e(this),
     !this,
     {this},
     unsafe { this }
 ];
     ⇒
 {
     let mut _id = a.b
     ;{ _id.0           }
     ;{ _id.c()         }
     ;{ _id.d           }
     ;{ e(_id)          }
     ;{ !_id            }
     ;{ {this}          }
     ;{ unsafe { this } }
 };

Ensuring that _id is used inside of all actions:

*when action don't uses _id then error should be produced

 a.b.[
     ,
     (),
     {x()},
 ];
     ⇒
 {
     let mut _id = a.b
     ;{     ERROR }
     ;{ ()  ERROR }
     ;{ x() ERROR }
 };

Overriding return value:

*when trailing comma is provided then reciver should be returned instead of last action result

 a.b.[
     c(),
 ];
     ⇒
 {
     let mut _id = a.b
     ;{ _id.c() }
     _id
 };

Warnings

warning: unnecessary extended dot scope
 --> src/main.rs:0:0
  |
0 |     x.[y()].z()
  |       ^^^^^ help: remove these brackets
  |
  = note: #[warn(unused_extended_dot_scope)] on by default

warning: unnecessary extended dot receiver
 --> src/main.rs:0:0
  |
0 |     x.[this.y()].z()
  |        ^^^^^ help: remove explicit receiver
  |
  = note: #[warn(unused_extended_dot_receiver)] on by default

Errors

error[E0000]: use of `this` inside of braces nested in extended dot scope
  --> src/main.rs:0:0
   |
0  |     x.[{this.y()}]
   |        ^ braces in extedned dot scope opens here
0  |
0  |     x.[{this.y()}]
   |         ^^^^ `this` is used here
   |
   = note: using `this` inside of braces nested in extended dot scope is prohibited because that allows to write imperative and unnecessarely intricated code. Consider introducing a temporary binding or move action expression into separate function instead

error[E0000]: use of `this` outside of extended dot scope
 --> src/main.rs:0:0
  |
0 |     this.z()
  |     ^^^^ `this` is used here
  |
  = note: `this` could be used only inside of extended dot scope

error[E0000]: side-effect in extended dot scope
 --> src/main.rs:0:0
  |
0 |     x.[].y()
  |       ^ side-effect applied here
  |
  = note: all extended dot actions should use receiver value explicitly or implicitly because chain would broke otherwise.

Drawbacks

This further complicates language
This requires introducing yet another language keyword
This allows writing non-self-documenting and more compressed code
Trailing comma has impact on control flow

Rationale and alternatives

This RFC provides probably the most compact and the easiest to work with syntax that's possible for the same functionality. Also it resolves many questions about introducing postfix macros, introducing unified method call syntax, and providing combinators that transforms self.

Without it some features could be implemented in different and in less uniform way, and some features most likely wouldn't be implemented at all.

Alternatives:

use always explicit this receiver
use differently named receiver
use only methods without allowing calling external functions
use braces or parentesis instead of brackets
use regular scope instead of list of comma separated expressions
use postfix macro that provides similar functionality
use regular or procedural macro that provides similar functionality
copy a lot of syntax constructs from other languages
don't introduce nothing and stick to current syntax

Prior art

In current form this syntax was never used in any programming language before. But the same functionality provides Kotlin through extension methods like run, with, let, apply, also, that are available on all Kotlin objects. Overall, experience with them wasn't very good because of generic names that don't describes well intention, difficulty in differentiating them from domain-specific functions, and difficulty to spot from where implicit or explicit receiver comes from. Although, in Kotlin they are very easy to understand after the rest of its idioms, also they looks consistent, and disciplined programmers still would write a proper code.

This RFC takes into account all encountered problems and proposed syntax don't affected with them with price of sacrificing in functionality, a bit more complexity, and differentiating in syntax.

Unresolved questions

Should we implement any alternative syntax?
Should we implement it as macro instead?
Should we have any different code style for it?

Future possibilities

Looks like it's impossible to invent something more here