Pattern matching updates

20 April 2021

Note:
This proposal was formerly being authored and championed by Kat Marchán, before they left TC39.

A new group of champions has taken the proposal back up with a new direction.


Champions:

Mark Cohen, Tab Atkins-Bittner, Jordan Harband, Yulia Startsev, Daniel Rosenwasser, Jack Works, Ross Kirsling

Note:
Thank you to all the champions for their hard work and thoughtful contributions!

Also thank you to Kat, for all the hard work they did before this group took up the proposal.


This is an update, not a request for stage advancement

Note:
As a champions group, while we are presenting what we think the best version of this construct is, we're not married to particular semantics, syntax, or spellings. We're not seeking advancement of the examples we present, we're just showing you where we are.


Priorities

Note:
Let's step through the priorities that we had in mind while designing this proposal.


Priority: pattern matching

  • Many ways to match values in JS
  • No way to match patterns
  • Construct contains more than just patterns
  • We will prioritize the ergonomics of patterns where conflicts arise

Note:
This might be "obvious", but it's worth stating explicitly.

This proposal is a whole conditional logic construct, more than just patterns.

Where we have to make a trade-off decision involving ergonomics, we will prioritize the use cases of patterns.


Priority: subsumption of switch

  • Easily googleable; zero syntactic overlap
  • Reduce reasons to reach for switch
  • Preserve the good parts of switch

Note:
We feel that any syntactic overlap with switch will produce confusion, and significantly hinder the discoverability and googleability of pattern matching.

After this proposal lands, we'd like there to no longer be much of a reason to reach for switch. switch is a frequent source of confusion for developers and bugs in production.

switch is pretty ergonomic for working with tagged unions; we'd like to ensure pattern matching is equally or more ergonomic for those use cases.


Priority: be better than switch

  • No more footguns
  • New capabilities

Note:
switch has many footguns. The big one is that fall-through is opt-out; forgetting a break statement is easy to do but hard to debug. Omitting curly braces in case statements hoists declarations, which is usually surprising.

It's difficult to work with untagged unions with switch. We'd like untagged unions to also be ergonomic with pattern matching.


Priority: expression semantics

  • Pattern matching construct should be usable as an expression
    • return match { ... }
    • let foo = match { ... }
    • etc

Note:
In languages that have robust pattern matching, using the construct as an expression is intuitive and concise. We believe this can be achieved in JS.


Priority: exhaustiveness and ordering

  • Fall-through and "no match" should be opt-in, not opt-out
  • Execution order should never be surprising

Note:
This is mostly about matching (get it?) what we believe to be developers' expectations.

If the developer wants two cases to share logic (what we know as "fall-through" from switch), they should specify it explicitly. A parser error reminding you to do so is less harmful than silently accepting buggy code.

If the developer wants to ignore certain possible cases, they should specify that explicitly. A development-time error is less costly than a production-time error from something further down the stack.

Matches should always be checked in the order they're written, from top to bottom.


Priority: user-extensibility

  • Userland objects and classes should be able to encapsulate their own matching semantics

Note:
This grew out of our thinking around regular expressions. Surely, it would make sense to use regexes as patterns; and surely, if the regex has named capture groups, it would make sense to have those available as bindings.

We (TC39) can treat this as a magic special case, or we can provide a generic standard by which developers can integrate userland objects with this language construct. We (champions) believe providing that generic standard would be a boon to developer ergonomics, especially for libraries and SDKs.


Questions before we see some syntax?


    match (res) {
      when ({ status: 200, body, ...rest }) {
        handleData(body, rest);
      }
      when ({ status: 301 | 304, destination: url }) {
        handleRedirect(url);
      }
      when ({ status: 404 }) { retry(req) }
      else { throwSomething() }
    }

Note:
Code is making an HTTP request

Whole construct is a match construct

We have four match clauses

Each clause contains a pattern (except else)

Patterns that use object or array destructuring yield bindings

There are other ways to get bindings, which we will discuss later


    match (res) {
//  match (matchable) {

Note:
The thing being "matched on" is the matchable


    when ({ status: 200, body, ...rest }) {
//  when (pattern) { ... }
//  ───────↓────── ───↓───
//        LHS        RHS (sugar for do-expression)
//  ───────────↓──────────
//          clause

Note:
A clause consists of the when keyword, a pattern inside parentheses, and a do-expression

This pattern uses object destructuring syntax, which should Just Work

On top of the existing object destructuring syntax, you can also have patterns on the RHS of the colon

Patterns can introduce bindings; this one introduces status, body, and rest


    when ({ status: 301 | 304, destination: url }) {
//  ↳ pipe is logical OR
//  ↳ `url` is an irrefutable match, functions as renaming

Note:
This pattern contains a pipe, which is the logical OR pattern combinator

Patterns can be nested!

destination: url is effectively a rename. url is an irrefutable match: it matches any value for destination and binds that value to the name url.

In general, bare variable names are irrefutable matches.


    else { ... }
//  ↳ cannot coexist with top-level irrefutable
//     match, e.g. `when (foo)`

Note:
else is a special fallback clause which matches anything. This is analogous to default in switch statements.

A top-level irrefutable match is also a fallback clause. It should be an early error to have multiple fallback clauses, or to have any clauses after the fallback clause.


match (command) {
  when (["go", ("N" | "E" | "W" | "S") as dir]) { ... }
  when (["take", item]) { ... }
  else { ... }
}

Note:
Code is a text-based adventure game

Here we see array destructuring, which works basically as expected

Here we see the as keyword, which can introduce intermediary bindings


match (res) {
  if (isEmpty(res)) { ... }
  when ({ numPages, data }) if (numPages > 1) { ... }
  when ({ numPages, data }) if (numPages === 1) { ... }
  else { ... }
}

Note:
Code is fetching from a paginated endpoint

Here we see guards, which provide additional conditional logic where patterns aren't expressive enough.


match (res) {
  if (isEmpty(res)) { ... }
  when ({ data: [ page ] }) { ... }
  when ({ data: [ frontPage, ...pages ] }) { ... }
  else { ... }
}

Note:
This is another way to write the previous code sample without a guard, and without checking the page count.

First when clause matches if data has exactly one element

Second when clause matches if data has at least one element. Gives the first page a binding, imagine for presentational purposes.

This also shows off recursive nesting! Patterns can contain patterns.


match (arithmeticStr) {
  when (/(?<left>\d+) \+ (?<right>\d+)/) { ... }
}

Note:
Code is a very bad arithmetic expression parser

Regexes are patterns, with the semantics you'd expect

Named capture groups should be able to introduce bindings to the RHS

Unless we want regexes to be a magic special case, we have to provide a protocol

Likely regex named capture groups will still be a smaller special case in that they're the only thing that can introduce bindings by itself. It's an open question whether bare regex literals will require an as or not.


const LF = 0x0a;
const CR = 0x0d;
match (token) {
  when ^LF | ^CR { ... }
}

Note:
Code is a lexer of some kind

Here we see the pin operator, which is an escape-hatch from irrefutable matches.

Without the pin operator, LF and CR would be irrefutable matches that introduce a binding that shadows the two constants.

With the pin operator, LF and CR are evaluated, and since they evaluate to primitives, matching is performed against the stored constants.


class Name {
  [Symbol.matcher](matchable) {
    const pieces = matchable.split(" ");
    if (pieces.length === 2) {
      return pieces;
    }
  }
}

match ("Tab Atkins-Bittner") {
  when ^Name with [ first, last ] if (last.includes('-')) { ... }
  when ^Name with [ first, last ] { ... }
}

Note:
Code is a declaration of the matcher protocol on an imaginary class; implements a very bad name parser. Then matching hyphenated last names separately from non-hyphenated.

Here we see the other use of the pin operator, which is to invoke the matcher protocol.

This operator will probably need to immediately precede an identifier or a parenthesized expression.

We also see the with keyword, which is used to pattern-match the value returned by the matcher protocol.

This operator is probably the thing we're least happy with, as a champions group. This turns out to be a hard problem to solve. Prior art is a bit of a mixed bag; this is Elixir's approach. We're very open to other spellings and other ideas.


Lightning round: add-ons

Note:
Finally, before we go to the queue, let's run through some potential add-ons.


async match

async match (await auth()) {
  when ({ user: ^(await getUser()) }) { ... }
  else { await getError() }
}

Note:
This is the add-on that we're most jazzed about

Should be fairly simple from spec and implementation perspectives

Allows await anywhere inside the construct, and the whole expression produces a Promise


& combinator

match (getFromDB()) {
  when (^FancyError) { ... }
  when (^AggregateError & { errors: [ ^TypeError, ...rest ] }) { ... }
  when (^AggregateError) { ... }
}

Note:
The OR combinator (|) that we saw earlier tries patterns until one succeeds; this tries patterns until one fails

Semantics are still unclear

Allows for more expressive match clauses without having to reach for guards


Nil pattern

match (someArr) {
  when [_, _, someVal] { ... }
}

Note:
Most languages that have structural pattern matching have the concept of a "nil matcher", which fills a hole in a data structure without creating a binding.

In JS, the primary use-case would be skipping spaces in arrays. This is already covered in destructuring by simply omitting an identifier of any kind in between the commas.

With that in mind, and also with the extremely contentious nature, we would only pursue this if we saw strong support for it.


catch guards

try {
  doSomething();
} catch match (err) {
  if (err instanceof RangeError) { ... }
  when (/^abc$/) { ... }
  // default: else { throw err; }
}

Note:
This is hopefully a pretty simple one; it's just sugar for catch (err) { match (err) { } }.

There would also be a slight change in semantics, which is that on a non-exhaustive match, we re-throw the caught error, rather than generating a new error.


Questions?


Thank you!

Select a repo