<style type="text/css">
* {
text-align: left;
}
</style>
# Pattern matching updates
## 20 April 2021
Note:
This proposal was formerly being authored and championed by Kat Marchán, before they left TC39.
A new group of champions has taken the proposal back up with a new direction.
---
## Champions:
Mark Cohen, Tab Atkins-Bittner, Jordan Harband, Yulia Startsev, Daniel Rosenwasser, Jack Works, Ross Kirsling
Note:
Thank you to all the champions for their hard work and thoughtful contributions!
Also thank you to Kat, for all the hard work they did before this group took up the proposal.
---
## This is an update, not a request for stage advancement
Note:
As a champions group, while we are presenting what we think the best version of this construct is, we're not married to particular semantics, syntax, or spellings. We're not seeking advancement of the examples we present, we're just showing you where we are.
---
## Priorities
Note:
Let's step through the priorities that we had in mind while designing this proposal.
---
### Priority: *pattern* matching
- Many ways to match values in JS
- No way to match patterns
- Construct contains more than just patterns
- We will prioritize the ergonomics of patterns where conflicts arise
Note:
This might be "obvious", but it's worth stating explicitly.
This proposal is a whole conditional logic construct, more than just patterns.
Where we have to make a trade-off decision involving ergonomics, we will prioritize the use cases of patterns.
---
### Priority: subsumption of `switch`
- Easily googleable; zero syntactic overlap
- Reduce reasons to reach for `switch`
- Preserve the good parts of `switch`
Note:
We feel that any syntactic overlap with `switch` will produce confusion, and significantly hinder the discoverability and googleability of pattern matching.
After this proposal lands, we'd like there to no longer be much of a reason to reach for `switch`. `switch` is a frequent source of confusion for developers and bugs in production.
`switch` is pretty ergonomic for working with tagged unions; we'd like to ensure pattern matching is equally or more ergonomic for those use cases.
---
### Priority: be better than `switch`
- No more footguns
- New capabilities
Note:
`switch` has many footguns. The big one is that fall-through is opt-out; forgetting a `break` statement is easy to do but hard to debug. Omitting curly braces in `case` statements hoists declarations, which is usually surprising.
It's difficult to work with untagged unions with `switch`. We'd like untagged unions to also be ergonomic with pattern matching.
---
### Priority: expression semantics
- Pattern matching construct should be usable as an expression
- `return match { ... }`
- `let foo = match { ... }`
- etc
Note:
In languages that have robust pattern matching, using the construct as an expression is intuitive and concise. We believe this can be achieved in JS.
---
### Priority: exhaustiveness and ordering
- Fall-through and "no match" should be opt-in, not opt-out
- Execution order should never be surprising
Note:
This is mostly about matching (get it?) what we believe to be developers' expectations.
If the developer wants two cases to share logic (what we know as "fall-through" from `switch`), they should specify it explicitly. A parser error reminding you to do so is less harmful than silently accepting buggy code.
If the developer wants to ignore certain possible cases, they should specify that explicitly. A development-time error is less costly than a production-time error from something further down the stack.
Matches should always be checked in the order they're written, from top to bottom.
---
### Priority: user-extensibility
- Userland objects and classes should be able to encapsulate their own matching semantics
Note:
This grew out of our thinking around regular expressions. Surely, it would make sense to use regexes as patterns; and surely, if the regex has named capture groups, it would make sense to have those available as bindings.
We (TC39) can treat this as a magic special case, or we can provide a generic standard by which developers can integrate userland objects with this language construct. We (champions) believe providing that generic standard would be a boon to developer ergonomics, especially for libraries and SDKs.
---
## Questions before we see some syntax?
---
```javascript
match (res) {
when ({ status: 200, body, ...rest }) {
handleData(body, rest);
}
when ({ status: 301 | 304, destination: url }) {
handleRedirect(url);
}
when ({ status: 404 }) { retry(req) }
else { throwSomething() }
}
```
Note:
Code is making an HTTP request
Whole construct is a **match construct**
We have four **match clauses**
Each clause contains a pattern (except else)
Patterns that use object or array destructuring yield bindings
There are other ways to get bindings, which we will discuss later
---
```javascript
match (res) {
// match (matchable) {
```
Note:
The thing being "matched on" is the **matchable**
---
```javascript
when ({ status: 200, body, ...rest }) {
// when (pattern) { ... }
// ───────↓────── ───↓───
// LHS RHS (sugar for do-expression)
// ───────────↓──────────
// clause
```
Note:
A **clause** consists of the `when` keyword, a pattern inside parentheses, and a do-expression
This pattern uses object destructuring syntax, which should Just Work
On top of the existing object destructuring syntax, you can also have patterns on the RHS of the colon
Patterns can introduce bindings; this one introduces `status`, `body`, and `rest`
---
```javascript
when ({ status: 301 | 304, destination: url }) {
// ↳ pipe is logical OR
// ↳ `url` is an irrefutable match, functions as renaming
```
Note:
This pattern contains a pipe, which is the logical OR pattern combinator
Patterns can be nested!
`destination: url` is effectively a rename. `url` is an **irrefutable match**: it matches any value for `destination` and binds that value to the name `url`.
In general, bare variable names are irrefutable matches.
---
```javascript
else { ... }
// ↳ cannot coexist with top-level irrefutable
// match, e.g. `when (foo)`
```
Note:
`else` is a special **fallback clause** which matches anything. This is analogous to `default` in `switch` statements.
A top-level irrefutable match is **also a fallback clause**. It should be an early error to have multiple fallback clauses, or to have any clauses after the fallback clause.
---
```javascript
match (command) {
when (["go", ("N" | "E" | "W" | "S") as dir]) { ... }
when (["take", item]) { ... }
else { ... }
}
```
Note:
Code is a text-based adventure game
Here we see array destructuring, which works basically as expected
Here we see the `as` keyword, which can introduce intermediary bindings
---
```javascript
match (res) {
if (isEmpty(res)) { ... }
when ({ numPages, data }) if (numPages > 1) { ... }
when ({ numPages, data }) if (numPages === 1) { ... }
else { ... }
}
```
Note:
Code is fetching from a paginated endpoint
Here we see **guards**, which provide additional conditional logic where patterns aren't expressive enough.
---
```javascript
match (res) {
if (isEmpty(res)) { ... }
when ({ data: [ page ] }) { ... }
when ({ data: [ frontPage, ...pages ] }) { ... }
else { ... }
}
```
Note:
This is another way to write the previous code sample without a guard, and without checking the page count.
First `when` clause matches if `data` has **exactly one** element
Second `when` clause matches if `data` has **at least one** element. Gives the first page a binding, imagine for presentational purposes.
This also shows off recursive nesting! Patterns can contain patterns.
---
```javascript
match (arithmeticStr) {
when (/(?<left>\d+) \+ (?<right>\d+)/) { ... }
}
```
Note:
Code is a very bad arithmetic expression parser
Regexes are patterns, with the semantics you'd expect
Named capture groups should be able to introduce bindings to the RHS
Unless we want regexes to be a magic special case, we have to provide a protocol
Likely regex named capture groups will still be a smaller special case in that they're the only thing that can introduce bindings by itself. It's an open question whether bare regex literals will require an `as` or not.
---
```javascript
const LF = 0x0a;
const CR = 0x0d;
match (token) {
when ^LF | ^CR { ... }
}
```
Note:
Code is a lexer of some kind
Here we see the **pin operator**, which is an escape-hatch from irrefutable matches.
Without the pin operator, `LF` and `CR` would be irrefutable matches that introduce a binding that shadows the two constants.
With the pin operator, `LF` and `CR` are evaluated, and since they evaluate to primitives, matching is performed against the stored constants.
---
```javascript
class Name {
[Symbol.matcher](matchable) {
const pieces = matchable.split(" ");
if (pieces.length === 2) {
return pieces;
}
}
}
match ("Tab Atkins-Bittner") {
when ^Name with [ first, last ] if (last.includes('-')) { ... }
when ^Name with [ first, last ] { ... }
}
```
Note:
Code is a declaration of the matcher protocol on an imaginary class; implements a very bad name parser. Then matching hyphenated last names separately from non-hyphenated.
Here we see the other use of the pin operator, which is to invoke the matcher protocol.
This operator will probably need to immediately precede an identifier or a parenthesized expression.
We also see the `with` keyword, which is used to pattern-match the value returned by the matcher protocol.
This operator is probably the thing we're least happy with, as a champions group. This turns out to be a hard problem to solve. Prior art is a bit of a mixed bag; this is Elixir's approach. We're very open to other spellings and other ideas.
---
## Lightning round: add-ons
Note:
Finally, before we go to the queue, let's run through some potential add-ons.
---
### `async match`
```javascript
async match (await auth()) {
when ({ user: ^(await getUser()) }) { ... }
else { await getError() }
}
```
Note:
This is the add-on that we're most jazzed about
Should be fairly simple from spec and implementation perspectives
Allows `await` anywhere inside the construct, and the whole expression produces a `Promise`
---
### `&` combinator
```javascript
match (getFromDB()) {
when (^FancyError) { ... }
when (^AggregateError & { errors: [ ^TypeError, ...rest ] }) { ... }
when (^AggregateError) { ... }
}
```
Note:
The OR combinator (`|`) that we saw earlier tries patterns until one succeeds; this tries patterns until one fails
Semantics are still unclear
Allows for more expressive match clauses without having to reach for guards
---
### Nil pattern
```javascript
match (someArr) {
when [_, _, someVal] { ... }
}
```
Note:
Most languages that have structural pattern matching have the concept of a "nil matcher", which fills a hole in a data structure without creating a binding.
In JS, the primary use-case would be skipping spaces in arrays. This is already covered in destructuring by simply omitting an identifier of any kind in between the commas.
With that in mind, and also with the extremely contentious nature, we would only pursue this if we saw strong support for it.
---
### `catch` guards
```javascript
try {
doSomething();
} catch match (err) {
if (err instanceof RangeError) { ... }
when (/^abc$/) { ... }
// default: else { throw err; }
}
```
Note:
This is hopefully a pretty simple one; it's just sugar for catch (err) { match (err) { } }.
There would also be a slight change in semantics, which is that on a non-exhaustive match, we re-throw the caught error, rather than generating a new error.
---
## Questions?
---
## Thank you!
{"metaMigratedAt":"2023-06-15T19:34:29.602Z","metaMigratedFrom":"Content","title":"Pattern matching updates","breaks":true,"contributors":"[{\"id\":\"71e4b57f-fbb5-4984-b957-0cb1fd35488c\",\"add\":22606,\"del\":14053},{\"id\":\"765b3d18-e343-498f-bdd8-ccef7e5aa810\",\"add\":4569,\"del\":1221},{\"id\":null,\"add\":103,\"del\":68}]"}