# Pattern matching WG
Members: [@raph-amiard](https://github.com/raph-amiard), [@clairedross](https://github.com/clairedross), [@roldak](https://github.com/roldak), [@yakobowski](https://github.com/yakobowski)
## Overview
## Syntax
Have a “moving target” syntax for our pattern matching proposal. Current proposal:
### Syntax for matching composite types
Our proposal uses the syntax for aggregates to match composite values, extended with `<>` to represent the `others` (which will match any value).
```ada
case Arr is
when (1, 2, 3) => null;
when (others => 12) => null;
when (1 | 2 => 4, 3 .. 8 => <>) => null;
when <> => null;
end case;
```
You can also match records, including variant records.
```ada
type Opt (Has_Value : Boolean) is record
case Has_Value is
when True =>
Val : Int;
when others => null;
end case;
end record;
I : Opt := ...;
case I is
when (Has_Value => True, Val => 2) => null;
when (Has_Value => False) => null;
end case;
```
### Syntax for bindings values
To bind a value, the simple form is:
```
pattern `as` binding_name [`:` subtype_indication]
```
```ada
case I is
when (Has_Value => True, <> as Val : Integer) => null;
when (Has_Value => False) => null;
end case;
```
We would like to have a short form for the “others” pattern (expressed as `<>`)
when binding a name:
```
<binding_name>
```
```ada
case I is
when (Has_Value => True, <Val>) => null;
when (Has_Value => False) => null;
end case;
```
### Syntax for dereferencing accesses
You can match an access value with either:
```
`null` | `(` `all` `=>` pattern `)`
```
```ada
function Add (A, B : Int_Access) return Integer is
begin
case (A, B) is
when ((all => <A>), (all => <B>)) => A + B
when ((all => <A>), null) or (null, (all => <A>)) => A
when (null, null) => 0
end case;
end Add;
```
Note: not sure about the all syntax, but we don’t have a better idea that is readable and composable at the moment. We tried playing with “.all” but it didn’t work out well. One idea would be to have `all` as a prefix operator:
```ada
function Add (A, B : Int_Access) return Integer is
begin
case (A, B) is
when (all <A>, all <B>) => A + B
when (all <A>, null) or (null, all <A>) => A
when (null, null) => 0
end case;
end Add;
```
### Syntax for matching tagged types
Basically using qualified expressions:
```ada
function Add (A, B : Maybe (Integer)) return Integer is
begin
case Exp is
when Bin_Op'(<L>, <Op>, <R>) => ...
when Call'(<Callee>, (Some, <Args>)) => ...
when Call'(<Callee>, None) => ...
end case;
end Add;
```
Additionally, would be nice to be able to use just the type name, as done for subtypes currently:
```ada
function Add (A, B : Maybe (Integer)) return Integer is
begin
case Exp is
when Bin_Op => ...
when Call => ...
when Un_Op as U => ...
end case;
end Add;
```
### Syntax for matching single discriminant record ?
When matching a maybe type, or any type that has a valid value that is only represented by one discriminant value, we have to use the long form `(Disc => Value)`. In the case of the `Maybe` type, this is very heavy, as evidenced in the example below:
```ada
function Add (A, B : Maybe (Integer)) return Integer is
begin
case (A, B) is
when ((Some, <A>), (Some, <B>)) => A + B
when ((Some, <A>), (Kind => None)) or ((Kind => None), (Some, <A>)) => A
when ((Kind => None), (Kind => None)) => 0
end case;
end Add;
```
**NOTE**: Actually in the example above in can be worked around by the fact that the only remaining value is None in every case we use it, so we can just use `<>`
```ada
function Add (A, B : Maybe (Integer)) return Integer is
begin
case (A, B) is
when ((Some, <A>), (Some, <B>)) => A + B
when ((Some, <A>), <>) or (<>, (Some, <A>)) => A
when (<>, <>) => 0
end case;
end Add;
```
This alleviates the concern somehow, since you can probably expect that the empty alternative in discriminated records will always be the last match. However this will mess up with people trying to enforce complete coverage by manually matching every case.
Ideas so far
Romain proposed allowing to put the discriminant first, before the aggregate, which would make it look like a ML constructor, or a function call in Ada. Claire said that in this case having this syntax for everything:
```ada
when (Some(<A>), None) or (None, Some(<A>)) => A
```
The problem with that syntax is that it’s highly ambiguous in the regular
aggregate case.
It looks like a function call, but would be very hard to implement in terms of
function calls If several discriminated records use the same enum discriminant
Another possibility that arose would be using subtypes:
```ada
generic
type T is private;
package Opt is
type Opt_Kind is (Kind_None, Some);
type Maybe (Kind : Opt_Kind) is record
case Kind is
when Some => Val : T;
when Kind_None => null;
end case;
end record;
subtype None is Maybe (Kind_None);
end Opt;
X : Int_Option := Some'(Value => 2);
function Add (A, B : Maybe (Integer)) return Integer is
begin
case (A, B) is
when ((Some, <A>), (Some, <B>)) => A + B
when ((Some, <A>), None) | (None, (Some, <A>)) => A
when (None, None) => 0
end case;
end Add;
function Add (A, B : Maybe (Integer)) return Integer is
begin
case (A, B) is
when ((Some, let A), (Some, let B)) => A + B
when ((Some, let A), None) | (None, (Some, let A)) => A
when (None, None) => 0
end case;
end Add;
```
**Decision: Using a subtype for None seems like a good enough solution that can be implemented with existing language facilities.**
## Semantics
### Semantics of bindings values
#### If we consider that a variable binding is like a rename:
This causes potential safety issues, which already exist and are why you cannot rename a component of a mutable discriminated record.
Is the user allowed to mutate the renamed component ?
#### If we consider that a variable binding is like a copy:
Consider that the copy is “constant” by default
#### Do we want to allow both ?
Rename semantics is desirable because then you can mutate matched components
But then you cannot use pattern matching on mutable discriminated records, which is a huge cons.
Is adding the complexity of two kind of bindings worth it ?
Maybe have `as` for copy-like bindings and `renames` for rename bindings ? Answer: That’s horrible because this would imply "pattern `as` binding_name" but "binding_name `renames` pattern"
### Sequential vs partitioned semantics for branch matching
Claire & Raphael, while writing the original AI, have found a paradigm for
matching that allows saving some of the original philosophy behind the Ada case
statement’s rules, while allowing some expressiveness. The idea is that each
pattern sub-rule must go from specific to more general. The order of branches
doesn’t influence execution (cannot, really, because either branches are non
overlapping, either the order in which branches are written is the only
possible one).
The rules in short are: Matches and sub matches should generally be ordered
from less general to more general. When that is not the case, and you go from a
general match to a specific match, and that generates overlapping cases, you
need to make sure that those cases are also explicitly handled.
The question is whether we should explore this idea or dig into the ML family
pattern matching direction, where order of branches matter, overlap can happen,
and only dead branches are discovered.
Claire mentioned live that the Ada-like fashion is still more safe, while less
expressive.
The rationale is similar to the one used for contract cases, where we want only
one case to apply at each time for readability. If you have a case with -100 ..
0 and another with 0 .. 100, it is interesting to make sure that the programmer
has considered the case of 0 and decided what behavior he wanted here. And
assuming that the pattern matching statement is big, it would be helpful to
know when we have a constrained pattern that we necessarily end up here for all
matched values:
```ada
case X : Integer is
when Positive => ...;
when Negative => ...;
-- if this one is far from the others, it would be nice to be sure when
-- reading it that we will enter this alternative each time we have -1,
-- 0 or 1.
when -1 .. 1 => ...;
end case;
```
Raph: Romain has pointed out that if the -1 .. 1 range above is treated like 3
different branches, the dead branch warning would catch this. If we consider it
this way, the true difference between linear matching a la OCaml and what we
are proposing is that in OCaml you will only get a warning when the compiler is
sure that there is a dead branch (which cannot be verified in all cases because
of guards - maybe amongst other features), whereas in Ada, you are forbidden to
write anything that can generate such uncertainty, and hence the order *must
not* matter.
In other words, if you remove cases that cannot be statically evaluated (such
as branches) and make the dead branch detector mandatory, you basically have
the same system that we are proposing (I think).
Claire: As discussed live, either I don’t understand or don’t agree with the
above. If two cases are overlapping, and none are dead, the dead branch
detector won’t complain, as in the case below if you remove (1,1).
```ada
case P : Point is
when (1, 1) =>
when (1, <>) =>
when (<>, 1) =>
end case;
```
Boris looked at ~100 complex pattern-matching in the code of Why3, trying to
find examples where this semantics would be unacceptably more verbose, and did
not find any. The increase in verbosity was always acceptable.
**Decision: Try to implement the explicit overlap model. See if it’s
implementable, and ergonomic.**
### Semantics of matched values & completeness
Matching tagged types would anyway imply matching potentially infinite number
of cases.
Also, it is felt amongst members of the WG that the restrictions on real types
is kind of artificial anyway.
**So we propose completely removing any restrictions on the types that can be
matched, and instead replacing by a completeness check that will try to detect
uncovered cases depending on the type being matched.**
## Open questions