owned this note
owned this note
Published
Linked with GitHub
---
id: polymorphic variants
title: Polymorphic Variants
description:
> Learn about OCaml's polymorphic variants
category : "Introduction"
level: "beginner"
prerequisites: "variants"
---
<!--
Sources:
https://caml.inria.fr/pub/papers/garrigue-polymorphic_variants-ml98.pdf
https://blog.shaynefletcher.org/2017/03/polymorphic-variants-subtyping-and.html
https://discuss.ocaml.org/t/new-draft-tutorial-on-polymorphic-variants/13485
https://discuss.ocaml.org/t/is-there-any-kind-of-guidline-about-when-to-use-polymorphic-variants/11006
https://blog.klipse.tech/ocaml/2018/03/16/ocaml-polymorphic-types.html
This lesson is about 800 lines long (about 1100 with line length limeted to 85 columns). This makes this lesson on the longer side when compared to other lessons on OCaml.org. Therefore, this is the first of 2 lessons on polymorphic variants.
This lesson (lesson 1) introduces the concepts behind polymorphic variants (such as row polymorphism and structural typing), then discusses common syntactic structures of polymorphic variants. It teaches these concepts in a bottom-up direction. It is my subjective belief (held lightly) that introducing polymorphic variants in a top-down direction leads to more complexity and confusion.
Lesson 2, which is forthcoming, introduces common usecases for polymorphic variants through real-world examples.
Any feedback a review is willing to provide is greatly appreciated. The author is particularly interested in ensuring accuracy and validity of examples and consistency in the language with OCaml.org's other materials, but all feedback is welcome.
-->
## Introduction
**Note**: In this lesson, we will use the term "ordinary variant" to refer to the variants we are already familiar with, which are typically simply called "variants". This is to make it clear when we refer to ordinary variants or polymorphic variants.
Polymorphic variants provide a flexible and extensible alternative to ordinary variants in OCaml. They extend the concept of variants by removing the need for explicit type declarations and allowing a greater degree of composability across different parts of a program while maintaining the type safety of the program.
Before polymorphic variants, OCaml only supported ordinary variants, which required explicit type declarations that are fixed: once a variant type is declared, it cannot be extended without modifying the variant type's declaration.
However, as developers worked on larger and more modular systems, they encountered situations where extensibility and code reuse were limited by ordinary variants. This created a demand for extensible variants, variant types that could be shared across modules and extended without modifying fixed type declarations.
To address these challenges, polymorphic variants were introduced to OCaml in version 3.0 in 2000 by Jacques Garrigue. If inclined, you are encouraged to read [this paper](https://caml.inria.fr/pub/papers/garrigue-polymorphic_variants-ml98.pdf) from 1998 introducing the motivations and implementation strategy for polymorphic variants in OCaml.
<!-- When available, mention lesson 2 on use-cases by example here -->
While polymorphic variants have benefits in certain scenarios, a downside is that they can require the developer to keep more context in mind, which can lead code that is more difficult to reason about and maintain. For these reasons, it is recommended that polymorphic variants be used judiciously. It is recommended (if only when first using polymorphic variants) that developers initiallly use ordinary variants, and as a codebase grows and a need for flexibility outweighs the higher complexity, the codebase be refactored to employ polymorphic variants stategically.
<!-- Is it worth mentioning that exceptions have the a restricted for of extensionality when compared to types? -->
## Prerequisites
Functions and values
Variants
Objects (optional)
...
## Foundations
Before we introduce patterns utilizing polymorphic variants, we will discuss several foundational topics that will provide context for how polymorphic variants can be used and for what purpose.
**Note**: To simplify this lesson, we will frequently use polymorphic variants that do not carry payloads. Keep in mind that polymorphic variants are able to carry payloads in the same way as with ordinary variants.
### On Constructors Vs Tags
An important syntactic distinction between ordinary variants and polymorphic variants is that ordinary variants define **constructors** for typed values while polymorphic variants define **tags** for typed values.
**Note**: Tags are technically constructors for a polymorphic variant and can be referred to as such, but for clarity we will use the term "tag" to refer to constructors of polymorphic variants and "constructor" to refer to constructors of ordinary variants. The distinction is that constructors (of ordinary variants) construct a *predeterminted* type, and tags (of polymorphic variants) construct an adhoc type that is *indeterminate* until applied to a specific value. This means that the exact type of a polymorphic variant remains open and is determined by the operations and contexts in which it is employed.
To make this concrete, lets first declare an ordinary variant type:
``` ocaml
# type x_variant =
| A (* constructor *)
| B (* constructor *)
| C;; (* constructor *)
type x_variant = A | B | C
```
We can also name polymorphic variants in a similar way with a type abbreviation:
``` ocaml
# type x_poly_variant = [
| `A (* tag *)
| `B (* tag *)
| `C (* tag *)
];;
type x_poly_variant = [ `A | `B | `C ]
```
Tags are defined like constructors except they include a backtick (\`) and don't require capitalization, however it is idiomatic to capitalize them as well (the tag `` `A `` is preferred, the tag `` `a `` is discouraged).
Above, we used polymorphic variant tags in a type abbreviation, but this is not a requirement like it is with ordinary variants.
For example, we can assign a variable to a tag with a `let` expression:
``` ocaml
# let a = `D;;
val a : [> `D ] = `D
```
But it is not required that we define a tag with a `let` expression either!
### On Polymorphic Variant Tags and Type Expressions
We can make reference to a tag in polymorphic variant type annotations and pattern matching expressions without pre-declaring or pre-defining tags at all:
``` ocaml
# let handle_tag_1 (x : [> `E | `F]) = (* an explicit type expression annotation *)
match x with
| `E -> "Got E"
| `F -> "Got F"
| _ -> "Something else";;
val handle_tag_1 : [> `E | `F ] -> string = <fun>
# handle_tag_1 `E;;
- : string = "Got E"
# handle_tag_1 `G;;
- : string = "Something else"
```
**Side Note**: The symbol `>` is called an lower bound, meaning that the argument `x` in `handle_tag_1` declares that it can handle the tag `` `E ``, `` `F ``, `` [ `E | `F ] ``, and potentially other tags as well. This policy is supported by the catch-all branch of the match expression.
**Syntax of Polymorphic Variant Tag Expressions**:
Polymorphic variant expressions are sets of tags within square brackets (`[`, `]`), separated by a type union operator (`|`). This notation represents a union of tags that comprise a single polymorphic variant type expression.
These type expressions also carry extensibility information, indicating whether the type is
open (the `>` operator), closed (the `<` operator), or exact (no operator). The precise meaning behind the extensibility information (open, closed, and exact) will be thoroughly discussed later in this lesson.
### On Explicit Vs Inferred Annotations
When working with polymorphic variants, it is common to avoid explicitly annotating our values unless we require specific behavior that cannot be inferred. By allowing types to be inferred by the compiler, our codebase is able to be extended and easily refactored.
For example, the previous `handle_tag_1` function does not require an explicit type annotation on its argument:
``` ocaml
# let handle_tag_1 x =
match x with
| `E -> "Got E"
| `F -> "Got F"
| _ -> "Something else";;
val handle_tag_1 : [> `E | F ] -> string = <fun>
# handle_tag_1 `E;;
- : string = "Got E"
# handle_tag_1 `G;;
- : string = "Something else"
```
Without an explict annotation, the open polymorphic variant type is automatically inferred by the compiler. If we alter the behavior of our function, its signature will automatically be inferred without our having to update an explicit annotation:
``` ocaml
# let handle_tag_2 x =
match x with
| `E -> "Got E"
| `F -> "Got F"
val handle_tag_2 : [< `E | F] -> string = <fun>
# handle_tag_2 `E;;
- : string = "Got E"
# handle_tag_2 `G;;
Error: This expression has type [> `G ] but an expression was expected of type
[< `E | `F ]
The second variant type does not allow tag(s) `G
```
**Side Note**: The symbol `<` is called a upper bound, meaning that the argument `x` in `handle_tag_2` declares that it can handle the tag `` `E ``, `` `F ``, ``[ `E | `F ] ``, and none others. This policy is supported by the lack of a catch-all branch in the match expression.
While depending on the compiler to implicitly infer type annotations on polymorphic variants can lead to greater flexibility, if the use of polymorphic variants becomes sufficiently complex it can lead to code that is difficult to reason about, leading to bugs. Debugging code with polymorphic variants can be challenging because the error messages emitted by the compiler can be difficult to interpret. If we find ourselves in this situation, one strategy to deal with it is to slowly introduce explicit annotations into our code until we find the cause of bug. Once the bug is identified and fixed, our explicit annotations can be removed or kept in place depending on the scenario.
### On Runtime Memory Costs
While polymorphic variants offer flexibility and composability over ordinary variants, they come with a runtime memory cost that is useful to understand, especially in performance-sensitive programs that make sufficiently extensive use of polymorphic variants.
**Note**: To better understand OCaml's memory representation in general, you can learn about it in [this lesson](https://ocaml.org/docs/memory-representation).
Consider an ordinary variant with a constructor carrying a tuple of two values:
```ocaml
# type t = Pair of int * int
let v = Pair (1, 2);;
type t = Pair of int * int
val v : t = Pair (1, 2)
```
At runtime, the value `Pair (1, 2)` occupies 3 words in memory:
1. **1 word** for the block header, which encodes both garbage-collector metadata and
the constructor tag in a compact form.
2. **2 words** for the payload: `1` and `2`.
<!-- Please very this -->
This totals *3 words*.
Now, consider a polymorphic variant that carries the same pair:
```ocaml
let v = `Pair (1, 2)
```
At runtime, this value occupies 6 words:
1. **1 word** for the tag `` `Pair ``, stored as a full-word hash; it is not packed into the header.
2. **1 word** for a pointer to the payload.
3. The payload `(1, 2)` is itself a tuple, which occupies:
- 1 word for its block header,
- 1 word the value `1`.
- 1 word the value `2`.
This totals *6 words*.
<!-- Please very this -->
These additional runtime memory requirements are greater for polymorphic variants than ordinary variants, however in practice it is not common for runtime efficiency to be noticeably affected unless polymorphic variants are use extensively.
### On Row Polymorphism and Structural Typing
Row polymorphism is a type system feature that lets us parameterize types over an unspecified "row" of fields or tags. In OCaml, object types and polymorphic variant types display row polymorphism. In some other languages, row polymorphism may also apply to extensible records, but this is not the case in OCaml.
Structural typing, on the other hand, determines type compatibility based on the shape or structure of a type, regardless of its name. In OCaml, tuples, function types, object types, and polymorphic variants are structural.
The two concepts, row polymorphism and structural typing, are orthogonal to one another. "Orthogonal" in this context means that row polymorphism and structural typing are independent features of the type system. They operate on different aspects and can be applied separately without interfering with one another. Structural typing is about determining type compatibility based solely on the structure (such as the fields of an object or the tags of a variant), whereas row polymorphism is a mechanism for handling extensible collections of fields or tags through parametric polymorphism. Because they are orthogonal, we can use row polymorphism with both structurally typed and nominally typed constructs without one affecting the behavior of the other.
In the following subsection, we introduce row polymorphism and structural typing with practical examples to illustrate how these concepts work together in OCaml.
#### On Parametric Polymorphism and Row Polymorphism
"Polymorphism" is a term we were introduced to previously in the form of parametric polymorphism. We see this in ordinary variants when we declare constructors that take as parameters variable data types. An `option` is an example of a variant that contains parametrically polymorphic variables:
``` ocaml
# type 'a option =
| None
| Some of 'a;;
type 'a option = None | Some of 'a
# let process_option_1 opt =
match opt with
| None -> "got none"
| Some x -> "got Some";;
val process_option_1 : int option/2 -> string = <fun>
# process_option_1 (Some 42);;
- : string = "got Some"
# process_option_1 (Some "Hello");;
- : string = "got Some"
```
This kind of polymorphism allows an `option` to wrap a variety of data types carried by the `Some` constructor.
Likewise, polymorphic variants also exhibit parametric polymorphism. For example we can construct an `option` that uses polymorphic variants as well:
``` ocaml
# let process_option_2 opt =
match opt with
| `None -> "got None"
| `Some x -> "got Some";;
val process_2_option : [< `None | `Some of 'a ] -> string = <fun>
# process_option_2 (`Some 42);;
- : string = "got Some"
# process_option_2 (`Some "Hello");;
- : string = "got Some"
```
**Question**: If both ordinary and polymorphic variants can carry variable values known as parametric polymorphism, what does the "*polymorphic*" in "*polymorphic variant*" mean?
**Answer**: the "polymorphic" in "polymorphic variant" refers to row polymorphism, which is a form of parametric polymorphism that requires a free variable representing the extra, unspecified components. Therefore, polymorphic variants exhibit two instances of parametric polymorphism, one for representing different types of payload and the other for representing variable rows of variant tags.
**Row Polymorphism in Objects**
Object types in OCaml are part of its object-oriented features and support row polymorphism:
```ocaml
# let greet obj = print_endline ("Hello " ^ obj#name);;
val greet : < name : string; .. > -> unit = <fun>
```
The type `< name : string; .. >` means: "This function works on any object that has a `name` method returning a string, regardless of what other methods it might have". Row polymorphism refers to the ability of a type system to remain flexible regarding the fields or methods that an object might have. In this context, the ellipsis (`..`) in `< name : string; .. >` represents an open-ended row variable, meaning that the object can include any additional methods beyond `name`. This allows functions like `greet` to operate on any object that has at least a `name` method, regardless of what other methods it provides.
Let’s put it to a test:
``` ocaml
# let o1 = object method name = "Alice" end;;
val o1 : < name : string > = <obj>
# let o2 = object method name = "Bob" method age = 40 end;;
val o2 : < age : int; name : string > = <obj>
# greet o1;;
Hello Alice
# greet o2;;
Hello Bob
```
Other methods are optional, and this structure is extensible. This is row polymorphism at work.
**Row Polymorphism in Polymorphic Variants**
Polymorphic variants behave similarly to objects regarding row polymorphism, but instead of methods, they work with tags.
Let’s compare this to a function that handles error tags:
```ocaml
# let handle_error x =
match x with
| `Not_found -> "Not found"
| `Permission_denied -> "Permission denied"
| _ -> "Some unknown error";;
val handle_error : [> `Not_found | `Permission_denied ] -> string = <fun>
```
Here, the type `[> \`Not_found | \`Permission_denied ]` contains two explicit "rows". The `>` symbol indicates an lower bound of an open polymorphic variant, allowing the function to be polymorphic over rows. It can accept any value that includes at least the given tags.
In OCaml, the polymorphism of a polymorphic variant type is influenced by its bounds:
**Open (`[> ...]`)**: An open polymorphic variant has a lower bound. These denote that the type can include additional tags beyond those specified, allowing for flexible extension. This openness enables polymorphic behavior. Functions accepting such types can handle variants with at least the specified tags but are not limited to them.
**Closed (`[< ...]`)** A closed polymorphic variant has an upper bound. These restrict the type to only the specified tags, disallowing any others. It ensures type safety by preventing unexpected tags by limiting row polymorphism and specifying an upper bound of permitted tags.
**Exact (`[ ... ]`)**: An exact polymorphic variant has neither an upper nor a lower bound and only accepts values witht he exact polymorphic variant type specified. This ensures type safety by limiting row polymorphism and disallowing any fewer rows.
Therefore, to achieve polymorphic behavior with polymorphic variants in OCaml, employing open polymorphic variants is essential. Closed or exact polymorphic variants, while providing strict type constraints, allow us to restrict row polymorphism.
Let’s try calling our `handle_error` function with varying tags:
``` ocaml
# handle_error `Not_found;;
- : string = "Not found"
# handle_error `Permission_denied;;
- : string = "Permission denied"
# handle_error `Another_error;;
- : string = "Some other error"
```
This works because OCaml allows the function to handle any superset of the expected tags by using row polymorphism to keep the function's argument type extensible.
#### On Nominal Typing Vs Structural Typing
Nominal typing is where type compatibility and equality are determined by explicit type declaration names (hence, it is "nominal"). This approach emphasizes type identities and namespaces.
In contrast, structural typing bases type compatibility on the actual structure or shape of the types. If two types have the same components or behavior, they are treated as equivalent regardless of how they are named or where they are defined. OCaml’s polymorphic variants and object types are examples of structural typing, allowing for more flexible code reuse and subtyping.
**Nominal Typing with Records**
OCaml's records are nominally typed, meaning that even if two records have the same fields with identical types, they are considered different types if they have different names. For instance:
```ocaml
# type point2d_typ = { x: float; y: float }
type coordinates_typ = { x: float; y: float };;
type point2d_typ = { x : float; y : float; }
type coordinates_typ = { x : float; y : float; }
```
Even though `point2d` and `coordinates` have the same structure, OCaml treats them as distinct types because records are nominal.
If we use them in contexts in which they are not explicitly allowed, an exception will be raised:
```ocaml
# let print_point2d (pt: point2d_typ) =
Printf.printf "(%f, %f)\n" pt.x pt.y;;
val print_point2d : point2d_typ -> unit = <fun>
# let point2d : point2d_typ = {x=1.; y=2.}
let coordinate : coordinates_typ = {x=1.; y=2.};;
val point2d : point2d_typ = {x = 1.; y = 2.}
val coordinate : coordinates_typ = {x = 1.; y = 2.}
# print_point2d point2d;;
(1.000000, 2.000000)
- : unit = ()
# print_point2d coordinate;;
Error: This expression has type coordinates_typ
but an expression was expected of type point2d_typ
```
**Nominal Typing with Ordinary Variants**
Next, let’s declare two different ordinary variants with constructors that coincidentally share the same names to demonstrate nominal typing:
```ocaml
# type cardinal = North | East | South | West
type compass = North | East | South | West;;
type cardinal = North | East | South | West
type compass = North | East | South | West
```
Now, let’s compare the values they construct:
``` ocaml
# (North : cardinal) = (North : cardinal);;
- : bool = true
# (North : compass) = (North : compass);;
- : bool = true
# (North : cardinal) = (North : compass);;
Error: This expression has type compass but an expression was expected of type
cardinal
```
As we expected, values constructed with the same type and constructore are **equal**, while values with the same constructor but different types are **not equal**.
**Structural Typing with Tuple Types**
Tuples with the same structure can be used interchangably:
``` ocaml
# let tuple1 : int * string = (42, "hello")
let tuple2 : int * string = (42, "hello");;
val tuple1 : int * string = (42, "hello")
val tuple2 : int * string = (42, "hello")
# let print_tuple (p : int * string): unit =
let (i, s) = p in
Printf.printf "Tuple contains: %d and %s\n" i s;;
val print_tuple : int * string -> unit = <fun>
# print_tuple tuple1;
print_tuple tuple2;;
Tuple contains: 42 and hello
Tuple contains: 42 and hello
- : unit = ()
```
Here, even though `tuple1` and `tuple2` are defined separately, they share the same structure (`int * string`), so they are treated as the same type. The function `print_tuple` can accept either tuple without any extra conversion. This demonstrates how tuples are compared by their structure rather than by any declared name.
**Structural Typing with Function Types (Arrow Types)**
```ocaml
# let increment = fun x -> x + 1;;
val increment : int -> int = <fun>
# let double = fun x -> x * 2;;
val double : int -> int = <fun>
# let apply_twice (f : int -> int) x = f (f x);;
val apply_twice : (int -> int) -> int -> int = <fun>
# apply_twice increment 3;;
- : int = 5
# apply_twice double 3;;
- : int = 12
```
In this example, the functions `increment` and `double` are defined independently but share the same structure for their first argument (`int -> int`) . Because function types are structural, the higher-order function `apply_twice` can accept either of them without any explicit type coercion.
**Structural Typing with Polymorphic Variants**
Next, let’s set up a similar scenario we did with ordinary variants using polymorphic variants:
```ocaml
# let cardinal_north = `North;;
val cardinal_north : [> `North ] = `North
# let compass_north = `North;;
val compass_north : [> `North ] = `North
```
and now lets compare them:
```ocaml
# cardinal_north = compass_north;;
- : bool = true
```
Even though we defined `cardinal_north` and `compass_north` with the same tag names, they are nominally not equal. Instead, they are **structural**, meaning that equality depends on structure, not declaration location and name.
Because polymorphic variants are not necessarily tied to fixed types, we can define adhoc tags and compare them directly:
``` ocaml
# `North = `North;;
- : bool = true ```
```
## Arguments with Bounded Polymorphic Variants
Polymorphic variants specify constraints that can be open, closed, or exact. Depending on how the set of possible tags are constrained, these constraints affect how values interact under subtyping.
The notation for the bounds of a polymorphic type expression are:
- open = `[> ... ]`
- closed = `[< ... ]`
- exact = `[ ... ]`
**Note**: The default type of a polymorphic variant tag is open, meaning that the type of ```A `` is interpreted as `` [> `A] ``. This is called refinement and will be discussed in greater detail in the next section.
To demonstrate the behavior of these constraints, we will use the following functions, which accept arguments with different type constraints and return a simple string:
``` ocaml
# let process_open (x: [> `A | `B ]) = "Accepted" (* open bound *)
let process_closed (x: [< `A | `B ]) = "Accepted" (* closed bound *)
let process_exact (x: [ `A | `B ]) = "Accepted";; (* exact bound *)
val process_open : [> `A | `B ] -> string = <fun>
val process_closed : [< `A | `B ] -> string = <fun>
val process_exact : [ `A | `B ] -> string = <fun>
```
**Open Polymorphic Variant Types in Action**
First, lets observe the behavior of `process_open` by calling it with different polymorphic variant tags:
``` ocaml
# process_open `A;;
- : string = "Accepted"
# process_open `B;;
- : string = "Accepted"
# process_open `C;;
- : string = "Accepted"
```
The type of `process_open` is `` [> `A | `B] ``, meaning that it is open to inputs of either `` `A ``, `` `B ``, `` [> `A | `B ] ``, or other tags. Therefore, it is intuitive that `process_open` accepts `` `A `` and `` `B ``, and `` `C ``.
**Closed Polymorphic Variant Types in Action**
Next, lets observe the behavior of `process_closed` on the same inputs:
``` ocaml
# process_closed `A;;
- : string = "Accepted"
# process_closed `B;;
- : string = "Accepted"
# process_closed `C;;
Error: This expression has type [> `C ] but an expression was expected of type
[< `A | `B ]
The second variant type does not allow tag(s) `C
```
The type of `process_closed` is `` [< `A | `B ] ``, meaning that it is closed to inputs of either `` `A ``, `` `B ``, `` [ `A | `B ] `` and no others. Therefore, it is intuitive that `process_closed` accepts `` `A `` and `` `B ``, and that it raises an exception when passed `` `C ``.
**Exact Polymorphic Variant Types in Action**
Next, lets observe the behavior of `process_exact` on the same set of inputs:
``` ocaml
# process_exact `A;;
- : string = "Accepted."
# process_exact `B;;
- : string = "Accepted"
# process_exact `C;;
Error: This expression has type [> `C ] but an expression was expected of type
[ `A | `B ]
The second variant type does not allow tag(s) `C
```
The type of `process_exact` (`` [ `A | `B ] ``) is exact, meaning that it is highly particular about the *exact* type it accepts. In other words, it doesn't accept any more or any less than `` [ `A | `B ] ``, and requires the argument to have the exact type.
In the above examples, `process_closed` and `process_exact` share the same behavior for similar reasons.
**Open Polymorphic Variants Revisited**
To add a layer of depth to our exploration, lets call the same functions again but with explicit type constraints with a common input tag `\`A`:
``` ocaml
# process_open (`A: [> `A]);; (* open bound (default) *)
- : string = "Accepted"
# process_open (`A: [< `A]);; (* closed bound *)
Error: This expression has type [ `A ] but an expression was expected of type
[> `A | `B ]
The first variant type does not allow tag(s) `B
# process_open (`A: [`A]);; (* exact bound *)
Error: This expression has type [ `A ] but an expression was expected of type
[> `A | `B ]
The first variant type does not allow tag(s) `B
```
As expected, `process_open` accepts an open polymorphic variant, as we saw previously. This is because we are attempting to fit a flexible constraint into a function accepting flexible constraint, meaning neither policy is violated.
When we attempt to pass an argument that has a closed or exact polymorphic variant type, exceptions are raised. This indicates that even though the function's argument type constraint includes an allowed tag, if the tag's type has a closed or exact bound there is a mismatch.
When we pass a closed or exact type to a function that expects an open type, we are attempting to fit a rigid constraint into a function with a flexible constraint. The rigid constraint doesn’t have the flexibility to promise that it can accommodate additional tags if needed. This mismatch arises because the function presumably accommodates more tags, but the closed or exact types passed into the function explicitly rule out the possibility that they can be handled the same way other tags may be handled.
**Closed Polymorphic Variants Revisited**
Next, lets observe the behavior of `process_closed` on the same arguments:
``` ocaml
# process_closed (`A: [> `A]);; (* open bound (default) *)
- : string = "Accepted"
# process_closed (`A: [<`A]);; (* closed bound *)
- : string = "Accepted"
# process_closed (`A: [`A]);; (* exact bound *)
- : string = "Accepted"
```
As expected, `process_closed` accepts an explicitly open bound, as we saw previously. The flexible input specifies that it can be handled by a function that can treat it like other possible types and the function specifies that it will not treat it like any other types than possibly `` `B ``, therefore there is no conflict in expectations.
When we pass closed or exact types to a function that accepts a closed argument type, the inputs are accepted because the rigid type constraints match. In effect, the function specifies that it cannot process any other tags and the arguments likewise specify that they cannot be handled in the same way other tags might, meaning there is no mismatch.
**Exact Polymorphic Variants Revisited**
Next, lets observe the behavior of `process_exact` on the same arguments:
``` ocaml
# process_exact (`A: [> `A]);; (* open bound (default) *)
- : string = "Accepted"
# process_exact (`A: [<`A]);; (* closed bound *)
Error: This expression has type [ `A ] but an expression was expected of type
[ `A | `B ]
The first variant type does not allow tag(s) `B
# process_exact (`A: [`A]);; (* exact bound *)
Error: This expression has type [ `A ] but an expression was expected of type
[ `A | `B ]
The first variant type does not allow tag(s) `B
```
As expected, `process_exact` accepts an explicitly open type, as we saw previously and for the same reason as in the previous example.
When we attempt to pass closed or exact types to a function that accepts only exact types, exceptions are raised. The error tells us that the exact type `` [ `A | `B ] `` is expected. If we want to ensure our input is accepted, we can alter our value by converting it to an acceptable type:
``` ocaml
# process_exact (`A: [< `A | `B]);;
- : string = "Accepted"
# process_exact (`A: [`A | `B]);;
- : string = "Accepted"
```
**Note**: This is an example of upcasting a subtype that structurally contains `` `A `` to a supertype that contains `` `A `` and `` `B `` by adding an explicit type annotation.
Now, the types match because both inputs `` `A `` specify that they can be handled by a function that may treat it like `` `B ``.
## Refinement of Polymorphic Variants
In the above example of `process_exact` (`` `A: [< `A]) ``, you may have noticed something unexpected. The exception raised reads: `` Error: This expression has type [ `A ] but an expression was expected of type [ `A | `B ] ``. But we didn't pass in `` [ `A ] ``, a type with an exact bound. We passed in `` [< `A ] ``, a type with a closed bound. What is the cause of this discrepancy?
This behavior demonstrates an important aspect of OCaml's type system called type refinement. This refinement process helps OCaml provide more precise type information and catch potential type mismatches. It is a part of OCaml's type inference system, which aims to determine the most specific type possible given the available information.
We can recreate this situation by assigning a variable to a tag with the same explicit type annotation:
**Exact Type Refinement**
``` ocaml
# let a = (`A: [< `A]);;
val a : [ `A ] = `A
```
In this case, the compiler refines the type of `` [< `A] `` to `` [ `A ] ``. This refinement occurs because the closed bound `` [< `A] `` specifies that the type can include at most the tag `` `A ``. However, in this specific context, where `` `A `` is the only tag present, the compiler infers that the type must be exactly `` [ `A ] ``. The same refinement occurs when we pass `` (`A: [< `A]) `` to `process_exact`.
**Default Open Type Refinement**
As noted earlier, the default bounding for a tag is open:
``` ocaml
# let b = `A;;
val b : [> `A ] = `A
```
**Open Type Extension Refinement**
``` ocaml
# let d = (`C: [> `A | `B ]);;
val d : [> `A | `B | `C ] = `C
```
In this example, the explicit type annotation `[> \`A | \`B ]` specifies an open bound that includes at least the tags `\`A`, `\`B`, and `[ \`A | \`B ]`. However, the value being assigned is `\`C`, which is not part of the explicitly annotated type. Nevertheless, OCaml's type system refines the type to include `\`C`, resulting in `[> `A | `B | `C ]` while maintaining openness (`>`)
**Closed Type With Open Type Refinement**
``` ocaml
# let c = (`A: [< `A | `B]);;
val c : [< `A | `B > `A ] = `A
```
This example demonstrates a more complex type refinement. Here, we're explicitly annotating `\`A` with a closed type that allows for either `\`A` or `\`B`. However, since we're only using `\`A`, the compiler refines this type to `[< \`A | \`B > \`A ]`. This refined type has two parts:
1. The first part is closed, represented by `[< ... ]`.
2. It must include `\`A`, represented by `[> `\`A` ]`
This refinement provides a more precise type that captures both the constraints we explicitly set and the actual usage. It's a middle ground between the fully open `[> \`A]` and the exact `[\`A]`, reflecting that while `\`B` is allowed, it's not necessarily present.
## Narrowing and Widening with Polymorphic Variants
As we saw, polymorphic variants can be constrained by bounds that either narrow or widen the set of tags they accept. Understanding these concepts is key to designing functions that interact well via subtyping.
**Combining Narrowing and Widening**
A common pattern is to have a function that narrows its input and widens its output. This ensures that only a particular subset of tags are accepted while its output is widened to maximize interoperability with other functions that may be expecting a larger set.
Imagine a function that takes an input that it trusts to be within a narrow set, processes it, and then returns a result that’s upcast to a more open type:
```ocaml
# let process_and_return x =
match x with
| `A -> `A
| `B -> `B;;
val process_and_return : [< `A | `B ] -> [> `A | `B ] = <fun>
```
Our function, `process_and_return` has a policy that it can only handle `x` if it is either type ``\`A``, ``\`B``, ``[ \`A | \`B ]`` and no others. This makes sense, since the match expression doesn't handle any other cases. If the match statement was to handle a wildcard case with, for example `| _ as a -> a`, the argument's type would necessarily have to be expanded, and since we cannot determine what other tags these would be the refined type would be ``[> \`A | \`B ]``.
The output of `process_and_return`, may be more surprising. Why would the compiler refine the output to an open type? If we think back to our earlier examples where open polymorphic variants were applied to functions that narrow the set of possible inputs, arguments with closed bounds were frequently rejected unless they strictly met the argument type's criteria. Therefore, if the output of a function is open, it enhances composability with other functions by passing the responsibility of rejecting compatiblity to the next function's argument type rather than the value's type.
## Passing Input Tags to Output
During pattern matching, it can be desirable to capture the tag for use in the branch arm rather than only capturing its payload. In this case, we can alias the pattern matched using the keyword `as`.
For example:
``` ocaml
# let process_and_pass_through x =
match x with
| `A as a -> a
| `B as b -> b
| _ as c -> c;;
val process_and_pass_through : ([> `A | `B ] as 'a) -> 'a = <fun>
```
The type signature of `process_and_pass_through` (`[ ... ] as 'a`) indicates that an alias `'a` is used to represent the matched tag and payload.
## Declaring a Named Polymorphic Variant Type
Polymorphic Variants can be abbreviated with the `type` keyword:
``` ocaml
# type typ_1 = [ `A | `B | `C ];;
type typ_1 = [ `A | `B | `C ]
```
Notice that the type is specifed with as exact. Indeed, it is not possible to declare a polymorphic variant expression with a lower or upper bound:
``` ocaml
# type typ_2 = [> `A | `B | `C ];;
Error: A type variable is unbound in this type declaration.
In type [> `A | `B | `C ] as 'a the variable 'a is unbound
# type typ_3 = [< `A | `B | `C ];;
Error: A type variable is unbound in this type declaration.
In type [< `A | `B | `C ] as 'a the variable 'a is unbound
```
When declaring polymorphic variant types in OCaml, using exact bounds (e.g., `[ \`A | \`B | \`C ]`) fully specifies the type without introducing free type variables. This explicitness ensures that the type is concrete and complete, allowing the compiler to understand precisely which tags are included.
In contrast, attempting to declare a polymorphic variant with an lower bound (`[> ...]`) or upper bound (`[< ...]`) within a type declaration introduces an implicit free type variable, which is called `'a` in the above errors. This variable represents the potential for additional unspecified variants (in the case of open bounds) or restricts the type to a subset of variants (in the case of upper bounds). Without explicitly binding this free variable, the compiler cannot fully determine the type's structure, leading to unbound type variable errors during type checking.
## Extending Named Polymorphic Variants
A benefit of named polymorphic variant types is that they can be referenced in the declarations of other named polymorphic variant types. This allows us to extend sets of named polymorphic variants types by other polymorphic variants.
This extensibility enables us to define modular and composable type declarations, facilitating code reuse and maintainability.
This feature is demonstrated in the next section.
### Including Named Polymorphic Variant Types in Pattern Matches
Another benefit of named polymorphic variant types is that they can be expanded during pattern matching using a hash symbol (`#`):
Imagine we have a set of mouse events grouped in a abbreviated polymorphic variant type:
```ocaml
# type mouse_event = [ `Click of int * int | `Hover of int * int ];;
type mouse_event = [ `Click of int * int | `Hover of int * int ]
```
We also have other events in our application:
```ocaml
# type event = [
| mouse_event
| `KeyPress of char
| `Resize of int * int
];;
type event =
[ `Click of int * int
| `Hover of int * int
| `KeyPress of char
| `Resize of int * int ]
```
In the above named polymorphic variant type, we extend `mouse_event` with additional tags.
Now suppose we want to handle all events without having to list all mouse events separately in a match case:
```ocaml
# let handle_event (ev : event) =
match ev with
| #mouse_event as me ->
(match me with
| `Click (x, y) -> Printf.sprintf "Mouse clicked at (%d, %d)" x y
| `Hover (x, y) -> Printf.sprintf "Mouse hovered at (%d, %d)" x y)
| `KeyPress c -> Printf.sprintf "Key pressed: %c" c
| `Resize (w, h) -> Printf.sprintf "Window resized to (%d, %d)" w h;;
val handle_event : event -> string = <fun>
# handle_event (`Click (25, 75));;
- : string = "Mouse clicked at (25, 75)"
```
In this function, the pattern `#mouse_event as me` expands to match any variant value that is part of the `mouse_event` type without the need to write out both cases explicitly.
This approach is an example of behavior that can be implemented with ordinary variants, but the flexibility of polymorphic variants provides an improved developer experience by reducing the boilerplate code otherwise required.
**Note**: Pattern-matching with `#` is not a dynamic type check. The `#mouse-event` pattern acts like a macro by providing shortcut to avoid writing all the corresponding patterns.
## Conclusion
In this lesson, we introduced the notation and concepts underpinning polymorphic variants in OCaml, beginning with a clear distinction between ordinary and polymorphic variants. We saw how polymorphic variants provide a flexible, extensible alternative to ordinary variants by allowing tags to be defined and reused without the need for rigid type declarations. This flexibility is achieved through row polymorphism and structural typing, where types are defined by their structure rather than by their names.
Building on this foundation, we explored the concept of type refinement in polymorphic variants, illustrating how the use of open, closed, and exact polymorphic vaiant types shape the behavior of functions that accept these types. By demonstrating narrowing (restricting inputs) and widening (upcasting outputs), we learned how OCaml's type system precisely manages the set of acceptable tags. Finally, we saw how abbreviated types can be combined with other abbreviations and how they can be expanded during pattern-matching.
<!-- Link to next lesson on use-cases for polymorphic variants (TBD) -->
## Special Thanks**
A special thanks is due to the following members of the [OCaml community message board](https://discuss.ocaml.org/) for their generous feedback. The precision and clarity of this lesson has been significantly improved thanks to their efforts.
- [kopecs](https://discuss.ocaml.org/u/kopecs)
- [cjr](https://discuss.ocaml.org/u/cjr)
- [octachron](https://discuss.ocaml.org/u/octachron)
- [mobilink](https://discuss.ocaml.org/u/mobileink)