Result types in Temper

# Result types in Temper This discusses how migrating to non-first-class *Result* types for Temper's failure handling would help preserve information important to developers and to target-language programs while preserving Temper's ability to support a wide variety of languages. Contrast with [just use null proposal](https://hackmd.io/v0QD2ZdsQaSnxpbc1y846w). ## Goals Programming languages need to support *failure idioms*: switching from happy path code to alternate paths when it's difficult to provide a high level of service. ### Different prevailing idioms Temper needs to **support different languages' failure idioms**: - *Global error numbers* as in C, Perl, and Bash - *Exceptions* as in `try`/`catch` - *Sum types* like *Result* and *Option* where some variants indicate a useful result and others indicate no useful result. - *Extra outputs* like Go's convention of returning `(resultOrNil, errorOrNil)` where the first is usable iff the second is `nil`. | Language | Exceptions | Global vars | Sum types | Extra returns | Classification | | ---- | ---- | ---- | ---- | ---- | ---- | | C | | ✓ | | | Implicit, intermixed | | C++ | ✓[^goo-cpp] | ✓ | ✓ | | Implicit, intermixed | | C# | ✓ | | | ✓ | Implicit, out of band | | F# | ✓ | | ✓ | | Explicit, in band | | Go | | | | ✓ | Explicit, out of band | | Java | ✓ | | | | Explicit[^ch-ex], out of band | | JS | ✓ | | | | Implicit, out of band | | Lua | ✓ | | | ✓[^via-pcall] | Implicit, out of band | | OCaml | ✓ | | ✓ | | Explicit, in band | | PHP | ✓ | ✓ | | | Implicit, out of band | | Ruby | ✓ | ✓ | | | Implicit, out of band | | Rust | | | ✓ | | Explicit, in band | | Swift | ✓ | | ✓ | | Implicit, out of band | [^goo-cpp]: C++ allows turning off exceptions (`-fno-exceptions`) because they were introduced relatively late in the language's lifetime. [Google's C++ style guide](https://google.github.io/styleguide/cppguide.html#Exceptions) says "We do not use C++ exceptions … On their face, the benefits of using exceptions outweigh the costs, especially in new projects. … Because most existing C++ code at Google is not prepared to deal with exceptions, it is comparatively difficult to adopt new code that generates exceptions." [^ch-ex]: Java has [checked exceptions](https://docs.oracle.com/javase/tutorial/essential/exceptions/catchOrDeclare.html) but they are a compiler fiction so are bypassable by [reflection](https://docs.oracle.com/javase/8/docs/api/java/lang/Class.html#newInstance--) and most other JVM languages have opted to treat all exceptions as unchecked ([Clojure](https://clojureverse.org/t/error-handling-in-clojure/1877/9), [Groovy](https://docs.groovy-lang.org/latest/html/documentation/#_exception_declaration), [Kotlin](https://kotlinlang.org/docs/reference/exceptions.html#checked-exceptions), [Scala](https://contributors.scala-lang.org/t/pre-sip-checked-exceptions/4044)). The trend in language design is squarely against checked exceptions. [^via-pcall]: via [*pcall*](https://www.lua.org/pil/8.4.html) Some languages have more than one of these in which case, backend implementors need to pick an approach. - Ocaml and F# have exceptions but sum types are more idiomatic. - Perl and Ruby (and PHP?) have exceptions and global error variables but the latter are widely considered design warts. It's possible for some backends to generate code to work in multiple scenarios. C++ inherits some errno stuff from C, and it has exceptions when `-fno-exceptions` is not passed to the compiler, but `std::expected` (C\+\+23?) and `std::variant` (C\+\+17) are widely available and probably preferred in modern code. A C++ backend could use exceptions when `#ifdef __EXCEPTIONS` and use macros for return types that produce `std::expected` when it's not defined. ### Preserving failure metadata Temper needs to **preserve failure metadata** when a problem originates in native language code and is properly handled in native language code. Failure metadata includes values stored with *sum types* or exceptions, stack trace information attached to exceptions, and the specific values stored in global error number variables or in extra outputs. For example: 1. Some Java program calls a Java library from Temper, passing a Java lambda expression. 2. Temper invokes the Java lambda 3. The lambda raises an exception specific to the project 4. The outer caller (1) gets an exception back from its call to Temper (2) 5. The exception it receives should match that raised in (3). Preserving failure metadata helps non-Temper developers diagnose problems because exceptions often contain stack traces. In this case, the stack trace should point at the lambda. Preserving failure metadata also helps target language programs: custom exception types, and info stored in *Result* types can be part of contracts beween parts of the same program. ### Lighter weight failure It'd be nice if Temper could substitute **lightweight failure signalling** when failure is not exceptional, and the substitution only affects implementation details or the author decides that it is idiomatic in context. For example, when `collection.indexOf(wanted)` fails to find `wanted` it could return -1 instead of allocating and raising an exception. If `indexOf` were defined as a private implementation detail (all callers are in the same compilation unit), then that could be done automatically, or its author could opt-in via a translation annotation. ## The state of Temper today Currently, Temper has a *sum* type approach. Like *Option* types, you either get *Some* result or *None* indicating no usable result. ```ts x = y / z; // CAN FAIL IF Z IS 0 // That translates to code that checks like the below. var fail: Boolean; x = hs(fail, y / z); if (fail) { bubble(); // JUMP TO RECOVERY CODE } else { // x is usable } ``` The `hs` builtin performs an operation (`y / z` in this case), traps failure and sets its first argument, `fail`, to false when the result is usable, or true when the operation failed and the result is not usable. We automatically produce these extra checking instructions so that the program does not observe a variable whose content is not usable. Unfortuately, we've lost all the metadata. And our model assumes that functions that can bubble can return a failure sentinel. This may not hold for important languages. In an exception raising language, `bubble()` can throw, but it doesn't know what to throw. We can correct this by having `fail` hold either `null` or the non-null metadata. ```ts x = y / z; // CAN FAIL IF Z IS 0 // That COULD translate like: var reason: Reason | Null; x = hsNew(reason, y / z); if (reason != null) { bubble(reason); // JUMP TO RECOVERY CODE } else { // x is usable } ``` That looks very similar to Go's extra outputs, but it is easy to translate in other ways: For a sum type language: *hsNew* simply unpacks a *Result* type, returning the result, and we use a type switch thus: ```ts x = hs(reason, operation); if (reason != null) { bubble(reasons); } use(x); // Since bubble does not flow, that is equivalent to if (reason != null) { bubble(reasons); } else { use(x); } // After sum-type transform let xResult = operation; match (xResult) { is Failure(let reason) => do { bubble(reason); }; is Success(let x) -> do { use(x); } } ``` For an exception language, we preserve the control flow structure and so can erase the checks entirely. For a global error number language, `hsNew` reads the global error number, probably resets it, and stores it or something corresponding to it in `reason`. ## An ambiguity remains Unfortunately, we can't just turn our existing `T | Bubble` into a result type. ```ts let callF<OK extends AnyValue, BAD extends Bubble>( f: Fn() : OK | BAD ): OK | BAD { let x = f(); console.log("f passed"); x } ``` The semantics of *callF* are as follows: 1. call its argument 2. if it failed, propagate its failure 3. if it succeeded, log "f passed", and return its result ```ts let callF<OK extends AnyValue, BAD extends Bubble>( f: Fn() : OK | BAD ): OK | BAD { let reason; let x = hs(reason, f()); if (reason != null) { bubble(reason); } else { console.log("f passed"); x } } ``` Let's consider how a call to *callF* is translated to a sum-type language two ways: 1. *f* could fail, so \*<BAD\>* binds to a type. 2. *f* is known to not fail, so *\<BAD\>* binds to *Never*. ### 1. F could fail Below we translate a type union like *OK | BAD* to a Rust *Result* type. ```ts // TEMPER let callF<OK extends AnyValue, BAD extends Bubble>( f: Fn() : OK | BAD ): OK | BAD { ... } let x: Int | Badness = callF<Int, Badness>(fn: Int | Badness { shouldFail ? new Badness() : 42 }); // Rust fn callF<OK, BAD>( f: fn () -> Result<OK, BAD> ) -> Result<OK, BAD> { ... } let x: Result<Int, Badness> = callF<i32, Badness>( || { if (shouldFail) { Err(Badness {}) } else { Ok(42) } } ) ``` ### 2. F does not fail ```ts // TEMPER let callF<OK extends AnyValue, BAD extends Bubble>( f: Fn() : OK | BAD ): OK | BAD { ... } let x: Int = callF<Int, Never>(fn: Int { 42 }); // Rust fn callF<OK, BAD>( f: fn () -> Result<OK, BAD> ) -> Result<OK, BAD> { ... } let x: i32/*⚠️*/ = callF<i32, !>(|| /*⚠️*/42) ``` ## Solution 1: Autoboxing & unboxing In the Rust translation above, there are two problems: - *callF* returns a *Result* but the result of *callF* is assigned to an *i32* - The closure passed to *callF* returns an *i32* but *callF* expects a *Result* The *TmpLTranslator* could adapt. It could auto-unbox the result from *callF* from a *Result\<i32, !\>* to an *i32* by recognizing the type mismatch. ```rust #![feature(never_type)] pub fn unbox<T>(r: Result<T, !>) -> T { match r { Ok(x) => x, Err(x) => x } } ``` This code depends on Rust recognizing that in the `Err(x) => x` case is typesafe because Rust has an (experimental) bottom type. The other adaptation is harder though, because we have to wrap a function in another function. Note that below, the return type can't be `fn () -> Result<T, !>` because a closure cannot assign to a function type, only the *Fn* type, and there's a need for a `move` to handle lifetime expectations. ```rust pub fn auto_boxing_fn<T>( f: fn () -> T ) -> impl Fn () -> Result<T, !> { move || Ok(f()) } ``` ## Solution 2: Use result types in Temper Alternatively, we could use *Result* types in Temper while maintaining, within Temper, the feel of a language that doesn't require explicit boxing/unboxing. 1. Functions that can fail under any parameterization, should return a *Result* type. 2. Temper automatically unboxes failure results using inserted failure checks. 3. Temper automatically unwraps success results. 4. Temper provides shorthand syntax for specifying result types. This requires a different approach for each of the cases above. - For a *sum type* language, the result type matches closely. - For an *exception* language, no extra work. A *Result*'s error type parameter just specifies the super-types of any exceptions thrown. - For a *multiple return value language*, we benefit because we don't have to synthesize extra parameters similar to boxing and unboxing in solution 1 above. - For a *global error number* language, we at least know whether we should be clearing the error state. A notable feature of this approach is that Temper does not allow creation of instances of nested result types (unless the wrapped is received from outside as a strict super-type of *Result*): *Result\<Result\<T, U\>\>*. The rest of this article assumes and expands on this approach. ## Making Result types easy to write. ```ts Result<Int, Bubble> // 19 keystrokes Int | Bubble // 12 keystrokes Int~ // 4 keystrokes ``` Temper will probably end up using a postfix `?` to mean "or null". Temper could adopt another character like `!` or `~` in postfix/infix position to mean "can bubble." For example: ```ts // returns an integer let f(): Int {...} // returns an integer or null let f(): Int? {...} // returns an integer or bubbles let f(): Int~ {...} // returns an integer or the error // is a BadInput let f(): Int~BadInput {...} ``` Postfix `~` (😖) specifies that some generic Temper *Bubble* type is emitted that communicates *NoUsableResult* but not why. We could expand to infix `~` to allow specifying a tighter bound on the error type. We should maintain the invariant that *T~E* is a sub-type of *T~* but not of *T*. That may be tricky if we have to pick a type for `bubble()` calls like *T~* is effectively *T~Null*. We might have to do that anyway so that `orelse` can trap all failures. But regardless of whether we type *E*, we allow assignment of *T~E*s to *T*s. ```ts let callF<T, E>( f: Fn(): T~E ): T~E { ... } let x: Int = callF { if (thereIsAProblem()) { bubble() } else { 42 } } ``` That code works because we auto-unbox results and insert the `if (reason != null)` instructions. The other side of the auto-boxing is partially eliminated by type inference. ```ts let x = callF { 42 }; ``` Here, the inferred return type of `{ 42 }` is *Int~Never* because the block lambda's type is inferred based on both the code's happy paths and the callee's signature. This doesn't solve the problem where a Temper dev wants to pass a non-bubbly function where a bubble function is expected, but existing language choices provide a recovery path. ```ts let f(): Int { 42 } callF(f) // TYPE MISMATCH Int <!: Int~ callF { f() } // OK ``` Temper accept some type errors but there is a clear path to fix, and the explicit wrapper relieves backends of the need to do generic function wrapping. ## Debugging Temper's `console` should integrate well with target language debugging facilities. Which means we need some way to access the failure result. ```ts mightBubble() orelse(e) do { console.log("Might bubble did", e) } ``` Here the `(e)` (or `(let e)`) declares a variable that is bound to the bubbled value or *Null* (see notes about *T~E* subtyping and type-non-matching exceptions raised in target language code above). ## Opaque bubbles Exception languages often have detailed type hierarchies of exception types. These have similarities, but also differences. ### Python exception hierarchy ```mermaid flowchart Exception ZeroDivisionError[ZeroDivisionError①] FileNotFoundError[FileNotFoundError②] EOFError[EOFError③] RecursionError[RecursionError④] IndexError[IndexError⑤] AttributeError[AttributeError⑥] ZeroDivisionError --> ArithmeticError --> Exception --> BaseException --> object FileNotFoundError --> OSError --> Exception EOFError --> Exception RecursionError --> RuntimeError --> Exception IndexError --> LookupError --> Exception AttributeError --> Exception ``` ### Java exception hierarchy ```mermaid flowchart Exception ArithmeticException[ArithmeticException①] FileNotFoundException[FileNotFoundException②] EOFException[EOFException③] StackOverflowError[StackOverflowError④] ArrayIndexOutOfBoundsException[ArrayIndexOutOfBoundsException⑤] NullPointerException[NullPointerException⑥] ArithmeticException --> RuntimeException --> Exception --> Throwable FileNotFoundException --> IOException --> Exception EOFException --> IOException StackOverflowError --> VirtualMachineError --> Error --> Throwable ArrayIndexOutOfBoundsException --> IndexOutOfBoundsException --> RuntimeException NullPointerException --> RuntimeException ``` In Python, *FileNotFound* and *EndOfFile* are almost disjoint (the former is an *OSError*), but in Java, they have a common super-type: *IOException*. Python allows distinguishing *DivisionByZero* from other kinds of arithmetic problems (overflow) but Java does not. There's no way to connect Temper bubble values' types to backend language types while preserving the ability to consistently answer *is-a* questions. JavaScript uses the same error type for a failure to parse JSON and regular expressions meaning that a JavaScript backend that delegates work needs to pro-actively remap exceptions or no backend can distinguish between malformed JSON and a malformed regex: ```sh $ node Welcome to Node.js v21.7.1. Type ".help" for more information. > JSON.parse('[') Uncaught SyntaxError: Unexpected end of JSON input > new RegExp('[') Uncaught SyntaxError: Invalid regular expression: /[/: Unterminated character class > ``` I.e, it would be hellishly difficult to use exception type information to dispatch to different *catch* blocks à la Java's: ```java try { } catch (ArrayIndexOutOfBounds e) { } catch (IndexOutOfBounds e) { } ``` Temper should treat bubble values as mostly opaque: we pass them through so that debugging facilities like `console` have all the information they need to display the kinds of diagnostics that developers in that language are accustomed to. ## Possible to allow capturing whole `~` Sometimes one function may want a whole result object. We could implement a builtin (*takeResult*) that acts like *hs* but does not unpack. ```temper let getsResult(f: Int~) { let r: Int~ = takeResult(f()); match (r) { is OkResult -> ...; is BubbleResult -> ...; } } ``` Unfortunately, this would require allowing manipulating bubbles as values which introduces a whole can of worms; consider how little follows from `obj.propName === undefined` because `undefined` is a first-class value in JavaScript. Alternatively, when translating to *exception* and *extra output* languages, we could do tail-call optimizations: when a function that can bubble, directly returns the result of a bubbly operation with compatible return type, detect that and do not bother unpacking. ## Unresolved issues: multiple error types in checked exception languages Preserving fine-grained type information for bubbled values is useful for documentation but also for translating to languages like Java that have checked exceptions. Languages with checked exceptions often allow for multiple exception types. The Java below could fail to read the file (it does not exist or the current user does not have read access) or the content read is not a well-formed decimal string. ```java static int readANumberFromAFile( File f ) throws IOException, NumberFormatException { ... } ``` This could be represented in Temper using an `|` type: this function bubbles this way *or* that way. Expanding use of allowable `|` types requires more complexity in the type solver, but as long as we keep bubbled values opaque, we are not allowing more casting based on `|` types. ## Roadmap There are a number of cocrete steps to get the Temper toolchain working with *Result*s. 1. Define some types in *WellKnownType* - *interface Result\<OK, ERR\>*, not an *AnyValue* sub-type. - */\*sealed\*/ class Ok\<T\> extends Result\<T, Never\>* - */\*sealed\*/ class Err\<T\> extends Result\<Never, T\>* 2. Define *BuiltinFuns* to pack and unpack results. Do not expose these in *BuiltinEnvironment* 3. Change *TProblem* so that its value content can store a log entry (for internal interpreter use) or a bubble *Value\<\*>* and the *Fail* result wrapper similarly. 4. Change builtin *bubble* function to take an optional argument that it attaches to a fail result. 5. Define an alternative to *hs*, *HandlerScopeFn.kt*, called *hs2* below that requires its first argument to be a left-name for a *Result* instead of a boolean. It's semantics are: i. evaluate its second argument. ii. If it produces *NotYet* return that iii. else if it produces a *Value* return an *Ok* wrapper and store it in the environment record for the left name iv. else the result is a *Fail* result. Proceed as below: v. If there is a log entry, convert it to a string and store that string value in the environment record for the left name vi. else if there is a failure value, store it in the environment record for the left name vii. else there is no associated value, so store the string value *"internal interpreter failure"* in the environment record for the left name viii. Regardless, if any of the attempts to store in the environment record above yielded with a *NotYet* or a *Fail* value, panic() 6. Rework *MagicSecurityDust* to produce *hs2* instructions instead. ```ts // Input mightFail() // Output before var fail: Boolean; hs(fail, mightFail()); if (fail) { bubble() } // Output after var bubbled: Result; hs2(bubbled, mightFail()); if (isErr(bubbled)) { bubble(unpackErr(bubbled)); } // ------------------- // Input x = mightFail(); use(x); // Output before var fail; x = hs(fail, mightFail()); if (fail) { bubble(); } use(x); // Output after var bubbled: Result; hs2(bubbled, mightFail()); if (isErr(bubbled)) { bubble(unpackErr(bubbled); } let x = unpackOk(bubbled); use(x); ``` 7. Rework *Typer* to recognize *hs2*. There is currently special case handling for *hs* which should serve as a template. 8. Add *TyperTest* cases to make sure that when a block lambda only succeeds with type *T*, we can still infer *T~Never* when its callee requires a result type. 9. Fix up lots of test cases. 10. Add `~` as an infix operator with a min arity of 1 with a precedence that is the same as `|` and that is Right associative so that `T ~ E1 | E2` is equivalent to `T ~ (E1 | E2)` and `Fn: T ~ E1` is equivalent to `Fn: (T ~ E1)` and `Fn: Fn: T ~ E1` is equivalent to `Fn: (Fn: (T ~ E1))` 11. In *TypeFuns.kt* implement `~` given one or two arguments to expect its arguments to be *Value(ReifiedType, TType)* values and return a *ReifiedType* for a *Result* type 12. Maybe tweak *toPseudoCode* to simplify result types to `~` syntax 13. Work through *TmpLBackendTest* cases to make sure we can translate exceptions properly. We will probably need a new *enum BubbleBranchStrategy* member: *ConnectResultToSumType*. 13. Clean up old *hs* and rename *hs2* to *hs* or something better.