# Discarded Pipeline Input and Type Checking ## The problem Some commands discard their pipeline input as part of their normal operation. This can cause some unexpected issues during pipeline input type checking. I discovered this issue while developing run-time pipeline input type checking, but this affects the type system more generally -- issues that are currently run-time only will cause us issues later down the line if we improve type inference. The fundamental problem is that **we shouldn't enforce type checking on pipeline input if that input is going to be ignored.** This is especially applicable for "value-generating" commands like `echo` or `open`. Let's look at a problematic example with `echo`: ```nushell 1..10 | each { echo "hello" } ``` This looks fine, and indeed it runs fine (and I think most would agree, it _should_ run fine). There's a sneaky issue here though if you analyze the pipeline input/output types: `echo` only supports `nothing` input, but it's (implicitly) being passed `int` pipeline input. If we write this out (much) more explicitly, the parser will catch on to this: ```nushell 1..10 | each {|e| let e: int = $e $e | echo "hello" } # => Error: nu::parser::input_type_mismatch # => # => × Command does not support int input. # => ╭─[entry #18:3:8] # => 2 │ let e: int = $e # => 3 │ $e | echo "hello" # => · ──┬─ # => · ╰── command doesn't support int input # => 4 │ } # => ╰──── ``` However, it doesn't actually matter what type of the pipeline input being passed to `echo` is, since it will discard that input, **but the parser doesn't know that**. If the parser could recognize that `echo` is implicitly being passed an `int` in the `each { echo "hello" }` example (via improved type inference), it would also raise an type mismatch error in that case. Commands currently have no way to indicate that they might discard their pipeline input, which causes undesirable type checking behavior, like the above example. Because of our limited type inference, this issue has been lurking quietly in the background, but it has begun to rear its head with the introduction of run-time type checking. ## Conditionally discarded pipeline input When implementing run-time type checking, examples such as piping into `echo` lead me to a simple rule: if a command's only pipeline input type is `nothing`, then don't enforce pipeline input type checking. This seemed like an adequate solution (despite diverging from the parse-time type checker, as shown above), but it misses a critical case: conditionally discarded pipeline input. Here's an example of this: the `open` command can read the filename to open from pipeline input, _or_ it can read the filename as a positional argument and ignore the pipeline input. This means that its input/output types look like this<sup>1</sup>: ``` ╭───┬─────────┬────────╮ │ # │ input │ output │ ├───┼─────────┼────────┤ │ 0 │ nothing │ any │ │ 1 │ string │ any │ ╰───┴─────────┴────────╯ ``` If `open` is passed a value which is not a string, then it's correct to raise a type error: ```nushell ls | where type == file | each {|e| let e: record = $e $e | open } # => Error: nu::parser::input_type_mismatch # => # => × Command does not support record input. # => ╭─[entry #5:3:8] # => 2 │ let e: record = $e # => 3 │ $e | open # => · ──┬─ # => · ╰── command doesn't support record input # => 4 │ } # => ╰──── ``` But, if `open` is also passed a positional argument filename, then it's _not_ correct to raise a type error, since the input is discarded, but we still do: ```nushell ls | where type == file | each {|e| let e: record = $e $e | open $e.name } # => Error: nu::parser::input_type_mismatch # => # => × Command does not support record input. # => ╭─[entry #6:3:8] # => 2 │ let e: record = $e # => 3 │ $e | open $e.name # => · ──┬─ # => · ╰── command doesn't support record input # => 4 │ } # => ╰──── ``` In the implicit form of this, no parse-time error is raised, since the parser doesn't know a record is being passed to `open`: ```nushell ls | where type == file | each {|e| open $e.name } ``` However, the initial implementation of run-time type checking _did_ raise a type error here, because at run-time we _do_ know that a record is being passed to open. This lead to me implementing temporary workaround ([#14922](https://github.com/nushell/nushell/pull/14922)) which adds `any` as an pipeline input type to `open` (and other commands which conditionally discard pipeline input), bypassing parse-time and run-time pipeline input type checking. Without a way for commands to indicate that they might discard pipeline input, we can't confidently type check pipeline input for these commands. (Footnote 1: Technically, after [#14922](https://github.com/nushell/nushell/pull/14922) the `any` pipeline input type was added to `open` for reasons explained later in this section, so these examples won't work in current Nushell, but this was the behavior before #14922) ## Possible solutions 1. Disallow conditionally discarded pipeline input, and make `nothing` pipeline input mean that input is discarded. This is probably the most internally consistent solution, and it doesn't require massive changes to the type system. It's unfortunate that commands like `open` would have to choose between a positional argument and pipeline input. It would be a little weird if we still passed pipeline input values to commands with a `nothing` input type in this case. This could potentially be problematic for commands which behave differently depending on the prescence of a pipeline input value, such as `generate`: ```nushell def five-fib []: nothing -> list<int> { generate {|fib=[0,1]| {out: $fib.0, next: [$fib.1, ($fib.0 + $fib.1)]} } | first 5 } five-fib # => [0, 1, 1, 2, 3] # oops... [[0 1]] | five-fib # => [0] ``` Maybe we swap out pipeline input with `PipelineData::Empty` when running a command with `nothing` pipeline input? This might be somewhat surprising, but is probably better than the above behavior. 2. Allow commands to indicate that they (conditionally) discard pipeline input This is similar to (1), but with a more explicit mechanism for commands to indicate they might discard pipeline input. This is what I was going for with the `empty` "type" in [#15165](https://github.com/nushell/nushell/pull/15165), but that PR doesn't solve the problem very elegantly, and still delegates type checking to run-time command code (which is especially problematic for commands commands, where that's not trivial). We could potentially implement this in a way that would allow us to still type check commands which conditionally discard pipeline input, but I'm not sure exactly what that would look like (especially for custom commands). Maybe a new field in `Signature::input_output_types`? 3. Combination of (1) and (2) Disallow conditionally discarded pipeline input, and add a way for commands to indicate that they discard pipeline input. This would make swapping pipeline input with `PipelineData::Empty` less surprising. For example, a `discard` (🚲🛖) pipeline input "type" (but _only_ input, it doesn't make sense to discard the output): ```nushell def five-fib []: discard -> list<int> { generate {|fib=[0,1]| {out: $fib.0, next: [$fib.1, ($fib.0 + $fib.1)]} } | first 5 } five-fib # => [0, 1, 1, 2, 3] # passes type checking, but is discarded [[0 1]] | five-fib # => [0, 1, 1, 2, 3] ``` 4. secret fourth thing?