Some commands discard their pipeline input as part of their normal operation. This can cause some unexpected issues during pipeline input type checking. I discovered this issue while developing run-time pipeline input type checking, but this affects the type system more generally – issues that are currently run-time only will cause us issues later down the line if we improve type inference.
The fundamental problem is that we shouldn't enforce type checking on pipeline input if that input is going to be ignored. This is especially applicable for "value-generating" commands like echo
or open
.
Let's look at a problematic example with echo
:
This looks fine, and indeed it runs fine (and I think most would agree, it should run fine). There's a sneaky issue here though if you analyze the pipeline input/output types: echo
only supports nothing
input, but it's (implicitly) being passed int
pipeline input.
If we write this out (much) more explicitly, the parser will catch on to this:
However, it doesn't actually matter what type of the pipeline input being passed to echo
is, since it will discard that input, but the parser doesn't know that. If the parser could recognize that echo
is implicitly being passed an int
in the each { echo "hello" }
example (via improved type inference), it would also raise an type mismatch error in that case.
Commands currently have no way to indicate that they might discard their pipeline input, which causes undesirable type checking behavior, like the above example. Because of our limited type inference, this issue has been lurking quietly in the background, but it has begun to rear its head with the introduction of run-time type checking.
When implementing run-time type checking, examples such as piping into echo
lead me to a simple rule: if a command's only pipeline input type is nothing
, then don't enforce pipeline input type checking. This seemed like an adequate solution (despite diverging from the parse-time type checker, as shown above), but it misses a critical case: conditionally discarded pipeline input.
Here's an example of this: the open
command can read the filename to open from pipeline input, or it can read the filename as a positional argument and ignore the pipeline input. This means that its input/output types look like this1:
If open
is passed a value which is not a string, then it's correct to raise a type error:
But, if open
is also passed a positional argument filename, then it's not correct to raise a type error, since the input is discarded, but we still do:
In the implicit form of this, no parse-time error is raised, since the parser doesn't know a record is being passed to open
:
However, the initial implementation of run-time type checking did raise a type error here, because at run-time we do know that a record is being passed to open. This lead to me implementing temporary workaround (#14922) which adds any
as an pipeline input type to open
(and other commands which conditionally discard pipeline input), bypassing parse-time and run-time pipeline input type checking.
Without a way for commands to indicate that they might discard pipeline input, we can't confidently type check pipeline input for these commands.
(Footnote 1: Technically, after #14922 the any
pipeline input type was added to open
for reasons explained later in this section, so these examples won't work in current Nushell, but this was the behavior before #14922)
Disallow conditionally discarded pipeline input, and make nothing
pipeline input mean that input is discarded.
This is probably the most internally consistent solution, and it doesn't require massive changes to the type system. It's unfortunate that commands like open
would have to choose between a positional argument and pipeline input.
It would be a little weird if we still passed pipeline input values to commands with a nothing
input type in this case. This could potentially be problematic for commands which behave differently depending on the prescence of a pipeline input value, such as generate
:
Maybe we swap out pipeline input with PipelineData::Empty
when running a command with nothing
pipeline input? This might be somewhat surprising, but is probably better than the above behavior.
Allow commands to indicate that they (conditionally) discard pipeline input
This is similar to (1), but with a more explicit mechanism for commands to indicate they might discard pipeline input. This is what I was going for with the empty
"type" in #15165, but that PR doesn't solve the problem very elegantly, and still delegates type checking to run-time command code (which is especially problematic for commands commands, where that's not trivial).
We could potentially implement this in a way that would allow us to still type check commands which conditionally discard pipeline input, but I'm not sure exactly what that would look like (especially for custom commands). Maybe a new field in Signature::input_output_types
?
Combination of (1) and (2)
Disallow conditionally discarded pipeline input, and add a way for commands to indicate that they discard pipeline input. This would make swapping pipeline input with PipelineData::Empty
less surprising. For example, a discard
(🚲🛖) pipeline input "type" (but only input, it doesn't make sense to discard the output):
secret fourth thing?