owned this note changed 2 years ago
Published Linked with GitHub

Types and Overloading

Types

Each command has input/output type which by default is any. Input is only considered coming from a pipe, i.e., command arguments are not an input.

Also, one way to think about overloads is object-oriented programming, kind of Julia-style. Adding an overload == defining a method for a type.

To define command's types, we need syntax for it. And we need it at two places:

  • definition
  • call

The syntax also allows us to do more comprehensive type checking

Definition-site type annotations

Command signature:

def foo[in: int] -> string { ls | get $in | get name }

Closure signature:

{|in: int, -> string| ls | get $in | get name }

Call-site type inference

3 | foo  # 3 is a static integer => statically inferred

# ... same as ...

def generate-int [] { # no output type means 'any'
    3
}  
generate-int | foo # dynamically checks for the type and sees integer => pass

# ... same as ...

def foo [int: string] -> string { ... }  # overloaded foo
<int>foo -h  # force Nushell to use a specific overload

(help messages need a special treatment that will be discussed later)

Static vs. dynamic

We need:

  • static type checking
    • Checking types / deciding overloads at type-check time (after parsing, before eval)
  • dynamic type checking
    • Checking types / deciding overloads during evaluation

We could have a --strict flag to the nu binary that enforces static types everywhere. Dynamic type checking would throw an error. (Is this how it works in TypeScript?)

With type annotations, we can make static typing more powerful.

Q: We need to check all exit paths for cases like this:

def foo [] -> any<string, nothing> {
    if (ls | length) > 4 {
        "yes"
    }
}

Overloading

Overloading == allowing a command to have the same name as another command but different input type. The signature and output type must be the same for all overloads. While technically, we could allow overloads also on the signature and output types, it would complicate things.

Examples

def foo[in: int] -> string {
    $in
}
# Error! return type does not match
def foo[in: int] -> string {
    ls | get $in | get name
}

def foo[in: string] -> string {
    ls | where name == $in | get 0
}

3 | foo  # statically infers the first overload
def generate-number[generator: block] {  # return type implicit 'any'
    do $generator
}

generate-number { 2 } | foo  # dynamically infers 'int' type

generate-number { [1 2 3] } | foo 
# Error! Type 'list' does not match expected 'int' or 'string'

Types of Types

  • Any
    • allow combination of types: any<int, string> or anyof<int, string>
    • optional type: option<int> as a shorthand for any<int, nothing>
  • simple types
    • String
    • Bool
    • Duration
    • CellPath (?)
    • Date
    • Filesize
    • Number
      • Int
      • Float
    • Nothing
    • Error
    • Binary
  • compound types
    • Record`
      • record<string, filesize>
    • Block` (block should be type-able):
      • <int>block<string> a block with "int" input type and "string" output type
    • iterator types
      • Range
        • range<float>, range<date>
      • List
        • list<int> requires collecting the list or good faith
      • ListStream
        • liststream<int> requires collecting the list stream or good faith
      • (ExternalStream?)
      • Table
        • table<string, filesize> requires collecting the table or good faith
      • Idea: The above could have an Iter super-type: iter<string> could accept both list<string> and liststream<string>
  • Custom (???)
  • Signature (???)

Help!

Need proper help messages when calling -h on commands that have overloads. Possibly a structured output.

Q: How to handle different help messages for different overloads?

Additional Usage

hide <int>foo      # hide only the integer overload of 'foo'
use spam <int>foo  # bring only the 'int' overload of 'foo' from 'spam' module

Pattern Matching / Destructuring

Matching on a type

It would be cool to be able to do

let x, y = [1 2]

Examples

N/A

The Grand Idea

Use overloads for internal commands. Many commands, like each, have a huge match inside checking for the input PipelineData and Value variants. These could be overloads as well.

Unite pipeline data and value types. Requires $in (and other variables?) to not collect streams, probably.

???

  • Syntax shapes vs. types: No longer necessary to keep shapes in the new parser?
  • Mapping "iterator" types (stream, list, table, range, etc.) ?
    • We can't iterate overl all elements and type-check them, but it would be nice to have list<int>, etc.
      • Could the solution be an "array" type that guarantees all members are of the same type? List would then be fully dynamic. Would be applicable to tables as well: arrays of records.
    • Could we have a generic iter type that would allow all of the iterables (stream, list, )?
  • How to handle help messages of overloads?

Reference

pub enum Type {
    Int,
    Float,
    Range,
    Bool,
    String,
    Block,
    CellPath,
    Duration,
    Date,
    Filesize,
    List(Box<Type>),
    Number,
    Nothing,
    Record(Vec<(String, Type)>),
    Table(Vec<(String, Type)>),
    ListStream,
    Any,
    Error,
    Binary,
    Custom(String),
    Signature,
}
pub enum Value {
    Bool {
        val: bool,
        span: Span,
    },
    Int {
        val: i64,
        span: Span,
    },
    Float {
        val: f64,
        span: Span,
    },
    Filesize {
        val: i64,
        span: Span,
    },
    Duration {
        val: i64,
        span: Span,
    },
    Date {
        val: DateTime<FixedOffset>,
        span: Span,
    },
    Range {
        val: Box<Range>,
        span: Span,
    },
    String {
        val: String,
        span: Span,
    },
    Record {
        cols: Vec<String>,
        vals: Vec<Value>,
        span: Span,
    },
    List {
        vals: Vec<Value>,
        span: Span,
    },
    Block {
        val: BlockId,
        captures: HashMap<VarId, Value>,
        span: Span,
    },
    Nothing {
        span: Span,
    },
    Error {
        error: ShellError,
    },
    Binary {
        val: Vec<u8>,
        span: Span,
    },
    CellPath {
        val: CellPath,
        span: Span,
    },
    CustomValue {
        val: Box<dyn CustomValue>,
        span: Span,
    },
}
pub enum PipelineData {
    Value(Value, Option<PipelineMetadata>),
    ListStream(ListStream, Option<PipelineMetadata>),
    ExternalStream {
        stdout: Option<RawStream>,
        stderr: Option<RawStream>,
        exit_code: Option<ListStream>,
        span: Span,
        metadata: Option<PipelineMetadata>,
    },
}
pub struct ListStream {
    pub stream: Box<dyn Iterator<Item = Value> + Send + 'static>,
    pub ctrlc: Option<Arc<AtomicBool>>,
}
pub struct RawStream {
    pub stream: Box<dyn Iterator<Item = Result<Vec<u8>, ShellError>> + Send + 'static>,
    pub leftover: Vec<u8>,
    pub ctrlc: Option<Arc<AtomicBool>>,
    pub is_binary: bool,
    pub span: Span,
}

pub fn parse_type(_working_set: &StateWorkingSet, bytes: &[u8]) -> Type {
    match bytes {
        b"int" => Type::Int,
        b"float" => Type::Float,
        b"range" => Type::Range,
        b"bool" => Type::Bool,
        b"string" => Type::String,
        b"block" => Type::Block,
        b"duration" => Type::Duration,
        b"date" => Type::Date,
        b"filesize" => Type::Filesize,
        b"number" => Type::Number,
        b"table" => Type::Table(vec![]), //FIXME
        b"error" => Type::Error,
        b"binary" => Type::Binary,

        _ => Type::Any,
    }

Select a repo