Nushell core team meeting 2023-01-18

Attendees

  • Darren
  • Reilly
  • JT
  • Michael
  • Andres
  • Stefan
  • Keith
  • Jakub

Agenda

PRs

  • add dedicated const in pipeline, const builtin var errors https://github.com/nushell/nushell/pull/7784
  • convert SyntaxShape::Table into the corresponding Type https://github.com/nushell/nushell/pull/7781
  • Remove deprecated numbered flag from five commands https://github.com/nushell/nushell/pull/7777
  • Fixes shell crashing because alias name is shorter than alias command and there are pipes present. (Fixes Issue 7754) https://github.com/nushell/nushell/pull/7756
  • str length, str substring, str index-of and split chars now use graphemes instead of UTF-8 bytes https://github.com/nushell/nushell/pull/7752 - It seems like we've reached a tipping point where we mostly agree that a --grapheme(-g) flag should be used but the further discussion suggested by Leon is to make the default bytes vs grapheme configurable. How do you feel about that part?
    • Reilly: would vote against configuration. we have too much of it already, adds another combination of config to test
  • Reilly's LazyRecord PR https://github.com/nushell/nushell/pull/7619
    • Reilly: I'm OK if we decide that we don't want another Value variant. IMO this was a successful experiment and a nice performance improvement, but it's OK if we decide that this isn't the right long-term approach.
  • The rest of the older PRs probably need another look at and ping the OP to see if we need to close them or if they still plan on working on them.

External PRs

Discussed Topics

Config system discussion

Darren's comparison of different possible paths
https://hackmd.io/NfBoTWUeQhOTXoeKra437A

Kubouchs suggestions
https://hackmd.io/@nucore/BJpziLNsj

Current problem two separate config files: env.nu and config.nu
-> could we bridge the difference or make it more easy to understand

Current configs grow really large and get messy (error messages are hard)
-> Pro splitting into more atomic parts

Also handling of fall-back default config if user deletes config or if config is broken

Have an easy understandable logic to understand how configuration is loaded with good default experience without needing a PhD to configure nushell

Declarative config (toml, nuon etc.)
Scripted config (much greater flexibility but some challenges and order matters much more plus cost of evaluation)

We want to be able to break the config into easily digestable chunks (e.g. keybindings.nu, hooks.nu, themes/my-theme-dark.nu)

Easy orderable parts and hard to order parts (NU_LIB_DIRS, ENV_CONVERSIONS)

As mentioned last meeting: Path/PATH conversions for a bunch of variables (currently special sauce)
Jakub: could this just be a command that gets invoked based on a closure

Curveball: How strict do we want to keep the parse eval split

JT: can we start breaking up config.nu into smaller scripts keyboard.nu, theme.nu etc.

Important point by Michael: How do we ship the default or example config and make sure user can upgrade seemlesslyo

Requirement: Set up a good default!

Currently if we try to source a non existing file we error

Suggestion by Jakub to have a module level mechanism to have export symbols you could query

Smallest custom.nu (user defined) as possible for folks starting out

Unicode and encodings

https://github.com/nushell/nushell/pull/7752

Breaking change that would change the semantics of indices into strings

Currently we closely map rusts &str/String:

  • Everything is encoded as UTF-8
  • Indexing through byte (slices) for O(1) constant time operations
  • Iterator operates over codepoints/scalars from the Unicode encoding definition
    • diacritics + base character can be two codepoints or more
    • Emojis can be a wild composition of multiple codepoints
    • split chars command in nushell
    • Operation is cheap as UTF-8 encoding says how many bytes a codepoint uses.

Graphemes are the unit relevant to typesetting or language understanding.

Risk of breaking changes high as different semantics might be expected for different situations! (e.g. \r\n is it one "character" or two?)

Decision: lets first start supporting graphemes at all and revisit breaking changes at a later date

Note:

We are making strong assumptions by using UTF-8 strings in some places (e.g. file/path names can have different restrictions and be just bytes)

We currently paper over those difficulties by heavily using from_utf8_lossy that replaces bytes that are not valid UTF-8 with a replacement (so bit twiddling string ops are safe)
This allows glosses over invalid byte level indexing in nushell with those pesky question mark characters

LazyRecord and friends

Reilly implemented LazyRecord
Goal: accelerate stuff like the materialized sys or $nu record

Has to clone the engine state at a particular time

Laziness is cool!

Idea: lazy table for stuff like ps or ls with lazy loaded column

Impl details:

new variant on Value

matching for the record manipulating commands necessary
(methodification of Value should be a goal!)

Sentiment: let's land and look for problems and opportunities
$nu gets much faster and thus nushell is starting up quicker!!!

Q: are we missing a match that currently hits the default fall through (PITA to track down)

Relationship to the metadata PR discussed in the last meeting
We consider the lazy record to be the experiment to prove out laziness and more semantic extension of Value
We need to do some refactoring before we can easily continue to work on metadata or removal of spans for Value.

Select a repo