Try   HackMD

IPSL - InterPlanetary graph Selection Language

A mostly turring incomplete language for graph selection with lisp syntax, duck typing and wasm extensions.

Killer features (design goals):

  • Extremely low development bootstrap costs
    • single pass compiler (this include lexing, parsing and compiling, all in one single pass)
    • Lisp syntax (the easiest thing in the world to parse)
    • Pay (implement) only what you use approach
  • Duck typing, QUACK!
  • Comparable
  • Pausable and Resumable execution (with support for state serialisation and deserialisation)
  • Decomposable traversal
  • Turring incomplete control flow which allows for concurrent explorations
  • Extensible features with WASM modules (TODO)
  • Optin IPLD requirement (TODO)
    • Selector supper set
  • Compact binary representation (TODO)

Syntax - Lisp

Tokens are accumulated until broken at which point a new token is made.

If a break happen but the current token is empty no token is emited.

Thoses codepoints (ascii whitespaces) break tokens, 0x20 and 0x9 to 0xc (including).

Code is a stream of unicode codepoints. Which unicode encoding to use is left as an implementation details (/ must be managed out of band).

Literals

Numbers

If a token starts with 0 to 9 it is a literal and the resulting value is of type number.

If the token is made of a single 0, then it is the value zero and number parsing stops here.

Else if it starts with a number from 1 to 9 the whole token is considered as the number body and the base is 10.

For all bases the alphabet is: 0123456789abcdefghijklmnopqrstuvwxyz (case insensitive).

If it starts with a 0 alternative base mode is activated.

  • If the second rune is x or X then everything past the second rune is the number body and the base is 16, if you find nothing this is a syntax error.
  • If the second rune is o or O then everything past the second rune is the number body and the base is 8, if you find nothing this is a syntax error.
  • Else accumulate digits from 0 to 9 until you reach a b or B boundry rune. You parse everything you accumuled so far as base 10 and this is the base that will be used, it must be 2 <= base <= 36 else that a syntax error. Then everything past the boundry rune is the number body, if you find nothing this is a syntax error.

In number bodies the _ character is skipped by the parser (so 12__37 must be read as the number 1237)

The number body is parsed as a RTL number of whatever base was specified.

Examples
Literal Value
1234 1234
0xfF 255
0O10 8
03b20 6

Strings

The " character starts a string, while parsing a string other parsing rules are ignored.

Strings follow the interpreted_string_lit golang spec. The parser is free to parse the string as whatever unicode encoding it uses, however the emited string value must be utf8 encoded. If the string is not valid utf8 then this is a syntax error.

If the string does not match the interpreted_string_lit spec

CIDs

If a token begin with a $ everything past the first rune is parsed following the Human Readable CIDs spec and the resulting value is of type CID.

Nodes

Nodes build a tree like structure, each node can be a collection of tokens and child nodes.

Parsed tokens and nodes are added in order to the node that currently on top of the node stack.

When parsing is finished the node stack must be empty else that a syntax error (that means there are unmatched nodes).

When closed this scope node is emited into the parent.

() Value Nodes

The ( character starts a value node, pusing a new value node on the node stack, it also breaks tokens before and after.

A ) pop a node from the node stack and close it, if the poped node isn't a value node the syntax is invalid.

[] Scope Nodes

The [ character starts a scope node, pusing a new scope node on the node stack, it also breaks tokens before and after.

A ] pop a node from the node stack and close it, if the poped node isn't a scope node the syntax is invalid.

When closed this scope node is emited into the parent.

{} Comment Nodes

The { character starts a comment node, pusing a new comment node on the node stack, it also breaks tokens before and after.

A } pop a node from the node stack and close it.

While inside a comment node the parser ignores other parsing rules until it find a } rune, a comment node does not emit anything into the parent node.

Decorators

Decorator are tokens that will modify some flag of the following node, if they are followed by Generic Token or comment node this is a syntax error.
The Nodes Decorators are !?§%@# and can be used by the type system.

Decorators breaks tokens.

Generic tokens

If a remaining tokens contain values outside of this list: abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_- this is a syntax error. Else they are added as tokens (they will be used later by the type system).

Type system

None

None is a value some types can have, it indicates that this feature is not supported.

Decorators

§ match

§ can decorate literals and will transform them into an equivalent (match literal) node.

§ can also be used to decorate conditional nodes, then it transform them on a matcher node that will apply the condition to the matching result.

Other decorators

Other decorators are reserved for future usage, using them now is a type error.

Builtins Function

Builtins MUST be supported by all runners that implement IPSL.

Generics

pick

[T] <- (...[T])

pick is a choice node, it accepts a variable amount of arguments which all have the same type, the runner then pick one of them that is not a none type. If all are none then it returns none.

Implementations may lazily and omit type checks of nodes that it knows it is not gonna pick, so for example in this case:

(pick 123 "testing")

If the node decide to pick 123, it can return 123 even tho "testing" should be a type error.

The runner is free to implement whatever picking strategy it wants.
The simplest strategy would be to always pick the first non none element, however the runner is free to implement ones it like more.
For example, if tasked to pick a scope, scopes that does not require downloading extra data (such as a builtin scope, or a wasm scope that is already cached) are probably preferable.

load-builtin-scope

scope <- (matcher[string])

load-builtin-scope will execute the matcher on all available builtin scopes and return one of them that match. If none match it return a none scope.

Examples

Fetch some files in a unixfs directory

[unixfs (pick (load-builtin-scope (wildcard "/unixfs/v1.*")) (load-wasm-scope $Qmfoo {Qmfoo is https://example.com/Foo/rust-unixfs-ipsl-scope}))
    (depth-limit 10 (all
        (unixfs.name (wildcard "file1*") (unixfs.everything))
        (unixfs.name §"file2" (unixfs.file-range 0xFF1024 0x2FF0555 (unixfs.everything)))))]
traversal := ipsl.NewScope(unixfs.Scope, ipsl.NewDepthLimit(10, ipsl.NewAll(
    unixfs.Name(match.Wildcard("file1*"), unixfs.Everything)),
    unixfs.Name(match.Eq("file2"), unixfs.Range(0xFF1024, 0x2FF0555, unixfs.Everything))))
(unixfs folder $Qmbar)
+ -> file1*
|    + -> *
+ -> file2
     + -> 0xFF1024 / 0x2FF0555
          + -> *