# IPSL - InterPlanetary graph Selection Language
A mostly turring incomplete language for graph selection with lisp syntax, duck typing and wasm extensions.
Killer features (design goals):
- Extremely low development bootstrap costs
- single pass compiler (this include lexing, parsing and compiling, all in one single pass)
- Lisp syntax (the easiest thing in the world to parse)
- Pay (implement) only what you use approach
- Duck typing, **QUACK**!
- Comparable
- Pausable and Resumable execution (with support for state serialisation and deserialisation)
- Decomposable traversal
- Turring incomplete control flow which allows for concurrent explorations
- Extensible features with WASM modules (*TODO*)
- Optin IPLD requirement (*TODO*)
- Selector supper set
- Compact binary representation (*TODO*)
## Syntax - Lisp
Tokens are accumulated until broken at which point a new token is made.
If a break happen but the current token is empty no token is emited.
Thoses codepoints (ascii whitespaces) break tokens, `0x20` and `0x9` to `0xc` (including).
Code is a stream of unicode codepoints. Which unicode encoding to use is left as an implementation details (/ must be managed out of band).
### Literals
#### Numbers
If a token starts with `0` to `9` it is a literal and the resulting value is of type number.
If the token is made of a single `0`, then it is the value zero and number parsing stops here.
Else if it starts with a number from `1` to `9` the whole token is considered as the number body and the base is 10.
For all bases the alphabet is: `0123456789abcdefghijklmnopqrstuvwxyz` (case insensitive).
If it starts with a 0 alternative base mode is activated.
- If the second rune is `x` or `X` then everything past the second rune is the number body and the base is 16, if you find nothing this is a syntax error.
- If the second rune is `o` or `O` then everything past the second rune is the number body and the base is 8, if you find nothing this is a syntax error.
- Else accumulate digits from `0` to `9` until you reach a `b` or `B` boundry rune. You parse everything you accumuled so far as base 10 and this is the base that will be used, it must be `2 <= base <= 36` else that a syntax error. Then everything past the boundry rune is the number body, if you find nothing this is a syntax error.
In number bodies the `_` character is skipped by the parser (so `12__37` must be read as the number `1237`)
The number body is parsed as a RTL number of whatever base was specified.
##### Examples
| Literal | Value |
|:-:|:-:|
| `1234` | `1234` |
| `0xfF` | `255` |
| `0O10` | `8` |
| `03b20` | `6` |
### Strings
The `"` character starts a string, while parsing a string other parsing rules are ignored.
Strings follow the [`interpreted_string_lit` golang spec](https://go.dev/ref/spec#String_literals). The parser is free to parse the string as whatever unicode encoding it uses, however the emited string value must be utf8 encoded. If the string is not valid utf8 then this is a syntax error.
If the string does not match the `interpreted_string_lit` spec
### CIDs
If a token begin with a `$` everything past the first rune is parsed following the [Human Readable CIDs spec](https://github.com/multiformats/cid#human-readable-cids) and the resulting value is of type CID.
### Nodes
Nodes build a tree like structure, each node can be a collection of tokens and child nodes.
Parsed tokens and nodes are added in order to the node that currently on top of the node stack.
When parsing is finished the node stack must be empty else that a syntax error (that means there are unmatched nodes).
When closed this scope node is emited into the parent.
#### `()` Value Nodes
The `(` character starts a value node, pusing a new value node on the node stack, it also breaks tokens before and after.
A `)` pop a node from the node stack and close it, if the poped node isn't a value node the syntax is invalid.
#### `[]` Scope Nodes
The `[` character starts a scope node, pusing a new scope node on the node stack, it also breaks tokens before and after.
A `]` pop a node from the node stack and close it, if the poped node isn't a scope node the syntax is invalid.
When closed this scope node is emited into the parent.
#### `{}` Comment Nodes
The `{` character starts a comment node, pusing a new comment node on the node stack, it also breaks tokens before and after.
A `}` pop a node from the node stack and close it.
While inside a comment node the parser ignores other parsing rules until it find a `}` rune, a comment node does not emit anything into the parent node.
### Decorators
Decorator are tokens that will modify some flag of the following node, if they are followed by Generic Token or comment node this is a syntax error.
The Nodes Decorators are `!?§%@#` and can be used by the type system.
Decorators breaks tokens.
### Generic tokens
If a remaining tokens contain values outside of this list: `abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_-` this is a syntax error. Else they are added as tokens (they will be used later by the type system).
## Type system
### None
None is a value some types can have, it indicates that this feature is not supported.
###
### Decorators
#### `§` match
`§` can decorate literals and will transform them into an equivalent `(match literal)` node.
`§` can also be used to decorate conditional nodes, then it transform them on a matcher node that will apply the condition to the matching result.
#### Other decorators
Other decorators are reserved for future usage, using them now is a type error.
## Builtins Function
Builtins MUST be supported by all runners that implement IPSL.
### Generics
#### `pick`
`[T] <- (...[T])`
`pick` is a choice node, it accepts a variable amount of arguments which all have the same type, the runner then pick one of them that is not a none type. If all are none then it returns none.
Implementations may lazily and omit type checks of nodes that it knows it is not gonna pick, so for example in this case:
```
(pick 123 "testing")
```
If the node decide to pick `123`, it can return `123` even tho `"testing"` should be a type error.
The runner is free to implement whatever picking strategy it wants.
The simplest strategy would be to always pick the first non none element, however the runner is free to implement ones it like more.
For example, if tasked to pick a scope, scopes that does not require downloading extra data (such as a builtin scope, or a wasm scope that is already cached) are probably preferable.
### Scope Related
#### `load-builtin-scope`
`scope <- (matcher[string])`
`load-builtin-scope` will execute the matcher on all available builtin scopes and return one of them that match. If none match it return a none scope.
## Examples
### Fetch some files in a unixfs directory
```
[unixfs (pick (load-builtin-scope (wildcard "/unixfs/v1.*")) (load-wasm-scope $Qmfoo {Qmfoo is https://example.com/Foo/rust-unixfs-ipsl-scope}))
(depth-limit 10 (all
(unixfs.name (wildcard "file1*") (unixfs.everything))
(unixfs.name §"file2" (unixfs.file-range 0xFF1024 0x2FF0555 (unixfs.everything)))))]
```
```go
traversal := ipsl.NewScope(unixfs.Scope, ipsl.NewDepthLimit(10, ipsl.NewAll(
unixfs.Name(match.Wildcard("file1*"), unixfs.Everything)),
unixfs.Name(match.Eq("file2"), unixfs.Range(0xFF1024, 0x2FF0555, unixfs.Everything))))
```
```
(unixfs folder $Qmbar)
+ -> file1*
| + -> *
+ -> file2
+ -> 0xFF1024 / 0x2FF0555
+ -> *
```