owned this note
owned this note
Published
Linked with GitHub
# Japt v2
✔ - implemented
## Methods
[**See separate document**](https://hackmd.io/XRb-CU1iT1isNvYwfd9N2Q)
## Variables ✔
We want:
- 10 ✔
- 11?
- 12
- 13?
- 14?
- 15
- 16
- 32
- 64
- 100
- 1000
- -1
- pi
- e
- golden ratio
- new Date()
- uppercase alphabet `”Ạ`
- lowercase alphabet `”ạ`
- digits 0-9 `”ḍ`
- ASCII table `”ṃ`
- empty string
- newline
- space
- 0 (for counter if needed; could also increment automatically each time it's called)
- empty array
- array of arguments (v2 will use arguments by default instead of parsing input)
- first argument
- second argument
- third argument
- fourth argument?
- raw input
- next char in input (to support infinite input through command line)
## Regex ✔
### Groupings
| JS | Japt | Description |
| -------- | -------- | -------- |
| `/a/` | `«a»` | ✔ Ends of regex
| N/A | `»a` | ✔ Single-class regex
| `(a)` | `“a”` | Capturing group
| `(?:a)` | `“a„` | Non-capturing group
| `(?=a)` | `‟a”` | Positive lookahead
| `(?!a)` | `‟a„` | Negative lookahead
| `(?<=a)` | `«a”` | Positive lookbehind (ES2018)
| `(?<!a)` | `«a„` | Negative lookbehind (ES2018)
| `[a-z]` | `₍a₋z₎` | ✔ Any of these chars
| `[^a-z]` | `⁽a₋z₎` | None of these chars
| `(?:ab)` | `‼ab` | Two items, non-capturing
| `(?:abc)`| `…abc` | Three items, non-capturing
### Modifiers
| JS | Japt | Description |
| -------- | -------- | -------- |
| `a+` | `a⁺` | One or more, greedy
| `a+?` | `a₊` | One or more, non-greedy
| `a*` | `a⁻` | Zero or more, greedy
| `a*?` | `a₋` | Zero or more, non-greedy
| `a?` | `a¿` | Zero or one
| `a{27}` | `a²⁷` | X matches
| `a{3,}` | `a³⁻` | X or more matches, greedy
| `a{0,27}` | `a⁻²⁷` | Up to Y matches, greedy
| `a{3,27}` | `a³⁻²⁷` | X to Y matches, greedy
| `a{3,}?` | `a³₋` | X or more matches, non-greedy
| `a{0,27}?` | `a₋²⁷` | Up to Y matches, non-greedy
| `a{3,27}?` | `a³₋²⁷` | X to Y matches, non-greedy
| `a|b` | `a∨b` | Either `a` or `b`
| `(?=.*a)(?=.*b)` | `a∧b` | Both `a` and `b` ahead
### Special classes
| v1 | v2 | Description
| -- | -- | -----------
| `.` | `ẹ` | Non-newline
| `[^]` | `ẓ` | Anything
| `%1`-`%9` | `₁`-`₉` | Capturing groups
| `^` | `≤`? | Start of input
| `$` | `≥`? | End of input
| `%b` | `ḅ` | Word boundary
| `%B` | `Ḅ` | Non-boundary
| N/A | `Ḃ` | Start of word (`\b(?=\w)`)
| N/A | `ḃ` | End of word (`\b(?!\w)`?)
| `%w` | `ẉ` | `A-Za-z0-9`
| `%A` | `Ạ` | `A-Z`
| `%a` | `ạ` | `a-z`
| `%l` | `ḷ` | `A-Za-z`
| `%d` | `ḍ` | `0-9`
| `%s` | `ṣ` | whitespace
| `%n` | `¶` | ✔ newline
| `%t` | `ṭ` | ✔ tab
| `%p` | `ṃ` | printable ASCII
| `%q` | `ṇ` | printable + newline
| `%v` | `ṿ` | `AEIOUaeiou`
| `%y` | `ỵ` | `AEIOUYaeiouy`
| `%c` | `ẏ` | `BCDF...wxyz`
| N/A | `ċ` | `BCDF...vwxz`
Note: any of these (except `Ạ`/`ạ` and the `B`s) can be changed to "everything but this" by using the uppercase letter.
`Ḃ` can be fully simulated with `(?<!\w)(?=\w)`, and `ḃ` with `(?<=\w)(?!\w)`. Both require lookbehinds, only added in ES2018 and currently only supported in Chrome.
The `u` flag should be automatically added, or the regex should be modified to act like it is enabled. `ẹ` will be equivalent to `(?:[\uD800-\uDBFF][\uDC00-\uDFFF]|.)`.
### Flags
| JS | Japt | Description |
| -- | ---- | ----------- |
| `g` | `Ġ`? | Match globally?
| `i` | `İ` | Ignore case
| `m` | `Ṁ` | Match `^` and `$` at newlines
| `y` | `Ṡ` | Match from end of previous
## Strings
### Types
- ✔ Char:`”a`?
- 2-char: `‼ab`
- 3-char: `…abc`
- ✔ Regular: `“a”`
- Compressed: TBD
### Special features
| Syntax | Feature | Notes / Thoughts |
| ------ | ------- | ---------------- |
| `‹U›` | interpolation |
| `₍abc₎` | group |
| `‼ab` | 2-group |
| `…abc` | 3-group |
| `a²⁷` | repeat previous item | item = char or group
| `a⁺b⁻c` | repeat forever | `abb...bbc`
| `a⁺z` | repeat each forever | `...aaazzz...`
| `a⁽bc⁾d` | start/end of string | for defining indices outside string
| `a₋z` | character range |
| `a₋c₋z` | skipping range? | `ace...uwy`
| `z₋a` | reversed range |
| `a²₋z` | repeating range | `aabb...zz`
| `a₋²z` | repeated range | `abc...zabc...z`
| `ẉ` | `A₋Za₋z0₋9` | alphanumerics
| `ụ` | `A₋Z` | uppercase alphabet
| `ḷ` | `a₋z` | lowercase alphabet
| `ạ` | `A₋Za₋z` | both alphabets
| `ḍ` | `0₋9` | digits
| `ṣ` | <code>¶ṭ </code> | newline, tab, space
| `¶` | ✔ newline
| `ṭ` | ✔ tab
| `ṃ` | <code> ₋~</code> | printable ASCII
| `ṇ` | <code> ₋~¶</code> | printable plus newline
| `ṿ` | `AEIOUaeiou` | vowels
| `ẏ` | `BCDF...wxyz` | consonants + y
| `ỵ` | `AEIOUYaeiouy` | vowels + y
| `ċ` | `BCDF...vwxz` | consonants
## Code page ✔
Current assigned code points: (could change at any time)
```
x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x ₀ ₁ ₂ ₃ ₄ ₅ ₆ ₇ ₈ ₉ ₊ ₋ ₍ ₎ ¼ ¾
1x ⁰ ¹ ² ³ ⁴ ⁵ ⁶ ⁷ ⁸ ⁹ ⁺ ⁻ ⁽ ⁾ ½ ⅟
2x ! " # $ % & ' ( ) * + , - . / (0x20 is literal space)
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~ ¶ (0x7F is also newline)
8x Ạ Ḅ Ḍ Ẹ Ḥ Ị Ḳ Ḷ Ṃ Ṇ Ọ Ṛ Ṣ Ṭ Ụ Ṿ
9x Ẉ Ỵ Ẓ Ȧ Ḃ Ċ Ḋ Ė Ḟ Ġ Ḣ İ Ŀ Ṁ Ṅ Ȯ
Ax Ṗ Ṙ Ṡ Ṫ Ẇ Ẋ Ẏ Ż ạ ḅ ḍ ẹ ḥ ị ḳ ḷ
Bx ṃ ṇ ọ ṛ ṣ ṭ ụ ṿ ẉ ỵ ẓ ȧ ḃ ċ ḋ ė (0xB5 is also tab)
Cx ḟ ġ ḣ ı ŀ ṁ ṅ ȯ ṗ ṙ ṡ ṫ ẇ ẋ ẏ ż
Dx à á â æ è é ê ì í î ò ó ô ù ú û
Ex ≈ ≠ ≡ ≢ ≤ ≥ ∧ ∨
Fx ‹ › « » “ ‟ ” „
```
Chars to assign:
```
¿‼…
```
Assigned but currently don't have a non-regex use:
```
\`¿
```
## Shortcuts
| Char | Expression |
| ---- | ---------- |
| `⁺` `⁻` | unary `+` `-`
| `₊` `₋` | `++` `--`
| `‹` `›`? | `<<` `>>`
| `≤` `≥` | `<=` `>=`
| `≈` `≠` | `==` `!=`
| `≡` `≢` | `===` `!==`
| `⁽` `⁾` | `((` <code>) </code>
| `₍` `₎` | `(((` `))`?
| `∧` `∨` | `&&` `||`
| `₀` ... `₆` | `g0` ... `g6`
| `₇` `₈` `₉` | `g⁻3` `g⁻2` `g⁻1`
| `⁰` ... `⁹` | `p0` ... `p9`
| `¼` `½` `¾` | `.25` `.5` `.75`
| `⅟` | `1/`
## Miscellaneous
- ?