owned this note
owned this note
Published
Linked with GitHub
# Hoon for Regular Programmers
## Introduction
I had a lot of troubles learning Hoon, and I blame it on the lack of tutorials tailored to my mode of learning, and the great deal of obfuscation that was put in the non-essential parts of Urbit. I wrote it for myself, to learn. But if you're anything like me, you might enjoy it as well.
It's going to be dense, terse, bottom-up. I'll be explaining things by comparing to mainstream programming languages and trying to avoid forward-references (explaining things by things that were not yet introduced). I also reject most of the silly names that Hooners use as I consider them ... an unproductive LARPing.
You get what you pay for, and I'm just learning myself, so if anything is wrong, just fill a comment at https://hackmd.io/IYd4RkpBQVqQTehmJeoxRw or contact me elsewhere.
I'm not a native speaker, and I have a weird sense of humor. You've been warned.
The whole thing is in Public Domain, and feel free to do whatever with it. If you're interested in helping or something, let me know. I'm ~napter-matnep on Urbit, or @dpc_pw on Twitter.
### Prerequisites
I assume you're a regular programmer and you know well one or more mainstream programming language like JS, Python, Java, Go, D, Rust, C++, Ruby, whatever.
I assume you can install, configure and run an Urbit instance to the point where you have a working Dojo - Urbit's built-in REPL/console. If you haven't already, [go and figure it out](https://urbit.org/using/install/) first.
No need for UrbitID yet. "Fake Zod" or a self-generated identity ("a comet") will do.
### Note for Hoon tutorial writers
Please ignore if you're just trying to learn it.
Some thoughts about what I think are good ideas when teaching Hoon.:
* 1:1 mappings or analogs to other programming languages help a lot.
* Everything in dojo first, before messing with files, mounting, generators, etc.
* Skip irrelevant Urbitism (like how to pronounce stuff) and focus on teaching the language
* If you can't explain something with previously explained primitives and analogs to other PLs, it should not yet be introduced.
* The relationship between "type" and "data" is/was not well explained and turned out that I was confused a lot about it.
## Data in Hoon
Everything in Urbit is stored in a form of just one data structure / data type called Noun. It **does not** mean you can't use different types. It only means, that under the hood everything is actually a Noun.
Nouns are really simple. A Noun can be one of two things. It can be an Atom - a number (any "unsigned integer" or rather an unsigned`BigNum`). Example of an Atom:
* `1`
* `10000000000000000000000`
Or it can be a Cell - a pair of two Nouns. The syntax for creating a Cell is `[a b]` where both `a` and `b` can be any Noun. The order is important: `[a b]` is different from `[b a]`.
As you might have already noticed, this definition is recursive. So one can have arbitrarily deep nested Nouns.
* `[1 [ 2 3 ] ]`
* `[ [ 1 2 ] [ 3 4 ] ]`
* `[ [ 1 2 ] [ 3 [ 4 [ 5 6 ] ]`
For simplicity you can skip the inner-brackets in any `[ a [ b c ] ]` expression, reducing it to just `[ a b c ]`.
So `[ 1 [ 2 [ 3 [ 4 [ 5 6 ] ] ] ] ] ` can be written as just `[ 1 2 3 4 5 6 ]`
A Noun can be viewed as binary tree of numbers. and `[ 1 2 3 4 5 ]` can be considered a special case - a list. Try reading [Noun Addresses section](https://urbit.org/docs/tutorials/hoon/nouns/#noun-addresses) if this is not clear.
**I can't stress it enough**: everything in Urbit is a tree of numbers. Your entire Urbit state is (conceptually) stored as one big tree of numbers. Every binary code, source code, "function", "message", "name", etc. Everything is a Noun under the hood.
## The Subject
In a mainstream programming language, when you see something like:
```
... // some code before
c = a + b
... // some code after
```
you immediately wonder: "What are `a`, `b`, `c` ? Where do they come from? Where are they stored?". They could be arguments to the current function, or a local variable, potentially in one of many nested scopes in the current function. Or a class field, or a global variable. You probably use IntelliJ or other IDE, because where these variables come from can be hard to find sometimes.
In Hoon none of these concepts exist. There is no scopes, function arguments, local or global variables. Or actually ... they are all unified into one and only one thing: The Subject. The Subject is an implicit value that is always present when any expression executes. This value is a Noun, and whatever data the code is using it must be coming from the subject.
You can think about the subject as an argument to the code. Not argument to function. To any expression (code). Every expression is always running with some subject present (or as Hooners say it: "against the subject").
Time to try out the Dojo thing. Type in `.` and press Enter.
```
> .
[ [ our=~napter-matnep
now=~2020.6.3..05.28.53..c4dc
eny=0vlk.if0q1.p52dd.9o2aq.87utg.ghlqp.54sqt.egiet.i9tlf.rlksf.lklt9.89q8v.2j4ot.0e2j1.aja9f.n22k9.m0e79.b4eks.in40k.sg8he.e88pq
]
<19.xls 31.eyd 9.umo 36.oqd 93.zfd 247.vdb 51.qyl 129.xvv 41.mac 1.ane $141>
]
```
Wow. What the hell just happened here?! Did it break?
Don't be scared. Your Urbit is OK. At least so far, heh.
`.` is a Hoon expression that returns ("evaluates" or "reduces" or "produces") "the current subject". What Dojo printed is a low-level encoding of it. The `.` expression is somewhat equivalent to:
```
anything foo(anything a) {
return a;
}
```
or
```
foo(a) {
return a;
}
```
in other programming languages, but `a` argument is passed implicitly.
When you typed in `.` and pressed enter the following happened:
* Dojo took the expression you typed in, parsed it and compiled it to a binary code (Nock).
* Dojo ran the compiled code "against some subject" ("with some argument").
* Your code evaluated (returned) the same thing that was passed to it - the subject.
* What you see in Dojo is what this expression returned ("produced"), which is what the Dojo passed to your expression, serialized and decorated for the human to look at.
As you can see whatever Dojo passed to this expression, it is of the form of a nested tree with `[` and `]` around some other things inside it. A Noun. Couldn't be anything else. Everything is a Noun in Urbit.
## Hoon syntax basics
Hoon's syntax works differently than mainstream imperative languages you're familiar with.
You're probably used to building blocks of code that start and end with `{` and `}`, or `begin` and `end`, or something like that. In between you'd put `doThis`, `doThat`, and eventually `return foo`. That's imperative programming: list of steps for the computer to perform.
In Hoon, like in many other functional programming languages, you only can use expressions. If you're not familiar with what expression means: An expression is just code that evaluates (returns) something else. It's a bit like you had to write everything as a one `return ...`. There is no `begin`s, `end`s, lists of steps, and so on. You have to make the `...` do everything by composing (combining) other expressions.
Because of that the syntax can be much different. It can be simplified to just `<operation> <argument_1>...<argument_n>`. Depending on the `<operation>`, there is a given amount of arguments to that operation. Each argument can potentially be another operation, recursively.
**The below code is not real Hoon yet** - just an example.
Imagine that you have two operations. `inc` and `add` . `inc` takes one argument - a number it will increment, and `add` takes two: numbers to add together.
This code (again: just a pseudo-Hoon!):
```
inc .
```
would be an expression that increments an argument passed to it. It would be like `return (1 + x)` in an imperative language.
```
inc inc .
```
would compile to a function that adds 2 to an argument passed to it. Like `return (1 + (1 + x))`.
```
inc add inc . inc .
```
would compile to a function that is basically: `return (1 + (1 + x) + (1 + x))`. If you have troubles with this one, let's add some parentheses surrounding the arguments:
```
inc (add (inc (.) (inc (.)))
```
The important point is: the parser knows where each argument begins and ends, by knowing how many arguments each operator needs. That's why there's no need for `{ ... }` or `begin` and `end`.
Now, let's try it with some real Hoon in the Dojo.
## Running simple expressions in the Dojo
Any expression typed in the Dojo will be evaluated against the weird subject we printed before with `.` expression. Because we don't know how to work with it yet, we will have to use a trick:
```
=>(subject expression)
```
to run an arbitrary expression with an arbitrary subject.
Let's try `=>(2 .+(.))`. In the Dojo it should look like this:
```
> =>(2 .+(.))
3
```
`=>` and `.+` are operators. Hooners call them "runes". Giving basic operators proper names, would be too simple, so Hoon is using funny looking two-character symbols instead. Just roll with it for now, maybe one day you'll learn to love it, who knows. It actually make it easy to tell them from other things in the code.
`=>` rune takes the first argument and runs it "against the current subject" (meaning: with the subject as an argument), then takes the result of that as a new subject, and runs the second argument against it. The result of the last expression is the result of the whole `=>` rune expression.
We use `=>` with a constant `2`, to ignore the subject passed by the Dojo and turn it into a number of our choosing: `2`. Then `.+` is an increment rune. `.+(.)` is like `inc .` or `inc(.)` from our previous pseudo-Hoon. It just adds one to the subject and evaluates to (returns) it.
`=>( ... )` is a "wide-form" of calling the `=>` rune. It's used if you want to keep the whole thing as a one line. Otherwise you'd have to use the so-called "tall-form", where instead of `(` and `)` you separate the rune from the arguments with a "gap" - a newline or two or more spaces. I know - bit weird again. Just remember that one space is different than two or more spaces or a newline.
All of these are equivalent:
```
> =>(2 .+(.))
3
> =>
2
.+(.)
3
> => 2
.+
.
3
> => 2 .+ .
3
```
Note: Be mindful of spaces vs gaps (multiple spaces)!
Go ahead, try changing `2` into something else, and play a bit with it.
Tip: If you have trouble figuring out what certain expression do, practice converting them from `inc inc add . .` to `inc(inc(add(., .)))` form in your head. It really is that simple - it's just the parenthesis are not used.
## Conditionals
Let's do something like:
```
if (x == 2) {
return 3
} else {
return 4
}
```
now. Here it goes in Dojo, with different subjects (arguments):
```
> =>(1 ?:(=(. 2) 3 4))
4
> =>(2 ?:(=(. 2) 3 4))
3
> =>(3 ?:(=(. 2) 3 4))
4
```
`:?` is the equality operator, followed by the condition, then expression to evaluate if the condition was `true`, then expression if the condition was `false`.
`=(. 2)` is the irregular wide-form of `.=` - an equality operator. It evaluates to true if `.` (the subject) is equal to `2`.
Let's do the tall-form as well:
```
> => 2
?: =(. 2)
3
4
3
```
Tall vs wide-form might seem weird at first, and there are other tutorials that explain it in more detail. Maybe check the ["Tall and wide forms" section of Urbit documentation](https://urbit.org/docs/tutorials/hoon/hoon-syntax/#tall-and-wide-forms).
## Comments
You can add comments in Hoon after `::`, even when typing in the Dojo.
```
> => 2 :: change subject to just number 2
?: =(. 2) :: compare to 2
3 :: if true return 3, else
4
3
```
They will become more useful when we will start putting our code inside files.
## Addressing parts (fragments) of Nouns
Do you remember that everything in Hoon is stored as Noun? Very important.
Since Nouns are so ubiquitous Hoon has a lot of syntax dedicated to operating on Nouns, and a lot of higher-level concept is just Nouns of a specific structure ("shape").
Oftentimes, it's necessary to refer to the part of a subject. Let's say you have a pair ("cell") `[ 1 2 ]`, and you'd like to get the first, or the second value inside it.
In the Dojo:
```
> => [1 2] +1
[1 2]
> => [1 2] +2
1
> => [1 2] +3
2
```
Again, `=>` operator changes current subject from whatever the Dojo passed, to result of the next expression: `[1 2]`, and then evaluates the following expression with against that as a subject.
So what are `+1`, `+2`, `+3`? They are address syntax to retrieve parts of the subject. `+1` is the whole subject, `+2` is the first (left) child, and `+3` is the second (right) one.
You probably wonder: "can I do +4"? Yes, you can! Let's try.
```
> => [[1 2] [3 4]] +1
[[1 2] 3 4]
> => [[1 2] [3 4]] +2
[1 2]
> => [[1 2] [3 4]] +3
[3 4]
> => [[1 2] [3 4]] +4
1
> => [[1 2] [3 4]] +5
2
> => [[1 2] [3 4]] +6
3
> => [[1 2] [3 4]] +7
4
```
I hope you can figure out the rest. If you have troubles - don't hesitate to look into other tutorials or for help first. They usually describe it in more detail.
## Objects (cores)
Objects ("cores") in Hoon are somewhat similar to objects in other programming languages. They are such a low level part of Hoon, that it's hard to explain almost anything without them. Hoon uses them for everything: namespacing, state management, loops, and more.
An object ("core") is stored as just a pair of `[code data]`.
`code` part is a list of compiled methods. `data` is just stuff like variables and such. The exact structure of the object (especially the `data` part) varies, so one could say there are many types of objects in Hoon.
There are many operators that can create an object, and they all start with `|`.
## Closures (one-method objects)
Let's start with the simplest core I could find: a closure. If you're not familiar with closures - they are basically functions with some data "captured" from its environment.
In Hoon a closure is just an object with one method named `$`. Simple. The name is weird, but who cares - an arbitrary symbol that means "anonymous function name", I guess. The data captured by the closure is just the data of the object.
To create a closure, we can use `|.` operator:
```
> => 1 |. 3
<1.slf @ud>
```
As usual, we set a fixed subject first with `=>`, then `|.` create a closure with the subject as data, and the next expression, as the body of `$` method.
In the above case, the result of the whole expression is `[ <methods> 3 ]`. The output we're seeing is Hoon/Dojo pretty-printing it for us.
Hoon knows that the returned value is an object. `1.slf` means that there is one method in the code section, and the checksum of the code section is `slf`. `@ud` is describes the type (and not the value) stored in the data section. `@ud` means unsigned decimal. The value of the whole data section is actually `1` - the subject the `|-` is ran against.
To investigate properties of the closure in more detail, let's learn a second closure operator: "create and evaluate closure": `|-`.
```
> => 1 |- 3
3
```
The `|-` operator is exactly like `|.`, but right after creating the closure, it also evaluates the `$` method of the object that was just created. Because of that the result of the whole expression is the result of the expression after `-|` operator: `3`.
It's important to note, that when methods on objects are evaluated, their code is evaluated with the data of the object as the subject. Similarly to how it would work in a language like Java:
```Java
foo.doSomething();
```
The code inside the `doSomething` method, would have access to all the fields and other methods inside its class.
Let's double check what is the subject of the arm is evaluated against.
```hoon
> => 1 |- .
<1.xaz @ud>
```
Ha! It's the same as as before:
```hoon
> => 1 |. .
<1.xaz @ud>
```
The subject in the method works pretty much the same as `this`/`self` would work in other programming languages.
Since we know exactly how the object is structured, we can look at the parts of it from inside its methods:
```hoon
> => 99 |- +3
99
```
The data is exactly how we would expect it to be!
I'm curious: how does the code look like?
```
> => 99 |- +2
[0 2]
```
If you're confused, let me explain. We made our closure return the `code` part of the `[code data]` using `+2`. `code` is supposed to contain the binary code of the compiled methods. And as you can see ... it's just `[0 2]`. I told you that **everything** in Urbit is a tree of numbers, even binary code. `[0 2]` is a Nock (Urbit's binary code) for "return value at address 2". Which is exactly what our method is doing!
Let's try something else:
```
=> 99 |- $
```
Have you pressed enter? Do you hear your CPU fan spinning already? Tricked ya, haha! I made you cause an infinite recursion. Don't worry. You can just press Ctrl+C to interrupt it.
Do you remember, that closure is just an object with one method, named `$`? We've made `$` call `$`.
## Static Typing
So far we've been only playing with numbers, so we did not care much about the types. But Hoon is (mostly) statically-typed.
For each expression Hoon does track the type of the subject it is evaluted against, and is able to infer the type of the product (returned data).
It's a bit tricky to inspect the types, because they are not part of the data itself, and only something that Hoon keeps track of internally. Luckily, Dojo has a special operator `?` that allows inspecting the type of a given expression. Note: this operator is not a Hoon operator. It's a Dojo feature.
```
> ? 1
@ud
1
```
`@ud` is the type and `1` is the product (returned value). `@` means an atom - a number, as opposed to a pair. `ud` are additional qualifiers: "unsigned" and "decimal".
That's all you need to understand right now: even though there is no `float`s, `usigned long`s etc. in the code - on every step Hoon does keep a track of the type and even uses it for things like pretty-printing.
## Names: identifiers and labels
Hoon allows you to assign a name to the type of any part of your data. In Hoon-speak they're called "faces".
```
> => ^= foo 1 .
foo=1
```
`^=` operator takes a name and an expression as arguments. It produces (returns) the product of the expression, but with the type that now contains a new name for it. The result is the same as if we did just `=> 1 .`, but the type is different.
```
> => 1 .
1
> => ^= foo 1 .
foo=1
> ? => 1 .
@ud
1
> ? => ^= foo 1 .
foo/@ud
foo=1
```
Notice how `foo/` is displayed in the type, and `foo=` in the value. Pretty-printing magic.
What is super-interesting is ... names can be applied multiple times!
```
> ? => ^= bar ^= foo 1 .
bar/foo/@ud
bar=foo=1
```
When Hoon spots a name in the expression it will do a depth-first search in the subject and evaluate the first thing it finds.
```
> => ^= foo 1 foo
1
```
This is much more convenient way to refer to parts of the subject, than their raw addresses.
Note: There are more details about Hoon's name resolution that I will not discuss right now.
## Mutating data
All data in Hoon is immutable. One can't modify any value, but it is possible to make copies.
It is possible to take parts of the subject and make a mutated copy of it, replacing things with new values.
The basic syntax is `address-in-the-subject(address-in-the-selected-part new-value, ...)`.
You'll find more information in the next section.
## Further reading about the subject
There is a lot of little details about all of:
* addressing parts of the subject,
* name resolution,
* mutation syntax
that would be very laborious to describe. Fortunately there's is already a page describing it all in Urbit documentation, and I think now you should be ready to not get confused.
Please go and at least glance through [The Subject and Its Legs](https://urbit.org/docs/tutorials/hoon/the-subject-and-its-legs/) page.
## Working with source files
We've been trying to avoid it so far, but as our code examples get larger, it becomes easier to edit them in files.
You can make your Urbit mirror its internal file system by mounting it to your host's file system.
```
> |mount %
>=
```
`>=` means we're good.
To commit your changes back to Urbit use:
```
> |commit %home
>=
```
We're going to create a generator, which is a Dojo's version of a "shell script".
Create a `<ship-name>/home/gen/foo.hoon` file and put inside:
```
|= n=@
n
```
Then commit in Dojo:
```
> |commit %home
>=
+ /~napter-matnep/home/21/gen/foo/hoon
```
Changes have been synced!
```
> +foo 1
1
> +foo 2
2
```
Success. The new generator works.
## Functions
For the generator to work, its source code must evaluate to a function, which in Hoon is just another type of object (called a "gate" if you're curious).
A function is very much like a closure produced with `|-`, but as you can see it also takes an argument. `n=@` means that the argument to this function is named `n` and it's type is an atom (a number).
Let's change our `foo.hoon` generator to:
```
|= n=@
.
```
The result of this function is the whole subject (`.`) that method `$` will be run against. This will allow us to investigate how is the function really built in Urbit.
Now commit, and run:
```
> |commit %home
>=
: /~napter-matnep/home/23/gen/foo/hoon
> +foo 1
<1.kmo {n/@ <19.xls 31.eyd 9.umo 36.oqd 93.zfd 247.vdb 51.qyl 129.xvv 41.mac 1.ane $141>}>
```
What do we have here. The code part of this object is very similar to what we have seen with `|-`. One method (`1.` part of the `1.kmo`). But the data is different.
The data section is stored as `[ argument environment ]`. The whole object is `[ code [ argument environment ] ]`.
Let's peek closer with:
```
|= n=@
+3
```
Save, commit, run:
```
> +foo 99
[n=99 <19.xls 31.eyd 9.umo 36.oqd 93.zfd 247.vdb 51.qyl 129.xvv 41.mac 1.ane $141>]
```
You might be able to tell already that the environment is just another code section of some kind. A lot of methods. Turns out Dojo conveniently passes a whole Hoon's standard library to generators as function environment.
## Loops
Let's say we would like to count numbers between some `n` and 100 in Hoon.
Something like:
```
sum = 0;
for (i = n; i < 100; i = i + 1) {
sum = sum + 1
}
return sum
```
Like many other functional programming languages, Hoon uses recursion for loops. So let's turn the loop into a recursive function first:
```
function numBetween(start, end) {
if (start >= end) {
return 0
}
return 1 + numBetween(start + 1, end)
}
sumBetween(n, 100)
```
If you can't wrap your head around the concept of recursion, then you're not going to have a great time learning FP-language like Hoon.
If you can, it probably seems rather simple: instead of looping, we add 1 to the result of counting all other numbers.
Let's write it as a generator:
```
|= n=@
?: =(100 n)
0
.+($(n .+(n)))
```
Save, commit, and test:
```
> +foo 77
23
```
Yay. Let's walk over the code step by step
```
|= n=@
```
creates a function taking one argument, the rest of code will the body of that function (the body of its only anonymous method named `$`).
```
?: =(100 n)
0
```
This is "if operator" with "100 equals n" condition. The `n` will be looked up in the subject, which is our function. If the condition was true, the `?:` will evaluate to `0`. If the condition was false...
```
.+($(n .+(n)))
```
... the result of the function will be one plus `$(n .+(n))`. `$` is the name of the inner-method of our function `$( ... )` is a noun mutation operator. The whole `$(n .+(n))` means "evaluate method $, but with `n` in the subject changed to one plus previous n", which accomplishes the recursion we needed.
## Further reading
This document is an ongoing work, and I might add more content, as I keep learning Hoon myself.
I hope that all the existing content was a useful introduction, and will allow you to jump straight into [Urbit's official Hoon tutororial](https://urbit.org/docs/tutorials/hoon/) and not get confused too easily. It certainly help me.