owned this note
owned this note
Published
Linked with GitHub
# organization principles of secure scuttlebutt backend
###### tags: scuttlebutt
## design patterns
not all of these principles are intentional! but if you want to understand the code, I believe you should try to understand the social/cultural situation in which it was created. Sometimes accidents have a lot of staying power, or even turn out to be good ideas. If you are looking for the reason why the code is like it is, you should include those reasons without immediate judgement.
### legacy: good idea at the time
sometimes something seemed like it would be needed going forward,
but turned out to not be that useful. New code probably doesn't do it this way anymore
so this can make old code look weird.
example: https://github.com/ssbc/ssb-server/blob/07d63a3abed4e45118459592b21fbfb878929e0b/plugins/master.js
there was a previous plugin format that didn't need {init, manifest,name...} and master used this.
also the master thing allows you to configure a remote key that has full access to the ssb-server.
for example, set `master:your_id` in your pub's config and you can now connect to your pub remotely,
without using ssh. This seemed like a good idea at the time, but didn't really get used.
### legacy: get it working
sometimes it wasn't clear what the best way of doing something was, so we just picked something
quickly, so we could get on with more important things. This later meant the effects of that decision
where remained with us. In some cases, this can be removed later, but in others it still remains.
examples:
* using JSON for message encoding. Originially, we had tried a binary format, but soon realized that encoding message content with a flexible format was very useful because we could have general purpose database indexes. At that time it we used msg-pack, but after an early rewrite switched to JSON, since everyone knows it and it would be more accessable, and didn't want to debug binary stuff.
* pub messages. when accepting an invite, you need to remember the address of the pub so you can connect to it next time you run ssb. So when accepting an invite, a pub message is created just so that you'll remember the address. https://github.com/ssbc/ssb-gossip/blob/541911c4cb8938489e3914e629fec418d4121c77/init.js#L16-L27 hopefully this will be superceded by https://github.com/ssbc/ssb-device-address
* invite codes were themselves a get-it-working short cut: https://github.com/ssbc/ssb-invite/ hopefull to be superceded by https://github.com/ssbc/ssb-peer-invites
### personalities
Different code was written by different people who think differently. If that code works it can stick around a long time, and that may explain why it's like it is. There has historically been a split between people who took responsibility for front end and those who took responsibility for the back end, as well as other projects that are part of the ecosystem such as git-ssb. This was not planned but it happened that way. These things naturally exerted influence on each other, but resulted in how ssb is now.
examples:
* muxrpc cli validation, written by Paul Frazee (who was the first person to join the project, and wrote patchwork 1 and 2, but now works on beaker browser) https://github.com/ssbc/ssb-server/blame/master/lib/validators.js
* patchcore api and directory structure organized into `{topic}/{type}/{module}` https://github.com/ssbc/patchcore#directory-structure
- largely ahdinosoaurs contribution, but baked into code by matt / piet / mix
* cel writes quite long files in git-ssb https://git-ssb.celehner.com/%25q5d5Du%2B9WkaSdjc8aJPZm%2BjMrqgo0tmfR%2BRcX5ZZ6H4%3D.sha256/blob/3852c8c0341ee5a8a6170245a58ef56c3e574058/index.js also does most of his computing on a rasberry pi - so forced to optimize for very little cpu and memory so writes static html apps.
* matt chose that ssb-backlinks creates [an index with a file name containing your ssb id](https://github.com/ssbc/ssb-backlinks/blob/master/index.js#L22) (ssb-backlinks started as part of patchwork@3 which matt wrote)
* dominic is obsessed with modularily and so attempts to decouple things where ever possible. all sbot methods can be "hooked" via https://www.npmjs.com/package/hoox this is used to implement some features such as preventing blocked peers from replicating the feed that blocked them https://github.com/ssbc/ssb-friends/blob/master/index.js#L46-L78
### well thought out, good ideas
there is also stuff that is well thought out good ideas! ideas that were slowly arrived at after experience.
(the personality based informed stuff is also great, but the focus of that section was on how personality informs the code)
Usually, this stuff tries to be less coupled to ssb it self, so usually when you see something that doesn't have a ssb- prefix, and is then wrapped into an ssb- module that's something that's taken a while to get there, and I am _probably_ reasonably happy with. (though there are some instances that are very old now)
This stuff is also quite likely _complicated_ but it's about things where there is a strong _right answer_ that should hopefully only be solved once, and solved right.
* epidemic-broadcast-trees (wrapped to ssb-ebt)
* [dynamic-dikjstra](https://github.com/dominictarr/dynamic-dijkstra) and [layered-graph](https://github.com/ssbc/layered-graph) wrapped into [ssb-friends](https://github.com/ssbc/ssb-friends)
* [gossip-query](https://github.com/dominictarr/gossip-query) wrapped into [ssb-ooo](https://github.com/ssbc/ssb-ooo) (though this has a bit of just-get-it-working)
* [flumedb](https://github.com/flumedb) was written to refactor ssb-db, and became a core principle.
* [secret-handshake](https://github.com/auditdrivencrypto/secret-handshake) after evaluating all the secure channel implementations available, and finding none that were both good and simple, I realized I was now qualified to implement one.
* [secret-stack](https://github.com/ssbc/secret-stack) an instance of one of the older ones.
* [multiserver](https://github.com/ssbc/multiserver) decouples addresses and protocols, so that it is possible to upgrade protocols. (we havn't actually upgraded a protocol yet, however)
### synchronizing logs
ssb embodies the idea of synchronizing logs in several places. The original insight that lead to ssb was that append only logs were easy to synchronise but also quite useful. An append-only log is easy to replicate because if you replicate it in order (oldest to newest) you can just say
"i'm good up to tuesday" and if they have any messages from after tuesday, they send them.
If that transmission fails part way, next time you just say "I'm good to tuesday afternoon"
and they send whats after that. If you did not replicate messages in order, you'd have to provide
a list of message ids that you do have, or ranges you have, both more complexity and overhead.
#### flumedb
flumedb stores the main data in an append only log (each record is identified by it's byte offset in the log file). Then this data is synchronized into _views_ - reductions / interpretations / lookups that have been derived from the core log. Many plugins add views. If a view is corrupted or changed, the view state is deterministically rebuilt from the log. This means the view must be entirely determined by the content of records in the log.
thread about history/rationale of flume `%pYmFr6d0QwLP+YG0VNoo75PP7eYNZ1Y8C2MC9IjF5aw=.sha256`
example:
* ssb-db uses flumeview-reduce to track the latest message from each feed in your log https://github.com/ssbc/ssb-db/blob/master/indexes/last.js#L6-L14
* ssb-links indexes which messages point to each other used flumeview-query/links https://github.com/ssbc/ssb-backlinks/blob/master/index.js#L21-L24
note: to register a flume view, you use `_flumeUse(name, view)` this was a get-it-working thing. I used _ prefix because I intended to replace it with something better, but that hasn't happened yet.
see also: plugins with non-log state
#### replication
the basic idea of scuttlebutt is to replicate append only logs. when two peers connect, they
request feeds from each other, validate messages received and store them in their flumelog locally.
(which triggers view building, and ui updates etc)
examples:
* replication originally worked by calling `createHistoryStream({id, seq})` for every feed you wanted to replicate. When thousands of people came to ssb this became too much. https://github.com/ssbc/ssb-replicate/ "legacy replication" is very simple.
* [ssb-ebt](https://github.com/ssbc/ssb-ebt/) replaced that and includes several optimizations.
#### plugins with non-log state
sometimes a module needs to do something that requires it to remember something, but it didn't seem appropiate represent that data by writting messages to the user's feed. usually this is stored as either a json file or a leveldb instance.
* ssb-invite has a [leveldb instance that stores invite codes](https://github.com/ssbc/ssb-invite/blob/master/index.js#L52-L55) (also note, it uses sublevel because that used to be part of ssb-db, an instance of seemed-like-a-good-idea-at-the-time) (counter example: [ssb-peer-invites](https://github.com/ssbc/ssb-peer-invites) is fully on-log)
* ssb-blobs has a [leveldb instance to remember which blobs need to be pushed out to pubs](https://github.com/ssbc/ssb-blobs/blob/master/index.js#L33)
* ssb-gossip stores info about peers it has tried to connect to (such as whether they errored, what the latency was) in a [json file](https://github.com/ssbc/ssb-gossip/blob/master/index.js#L86).
* ssb-identities creates additional [secret keys under the ~/.ssb/identities directory](https://github.com/ssbc/ssb-identities/blob/master/index.js#L27-L34)
* ssb-unread which has an off-log db
* ssb-about which just provides faster server-side queries and leans on ssb-backlinks for views
## separating HOW from WHAT
Complex plugins (such as replication) should focus on _HOW_ they do what they need to do, and not _WHAT_ they need to do. For example, the replication module shouldn't have opinions on what feeds are to be replicated. Instead it has an api (`replicate.request(id, toReplicate)`) that another module
can call when it wants a feed replicated.
This idea was first introduced for `ssb-blobs`. The first version of ssb-blobs scanned messages for blobs, and tried to replicate every blob. Later, ssb-blobs protocol was rewritten, so ssb-blobs only requested blobs that the application had asked for (i.e. tried to render an image).
examples:
* ssb-blobs replicates blobs when the application calls blobs.get(id)
* ssb-replicate and ssb-ebt replicate a feed if something (i.e. ssb-friends) calls replicate.request(feed, true)
* ssb-gossip will connect a peer if something calls gossip.add(address) (at some point in the future)
note: this idea was fully formed by the time of ssb-ebt, but the other uses may still have some legacy quirks, although ssb-blobs was a clean refactor.
## incomplete intentions
In some places, a new feature or api was introduced with the intention that it could be expanded and used for other things. In some cases it has not been used for those things _yet_. (not to say that they won't be - but just to document that these are plans that havn't happened yet)
* master plugin - to allow remote control of a pub, didn't really get used, forgot that it was even there until [mix complained about it](https://github.com/ssbc/ssb-server/issues/629) (which incidentially was the impetus for this document).
* multiserver allows multiple addresses at once, which is intended to provide ability to upgrade protocols (for example, new, better version of shs). So far, new transports have been added, but not a protocol. (progress towards this is happening though)
* friends rewrite also with createLayer method allows multiple representations of relations between feeds. This is progressing - this is used by [peer-invites](https://github.com/ssbc/ssb-peer-invites) but it's also intended to support same-as (incomplete)
* ssb-db has [maps](https://github.com/ssbc/ssb-db/blob/dd961e4fbaf5321a314266adbcd32ac22ec3d413/minimal.js#L85-L103) and [addMap](https://github.com/ssbc/ssb-db/blob/dd961e4fbaf5321a314266adbcd32ac22ec3d413/minimal.js#L231-L233) method, intended to support both an experiment for off-chain content, and to facilitate private groups / private message rewrite.
* ssb-links was intended to replace ssb-db.links but we never got around to removing links.
## try to separate IO from state transitions
more recently, I (dominic) started to realize that testability was a huge problem, and that generally IO made that more difficult, because usually IO happens in parallel and thus you need to test the different orderings, which means a lot more tests, too many to hand write. So, instead I try to avoid this as much as possible. Write in an event processing style: `eventHandler(state, event) => new_state`
(this was inspired by react, but I consider the particular way react does it to be unnecessarily verbose). It also means test data can be serialized and thus more easily ported to another language.
Then IO is "glued on". (I would be even better to find a way to compose these together without the glue, so that the state stays abstracted, but I havn't figured that out yet)
* [epidemic-broadcast-trees/events](https://github.com/dominictarr/epidemic-broadcast-trees/blob/master/events.js) each event is handled with state passed in and returned, so it's easy to write tests that check what should happen when a particular event is received in a particular state.
* [ssb-validate](https://github.com/ssbc/ssb-validate/blob/master/index.js) is also written with state and input separated.
* [secret-handshake/crypto](https://github.com/auditdrivencrypto/secret-handshake/blob/master/crypto.js) the construction and validation of each message type is separated from the IO flow of the handshake, again so that it's easier to test the handling of valid and invalid packets, etc.
## plugins, communication, and interactions between plugins
A core idea is your "ssb identity" this is a cryptographic key stored in your `~/.ssb/secret` file.
It is used both to sign messages, and to authenticate connections via [secret-handshake](https://github.com/auditdrivencrypto/secret-handshake).
A protocol like ssb needs many features, and for these features to grow and change over time. To make it easier to add features, we first made an rpc protocol, [muxrpc](http://github.com/ssbc/muxrpc) (a standard way to call methods on peers) and then wrote the protocol in terms of that. (due to prior experience, the protocol combines both multiplexing streams "mux" and remote proceduce calls "rpc". the main thing both of these need is [framing](http://github.com/ssbc/packet-stream-codec), previously it seemed to me that mux and rpc could be separate layers, but then rpc gets framed twice so I realized they are better combined)
[secret-stack](http://github.com/ssbc/secret-stack) creates peers that can connect and identify with secret-handshake, and then speak muxrpc. (Originally I wanted to call this module "illuminati" because groups of peers meeting and giving secret handshakes, but the name was already taken on npm)
we are now ready to talk about plugins.
### use(plugin)
secret-stack provides a "use" method. to make a working secret-stack application, you combine a bunch of plugins. Each plugin has a name, and can expose new methods under that name. (exception: ssb-db doesn't have a name, and this allows it to expose methods on the top level - all other plugins have a name though). There are several ways plugins can interact.
#### calling method on another plugin
The simplest is to just call a method on another plugin
examples:
* [ssb-friends calls replicate.request to ask that friend's feeds be replicated](https://github.com/ssbc/ssb-friends/blob/master/index.js#L100)
* [ssb-server/plugins/local calls gossip.add so that peers connect others on local network](https://github.com/ssbc/ssb-server/blob/f587a0fcb84e066fa5ea09938d45ab5c466738b4/plugins/local.js#L61)
#### emitting an event that another plugin listens for
* ssb-ebt emits ["replicate:fallback"](https://github.com/ssbc/ssb-ebt/blob/master/index.js#L130) method if it wasn't able to use ebt replication. [legacy replication listens for that](https://github.com/ssbc/ssb-replicate/blob/master/legacy.js#L299) (this is an instance of get-it-working)
#### hooking a method
it's possible for one plugin to "hook" a method from another plugin. With a hook you can run code that runs before or after or around, that modifies the input or the output or even avoids calling the original method in certain situations.
examples:
* authorizing invite codes, ssb-peer-invites https://github.com/ssbc/ssb-peer-invites/blob/master/index.js#L147-L161
* authorizing invites codes, ssb-invite https://github.com/ssbc/ssb-invite/blob/master/index.js#L58-L73
* giving configured remote peers _full access_, ssb-server/plugins/master https://github.com/ssbc/ssb-server/blob/07d63a3abed4e45118459592b21fbfb878929e0b/plugins/master.js#L6-L10
* preventing a blocked peer from calling createHistoryStream for someone who blocked them https://github.com/ssbc/ssb-friends/blob/master/index.js#L46-L78
### friends.createLayer
the ssb-friends plugin sets up a special case for interactions between plugins.
As it's focused on social network applications, it's valuable to represent the "closeness" (aka, "friendship") between feeds. ssb-friends represnts this as a graph of `{<follower>: {<followed>: <closeness>}}`.
But there are also multiple ways that closeness may be represented, so this is expressed as "layers"
basically, each layer is another graph, and then these graphs are merged into one, and on that merged graph the distances are calculated. (and used to decide who to replicate, etc)
examples:
* ssb-peer-invites creates uses a friends layer [to represent peer-invites as friendships](https://github.com/ssbc/ssb-peer-invites/blob/master/index.js#L114-L115)
* ssb-friends uses a layer to [handle ordinary contact messages](https://github.com/ssbc/ssb-friends/blob/master/contacts.js#L7-L37)
* this method was also intended to implement ["same-as"](https://github.com/ssbc/ssb-same-as)