# 1000 Validators Effort (Aka Validatorship Invites/Airdrop)
Polkadot should scale linearly in the number of validators, well once parachains work, including direct UDP connections between validators. We impove security if we diversify our validator operator pool too. We therfore want W3F to invite fresh-but-good validator operators into the communite and nominate their nodes, but this risks W3F being slashed.
It's impossible to mitigate slashing risk while staking unknown people (see slashing section below). We could vet new validators whom we stake, and we should do some of this, but vetting becomes time intensive.
Instead we should vet validator operators using their past public activity on peer-to-peer networks, which for diversity should ideally lie not be crypto-currency networks.
There are three enormous peer-to-peer networks that fit this critertia, Tor, I2P, and Bittorrent. We'd presuably start new operators on a test network, or Kusama, and then graduate them to Kusama and Polkadot.
## Tor
Tor provides a near perfect resource here. Tor actively avoids node hosting infrastructure homogenaity, ala EC2, etc., so Tor operators bring hosting diversity and thus security. Tor spends considerable effort teaching operators about basic operational security for peer-to-peer nodes. Tor even educates operators about interactions with law enforcement. Also, Tor operators can be solicited via Tor's mailing lists.
Tor publishes a "concensus" of all Tor relays, including their long-term keys and other details, which permits unbiased path selection by Tor clients and other features. Tor even keeps historical records for their concensus. In other words, Tor itself acts as a strong character witness becasue a long-term Tor node represents a serious sustained altruistic act by the operator.
We should favor nodes with good flags set, like exit, guard, etc. and reject nodes with bad flags set. If desired, we can champion geograpic diversity by favoring nodes outside Europe and the U.S. ala https://tormap.void.gr/
Is this easy? Yes it's fairly:
We first download historical records from metrics.torproject.org and/or archive.torproject.org likely via rsync://metrics.torproject.org We then write a parsing script that takes a list of applicants by ed25519 master public key, extracts their data from Tor's history, and prioritizes them based upon our criteria.
Any tor relay has a data directory that contains either an ed25519_master_id_secret_key, or maybe ed25519_master_id_secret_key_encrypted, or else both ed25519_signing_secret_key (?) and ed25519_signing_cert, and perhaps places the master keys elsewhere.
We need tool that converts either from a substrate ed25519 controller secret key to tor's ed25519_signing_secret_key, or else that converts from tor's ed25519_signing_secret_key to a substrate ed25519 controller secret key. After we write this tool, then a script to certify it using their tor master key looks vaguely like:
```
TorDD=/var/lib/tor # Put your data directory here
# We need tor to certify our new key using its master id key.
# As https://trac.torproject.org/projects/tor/ticket/17127 was not
# resolved, we do this by creating an ephemeral tor data directory
# in which we can create a new signing key for tor to certify.
# See https://trac.torproject.org/projects/tor/wiki/doc/TorRelaySecurity/OfflineKeys
# if you need to make this work with offline tor master id keys.
WORK=/tmp/TorEphDD
mkdir $WORK
cd $WORK
F1="$TorDD/ed25519_master_id_secret_key"
F2="$TorDD/ed25519_master_id_secret_key_encrypted"
if [ -f "$1" ]; then
ln -s "$F2"
elif [ -f "$1" ]; then
ln -s "$F2"
else
echo "$FILE does not exist."
fi
# TODO: Input a substrate ed25519 controller secret key passphrase
# and construct the file ed25519_signing_secret_key expected by tor.
missing_code
# We need --SigningKeyLifetime '0 days' because our fresh substrate
# controller key always gets an expired certificate for tor itself.
# We expect this reads the existing ed25519_signing_secret_key and
# creates a new updated ed25519_signing_cert.
tor --keygen --DataDirectory . --SigningKeyLifetime '0 days'
# Now submit the new ed25519_signing_cert as an applicaiton for nomination
curl -X POST --header "Content-Type:text/xml;charset=UTF-8" --data-binary @ed25519_signing_cert https://invitetor.web3.foundation
# TODO: If ed25519_signing_secret_key was not created above, then
# actually the above tor command creaated one, so we could convert
# it into a format usable by substrate.
rm ed25519_signing_secret_key
```
If desired, we could actually convert from sr25519 keys here since this code only operates ephemerally and need not be portable across ristretto implementation, but doing so would anger the daleks.
We should ask if there any way to embed additional data in this certificate, and figure out if we'd want to do so. At our end, applicaitons should consist of a certificate on a signing key by a mater key. We evaluate the application based upon the master keys' record, but should reject signing keys taht appear in the live tor network.
## I2P
I2P sounds similar to Tor in principle. Yet, I2P nodes learn about their network via gossip, not via a concensus like Tor. We'd need to discover I2P nodes by running I2P nodes.
Also, I2P never engaged with the academic community for the pupose of studying metrics, never raised funds for building metrics infrastructure, and probably never collected historical data.
## Bittorrent
We need not include trackers since too few trackers exist. If we can learn the torrents then we can however survey trackers for current stats from seeders. Again historical stat would not exist, but this could be easier than I2P if the trackers tell us their torrents. Assuming trackers never tell about their torrents then we'd need a list of interesting torrents, which sounds labor entensive.
## Slashing
Is there anything that meaningfully obstructs validators from intentionally being slashed? tl;dr no..
We'll avoid slashing for most honest mistakes by doing [substrate#7398](https://github.com/paritytech/substrate/issues/7398) aka ["slashing reform"](https://hackmd.io/FY4MCfZCSZGkV-Ulo3VSRg) ([w3f#530](https://github.com/w3f/research-internal/issues/530)) which we require for bridges anyways. Yet, these "back certs" do not reduce slashing levels, do not avoid honest timeout mistakes [polkadot#1656](https://github.com/paritytech/polkadot/pull/1656), and do not mitigate intentional slashes.
### Corrolation
We cannot meaningfully reduce "uncorrolated" slashes for candidate invalidity: Anytime an (invalid) parachain candidate gets backed then an adversary soon learn its approval checkers. At this point, if an adversary dislikes their odd against those approval checkers, then they need not reveal their own approval assignements for checking that candidate, and they can simply abandon their backing checkers to being slashed. Asymptotically, an advertsary with unlimited stake can make infinite profit by running this attack on Polkadot. We need real adversaries to run out of stake before this attack succeeds.
Roughly speaking, if the backing group were slashed only a proportion 1/k then slashing them 100% exhausts their stake k times faster. In other words, we should ask adversaries commit k times more stake, which means k times more checkers _all the time_, and so slashing 100% supports k times as many parachains.
It follows that a few validators wishing to beslashed 100% could always simply wait until they've enough validators assigned to the same validator group, and then produce a block on whatever parachain they like. If we've k groups, need 3 backers, and the adversary has n validators, then this happens with odds p = {n \choose 3} k^{-2} in any given era, so their expected waiting time is 1/p = k^2 / {n \choose 3} eras. We cannot improve this 2 so much by asking for more backers because backers contribute much less than apporoval checkers.
We thus cannot mitigate against validators intentionally getting themselves slashed 100%, reporting the slash themselves.
### Rewards
It's worse because a validator who slashes themselves could walk away with reporting rewards from their nominators stake.
We should reduce rewards for reporting slashing of course, ala https://github.com/w3f/research-security-issues/issues/42 of course. Ideally, rewards would be less than minimum self take, but that's impossible. We've two respectable options however:
- Reporting rewards could become a lower bound on self stake plus some multiple of commission, so a validator with low self stake and must have a high commission instead.
- Reporting rewards could become "more manual", either by asking the council tip the reporters from treasury, or else by locking the rewards for long enough the council can manually reduce the rewards if appropriate.
## ssh-airdrop
- Fetch ssh public keys via [`https://github.com/<username>.keys`](https://stackoverflow.com/questions/16158158/what-is-the-public-url-for-the-github-public-keys)
- ssh key parsing
- ['ssh-keygen -p -m PEM -f ~/.ssh/id_rsa' etc](https://stackoverflow.com/a/56300901)
- https://rubygems.org/gems/openssl-additions
- https://github.com/pwnedkeys/openssl-additions/blob/master/lib/openssl/pkey.rb