| ... | ... |
| ------------ | --------------------------- |
| Title | Bitcoin Contacts |
| Status | Draft |
| Authors | Edward Pratt, Johns Beharry |
| Comments-URI | ... |
| Created | |
| Type | Informational |
# Contacts, Labelling, and Coin Control
This is a research document to collate information on contacts, UTXO labelling and coin control strategies within the bitcoin framework.
At the moment, only a small group of users (tech savy, privacy focused cypherpunks) are getting the full benefit of what bitcoin payments have to offer with regards to privacy. More often than not, bitcoin is described as a "pseudoanonymous" payment mechanism. This is because information can be unveiled over time to users on the network (e.g address reuse, large change outputs, etc). We'll get to more of this detail later, but for now just be aware that bitcoin is only as private as you make it.
So the goal of our research is to highlight and understand these privacy leaks, and provide research on how to go about creating best practises to accentuate anonymity in payments -- making privacy in bitcoin payments something which is default and accessible to less technical people.
For any information on definitions and terms, please visit the [Bitcoin.Design Glossary](https://bitcoin.design/guide/glossary/).
## Bitcoin's UTXO Model
In order to get a good understanding on contacts, labelling, and coin control within bitcoin payments, one must first understand the nature of Bitcoin's accounting model - Unspent Transaction Outputs (UTXOs).
Every bitcoin transaction (From -> To) has inputs: coins that are being spent, and outputs: the corresponding change from those inputs (seen in the form of new "coins"). For reference, these newly created (unspent) coins do not have to equate to entire units of a bitcoin (i.e 1 BTC). Instead, they are often fractions of a coin, or change. The best way to think of it is, "as change is created from transactions, it creates new and unique coins to be spent". The input of a transaction has to be an unspent output of a previous transaction, and so on. Each UTXO is the most recent link of a chain that can be linked all the way back to its original form (a whole bitcoin), minted as a block reward.
In leighmans terms, if I go shopping for groceries and the total comes to $0.75, but I only have a $1 note, I will pay with this and receive $0.25 change. This change ($0.25) is the unspent transaction output (UTXO) which returns to my wallet as change, and will be seen as a unique "coin" that can be used in future transactions.
Each bitcoin wallet will display a balance or total amount of funds to the wallets owner. A lot of the time, what wallet owners don't see are the different coins (UTXOs) that amount to their total balance. More often that not, a wallet's balance is comprised of multiple UTXOs (coins) that add to a sum of all their values.
WALLET BALANCE: 6.8 BTC
Contains 8 UTXOs : 0.2 BTC | 0.8 BTC | 4.6 BTC | 0.004 BTC | 0.0182 BTC | 0.128 BTC | 0.0012 BTC | 1.0486 BTC
## Problems with Privacy
Despite having it's various benefits, bitcoins UTXO model (if not used effectivly) can lead to privacy loss for both payers and payees. Tactics such as Coinjoins, unique address generators, and more can be used to curb this loss over time. However, in a nutshell, here's why Bitcoin's UTXO model can lead to privacy loss and transaction tracking.
### Unique UTXOs & Contact Mixing
> *"Each UTXO is a unique snowflake with a public transaction history. For example, when Alice sends a coin to Bob, then Bob does not just have any random UTXO, but he has specifically the coin that Alice has sent him. When Bob sends this coin to Charlie, then Charlie can check the history of the coin and see the transaction from Alice to Bob. But due to the pseudonymity of Bitcoin, he does not necessarily find out that Alice is involved."*
> *"There is a chain of digital signatures all the way from the coinbase reward to the current UTXO. This transaction history can reveal sensitive information about the spending patterns of individuals. The receiver of a coin can look back into the transaction history of the sender. And the sender can see the future spending of the receiver."*
> **[Wasabi Docs](https://docs.wasabiwallet.io/why-wasabi/Coins.html#problem)**
Say that part of one of my coins ends up in the wallet of a friend (from a payment), they have the ability to retrace that coins footsteps all the way back to its original (whole) form. Say Alice had used one of her coins to buy Bob a birthday present, and then later used the change from that transaction to pay Bob back for the pizza he bought her last week. Bob, in theory, could trace the coin back until he discovered the transaction in which Alice bought him a present (perhaps he knows the exact price of what it is he asked for). This is obviously a fairly harmless example, but does show the potential of how our transactions can be traced.
Furthermore, coin consolidation (or merging) can lead to an even further loss of privacy. This happens when two or more coins are selected for a transaction's inputs and merged as a single output (change).
### Address Reuse & Wallet Clustering
If an address is used more than once to receive a bitcoin transaction (or change outputs from outgoing tx's), the address begins to collect multiple coins to spend under the management of one private key. It becomes very easy to find each UTXO held by the address, and therefore the address's total balance.
What's more, if an outgoing transaction has a reused address, then it is more than likely that the output of this transaction is the payment destination. This is because most wallets automatically generate new change addresses for each transaction, whereas the payment destination is selected manually by the wallet owner.
> The address reuse would happen because the human user reused an address out of ignorance or apathy - **[Bitcoin Wiki](https://en.bitcoin.it/Privacy#Address_reuse)**
Chain analysis companies recognise that change outputs are sent to newly generated addresses, therefore any resused address is a payment destination and may be a point of reference for UTXO clusters.
#### Dusting Attacks
A dusting attack is when an actor (usually a security/chain analysis firm, or malicious actor) sends an almost unnoticeable amount of bitcoin (perhaps a few hundred/thousand satoshis) to an address. As mentioned before, since UTXO spending histories link addresses and transaction timelines, addresses become contaminated with dust, exposing address and balance data. Furthermore, if this dust is then moved (perhaps unknowingly used as a transaction input by the address owner), the attacker (or sender of the dust) can further their search of deanonymising wallet owners and their spending habits. This knowledge can then be used in subsequent phishing attacks or perhaps even blackmailing wallet owners into cyber-extortion. As mentioned, research labs, government agencies and other companies also use dusting attacks as an attempt to deanonymise blockchain networks such as Bitcoin.
The solution to these dusting attacks are for wallet owners not to spend the dust in their wallets, or to move them between different addresses. As long as the dust stays where it is, the wallet owner can avoid contaminating the rest of their funds and hence their privacy too.
### Labelling & Coin Control
Coin control is the ability for users to manually select which coins they wish to use as inputs when funding a bitcoin transaction. This is often seen as a more advanced feature in bitcoin wallets as it gives users greater control over optimising for privacy during transactions.
Coin control is made a lot easier when paired with a process known as UTXO/coin labelling. This is similar to attaching a reference or name to your transactions, so you know exactly who you sent which coins to (and perhaps what for). When creating bitcoin transactions, some wallets ask you to label the recipient's address (e.g Bob), so when looking back at your UTXOs their transaction histories are fairly clear at a glance. Once labelled, bitcoin wallets can inform the owner of which coins have been where and used by whom.
Users can then manually select or prioritise transaction inputs based on what information has already been revealed, and what could be revealed in the future. Payees can only see UTXO histories that they have been exposed to, as well as the addresses of these UTXOs are linked to by association. So, assuming users are not reusing addresses, reusing coins that are associated with a contact/label can minimise the risk of address and balance exposure to payees.
#### The Importance of Labels
When wallet owners receive coins (either in the form of an incoming payment, or a change output from an outgoing payment) they are able to label these coins, specifying "who" it is from. Labels not only provide context to a wallet's transaction history, but also are essential in identifying the origin of coins - making it easier to avoid a cross contamination of payment, balance, or address information during future transactions as inputs are selected.
Groups of coins with the same label are often referred to as "clusters". For the purpose of privacy, it is generally best practise to avoid mixing coins from different clusters as transaction inputs. If this happens, the addresses (or labels) associated with those coins become linked and the anonymity of payments begins to degrade. More on this in the Privacy in Bitcoin Transactions chapter.
### Minimising Change Outputs
As we have talked about already, each bitcoin transaction contains both inputs and outputs. Inputs are either an automatic (algorithmic) or manual selection of UTXOs from an address. Like many of our day-to-day transactions with cash, there is often change involved. These coins are known as "change outputs", and are simply UTXOs returned to the senders wallet (often to a newly generated address).
If I receive a bitcoin payment from someone, I can analyse the transaction data and clearly see which outputs were change and which were a payment to me. Once I know the change output address, I can see the sensitive information (e.g balance) of that address, as well as any other UTXOs associated with it.
For example, if I receive a payment of 0.5 BTC ($25,000) for my christmas bonus, and then use this UTXO to fund a transaction at my friend's coffee shop, they are able to analyse the transaction data and see that I funded the payment with a coin worth $25,000. They might then ask a bunch of questions which is frankly none of their business. So, in order to reveal less balance data, users can select smaller coins for these types of transactions, or perhaps use mixers/coinjoins to create private (anonset) coins for spending.
Therefore, by minimising the change outputs of a transaction, we can reveal the least amount of balance information possible to our own payees. Some automatic coin selection algorithms (Branch & Bound, or Blackjack) aim to create the least amount of change outputs possible in order to make payments inherently more private. Some even attempt to create an "exact match", which means there are 0 change outputs (i.e the transaction is funded with the exact amount required to fulfill the payment request).
### Avoiding Address Reuse
BIP 44 (Multi-Account Hierarchy for Deterministic Wallets) is now an industry standard improvement protocol that uses a master secret (private key) to generate a tree structure of "child" private keys and addresses. This is a deterministic solution because the same parent secret will always generate the same child private keys. This allows for wallets to generate new (child) addresses each time you wish to receive an incoming transaction, whilst still maintaining the umbrella security of one private (parent) key that can be used to calculate each child private key. Once an address has been calculated and used (coins are received by it), this address will now longer show in the wallet UI under the "Receive" tab, instead a new child address will have been created to avoid address reuse.
Furthermore, BIP 44 wallets use an alternative key derivation function called "hardened derivation", increasing the security of parent/child private keys by removing the ability for parent private keys to be calculated from their child private keys.
### CoinJoins & Mixers
As mentioned before, each UTXO has a public transaction history which can be analysed all the way back to its original coinbase reward. This inadvertently links senders and receivers, inputs and outputs.
In order to obfuscate this link between the inputs and outputs of bitcoin transactions, some wallets and third party services use CoinJoins or Coin Mixers to reinstate a sense of anonymity.
CoinJoins are used to build large transactions which have a number of inputs and equal value outputs. Inputs are usually non-private, and can therefore be linked to the coins previous transaction history. However, the outputs are intentionally made to be of equal value, obscuring the link between the inputs and change outputs of the CoinJoin anonset (anonsets are the number of transaction inputs for CoinJoins, the larger the amount of inputs e.g 100, the less chance an observer has of guessing the correct transaction history of a change output).
When a user now sends one of these anonset coins (the output of a CoinJoin), the receiver cannot clearly see the transaction history of the UTXO before the CoinJoin occured. Furthermore, if the receiver decides to do a CoinJoin using the same UTXO, the previous sender cannot survey the future spending patterns/tx's of that UTXO. Observers (senders, receivers, or others) can only hazard a guess at the correct input/output link. The chances of them correctly guessing this link and therefore linking the UTXO histories is a rate of 1/Anonset, e.g 1/100 or 1%. CoinJoins are very useful tools in obscuring UTXO histories and re-anonymising coins, however they are not foolproof and can in theory be unveiled. However, as UTXOs go through more and more CoinJoins (perhaps for increased security), the chance of association diminishes rapidly.