--- title: Mastering Ethereum --- ## Mastering Ethereum Notes ### Chapter 1: What is Ethereum? - A general purpose blockchain - It is a distributed state machine. But instead of tracking only the state of currency ownership, Ethereum tracks the state transitions of a general-purpose data store, i.e., a store that can hold any data expressible as a key–value tuple - Two of the critical differences from most general-purpose computers are that Ethereum state changes are governed by the rules of consensus and the state is distributed globally - Components of Ethereum - A peer to peer network - Consensus rules - Transaction (messages across the network) - State Machine (The `Ethereum Virtual Machine`, a stack based VM that executes bytecode) - Data Structures (For storing the state and transactions in systematic fashion using `Merkle Patricia Tree`) - Consensus algorithm (Weighted PoW consensus named `Casper` in Eth 1, PoS in Eth2) - Economic Security - Clients (Eth has several interoperable implementations of clients) - Ethereum is turing complete, meaning it can compute any algorithm that can be computed by any Turing machine. - Hence, ethereum combines general purpose computing archhitecture of a stored-program computer with a decentralized blockchain, thereby creating a distributed single state world computer. The ethereum programs runs everywhere, yet produce a common state that is secured by the rules of consensus. - The fact that ethereum is turing complete, means that any program of any complexity can be executed on it leading to security and resource management problems. - Hence, ethereum introduces `Gas`, a metric to measure cost of each operation on ethereum. This limits the usage and helps in solving lot of problems. More on gas in later chapters. - Along the way of improvements, Dapps (Decentralized apps) turned out to be one of the major use cases of a world computer. - Dapps in simpler terms are nothing but set of contracts and a web interface using the ethereum's infrastructure. - This further led to `Web3` which means the 3rd version of web, representing a new vision and focus for web apps: from centrally owned and managed apps to those built on decentralized protocols. ### Chapter 2: Ethereum Basics - Ethereum's native currency is `Ether` also identified as `ETH`. - Smallest Unit: `wei` which is 10^(-18) ETH. - Ethereum is the system, Ether is the currency. - Wallet - A wallet is a software application, that allows you to use an Ethereum account (a gateway to ethereum). It holds your keys (discussed in detail later) and can create/broadcast transactions on your behalf to the ethereum network. - Your account (your private keys) gives you access to your funds and smart contracts. - EVM - The ethereum virtual machine, is a global singleton, meaning that it operates as if it were a global, single-instance computer, running everywhere. Each node on the Ethereum network runs a local copy of the EVM to validate contract execution, while the Ethereum blockchain records the changing state of this world computer as it processes transactions and smart contracts. More on EVM in later chapters. - Contracts and Externally Owned Accounts (EOAs) - EOAs are nothing but accounts owned by user (like wallet accounts). - Contracts accounts refer to contracts which are deployed on the ethereum network containing some code. - EOAs have private keys, while contract accounts don't. - Contract accounts have code, while EOAs don't. - Both have ethereum addresses, which acts as the identifier. Both can send and recieve ether. - However, when a transaction destination is a contract address, it causes that contract to run in the EVM, using the transaction, and the transaction’s data, as its input. In addition to ether, transactions can contain data indicating which specific function in the contract to run and what parameters to pass to that function. In this way, transactions can call functions within contracts. - Note: As contracts doesn't have private key, they can't initiate a transaction (but in response to a transaction, can make another contract call). - Contracts - When a contract is created, a transaction with destination address 0x00 (zero address) and data containing evm bytecode is sent across the network - Anytime someone sends a transaction to a contract address it causes the contract to run in the EVM, with the transaction as its input. - Transactions sent to contract addresses may have ether or data or both. If they contain ether, it is "deposited" to the contract balance. If they contain data, the data can specify a named function in the contract and call it, passing arguments to the function. - Calling a contract function (through transaction), creates an `IN` transaction (acc. to etherscan terms), while the contract calls and transfers and showed in the `Internal Transactions`. ### Chapter 3: Ethereum Clients - In simple terms, an ethereum client is a software application that implements the Ethereum specification and communicates over the peer-to-peer network with other Ethereum clients. - There exist a variety of Ethereum-based networks that largely conform to the formal specification defined in the Ethereum Yellow Paper, but which may or may not interoperate with each other. - Below mentioned are the types of ethereum clients. - Full Client / Node - Provides all functionalities and requires very resources and bandwidth (More on advantages and disadvantages [here](https://github.com/ethereumbook/ethereumbook/blob/develop/03clients.asciidoc#full-node-advantages-and-disadvantages)). - Remote-Cliens - Similar to wallets, which are light and helps in broadcasting transactions to the network (doesn't validate block headers or transactios) - Local Blockchain - Single instance private blockchain (e.g. Ganache) - Light nodes - JSON-RPC Interface - The ethereum clients offers an API and set of RPC (Remote Procedure Call) commands, which are JSON encoded. - Refer to JSON-RPC API specifications ### Chapter 4: Cryptography - Keys and Addresses - Ownership of funds and contracts on EOAs is established through digital private keys, ethereum addresses, and digital signatures. - In fact, account addresses are derived directly from private keys: a private key uniquely determines a single Ethereum address, also known as an account. But private keys are not used directly or stored in ethereum. - The EOAs have a key pair (public and private), where the public key behaves as the identifier while the private key is the password to it which gives the ownership of that account. - The contract accounts are not backed by private keys. - For digital signatures generated with private keys, the inverse (of getting key form sign) is very hard to calculate. - The account holder signs the transaction with his/her private key, creating a digital signature, which is sent over the network along with the transaction for verification. The nodes with only knowing the digital sign and transaction details are able to verify (using elliptic curve cryptography) the validity of the transaction. - Elliptic curve mathematics means that anyone can verify that a transaction is valid, by checking that the digital signature matches the transaction details and the Ethereum address to which access is being requested. - The verification doesn’t involve the private key at all; that remains private. - Ethereum uses the Keccak-256 cryptographic hash function in many places. - The ethereum account addresses are hexadecimal numbers, identifiers derived from the last 20 bytes of the keccak-256 hash of the public key. - Additional notes on symmetric and asymmetric cryptography (Taken from [here](https://hackernoon.com/asymmetric-cryptography-in-blockchains-d1a4c1654a71)) - Symmetric Cryptography - A simple form of cryptography which uses single 'key' to encrypt and decrypt data. - The sender encrypts the data with a 'key', which can be any random string and sends to the receiver. The receiver needs to have the same 'key' in order to decrypt and get the original message. - Drawback: The key needs to be shared with everyone who needs the access to data. - Asymmetric Cryptography - Slightly more complex and solves drawbacks of symmetric cryptography. - Uses 'key pairs' instead of single shared key to perform encryption and decryption. - Key Pairs: Public Key (think of user name) and Private Key (think of password). - Data is tied to public key & can be publicly available, but is only authorised through private key. - The sender encrypts the data with the 'public key' of the receiver (denoting that he/she is sharing this data with a specific person) and the receiver can decrypt the data using his/her 'private key'. - Hence, those who are supposed to get access to data are explicitly given access. - It gives security + verifiability (see below). - Digital Signatures - Digital Signatures are incorruptible and easily verifiable. - Example: Alice wants to send a message to Bob and Bob wants to verify that this message is sent by Alice only. In this case, Alice puts the message in the box and locks the box with her private key. Bob can verify the message using Alice's public key. - If you encrypt (“lock”) something with your private key, anyone can decrypt it (“unlock”), but this serves as proof you encrypted it: it’s “digitally signed” by you. — Panayotis Vryonis - Example 2: Alice wants to send encrypted message to Bob and Bob wants verifiability as well as Security. Hence, Alice will first lock using her own private key, and then lock with Bob's public key. Bob first will have to decrypt with his private key (breaking the outer lock) and then can verify using Alice's public key (breaking the inner lock). - Usage in Blockchain - Accounts work in the same way and each account has key pair (public and private keys). - Example: A Blockchain transaction is signed by one's private key but can be openly verified by each node using the public key. ### Chapter 5: Wallets - A wallet at high level is a software application that serves as a primary user interface to Ethereum. - It controls the access to user's money, manages keys and addresses, tracks balances, helps in creating and signing transactions. - Wallet doesn't contains any user's funds, it only has the the keys. - Types of wallet - Non Deterministic Wallet - Each key is independently generated from a different random number - Deterministic Wallet - A master key called `seed`. - All keys are related to each other and can be generated again given the seed is known. It uses various derivation mechanism. They are encoded into set of english words for convenience also knows as `mnemonic code words`. - Hierarchical Deterministic Wallets (BIP-32/BIP-44) - HD wallets contain keys derived in a tree structure, such that a parent key can derive a sequence of child keys, each of which can derive a sequence of grandchild keys, and so on. - They can be used to express additional organizational and structured meaning. - The second advantage of HD wallets is that users can create a sequence of public keys without having access to the corresponding private keys. ### Chapter 6: Transactions - Transactions are signed messages originated by an externally owned account (EOA), transmitted by the Ethereum network, and recorded on the Ethereum blockchain. - It causes a change in state, or cause a contract to execute in the EVM. Every state change is caused by a transaction. - It is a serialized binary message that contains following things - Nonce: A sequence number, issued by the originating EOA, used to prevent message replay - Gas price: The amount of ether (in wei) that the originator is willing to pay for each unit of gas - Gas limit: The maximum amount of gas the originator is willing to buy for this transaction - Recipient: The destination Ethereum address - Value: The amount of ether (in wei) to send to the destination - Data: The variable-length binary data payload - v,r,s: The three components of an ECDSA digital signature of the originating EOA - The transaction message’s structure is serialized using the Recursive Length Prefix (RLP) encoding scheme, which was created specifically for simple, byte-perfect data serialization in Ethereum. All numbers in Ethereum are encoded as big-endian integers, of lengths that are multiples of 8 bits. - The fields such as to, gas limit, etc. are not part of the serialized tx data. The additional info such as to address, block number, tx hash are derivable from tx itself or the blockchain, but are not part of the tx message itself. - Importance of each field - Nonce - According to the ethereum yellow paper, the definition reads: > nonce: A scalar value equal to the number of transactions sent from this address or, in the case of accounts with associated code, the number of contract-creations made by this account. - Nonce is in context of the sending / originating address. - Not stored explicitly with the account state, but calculated dynamically by calculating the number of confirmed transactions that have originated from that address. - 2 amazing use cases in form of scenarios mentioned [here](https://github.com/ethereumbook/ethereumbook/blob/develop/06transactions.asciidoc#the-transaction-nonce). - This field is important in an `account-based` protocol in contract to bitcoin's `UTXO (Unspent Transaction Output)` mechanism. - The network processes the transactions seqentially based on nonce. If they're out of order (e.g. 1st tx with nonce 0, 2nd tx with nonce 2), then the network will consider the 1st tx, but will assume that the tx with missing nonce (1 in this case) is delayed and the tx with nonce 2 is received out of order. Only once the gap is filled, the network will consider the other txs held in mempool. - Incase of sequence of txs, if one of the tx fails, all other tx in the sequence gets stuck in the mempool. To get things moving again, you have to submit a new valid tx with missing nonce. But, once it's validated, other ones in sequence will also get validated. There is no way to `recall` or `undo` a transaction. - It's super important to handle the situation (for the application used to generate txs) and is difficult in case of concurrent environment. - Some Light on Concurrency (only definition and issues mentioned) - Concurrency is when you have simultaneous computation by multiple independent systems. These can be in the same program (e.g., multithreading), on the same CPU (e.g., multiprocessing), or on different computers (i.e., distributed systems). Ethereum, by definition, is a system that allows concurrency of operations (nodes, clients, DApps) but enforces a singleton state through consensus. - A scenario for handling withdrawls through an hot wallet in an exchange is shown [here](https://github.com/ethereumbook/ethereumbook/blob/develop/06transactions.asciidoc#concurrency-transaction-origination-and-nonces). - Gas - Gas is the fuel of Ethereum. - It's not ether, it's a seperate virtual currency which has it's own exchange rate against ether. - Necessity of gas: Ethereum uses gas to control the amount of resources that a transaction can use, since it will be processed on thousands of computers around the world. The open-ended (Turing-complete) computation model requires some form of metering in order to avoid denial-of-service attacks or inadvertently resource-devouring transactions. - `gasPrice`: price the originator is willing to pay for a transaction in exchange for gas. Measured in wei per gas unit. For e.g., a gasPrice value of 3 gwei means that a user is willing to pay upto 3 gwei for 1 gas unit. - [ETH Gas Station](https://ethgasstation.info/) shows info and metrics about gas for ETH mainnet. - Chances of transaction to get confirmed / priority of transaction is directly proportional to the gasPrice set (network also allows 0 gasPrice). - `gasLimit`: Maximum units of gas the user is willing to buy for completing the transaction. - A simple transfer operation requires 21,000 gas units. - Hence, finally the total gas fee the user will be paying will be `currentGasPrice * gasUnitsConsumed*` - For a contract interaction, the gas units consumed can be estimated, but can't be determined accurately. This is because the contact can different conditions leading to different execution paths, leading to different computation. Think of a simple car journey and fuel analogy, where fuel being the gas and car being the transaction. - Hence, the limit shown while placing a transaction is not the exact amount one would be paying. The gas is paid based on the actual computation used while executing the transaction. - Transaction Recipient - A 20-byte `to` address. Can be an EOA or Contract Address. - This field is not validated by the network and hence any invalid value to this field will burn the ether send with the tx. - Transaction value and data - The main `payload` of transaction is contained in 2 fields: value and data. All 4 combinations of their existance are valid. - Only value: Payment - Only data; invocation - Value and data: Both payment and invocation - None: Waste of gas! (But still possible) - Transmitting Value to EOAs and Contracts - When you construct an Ethereum transaction that contains a value, it is the equivalent of a payment. Such transactions behave differently depending on whether the destination address is a contract or not. - For EOA: the network will record a state change, updating the balance associated with that account. - For Contract: the EVM will execute the contract and will try for the following steps - Check for function invocation if any in data - If yes, invoke that function - Else call fallback function (which needs to be payable) - If no such function is found, the tx will be reverted. - If found, the logic inside it will get executed and the balance of contract increases. - Transmitting a Data Payload to an EOA or Contract - When data is sent, the tx is mostly addressed to a contract account. - It can also be a EOA, but the network ignores the data in this case. It might be possible that the wallet can use it for interpretation. - In general case, the data is interpreted as a contract function invocation by calling the named function and passing any encoding arguments to the function. - The data payload sent to an ABI-compatible contract (which you can assume all contracts are) is a hex-serialized encoding of: - A function selector: The first 4 bytes of the Keccak-256 hash of the function’s prototype. This allows the contract to unambiguously identify which function you wish to invoke. - The function arguments: The function’s arguments, encoded according to the rules for the various elementary types defined in the ABI specification. - A good example given over [here](https://github.com/ethereumbook/ethereumbook/blob/develop/06transactions.asciidoc#transmitting-a-data-payload-to-an-eoa-or-contract). - Special Transaction: Contract Creation - For creating a contract, the tx is sent to a special address, known as `zero address` i.e. 0x00, which represents neither a EOA address nor a Contract address. - While the zero address is intended only for contract creation, it sometimes receives payments from various addresses. - Any ether sent to this address will be burnt. - A contract creation transaction need only contain a data payload that contains the compiled bytecode which will create the contract. - To initiate the contract with some balance, value can be passed in data. - Once mined into a block, the receipt of this transaction contains the address of the contract created. - Digital Signatures - The digital signature algorithm used in Ethereum is the Elliptic Curve Digital Signature Algorithm (ECDSA). - A digital signature serves three purposes in Ethereum. First, the signature proves that the owner of the private key, who is by implication the owner of an Ethereum account, has authorized the spending of ether, or execution of a contract. Secondly, it guarantees non-repudiation: the proof of authorization is undeniable. Thirdly, the signature proves that the transaction data has not been and cannot be modified by anyone after the transaction has been signed. - More details [here](https://github.com/ethereumbook/ethereumbook/blob/develop/06transactions.asciidoc#how-digital-signatures-work). - Transaction Signing in Practice - To produce a valid transaction, the originator must digitally sign the message, using the Elliptic Curve Digital Signature Algorithm. When we say "sign the transaction" we actually mean "sign the Keccak-256 hash of the RLP-serialized transaction data." The signature is applied to the hash of the transaction data, not the transaction itself. - To sign a transaction in Ethereum, the originator must: - Create a transaction data structure, containing nine fields: nonce, gasPrice, gasLimit, to, value, data, chainID, 0, 0. - Produce an RLP-encoded serialized message of the transaction data structure. - Compute the Keccak-256 hash of this serialized message. - Compute the ECDSA signature, signing the hash with the originating EOA’s private key. - Append the ECDSA signature’s computed v, r, and s values to the transaction. - The special signature variable v indicates two things: the chain ID and the recovery identifier to help the ECDSArecover function check the signature - Transaction Propagation - The Ethereum network uses a "flood routing" protocol. Each Ethereum client acts as a node in a peer-to-peer (P2P) network, which (ideally) forms a mesh network. No network node is special: they all act as equal peers. - The tx propogates from the origin node (post validation) and is propogated to it's neignbours. (Average 13 neighbour nodes). - This continues until all have a copy stored in their local system and have validated it. Hence it gets flodded accross the network. - The mining nodes will pick the transactions up from the mempool and will be eventually added to the blockchain. - Once mined into a block, transactions also modify the state of the Ethereum singleton, either by modifying the balance of an account or by invoking contracts that change their internal state. These changes are recorded alongside the transaction, in the form of a transaction receipt, which may also include events. - A transaction that has completed its journey from creation through signing by an EOA, propagation, and finally mining has changed the state of the singleton and left an indelible mark on the blockchain. - Multisig Transactions - An account, which requires more than 1 signers to populate the transaction. - This is not possible in basic EOA, but is possible through a `multisig wallet contract`. Example contract [here](https://solidity-by-example.org/app/multi-sig-wallet/). - More about it [here](https://github.com/ethereumbook/ethereumbook/blob/develop/06transactions.asciidoc#multiple-signature-multisig-transactions). ### Chapter 13: The Ethereum Virtual Machine (EVM) - EVM is the heart of the ethereum protocol, which handles the contract deployment and execution. - The EVM is a quasi–Turing-complete state machine; "quasi" because all execution processes are limited to a finite number of computational steps by the amount of gas available for any given smart contract execution. - As such, the halting problem is "solved" (all program executions will halt) and the situation where execution might (accidentally or maliciously) run forever, thus bringing the Ethereum platform to halt in its entirety, is avoided. - The EVM has a stack-based architecture, storing all in-memory values on a stack. - It has several addressable data components: - An immutable program code ROM, loaded with the bytecode of the smart contract to be executed - A volatile memory, with every location explicitly initialized to zero - A permanent storage that is part of the Ethereum state, also zero-initialized - Comparision with existing technology - The term `Virtual Machine` referring to virtualization of a real computer or of an entire OS provide a software abstraction, respectively, of actual hardware, and of system calls and other kernel functionality. - The EVM operates in a much more limited domain: it is just a computation engine, and as such provides an abstraction of just computation and storage, similar to the Java Virtual Machine (JVM) specification. - It has runtime environment, that is agnostic of the underlying host OS or hardware. (enables compatibility across wide level of systems) - The EVM, therefore, has no scheduling capability, because execution ordering is organized externally to it. - The Ethereum clients run through verified block transactions to determine which smart contracts need executing and in which order. In this sense, the Ethereum world computer is single-threaded, like JavaScript. Neither does the EVM have any "system interface" handling or “hardware support”—there is no physical machine to interface with. The Ethereum world computer is completely virtual. - The EVM Instruction Set (Bytecode Operations) - Arithmetic and bitwise logic operations - Execution context inquiries - Stack, memory, and storage access - Control flow operations - Logging, calling, and other operators - Apart from this, the EVM also has access to the account information (address, balance) and block information (block number, current gas price). - The op codes with their specifications are mentioned [here](https://github.com/ethereumbook/ethereumbook/blob/develop/13evm.asciidoc#the-evm-instruction-set-bytecode-operations). - The Ethereum State - At higher level, Ethereum has a `world state`, which is a mapping of ethereum addresses (160 bit / 40 bytes values) to accounts. - At lower level, each ethereum address maps to an account having `balance`, `nonce` (diff. for both types of account), `account storage` and `program code`. The last 2 fields are only for contract accounts. - Execution - When a tx results in contract execution, EVM is intantiated with all input params (related to block and tx). ROM is loaded with contract code, storage is loaded from account storage, the memory is set to all zeros and other env variables are set. - The gas supplied is tracked after each operation. If at any point, it runs out of gas, the execution is halted (with and Out of Gas Exception). No changes are made to the world state. - If execution completes, state changes (like balance update, account storage, contract creation, etc.) are made to the world state. - This particular (transactions') VM instance can be thought of as a fork of the sandbox state of the world state. The changes are made into the instance and then are applied to the real world state. - A smart contract can itself effectively initiate transactions, code execution is a recursive process. A contract can call other contracts, with each call resulting in another EVM being instantiated around the new target of the call. Each instantiation has its sandbox world state initialized from the sandbox of the EVM at the level above. - Each instantiation is also given a specified amount of gas for its gas supply (not exceeding the amount of gas remaining in the level above, of course), and so may itself halt with an exception due to being given too little gas to complete its execution. Again, in such cases, the sandbox state is discarded, and execution returns to the EVM at the level above. ### Chapter 14: Consensus - Consensus simply refers to synchronizing to state in a distributed system such that all participants in the system agree to it. - Consensus algorithms are the mechanism used to reconcile security and decentralization in such systems. - It means producing a system of strict rules without rulers, hence no one is incharge of the system. Instead the power and charge is distributed among all. - 2 major types of consensus. - Proof of Work (PoW) - Originated and invented by creators of bitcoin. - The process of mining, which is often referred as a way to earn/create new currency is a consensus model at it's core, as it is used to operate and secure the blockchain. - The reward is the incentive obtained to keep the system secure. - The mechanism (specific to bitcoin) - All the mining nodes try to solve a problem and has to pay the cost of energy required to participate. If they do not follow and earn, they risk the funds they have spent on the electricity to mine. There is a punishment involved in this case. Hence this is a careful balance of risk adn reward that drives the participants to stay honest out of self interest. - The problem which they try to solve: Once each block is mined, the miners picks up set of transactions from mempool, bunch them together into a block and tries to compute the hash. Though there are some fixed entities such as the transactions (not actually all transactions, but just the hash), there are also some varying entities such as timestamp (unix) and nonce (a counter), which can be used to generate new hash. There is a threshold set by the network, denoting that the hash of the new block to be mined needs to be less than that value. Hence, miners are in race to calculate the hash and the moment any of them are able to achieve it, they append the block and send them across the network. Others verify the block (by checking multiple things such as the validity, the hash, etc) and agrees on it, which finally leads to consensus among all the nodes.