--- tags: Projects --- # Chain :::info **Released:** February 6th, 11:59pm EST **Due:** February 26th, 11:59pm EST ::: ## Introduction Chain is an example storage system (a blockchain) for a cryptocurrency. Chain is modeled after [Bitcoin's storage system](https://en.bitcoin.it/wiki/Bitcoin_Core_0.11_(ch_2):_Data_Storage), though heavily simplified. The goal of this project is to give you a taste of the many working components of a blockchain. <!-- :::warning **You must fill out this [partner form](https://forms.gle/qhVUZNuCYvsQcXhX6) ASAP**. Reminder: you can work with whoever you want on this project, so your actual partner isn't all that important. ::: --> ### Components Chain consists of several main components. Here's a quick overview: 1. **BlockChain** - The main type of this project, **BlockChain** is a blockchain that stores and validates **Blocks**. It manages and updates its main chain based on the Blocks it receives. Watch out for forks! 2. **BlockInfoDatabase** - The **BlockInfoDatabase** is a wrapper for a [LevelDB](https://en.wikipedia.org/wiki/LevelDB), storing information about each Block it receives in the form of a **BlockRecord**. In addition, each BlockRecord contains storage information for an **UndoBlock**, which provides additional information to revert a Block, should a fork occur. 3. **ChainWriter** - The **ChainWriter** takes care of all I/O for the BlockChain. It writes and reads Blocks and UndoBlocks to Disk. 4. **CoinDatabase** - **CoinDatabase** stores all UTXO information. It contains a cache of **Coins** and a LevelDB of **CoinRecords**. A Coin is a wrapper for a UTXO, and a CoinRecord is a record of all the UTXO created by a Transaction. Validation needs to be as quick as possible, which is why the CoinDatabase contains an in-memory cache in addition to a persistent database. Eventually the cache becomes too large, at which point the CoinDatabase must flush its cache to its database. ### Gear-Up! :::success Here's a **[gear-up](https://youtu.be/-Ae34Qkti30)** video to get you started! ::: ## Assignment For this assignment, you will implement the functions listed below. We recommend that you tackle them in the following order: 1. `StoreBlockRecord` 2. `GetBlockRecord` 3. `WriteBlock` 4. `WriteUndoBlock` 5. `StoreBlock` 6. `UndoCoins` 7. `HandeBlock` To see how difficult we think each function is, check out our [grading breakdown](##Grading). ### (1) blockinfodatabase.go #### Protocol Buffers: * Protocol buffers (protobufs) are Google's "language-neutral, platform-neutral, extensible mechanism for serializing structured data" ([docs](https://developers.google.com/protocol-buffers)). In other words, protobufs are a friendly, efficient way to exchange data. * Protobufs are similar to how json is used for transmitting data across web applications. Read a comparison between the two [here](https://stackoverflow.com/questions/52409579/protocol-buffer-vs-json-when-to-choose-one-over-another). [^1]: [Protocol Buffer Docs](https://developers.google.com/protocol-buffers) #### Why do we have local structs if we're using protobufs? Good question. Normally, when using protobufs, you would probably NOT create redundant local structs. After all, if you have a well-defined message format, why would you rewrite the same format? The answers is that these redundant local structs are easier to work with. That means less headaches for you. #### BlockRecords: * BlockRecords contain information on where a Block and its UndoBlock are stored on disk. Its structure is as follows: * Header - the block's header * Height - the block's height * NumberOfTransactions is the number of Transactions in the Block. * BlockFile is the name of the file where the Block is stored. * BlockStartOffset is the starting offset of the Block within the BlockFile. * BlockEndOffset is the ending offset of the Block within the BlockFile. * UndoFile is the name of the file where the UndoBlock is stored. * UndoStartOffset is the starting offset of the UndoBlock within the UndoFile. * UndoEndOffset is the ending offset of the UndoBlock within the UndoFile. #### Implement: ```go! // StoreBlockRecord stores a block record in the block info database. func (blockInfoDB *BlockInfoDatabase) StoreBlockRecord(hash string, blockRecord *BlockRecord) ``` As suggested by this function's name, the purpose of `StoreBlockRecord` is to store block records in the BlockInfoDatabase. In order to do this, we'll need to 1) encode the `BlockRecord` as a protobuf, 2) convert the protobuf to the correct format and type (`byte[]`) so that it can be inserted into the database, and 3), put the block record into the database. ##### Helpful functions: * `func EncodeBlockRecord(br *BlockRecord) *pro.BlockRecord` * proto's `func Marshal(m Message) ([]byte, error)`. See documentation [here](https://pkg.go.dev/github.com/golang/protobuf/proto#Marshal). * leveldb's `func (db *DB) Put(key, value []byte, wo *opt.WriteOptions) error`. See documentation [here](https://pkg.go.dev/github.com/syndtr/goleveldb/leveldb#pkg-functions/). #### Implement: ```go! // GetBlockRecord returns a BlockRecord from the BlockInfoDatabase given // the relevant block's hash. func (blockInfoDB *BlockInfoDatabase) GetBlockRecord(hash string) *BlockRecord ``` This function retrieves block records from the database, and returns the block record. In order to do this, we'll basically do the reverse of what we did in `StoreBlockRecord`. Specifically, we'll 1) retrieve the block record from the database, 2) Convert the `byte[]` returned by the database to a protobuf, and 3) convert the protobuf back into a `BlockRecord`. ##### Helpful functions: * `func DecodeBlockRecord(pbr *pro.BlockRecord) *BlockRecord` * proto's `func Unmarshal(b []byte, m Message) error`. See documentation [here](https://pkg.go.dev/github.com/golang/protobuf/proto#Unmarshal). * leveldb's `func (db *DB) Get(key []byte, ro *opt.ReadOptions) (value []byte, err error)`. See documentation [here](https://pkg.go.dev/github.com/syndtr/goleveldb/leveldb#pkg-functions/) and be sure to account for the corresponding I/O errors. --- ### (2) chainwriter.go #### So what's an UndoBlock? - An UndoBlock is an object that contains information necessary for reverting a specific block. Thus, each Block has an accompanying UndoBlock. - When handling Blocks, forks will occasionally supplant the current, main chain. When this happens, our CoinDatabase (akin to Bitcoin's UTXO Set) is out of date, and we need to update it to reflect the fork. - You may be asking yourself: **why do I need to update the CoinDatabase?** When a fork occurs, the forked blocks (almost always) contain a different set of transactions than the current blocks. This means that both the inputs used and the outputs created by the forked blocks are different than what our CoinDatabase currently reflects. - UndoBlocks help us revert spent Coins, (re)making them valid inputs for future transactions. #### Implement: ```go! // WriteBlock writes a serialized Block to Disk and returns // a FileInfo for storage information. func (cw *ChainWriter) WriteBlock(serializedBlock []byte) *FileInfo ``` ```go! // WriteUndoBlock writes a serialized UndoBlock to Disk and returns // a FileInfo for storage information. func (cw *ChainWriter) WriteUndoBlock(serializedUndoBlock []byte) *FileInfo ``` **Note:** These two functions are nearly identical. Think about which ChainWriter fields you're modifying. #### Hints: - Before writing a block to a file, make sure to check that doing so will not cause the file to be larger than the maximum allowable file size. If your Block/UndoBlock is too large to store in the current file, you'll have to update where you're writing to! - In order to update the file you're writing to, you should increment the `cw.CurrentBlockFileNumber` by one. - You should use the following naming convention for your files: ```dataDirectory/fileName_fileNumber.<file extension>``` - Make sure that you're updating relevant `ChainWriter` fields as appropriate ##### Helpful functions: * `writeToDisk(fileName, serializedBlock)` --- ### (3) coindatabase.go: #### Cache vs Database * To improve efficency, a cache (stored in RAM) is used for storing and reading newer coins before putting them in the database since reading from the database (stored on disk) is [slow in comparison (milli vs nano)](https://stackoverflow.com/questions/1371400/how-much-faster-is-the-memory-usually-than-the-disk#:~:text=Random%20Access%20Memory%20(RAM)%20takes,speed%20is%20measured%20in%20milliseconds.). * However, a cache cannot hold as much data as the database. So once our cache is full, we empty the data from the cache into the db (called **flushing**). With an empty cache, our Cache can now accept new Coins. * Our flushing mechanism is NOT efficient. You can read more about cache replacement policies [here](https://en.wikipedia.org/wiki/Cache_replacement_policies). * **Note:** when you store a block, you should store its UTXOs in **both** the DB and the Cache. The main reason for the cache is for quick validation and UTXO updates. #### Understanding Reverting a Block * Brief visual of reverting a block [here](https://youtu.be/mQEGhmI5epE) :) #### Implement: ```go! // StoreBlock handles storing a newly minted Block. It: // (1) removes spent TransactionOutputs // (2) stores new TransactionOutputs as Coins in the mainCache // (3) stores CoinRecords for the Transactions in the db. // We recommend you write a helper function for each subtask. func (coinDB *CoinDatabase) StoreBlock(transactions []*block.Transaction) ``` #### Hints - When removing spent TransactionOutputs, if a Coin is in the mainCache, we should simply mark the Coin as spent. This will cause the Coin to be removed from its CoinRecord in the database next time the mainCache is flushed. If a coin is not in the cache, we should directly remove the Coin from its CoinRecord in the database. ##### Helpful functions: * `makeCoinLocator(txi *block.TransactionInput)` * `coinDB.removeCoinFromDB(txHash string, cl CoinLocator)` * `coinDB.FlushMainCache()` * `coinDB.createCoinRecord(tx *block.Transaction, active bool)` * `coinDB.putRecordInDB(txHash string, cr *CoinRecord)` ```go! // UndoCoins handles reverting a Block. It: // (1) erases the Coins created by a Block and // (2) marks the Coins used to create those Transactions as unspent. func (coinDB *CoinDatabase) UndoCoins(blocks []*block.Block, undoBlocks []*chainwriter.UndoBlock) ``` ##### Helpful functions: * `coinDB.removeCoinFromRecord(cr *CoinRecord, outputIndex uint32)` * levelDB's `Delete(key []byte, wo *opt.WriteOptions)` * `coinDB.getCoinRecordFromDB(txHash string)` * `coinDB.addCoinsToRecord(cr *CoinRecord, ub chainwriter.UndoBlock)` * `coinDB.putRecordInDB(txHash string, cr *CoinRecord)` --- ### (4) blockchain.go #### Implement ```go! // HandleBlock handles a new Block. At a high level, it: // (1) If valid, appends the Block onto the active chain. // (2) Stores the Block and resulting Undoblock to Disk. // (3) Stores the BlockRecord in the BlockInfoDatabase. // (4) Handles a fork, if necessary. // (5) Updates the BlockChain's fields. func (bc *BlockChain) HandleBlock(block *block.Block) ``` ##### Helpful functions: * There are several helpful functions in `blockchain.go` that we've crafted to make your life easier. Make sure you understand what each one is doing! * Remember, you will be using functions declared in other Chain packages. Make sure to **navigate the code base** to find out what might be useful for you. #### Additional notes: * In our implementation, we store the unsafe hashes in a queue, where `unsafeHashes[0]` is the least recent hash. This means that when updating `unsafeHashes`, we pop from the front (but should we always pop?) and push to the back of the slice. The helper function `getForkedBlocks` handles `unsafeHashes` as such. If you decide to keep track of unsafe hashes differently, `getForkedBlocks` will not behave as expected. In that case, feel free to update `getForkedBlocks`! * We use `unsafeHashes` and `maxHashes = 6` as a heuristic for how many blocks back we need to check to find the origin of a fork. While in real life there is an infinitesimal chance that a Bitcoin fork requires a node to search back beyond 6 blocks, it's still possible! Such an event would (unfortunately) break our implementation. * This function (along with `UndoCoins()`) is somewhat complicated and long. If you're stuck, collaborate with your peers or come to TA hours! --- ## Testing This assignment is autograded, and you are able to run our test suite as many times as you like on Gradesceope. In addition, you should write your own tests when things aren't working! We have provided several helper functions in `test/utils.go`. To test your project with the tests you've written, `cd` into `test` and run `go test`. This will let you know which tests are failing, and why. It will likely be more convenient to **run individual tests**, which you can do using the GoLand UI. :::warning **Note:** Running the autograder directly on the stencil will cause a system panic, since the application relies on the dereferencing of things like block records (which will be null if you have yet to implement anything). We have configured it such that you will still see the error message in cases like this, but it may not be as useful as those that were written by the staff. ::: ## Install 1. Clone the [stencil repo](https://classroom.github.com/a/J71MnJVR). 2. In your local Chain directory, run `go get ./...` to install the project's dependencies. 3. Get after it! Good luck :smiley: ## HandIn 1. Log into [Gradescope](https://www.gradescope.com/). 2. Submit your code using **GitHub**. ## Grading This assignment is out of **140 points**. Your grade will be determined by passing the following test functions: ### BlockChain | Test Function | Points | |:-------------------------- |:------:| | `TestHandleAppendingBlock` | 5 | | `TestHandleInvalidBlock` | 5 | | `TestHandle50Blocks` | 5 | | `TestHandleForkingBlock` | 10 | | `TestHandle2Forks` | 10 | | **Total** | **35** | ### BlockInfoDatabase | Test Function | Points | |:------------------------- |:------:| | `TestGetSameRecord` | 5 | | `TestGetDifferentRecords` | 5 | | **Total** | **10** | ### ChainWriter | Test Function | Points | |:------------------- |:------:| | `TestReadBlock` | 5 | | `TestReadUndoBlock` | 5 | | `TestRead100Blocks` | 5 | |`TestStoreOrphanBlock`|5| |`TestStoreBlock`| 5| |`TestWriteBlock`|5| |`TestWriteBlockSwitchesFiles`|5| |`TestWriteUndoBlock`|5| |`TestWriteUndoBlockSwitchesFiles`|5| | **Total** | **45** | ### CoinDatabase | Test Function | Points | |:-------------------------- |:------:| |`TestGetCoin`|5| |`TestUndoCoins`|5| |`TestUpdateSpentCoinsInCache`|5| |`TestUpdateSpentCoinsNotInCache`|5| |`TestValidateValidBlock`|5| |`TestValidateInvalidBlock`|5| |`TestCacheFlushedWhenFull`|10| |`TestCoinsStoredInCacheAndDB`|10| | **Total** | **50** |