# Contract Costs & Why Our Design Needs Work Full-Stack StarkNet: * **[Part 1]** 🚧 [Getting Started in Cairo & Deploying with Nile](https://hackmd.io/@sambarnes/BJvGs0JpK) * **[Part 2]** 🐍 [Contract Interaction with starknet.py](https://hackmd.io/@sambarnes/H1Fx7OMaF) * **[Part 3]** 👥 [StarkNet Account Abstraction & Using Standard Contracts](https://hackmd.io/@sambarnes/rkGekNvAY) * **[Part 4]** 💽 [Local Devnet & Starknet.py's Account Capabilities](https://hackmd.io/@sambarnes/By7kitOCt) * **[Part 5]** 🎨 [StarkNet Frontends w/ Cairopal & Argent X](https://hackmd.io/@sambarnes/HydPlH9CY) * **[Notes]** 💰 [Contract Costs & Why Our Design Needs Work](https://hackmd.io/@sambarnes/SkxMZHhRK) (***you are here***) *Completed code on GitHub [here](https://github.com/sambarnes/fullstack-starknet)*. > **UPDATE**: Since the time of writing, there has been official guidance and structure developed for [fees on StarkNet](https://docs.starknet.io/docs/Fees/fee-mechanism/). Defer to that page for any inconsistencies found here. We'll keep this one brief. Honestly, before even continuing here, just go read [RoboTeddy's post](https://hackmd.io/@RoboTeddy/BJZFu56wF#Maxim-Computation-is-cheap-Writes-are-expensive) on the topic. No point in regurgitating something already informative yet concise. Some of it may end up outdated as the [fee discussion](https://community.starknet.io/t/fees-in-starknet-alpha/286/29) develops furthers, though it's definitely a great set of guidelines to shoot for regardless. We'll know more closer to the v0.8.0 release (expected in March). Since you're pretty early to the ecosytem, it's likely a good idea to read some of the fee discussion linked above too. Just to get a sense of the current stakeholders and where their heads are at with this pretty impactful topic. In this post, we'll quickly identify the most expensive part of our application, and try to think of how we might be able to mitigate that. ## Computation is Cheap, Writes are Expensive After going through the post and understanding -- generally -- where our expected costs are going to be, you may be thinking of something in our project that now feels a little suspect. The maxim `Computation is Cheap, Writes are Expensive` raises some red flags. The primary use case of the application, committing a hash to be stored on chain for later auditing, is likely going to be mad expensive. Writing to a new storage slot every time that `attest_state()` is called? 🤦 Instead, maybe we should think about challenging the fundamental purpose of storing the hash itself on chain... > **T1:** Car takes hash of diagnostics (H1), just after stopping at a stop sign > ... > **T5:** Car gets pulled over for "running a stop sign", then wants to prove integrity of locally saved diagnostics that dispute the ticket So we store `(Vehicle ID, T1) = H1` in the smart contract, and later read it back to prove: * at time `T1`, the vehicle attested to state `H1` * our local diagnostics data for that window hashes to `H1` * thus, our diagnotics data has not been modified after the fact The primary thing we're worried about is ***the fact that*** at time `T1`, the car committed some state `H1`. We don't necessarily care to store that hash on chain. Instead, we only care that if we present some untrusted hash `H1_proposed` to the contract, it can return true/false depending on if that hash was committed at `T1`. How would we store enough information to indicate that one fact, without needing to store so much as all the actual time & hash pairs? Perhaps a bloom filter, [as suggested in the previously linked post](https://hackmd.io/@RoboTeddy/BJZFu56wF#Use-extra-computation-to-avoid-writes). ![An example bloom filter](https://i.imgur.com/Xvlm7DH.png) > An example of a [Bloom filter](), representing the set {x, y, z} . The colored arrows show the positions in the bit array that each set element is mapped to. The element w is not in the set {x, y, z} , because it hashes to one bit-array position containing 0. For this figure, m = 18 and k = 3. So even though many writes still take place, there should be a non-negligible overlap in storage slots being updated. If multiple bits in the same felt of storage get updated, within the same batch rolled up to L1, there will be a "Batching Rebate" to discount those writes proportionally. An excerpt from the article we read at the beginning explains it futher: > If a particular storage slot is written to n times within a single StarkNet batch (~1-4 hours), each write costs only 1/nth as much. This is true even if the writes to the storage slot were caused by different tx, calls from other contracts, etc. ## Implementing a bloom filter As a cairo dev, you'll often find that a lot of the tools you try to reach for... just don't exist yet. I came across RoboTeddy's suggestion for a bloom filter, but couldn't find an implementation online when looking to refactor our contract. So I did my best to hack one together 😅 First lets jog our memory with a [python reference implementation](https://www.geeksforgeeks.org/bloom-filters-introduction-and-python-implementation/) on geeksforgeeks. After playing with that in python for a minute, the general approach should feel intuitive. Just deterministically make K hashes of the item, and flip those bits in an array. > 💡 Tip: You'll likely start to run into some of the warts/complexities of cairo development in this one. Among others, [revoked references](https://www.cairo-lang.org/docs/how_cairo_works/consts.html#revoked-references) come to mind. Specifically, they come up when there is branching logic. These can typically be handled by either reorganizing/duplicating some branches, or by making more use of local variables. To start, we'll import some functions that *might* be helpful in our implementation. The `TRUE` and `FALSE` constants are just aliases for felts `1` and `0`, respectively. ```python= %lang starknet from starkware.cairo.common.cairo_builtins import HashBuiltin from starkware.cairo.common.hash import hash2 from starkware.cairo.common.math import assert_le, split_felt, unsigned_div_rem from starkware.cairo.common.uint256 import Uint256, uint256_unsigned_div_rem from contracts.utils.constants import FALSE, TRUE ``` Then, we'll stub out the parameters for the data structure: ```python # TODO: make these parameters configurable later # Probability of false positives # p = 0.0001 # Number of items in the filter const N = 5000 # Number of bits in the filter const SIZE = 95841 # Number of hash functions to perform const K = 13 ``` A user should be able to call `bloom_add(42424242)` to add an element to the set. Similarly, they should able to call `bloom_check(42424242)` to be check whether we've **probably** seen the item (`res=1`) or **definitely** have not seen it (`res=0`). A testcase demonstrating the interface: ```python= import os from random import shuffle import pytest from starkware.starknet.testing.starknet import Starknet CONTRACT_FILE = os.path.join("contracts", "bloom.cairo") @pytest.mark.asyncio async def test_bloom_filter(): starknet = await Starknet.empty() contract = await starknet.deploy(source=CONTRACT_FILE) # Add 0-24 to the filter items_present, items_absent = list(range(25)), list(range(25, 35)) for item in items_present: await contract.bloom_add(item=item).invoke() # Assert anything 0-24 exists and 25-34 does not exist test_items = items_present + items_absent shuffle(test_items) for item in items_absent: execution_info = await contract.bloom_check(item=item).call() (exists,) = execution_info.result if exists == 1: # We've probably seen the item assert item in items_present, "False positive" else: # We've definitely not seen the item assert item in items_absent, "False negative" ``` Let's start by focussing just on the `bloom_add` function. To start, we'll keep track of the current total in the filter, and make sure it stays below the rated capacity. ```python # The actual filter. # HACK: One felt = one bit. Optimize later. @storage_var func bit_array(index : Uint256) -> (res : felt): end # Total items added to the filter @storage_var func total_items() -> (res : felt): end # Recursively flip bits for all indices [H(item, i) for i in range(K)] func _add{syscall_ptr : felt*, pedersen_ptr : HashBuiltin*, range_check_ptr}( item : felt, hash_count : felt): # TODO return () end # Add an item to the bloom filter, reverting if no remaining space @external func bloom_add{syscall_ptr : felt*, pedersen_ptr : HashBuiltin*, range_check_ptr}(item : felt): alloc_locals let (local current_total) = total_items.read() assert_le(current_total + 1, N) total_items.write(current_total + 1) _add(item, K) return () end ``` Diving deeper into the primary recursive loop, we're essentially going to start with `_add(item, K)` and keep calling `_add(item, K-1)` until we get to `K=0`. At each step, we'll hash the `item` and set the `bit_array[index=hash]` to `TRUE`. ```python # Recursively flip bits for all indices [H(item, i) for i in range(K)] func _add{syscall_ptr : felt*, pedersen_ptr : HashBuiltin*, range_check_ptr}( item : felt, hash_count : felt): alloc_locals if hash_count == 0: return () end # Deterministically seed the hash with our recursive counter hash_count # TODO: calculate digest = H(item, hash_count) bit_array.write(index=digest, value=TRUE) _add(item, hash_count - 1) return () end ``` > ✨ Exercise: Implement the TODO above using some of the functions we imported at the beginning. You can try to accomplish it with only felts. However, after hitting a few restrictions, you may quickly find that some operations will need Uint256. *(answer hidden below)* :::spoiler ```python= # Actually hash item & the seed let (h1) = hash2{hash_ptr=pedersen_ptr}(item, hash_count) # Convert our inputs into Uint256, to avoid felt math limitations let (h1_high, h1_low) = split_felt(h1) let h1_uint256 = Uint256(low=h1_low, high=h1_high) let (size_high, size_low) = split_felt(SIZE) let size_uint256 = Uint256(low=size_low, high=size_high) # digest = H( item , seed ) % SIZE let (_, digest) = uint256_unsigned_div_rem(h1_uint256, size_uint256) ``` ::: --- Running the test with our `bloom_check` section commented out, we should be able to successfully add items. > ✨ Exercise: Re-using a lot of the same logic, try implementing `bloom_check()` to get our tests to pass *(answer hidden below)* :::spoiler ```python= # Recursively check that bits are flipped at all indices [H(item, i) for i in range(K)] func _check{syscall_ptr : felt*, pedersen_ptr : HashBuiltin*, range_check_ptr}( item : felt, hash_count : felt) -> (res : felt): alloc_locals if hash_count == 0: return (res=TRUE) # Item probably exists end # Seed with the unique hash iteration being performed let (h1) = hash2{hash_ptr=pedersen_ptr}(item, hash_count) let (h1_high, h1_low) = split_felt(h1) let h1_uint256 = Uint256(low=h1_low, high=h1_high) let (size_high, size_low) = split_felt(SIZE) let size_uint256 = Uint256(low=size_low, high=size_high) # digest = H( item , seed ) % SIZE let (_, digest) = uint256_unsigned_div_rem(h1_uint256, size_uint256) let (local is_flipped) = bit_array.read(index=digest) if is_flipped == FALSE: return (FALSE) # Item definitely does not exist end let (res) = _check(item, hash_count - 1) return (res) end # Check for the existence of an item in the bloom filter @view func bloom_check{syscall_ptr : felt*, pedersen_ptr : HashBuiltin*, range_check_ptr}( item : felt) -> (exists : felt): let (exists) = _check(item, K) return (exists) end ``` ::: --- As you probably figured out, this really isn't that efficient. We're using a *full felt* of storage for each bit in the bit array. Can you think of a way to condense this? Somehow squeezing multiple bits into a single storage slot? I gave it a shot here: https://github.com/sambarnes/cairo-bloom. *Definitely* still needs a ton of work to be production ready. There are likely plenty more clever tricks that could be applied. If you manage to pack even more bits into a single felt or make improvements in some other way, feel free to make a PR! ## Measuring Contract Calls One last thing. In order to get a sense of your contract cost, it helps to put some numbers to it. Luckily it's been right in front of us this whole time: ```python= @pytest.mark.asyncio async def test_bloom_filter(): starknet = await Starknet.empty() contract = await starknet.deploy(source=CONTRACT_FILE) # Add 0-24 to the filter items_present, items_absent = list(range(25)), list(range(25, 35)) for item in items_present: # # Add a breakpoint so we can inspect our contract call # execution_info = await contract.bloom_add(item=item).invoke() breakpoint() ``` When you run the tests, we should drop into a REPL: ``` (venv) sam@sam:~/dev/eth/starknet/cairo-bloom$ make test pytest tests/ ==================== test session starts ===================== platform linux -- Python 3.7.12, pytest-7.0.0, pluggy-1.0.0 rootdir: /home/sam/dev/eth/starknet/cairo-bloom plugins: typeguard-2.13.3, asyncio-0.17.2, web3-5.27.0 asyncio: mode=legacy collected 1 item tests/test_bloom.py >>>>>>>>>> PDB set_trace (IO-capturing turned off) >>>>>>>>>>> > /home/sam/dev/eth/starknet/cairo-bloom/tests/test_bloom.py(18)test_bloom_filter() (Pdb) print(a.call_info.cairo_usage) ExecutionResources(n_steps=11978, builtin_instance_counter={'pedersen_builtin': 65, 'range_check_builtin': 1210, 'bitwise_builtin': 13, 'output_builtin': 0, 'ecdsa_builtin': 0, 'ec_op_builtin': 0}, n_memory_holes=305) ``` Specifically, we can see the number of steps -- `11978`. Comparatively, when I switch to the our original implementation from this post: `5474`. Around half the length. As expected, we added cheap computation in order to reduce expensive writes. ## Conclusion We're not going to go about re-implementing everything with a bloom filter now. Even still, try to think about how our design might have to change: * Can we reuse this filter for multiple cars? * Perhaps a new contract must be deployed for each registered car, including a new empty bloom filter? * Registration "renewal" required with a new contract? * What tradeoffs can be made if you [adjust other parameters](https://hur.st/bloomfilter/) of the filter? * Would a recursive rollup (i.e. StarkNet on StarkNet) make more sense for this application? --- That wraps up the series for now! We've introduced most major parts of a typical StarkNet Dapp -- accounts, cairo development, testnet deployment, working with a devnet, and interacting from a frontend with Argent X. Hopefully you've found this somewhat helpful in getting started :) Feel free to make comments/PRs if you hit roadblocks or think things could be presented in a better way ✌