EVM to Solana Developers Guide

# EVM to Solana Developers Guide This guide is targeting solidity/evm devs that might want to explore Solana. It covers the major architectural differences and common coding patterns you might find helpful. It's going to abstract over lower level implementations and nuances of these things and focus more on practical developer mental models. If this guide helps you or you think it's valuable, consider buying me a Red Bull: **6W4XEiEr5LGnvCEftTHAzeQnw3QPtZzd7AftFPVWkVpD** If you have questions about this, you can enter the conversation on the twitter post about it [here](https://x.com/spacemandev/status/1880197535673000197) Or come hang out in the 76Devs Solana developers discord and harass me in voice chat: https://discord.gg/76devs ## Smart Contracts vs Programs EVM calls executable code "Smart Contracts" and Solana calls executable code "Programs". <div style="display:flex; flex-direction: row; width: 50%"> <img src="https://hackmd.io/_uploads/Byqkz5vPyg.png"> <img src="https://hackmd.io/_uploads/HJSeM5PDJe.png"> </div> EVM's approach to smart contract code is to bundle executable code and the storage of this code together into one public key. Solana separately stores these in two different buckets, one is executable code (a "program") and then you have data blobs for this program (also confusingly called "accounts" -- see below). ## EVM Accounts vs Solana Accounts On EVM you have - Externally Owned Accounts (EOA) - which are controlled by anyone with the private key to the account - Contract Account - a smart contract deployed to the network (and it's associated storage) - This type of account can grow to be up to 2^256 -1 storage slots big ( 32 bytes per storage slot) On Solana, you have - Accounts -- which are just files that store a Sol balance + some data. This data can be executable code or just storage blobs of _non executable_ code. These accounts can be up to 10mb in size. A key thing to differentiate here is that an account can be marked "executable" or not, but it CANNOT be both (unlike EVM). - Another key thing to note is that _every_ account is owned by either the System Program _or_ by another account. Edits can only be made to an account by the account that owns it. So when you transfer Sol, you call the System Program and request and it move Sol from your account to the recipients account. ## Sequential vs Parallel Execution This separation of executable code and the data blobs it access is one of the major reasons Solana can process transactions so much faster. Let's walk through an example for a set of token transfers. Assume we have four people (Alice, Bob, Steve and Samantha). Alice wants to send Bob 20 $SHELLS and Steve wants to send Samantha 15 $SHELLS. In EVM land all of these happen in the same "account" (a contract + storage account). Alice and Steve would call into the SHELLs contract, and request the transfer() function. If both of those transfers come in at the same time, EVM nodes will order them such that they happen one after another. This is because they both call into the same "account" and as such could potentially modify the same memory. The validator does _not_ know the difference of storage between Alice/Bob/Steve/Samantha. For all it knows, Alice and Steve might actually have the same origination balance, and so on. In Solana land, there's five _different_ accounts. There's the _executable_ contract account, and balance accounts for Alice, Bob, Steve, and Samantha. When the two transactions come in, the contract account can be run in parallel because each transaction marks the specific accounts they want to modify as writable, letting the validator know that the execution _can not_ modify data that it doesn't request as modifiable. Alice -> Bob transaction requests Alice's and Bob's account as writable, and Steve -> Samantha transaction requests Steve and Samantha's account as modifiable. The validator runs the contract account in _parallel_, because there is no danger of transaction A modifying the same memory as transaction B. By doing this kind of "data marking", Solana is able to safely execute hundreds of thousands of transactions in parallel without worrying about if memory violations. Only if transactions come in that modify the _same_ memory will they be run sequentially. ## Solana PDAs In EVM land, all the contract + storage for an app is referenced by the same address, which is quite manageable. In Solana land, the System Program owns your dapp's program account, which in turn owns all the various data accounts that it creates. If these data accounts could only be managed by private/public key pairs then the program account would need to track all the private keys on chain -- presenting a huge security problem as that would allow anyone to make changes to any account. This is where PDAs -- program derived addresses -- come in. Before we cover what that is, we need a quick primer on Ed25519, the curve which Solana uses for it's private key. You can imagine a huge graph with (x,y) coordinates. A (x,y) coordinate that results in a point on the curve has a private key and public key. Any other (x,y) coordinate that is _not_ on the curve results in just a public key with no associated private key. ![Ed25519](https://hackmd.io/_uploads/BkBzmqvPkl.png) *Image from https://solana.com/docs/core/pda* There's a lot more to this cryptographically, but basically, we can use some set of seeds + owning_program_id to come up with a (x,y) such that it is *not* on the curve. A public key that does *not* have a corresponding private key and one of it's seeds is it's owning program id is called a "Program Derived Address". You can learn more about PDAs [here](https://solana.com/docs/core/pda) but basically this means that data accounts *don't need* to have private keys associated with them, and can be derived and managed by the program account that created that data account, making managing state a *lot* easier. PDAs also allow you to *sign* as the program, but more on that later. ## Storage Cost On EVM, storing data uses the 'SSTORE' opcode which charges some gas to store a 32byte word (~20k Wei on ETH last I checked, but this changes from various EVM chains and after forks and such). When you set this data to 0 (deleting data effectively), it refunds you 75% of the inital storage cost -- but not all. On Solana, you are charged a base fee for creating an account (0.0008 sol as per writing this) and then per byte storage cost (0.0011136 sol at the time of writing). If you *close* this account and free up the space, you actually get the *full* cost of the data back. ## Storage Patterns You can emulate Solana storage patterns in EVM by using the [Diamond Storage](https://www.quicknode.com/guides/ethereum-development/smart-contracts/the-diamond-standard-eip-2535-explained-part-1) standard. Basically you'd have a EVM Smart Contract that acts as a storage contract, and then you have smart contracts that are executable only that write to that storage contract. Another common pattern in EVM development is using `map` data type to store (key,value) data. This works in EVM because the contract has 2^256-1 storage slots (read: a *large* max storage capacity). In Solana, each account is limited to 10mb *max* size, so instead a common pattern is to make a *lot* of accounts, often small in size. A PDA Account for example is often seen as a KV store, with the key being the set of seeds used to derive the PDA, and the value being whatever you want to store in the account. ## Upgrade Patterns In EVM, all smart contracts are immutable by default, but often developers get around this by using proxy contracts and pointing them to new implemenations of upgraded contract. On Solana, if you own the private key for the exectuable account, you can upgrade it any time *unless* you mark it as immutable, which will disallow the system program from letting anyone change it in the future. ## Contract Interfaces vs Reusable Programs In EVM, because the executable code and the storage of the dapp both live in the same bucket, it's necessary for developers to redeploy the same executable code over and over again to gain their own buckets. An example of this is the ERC20 contract, which is redeployed *every time* someone launches a token, even though all the new tokens are the *same* code. By contrast, on Solana, because the executable code and the storage code are *separate*, most developers *do not need* to write their own Rust programs. They can make use of *existing* solana programs (like the SPL Token Program), and just create new data accounts that point to their token mint and balance accounts for people holding their token. By reusing existing deployed programs, developers also get to piggyback on the audit status of existing programs and having to manage and deal with deployment pipelines for on chain code, which saves a *ton* of $$$. Reusing existing code means that most Solana developers are actually *not* writing Rust code, they are just using Javscript (or whatever language of their choice) to call common existing programs. Some critics point out that this means that solana program development is "centralized". This is not the case. There's nothing stopping you from writing your own token program and different than existing ones, the challenge comes in having wallet support to display your new token standard (which is the *same* problem with writing new standards in EVM development too). ## Composability (Delegate Call vs Cross Program Invocation ("CPI")) In EVM when Contract A wants to read or write from Contract B it does a "delegate call" during it's execution and calls functions on Contract B to do something. Contract B can in turn call Contract C during it's execution, and so on and so on. This results in some *serious* vulnerabilities, as it's hard for anyone to know what code execution path will result in what memory being changed and what contracts being called, but does allow for fairly easy composability. One thing to note is that reading data often is limited to what the called contract's functions return, as you can't read *raw* data in Contract B/C/etc. By contrast, in Solana, when you call Program A (it's executable account), you have to specify ahead of time all the accounts you will use in the transaction, and which ones you want to be writable and which ones read only. You might read data from an account owned by Program B and another owned by Program C and have no need to call either. Only when you want to *write* data do you have to call those programs, as only the program which owns the data account can write to it. Calling into a program from another program is called a "Cross Program Invocation". One major current limitation of the Solana runtime is that you're limited to a depth of 4 for CPIs, which seriously hinders write composability. This is being addressed in runtime v2, but that won't be live for a while yet. ## Transaction vs Instructions Calling an EVM smart contract takes one transaction, and that's the *only* thing you can do in that transaction. By contrast, when you call a Program that's called an "Instruction Call", and you can bundle multiple of these together in one transaction (up to the transaction size limit of 1232 kb*). You have to be careful though, because if one instruction fails in the transaction, all instructions revert, so a transaction has to go through entirely or not at all. *there's currently talks of increasing the size limit for transaction sizes as 1232 kb is *very* tiny, so stay tuned for updates on that. ## Msg.Sender vs Signers All EVM transactions have a msg.sender, this is the account that paid for the transaction and signed it. Even if you go from Contract to Contract, the msg.sender stays the same. In Solana, for every instruction you have a list of accounts that you're going to be working with, which by default are non signers and ready only. You can mark any of these accounts as "signer" and "writable". Signer accounts mean you'll attach a signature from this account to the transaction proving the account owner signed off on this instruction. Writable accounts mean that they will be potentially modified during execution. Additionally, the one or more instructions are then packed into a transaction, which has a fee payer, which pays the transaction fee (base fee + priority fees) for the transaction. This model lets you do a bunch of cool things; 1. You can have *multiple* signers on an instruction, so you can have multiple users sign or user + backend, or whatever you want to do. Makes multisig implementations super easy. 2. You can seperate the transaction fee payer from the signers of the account, allowing users to call instructions and paying fees on their behalf if you so choose. 3. Account creation fee payer is listed *seperate* from the transaction fee payer, so that can be the same as the transaction fee payer or can be someone else. You can also have multiple payers who pay for different parts of the execution / account creation / etc. ## Gas vs Compute Budget EVM has a concept of "gas". For every opcode you call in your code, you end up paying some amount of Eth, and more complex code costs more gas. This often leads to "code golf" behavior in Solidity, where developers resort to pure assembly and unreadable code to use hacks to get their gas costs as low as possible. On Solana, all transactions have a standard 200,000 Compute Units they can utilize for a base fee of 5000 lamports (0.00005 Sol). Execution of thier code costs compute units, but they pay the same 5k lamports regardless of if they use 15,000 CU or 195,000 CU. Note: these compute units are *seperate* from storage costs, which we discussed above. Additionally, on Solana, developers can optionally request up to 1.4 Million compute units for more complex transactions. They still pay the same 5k lamport base fee, no matter how much compute they request. HOWEVER, Validators are incentivized to include transactions in blocks that use *less* compute units, because this means they can fit more transactions in a block, which means more $ for them overall. So generally, it's a good practice to *only* request the number of compute units that you specifically need for your instruction (which you can get by simulating it), even if it's less than 200k. And if requesting more than 200k, or if there's high congestion conditions on Solana, developers can additionally attach a "priority" fee of X lamports per CU requested going above and beyond the base fee they pay for the transaction to help incentivze the validators to include their transaction in the blocks quicker. ## Built In Functions vs Syscalls In EVM certain functions that run on the validator to save you gas costs are called "built in functions" and in Solana they are called "syscalls". Same thing though. ## Solidity View Functions vs Account Reads When you want to *read* data from chain, Soldity has these nice "view" functions which allow you to fetch state without modifying it and they run for free (no gas costs). If you have different data structures, view functions are a good way to seperate the way you *store* the data from how you want it to be read. For example, you might have a map(address => string) for usernames, and map(address => uint256) balances as two data structures in your EVM code. Then your view function might return a struct that has both the username and balance together in one package. These view functions are basically "data processing" functions that run on the RPC and return parsed data. Unforetuntely, in Solana, this is more complex. The RPC only returns the data blob that you want, and if you want a join of two data accounts or things like that, you have to do the manual work yourself. There are APIs like DAS that make this simpler for NFTs and tokens, but they aren't available on every RPC provider. ## Logs On EVM, you can store data in logs and have that be a secondary, cheaper storage location for client read only data (it can't be accessed on chain). On Solana, you do have logs, but no one really stores anything in them, because very few validators store logs past 30-90 days. If you want to store data in logs, you instead use a return-data-cpi hack, using a CPI that returns data (could be to your own program, could be to the noop program, etc). The return data will actually be stored in logs and *not* pruned by validators. It's stupidly hacky, and I hate it, but that's what it is. ## Getting Started On EVM, I usually taught solidity through the Remix IDE, which is a fantastic tool. The closest you have to do that in Solana is [Solpg](https://beta.solpg.io). We don't have a SVM in WASM yet so you can't run the code in the browser like you can with Remix, but it's a work in progress. For writing the code itself, you can write what's called "native" solana, or use a framework like [Anchor](https://www.anchor-lang.com) or [Steel](https://github.com/regolith-labs/steel) or [Pinocchio](https://github.com/febo/pinocchio). These frameworks usually provide nice rust macros that make the boilerlate less irrating. Generall you'd write your rust code, then use a tool like Anchor or [Codama](https://github.com/codama-idl/codama) to generate an IDL which is a JSON description of your program. From that you can generate client code to make calling your program easier.