sxysun
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # EVM Parallelization Refinement Notes ### Introduction In this note, we present two example algorithms that utilize formal methods for better EVM parallelization by refining the state witness oracles. ### Formal Methods One way to facilitate better refinement is using formal methods(a type of static analysis) to infer the shared location information $\kappa$. Assuming executing $txn_i$ and $txn_j$ in either order using the sequential EVM $\delta$ generate the same Ethereum state, then we say they are *parallelizable*, otherwise, they are *conflicting*. If our analysis gives no false negatives on the *conflicting* property, then our analysis is *sound*. If our analysis gives no false positives on the *conflicting* property, then our analysis is *precise*. So we have two choices here, we could give a sound list, overestimate the storage accesses, and guarantee no conflicts but perform more unnecessary sequential computation. Or, we could give a precise list, underestimating the storage access, thus taking some risks of ending up with inconsistent states from $\delta$ and $\delta_p$, but the time we saved from speculative execution might be worth to have few re-computations (that's why it's the [art](https://www.di.ens.fr/~cousot/AI/IntroAbsInt.html) between soundness and preciseness). If you think more on the idea of *soundness* and *preciseness*, you could easily come up with example cases where we could parallelize two transactions but it is not obvious to see. For example, there is a DeFi protocol that only allows people who holds their native token to claim some airdrop of another token: ```solidity function claim_airdrop() external { require(!claimed[msg.sender]); require(IERC20(native_token).balanceOf(msg.sender) > 0); claimed[msg.sender] = true; IERC20(airdrop_token).mint(msg.sender, 400 ether); emit AirdropClaimed(msg.sender, IERC20(airdrop_token).symbol(), 400 ether); } ``` Now suppose some Alice and Bob both already have a balance of `100` native tokens, and there are two transactions $txn_1$ and $txn_2$: in $txn_1$ Alice calls the `claim_airdrop` function and in $txn_2$ Bob calls `IERC20(native_token).transfer` transferring all his 100 tokens native tokens to Alice. Placing $txn_2$ before $txn_1$ makes $txn_1$ read a different native token balance from storage for Alice, but this read doesn't affect $txn_1$'s execution (control flow) and it doesn't have an impact on the final state change. And this is just a very simple case, you get the idea: there could be other more complex function intertwines making it difficult to deduce the *parallelizability* of $txn_1$ and $txn_2$ without actually executing them. In fact, it is [impossible](https://en.wikipedia.org/wiki/Rice%27s_theorem) to be sound and precise at the same time with only static reasoning. *Although it is a fantastic idea to build quick pattern matchers for those special cases and use them as optimizations for our main static analysis engine*, for now, we leave that for future work and focus on building a "good enough" static analyzer for deriving storage access information $\kappa$. ### Approach We define our algorithm using [abstract interpretation](https://cs.nyu.edu/~pcousot/). In short, abstract interpretation is to replace the concrete value of a variable (like `uint256 18000000`) with an abstract value during the program "interpretation." And this abstract value is usually defined as an element from a [complete lattice](https://en.wikipedia.org/wiki/Complete_lattice). In short, a complete lattice $(L, \sqsubseteq, \sqcup, \sqcap, \bot, \top)$ satisfies that $L$ being a [partially ordered set](https://en.wikipedia.org/wiki/Partially_ordered_set) under operation $\sqsubseteq$, and every subset of $L$ has both a supremum (the greatest element in terms of $\sqsubseteq$) and an infimum (the smallest element in terms of $\sqsubseteq$). The operator for getting the supremum is called join ($\sqcup$) and the operator for getting the infimum is called meet ($\sqcap$). Finally, we have $\bot = \sqcap \ L$ the infimum of the set of all elements in $L$, and $\top = \sqcup \ L$ the supremum of the set of all elements in $L$. Why use such a fancy mathematical structure you ask? Well, it helps us to formulate the relation between abstract interpretation and concrete execution with [Galois Connection](https://en.wikipedia.org/wiki/Galois_connection) and prove [some properties](https://matt.might.net/papers/vanhorn2010abstract.pdf) of it, for example, that with [well-defined](https://en.wikipedia.org/wiki/Monotonic_function) transition functions, our interpretation [always terminates](https://en.wikipedia.org/wiki/Kleene_fixed-point_theorem). For interested wrinkle brain theorists out there, you can dig into abstract interpretation more by starting from [Patrick Cousot's tutorial](https://cs.nyu.edu/~pcousot/summerschools/ICTAC-2019/Cousot-tutorial.pdf) (if you didn't understand or think it's too early for symbols and greeks, don't worry! We will present concrete examples at the end of this section). ### Assumption We assume that our EVM has a [back-tracking](https://github.com/ledgerwatch/erigon/wiki/EVM-with-abstract-interpretation-and-backtracking) capability, in other words, it can take and keep a snapshot of the EVM state whenever we reach an `opcode` that changes our program counter `pc`, so that later when we finish the execution, we could "go back in time" to the previous EVM state and choose a different path. For example, when we encounter `JUMPI`, our EVM can take a shot of its state, and then proceed without jumping (just incrementing `pc`), and then later when it finishes execution it could come back and pick up the snapshotted state to execute with a jump (changing `pc` to the jump destination). ### Algorithms We present two algorithms utilizing abstract interpretation. One for general transactions and another for special case transactions like ERC20 token transfer or Uniswap add liquidity, we can run the algorithms beforehand to get state witness summaries and determine runtime which summaries to use based on dynamic [profiling](https://en.wikipedia.org/wiki/Profiling_(computer_programming)). #### Coarse Analysis For general transactions, we perform abstract interpretation on them after the client receives them. Thus we are performing a Just-In-Time (JIT) abstract interpretation. Obviously, this would require the abstract interpretation to be fast, therefore we use a coarse domain (i.e., a complete lattice that has relatively fewer elements and a simpler structure under its $\sqsubseteq$ operation) for our abstract value. Since the goal is to have the overall amortized time of parallel coarse analysis and execution be much less than the time it takes for sequential execution, we deliberately choose a coarse domain that includes [havoc](https://research.cs.wisc.edu/wpis/papers/vmcai16-invited.pdf) operations. Basically, havoc just means that we are deliberately overestimating the abstract value of some variable. And the reason we do this is that by setting the abstract value of a variable to $\top$ (unknown) we save analysis time (think of $\top$ as meaning the value could be anything) because performing join/meet with unknown is fast (constant time complexity, because $\forall v, v \sqcap \top = v, v \sqcup \top = \top$). And since $\top$ is an overestimation, every variable that depends (whether it is control-flow dependent or data-flow dependent) on a variable that has $\top$ abstract value is also going to have $\top$ as its abstract value. Thus making operations on them O(1) as well (kind of like a Domino effect). And the specific reason we use this is that when encountering `SLOAD` (recall this is the most expensive runtime opcode), we want to cut its execution time to *very low*, so we won't really read storage, as that would just make "abstract interpretation" an actual "concrete interpretation." So whenever we have `SLOAD`, we just put a $\top$ value at the top of the stack. And then go on with our analysis. According to [previous performance analysis](https://docs.google.com/presentation/d/11DqpZP1NA24OvJe3l9yaAPavQk7luGIE5oxIe07IayA/edit#slide=id.p), `SLOAD` is taking roughly 33\% of total EVM instruction emulation overhead. So cutting this alone could *directly* cut interpretation time by 33\%, not accounting for the Domino effect. We define the complete lattice $(L_c, \sqsubseteq, \sqcup, \sqcap, \bot, \top)$ used by our coarse analysis below: - $(L_c ,\sqsubseteq)$: $\bot \sqsubseteq 0, 1, 2, 3, ..., (2^{256}-1) \sqsubseteq \top$ - $\top$: Unkonwn value - $\bot$: Undefined - $\sqcap$: $\forall v, v \sqcap \bot = \bot, v \sqcap \top = v$ (notice $\sqcap$ is commutative) - $\sqcup$: $\forall v, v \sqcup \top = \top, v \sqcup \bot = v$ Next, we present a sound (conflict is impossible) naive algorithm for parallel EVM. 1. For each $txn_i$ in $\bar{t}$, run the following in parallel: 1.1. Initialize a set $l_i^p = \emptyset$ which will contain all storage access traces in a certain execution path $p$ (an [execution path](https://en.wikipedia.org/wiki/Symbolic_execution) is defined as a sequence of EVM opcode executions starting from the contract function selector to `REVERT` or `RETURN`). Initialize a set $e_i = \emptyset$ which will include the storage access trace of $txn_i$ across all paths. 1.2. Run abstract interpretation using the transition functions listed below in the table. Whenever we encounter `SLOAD` or `SSTORE`, and we see value $v$ on top of the stack, we do $l_i^p = l_i^p \cup \{ v \}$ (recall the fact that the topmost element of the stack is the storage key for `SSTORE` and `SLOAD`). And whenever we encounter `JUMPI`, we perform a backtrack, creating two execution paths $p_1$ and $p_2$ each with their storage trace set $l_i^{p_1}$ and $l_i^{p_2}$ initializaed to $l_i^p$. Whenever we finish a path $p$, we do $e_i = e_i \cup \ l_i^p$. 2. Check $e_i$ with other transactions, if $e_i \cap e_j = \emptyset$, then execute $txn_i$ and $txn_j$ in parallel, else execute them in order of $\bar{t}$. Note that we are introducing $\top$ on the top of the stack *after* we interpret `SLOAD` abstractly. This should not be confused with us seeing the value $v$ at the current stack top and adding it to set $l_i$ (the storage locations accessed set) right *before* our interpretation of `SLOAD` and `SSTORE`. We now walk through how we would do abstract interpretation on some representative EVM opcodes in the following table. Whenever we see an EVM opcode in `pc` and "stack input" on the EVM stack, we perform "operation" and get "stack output." During our abstract interpretation (as shown in the following table), we only *introduce* $\top$ whenever we interpret SLOAD, but we allow $\top$ to be *propagated* when we interpret other opcodes like `ADD`, `SHA3`, etc,. <!-- Since we are overestimating the value from `SLOAD`, what we get in the end is an underestimation of the actual storage accesses. So we might run into some races and need to recompute. --> <!-- 3. if our parallel execution in step 2 run generates a different state from sequential execution $\delta$, then re-run the conflicting transactions in order of $\bar{t}$. --> <!-- initialize initial state $X_i = \overline{\top} \in L_c$, also $F(X_i)$ until reaches a fixedpoint. During the execution of $F$ --> ``` opcode |stack input |stack output |operation --------------+----------------------+-----------------------+---------------------- SLOAD |key|... |⊤|... push ⊤, add key to l_i SSTORE |key|value|... |... nop, add key to l_i MLOAD |offset|... |value(⊤)|... if offset = ⊤ then push ⊤ else push memory[offset:offset+32] ADD |a|b|... |a+b(⊤)|... if a = ⊤ or b = ⊤ then push ⊤ else push a+b SHA3 |offset|length|... |hash(⊤)|... if offset = ⊤ or length = ⊤ then push ⊤ else push hash BALANCE |addr|... |addr.balance(⊤)|... if addr = ⊤ then push ⊤ else push address(addr).balance JUMPI |dest|cond|... |... if cond = ⊤ then backtrack $pc else $pc = dest, if dest = ⊤ then return with warning JUMP |dest|... |... if dest = ⊤ then return with warning else $pc = dest PUSH1 |... |uint8|... pushes a 1-byte value to stack DUP1 |value|... |value|value|... duplicates value whose domain is L_c CALLDATALOAD |i|... |msg.data[i:i+32]|... if i = ⊤ then push ⊤ else loads a uint256 from message data SELFDESTRUCT |addr|... |... if addr = ⊤ then just destroys contract else destroys the contract and sends all funds to addr RETURN |offset|length|... |... return l_i the collected storage accesses REVERT |offset|length|... |... return l_i the collected storage accesses ... ``` And to deal with inter-procedural calls, we perform the operation shown in the following table: ``` opcode: CALL stack input: |gas|addr|value|argsOffset|argsLength|retOffset|retLength|... stack output: |success|... operation: if addr = ⊤ then return with ⊤ added to l_i else proceed ``` For other interprocedural opcodes like `DELEGATECALL` and `STATICCALL` their execution is analogous to that of `CALL`. #### Precise Analysis For transactions that interact with popular contracts (e.g., Uni v3 pool) we pre-perform a static analysis with a relatively refined domain on the contract. Specifically, we call the functions in those contracts *special case functions*, denoted by $f_i$. Our goal is to generate a $\kappa$ that is very close to $\kappa_{perfect}$ in this step, without much consideration for the speed of our analysis. We define the complete lattice $(L_p, \sqsubseteq, \sqcup, \sqcap, \bot, \top)$ used by our precise analysis below: - $(L_p ,\sqsubseteq)$: $\bot \sqsubseteq$ `exp` $\sqsubseteq \top$. `exp`'s formal definition in [BNF](https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form) is given below. Note that hash functions / `SLOAD` / `MLOAD` / `CALLDATALOD` / ... are treated as [uninterpreted functions]((https://www21.in.tum.de/teaching/sar/SS20/6.pdf)) (e.g., `sha3(a + sha3(0x1efff321 + b))`) so later we could use [congruence closure](http://people.eecs.berkeley.edu/~necula/autded/lecture12-congclos.pdf) for automated reasoning. - $\top$: Unkonwn value - $\bot$: Undefined - $\sqcap$: $\forall v, v \sqcap \bot = \bot, v \sqcap \top = v$ (notice $\sqcap$ is commutative) - $\sqcup$: $\forall v, v \sqcup \top = \top, v \sqcup \bot = v$ ``` exp ::= literal | symbol | function literal ::= 1 | 2 | ... | 2^256 - 1 symbol ::= a | b | c | ... function ::= sha3(exp, exp) | sload(exp) | mload(exp) | calldataload(exp) ``` Next, we present a sound naive algorithm for $\delta_p$. 1. For each special case function $f_i$ run the following before real-time execution: 1.1. Initialize a set $l_i^p = \emptyset$ which will contain all storage access traces in a certain execution path $p$. Initialize an additional set $c_i^p = \emptyset$ used for tracking environment constraints we collect on path $p$. Finally, initialize a set $e_i = \emptyset$ whose elements will be in form of a tuple $(l_i^p, c_i^p)$. 1.2. Run abstract interpretation using the transition rules defined below. Whenever we encounter `SLOAD` or `SSTORE`, and we see value $v$ on top of the stack, we do $l_i^p = l_i^p \cup \{ v \}$ (recall the fact that the topmost element of the stack is the storage key for `SSTORE` and `SLOAD`). And whenever we encounter `JUMPI`, we perform a backtrack, creating two execution paths $p_1$ and $p_2$ each with their storage trace set $l_i^{p_1}$ and $l_i^{p_2}$ initialized to $l_i^p$. Whenever we finish a path $p$, we do $e_i = e_i \cup \ (l_i^p, c_i^p)$. 2. Upon receiving the $\bar{t}$ and $s_{k-1}$ (Etherum state in previous block, i.e., block $k-1$) in real time: 2.1. Let each symbolic value equal to the actual concrete value in $c_i^p$, then solve for $\bigwedge c_i^p \wedge \forall v, sstore(sload(v)) = v$ (basically this just means we "and" all the conditions in $c_i^p$ together, plus "andding" an additional theorem $\forall v, sstore(sload(v)) = v$ to save us time from actually loading the concrete value of $sload(v)$ from storage), if the solver returns true, then insert the set of concrete values into $l_i^p$ and let $l_i = l_i^p$. 2.2. Check $l_i$ with other transactions, if $l_i \cap l_j = \emptyset$, then execute $txn_i$ and $txn_j$ in parallel, else execute them in order of $\bar{t}$. The intuition for step 2.1 is that since we've collected the satisfiability condition of each path $p$, $c_i^p$, and the storage access trace in this path, $l_i^p$, we want to see that in the real transaction, which path $p$ is satisfiable by checking if $c_i^p$ solves with the concrete values inserted. If it solves, then indeed in the real transaction we will take this path $p$ and end up with the storage access trace set $l_i^p$. We walk through how our analysis generates $\kappa$ for a simple ERC20 contract template taken from [OpenZeppelin](https://github.com/OpenZeppelin/openzeppelin-contracts). <!-- 4. if our parallel execution in step 3 run generates a different state from sequential execution $\delta$, then re-run the conflicting transactions in order of $\overline{o}$. --> <!-- We walk through a simple transaction `0x05f6c23e709db140421dc4890bd39592f03fcfb9fad56817f1f31d0ffa519d48` which calls Uniswap V2 router to exchange 400M of some shitcoin to Ether. For now, we consider a simple ERC20 token contract. ![](https://i.imgur.com/xjqEtwd.png) --> <!-- But we do need some degree of impreciseness, as Usually, static analysis aims for soundness. However, we cannot aim for soundness in our analysis, because then we would generate a huge list of storage accesses such that it is not feasible to check for conflicts in real-time. An example is the Ethereum 2.0 staking contract, or the MasterChef contract in Pancakeswap. They all contain a long list of addresses that the contract might access. As a result, our analysis might seem ad-hoc at first. --> ```solidity= function _transfer( address sender, address recipient, uint256 amount ) internal virtual { require(sender != address(0)); require(recipient != address(0)); uint256 senderBalance = _balances[sender]; require(senderBalance >= amount); unchecked { _balances[sender] = senderBalance - amount; } _balances[recipient] += amount; emit Transfer(sender, recipient, amount); } ``` The source code, up to line 9, roughly translates[^30] to the following bytecode (with `sender`, `recipient` represented as symbolic values) with our analysis steps written as comments: [^30]: The bytecode I shared here has major differences from the ones that Solidity outputs. For example, a comparison of address `sender` and zero is done by bit-vector operations. Here we deliberately "hand-compiled" the code for simplicity in our demonstration. ``` solidity ... loading parameters ... PUSH // stack: sender | ... PUSH // stack: 0 | sender | ... NEQ // fork two path with c_i = {sender = 0} and c_i = {sender != 0} respectively, stack: 1 or 0 | ... PUSH CONT1 // stack: cont1 | 1 or 0 JUMPI // based on which fork we are on, jump to cont1 or not, stack: |... REVERT // return l_i = \emptyset and c_i = {sender = 0} JUMPDEST CONT1: PUSH recipient // stack: recipient | ... PUSH 0 // stack: 0 | recipient | ... NEQ // fork two path with c_i = {recipient = 0} and c_i = {recipient != 0} respectively, stack: 1 or 0 | ... PUSH CONT2 // stack: cont2 | 1 or 0 JUMPI // based on which fork we are on, jump to cont2 or not, stack: |... REVERT // return l_i = \emptyset and c_i = {sender != 0, recipient = 0} JUMPDEST CONT2: PUSH 0x3 // stack: 0x3 | ... SHA3 // stack: 0x1bf0b26eb2090599dd68cbb42c86a674cb07ab7adc103ad3ccdf521bb79056b9 | ... PUSH sender // stack: sender | 0x1bf0b26eb2090599dd68cbb42c86a674cb07ab7adc103ad3ccdf521bb79056b9 | ... ADD // stack: sender + 0x1bf0b26eb2090599dd68cbb42c86a674cb07ab7adc103ad3ccdf521bb79056b9 | ... SHA3 // stack: sha3(sender + 0x1bf0b26eb2090599dd68cbb42c86a674cb07ab7adc103ad3ccdf521bb79056b9) | ... SLOAD // stack: sload(sha3(sender + 0x1bf0b26eb2090599dd68cbb42c86a674cb07ab7adc103ad3ccdf521bb79056b9)) | ..., l_i = {sha3(sender + 0x1bf0b26eb2090599dd68cbb42c86a674cb07ab7adc103ad3ccdf521bb79056b9)} ... ``` In short, we use the idea of forking in symbolic execution by introducing an environment constraint set $c_i^p$ and symbolic values for every unknown input. And whenever we encounter an opcode concerning external input/output, we treat it as if it was an uninterpreted function. And this collected $c_i^p$ will finally be used to solve for $l_i^p$. Note that we could also do unification on symbolic constraints (which is analogous to [summary-based symbolic evaluation](https://homes.cs.washington.edu/~emina/doc/solar.ase20.pdf)) before real-time transactions come in to save reasoning time. Of course, the actual compiled bytecode might have parts where symbolic execution simply cannot solve. For example, the Solidity compiler will compile the comparison with a zero address not to numeric operations but a bit-vector operation. However, bit operations are hard to solve using SMTs. We can combat this situation by either [transpiling](https://www.microsoft.com/en-us/research/uploads/prod/2021/09/3477132.3483564.pdf) the bytecode language to another more formal-methods friendly language, or use __[concolic execution](https://www.cs.cmu.edu/~aldrich/courses/17-355-19sp/notes/notes15-concolic-testing.pdf)__ so we only have symbolic values in path conditions but not in potential $\kappa$ (stack values are now concrete and no longer symbolic, this makes $\kappa$ potentially unsound but accelerates the analysis a lot). Of course, we still suffer from [path explosion](https://ieeexplore.ieee.org/document/5692223). But since we are not doing the analysis on-the-fly, we can just always retreat to a more coarse method and get a strictly better result than if we didn't use any refinement for parallelization. #### Pushing the boundary There are many long-tail optimizations[^14] that we can do to our algorithms: * separate read and write operations in $l_i$: currently if $txn_i$ and $txn_j$ both only read and not write, we don't parallelize while we should. * commutativity of the write operation: commutative write operations could make storage access to the same location parallelizable. [Previous work](https://arxiv.org/abs/1802.08748) showed that it is possible to even with [precision](https://arxiv.org/abs/1802.08748). * more refined $l_i^p$ set: currently $l_i^p$ is this big array that doesn't distinguish read/writes and doesn't account for the location of those read/writes. There are several optimizations we could do to make this faster, such as providing a more refined $l_i^p$ structure and then eliminating cases we talked about in section _Parallelization algorithms_, etc,. Details are described below. * pattern matchers: since our analysis result on $l_i^p$ only guarantees that there are no false negatives, and that we don't want to make our analysis too complex, we could build simple pattern matchers to eliminate special cases to make us more precise while keeping soundness. * abstract interpretation tweaks: we could tweak the design on $L_c$ to use [other domains](https://www.dsi.unive.it/~avp/domains.pdf), or to use other [ordering strategies](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.89.8183&rep=rep1&type=pdf) to try to accelerate the process. Or we could even tweak $F$ the transition function, for example, we could probabilistically make `MLOAD` be opaque (pushes $\top$ every time) and see how that affects precision and performance. [^14]: It would also be interesting to study how the preciseness of our analysis (e.g., tuning on the domains) will impact our overall performance. Basically, this is asking: if you were to plot a graph where the x-axis is abstraction level (how much information $\kappa$ gives) and the y-axis is the performance boost to parallel execution (the amortized execution time of $\delta_p$, it would also be interesting to see the conflict rate of $\bar{t}$ under $\kappa$'s information).

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully