New indexing algorithm in KAPI

# New indexing algorithm in KAPI The solution presented in this document is a simplified indexing algorithm that addresses the following issues: - Handling blockchain state reorganizations is not addressed. - There are too many unnecessary queries to nodes. - Applications built using KAPI face several problems related to the fact that KAPI does not explicitly indicate which data has changed, requiring a complete data reload from KAPI. This diagram shows the new indexing algorithm: > The symbol "*" marks the actions that I will explain after the diagram ```mermaid flowchart TD New_Tick(new app tick) --> is_blockhash_changed{is block hash changed?} is_blockhash_changed -->|No| end_without_changes[end without changes] is_blockhash_changed -->|Yes| is_eth_node_lagging{is the ether node lagging behind?} is_eth_node_lagging --> |No| operators_were_changed{have the operators been changed? *} is_eth_node_lagging --> |Yes| end_with_error{log warn, alert, end} operators_were_changed --> |Yes| change_operators_state[change operators state] operators_were_changed --> |No| skip_change_operators_state[skip change operators state] change_operators_state --> keys_were_changed{have the keys been changed? *2} skip_change_operators_state --> keys_were_changed{have the keys been changed? *} keys_were_changed --> |No| end_with_new_safe_state[record the last safe state, exit] keys_were_changed --> |Yes| find_new_left_edge[find new left edge *] find_new_left_edge --> end_with_new_safe_state_and_keys[record the last safe state, keys, exit] ``` ## New terminology ### Left edge The left edge is what I call the reporting point of the requested keys ### Last safe block state This is the basis for dealing with reorganization. It represents the necessary minimum data to understand the state of the stable (finalized) branch of the blockchain. It is used to obtain a stable data range when querying. Since the key download algorithm works on the basis of comparing old and new data, we cannot be sure that our old data has not undergone reorganization. Also, we cannot query every time data from a finalized block because the local state may lag behind that block. Also, the finalized block may already be missing from the node. As a solution to the problem, I came up with the following algorithm: - At each iteration, we write the necessary data (number of keys) relative to the finalized block (not the block hash itself, as it may disappear from the node) - As the left boundary of the requested data interval we take this data, if it is not available (cold start) we put 0 to make sure that we have the most up-to-date state ### Have the operators been changed? In order to understand if the state of the operators has changed we still use events, since *nonce* does not help us to determine if the operators' data has changed. But the current algorithm does not take into account possible reorganizations. Modified algorithm: - get the range of events taking into account the last safe block (if it is not more than 100) - cache the array of event block hashes - use this array in the next iterations to check if the state has changed - if the states are not equal, return true so we take into account reorganizations and do not make unnecessary changes ### Have the keys been changed? Here the algorithm is the same, only we look at the keys ## Changing the database schema In addition to the current changes, I want to add two fields to operators and one field to ElMetaEntity ### Operators - `keysIdempotentKey` - uid indicating a change to the operator keys - `oparatorIdempotentKey` - uid indicating the change of operator data ### ElMetaEntity - `idempotentKey` - uid indicating that any data has been changed In the current implementation of the current module, this key will be the hash of the block on which the last changes were committed. I chose the name abstractly on purpose, because in the future we can load keys not only from the ether blockchain, but also from other networks, as well as from files where idempotentKey will already be a hash sum, for example. ## Conclusion With minimal changes to the source code, we fix reorganization bugs, improve performance, and lay the foundation for the future