###### tags: `Cosmos SDK` # SDK Architecture Review - 2021-08-13 The **SDK Architecture Review** meeting is a bi-weekly meeting focused on architecture discussions of ADRs and features currently in development (any issues tagged with the label "Status: Needs Architecture Review"). ## Agenda - [x] Working Group Check In (for groups in DESIGN phase) - https://github.com/cosmos/cosmos-sdk/issues/9058 - [x] Tendermint v0.35 Key Migration - https://github.com/cosmos/cosmos-sdk/issues/9894 - [x] Deprecate x/params - https://github.com/cosmos/cosmos-sdk/discussions/9913 - [x] Decide about future of sdk.Dec - https://github.com/cosmos/cosmos-sdk/issues/7773 - [x] Event History / Event Hooks - https://github.com/cosmos/cosmos-sdk/pull/9305 - https://github.com/cosmos/cosmos-sdk/discussions/9656 - [x] ADR 044: Guidelines for Updating Protobuf Definitions - https://github.com/cosmos/cosmos-sdk/issues/9477 - https://github.com/cosmos/cosmos-sdk/pull/9613 - [ ] Breaking Module proto into API (Root), and internal. ## Notes ### Working Group Check In **Protobuf golang v2 migration and codegen** - No updates **Transaction Improvements** - had first meeting **SMT Migration** - Still evaluating two paths forward: - feature branch and pushing things singularly in master - direct commits of all changes into master - leaning towards the latter - Made a decision to continue support IAVL in parallel to a new SMT based database, and provide a smooth upgrade path ### Tendermint v0.35 Key Migration - Sam from Tendermint on the call - key names for data that tendermint stores changes (all prefixes change) - In order to have an in-place upgrade, a specific migration script needs to run - Migration script is just a go function that would need to be run, but there's also a CLI entrypoint in the tendermint binary - When users with existing nodes are upgrading to v0.35, they will need to run this migration script first - Main path will mostly be user hand-holding, but likely we should add some automation too - Aaron: I had a conversation about some other things that need to happen (per request of the AIB team) - Sometimes the app.toml changes and needs migrating as well - My proposal for how we handle this convention, is that for the binary that cosmovisor is managing, there would be a pre-upgrade command with a pre-upgrade handler which does any of these optional migrations - Sam: The upgrade script is written to be retryable - Having said that, i can't say that every error is recoverable, but as a general pattern that makes sense - One of the things that is a potential challenge is that if you're runing tendermint as a separate binary, then "i've given you a way to run this script", but we may need to support different pre-flight scripts as well - Aaron: Robert has proposed cosmovisor being able to manage multiple binaries. If enough people are running tendermint outside, then perhaps this is a use case to be supported - We had a call with you and Bez and would suggest that we have another call with you and bez to scope this out - Ira: Is there a place where we can join in on that process? Our operations group has a lot of feedback & concerns with that. - Robert: I think what Ira is asking is to add him to the next conversation that we're having on the topic. There is another use case of tendermint being used in applications which is not cosmos chains. - Aaron: Who can take the lead on scheduling this? Robert? - Robert: Next week i'll cordinate with the appropriate folks - Cory: At a high-level, we are folding this into the cosmovisor working group. Is that how folks are seeing this? Is that where the conversation will continue? - Aaron: Yes- this is more generally about upgrade handling & cosmovisor - ### Deprecation of x/params - Aaron: I think we should deprecate the use of params as a specific set of governance parameters managed through the x/params module (not remove x/params entirely right away) - Background is that the amino serializatino of params is quite complicated - On the osmosis chain, they changed the minimum deposit amount to a non-funcitonal denom. At this point gov became unusable and they needed a hard fork for that - It is possible that if you change certain parameters at the wrong time, it could cause issues in the running of consensus - We've tried to bring in validation of certain params prior to them getting updated via governance but that also seems difficult - Taking a step back, everything that is done in params can be done in a custom governance handlers. We're now actually transitioning to moving away entirely from custom gov handlers, but instead use standard Msg processes - Bez: the only UX issue i see, is that module developers would have to do slightly more work to set up parameters - Now they'll have to define parameters, msg's and handlers for those msg's. - Dev: Writing a Msg in the SDK is a lot of bloat, relative to params, where you only define your validate method. I do fully agree with the view that we can look at this as a problem of access control & default access controls. But i'm not sure that a custom message is really what's best here. - For chain developers there's a lot of overhead to writing a tiny Msg - For explorers you're not getting the same thing for all param updates. - Aaron: Currently in the way params work is that its more or less arbitrary JSON, and a block explorer doesn't really know what the schema of that is. - Aaron: Is there not a lot of overhead in the keytable that currently needs to be setup? - Dev: It's pretty straightforward, and keytable happens in the backend. - Bez: Another benefit is that param initialization uses raw JSON, but with this model we'd now be able to use protobuf definitions. - Dev: I do agree that there are some strange behaviors params - Dev: It should be possible to make param validation a stateful thing, and execute some code on upgrading a parameter - Bez: I just think the Msg approach is much more flexible (hooks & side-effects would be covered by this) - Aaron: We have our meetings on tuesdays- there are actually two submeetings there (Dev is on the invite). - Dev: I can comment on the issue today. ### Decide on future of SDK.Dec - Aaron: I've moved this up in priority. We do want to change the underlying implementation of whats going on in the decimal package. There was a lot of back & forth on this already. In Regen we have used apd package with decimal128 precision. - It's now coming up at the point where the group module code is moving into the SDK, but there's a dependency of the decimal type we're using. - A number of folks have spoken in favor of decimal128 - Dev had encouraged some error tracking (which we haven't had a chance to do) - What is the path forward to come into alignemnt on what we proceed? - We actually have done more in depth testing of the apd implementation on our end - Dev: What are the goals? I remember one of the goals is to reduce the edge cases where things don't work. Floats don't have commutivity over addition. - Usually you don't run into these unless you're doing stress testing - The second framing-- we're also not treating precision in the right way we should be. In physics you quote your results to 3 sig-figs, but you still need to store much higher precision. - We shouldn't be treating the variable we put into state, as the variable we're treating in any intermediate steps. - If we explicitly chopped off what's written into state, then we could get more consistent answers - Aaron: So you're arguing for doing calculations with a higher precision than your result. A starting point question i have, is why not start with a GDA specification for decimals? - We have an implementation not based on a standard in the SDK at the moment - Do you have concrete criticism of GDA itself? Bc otherwise we can have a precision conversation within the context of a migration to GDA - Dev: The issue i have with GDA is that its float not dec. I've thought this was base10 floating point arithmetic. - Aaron: Under the hood, it takes a big integer, and an exponent. and aligning the exponent to match. - Dev: I believe this is the definition of floating point arithmetic is. - Dev: What i view as ideal for the setting we're in, is having a fixed decimal (similar to what we have currently). - Dev: None of the identified errors get solved with float though? - Aaron: The order of operations issue that we dealt with is solved by Flat - Dev: But that issue is still there in float, and becomes increasingly a problem when you compound more often. It may be true that float is better when these numbers are small, but i dont see how this help things when the numbers are bigger. - Aaron: And so the question should be about what is more accurate more of the time. - Dev: In error analysis you track this as ULP (unit in the last place), and how many times the last unit was wrong. - Aaron: Is there anyone else here who has context for decimal arithmetic & the specifications? - Dev: We could try and bring in someone from error analysis. I can try to ask someone from berkeley who used to do this. - Aaron: I think thats not a bad idea to bring in an expert, and write up an ADR saying we got xyz expert opinion. - Ryan: So for next steps Dev & Aaron will coordinate? - Dev: Sounds like its something for me first, to understand what our use case is. ### Event History / Event Hooks - Aaron: Currently events are not part of the state machine behavior - The proposal i've put out is to have event listeners similar to what we have in a number of places in the SDK - Everytime there's an event emitted, there's a hook function which provides a side effect and also potentially returns an error - First let's move to typed events, and then add in functionality to make events state machine breaking - There are some details for how this relates to tendermint - The alternate proposal is called 'abci event history' - The event manager would collect all events emitted in a block, and some code would read the history of events and then do something on that history of events - However- this makes events state machine breaking without providing strongly typed definition of events - Dev: I really like the proposal. One question- when i have a failed message? (let's say some IBC transfer just failed) Do failed messages get any events? - Aaron: I think its like the way we have atomic commits (it either fails or succeeds completely). - In terms of process for something like this- I think we should put this as an epic in the unprioritized Epic backlog. There are a number pieces that reequire a fair amount of work. - Cory: Does this fit within the context of any of the existing working groups? ### ADR044 - Update protobuf guidelines - Amaury: This PR was merged in draft mode, and so i'm not looking for any more reviews on this - The outstanding question is whether we plan on bumping version as part of v0.44 - Jim Larson: I did a writeup on compatible protobuf changes for hte cell expression language, and so i'd like to understand perhaps participate - Amaury: Could you send me a github or discord handle? ### Breaking protobuf into root packages - Robert: The idea is to split the packages by API & how things are used - mainly to not expose data access layer to the user - Usually how we store the data should not break the API - What's better an issue or discussion? - Robert: We've also been discussing this in the NFT pull request. ## Follow Up - [ ] Add Sam Kleinman and member of Provenance (Ira Miller) to Cosmovisor working group and schedule next meeting @robert-zaremba - [ ] Add event hooks as unprioritized epic - [ ] Add Jim Larson to protobuf v1 working group