--- title: Simple JSONRPC requests in Rust (or is it?) tags: Rust, JSONRPC, serde --- This is a write up on a dive into the rabbit hole of JSONRPC requests serialized with [serde]. [serde]: (https://serde.rs/) ## Motivation We were trying to query the SnarkOS node and ran into the [issue] that numeric JSON rpc `Id` was no longer supported. In the attempt to fix this seemingly simple issue, the world of different jsonrpc lib and serde was revealed. [issue]: https://github.com/AleoHQ/snarkOS/issues/1369 ## The original problem The [line that failed] is simply where the data is deserialized into a `Request` type from the [json_rpc_types] crate; ```rust let req: jrt::Request<Params> = match serde_json::from_slice(&data) { ... } ``` Now this happens when the request `id` is numeric: ```sh '{"jsonrpc":"2.0","method":"latestblockheight","params":[],"id":1}' # as opposed to '{"jsonrpc":"2.0","method":"latestblockheight","params":[],"id":"1"}' ``` and the error message is: ```sh thread 'main' panicked at 'called `Result::unwrap()`on an`Err` value: Error("invalid type: map, expected string or number"` ``` This is because of [the bug] with [serde_json]'s `arbitrary_precision` feature enabled. This has been referenced to be caused by some [internal buffering] issue. [internal buffering]: https://github.com/serde-rs/serde/issues/1183 [the bug]: https://github.com/serde-rs/json/issues/505 [line that failed]: https://github.com/AleoHQ/snarkOS/blob/b15ae1fd7c7afdc1bbc2ae561116f14ce4aacd50/src/rpc/rpc.rs#L144D [json_rpc_types]: https://github.com/DoumanAsh/json-rpc-types/blob/master/src/id.rs [serde_json]: https://docs.serde.rs/serde_json/ ## The Basics > A data structure that knows how to serialize and deserialize itself is one that implements Serde's Serialize and Deserialize traits (or uses Serde's derive attribute to automatically generate implementations at compile time). This avoids any overhead of reflection or runtime type information. ### The `Deserialize` Trait The `Deserialize` trait looks like this ```rust impl<'de> Deserialize<'de> for Input { fn deserialize<D>(deserializer: D) -> Result<Self, < as serde::Deserializer<'de>>::Error> where D: serde::Deserializer<'de>, { // Depending on what you are deserializing, // In if it is a struct, it will be deserializer.deserialize_struct(...) // As an example // https://docs.serde.rs/serde/trait.Deserializer.html#tymethod.deserialize_struct deserializer.deserialize_*(_args, OurVisitor) } } ``` > The deserializer is responsible for mapping the input data into Serde's data model by invoking exactly one of the methods on the Visitor that it receives. > The Deserializer methods are called by a Deserialize impl as a hint to indicate what Serde data model type the Deserialize type expects to see in the input. There are many deserializer methods. The `deserialize_*` method gives hint as to what it will be deserializing. If the data format is self-describing, like JSON, then the hint is not so important. In that case the deserialize method only needs to be `deserializer.deserialize_any()`, and then the `derserialize_any` function will match the do the specific types, like this: ```rust // impl Deserializer fn deserialize_any<V>(self, visitor: V) -> Result<V::Value> where V: Visitor<'de>, { match self.peek_char()? { 'n' => self.deserialize_unit(visitor), 't' | 'f' => self.deserialize_bool(visitor), '"' => self.deserialize_str(visitor), '0'..='9' => self.deserialize_u64(visitor), '-' => self.deserialize_i64(visitor), '[' => self.deserialize_seq(visitor), '{' => self.deserialize_map(visitor), _ => Err(Error::Syntax), } } ``` However, in other data format, such as Bincode will rely on the hint to know exactly what data it is expecting and what visitor method to call. Now what is a `OurVisitor`? ### The `Visitor` Trait This is a struct that implements the `Visitor`trait which has the methods to deserialize. The`Deserializer` "drives" the visit but deciding which method of the visitor it will use. So it is more like the deserializer picks and uses the visitor methods which expects specific type as input and return the type that the`Visitor` should return. Example of a visitor, [See Visitor] for more details. [see visitor]: https://docs.serde.rs/serde/de/trait.Visitor.html ```rust struct I32Visitor; impl<'de> Visitor<'de> for I32Visitor { // This is the type to be returned by this Visitor type Value = i32; // This is the error message if the none of the visit_* methods can be called // for example the input value is a string fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result { formatter.write_str("an integer between -2^31 and 2^31") } fn visit_i8<E>(self, value: i8) -> Result<Self::Value, E> where E: de::Error, { Ok(i32::from(value)) } // Similar for other methods: // - visit_i32 // - visit_i64 // - visit_i16 // - visit_u8 // - visit_u16 // - visit_u32 // - visit_u64 } ``` Now having said all this, usually the `#[derive(derserialize)]` macro handles all the above for custom types structs / enums etc ## Back to the original problem So our original struct that needs to be deserialize into ```rust #[derive(Clone, Debug, PartialEq, Deserialize, Serialize)] #[serde(deny_unknown_fields)] pub struct Request<P, T=StrBuf> { ///A String specifying the version of the JSON-RPC protocol. pub jsonrpc: Version, ///A String containing the name of the method to be invoked /// ///By default is static buffer of 32 bytes. pub method: T, #[serde(skip_serializing_if = "Option::is_none")] ///A Structured value that holds the parameter values to be used during the invocation of the method pub params: Option<P>, #[serde(skip_serializing_if = "Option::is_none")] ///An identifier established by the Client. /// ///If not present, request is notification to which ///there should be no response. pub id: Option<Id>, } #[derive(Debug, PartialEq, Clone, Hash, Eq)] pub enum Id { /// Numeric id Num(u64), /// String id, maximum 36 characters which works for UUID Str(StrBuf), } ``` and `Id` implements its owns deserializer, because [StrBuf] is used for `no_std`, stack based string and it does not have a default Deserialize trait implemented. [strbuf]: https://docs.rs/str-buf/latest/str_buf/struct.StrBuf.html The [suggested workaround] for the original [issue] for deserializing a numeric `Id` was to first deserialise the bytes into `serde_json::Value` type. This workaround was then used in a [PR](https://github.com/AleoHQ/snarkOS/pull/1539) to fix the [issue]. [suggested workaround]: https://github.com/serde-rs/json/issues/505#issuecomment-489265253 All is well, now we get a new error message ( :tada: ) from previous [line that failed]: ```sh invalid type: string \"2.0\". expected a borrowed string ``` This is no longer on the `Id` type but the [Version] struct. This error message has been [discussed](https://github.com/serde-rs/serde/issues/1413) before (of course). But let me try to explain it and combine explanations from discussions with J and L. [Version]: https://github.com/DoumanAsh/json-rpc-types/blob/master/src/version.rs ## New error - Expected borrowed string? To deserialize into `Version`, this was implemented ```rust impl<'a> Deserialize<'a> for Version { fn deserialize<D: Deserializer<'a>>(des: D) -> Result<Self, D::Error> { // This line caused the issue let text: &'a str = Deserialize::deserialize(des)?; match text { "2.0" => Ok(Version::V2), _ => Err(serde::de::Error::custom("Invalid version. Allowed: 2.0")), } } } ``` Now, recall we have implemented the workaround so that the request body is already in the `serde_json::Value` type. The deserializer is therefore going to be working on the `serde_json::Value::String(string)` variant. The reason why deserializing from the json string caused the panic is because the string input contains a control sequence (Meta character) `\` which needs to be escaped. i.e. what is in the value JSON representation of the string is not what it should be in memory. This [article] explains it in great details. This means that casting the result into `text: &'a str` will not work as the serde_json deserializer will return a String as it needed ownership to modify the input to escaped contract sequences. [Here](https://docs.serde.rs/src/serde_json/de.rs.html#1492) is the detail of the implementation of how the deserializer works. It [parses the str] and copies it to do the escaping. As a solution, usually we deserialize into `Cow<'a, str>`. However, this `json_rpc_types` crate is `no_std`. [article]: https://d3lm.medium.com/rust-beware-of-escape-sequences-85ec90e9e243 [parses the str]: (https://github.com/serde-rs/json/blob/master/src/read.rs#L337) ### Solution We have [implemented] the deserialize trait by giving the deserializer the hint it should expect a `str` with a Visitor with `visit_str()` method. ```rust impl<'a> Deserialize<'a> for Version { fn deserialize<D: Deserializer<'a>>(des: D) -> Result<Self, D::Error> { des.deserialize_str(VersionVisitor) } } struct VersionVisitor; impl<'a> Visitor<'a> for VersionVisitor { type Value = Version; #[inline] fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result { formatter.write_str("Identifier must be a string") } #[inline] fn visit_str<E: Error>(self, v: &str) -> Result<Self::Value, E> { match v { "2.0" => Ok(Version::V2), _ => Err(serde::de::Error::custom("Invalid version. Allowed: 2.0")), } } } ``` **But WHY???** why does this work? It looks like it is still trying to visit a `str` when we hav ejust said it does a copy when it [parses the str]. It is because the `visit_str()` fn call happens [after it has been parsed and copied](https://github.com/serde-rs/json/blob/master/src/de.rs#L1519) and really at this point, only the new borrowed str value has already been escaped. [implemented]: https://github.com/DoumanAsh/json-rpc-types/pull/2 Now what remains is **WHY** was this new error even there. i.e. why does `serde_json::from_value(Value)` add these control sequence to the `"2.0"` into `\"2.0"\`? _That my friends, is for another time_