tags: SimplyCode

--- tags: SimplyCode --- # How to achieve simplicity A common advise for developers is to keep it simple. Create simple solutions. Write simple code. But what does this actually mean? What is simple? Is it: - small? - easy to understand? - unsurprising or even boring? And what is the reason behind striving for simplicity? How can we measure if code is simple enough? Lets take a look at some existing software and see how they compare on simplicity, and what advantages that simplicity offers. ## SOAP, XML-RPC, REST These days (2023) when connecting to a remote API, REST is the most common style you encounter. It is the clear winner of the three. But which of these is the simplest? SOAP stands for Simple Object Access Protocol. So clearly it is simple, right? Just joking, ofcourse not. SOAP was designed mostly by Microsoft. It is based on XML and XML-RPC. The most important difference with XML-RPC is that it adds encoding of custom classes of data. So you can define a class BirthDate in your C# code and export it in a SOAP service as the type BirthDate. For a client to automatically parse this class, they invented WSDL or Web Service Description Language, which is ofcourse also XML. Here is an example SOAP call: ``` POST /Quotation HTTP/1.0 Host: www.xyz.org Content-Type: text/xml; charset = utf-8 <?xml version = "1.0"?> <SOAP-ENV:Envelope xmlns:SOAP-ENV = "http://www.w3.org/2001/12/soap-envelope" SOAP-ENV:encodingStyle = "http://www.w3.org/2001/12/soap-encoding"> <SOAP-ENV:Body xmlns:m = "http://www.xyz.org/quotations"> <m:GetQuotation> <m:QuotationsName>MiscroSoft</m:QuotationsName> </m:GetQuotation> </SOAP-ENV:Body> </SOAP-ENV:Envelope> ``` XML-RPC in contrast only supports a few basic data types. It is still XML though. It has no WSDL, you need to read the documentation and code your client yourself. And here is an example of an XML-RPC call: ``` POST /Quotation HTTP/1.0 Host: www.xyz.org Content-Type: text/xml; charset = utf-8 <?xml version="1.0"?> <methodCall> <methodName>GetQuotation</methodName> <params> <param> <value><string>microsoft</string></value> </param> </params> </methodCall> ``` Clearly SOAP is easier, and you need to write less code. So is it simpler? No, it is not. Here is the problem: code generation does not result in simpler code. SOAP libraries in java and .net for a long time had subtle differences, which meant that sometimes you had to get deep in the generated code to find interoperability problems and solve them. So the high-level abstraction was leaky, you had to know what was going on under the hood to solve some problems. In a truly simple system, there is no code under the hood. This means that you may have to do more work yourself, but there is no magic, everything is exposed. So XML-RPC is simpler. Then how about REST? The first problem here is the definition of REST. There is a fairly well understood official definition, and then there is what is actually built 99% of the time. REST today is mostly JSON over HTTP. JSON is clearly simpler than XML. There are just 5 types of data in JSON, vs. infinity in XML - using DTD's (Document Type Definition.) There are no namespaces or aliases in JSON. No imports, no custom tags. However JSON is also incapable of defining custom data types. You are limited to 5. There have been many attempts at encoding more information into JSON. One of the more known is JSON-LD, which adds type information with a special '@type' key. But to correctly parse this, you get a big step in complexity again. So REST (JSON over HTTP) is actually comparable to XML-RPC, from the viewpoint of a client. Where REST wins is on the server implementation. A REST server is just a web server. You have a URL, you do a GET or POST (or DELETE or PUT) request to it, with some query parameters or a JSON body, and the server does something and returns a result or error as JSON. Here is an example REST call: ``` GET /Quotation/microsoft HTTP/1.0 Host: www.xyz.org Accept: application/json ``` A clear advantage of REST is that it adds almost no ceremony or syntax on top of the basic HTTP server layer. And JSON is extremely simple as well. But the advantages go beyond that. One reason REST is so simple compared to XML-RPC or SOAP, is that it is much better aligned with the underlying platform: HTTP. REST uses HTTP verbs to implement a CRUD API (Create-Read-Update-Delete). It uses GET for information requests. This makes these requests cacheable. Not just client side, but in a proxy or server-side as well. POST requests are never cached, so SOAP and XML-RPC have a disadvantage out of the gate. REST uses the URL path structure to send information to the server. Instead of a seperate parameter in a POST body or url query string, you can just add it in the path. If done right, this makes the API simpler to inspect, and simple URL's are easier to share. However, it also means that REST is in fact less expressive than SOAP. The JSON can usually only be parsed correctly by reading the documentation and writing very situational client code. Enter [OpenAPI](https://www.openapis.org/) (which used to be called Swagger). Because WSDL was a good idea, right? So now we implement automatic client code generation not based on the actual server code, but on a hand-written yaml specification. What could go wrong? Clearly there is a need for a more capable system than just JSON-over-HTTP. But OpenAPI generates client code disconnected from the actual server implementation. That is two strikes: - code generation is never a good idea - documentation always lies ## Code generation is never a good idea The problem is not code generation per sé. Compilers generate code. The problem is what happens when you use that code. If the API changes, and the OpenAPI specification changes, then it is logical to assume you can just run the code generator again and get a new client library, that just works. However, your code was written to call the old client library. Not the newly generated one. So you need to find the differences between the old and new client, and refactor your own code. What are the differences though? Can you easily tell? What happens if you forget one call somewhere? With enough changes it becomes more efficient to just write your own client with only the parts of the API you actually use. This gives you hand-written code, that you know, which can be designed to better fit your use case anyway. Take for example Google. It has an enormous REST API, with all kinds of services. [It is well documented](https://developers.google.com/workspace). Suppose you have a PHP application that you want to connect to a Google service. You could write your own REST API HTTP client, handcoded against the specific service you want to use. I've done that, and it took me about a day and about 300 lines of PHP code. Or you can use Google's own PHP client library, which you can install with just one line in your composer.json file. And then you get 27,000 lines of PHP code added to your project. Worse, the PHP client code does not work identically to the REST API documentation. And the documentation is limited to a single README file and autogenerated class documentation. As Ritch Hickey says [Simple is not always Easy](https://www.youtube.com/watch?v=SxdOUGdseq4) ## Documentation always lies Code should be its own documentation. Because the code is always true. Documentation, even inline comments, have their own life. When written the documentation is probably true. But code changes, and documentation often does not. So to find out how a system works, use the source. Or the unit tests, if they exist. Unit tests are better documentation than the code, because you can check them, by running them. And they should specify how the system is supposed to work. Good unit tests give you a cookbook of use cases. Even online systems like stack overflow are no good source of documentation. Very often you will find older answers to your question, only to find out that the system has changed. The recommended answer no longer works. Sometimes it is clear for which version the answer is supposed to work, but often you only have the date of the question and answers to guide you. Now take the OpenAPI specification. This is usually a yaml file, that is handwritten by someone. So when the API changes, the OpenAPI specification is kept up to date, right? Well, if the organization has a lot of focus on client libraries and developer experience, than probably yes. But the person writing the specification is usually not the developer implementation new features. So unless the specification is unit tested... well, as I said, documentation always lies.