# F. Asynchronism ![](https://i.imgur.com/8lIanxd.png) Asynchronous workflows help reduce request times for expensive operations that would otherwise be performed in-line. They can also help by doing time-consuming work in advance, such as periodic aggregation of data. ## Message queues Message queues receive, hold, and deliver messages. If an operation is too slow to perform inline, you can use a message queue with the following workflow: An application publishes a job to the queue, then notifies the user of job status A worker picks up the job from the queue, processes it, then signals the job is complete The user is not blocked and the job is processed in the background. During this time, the client might optionally do a small amount of processing to make it seem like the task has completed. For example, if posting a tweet, the tweet could be instantly posted to your timeline, but it could take some time before your tweet is actually delivered to all of your followers. Redis is useful as a simple message broker but messages can be lost. RabbitMQ is popular but requires you to adapt to the 'AMQP' protocol and manage your own nodes. Amazon SQS is hosted but can have high latency and has the possibility of messages being delivered twice. ### RabitMQ [RabbitMQ](https://jack-vanlightly.com/blog/2017/12/4/rabbitmq-vs-kafka-part-1-messaging-topologies) is a message broker that implements Advanced Message Queuing Protocol (AMQP). It provides client libraries for major programming languages. **AMQP** (Advanced Message Queuing Protocol) is a messaging protocol that enables conforming client applications to communicate with conforming messaging middleware brokers. **Messaging brokers** receive messages from publishers (applications that publish them, also known as producers) and route them to consumers (applications that process them). #### Types of exchangers ![](https://i.imgur.com/x2oV3xM.png) * **Direct**: The message is routed to the queues whose binding key exactly matches the routing key of the message. For example, if the queue is bound to the exchange with the binding key pdfprocess, a message published to the exchange with a routing key pdfprocess is routed to that queue. * **Fanout**: A fanout exchange routes messages to all of the queues bound to it. * **Topic**: The topic exchange does a wildcard match between the routing key and the routing pattern specified in the binding. * **Headers**: Headers exchanges use the message header attributes for routing. ![](https://i.imgur.com/7jWEyDA.png) #### GUARANTEES RabbitMQ offers **"at most once delivery"** and **"at least once delivery"** but not "exactly once delivery" guarantees. We'll take a deeper look at message delivery guarantees in Part 4 of the series. Messages are **delivered in order** of their arrival to the queue (that is the definition of a queue after all). This does not guarantee the completion of message processing matches that exact same order when you have competing consumers. This is no fault of RabbitMQ but a fundamental reality of processing an ordered set of messages in parallel. This problem can be resolved by using the Consistent Hashing Exchange. RabbitMQ supports both PUSH and PULL based consuming, but performs better with **PUSH**. ### ActiveMQ ActiveMQ makes use of the Java Message Service (JMS) API, which defines a standard for software to use in creating, sending, and receiving messages. ActiveMQ sends messages between client applications—**producers**, which create messages and submit them for delivery, and **consumers**, which receive and process messages. The ActiveMQ broker routes each message through one of two types of destinations: * a queue, where it awaits delivery to a single consumer (in a messaging domain called **point-to-point**), or * a topic, to be delivered to multiple consumers that are subscribed to that topic (in a messaging domain called publish/subscribe, or **“pub/sub”**) ![](https://i.imgur.com/Nc09Qpr.png) #### Availability ActiveMQ supports: * Consumer clusters - compating consumers * Cluster of stand-alone Brokers (with falover) or network of brokers (if broker has publishers but doesn't have consumers it can forward a message to another broker) * Master-slave replication #### Amazon MQ Amazon MQ is a managed message broker service for Apache ActiveMQ that makes it easy to set up and operate message brokers in the cloud. Message brokers allow different software systems–often using different programming languages, and on different platforms–to communicate and exchange information. Amazon MQ reduces your operational load by managing the provisioning, setup, and maintenance of ActiveMQ, a popular open-source message broker. ### AWS SQS Amazon Simple Queue Service (Amazon SQS) offers a secure, durable, and available hosted queue that lets you integrate and decouple distributed software systems and components. Amazon SQS offers common constructs such as dead-letter queues and cost allocation tags. #### Message Order and Delivery Guarantee In order to achieve high levels of scalability and redundancy SQS relaxes some of the guarantees of a traditional queuing system. On rare occasions messages may be delivered out of order and more than once, but they will get delivered and no message will be lost. Applications sensitive to duplicated or out-of-order processing need to implement logic to cover these scenarios. #### Scalability and Performance SQS does not return from a SendMessage request unless the message has been successfully store and as a result it has a request-response latency of around 20ms. At first glance it may mean that it cannot handle more than a few hundred messages per second. However, when dealing with a distributed queue like SQS one has to distinguish between latency and throughout17. SQS scales horizontally. By using multiple threads it is possible to increase message throughput almost indefinitely. ### Kafka Apache [Kafka](https://jack-vanlightly.com/blog/2017/12/4/rabbitmq-vs-kafka-part-1-messaging-topologies) isn’t an implementation of a message broker. Instead, it’s a **distributed streaming platform**. Kafka is a distributed, replicated commit log. * **Distributed** because Kafka is deployed as a cluster of nodes, for both fault tolerance and scale * **Replicated** because messages are usually replicated across multiple nodes (servers). * **Commit Log** because messages are stored in partitioned, append only logs which are called Topics. This concept of a log is the principal killer feature of Kafka. Kafka doesn’t implement the notion of a queue. Instead, Kafka stores collections of records in categories called **topics**. For each topic, Kafka maintains a partitioned log of messages. Each partition is an ordered, immutable sequence of records, where messages are continually appended. Kafka appends messages to these partitions as they arrive. By default, it uses a round-robin partitioner to spread messages uniformly across partitions. Producers can modify this behavior to create logical streams of messages. For example, in a multitenant application, we might want to create logical message streams according to every message’s tenant ID. Making sure all messages from the same logical stream map to the same partition guarantees their delivery in order to consumers. ![](https://i.imgur.com/4kBFg9B.png) Each consumer tracks where it is in the log, it has a pointer to the last message consumed and this pointer is called the offset. Consumers maintain this offset via the client libraries and depending on the version of Kafka the offset is stored either in ZooKeeper or Kafka itself. **ZooKeeper** is a distributed consensus technology used by many distributed systems for things like leader election. Kafka relies on ZooKeeper for managing the state of the cluster. Kafka does not support competing consumers on a single partition, Kafka's unit of parallelism is the partition itself. So if two consumers read from same partition, both of them recieve all messages. Implication is that **you need at least as many partitions as the most scaled out consumer**. Kafka is **PULL** based MS ## Comparisons ### Kafka vs RabbitMQ **Use Kafka if you need** * Time travel/durable/commit log * Many consumers for the same message * High throughput (millions of messages per second) * Stream processing * Replicability * High availability * Message order **Use RabbitMq if you need:** * flexible routing * Priority Queue * A standard protocol message queue