Difference in Messaging queue and Pub-sub – RabbitMQ vs Kafka

Messaging queue vs Published Subscribe

Messaging queues

  • Messaging Queue is typically designed for one to one(point to point) communication, and it’s called producer-consumer pattern.
  • A message queue receives incoming messages and ensures that each message for a given topic or channel is delivered to and processed by exactly one consumer, means processing is exactly once.
  • To ensure that a message is only processed by one consumer, each message is deleted from the queue once it has been received and processed by a consumer.
  • Messaging Queue can support high rates of consumption by adding multiple consumers for each topic, but only one consumer will receive each message on the topic.
  • All the consumers will do the same thing, makes it more like an imperative paradigm.
  • Which consumer receives which message is determined by the implementation of the Messaging Queue.
  • Messaging queue ensures queued messages are stored in published order even in the face of requeues or channel closure.
  • We can use message groups to partition your traffic to ensure order
  • It’s good to get a glimpse of their wiki page https://en.wikipedia.org/wiki/Message_queue

Published subscribe system

  • Pub-sub ensures at least once processed.
  • You can have any number of subscribers (including zero) listening to the same messages(topics).
  • Subscribers can do different things, makes it more like Reactive paradigm
  • Messages will be wiped from the topic only if it is consumed by all of the subscribers. (There can be retention policy to retain the message for a longer time)
  • Pub-Sub does not guarantee message ordering across partitions.
  • Within a partition, each consumer receives messages in a topic in the exact order in which they were received by the messaging system.

Implementations

AMQP (Advanced Message Queueing Protocol) is a standard for implementing different Messaging queue, you can check their objective and mission, https://www.amqp.org/about/what.

AMQP is a messaging protocol which originated in the financial industry. Its core metaphors are Messages, Exchanges, and Queues. AMQP was originated in 2003 by John O’Hara at JPMorgan Chase in London.

RabbitMQ, ActiveMQ, IMBMQ, StormMQ, Amazon SQS are pretty standard implementations of Messaging Queue.

Apache Kafka, PubNub, MQTT, Google PubSub (Managed), Amazon Kinesis(Managed) are implementations of Publish subscribed pattern.

MQTT is a machine-to-machine (M2M)/”Internet of Things” connectivity protocol. It was designed as an extremely lightweight publish/subscribe messaging transport.

Kafka was designed as a log-structured distributed datastore. It has features that make it suitable for use as part of a messaging system, but it also can accommodate other use cases, like stream processing, data pipelining. It originated at LinkedIn and its core metaphors are Messages, Topics, and Partitions.

Difference between RabbitMQ and Kafka

As per the design both are was little different and little similar also, however one is Messaging Queue and other is Pub Sub, but a both can be made Queue(one to one), Fan out (one to many), Pub-Sub (many to many), so lets understand the use cases and figure out when to use what.

Use Kafka if you need to deliver a huge(50k per second) number of events/messages in partitioned order ‘at least once’ with a mix of online and batch consumers, but most importantly you’re OK with your consumers managing the state of your “cursor” on the Kafka topic.

Kafka’s main superpower is that it is less like a queue system and more like a circular buffer that scales as much as your disk on your cluster, and thus allows you to be able to re-read messages.

Use Rabbit if you have messages (20k per second per queue) that need to be routed in complex ways to consumers, you want per-message delivery guarantees, you need one or more features of protocols like AMQP 0.9.1, 1.0, MQTT, or STOMP, and you want the broker to manage that state of which consumer has been delivered which message.

Functional Differences

  • Rabbitmq traces message states (acknowledge, unack, reject etc) but Kafka gives this responsibility to consumers.
  • Rabbitmq pushes messages to the consumer, in Kafka a consumer fetches messages from Kafka partition
  • Rabbitmq keeps messages orders based on queue, Kafka keeps for partitions
  • You can re-consume old messages on Kafka simply moving the consumers offset to backward, in RabbitMQ consumed messages (if successfully acknowledged) are deleted from the queue

Primary use case difference

  • Apache Kafka, build applications that process and re-process streamed data on disk, event processing, data processing, replaying
  • RabbitMQ – It process high-throughput and reliable background jobs, communication and integration within, and between applications.

Scaling differences

  • Kafka is Horizontally scalable and RabbitMQ is vertically
  • Kafka uses disk storage and partitions to scale the throughput to allow more producers to generate more messages
  • Rabbitmq traces all message states and responsible for pushing messages to the consumer, so it needs more memory and scales well on the cluster with high memory availability.
  • Scaling of Kafka is easier than RabbitMQ cluster even RabbitMQ cluster.
  • RabbitMQ handles its messages largely in-memory and thus uses a large cluster, whereas Kafka proudly leverages the powers of sequential disk I/O and requires less hardware
  • You can scale a consumer for a queue simply deploying another instance of consumer and both two consumers can consume the same queue, but if you want to scale a Kafka topic consumer you need to split another partition and the global order of messages will be lost (remember, Kafka keeps messages orders for partition)

 

Good reads

Facebook Comments

Leave a Reply

Your email address will not be published. Required fields are marked *