Concepts

The Need

REST APIs are synchronous and follow a client-server model (point-to-point).

Async API: put message in a third party (queue) and move on (โ€œfire and forgetโ€)

  • deferred processing: place message in queue and process it later
  • fault-tolerance: even if a service goes down we still have previously sent messages in MQ
  • decoupling: services arenโ€™t time bound w.r.t each other
  • load balancing: delegate load to multiple consumers
  • data-streaming: large volumes of data exchange between services

Not only data messages, but tasks messages can be put in queues too.

Event-driven Architecture: put a task in the queue, send it to services and trigger processing in them

Streaming Data Pipelines: huge volume of data (~100K requests/sec) can be sent between services

Prod-Con Model (MQ)

Simple MQ - single FIFO pipeline. Ex - IBM WebSphere MQ, Rabbit MQ, Apache Active MQ, etc.

Messages in MQs can be ordered (FIFO order) or unordered (high priority ones are processed first).

  • one-to-one; one producer, one consumer
  • message is deleted from MQ by MQ system after consumption, they can also configured to be deleted on consumer-ack (RabbitMQ has this config)

The message deletion can be turned off in most MQ platforms but the general idea of MQ is remove-on-consume.

Disadvantages:

  • low latency but slower than Kafka
  • throughput is not as high as Kafka

Pub-Sub Model (Kafka)

Publisher puts messages in a central system, subscriber(s) consume them. Broadcasting. Acts as a Distributed Commit Log.

  • one-to-many; one producer, many consumers
  • message is not deleted from Topic after consumption by consumers (see Kafka notes)
  • unordered mostly
  • scalability: add multiple consumers to consume messages faster

Disadvantages:

  • not for mission critical synchronous systems where ack is required
  • not for non-idempotent tasks like financial transactions; because a single message can be processed multiple times

Push vs Pull Semantics

Push is good when we have too many producers.

Push is good when we want producers to send data asap.

Pull is good when we want consumers to control the rate of consumption.

The consumer can make request to the broker (pull) with very little additional configurations and firewall issues, than the broker calling a bunch of consumers (push).

References