Why Rubyists Should Care About Messaging (A High Level Intro)

Messaging in the context of application architecture (grandly referred to as message oriented middleware on Wikipedia) is similar to messaging in the real world. If you want to ask your colleague to do something, you’ll send him a message of some sort. And if your app needs to ask another app to do something it can do the same, send a message to another app or process to run a command or send an e-mail, for example.

Note: This is a guest post by Jakub Stastny, a member of the RabbitMQ team. Further info can be found at the footer of this post.

There are many reasons for using messaging in your applications. It can help you:

  • improve response times by doing some tasks asynchronously
  • reduce complexity by decoupling and isolating applications
  • build smaller apps that are easier to develop, debug, test, and scale
  • build multiple apps that each use the most suitable language or framework versus one big monolithic app
  • get robustness and reliability through message queue persistence
  • potentially get zero-downtime redeploys
  • distribute tasks across machines based on load

For the purpose of this article I’m going to use the word “messaging” only for sending messages over some kind of messaging protocol such as AMQP or STOMP. There are some messaging systems which work only within one language, such as JMS for Java, and I’m not going to touch on these.

Messaging Architecture

Most messaging software is implemented as a message broker which is a daemon connecting producers with consumers. Producers send messages and consumers process them. In Web development, a producer is usually a frontend which based on user actions generates tasks, whereas a consumer is usually a backend executing those tasks. Examples of a messaging broker are RabbitMQ and ActiveMQ. However, a broker isn’t strictly required. For example, ZeroMQ provides only a socket-like API (this white paper explains more about broker vs brokerless systems).

Basic schema of how messaging works.

A broker isn’t only a dumb storage of tasks – it can do a lot more. An important feature is advanced routing, giving you the power to route one message to one or multiple queues based on configuration or even based on some pattern in the messages. The part of the broker which takes care of the routing is called an exchange in the case of AMQP.

Schema of how AMQP works.

How Can You Benefit from Using Messaging?

Reliability & Robustness

Now you might wonder: “Isn’t a background thread enough?” What if the application crashes? Most of the messaging brokers support some form of persistency, so even if the server is restarted, no data are lost1. Messaging protocols often support ‘acknowledgements’ too, which means that a task is considered to be finished only if the client sends confirmation that everything went OK.

1 It might be tricky in case the broker is killed, but there’s usually a solution for that as well. For example AMQP supports transactions and RabbitMQ provides publisher confirms which is a fast, asynchronous way to be notified that the message was published successfully. Persistent messages are confirmed when all queues have either delivered the message and received an acknowledgement, or persisted the message.

Decoupling

With a message based infrastructure, different parts of your app can easily communicate to each other, making it simple to decouple your app into a few smaller ones. I believe this is really crucial, because it makes the design much better, it makes a lot of stuff simpler and gives a natural progression for scaling.

If your apps are separated, you don’t have to write everything in one language, hence you can choose the right tool for the right job. You can connect your new apps in Ruby with your legacy apps in, let’s say PHP. You don’t have to rewrite the whole ecosystem of apps and specific problems can be solved using Java, Erlang or C if you need better performance or scalability. This isolation can also make it easier for different people to work on different apps, as long as the messaging scheme is agreed upon. (Hello, outsourcing!)

The pain of deploying of large apps can also, in many cases, be reduced. Designed correctly, a heavily decoupled system made of several parts is less likely to come crashing down like a house of cards. Instead, you might have a few component apps dying and only having a cosmetic effect on the larger app overall.

And a bonus: because the apps are isolated, you can easily see the input and output of them, therefore it’s easy to inspect and debug them.

#!/usr/bin/env ruby
# encoding: utf-8

require "amqp"

EventMachine.run do
  AMQP.connect do |connection|
    channel  = AMQP::Channel.new(connection)
    queue    = channel.queue("", auto_delete: true)
    exchange = channel.direct("")

    exchange.publish "Hello RubyInside readers!", key: queue.name
  end
end

This from com.rabbitmq.tools.Tracer tool of rabbitmq-java-client showing how we can easily inspect the code above. Another tool you can use for this purpose is Wireshark.

Now imagine everyone starts to use your application. You’ve suddenly become rich, you buy beers to everyone and you hire the top people of your community. But then what happens? Oh no, an angry unicorn!

Right, it’s time to scale. But how? The frontend is fairly simple, but there’s a lot of stuff going on on the backend: sending e-mails, processing images, running some tasks … you can add new instances, but it won’t reduce complexity, and it won’t be very efficient as different parts of the app have different performance requirements.

So instead you can use decoupling and split your app into multiple separate services. Such applications are very easy to scale and because they’re small, they’re also way easier to test and the isolation makes it easy to rewrite some parts in case you have so big load that you need to rewrite the critical bits into Java or C.

Scaling by adding new instances.

Scaling by adding new task consumers.

Faster Response Times

Most of the Web apps nowadays are too synchronous. If you upload an image, you might sit there waiting for the thumbnails to be made. It’s slow and puts a lot of load on the frontend of a Web app. With a message based architecture, the frontend could instead publish a message saying “Please resize me image XY” (well, in a slightly more technical way ;-)) and leave it be. The same applies for many other situations: sending e-mails, following other users etc.

If it can’t get an instant response and deliver that to the user, put it into a message and pop it on a queue to be done later. Most larger sites and services have to do this, so if you can bake it into your smaller app, you’ll give yourself a longer runway.

Avoiding Unnecessary Downtimes

Another advantage of messaging is that, if designed properly, you could experience no downtime when redeploying backend services. Consider the scenario with uploading images I mentioned previously. If the communication were synchronous and the “scaler” were down, any request to the service would fail because it couldn’t respond. With messaging, you don’t have to care. You’d just publish the task and once the service is back online, it will “catch up” and process all of the images.

Communicating with service over HTTP. If the service goes down, the frontend won’t work.

Communicating with a service over a messaging broker. If the service goes down, the frontend can still work, because once the service goes online, it’ll catch up on the messages which have been sent before.

But I Can Just Use HTTP, Right?

In the Ruby community HTTP is very popular but often overused. It always depends on your use-case. For example, if you need more advanced routing like 1:n or n:m, HTTP has little to offer. If you need asynchronous functionality or loosely coupled components like in case of pub/sub pattern, again, HTTP isn’t usually a good choice.

Downsides of Messaging

On the other hand, messaging infrastructures have their own downsides. First of all there are the innate downsides of going distributed at all such as increased reliance on the network and systems administrators (networks can and do go down, even within a single machine). Then there’s quite some code for handling reconnection like redeclaring non-durable queues and exchanges and also you have to accumulate messages until network connection is up again (though a good broker will deal with much of this).

One of the most important features of the upcoming AMQP 0.8 gem is significantly improved error handling, so these problems aren’t fatal, but you should mind them before making the decision whether the architecture suits your application or not. In most cases it’s a fair price for the advantages of a messaging-based architecture, but in some cases you might find it better to just use a synchronous approach.

Routing via HTTP with sending request/response for each client/message.

Routing via a messaging broker when you send a message to the broker and it takes care about the rest.

My Presentation

I recently gave a talk giving an introduction to messaging, along similar lines to this article. The slides are embedded below for your reference:

I’d like to thank Michael Klishin for his suggestions about improving this post.

This post was by Jakub Stastny. Jakub is a Ruby contractor currently working for the RabbitMQ team of VMware with the mission to make Ruby developers more aware of messaging. He created the Rango framework, the first Ruby framework with template inheritance and has contributed to many well-known projects such as RubyGems, rSpec and Merb. He has a blog 101ideas.cz where he writes about IT and self-development stuff and he tweets as @botanicus.