ZeroMQ Messaging Library Making Complex Communication Easy
I had been playing with ZeroMQ and I wanted to share it with you. First of all, it is not a complete messaging system such as RabbitMQ or ActiveMQ. A full fledged messaging system gives you an out of the box experience. ZeroMQ is not such a system at all; it is a simple messaging library to be used programmatically. It basically gives you a pimped socket interface allowing you to quickly build your own messaging system. This is very cool because you no longer need to use very complex monolithic systems, configure them, and have to have a dedicated system. It gives you sockets that carry whole messages across various transports like in-process, inter-process, TCP, and multicast. You can connect sockets N-to-N with patterns like fanout, pub-sub, task distribution, and request-reply. It’s fast enough to be the fabric for clustered products. Its asynchronous I/O model gives you scalable multicore applications, built as asynchronous message-processing tasks. It has a score of language APIs and runs on most operating systems making it great for homogeneous environments. Specifically some benefits to using ZeroMQ are:
- It handles I/O asynchronously, in background threads. These communicate with application threads using lock-free data structures, so ØMQ applications need no locks, semaphores, or other wait states.
- Components can come and go dynamically and ØMQ will automatically reconnect. This means you can start components in any order. You can create “service-oriented architectures” (SOAs) where services can join and leave the network at any time.
- It queues messages automatically when needed. It does this intelligently, pushing messages to as close as possible to the receiver before queuing them.
- It has ways of dealing with over-full queues (called “high water mark”). When a queue is full, ØMQ automatically blocks senders, or throws away messages, depending on the kind of messaging you are doing (the so-called “pattern”).
- It lets your applications talk to each other over arbitrary transports: TCP, multicast, in-process, inter-process. You don’t need to change your code to use a different transport.
- It handles slow/blocked readers safely, using different strategies that depend on the messaging pattern.
- It lets you route messages using a variety of patterns such as request-reply and publish-subscribe. These patterns are how you create the topology, the structure of your network.
- It lets you place pattern-extending “devices” (small brokers) in the network when you need to reduce the complexity of interconnecting many pieces.
- It delivers whole messages exactly as they were sent, using a simple framing on the wire. If you write a 10k message, you will receive a 10k message.
- It does not impose any format on messages. They are blobs of zero to gigabytes large. When you want to represent data you choose some other product on top, such as Google’s protocol buffers, XDR, and others.
- It handles network errors intelligently. Sometimes it retries, sometimes it tells you an operation failed.
Quick note about running ZeroMQ on NodeJS. I tried to use npm to install ZeroMQ, and I had to make a few changes to the way ZeroMQ library was compiled on my mac OS X(10.6). You may have to change the warning level and install other libraries to get it working. Please contact me if you need help installing it!
ZeroMQ sockets are not the same as the sockets you know. When working with ZeroMQ, you need to think of them as magic communication hubs. Instead of thinking about the low-level socket and handshaking think about your message topology and flow. For example, you can have a ØMQ socket that binds to different ports and protocols at the same time. Your code will simply get the messages as if there is only one port and protocol. ZeroMQ also abstracts messaging patterns so if you want a ØMQ socket to act as a PUB/SUB or REQ/RES node you can just configure the ØMQ socket. Finally, ZeroMQ can inter connect and translate different protocols and messaging patterns so if you need to bridge a INPROC to TCP you can. Basically, ZeroMQ allows you to design your architecture and think about how you need all your systems to communicate with each other and not worry about how you are going to do it.
ZeroMQ handles all the complex work to make the messaging stable, fast, and reliable to let you focus on the fun part of architecting. The first abstraction in ZeroMQ is the transport layer. You can pick from a long list of transports. When designing your architecture ZeroMQ allows you to use 4 different transports:
- INPROC an In-Process communication model
- IPC an Inter-Process communication model
- MULTICAST multicast via PGM, possibly encapsulated in UDP
- TCP a network based transport
The TCP transport is often the best choice, it is very performant and robust. However, when there is no need to cross the machine border look at the IPC or INPROC protocol to lower the latency even more. The MULTICAST transport can be interesting in special cases. Finally, after you have picked your transport layer you need to decide how you are going to wire up the message passers and receivers notice how I did not refer to them as clients and servers. This is because ZeroMQ can be used to message pass between processes on the same box or over the network without a central dedicated server. ZeroMQ is a low-level library that you link into your code that lets you build complex messaging systems. You have to decide what code is going to be the server or client and think about what parts of the architecture is stable and what part is elastic. This is apart of what makes ZeroMQ so cool you as a programmer get to build your messaging system but, you do not need to understand the complex nature of making a robust mature messaging system.
ZeroMQ has a lot of options on how to connect the message queues up. In a later post, I plan to cover some of the different type of queues, but there is already a lot of really good documentation on their API at www.zeromq.org. The key idea you should walk away with is do you want to send a message out to many workers to do processing or you want to fan-in all the request so you will know when all the work is done. Do you want to use a pipeline so you can have serial processing or do you want to do something else. You can achieve many different designs by picking the right messaging pattern. It is amazingly easy to setup a ZeroMQ queue that already has the logic and transport to support your requirements. So you can just start sending a message to ZeroMQ and let it do all the magic of sending the data and receiving it.
I have been playing with ZeroMQ to make a workflow system and a distributed processing system like Hadoop. I am loving it. ZeroMQ is really cool technology and has simplified my life a lot. Check it out, sometime when I have more time I will post some code examples!