In my last post, I talked about how we can dramatically improve our application architectures by making workflow explicit. When we pull the flow of the application apart and create a higher level “workflow” object (often called a ‘controller’ or ‘mediator‘), we can separate the process from the implementation details. This gives us a lot of advantages in reading and understanding the code, as well as advantages in modifying and maintaining the code.
A Long Running, Distributed Workflow
In an interview I did with Jimmy Bogard (part of the RabbitMQ For Developers package), he describes a long running, distributed workflow and calls it a “saga” or “process manager“. The general idea is to have a single object manage a process that may take a significant amount of time. This of often described as a “long running transaction” – but not in the sense of a database transaction. It’s more like a fast food order as a “transaction” – paying for something and having it delivered to you – as Jimmy describes in our interview:
Imagine going somewhere like McDonald’s and ordering something there. You go and order something and that kicks off a process behind the scenes. It represents the overall transaction of you getting your order.
Getting the food is the transaction, you’re not leaving the restaurant until you get that. You don’t stand at the counter and just sit there waiting while all the things are done before the next person goes. That would be a synchronous transaction. Instead it is this more long running transaction that has a number of independent pieces that are all communicating via messaging.
The cashier will message to the cook to say order up and it has a little piece of paper that has your order on it. That’s the message, what you’re ordering. And when they are done with it it all comes back to a single … at least at McDonald’s it comes back to a tray, and individual people finish up with stuff. That tray effectively is the state of what is done and what’s not done. In fact, if you go look at how they do it, every time a cashier or one of the people who has something to do with your order does something, they actually look at your receipt on the tray to see what’s finished.
In this example, the order ticket that goes to each station of the restaurant is the message. The food tray where your order is placed, before it gets handed to you, is the state of the order. This is all one large, long running transaction – a “saga” – where multiple messages are sent out and things may happen sequentially or in parallel to complete your order.
But, how would you represent that in code? What would it look like to have an object that coordinates messages in order to complete an order, like this?
Modeling The Food Order Workflow
To start out simple, take the Backbone.js code from the previous post on workflow and model a workflow that does the following:
- Receive payment for the order
- Request a burger
- Request french fries
- Request a drink
- Present the order to the customer
The code to handle this in a Backbone.js application may look something like this, at a high level:
In this code, you can see the payment being received before any of the food is made and gathered. Once all of the food is ready and on the tray, it is handed to the customer. The tray is acting as the state of the over-all workflow. It knows what items have been added to it. After each item is added, the state of the order is checked by examining the tray’s contents. Once all of the items are on the tray, the order is done and it is delivered to the customer.
Be sure to note that the order is not processed sequentially. You don’t make the burger before making the fries, or the fries before making the drink. To do that would adversely affect the efficiency of the restaurant. Instead, each of the items for the order is allowed to be made in parallel to the other items. They are placed on the tray when they are done, and the tray is checked each time to see whether or not the order is complete.
If you have a process that needs to run sequentially, the code can easily be changed to work this way. The previous workflow post shows code that needs to work this way, for example.
Lastly, this code has a lot of assumptions in it such as only having one burger, fries and drink. A more realistic example would require more abstraction to represent any number of line items on the order.
Moving From In-Memory To Messages
Now that the food order workflow has been built, it’s time to change this up so that messaging is involved. The good news is, this takes very little change. In fact, it could be done with zero change at all to the high level FoodOrder object! The real change that is needed, will be in the lower level components – the parts that do the work.
For example, you could easily build a “DrinkStation” object with RabbitMQ and Node.js:
In this example, the drink station is handled by another process somewhere else. The code in question sends off a request for the drink to be made and then waits for the drink data to be returned. Once the drink comes back, it triggers the “drinkup” event so that the higher level workflow can be notified that the drink is now ready to be put on the tray.
As simple as this example is, there are some downsides. Running a request/response scenario like this requires a fairly fast drink station on the other side. If the response takes more than a few seconds, then the Request/Response pattern may be the wrong choice. It might be better to look at two way messaging with multiple publishers and subscribers.
A Long Running Task
Using Request/Reply often seems like a good idea at first, but can be dangerous. Request/Reply does not guarantee the response will be handled by the requester, and typically requires a very fast response for it to be useful.
In the case of a drink station, burger station or fry station, it will probably take anywhere from 30 seconds to 3 minutes (or more) to finish making the item. This is probably too long for the Request/Reply to hang around waiting. What you need instead, is bi-directional messaging with multiple publishers and subscribers.
In the case of the drink station object, there will be two child objects
- Drink Request Sender
- Drink Response Handler
Off-hand, this looks like request/response, but it will be implemented with a more straightforward set of objects. One that can send a message and one that can receive a message.
Now that the requester and handler are separated, the system will be more fault tolerant – there will be more room for processes crashing and starting up again, and message will have a better chance of getting to their intended destination.
But again, this new code is not without it’s own share of potential problems.
Reconstructing Workflow Objects
When you have a long running process facilitated by messaging, you may not want to keep the process object around in memory all the time. If there are hundreds or thousands of these instances running, that could eat up a lot of memory. Additionally, you have no guarantee that the server won’t go down and come back up in between messages that are sent back and forth.
In my email course / ebook on RabbitMQ Patterns, I talk about the challenge of ensuring the response message is handled by the right object.
The easiest way to do this is to again use the Correlation ID of the message. By sending an ID with the original command, and returning it with each status event message, you can apply the message to the correct job. The correlation ID in a typical request / response scenario will likely be a random GUID or UUID. In the case of the job status events, however, the correlation ID should be the job’s unique ID. This makes it trivial to find the job to which the event message applies, and update the job accordingly.
If the original object that is managing the workflow is no longer in memory, you will have to reconstruct that object when a related message comes in. This is where the correlation ID that is mentioned above comes in to play. The correlation ID should be examined when a message arrives, and the correct workflow object should be loaded in to memory again. Once that has been done, the message can be processed by object, the state can be saved and then the workflow object can be unloaded from memory once more.
To make this happen, the code will have to be adjusted significantly with the message listener and workflow object relationship inverted.
Setting Up A Workflow Manager
To facilitate reloading the workflow object as needed, the message listener can no longer be held by the workflow object directly. Instead, there will need to be a 3rd party involved – one that will listen for the messages, load the correct workflow and tell the workflow object to process the result.
With the FoodOrderManager in place, the code is quickly becoming more robust and fault tolerant. Instead of requiring the drink station to very quickly fill a drink and send it back, it can take it’s time and do it right. If the main process crashes or needs to pause the order for a while, it can. The original FoodOrder can be unloaded from memory and minutes, hours or even days later it can be reconstructed and continue processing.
Seeing The Flow And The Aggregation Of Patterns
The workflow object is a useful pattern in itself, but it also involves a number of other patterns at various times. There’s the workflow manager object, for one, but there’s also the individual message senders and receivers. Each of these individual steps is likely comprised of one or more chunks of code that aggregate in to the step… sub-workflows. They may not be implemented as workflow like objects, but they could be if needed.
However the steps are implemented, all of these patterns come together in the end to create the higher level workflow. Even with details omitted (error handling, missing object definitions, the need for a real RabbitMQ library, etc), the benefits of having a workflow in place for messaging are there. You get the high level process definition in the FoodOrder workflow, and you can easily modify or replace the details of individual steps as needed.
This is only the beginning of exploration into workflow and architecture with messaging systems like RabbitMQ. In reality, you’ll find many more situations and many more patterns that become quite useful and necessary. But the idea of having a high level workflow is an important stepping stone in moving forward and creating maintainable systems.
There are still other questions to ask and answer, as well. For one, what happens when the FoodOrder workflow completes and needs to notify the original requesting object or code that it is done? This can be handled with additional messages and communication patterns, of course… but these and other questions are going to have to wait for another time or another set of resources.
For more information on the ideas, the interviews mentioned and the details of workflow in messaging architectures, take a look at the following resources:
- Enterprise Integration Patterns – the book on messaging
- RabbitMQ Patterns – a collection of common patterns, applied to RabbitMQ
- NServiceBus Sagas – an introduction to the concept built in to the NServiceBus (for .NET). worth reading for any developer
- Jimmy Bogard’s Saga Pattern Wrap-up – a great series on sagas. You should read Jimmy’s blog, in general (.NET focused)
- ServiceBus and Architecture – an interview with Udi Dahan
- RabbitMQ For Developers – my collection of training materials for RabbitMQ, including the Jimmy Bogard and Udi Dahan interviews, and more
- RabbitMQ’s “How To” page – an extensive list of articles, books and resources around working with RabbitMQ
While this is a good place to start, there are still other great resources out there covering the concepts of workflow, long running processes and more.