Durability

About me

Michael L Perry

Principal Consultant

User login

In a system using the CQRS pattern, the client will typically send command messages to a queue. A handler will pull messages from that queue for asynchronous processing.

When we take this approach, we have to be sure not to violate the client’s trust. If they send us a command, and know that we have received it, they should expect that it will eventually be processed. Furthermore, if they send us a command, and don’t know whether we received it (because they got a network failure), then they should feel safe to re-send the command.

These are durability guarantees. We guarantee that the message was persisted. If the server reboots before it can handle the message, the message won’t be lost. The message is not removed from persistent storage until it is processed. And if the client sends a duplicate, the handler recognizes it and does not duplicate the results.

MSMQ command queue

If you are using MSMQ for the commands, you don’t get the durability guarantees for free. The default behavior of MSMQ is optimized for performance. Durability has a cost. To get it, use the following options:

Recoverable messages
Transactional queues
Private local queues

As C# developers, we have our own definition for public and private. These definitions do not apply to MSMQ. Private local queues are managed just at the machine level. A public queue is managed by Active Directory. Because we are using this queue as an inbox for a known command handler, private queues are the simpler option.

When the command handler first starts up, it needs to check whether the queue exists. If not, it should create it. Use the overload that takes an extra boolean parameter so that you can specify this to be a transactional queue.

_path = @".\private$\" + queueName;
if (!MessageQueue.Exists(_path))
{
    MessageQueue.Create(_path,
        transactional: true);
}

To send a message to a transactional queue, you have to be in a transaction. Create a TransactionScope to begin the transaction. Use the Send overload that takes a MessageQueueTransactionType so that you can tell MSMQ to participate in that transaction. And always remember to call Complete on your transaction scope when it’s done. If you forget any of these steps, you will find that your message is silently dropped, and never makes it to the queue.

We want to send recoverable messages to the queue. Recoverable messages are persisted. Non-recoverable messages are kept in memory. If the machine reboots before the handler removes them from the queue, non-recoverable messages are lost. Set the Recoverable property to “true” before sending.

using (var scope = new TransactionScope())
{
    using (var queue = new MessageQueue(_path))
    {
        queue.DefaultPropertiesToSend
            .Recoverable = true;
        queue.Send(message,
            MessageQueueTransactionType.Automatic);
    }
    scope.Complete();
}

The receiver must also create a TransactionScope to receive from a transactional queue. It should enlist its database connection in this same transaction. That way the removal from the queue and the commit to the database are performed atomically. If the message is not processed, it is not removed from the queue.

Because we are using this transaction scope for database operations, be sure to set the isolation level to ReadCommitted. By default the isolation level is Serializable, which is usually far too paranoid. Again, use the MessageQueueTransactionType to enlist in the active transaction, and remember to call Complete.

using (var scope = new TransactionScope(
    TransactionScopeOption.RequiresNew,
    new TransactionOptions
    {
        IsolationLevel = IsolationLevel
            .ReadCommitted
    }))
{
    using (var queue = new MessageQueue(_path))
    {
        var msmqMessage = queue.Receive(
            MessageQueueTransactionType.Automatic);
        message = (T)msmqMessage.Body;

        HandleMessage(message);

        scope.Complete();
    }
}

By using MSMQ in this way, you can guarantee that the message is persisted, and that it is not removed from persistent storage until it is processed. But this does not guarantee that duplicates are recognized. For that, we need to follow some simple guidelines.

Duplicate messages

If the client sends a command, and a network exception occurs, then the client doesn’t know whether message made it into the queue. It has no choice but to retry. It is up to the server to detect duplicates and handle them appropriately. To make sure we don’t duplicate, follow these guidelines:

Client-generated identifier
Check for the outcome
Nod and smile

To detect duplicates, the server needs a way to uniquely identify commands. Because the duplication originates on the client, the unique identifier must as well. include a Guid command ID with every message.

public class PlaceOrder
{
    public Guid OrderId { get; set; }
    ...
}

When the handler has processed the message – whether it was successful or not – it should store an outcome. Make sure that outcome includes the command identifier. Then, when the next message comes in, you can simply check for an outcome having the same ID.

You might be tempted to throw an exception when you detect a duplicate. Doing so only confuses the issue. The best thing to do is nod and smile.

The client has already sent the original message, gotten an error, and sent the duplicate. They should not get an error after having successfully sent a message. What would the client do in response to that exception? The answer is the same as if they received no exception: move on. So don’t throw an exception to the client. Just nod and smile.

It’s better to just check for duplicates in the handler, after the client has already moved on. If you find an outcome for this command, then you know that the message is a duplicate. Just short-circuit the handler.

if (_pickListService.GetPickLists(message.OrderId)
    .Any())
    return;

These guidelines will help you to create services that uphold the durability guarantees. If the client knows that you have received the message, then they know that you have already persisted it. If the power goes out, their command won’t be lost. Furthermore, they know that you won’t drop the message until it has been handled. And finally, if they have to send you a duplicate in order to know that you’ve received it, they can rest assured that you won’t duplicate the results.

Should I use this right away

Submitted by Breno (not verified) on Fri, 11/23/2012 - 06:55.

Hello Michael, I'm following your course on pluralsight and thats how I got here, nice to meet you.

I've got a simple question, hope you can help me out...

I'm about to right a simple e-commerce app, and I'd like to know if it sounds like a good idea to use msmq to proccess the orders, I'll probably deploy to windows azure and to a single machine, I mean, It is not a distributed system physically.. do the msqm benefits still aply to my scenario?

I'm not sure whether my app is going to be a success, but I keep thinking that I should make it as it's going to be

Yes

Submitted by Michael L Perry on Fri, 11/23/2012 - 12:04.

There are two good reasons for using these patterns right away. First, you tolerate failures without data loss. And second, you are more prepared for success.

Udi Dahan gave some tounge-in-cheek advice in a recent article (http://msdn.microsoft.com/en-us/magazine/cc663023.aspx). He said that you should take out logging. The reason is that if you call a web service to place an order, and that service fails, the log entry is the only record that you have of loosing the order. So cover your tracks and discard the evidence.

His actual advice was to queue the request so that you can retry in failure. Even if the message is moved to the dead letter queue, you haven't lost the data. You can fix the software and replay it, or even process it manually. So start queueing today so you can handle failures.

Then second reason to start today is just in case your application does well. If it takes off, you may find that one server is not enough to handle the load. Using these patterns gives you more opportunity to scale in the future. Retrofitting a scalable solution into a synchronous web service is difficult.

So get started now just in case you fail, and just in case you succeed.