Greg Young and Udi Dahan have documented a pattern that they found worked well in large-scale distributed systems. They called this pattern Command Query Responsibility Segregation, or CQRS. This pattern is based on the principal of Command Query Separation, or CQS, which is described in Bertrand Meyer’s Object-Oriented Software Construction. It seeks to scale read-heavy components separately from write-heavy components. In order to achieve that scale, the responsibilities for processing write-heavy commands had to be segregated from the responsibility of processing read-heavy queries.
Command Query Separation
The underlying principal of Command Query Separation tells us that a method should either change the state of the system, or return results. A method should not do both. It is very difficult to reason about the behavior of a software system if it changes state whenever you observe it. The methods that change state are commands. The methods that return results are queries.
It is very easy for a user of the API to recognize commands and queries. Commands declare their return type as “void”. Queries have a non-void return type. This also makes it easy to spot the calls in code. The code will be doing something with the results of queries.
Have you ever seen code that ignores the return value of a method call? The caller probably made a mistake. It’s very difficult to see that mistake in the written code, since you can’t see that the method returns a value. If the method obeyed the principal of Command Query Separation, then it either wouldn’t have any effect on the system, or it wouldn’t have a return value to ignore. Either way, the caller wouldn’t be able to make that mistake.
When code follows the principal of Command Query Separation, callers are encouraged to organize their code in three sections. The first section gathers information by executing queries. The second section makes a decision based on that information. And the third section records the decision by calling a command.
There can be many calls to different objects to get the information the caller needs. These calls can occur in any order. Since the calls have no side-effects, they can be easily reordered. New calls can be inserted with no change in correctness.
There is usually just one command issued at the end of this kind of sequence. Chains of commands open the possibility for corruption and inconsistencies due to partial successes. It is much easier to maintain code in which each command leaves the system in a consistent state. A subsequent sequence of operations can read that state and issue the next command.
This kind of structure is much easier to reason about. In the gather information section, we can rewrite the calls just as we would a mathematical equation. We can apply the commutative and associative properties, because the order in which we execute queries has no effect on their results. There is only one state change in the entire sequence, and that always occurs at the end.
Command Query Responsibility Segregation
The CQRS pattern has a much different goal than it’s predecessor, the CQS principal. The goal of CQRS is to scale reads differently than writes.
Many of the systems that we build are read-heavy. We tend to read data much more often and in much higher quantity than we write it. You can actually calculate a ratio for any given system. Take a typical record. It will be written once. How many times will it be read? How many times will it be aggregated to produce a report? How many times will it be shared with the other users of the system? In most systems, it is not surprising to find that the average record is read hundreds or thousands of times. So the ratio of reads to writes is easily two to three orders of magnitude, in most business systems.
Given the read-heavy nature of business systems, when they come under load, it makes sense to scale up the number of resources dedicated to servicing reads. However, if those same resources are responsible for writes, then those resources are wasted. Worse, those resources can put heavier write load on the system, compete with the reads, and counteract the benefit of scaling up in the first place.
Where CQS separates commands from queries at the method level, CQRS separates them at the component level. Some components are responsible for commands – writes – and others are responsible for queries – reads. To scale out a read-heavy business system, you can allocate more resources just to the read components.
CQRS in practice
Consider this WCF contract.
public interface IFulfillmentService
Confirmation PlaceOrder(Order order);
This method does not obey the CQS principal. It is changing the state of the system, and returning a result. Hidden inside of this result is a tracking number. This result is quite expensive. To process an order and get a tracking number, the fulfillment system needs to:
- Locate inventory
- Allocate that inventory to this order
- Determine the shipment method
- Allocate space on a pallet
- Obtain a tracking number from the carrier
Because this is returned from a method, all of these steps must be performed synchronously. The caller will have to wait while those steps are performed before they know that their order has even been submitted.
The first step in correcting this problem is to define two methods: a command and a query.
public interface IFulfillmentService
void PlaceOrder(Order order);
Confirmation CheckOrderStatus(Guid orderId);
Now the service can return immediately after storing the incoming order. It doesn’t have to go through all of the steps before letting the caller know that the order was received. The caller can come back later and check the status of the order by executing a query.
The next step will be to move the processing of that command to a different component. We will see how to do that in the next article.
For the complete example, please download the code from the PharmaNet repository on GitHub. Look at the Fulfillment solution for this service.