Wednesday, May 8, 2013

Asynchronous Command Handlers, Object Identity, and CQRS

I have been working on a (closed-source, unfortunately) project in my spare time and have been using the Command Query Responsibility Separation (CQRS) pattern. In most places, it is pretty easy to utilize asynchronous command handlers, but there was one part that tripped me up a little bit. When a user creates a new aggregate root via, say, a web page, how do we handle the asynchronous nature of the command handler?

At one point, I thought about making these creation-type command handlers synchronous so the user could be warned if the creation has failed or if there were errors. I was not very happy with this because it meant that I couldn't just place the command on a bus and move on. It meant I cannot offload that work somewhere else and the user has to sit and wait for this thing to go on. Basically, an all around uncomfortable situation.

Why?

So, what made me care about this? Why is it important to think about? Well, it has some implications. First off, it makes one consider how an object's identity is created. If we have the domain create the id, then we have to sit around and wait for it to complete. (Hence, my dilemma.) Overall, this could work, but I was not satisfied with it. After all, my goal was to allow the web application to do as little processing as possible, other than serving up pages and communicating with the service bus.

Were do we go from here? I was a bit stumped and almost gave in to the idea that I would have to accept this band-aid fix as a permanent solution. Luckily, after some thought, and drawing some pictures, I realized I was looking at the problem from the wrong point of view. Shifting my thinking, the answer seemed pretty obvious and I wasn't sure why I didn't see it in the first place!

The First Step

The first step to my clarity was deciding that the domain did not need to be in charge of creating an identity. Instead, why can't we pass in, say a globally unique id (GUID) and tell the domain that we expect something to be created with this id? Now, we don't have to sit around and wait on some database to assign and id and filter it back to the user. So, as part of our command, we can create a new identity and pass it into the domain from the web server. Now, since the server has the targeted identity, we can forward the user to a "Success!" page with the new identity as a hidden field. We can either set a timer to forward the user to the read only model or provide a link on which the user can click.

But, what if it fails?

What if the creation fails? Well? Who cares? What does that actually mean in the domain? For me, in my current project, it didn't matter. We can display an "Oops! We screwed up!" page with a link back the creation page. We could go so far as to reload the creation page with the data passed in (since we have the command, after all). Even if the user cheats the system and tries to re-use an identity to create an aggregate to maybe cheat the system, we can detect it (the aggregate cannot be created when it has already been created!) and show the user an error page.

Wait, create what was already created?

Let's say a malicious user wants to try to trick the system into recreating an aggregate in hopes to gaining access to it. Well, we have to be careful here. In my solution, aggregates are made by newing up the target type, loading the event stream from the backing store, re-running the events, and then calling some method on the aggregate. This include "create" methods. The constructor doesn't actually put the object in a created state. So, instead, we have something like:

public abstract class Aggregate
{
    public void ReplayEvents(EventStream stream) { ... }
    protected void PlayEvent(Event target) { ... }
}

public class Foo : Aggregate
{
    private bool _created;

    public Foo()
    {
        // Just get the underlying code ready,
        // but don't set the state to created.
    }

    public void Create(FooCreateParams params)
    {
        // Validate the params and all that fun stuff
        // and, if all is well, fire a created event.
        // If this has already been created, throw an
        // exception or, maybe, fire an error event.

        if (_created) { /* Blow up! */ }

        PlayEvent(new CreatedEvent(params.Id)); // You get the gist.
    }

    private void PlayEvent(CreatedEvent target)
    {
        // React to the event here.
        _created = true;
    }
}

So, if the object has already been created, we don't want to mess with the state. Depending on your domain and the context of your application, you could either fail silently, fire an error event, or even throw an exception. No matter what you do, though, if a user somehow messes with the system (or you happen to have an identity clash) and tries to execute a Create on an already created object, we do not want to hint that the user actually hit upon an actual id.

Conclusion

With a little bit of thought, we are able to clear up a seemingly complex operation down to a pretty easy solution that allows us to keep our asynchronous operations. Now we have a pretty clean set of operations: User hits submit, we load the command on the bus with a new id, redirect the user to a "success" page with some way of referencing the new object, and then let the user move from there. On error, regardless of why, we let the user know an error happened, and provide them with some information to make a decision on how to move forward.

Tuesday, May 7, 2013

Refactoring? Or Rewriting?

I was recently asked in an interview what it meant to refactor code and what the prerequisite was to refactoring. I have to admit, I was a bit thrown off being asked what the "prerequisite" is, since it seems to me to be reflexive software development nowadays. To me, refactoring is an interesting concept in software development that a lot of other engineering-type professions don't get to leverage as freely as we do. But, I think it has become (incorrectly) synonymous with rewriting code. (Or maybe refactoring is instead used as a guise to rewrite.)

Refactoring is very simple and is akin to rewording the language in your code to express the same idea, or original intent, in a different statement. Rewriting your code is changing the intended behavior of the code—the opposite of refactoring. I have spent a lot of time refactoring some code on my current project lately, and I have run across a the following code a lot.

public List<Foo> GetFoos(SomethingElse[] somethingElses)
{
    var retval = new List<Foo>();

    foreach(var else in somethingElses)
    {
        retval.Add(new Foo() { Bar = else.Bar });
    }

    return retval;
}

So, this is a pretty trivial sample of code, but it is really easy to see what is going on and what the intent of the code is. Basically, it is mapping a collection of one type of object to a collection of another type. We create a list and then iterate through the original collection, appending a new instance of Foo, and then return it. Using LINQ, we can actually simplify this just a tad and even type less code to get the same effect. (Refactor it.)

using System.Linq;

public List<Foo> GetFoos(SomethingElse[] somethingElses)
{
    return somethingElses
        .Select(x => new Foo() { Bar = x.Bar })
        .ToList();
}

Again, this is a trivial example, but it shows how we can refactor our code, keep in tact the original desired behavior, but use different syntactic sugars to clean it up. (I prefer using LINQ, personally, over foreach, especially in nested situations.) But, besides this being a trivial situation, what confidence do we have that we did not silently introduce a bug into our system? What assurance do we have? Well, before refactoring our code, we should put barriers in place to help us reason about our code. In the simplest sense, we should have a unit test, in place, before we make our changes to assert that we did not introduce a bug. Of course, if we're good TDDers we would have this unit test in place already, and have a nice regression test to fall back on. If not, it would behoove us to get one in place, quickly.

[Test]
public void given_an_array_of_something_elses_it_should_return_a_list_of_foos()
{
    var somethingElses = Enumerable.Range(0, 5)
        .Select(i => new SomethingElse() { Bar = i })
        .ToArray()

    // Let's assume GetFoos is defined as a static method on Baz
    var result = Baz.GetFoos(somethingElses);

    for (int i = somethingElses.Length - 1; i >= 0; --i)
    {
        Assert.That(result[i].Bar, Is.EqualTo(somethingElses.ElementAt(i).Bar));
    }
}

Well, that's refactoring, but what about rewriting? If we look at our simple method, what happens when we pass in a null array of SomethingElse? At this point, our method doesn't care and will attempt to iterate over it anyway. This, of course, results in a null reference exception and we have to track down how this happened. But, let's say we decide to change the behavior of this method. We, instead, will throw an exception if the array is null because we the precondition has not been met. Since we are changing the method's behavior, we are rewriting our code. One hint that this is a rewrite is the fact that we need to introduce a new unit test.

[Test]
public void given_a_null_array_it_should_throw_an_argument_null_exception()
{
    var exception = Assert.Throws<ArgumentNullException>(() => Baz.GetFoos(null));
    Assert.That(exception.ParamName, Is.EqualTo("somethingElses"));
}

I have used NUnit-style syntax in this post, but it should be fairly clear what is being tested.

Now, why is this important? Well, as we're writing our code, we tend to learn a lot about it. We see new ways to implement things that allow us to, later, extend the behavior of something without changing the original meaning. It also allows us to nicely execute test driven development: "Red –> Green –> Refactor."