Wednesday, October 23, 2013

Zip with LINQ

I really love LINQ. In fact, LINQ, mixed with extension methods, is really enjoyable and makes it hard to leave C#. (Among other things.) At any rate, one of my favorite LINQ functions is Zip and I got to use it the last few days to solve a problem that would have been annoying otherwise. The basic idea of Zip is to take two collections (IEnumerable) and combine their elements with a selection function. Zip's signature is:

public static IEnumerable<TResult> Zip<T, U, TResult>(
    this IEnumerable<T> first,
    IEnumerable<U> second,
    Func<T, U, TResult> selector)

Sometimes extension methods can be hard to read. Basically, this method is invoked on an IEnumerable of some type U. The first parameter is the other IEnumerable you want to "zip" together with the first. The final parameter, Func<T, U, TResult>, basically says it takes a method which takes in two parameters (of type U and T, respectively) and returns something of TResult. Since U and T are the types defined by the two IEnumerables we're zipping together, one can think of it as a method that takes the one of each of these collections and returns something else. Anything, really.

As a web developer, I have created myriad collections of items that have an associated count with each item. I don't know how many times I wrote something like (using Razor syntax here):

@int counter = 1;
foreach(var elem in Elements)
{
    <div>@elem.SomeProp - @counter++</div>
}

Too easy

Sure, this is a pretty trivial example, but my views have sometimes become jam-packed with this sort of "logic." (You know, cause this is "view" stuff, right?) But, had I given it some though, I could have created a better view model that had this information in it. (And, in turn, kept my view a bit cleaner.)

One of the things to note about Zip is that it will only join items up to the length of the shortest collection and nothing more. We can use this to our power for situations like our previous pretend view. Let's say we want to take a collection of strings and list them out in a console app. We could do something like:

public void Main()
{
    val names = new[] { "Tom", "Dick", "Jane" };

    for (int i = 0; i < names.Length; ++i)
    {
        Console.WriteLine("{0}-{1}", i + 1, names[i]);
    }

    // or

    int counter = 1;
    foreach(var name in names)
    {
        Console.WriteLine("{0}-{1}", counter++, name);
    }
}

Again, this is trivial so it doesn't seem too painful. But let's look at how we can do it with Zip:

public void Main()
{
    var names = new[] { "Tom", "Dick", "Jane" };
    var counts = new[] { 1, 2, 3 };
    var strings = names.Zip(counts, (x, y) => string.Format("{0}={1|", x, y);

    foreach(var val in strings)
    {
        Console.WriteLine(val);
    }
}

A little more

So, the same results as above with a little bit more typing (in this trivial case). But, since we know that Zip will only combine items up to the length of the smaller collection, we can do some interesting things. Since counting items has been a theme for this post, I'll stick with it. Let's say we want to take a collection of strings of variable length and get the same effect as the previous two examples. Well, we could use a for-loop in every spot, or we can use a functional approach and create a generic method that will let us accomplish this. The Enumerable static class has a helper function on it called Count. It returns an IEnumerable that counts from N to M (whatever you input). Since we are iterating an IEnumerable, it is not loaded into memory all at once so we can do something like this:

public static IEnumerable<TResult> ZipWithCounter<T, TResult>(
    this IEnumerable<T> first,
    Func<T, int, TResult> selector)
{
    return first.Zip(Enumerable.Range(0, int.MaxValue), selector);
}

// And its use:

var names = new[] { "Tom", "Dick", "Jane" };
var zipped = names.ZipWithCounter((name, i) => string.Format("{0} {1}", i, name));

(This is of course simplified and skimps on error checking for brevity.)

There are a lot of use cases for Zip that clean up how we join two collections together into some arbitrary third collection. It definitely follows more of a functional style of programming than using a looped counter and can make code much easier to read by centralizing repeated logic. Zip is one of those underutilized tools in the LINQ tool belt.

No comments:

Post a Comment