Friday, May 21, 2010

Developer Tunnel Vision Syndrome

I wanted my very first post to be on a rather happier note that celebrates the beauty of learning functional programming with F#, or touts a technical triumph or a breakthrough of mine. But the post is going to be more sombre and talk about Developer Tunnel Vision; a syndrome I have been suffering from over the past couple of days until the eventual breakthrough this morning.

I say Tunnel Vision, because developers often spend inordinate amounts of time working exclusively in one technology or one programming language (C# for argument’s sake) and forget to touch on some neglected skills like thinking in T-SQL and finding solutions using solely set algebra and set operations.

To give you a concrete example, I present the problem I had (with contrived data for demonstration purposes): given a list of values (say, velocities) with start/end time range for each velocity; in tabular form the input data looks like:

Velocity    StartTime               EndTime     
--------    -------------------     -------------------
         2009-11-24 05:45:43     2009-11-25 04:23:18
0           2009-11-25 04:23:18     2009-11-26 07:00:00
0           2009-11-26 07:00:00     2009-11-27 06:23:18
40          2009-11-27 06:23:18     2009-11-27 23:57:22
0           2009-11-27 23:57:22     2009-11-28 09:00:00
0           2009-11-28 09:00:00     2009-11-30 01:57:22
0           2009-11-30 01:57:22     2009-11-30 11:00:00
0           2009-11-30 11:00:00     2009-12-02 03:57:22
0           2009-12-02 03:57:22     2009-12-04 05:57:22
40          2009-12-04 05:57:22     2009-12-04 15:45:43

Using only plain set operations in T-SQL, collapse date ranges for matching velocities, to produce the following:

Velocity    StartTime               EndTime       
--------    -------------------     -------------------
40          2009-11-24 05:45:43     2009-11-25 04:23:18
0           2009-11-25 04:23:18     2009-11-27 06:23:18
40          2009-11-27 06:23:18     2009-11-27 23:57:22
0           2009-11-27 23:57:22     2009-12-04 05:57:22
40          2009-12-04 05:57:22     2009-12-04 15:45:43

Simple at the first glance! So I went forward equipped with what I considered a solid knowledge in SQL and set algebra and started working on the solution.

I was amazed at how difficult it was, because I recently had spent the good part of the past year or so thinking exclusively in Object Oriented way and I felt my mind struggled to rid itself of inheritance and polymorphism and to think in joins and unions. I eventually got to the answer, not the most elegant or succinct, but a good answer nonetheless (it is irrelevant to post the solution here).

This morning however, I recanted yesterday’s solution and came up with a 3-line C# LINQ query that produced exactly the desired result.

var collapsedRecords = records
                .Aggregate(new List<Record>(), groupingFunctor)
                .GroupBy(dco => dco.GroupId, (k, v) => v)
                .Select(v => new Record
                                     Velocity = v.First().Velocity,
                                     ST = v.Min(_ => _.ST),
                                     ET = v.Max(_ => _.ET)

Along with the following simple plumbing:

Func<List<Record>, Record, List<Record>> groupingFunctor =
                (list, record) =>
                        record.GroupId = !list.Any() ? 1 :
                                        ((list.Last().Velocity == record.Velocity)
                                                 ? list.Last().GroupId
                                                 : list.Last().GroupId + 1);
                        return list;


    //simple model for the records
    internal class Record
        public int Velocity { get; set; }
        public int GroupId { get; set; } //helper property
        public DateTime ST { get; set; }
        public DateTime ET { get; set; }

Now we can digress and start raving about the accumulator pattern in the LINQ Aggregate extension method, or we can get to the point of this post:

Most of us juggle more than one language and/or technology. And in order to stay sharp (no pun intended!) we must practice, practice and then practice some more lest time and negligence dulls the edge of our skills.

Finally, I leave you with a bit of Charles Darwin:
"I have steadily endeavored to keep my mind free so as to give up any hypothesis, however much beloved (and I cannot resist forming one on every subject), as soon as the facts are shown to be opposed to it."