Friday, April 10, 2015

DRY considered harmful

Increasingly, I've been coming to the view that the DRY (i.e. Don't Repeat Yourself) principle is something that ends up being far too easy to abuse/overuse, leaving you with piles... nay, nasty tangled rose thickets of complexity that quickly snowball out of control. In fact, I'd go so far as to argue that our "cures" for this - templating, generics, class hierarchies + networks (i.e. basically most "OO") - only make things worse most of the time, especially when everyone starts lauding them as "silver bullets" when the see them.

Another probably controversial and related view I've arrived at in the past few years is that I now consider nested/chained event callbacks as evil - it only takes one or two run-ins with these to realise that it really sucks when you have a UI which inexplicably starts running slow as molasses and has a penchant for double/triple redrawing itself with a very visible white/blue flash each time, only to find after hours and days of blood curdling debugging that this is because the callbacks start becoming unintentionally reentrant by calling each other repeatedly...

For an all too familiar take on this, see: http://www.benkuhn.net/rha

Put simply, DRY applies a LOT less than you probably think it does! In fact, it can hurt readability faster than if you'd just left all the repetition in there..

It isn't for no reason that it is said that humans are great pattern matchers, and that at some level, we may even like it. It took me far too long to realise this about music... easily memorable music - i.e. the stuff that people like actively listening to, is repetitive. Partly this was because back when I still did a lot of music stuff, it was because I "had" to (for various reasons for another time) instead of having that much appreciation for much in particular, but also because at the time, I mainly listened to music just before nodding off each night. For that reason, "less stimulating" music that just seemed to "randomly flow" from one thing to another was a VERY good thing! On a a side note: Thomas Newman soundtracks are great for this (minus the one or two tracks with some or other horrid song with vocals by some clueless schmuck stuck in there). Recordings of certain types of classical music also have certain varying effects: Personally, I tend towards being but more productive the day after listening to orchestral/violin arrangements - suites/medleys/etc. - of Bizet's Carmen (though I should note that I am extremely picky about the way that this is performed... over half of the ones out there just sound wrong... lifeless, stilted/lethargic, and sonically barren). Also, I strongly encourage everyone to ACTIVELY AVOID listening to Mahler if they don't want to be seriously depressed for the next week!

I haven't really put any concrete guidelines on when exactly DRY can be applied safely, since there are no real one size fits all measures that can be applied. Instead, it is a matter of balancing the following concerns (and perhaps a few others too):
1) Can you count the number of repetitions on one hand? Two? If below 1, more likely than not, that won't take much time to fix, so it is NOT worth it. Sometimes, even up to 10 times you shouldn't need to worry.
2) How likely is it that changes need to be made here? For prototype/research/startup code, the answer is likely to be highly likely... and in drastic ways which nullify the repetition that used to exist (NOTE: There's a reason this sort of code is derided by some. But that's because it's meant to serve a different purpose than those criticising it are exposed to on a daily basis; That is, they are meant to be a testbed for experimentation rather than ossified/stable architectural infrastructure). For small tools being added within an existing framework though, or in a more locked down "formal design" environment, the opposite considerations will usually apply. The only exception I guess is when your boss/client is an egomaniac who changes their mind every other meeting...
3) How complex/long is the thing being repeated? But, more importantly, how much of this varies, has potential to change, or can be considered to be a "customisation point"? A long and complex thing which is repeated in a few places but only changes/varies in 2 places is a pretty good candidate, as is a math operation that does a bunch of operations in a fixed order but is called hundreds of times.
4) Does creating an abstraction here make any semantic sense? That is, would abstracting away the detailed mechanics help the reader focus on the domain concepts/logic being expressed, by making the structure more self evident? Or does adding the abstraction end up creating a library/toolbox of these tiny little "reusable components" with awkward long names, that "may" be useful later when we "want" to "reuse code" (a very laughable assumption, that has been proven wrong many times over)?
5) Is it really necessary to use abstractions here? For instance, is there some language feature that everyone should know which is able to convey this idea instead of having it handled in whatever fashion the developer in question considers her "style of the day" (e.g. using while-loops for iterating through a collection when a for-loop or a foreach-loop captures the idea better with less room for error; or using a series of if-statements when a switch would be better to make it clearer that we're testing for the different values that a particular control variable can take). Another consideration is whether the verbosity in question is something that suitable developer tooling can take care of (e.g. Android developer tools plugins)

1 comment:

  1. Having recently come to the same conclusion, I anticipated your article title and googled it. (Google makes a great source of confirmation bias :-) )

    I'm not sure whether I'd entirely agree with this comment:
    "... I'd go so far as to argue that our "cures" for this - templating, generics, class hierarchies + networks (i.e. basically most "OO") - only make things worse most of the time ..."
    though I appreciate you've left a lot of room for manoeuvre.

    I think that where DRY often goes wrong is when we stray across what the domain-driven design community would call bounded contexts. I think it's okay for code to be _apparently_ duplicated between domains if it's expressing the same idea about different things, particularly if there's any chance that the idea might change for one of those domains but not the other.

    There is still a place for non-domain-specific implementation to be shared, particularly algorithms and so on, but that stuff should be used for implementation only.

    It's worth watching some of Kevlin Henney's talks, particularly "SOLID Deconstruction" and the "7 habits..." one.

    ReplyDelete