Repeat Yourself - sometimes it is a good thing
The DRY principle may be messing your code
In this article, we discuss the use of the DRY (Don't Repeat Yourself) programming principle. We will examine a case when DRY should NOT be used. And we will see a step-by-step code refactoring obeying the DRY principle, that shows what to do and what not to do.
* * *
The DRY principleOf all the programming principles, for a combination of reasons, DRY is the most attractive to me. DRY is a common name, which demonstrates what it is about. Also, DRY is an acronym for a love at first sight phrase: Don't repeat yourself.
Maybe because it makes sense (who wants double work?) so easy, and is easy to remember, we tend to overestimate it. Using it when we should not use, using when is not necessary or using in the wrong way.
* * *
Real life analogiesThis section is here just in case DRY has become a religion to you.
Imagine a luxury building with 30 apartments. Each apartment has 4 bedrooms with exclusive bathrooms plus two social bathrooms. Which means 6 bathrooms per apartment. The total number of bathrooms in the building is 180 (30 x 6) plus some bathrooms outside the apartments. How many water tanks does a building of this type have? Two (one in constant use plus one in reserve); not 30 or 180.
Around the world, civil engineers and architects use the DRY principle to define that each building will have only 2 water tanks. But they ignore the DRY principle when setting the number of bathrooms in a luxury apartment.
All cars come with 4 wheels, but only a spare tire. This seems reasonable. DRY is OK for step tires. But if you radicalize the use of the DRY principle when designing a car, you will not produce a car, but a unicycle.
Tesla Motors decided to go against the DRY principle. They created the Model S AWD, placing an additional engine into Model S. Their engineers feared that increasing the car's weight could decrease its range. What happened? The range increased because the car became more economical. You can check at www.fueleconomy.gov.
* * *
Don't DRY itConsider the (oversimplified) code for the mechanics of a 2.5D game like the following. Its environment is like a 100x100 checkerboard, which means 10,000 squares. On the squares, we place 2,000 trees and 5,000 walls, creating a maze. This maze will be populated by 33 monsters and an avatar representing the player. The avatar must kill the monsters by shooting arrows:
Following the DRY principle we would say that the lists creatures and missiles are redundant to the list squares. We could delete those lists and read the creatures and missiles directly from squares adjusting functions updateCreatures and updateMissiles this way:
If we modify the code this way we will worsen the performance of our application. When it comes to a game, performance is critical.
Before, to update the creatures (or missiles) we would need to iterate over a list of no more than 34 elements (6 in the case of arrows). Now, we need to iterate over a list of 10,000 elements (squares) and still ask 10.000 times if there is a creature in square.
This is one of those cases where we must ignore the DRY principle.
* * *
DRY it right
Consider a small web application: a page to monitor browser performance while painting a canvas. This code sample is about the part where the most recent information is stored in two lists with 40 items each.
Note: the code samples here are missing "use strict" at the beginning of each .js file.
The code above works well. It was written in a relaxed way, acceptable considering it is a small test application (1 file, less than 200 lines).
The only difference between the registerStart and registerDuration functions is the names. So we could think of using the DRY principle on them. In fact, DRY is not necessary here (two small functions in a small code). However let's go ahead to see what happens:
In my opinion, the correct way to apply the DRY principle is to create auxiliary functions in a library: a specific module with the purpose of ONLY SERVING other modules. Let's do it.
The main.js module is great. Much clearer. We don't even need a comment to understand how a start or duration is registered. We know that the value will be placed at the end of the respective list. We can check each function in the blink of an eye!
In the library.js file, we got an excellent name for the helper function: placeTailInList. Just by name we may guess its internal mechanics. placeTailInList uses only universal names like "n", "list" and "tail" which we are used to. Easy.
However, we've created a problem for the future: we will not be able to change the desired length of starts without also changing the length of durations. Let's fix this:
Good news: we achieved the independence of the starts and durations lists. And we were able to eliminate arbitrary use of the numbers 40 and 39 in (bottom of) placeTailInList.
Bad news: we have summoned an horrendous creature with many frenetic hands: The Mess. There are other problems but our focus must be send The Mess away:
By dividing placeTailInList into two functions (BOSS and WORKER), we were able to eliminate The Mess!! The boss function prepares the arguments to be consumed by the worker function which runs perfectly, IGNORING ITS CONTEXT.
We have an imperfection: we are using flags in the form of literal strings. It is very practical but it is easy to create a bug that neither the compiler nor the interpreter can identify. This technique is only acceptable for small programs.
We have a problem: the boss function is making assumptions about the maximum (40) length of each list. Making assumptions is prohibited for a library function. This is the full list of prohibitions for a library function:
- it cannot read global variables
- it cannot write global variables
- it cannot call functions from other modules
- it cannot make assumptions about the context
Why so many rules if we can break them all and the code runs smoothly? Because breaking these rules makes the code LESS MAINTAINABLE and more prone to raise bugs in future editions.
Furthermore, what if instead of 2 lists (starts and durations) we had 20 lists to process? Our boss function would not have a DRY appearance. Lets correct it:
We improved the boss function (placeTailInList). It can now handle any number of lists without needing to be modified. And we are checking every string flag it receives. Also, we put the maximum length of each list (40) in just one place, with global access.
We have a problem. The boss function is reading a global variable (lengths) and this is prohibited for a library function.
We have another problem. The check of the string flag happens at run time. We are reducing the performance of the execution of the application to make a check that should be done at compile time. It's time for more fixing:
Friend, we made it, finally! Our code is so good that we no longer need a boss function in library.js.
The code now is more readable, robust and maintainable than the original. It has less lines than... Oops! After all the refactorings our code now has 30% more lines than the original code. Wasn't it supposed to DRY?
Humm ... we have one last card up our sleeve. The registerStart and registerDuration functions... their content is just a simple line. We could eliminate them and put their content directly where they are called. This would make the code shorter. OK let's try:
OH NO! We have summoned The Mess again. It seems that those 2 functions with only one line each had an important role: FILTER COMPLEXITY. Now we have thrown their complexity (internal details) directly on their callers...
We must refactor one step back!
* * *
ConclusionThe first big lesson of this article is "DRY should not be a concern". Having a maintainable code is a concern. Having a performant application is a concern. Having a no bugged application is a concern. But DRY... it is not a concern. It is a technique that sometimes is good to use.
The second big lesson of this article is that if you are going to DRY, DRY it the RIGHT way. If you are not going to DRY the right way, don't even start. Or else you will be messing up your code.
* * *