Moisten up your tests a little 2

Posted by David Chelimsky Thu, 04 Jan 2007 22:36:00 GMT

DRY (Don’t Repeat Yourself) is a very useful guideline. As with all principles, however, there is a tendency to apply it as a golden rule without really thinking about why it is meaningful, and therefore not understanding when it is important to use and when it is not.

There is another principle that sometimes conflicts with DRY: Clean Code!

For example, look at these two specs and think about which is cleaner and easier to understand:


specify "user should supply name when registering" do
  @user = User.new(:name => "David")
  @registry.should_receive(:user_name).with("David")
  @user.register(@registry)
end


specify "user should supply name when registering" do
  @name = "David" 
  @user = User.new(:name => @name)
  @registry.should_receive(:user_name).with(@name)
  @user.register(@registry)
end

For me, the first one is simpler. No question. There’s one less line of code (that’s 25% in this case), I can clearly see the name and I don’t have to mentally decode the meaning of @name – nor do I need to look around to see where else it might be used. Remember, temporary variables are a code smell too.

The reason that I believe DRY to be important is that when you have duplicated code you run the risk of changing it in one place and forgetting to change it in another. If you do that, you’re likely to introduce bugs. That is the real risk – not the fact that you might have to type things more than once.

You could make the argument that the second example above manifests this risk. So in that case, you have to weigh clarity and DRYness and make a choice. I’ll usually choose clarity in tests.

There are other examples in which I really don’t think this risk is real. One is multiple calls to a no-arg method. This came up in an email thread on the rspec-users mailing list. The poster was looking for a way to avoid writing a line of code that logs in a user in many different rspec contexts.

The line to avoid was …


session[:logged_in] = true

... and I recommended including a module …


module LogIn
  def log_in
    session[:logged_in] = true
  end
end

context "Given a logged in user..." do
  include LogIn
  setup do
    log_in
  end
  ...
end

The poster responded that he was concerned that there is still duplication because you have to type the two statements in every context.

I see a difference between duplicated statements and duplicated calls to a no-arg method. While typing the call to log_in may require a few extra strokes, the meaning of it will never change in different contexts. So the risks associated w/ code duplication are mitigated. In fact, the real risk is that you might change its one and only meaning, in which case you’d have to look at each case and decide if the new meaning makes sense.

That problem wouldn’t go away if you could subclass contexts. In fact, it would be more sinister because the behaviour is implicit. At least if you look at log_in in each context you have some sense of what it means.

In summary, I believe strongly that DRY is a very important principle. But it’s not the only principle and it definitely isn’t implicitly the most important principle to apply in every situation. Sometimes a little moisture can be just the thing you need to clean up your code.

Comments

Leave a response

  1. Avatar
    Ravi Venkataraman 8 days later:

    The example of the name and the code samples is interesting.

    The first example has the magic string “David” twice. While the second has it only once. What if we mistyped the name in one of these places?

    One of the reasons to use temporary variables is that we may be using the value of the variable more than once in the same section of code. If so, there is a chance that we assign different values. In such cases temporary values are justified. Shying away from temporary variables simply because we have a dislike for them may cause problems.

    Another point to consider is maintenance. If the name has to be changed for some reason, say somebody’s name has some of the accented characters used in French, and we want to test whether accented characters are acceptable, don’t we have to change the first segment of code in two places, but only once for the second segment?

    In this case, I feel that maintenance will be easier with the second segment.

    Of course, in a decent sized method, you may have 15-20 lines intead of 3 and it balloons by five extra lines when you use temporary variables as needed. The ease of maintenance easily overrides the fact that I have five extra lines. It will actually lead to fewer bugs, and reduced possibility of bugs, too.

    For all these reasons, I would use the second code segment in such situations. If “David” were mentioned only once in the code, then I would consider the first code segment. But even here, I would ask the question, would it ever be necessary to test for a different name? If the answer was yes, then I would consider making the name as a parameter and passing it in to the method. That is done more easily with the second style when the temporary variable would just become an argument to the method.

  2. Avatar
    Tim 8 days later:

    In general, I agree with David, however I have to agree with Ravi concerning some code I wrote today. The test failed because I somehow managed to type a zero instead of a capital o in the “expected results” field. I am pretty careful to choose fonts that dot or slash the zero, but I still didn’t see it until I’d wasted time with less happy techniques. Silly things will get in your way. I converted to a local constant (final variable).

    So readability is important, and I like the way David writes code, but for a dunce like me I think that it’s important that everything be “singular and well-placed”. :-)

Comments