Refactoring Finds Dead Code 11

Posted by Bob Koss Thu, 01 Jan 2009 03:01:00 GMT

One of the many things that I just love about my job as a consultant/mentor is when I actually get to sit down with programmers and pair program with them. This doesn’t seem to happen nearly as often as I would like, so when two developers at a recent client site asked me if I could look at some legacy code to see if I could figure out how to get some tests around it, I jumped at the opportunity. We acquired a room equipped with a projector and a whiteboard. A laptop was connected to the projector and we were all able to comfortably view the code.

I visit a lot of different companies and see a lot of absolutely ghastly code. What I was looking at here wasn’t all that bad. Variable names were not chosen to limit keystrokes and method names appeared to be descriptive. This was good news, as I needed to understand how this was put together before I could offer help with a testing strategy.

As we walked through the code, I noticed that there were classes in the project directory ending in ‘Test’. This took me by surprise. Usually when I’m helping people with legacy code issues, there aren’t any tests. Here, there were tests in place and they actually passed. Very cool, but now my mission wasn’t clear to me as I thought my help was needed getting tests in place around legacy code.

The developers clarified that they wanted help in testing private methods. Ah ha, the plot thickens.

The question of testing private methods comes up frequently whether I’m teaching a class or consulting on a project. My first response is a question. “Is the private method being tested through the public interface to the class?” If that’s the case, then there’s nothing to worry about and I can steer the conversation away from testing private methods to testing behaviors of a class instead of trying to test individual methods. Note that a private method being tested through its public interface would be guaranteed if the class was developed TDD style where the test is written first, followed by one or more public methods to make the test pass, followed by one or more extract method refactorings, which would be the birth of the private methods. This is almost never the case. My client didn’t know how the code was developed, but by inspection they concluded that the parameters of the test were adequately exercising the private method.

It looked like my work was done here. But not so fast.

I have a policy that says that whenever I have code up in an editor, I have to try to leave it just a little bit better than when I found it. Since we had put the testing matter to rest and we still had some time left in the conference room before another meeting started, I suggested that we see if we could make some small improvements to the code we were examining.

As I said earlier, the code wasn’t horrible. The code smell going past my nostrils was Long Method and the cure was Extract Method.

The overall structure of the method we were examining was

    
    if( conditional_1 )
    {
        // do lots of complicated stuff
    }
    else if( conditional_2 )
    {
        // do even more complicated stuff
    }
    else
    {
        // do stuff so complicated nobody understood it
    }
    

where conditional_1 was some horribly convoluted expression involving lots of &&’s, ||’s, and parentheses. Same for condition_2, which also had a few ^’s thrown in for good luck. To understand what the method did, one would have to first understand the details of how the method did it.

I asked the developers if they could come up with a nice, descriptive method name that described what I’m calling condition_1 so that we could do an extract method refactoring and the code would look like:

    
    if( descriptive_name() )
    {
        // do lots of complicated stuff
    }
    // etc
    

Now there were less details to understand when trying to determine what this method did. If we were to stop here and check in the code, we could leave the session feeling good as the code is better than when we started. But we still had time before we had to abandon our conference room so I pressed on.

“Can you summarize what this code does as a descriptive method name,” I asked. The developers pondered a few moments and came up with what they felt was a good name. Excellent. We did the same procedure for the “else if” clause. When we finished that, one of the developers said something along the lines of, “That was the easy part, I have no idea what this [else] remaining code does.” I was going to pat everybody on the back and call it a day because the code had improved tremendously from when we started, but the developers seemed to have a “we can’t stop now” attitude. They studied the code, pondered, cursed, discussed some more, and then one of them said, “This code can never execute!”

I’d like to use the expression, “You could have heard a pin drop,” to describe the silence in the room, but since there were only three of us, the phrase looses its power. As it turns out, now that the if() and else if() conditionals were given descriptive names and people could grok them at a glance, it became obvious that the business rules didn’t permit the final else – the first two conditions were all that could exist and the most complicated code of all was never being called. This was downright comical!

I asked if the tests would still pass if we deleted the code and after a short examination of the tests, the developers weren’t as confident that the test parameters actually hit that area of code. There was a log() statement in that code and one of the developers was going to examine the production logs to see if the code ever executed.

So there you have it, refactor your code and the bad parts just go away!

Comments

Leave a response

  1. Avatar
    David L. Penton 1 day later:

    Even when someone exclaims that “this condition can never happen” it is still valid to check. Why? It ends up being an error condition – or let’s say a “known unknown”. That means that it is unknown what conditions may trigger that state. What if the business rules are incorrect? How would you know if you didn’t check for “that” condition?

  2. Avatar
    Steve Freeman 1 day later:

    I’ve been doing a lot of this kind of clean up recently and, as well as surplus code, it usually throws up all sorts of unclear logic. Usually we get about a couple of hours in and have to start going back to the business to ask them about all the new edge cases we’ve just found.

  3. Avatar
    André Faria Gomes 3 days later:

    Very nice post! I am trying to improve that in legacy code every day. Domain Driven Design techniques are helping a lot with that.

    Cheers!

  4. Avatar
    André Faria Gomes 3 days later:

    Very nice post! I am trying to improve that in legacy code every day. Domain Driven Design techniques are helping a lot with that.

    Cheers!

  5. Avatar
    Jebadiah Moore 4 days later:

    I see people do this a lot, and it’s always bugged me. Why would you create a whole new method for something you’re only going to do once? Sure, a good compiler will probably inline it, but it pollutes the namespace and besides, a lot of dynamic languages won’t inline it. Then you’re making your program slower when you could’ve just left a comment.

    This is ugly:
    if ( a&&b | c&&!(d+6) ){
        //stuff
    }
    
    But this seems even more complex:
    bool isFoobar(){
        return (a&&b | c&&!(d+6));
    }
    if ( isFoobar() ){
        //stuff
    }
    

    The best solution to me, unless you’re using the condition repeatedly, is

    if ( a&&b | c&&!(d+6) ){  //if isFoobar
        //stuff
    }
    

    And even if the condition is used repeatedly, unless you’re going to include it in some kind of API or you’re going to use it outside of the current module, it would probably be better to use a macro.

    This kind of pattern does come up a lot, though, even in proofs and such. Obviously it wouldn’t really make sense to add it to Java or C or anything, but it might be cool in a higher level language like Python or Ruby to have a feature like:

    if isFoobar where isFoobar = a&&b | c&&!(d+6):
        #stuff
    

    The where clause would simply insert the expression back into the if statement, but if would also create a (lazily-evaluated, preferably) reference to an automatically created closure which would re-evaluate the condition. That’d probably be more work than it was worth, though. (Maybe in Lisp).

  6. Avatar
    Jebadiah Moore 4 days later:

    Whoops, I just realized I only used one bar in the middle I doubt a bit-wise or is what would be needed :)

  7. Avatar
    yachris 5 days later:

    Great post. I’ve recently updated Eclipse using Yoxos’s automatic choose-the-plugins-you-want and they get the dependencies right service. This got me EclEmma, which shows code coverage with Green, Yellow and Red highlighting. It’s very cool, particularly in dealing with legacy code, to put together a set of tests for the ‘known’ cases and see exactly what code is covered. It would have helped here, early on.

    Jebadiah - evidently you work with actual professionals… I used to work with a guy who’d routinely write two to three hundred line methods, and swear that that was The Right Thing™ using arguments just like yours. Copy pasted code? “Sure! Why not? It gets the job done…!” Sigh… kinda glad he’s gone :)

  8. Avatar
    Paul 5 days later:

    Jebadiah : 1. you didn’t choose good names replace isFoobar with stuffIsNeeded. You’ll get:

    bool stuffIsNeeded(){
        return (a&&b | c&&!(d+6));
    }
    
    if ( stuffIsNeeded() ){
        //stuff
    }

    it may seems dummy, but I feel it is better…

    2. Once you’ve done that, you can test this ugly condition, without testing again and again the ‘stuff’ part. My own experience is that it’s pretty hard to test such a complex expression. Specially to be sure you when through all the possible paths…

    3. Then, you can refactor that ugly test and make it clearer. I would probably cut it in 2 parts around the | (even if it was a double one)... but well, that is up to you !

    My small experience also though me not to rely on comments: this is a classic pattern of what could happen in your case: step 1: your code
    if ( a&&b | c&&!(d+6) ){  //if isFoobar
        //stuff
    }
    step 2: some people don’t like comments at the end of the line, so
      //if isFoobar
    if ( a&&b | c&&!(d+6) ){
        //stuff
    }
  9. Avatar
    Paul 5 days later:

    sorry for the interruption….

    step 3: someone move the code, but forgot the comment:
    //if isFoobar
    //some other stuff, not related to Foo, bar, or anything

    Then you’re in the middle of nowhere, you’ve got a misleading comment in a code, and a piece of your logic lost (if you add choosen a real name, not isFoobar)...

  10. Avatar
    Ryan Breidenbach 6 days later:

    Unless I am 100% sure that the condition I am removing will never occur, I like leave trip wire just in case:

    if (foo())
      doFoo();
    else if (bar())
      doBar();
    else
      throw new IllegalStateException("Some helpful message.");
    

    This is especially true if there is minimal test coverage around the code or the code is significantly complex. Relying on eyeballing the code and my knowledge of the system is not enough. I would rather be sure and put it in the code.

  11. Avatar
    Slashene 5 months later:

    Jebadiah Moore, I think if your condition is not reused, you can do :

    bool isFoobar = a&&b | c&&!(d+6);

    if (isFoobar){ ...

    }

Comments