If-Methods 28

Posted by Michael Feathers Tue, 23 Sep 2008 23:26:00 GMT

I was working on some lexical analysis code this morning and I was stumped by a name. Or, rather, the lack of a good name. The code was a long series of if-statements, each of which tested for a lexical condition and then handled it:

if (Character.isLetter(ch) && identCharCount > MIN_IDENT_LEN) {
    currentToken.append(ch);
}

if (Character.isWhitespace(char ch) {
    tokens.add(new Token(tokenBuffer.toString(), TokenType.IDENT));
    currentToken = new StringBuilder();
}
The list of ifs went on and on. Each of them did something distinct and I wanted to break the method down into a set of meaningful sub-methods, but how? The typical pattern to use is to extract the bodies of the if-statements into separate methods, provided you can find good names for them. However, when I imagined what the code would look like after that refactoring, I didn’t really like it. It was still rather wordy:

if (Character.isLetter(ch) && identCharCount > MIN_IDENT_LEN) {
    appendToCurrentToken(ch);
}

if (Character.isWhitespace(char ch) {
    addToken(currentToken.toString(), TokenType.IDENT);
}
So, I thought that I’d extract each of the if-statements into its own method:

void onIdentChar(char ch) {
    if (Character.isLetter(ch) 
            && identCharCount > MIN_IDENT_LEN) {
        currentToken.append(ch);
    }    
}

void onWhitespace(char ch) {
    if (Character.isWhitespace(char ch) {
        tokens.add(new Token(tokenBuffer.toString(), 
                             TokenType.IDENT));
        currentToken = new StringBuilder();
    }
}
The calling code would certainly look cleaner:

    onIdentChar(ch);
    onWhitespace(ch);

The only problem is that the method names are a bit of a lie. When we use an “on” prefix we’re writing in event style. We expect that a method like onWhitespace is called only when we have whitespace. It shouldn’t have to check for it.

I’ve run into this situation before. How do you name a function which consists of a single if-block? The situation crops up frequently in legacy code. You want to break up a large method into several smaller ones and you’re left with a dilemma: do you extract the entire if-statement or just its body?

Here’s another example.


if (replyFlag == RE_SEND) {
    ...
    // code that sends a mail message
    ...
}

I once asked a friend what he’d name a method for that if-block if we extracted it and he said “sendEmailIfNeeded.” We both flinched and moved on to other things, but let’s think about that name a bit. Part of it talks about intention and part of it talks about implementation. Maybe we can pull the implementation bit out and concentrate on the condition:


if (needResend()) {
    ...
    // code that sends a mail message
    ...
}

Better. Now, how about extracting it like this?


ifNeedsResend();

It’s sort of like the event style I wrote about earlier. The method is a handler for a condition and the condition is checked in the method. Violation of the Single-Responsiblity Principle? Perhaps, but I do know that I’m looking at some code now that has been clarified by this idiom.


public List<Token> getTokens() {
    onStart();
    for (int n = 0; n < text.length(); n++) {
        char ch = text.charAt(n);

        ifLetter(ch);
        ifWhitespace(ch);
        ifLeftParen(ch);
        ifRightParen(ch);
        ifVerticalBar(ch);
        ifColon(ch);
        ifDigit(ch);
    }
    onFinish();
    return tokens;
}
What do you think?
Comments

Leave a response

  1. Avatar
    Dean Wampler about 1 hour later:

    My first reaction was “use a better language for parsing, like lex/yacc, antlr, etc.” ;)

    I’ve used the doFooIfBar() and related idioms before. Yea, it’s a bit of an SRP violation, but of course the principles are just that, principles, not laws. The clarity you got at the end is worth it, IMHO.

    One nit; it looks like only one if block will get used, so calling all the methods for all chars seems wasteful, or is that premature optimization and hence evil? It would certainly require the final getTokens to be a little more complicated.

  2. Avatar
    Micah Martin about 1 hour later:
    I’d consider turning each if block into a an object. Each object an instance of Rule implementing two methods: isSatisfied(char ch) and apply(char ch). Stick the Rule objects in a list/array and you’re down a loop.
    for(Rule rule in rules) {
         if(rule.isSatisfied(ch)
              rule.apply(ch);
    }
    
  3. Avatar
    Jeffrey Fredrick about 1 hour later:

    I’m thinking most people would have called it handleWhitespace(ch) w/out a second thought.

  4. Avatar
    Rick DeNatale about 2 hours later:

    I can’t see this as an improvement.

    When I see

    ifLetter(ch);

    I can only wonder

    if it's a letter then WHAT?

    How does this make the code clearer?

  5. Avatar
    Ola Ellnestam about 2 hours later:

    I agree with Micah here and would like to add some.

    I’ve tried programming without using If’s. Just to see how far you can take it. In fact I created a smaller application without any conditionals at all.

    It’s not something I can recommend for production but it was a fun and mind expanding experience.

    I ended up more than one time with solutions similar to Micah’s suggestion. The rules become very clear that way. The implementation as well.

    No SRP-violations and you sort of ‘configure’ in new behavior by putting new Rules into a list/map. Easily testable and very flexible.

  6. Avatar
    Dean Weber about 3 hours later:

    How about naming them….

    whenLetter(ch);

    .... etc. ?

    This would imply an action would be taken with the parameter when the condition is met. The ifLetter() does not necessarily connote an action will be taken.

  7. Avatar
    Phil about 3 hours later:

    I’m with Jeffery. We do this and we also address Deans desire for short-circuiting and add error handling on top of that by returning a boolean from our handleXxx() routines:

    if (!(handleLetter(ch) ||
            handleWhitespace(ch) ||
            handleLeftParen(ch) ...)) {
        throw new LexerException("Did not recognize character: " + ch...));
    }
    
  8. Avatar
    Quamrana about 3 hours later:

    I’m with Micah.

    Once you get a collection of objects, then iterating through them is easy. Micah is left with just one ‘if’ and you don’t even have to have that inside the for loop.

    Now the for loop is closed to modification of the collection – bingo, OCP!

  9. Avatar
    Samuel A. Falvo II about 3 hours later:

    When programming in Forth, this pattern occurs all the time. I demonstrate this in my “Over The Shoulder” video (see the torrent http://archive.forthworks.com/ots/ots-01.mpg?torrent .

    To summarize, Forth programmers can, and I argue should, abstract multi-path branches in a more linguistically natural mechanism. For example:

    : character ( ch—) eitherLetter orWhitespace orLeftParen …. lexerException ;

    Then, we define each choice as a guard on some action, through careful manipulation of the return stack. For example:

    : eitherLetter dup isLetter? if handle-it-here r> drop then ;

    Interestingly, reading about how Haskell compilers implements some of its optimizations, the use of “structured returns” isn’t novel—Haskell’s been optimizing like this for some time.

  10. Avatar
    Michael Feathers about 4 hours later:

    I like ‘handle’ and it works in this case, but I keep thinking about the generic problem of naming a method extracted around an if-block.

    The thing about ‘handle’ is illustrated by the other example: we could call the method ‘handleResend’, but I wonder if I’m alone in feeling that it sounds so noncommittal.. sort of like you say “handle this” to an object and it replies “Okeh, whatever..” :-)

    Rick: I think your objection is very interesting. Yes, if we say ifWhiteSpace() people are bound to say “then WHAT?” but I wonder.. why don’t we have the same reaction to onWhitespace? Why don’t we say “On whitespace WHAT?” I think it’s just because we’re used to the idiom.

    DeanW, Micah: the thing that is missing here is that this is an odd lexer with very weird indentation behavior. Doesn’t feel worth trying to find a tool that will handle it well (no pun intended). Also, a class for each of those methods would be fine, except that they fiddle with the state of the class. Not wanting to break it out yet.

    Samuel: Have to check that out. Thanks!

  11. Avatar
    Bruno Borges about 5 hours later:

    The Rules approach is the best I can think of. But I would improve it with anonymous inline classes (like callbacks).

    final char ch = ’ ’; final StringBuilder sb = new StringBuilder();

    List rules = new ArrayList(); rules.add(new Rule() { public boolean isSatisfied() { return ; } });

    public void apply() {
       // do something with sb and ch
    }

    // add several rules ...

    for(Rule rule in rules) { if(rule.isSatisfied()) rule.apply(); }

    —-

    Also, sometimes I just create local boolean variables as conditions:

    boolean isLetter = Character.isLetter(ch); // bunch of boolean variables

    if (isLetter) { // do something }

    if (anoterBooleanVariable) { // do one more thing }

    Cheers

  12. Avatar
    Cory FOy about 7 hours later:

    The thing I don’t like about handle is that you aren’t really handling anything. I would expect it to be a chain of responsibility, where the first match stops the execution.

    I thought also about what Micah mentions of turning each if into an object. But it’s the same thing – we don’t really need to go through each rule (or maybe we do).

    I almost want to say something like:

    public List<Token> getTokens() {
        onStart();
        for (int n = 0; n < text.length(); n++) {
            char ch = text.charAt(n);
            handleCharacter(ch);
        }
        onFinish();
        return tokens;
    }
    

    Or:

    CharacterHandler.createHandler(ch).handle(tokens);
    

    With the former just deferring the action to somewhere else, and the latter using polymorphism to determine the handler to create. Bonus points for the latter that you could dynamically wire up new handlers and conditions.

  13. Avatar
    S. M. Sohan about 13 hours later:

    What if you had an if-else if-else if…-else situation instead of just the ifs?

  14. Avatar
    Alf about 13 hours later:

    I see the chain of responsability here. Still, if that’s not doable, I’d go for handleIfSpace, handleIfLetter, etc. Just the ifSpace, ifLetter leaves me hanging on the implied but unanswered then-part of the if.

  15. Avatar
    Anthony Williams about 16 hours later:

    I agree with the previous comments that ifLetter, ifWhitespace need an associated action. If letter then what?

    I like the suggestion of a table-based loop, with predicates (isLetter, isWhitespace) and corresponding actions.

    As you originally mentioned, onFoo is an event idiom: the code raising the event calls onFoo, and somewhere there is a function that handles the event. In this case you still ask the question “on foo then what?”, but the answer is “do what the listener wants”. Hopefully the listener has a sensibly-named function. I don’t like the handlers for button click events to be called “onClick”, I prefer “addUser”, “deleteFile”, “closeWindow”, “saveDocument”. If your framework requires that the button handler is called “onClick” then a simple

    void onClick()
    {
        deleteSelectedFile();
    }

    works wonders for clarity.

    This doesn’t work with your ifLetter scheme since the whole point of these functions as you’ve got them is to perform the test and the action. handleLetter works a bit better, since it says “handle the case that this is a letter”. I’ve used handleFoo before, but “handle” is a bit generic.

  16. Avatar
    David Barker about 18 hours later:

    “Why don’t we say “On whitespace WHAT?” I think it’s just because we’re used to the idiom.”

    I don’t think this question is asked much, because it’s usually a framework that’s calling onEvent not your “own” code. So when people read though, they may see onEvent registered somewhere, and then read declarative code saying onEvent { do this; do that; }

  17. Avatar
    Mark Stijnman about 19 hours later:

    I liked Micah’s approach, but I think I’d prefer using a list of tuples with a boolean predicate and an action:

    
    parser = new parser();
    parser.addRule(isLetter, Append);
    parser.addRule(isWhitespace, NextToken);
    ...
    tokens = parser.parse(text);
    return tokens;
    

    or something similar. The “parse” function would, for each character in the text, simply iterate over all tuples, and whenever a predicate is true, perform the associated action.

    This solution allows you to capture both the intent of the predicate, as well as the intent of the action, and avoids coupling the two unnecessarily. I think it’s this coupling between predicate and action that is the source of all the unease with the proposed solutions so far.

    Michael, you were heading towards an event style notation, but shied away from it. What’s wrong with an event notation? You just needed to separate the event from the action, as in the above example.

    Ultimately, I suspect a Finite State Machine would be one of the most appropriate representations of the intent of this part of the code. You could use the example above as an intermediate step to refactor towards a full FSM implementation, by adding states and state transitions to the rules later. Of course, this application might not require a full FSM, but I can’t judge that based on the information given.

  18. Avatar
    Brandon Carlson 1 day later:

    I would agree with Cory seems chain o’ reponsibility would apply here. My only concern would be introducing the complexities of a pattern in a stable area of the code.

  19. Avatar
    Sebastian Kübeck 1 day later:

    Rick is perfectly right. The method doesn’t “if” but it generates a token from character if the character matches a certain pattern so I’d rather opt for:

    for (int n = 0; n < text.length(); n++) {
            char ch = text.charAt(n);
    
            scanLetter(ch);
            scanWhitespace(ch);
            ...
    }
    

    The general problem here is that the methods do two distinct things so the right name would be something like “scanLetterIfItsALetter” which is rather strange. In the long run, I’d go with Micah. You can extend the isSatisfied method with a context parameter that propagates state and/or apply weighted rules to make the mechanism smarter.

  20. Avatar
    dhui 2 days later:

    I think it’s a language flaw. If the language supports pattern matching, then there will be a beautiful result.

  21. Avatar
    Harald 13 days later:
    What about a new class TokenBuilder
    public class TokenBuilder {
        StringBuilder token ...
        List tokens ...
    
        public void appendIf(boolean condition, char c) {
            if (condition)
                token.append(c);
        }
    
        public void cutIf(boolean condition, TokenType type) {
            if (condition) {
                String content = token.toString();
                tokens.add(new Token(content,type));
                token = new StringBuilder();
            }
        }
    }
    
    The loop would become:
    ...
        TokenBuilder builder = ...
        for (int n = 0; n < text.length(); n++) {
            char ch = text.charAt(n);
            bool isLetter = ...
            builder.appendIf(isLetter,ch);
            bool isWhitespace = ...
            builder.cutIf(isWhitespace,TokenType.IDENT);
            ...
        }
    ...
    
  22. Avatar
    http://www.blacktowhiteiphone4.com over 2 years later:

    now supplying the latest white iphone 4 conversion kit to make your iphone 4 different from others. As a fashion fans like you will surely love it.

  23. Avatar
    accounting services over 2 years later:

    I agree with Micah here and would like to add some.

    I’ve tried programming without using If’s. Just to see how far you can take it. In fact I created a smaller application without any conditionals at all.accounting services

  24. Avatar
    okey oyunu oyna over 3 years later:

    useful informative article. Thanks

    internette görüntülü olarak okey oyunu oyna, gerçek kisilerle tanis, turnuva heyecanini yasa.

  25. Avatar
    louboutinchaussures over 3 years later:

    Many this the industry superb e-commerce, over the internet wanting, precisely what MBT footwear? I’ t convinced you choosed conduct. Plus the release with e-commerce team obtained a on-line shopping for. Always be with all this mbt lami boots and shoes plus boot footwear and even lots of people over the internet hunting Masai boots and shoes plus boot footwear could be a wonderful liking.Christian Louboutin Rouges chaud Make sure you execute could be a site into the stay, Christian Louboutin Bottessports book odds your person’ vertisements working out and also journal to look for and also MBT boots low-priced and also seek MBT boots out there. North Face Apex Bionic Kvinder JakkerType in getting some keyword phrase, similar to an example is definitely the following MBT Masai boots and shoes plus boot footwear ideal for lists pertaining to recommendations boots and shoes plus boot footwear and also web-site may might seem when while in front of any one, and then motivateNorth Face Denali Jakker Kvinder hoodie, to illustrate, a person’s sensible Web pages should do,

    canada goose will likely be MBT footwear providing could great final decision. Then you could well anticipation any one not any specs of your webpage acceptable this mbt baridi boots and shoes plus boot footwear in the united kindom normally often be subject material, you will get you wantcanada goose outlet.

  26. Avatar
    Ugg Italia over 3 years later:

    Ugg Italia , Step un paio di: Sentirsi quel contatto-(da sinistra di un individuo) Una volta sentito il contatto dal difensore pendente a sinistra, lato posteriore, mentre i dribbling di basket particolare utilizzando la tua offerta giusta, si renderà il vostro riposizionare: Spin, semplicemente piantando insieme spingendo utilizzando il piede sinistro ogni volta che rapidamente perno con l’arto inferiore destro, e modificare i dribbling a tutti i tuoi offer.As lasciato un perno singolo, mettere il ginocchio destro in modo che la parte posteriore sulla destra tricipiti / bicipiti può semplicemente un massaggio poco dal retro del difensore, che contribuirà a creare spazio a Voi relativi in più, il difensore, poi terminare la “rotazione” intorno a tutte le vostre part.As destra si lascia il backspin, l’avversario al momento essere l’alimentazione voi come si sta cercando proprio all’interno del baskeball canestro. (da destra) Se si potrebbe essere stato un dribbling di basket fino al lato più adatto, la cottura con il palmo della mano sinistra, attendere significato per contatto attraverso il difensore sul proprio terzino destro-side.While si sente un contatto, spin semplicemente piantando insieme spingendo utilizzando il piede destro ogni volta che rapidamente perno con l’arto inferiore sinistro, e modificare i dribbling a tutte le vostre destra offer.As uno spin individuale, messo il ginocchio sinistro in modo la schiena sul lato sinistro tricipiti / bicipiti può semplicemente un massaggio poco dal retro del difensore, che contribuirà a creare spazio a Voi relativi in più, il difensore, poi andare a girare intorno per il bordo sinistro.

  27. Avatar
    cheap gucci over 3 years later:

    o jordan release dates 2011 take care of Chen Wan-sheng of the burden falls on the shoulders of the old jordan 8 father. jordan 14 jordan retro 7 Manage space jam jordans outs…

  28. Avatar
    Office 2010 over 3 years later:

    This article is GREAT it can be EXCELLENT JOB and what a great tool!

Comments