Outliving The Great Variable Shortage 14

Posted by tottinger Tue, 27 Feb 2007 05:07:00 GMT

One of the more annoying problems in code, confounding readability and maintainability, frustrating test-writing, is that of the multidomain variable.

I suppose somebody forgot to clue me in to the Great Variable Shortage that is coming. I have seen people recycling variables to mean different things at different times in the program or different states of the containing object. I’ve witnessed magic negative values in variables that normally would contain a count (sometimes as indicators that there is no count, much as a NULL/nil/None would). I might be willing to tolerate this in programming languages where a null value is not present.

Yet I have seen some code spring from multidomain variables that made the code rather less than obvious. I think that it can be a much worse problem than tuple madness.

I have a rule that I stand by in OO and in database design, and that is that a variable should have a single, reasonable domain. It is either a flag or a counter. It is either an indicator or a measurement. It is never both, depending on the state of something else.

It is sort of like applying Curly’s law (Single Responsibility Principle) to variables. A variable should mean one thing, and one thing only. It should not mean one thing in one circumstance, and carry a different value from a different domain some other time. It should not mean two things at once. It must not be both a floor polish and a dessert topping. It should mean One Thing, and should mean it all of the time.

Surely I’ll take a shot from someone citing the value of a reduced footprint, and I won’t argue very long about the value of usinge only as much memory as you must. In C++ I was a happy advocate of bitfields. I haven’t been overly vocal about using various data-packing schemes, but I think that it can be a reasonable choice for compressing many values into a smaller space, but I will maintain that multipurpose variables are a bad idea, and will damage the readability of any body of code.

I suggest, in the case of constrained footprint where there truly is a Great Variable Shortage (GVS) that if (and that’s a big if) the author absolutely MUST repurpose variables on the fly that it is the lot of that programmer to make sure that users of the class/struct never have to know that it is being done. Never. Including those writing unit tests. The class will have to keep data-packing and variable-repurposing as a dirty secret.

Perhaps a strong statement or two in rule-form should be made here:

  1. A variable should have a SINGLE DOMAIN.
  2. One must NEVER GET CAUGHT repurposing a variable.

We don’t have to make do with the smallest number of variable names possible. Try to learn to live in plenty: use all the variables you want. For the sake of readability, consider having a single purpose for every variable not only at a given point in time, but for the entire program you’re writing.

Trackbacks

Use the following link to trackback from your own site:
http://blog.objectmentor.com/articles/trackback/275

Comments

Leave a response

  1. Avatar
    nullness about 2 hours later:

    “I might be willing to tolerate this in programming languages where a null value is not present.” You seem to imply that using null value is a acceptable thing. How does null differ from your overall conclusion? Namely, null has a different purpose than the value normally bound to the variable name. IMO there should be no null. Use e.g. dataflow variables. (single -store model), or f.ex. tuple (Boolean, A), or even better, a record (isSet: Boolean, value: A).

  2. Avatar
    Nullness about 10 hours later:

    If you squint at null just right. :-) It is acceptable partly because we are in the habit of accepting it. At least “no answer exists” is a fact about the counter.

    But imagine a variable “width” in an “image” class, and that a positive number is really the width of an image, but a negative number is the error code received with a loader routine attempted to read the image from a file. I think that’s a whole new level of “bad”.

    In the pysweeper example (a game I still can’t win, by the way) there is a tuple which has two multi-domain values. One is either the presence of a mine or else the number of adjacent mines, or a value of 2 meaning that it is neither a mine nor has the number of mines been counted. The second half of the two-tuple tells whether the cell has been opened or not also whether it has been flagged. Having a two-tuple acting effectively as a four-tuple is confusing. It took me a while to work out the meanings of the various values from the code. Once I surrounded the two-tuple with named methods (so that the code using the tuples is not comparing to magic numbers) it was easy enough to switch to a four-tuple with distinct facts. Of course, I picked the old Pysweeper program because it is hard to read. :)

    The mixture of a two-tuple (no meaningful names for elements) and multi-domain variables made Pysweeper a tasty puzzle to solve in my spare time. The same kind of thing in a production program would have made me quite angry.

    While I agree with you that the typical null implementation is multi-domained (again, squinting just right) I personally can’t even make a fair comparison between the minor annoyance of a “magic null value” and the horror of repurposed variables.

    Thanks for bringing it up, though. We should sniff at our traditions from time to time, in hopes something much better can be created.

  3. Avatar
    YAChris about 11 hours later:

    We don’t have to make due

    Wearing my “language nazi” hat here, but it’s “make do”. Please, let’s not make this another “lose/loose” flagrant internet misspelling.

  4. Avatar
    John 1 day later:

    A variable should always have a variable from its SINGLE DOMAIN.

    s/have a variable/have a value/?

    By the way – agreed.

  5. Avatar
    Tim 1 day later:

    Correction made. I don’t have anything sem-humorous to add to the mistake. Many eyes make all blogs shallow?

    Thank you for the catch.

  6. Avatar
    djoubert.co.uk 5 days later:

    David,

    Well I totally agree with you, it can become very frustrating in perl to check whether the user has actually given input as ‘0’, 0 and ’ ’ all evaluate to false.

    PHP solves it nicely with a leteral check $value === false will only return true if the $value is explicity been set to false.

  7. Avatar
    Jason 5 days later:

    This page is nearly impossible to read due to the font size.

  8. Avatar
    Tim 5 days later:

    Re: null, zero, false, empty, etc:

    I was raised in C. In C it was considered very poor form to compare anything to “true” (1), because there are so many ‘true’ values, and relatively few ‘false’ values. We would even go so far as to define a macro (if you’ll forgive my reaching into ancient memory):
         #define IS_TRUE(x)   (!(!(x))
    

    In C++ we ultimately got a boolean type that could be compared to true or false, but I was already trained not to compare v. true.

    Does it bother me? Yeah, but I forgot that it bothered me, having not messed with it in a while.

    By the way, Python gives you an unusually wide range of ‘false’ values, and I am always a little iffy about using it. I try to be careful and precise. It’s a funny game. After all, how hard is it (really) to create a boolean type with two possible values? You wouldn’t think it would take years of language evolution to get to this point.

  9. Avatar
    mgsloan 5 days later:

    Take a look at the hindley milner type system. It’s got your requirement of having a specific domain down, yet disagrees with your multiple kinds judgement. I suppose that repurposing and able to take on multiple forms may not be exactly the same thing, yet one type, one variable, may hold entirely different data. And we’re talking about one of the most exactingly defined and implemented type systems.

    Basically every type’s domain is a sum and product of other types. You seem to agree with the product part, yet disagree with the sum part. Then again, this is likely an attack more on dynamically typed languages, in which case functional programming is fairly well behind you.

    One thing that languages often lack is a concise, well specified generics system. In haskell, it’s called type parameters. This allows for constructing new types by composing types together. For example, the Maybe T type-constructor has a domain of Just T and Nothing, basically adding one value to T’s domain to construct Maybe T. This very nicely encapsulates the concept of Null used in other languages.

    Tim writes:Thanks for the H/M type system pointer. I’m digging in. And no, I tend not to attack any dynamically typed languages, as I have a strong preference for strongly, dynamically-typed languages like Python. I should get to know more of them.

  10. Avatar
    nullness 5 days later:

    Agreed with mgsloan. Another language that has those properties is Scala. Maybe is equivalent to Option[T], which is either Some[T](value) or None. You can then use pattern matching to branch based on the value. Scala has also novel component construction mechanism with abstract type members, traits, explicit self-type annotations and mixin composition ;).

  11. Avatar
    JJM 5 days later:

    I’ve recently come to like it when empty arrays, tuples and strings evaluate to false. Most of the times it doesn’t matter whether a function was passed an empty array, an empty tuple or a null/none/nil value.

    This is especially true in horrendous languages like PHP which have functions that can return multiple types of results, like NULL when NULL was passed into the function, False when it can’t handle the inputs and an int if the function was successful….

  12. Avatar
    Tony Morris 6 days later:

    “A variable should have a SINGLE DOMAIN.”—Well, the very notion of a “single domain” is a projected perspective. Nevertheless, the correction is, “A variable should not exist”, since change is an illusion provided by a dimensional axis affectionately referred to as ‘time’.

    Quite a few red herrings in this post (blub?) which begs the question, ever used a pure functional language? No variables there; bring on the Great Variable Shortage!!

    Tim writes:Indeed, I have not tried pure functional languages. Most of my (limited) functional exposure comes from python’s functional features. I am interested in learning to work that way, when it makes it up may “next thing to learn” list. Thanks!

  13. Avatar
    George Dinwiddie 8 days later:

    WRT: “I suggest, in the case of constrained footprint where there truly is a Great Variable Shortage (GVS) that if (and that’s a big if) the author absolutely MUST repurpose variables on the fly that it is the lot of that programmer to make sure that users of the class/struct never have to know that it is being done. Never. Including those writing unit tests. The class will have to keep data-packing and variable-repurposing as a dirty secret.”

    Actually, the compiler should do the repurposing for you, and you shouldn’t worry about the number of names in the symbol table. It’s been a long time since I last looked at compiler output, but compilers (since the late 1980’s, anyway) are very clever about figuring out the lifetimes of variables and re-using the space. I found back then that compilers could optimize much better than I could, and I could luxuriate in readable code.

  14. Avatar
    Tim 8 days later:

    Well, it is convenient in Python that {} and [] are also false values, but hardly satisfactory to me. It lacks a delicious explicitness. Let’s put it to bed, because it was hardly my main point.

    My main point is that the magic values should not change the meaning of the variable. This is the thing I’ve been seeing that lead to the rant, and on that I feel very confident.

    If there are going to be variables, each should be a fact and the semantic of the fact should be stable and reflected in the name. I think that “eitherNameOrDateOfBirthForNonResidents” would make a particularly crappy variable name, and a sign of sloppy thinking.

Comments