category tims-tepid-torrent

What I Don't Like About Python 40

Posted by tottinger Wed, 18 Jul 2007 02:41:00 GMT

I love the simplicity, the nice data types, the strong dynamic typing, the significant indentation, the runtime flexibility, the list comps, the generators, they way that I can get work done in Python. I love the built-in help. I dig the Tao of Python.

There are still a few bits that seem artificial and clumsy. I was told that you don’t really know a language if there are not five things you hate about it. I have more than five, but these will do for now:

- Dir is not a very good name for the method that dumps the contents of an object. I don’t like the ruby alternative, where methods and members are separate, but I just don’t like the naming here. A change would be possible. It would have to be grandfathered in, but it could happen.
- double-underscores annoy me. I would like to see some other way of denoting operators and other magic functions. I suppose one could grandfather-in a decorator-based trick for defining operators. It would be a real compatibility breaker, but I really feel that __getitem__ is probably not the best naming our guys could come up with.
- __name__ has always felt like a hack to me. Try explaining it to a novice programmer, or worse to someone who programs in perl or ruby or java for a living. Comparing the magic variable __name__ to the magic constant ”__main__” feels doubly so. I would rather have a convention such as naming a method “main” or maybe decorating with @script_main now that we have decorators.
- Blocks give me of ruby envy or smalltalk envy. I like local functions. Love them. I tolerate lambda. But I really, really would like to see a more rubyesque iterator setup where we pass a callable to the list, and the callable can be defined free-form inline. Python doesn’t really do that and so it’s less of a language lab than I might like.
- Properties are unattractive, partly because of blocks being absent. I don’t really want to define a named parameter (with double-underscores, most likely) and then two named functions, and THEN declare a property. That seems like so much work for such a simple situation. It is something I will only do if all other methods fail me, or if all other methods are overriding __setattr__ and __getattr__.
- Lack of recognition just kills me. This is a wonderful little language with great libraries and tremendous capability. Google, Eve, Nasa, a great many scientific efforts, a bunch of web sites, and a lot of Linux installers and configuration tools use python. A number of very nice distributed version control tools are written in it. But still people seem to go deaf if anyone mentions python, as if we’d mentioned JCL or something. I don’t get that.
- There is a very ugly mutual-inclusion design bug bug. I spent time on it once and was very unhappy.
- Self loathing takes over in my functions. I wish that I could refer to member variables without saying “self.” first. Testing would be easier if I could just import all my fixture variables into my local namespace—or better if it were done for me automagically. Typing “self.” doesn’t kill me, but it doesn’t help me. Yes, I have an abbreviation in my editor, but it still bugs me.

Don’t put me down as a complainer. I really, really love working in python. But There’s a lot more love than hate here.

PS: Is it uncommonly sensible that “not” is spelled “not”, “and” is spelled “and” and “or” is spelled “or”? I think so. && I’m ! just kidding || something.

Posted in Tim's Tepid Torrent
Meta 40 comments, permalink, rss, atom

Dependency Broadcaster 14

Posted by tottinger Tue, 17 Jul 2007 01:34:00 GMT

I’m not sure if there is already a code smell name for this situation. The idea is rather like “Large Class” or “God Class” but isn’t really related to behaviour. It’s just a matter of dependency.

Michael Feathers refers to “horrible include dependencies”, and that’s the right idea.

So what if you have a class file that includes (or forward-declares) a few hundred other classes, and that class is used by almost every other class in the system?

This is perhaps what comes from writing a very class-rich architectural layer as a single class.

Mainly the idea is that the class takes a lot of ugly dependencies and spreads them evenly over the application. I would call that “broadcasting” although the correct agrarian term is “manure spreader.” I will stick with the less evocative “dependency broadcaster.”

Posted in Tim's Tepid Torrent
Meta 14 comments, permalink, rss, atom

Use VIM to do TUT Unit Testing 46

Posted by tottinger Wed, 11 Jul 2007 14:47:00 GMT

I’m not crazy about TUT, but a customer is using it on a project and wanted to get a little assist from VIM. I don’t blame him. It’s a pain to keep track of your test number and there’s always more testing that we’d like in our test frameworks.

So I scribbled up a little VIM script to handle some of the light housekeeping. It’s not a wonderful script, and I’m betting the readers can give me some pointers (I’ve only written a few other vimscripts ever). I’m sure that I could do better if I reflected and refactored, but I haven’t yet.

Maybe you know a good way to unit test vim scripts?.

My script hijacks your F5 key. You can change that easily enough. Maybe I should have mapped it to \nt for “new test” or something. Anyway, enjoy and comment please:


function! NewTutTest()
    let s:testNumber = 0
    let s:newNumber = 0
    " Seek a higher number
    let s:list = getline("1","$")
    for s:line in s:list
        if s:line =~ 'test<'
            let s:newNumber =  0 + matchstr(s:line, '\d\+')
            if s:newNumber > s:testNumber
                let s:testNumber = s:newNumber
            endif
        endif
    endfor

    "Increment (tests are 1..n)
    let s:testNumber = s:testNumber + 1

    "Output the test values
    let s:line = line(".")
    let s:indent = repeat(" ", &shiftwidth)
    let result = append(s:line, s:indent . "template<>")
    let s:line = s:line + 1
    let result = append(s:line, s:indent . "template<>")
    let s:line = s:line + 1
    let result = append(s:line, s:indent . "void object::test<" . s:testNumber . ">()")
    let s:line = s:line + 1
    let result = append(s:line, s:indent . "{" )
    let s:line = s:line + 1
    let result = append(s:line, repeat(s:indent,2) )
    let s:line = s:line + 1
    let result = append(s:line, s:indent . "}" )
    call cursor(s:line, (&shiftwidth * 2))

endfunction

" Paste a version of this line into .vimrc to assign a keystroke (in this case F5)
" Comment-out this version if you do
" --------------------------------
map <F5> :call NewTutTest()<CR>Aset_test_name(" 

" Some helpful macros for command mode. Press \ten and it opens a new line
" and types ensure(" so you can fill in the string and the parameters.
" Overall, not too shabby.
map \tn oensure(" 
map \te oensure_equals(" 
map \td oensure_distance(" 
map \tf ofail(" 

" Abbreviations to help you in insert mode. Type the 
" two-letter name, followed by a quote.
ab tn ensure(
ab te ensure_equals(
ab td ensure_distance(
ab tf fail(

Posted in Tim's Tepid Torrent
Meta 46 comments, permalink, rss, atom

Revisit: The common subgroups 23

Posted by tottinger Tue, 03 Jul 2007 15:43:00 GMT

In cleaning up the code, I simplified the algorithm a very little and improved performance considerably. Amazing how that works, how simpler equals faster for so much code. Adding simple data structures, local explanatory functions, and the like often make code much faster.

What I’m hoping is that I will use this in a few different and useful ways.

The first way is to look for interfaces where concrete classes are being used from many other classes. You need to add an interface, but don’t know who needs which part. The goal is to figure out a relatively small number of interfaces that satisfy a number of clients in a module.

The second use would be to look for common clumps of parameters when I’m working in a large code base where the average number of arguments per function call does not remotely approach one. I suspect that there are clumps of similarly-named variables being passed around, and that these are likely “missed classes”. Sometimes these are obvious, but it would be good to see them in a nice list spit out from a nice tool.

So this is a hopeful start on a series of useful tools.

Code follows.

import shelve
import sys

def find_groups(input):
    """ 
    Exhaustively searches for grouping of items in a map, such
    that an input map like this:
        "first":[1, 2, 3, 4],
        "second":[1,2,3,5,6],
        "third":[1,2,5,6]
    will result in:
        [1,2,3]: ["first","second"]
        [1,2]: ["first","second","third"]
        [5,6]: ["second","third"]

    Note that the return value dict is a mapping of frozensets to sets,
    not lists to lists as given above. Also, being a dict, the results
    are effectively unordered.
    """ 
    def tupleize(data):
        "Convert a set or frozenset or list to a tuple with predictable order" 
        return tuple(sorted(set(data)))

    def append_values(map, key, *values):
        key=tupleize(key)
        old_value = map.get(key,[])
        new_value = list(old_value) + list(values)
        new_value = tupleize(new_value)
        map[key] = new_value
        return key, new_value

    result = {}
    previously_seen = {}
    for input_identity, signatures in input.iteritems():
        input_signatures = set(signatures)
        for signature_seen, identities_seen in previously_seen.iteritems():
            common_signatures = set(signature_seen).intersection(input_signatures)
            if len(common_signatures) > 1:
                known_users = list(identities_seen) + [input_identity]
                append_values(result, common_signatures, *known_users)
        append_values(previously_seen, signatures, input_identity)
    return filter(result)

def filter(subsets):
    filtered = {}
    for key,value in subsets.iteritems():
        if (len(key) > 1) and (len(value) > 1):
            filtered[key] = set(value)
    return filtered

def display_groupings(groupings):
    "Silly helper function to print groupings" 
    keys = sorted(groupings.keys(), cmp=lambda x,y: cmp(len(x),len(y)))
    for key in keys:
        print "\n","-"*40
        for item in key:
            print item
        for item in sorted(groupings[key]):
            print "     ",item
        print

Posted in Tim's Tepid Torrent
Tags algorithm, clean, optimized, python
Meta 23 comments, permalink, rss, atom

Python Subgroup Detection and Optimization 12

Posted by tottinger Thu, 28 Jun 2007 04:23:00 GMT

I had a moderately interesting customer problem to work on. I got acquainted with a bit of legacy code that is seriously in need of some interface segregation. It’s an entirely concrete class and used from all over the code base. The question is how to segregate, and that depends on what methods are called from which programs. We ran ‘nm’ to extract the link table from our object files, saving me the trouble of parsing C++ (a scary thought) All that remained was for me to compare the method prototypes used by the object files and find the common sets.

Lacking better ideas, I decided to do this exhaustively in a brute-force kind of way. It is only a few hundred files, so it shouldn’t take too long. I was very wrong. It took a long time and eventually failed.

I had TDD-ed the code, so I had tests of correctness, and I relied on these as I added optimizations, but of course the performance problem occurred only under a real load.

I could have run a profiler on it (and probably should have) but instead I simply monitored my computer. I quickly saw that my time was going into memory allocation, which is also the reason it died after many minutes When I have python performance problems, this is usually the reason, and almost never “interpreter drag”.

My friend Norbert (a gentleman of many programming languages, including ruby) suggested that I wasn’t interning my strings, and of course I was not. I switched to interning strings, and noticed a little improvement which meant the program ran longer before failing from memory problems. Well, the tests still passed, so I knew at least that the logic was still good even if the algorithm was primitive and Occam was spinning in his grave.

Next I realized that I am dealing with a lot of small groups of strings, and in a leap of optimism/faith/stupidity I decided to intern groups of strings. That helped very little, but it did help since my program was now running for the better part of an hour (!!!) and then failing. “More of same” without measurement is hardly a good recipe for optimization.

This is when I realized that I shouldn’t be storing sets or frozensets, which are pretty heavyweight data structures. I had chosen them because I was really working with set intersections, but hadn’t counted the in-memory storage cost. I converted the data structure used by the algorithm and added a local function to make tuples out of the sorted sets.

I was very glad to have my tests to catch me when I had some typing mistake or sloppy conversion. My tests had to be edited, but the changes there were very lightweight, and caused me to abstract out some data comparisons that were (admittedly) repeated. It was all good.

When I ran the full program it completed so quickly that I was sure I’d broken it. It was running with sub-second time, including gathering data from various text files (nm output files). I did a few spot-checks, and determined that it was indeed doing the right thing (as far as I know).

The data is moderately interesting, and I will be able to pick out some useful interfaces. Better yet, I have a program that can pick out all the uses of my big, fat class and recommend interfaces to me. This is all good.

Lessons learned:

The tests gave me peace of mind as I worked. I would so hate to have done this “naked”.
Python’s speed is fine (even startling) if you aren’t doing something wasteful and heavy-handed like storing hundreds or thousands of non-interned strings and heavyweight data structures
I could have been done sooner if I’d measured with hotshot instead of guessing. This is very clear to me, and I won’t think I’m too clever or my problem is too simple to do this ever again.
Keep some ruby friends on hand. They come in handy.
I didn’t really need a cooler algorithm. Brute force is sometimes enough.

I want to build a new wrapper for this code to compare parameter lists, to help find unrecognized classes in these same programs. It shouldn’t be too hard, but it will be a larger data set so I will probably need some more optimization or a cooler algorithm later. I think I see some waste in it now, but I guess I’ll have to fix that after blogging.

Of course, I’m not done learning. I am sure there are a lot of ways to improve the core code. I also believe in other people, that I should make it available to criticism and suggestions.

Here it is:

Tests

(Which, embarassingly, could use more refactoring)

import unittest
import clumps

def keySet(someMap):
    return set(someMap.keys())

class ClumpFinding(unittest.TestCase):
    def testNoGroupsForSingleItem(self):
        input = { "OneGroup": [1,2] }
        actual = clumps.find_groups(input)
        self.assertEquals({}, actual)

    def testNoOverlapMeansNoGroups(self):
        input = {
            "first": [1,3],
            "second": [2,4]
        }
        actual = clumps.find_groups(input)
        self.assertEquals({}, actual)

    def testIgnoresSingleMatches(self):
        input = {
            "first": [1,3],
            "second": [1,4]
        }
        actual = clumps.find_groups(input)
        self.assertEquals({}, actual)

    def testTwoInterfaceMatches(self):because of
        group = (1,2)
        names = ("first","second")
        input = dict( [(name,group) for name in names])

        actual = clumps.find_groups(input)

        self.assertEquals(1, len(actual))
        [key] = actual.keys()
        self.contentMatch(group, key)
        self.contentMatch(input.keys(), actual[key])

    def contentMatch(self, left,right):
        left,right = map( frozenset, [left,right])
        self.assertEquals(left,right)

    def testFindsThreeGroupsMatchingExactly(self):
        group = [1,3,8]
        names = "one","two","three" 
        input = dict( [(name,group) for name in names ] )

        actual = clumps.find_groups(input)

        self.assertEquals(1, len(actual))
        [clump_found] = actual.keys()
        self.contentMatch(group, clump_found)
        self.contentMatch(names, actual[clump_found])

    def testFindsPartialMatchInThreeGroups(self):
        input = {
            "a":[1,2,3,4,5],
            "b":[1,4,5,6,8],
            "c":[0,1,4,5]
        }
        target_group = frozenset([1,4,5])
        names = input.keys()

        actual = clumps.find_groups(input)

        [key] = actual.keys()
        self.contentMatch(target_group, key)
        self.contentMatch(names, actual[key])

    def testFindsMultipleMatches(self):
        input = {
            "a":[1,2,3,4,5],
            "b":[1,4,5,6,8],
            "c":[0,1,4,5],
            "d":[1,2,3],
            "e":[1,3]
        }

        actual = clumps.find_groups(input)
        keys = actual.keys()

        self.assertEqual(3, len(actual))

        grouping = (1,2,3)
        referents = set(["a","d"])
        self.assert_(grouping in keys, "expect %s in %s" % (grouping,keys) )
        self.assertEqual(referents, actual[grouping])

        grouping = (1,4,5)
        referents = set(["a","b","c"])
        self.assert_(grouping in keys)
        self.assertEqual(referents, actual[grouping])

        grouping = (1,3)
        referents = set(["a","d","e"])
        self.assert_(grouping in keys)
        self.assertEqual(referents, actual[grouping])

if __name__ == "__main__":
    unittest.main()

The Code

import sys

def find_groups(named_groups):
    """ 
    Exhaustively searches for grouping of items in a map, such 
    that an input map like this:
          "first":[1, 2, 3, 4],
          "second":[1,2,3,5,6],
          "third":[1,2,5,6]
    will result in:
        [1,2,3]: ["first","second"]
        [1,2]: ["first","second","third"]
        [5,6]: ["second","third"]

    Note that the return value dict is a mapping of frozensets to sets,
    not lists to lists as given above. Also, being a dict, the results
    are effectively unordered.
    """ 
    def tupleize(data):
        "Convert a set or frozenset or list to a tuple with predictable order" 
        return tuple(sorted(list(data)))

    result = {}
    for name, methods_called in named_groups.iteritems():
        methods_group = frozenset(methods_called)
        methods_tuple = tupleize(methods_group)
        for stored_interface in result.keys():
            key_set = frozenset(stored_interface)
            common_methods = tupleize(key_set.intersection(methods_group))
            if common_methods:
                entry_as_list = list(result.get(common_methods,[]))
                entry_as_list.append(name)
                entry_as_list.extend( result[stored_interface] )
                result[common_methods] = tupleize(entry_as_list)

        full_interface_entry = result.setdefault(methods_tuple, [])
        if name not in full_interface_entry:
            full_interface_entry.append(name)
    return filter(result)

def filter(subsets):
    # Apology: I'm betting I can do this in a functional way.
    filtered = {}
    for key,value in subsets.iteritems():
        if (len(key) > 1) and (len(value) > 1):
            filtered[key] = set(value)
    return filtered

def display_groupings(groupings):
    "Silly helper function to print groupings" 
    keys = sorted(groupings.keys(), cmp=lambda x,y: cmp(len(x),len(y)))
    for key in keys:
        print "\n","-"*40
        for item in key:
            print item
        for item in sorted(groupings[key]):
            print "     ",item
        print

Posted in Tim's Tepid Torrent
Meta 12 comments, permalink, rss, atom

The Things That Pass For Simple I Can't Understand 35

Posted by tottinger Thu, 14 Jun 2007 05:19:00 GMT

(with apologies to Steely Dan for the nearly-lyrical title)

I’ve noticed that “Do the simplest thing that might possibly work” gets universal agreement in principle, and great divergence in practice.

Is a set of variables related by name prefix simpler than a named class containing the same variables? In the “fewest number of the fewest things” sense, I see a group of variables as a lot more to manage than a class. I can’t understand why some people pass the same group of loose variables to a number of methods. Isn’t a seven-argument method “complex”?

Is a custom exception “fancy” or “trivial”? I’ve had that discussion recently. Some feel one way, some the other. Is it more complex to throw a standard exception type and then try to figure out what it means elsewhere?

Is event-handling simpler than polling? I’ve heard this one too. I don’t know how other people see it. I think events have more complex plumbing, but that polling has a greater run-time complexity because of failure modes (periodicity issues, race conditions, etc). I think that it is simpler if you have fewer problems to look out for.

If you need a fixed set of java instances, is an enum simple or fancy? Here you have a newer language feature, but it exists to simplify the management of a fixed set of instances, so it leaves you less to deal with, no?

Is “more primitive” simpler? How about if you use arrays and integer indices into a string rather than a list of strings? Is that simpler or more complex? Is the Bowling Game “simple” because it uses a primitive array and a separate array counter instead of custom objects?

In C++, is it simpler to have a struct that contains two data members or to use the pair<> template? Is it “simpler” that you have to refer to them as “first” an “second”? Or is that just more obscure? I think that “obvious” is more simple.

I am not sure how other people judge simple. I think “simple to use”, “less bookkeepping”, “harder to mess up”, “less setup to call”, “fewer parameters” are simpler. Other people seem to have contrary thoughts.

Clearly simple involves more than “least thought” or “most primitive data components”. It should have something to do with the lack of effort in using something and the ease with which things can be used correctly. Well, clearly to me. And it should have to do with the least difficulty in reading the resulting code, and the least difficulty in changing it.

Unrelated to this, I find some tools that others find simple to be baffling, and some tools that are complex to others seem pretty easy to use to me. I suppose that’s a kind of “personal taste” but is there a more reliable rule to base “simple” on than personal choice? There should be if the rule is for us all to do the simplest thing that works, and to maintain the simplest design we can afford to build.

But I have unique viewpoints sometimes, and maybe simple to you isn’t simple to me. Personally, some of the things that are simple to you may be all but inscrutable to me. I guess that happens.

Posted in Tim's Tepid Torrent
Meta 35 comments, permalink, rss, atom

Collateral Effort Revisited 20

Posted by tottinger Tue, 12 Jun 2007 01:36:00 GMT

One of the things I love about TDD is that it takes all the scaffolding and collateral effort for creating a class, all indications of bad coupling, and makes them entirely visible. This is also one thing that makes it very hard to start TDD in legacy system.

When you realize that it’s very hard to test a class out-of-context, the right answer is to start decoupling each class so it can be run out-of-context. When we are successful here, tests run very very fast and are easily self-verifying and isolated. It is a beautiful thing even if it follows many ugly hours of painstaking work.

The wrong answer, I’m convinced, is to build a mega-framework for testing that allows you to test in-context. This type of framework typically creates a totally realistic runtime environment (complete with database, configuration files, directories, etc). This approach allows you to think that you can ignore your dependency nightmare. You really can not Mega-test-frameworks take a such a long time to run that developers stop running all the tests all the time. This is a far worse problem than breaking dependencies because it breaks the whole process. Mega-frameworks don’t solve the problem, they only defer the solution until the problem gets worse.

When you are faced this kind of collateral effort, the answer is to work through it, not to sweep it under a rug. Working through it is work, indeed, but this is about doing the right thing not doing the easiest thing.

Posted in Tim's Tepid Torrent
Meta no trackbacks, 20 comments, permalink, rss, atom

The Myth of Learning Compression 12

Posted by tottinger Thu, 24 May 2007 18:56:00 GMT

There is an odd idea that learning is compressible, that any training course of any length can be squeezed into a very small time box and delivered to any large number of people without significant loss.

As an educator and as a guy who has a pretty big learning load at any time, I know that this is a myth. I can cover any topic much better if I am working with a handful of individuals over the course of a week or two. If you double the number of attendees, then the interactivity is highly degraded, and discussions become time sinks. If you half the time, then interactivity is likewise degraded, and material that might have been covered must be skipped or skimmed.

I have a joke with my coworkers that the ideal environment for some large corporations would be a three-train roller coaster. The participants would buy a ticket, climb on board, sail past the materials at speeds in excess of 60 miles per hour, and would collect their certificate at the other end. As one train is unloading, another is loading, and another is in progress so that we could cover many hundreds or thousands of students per day. Of course, they would learn next to nothing and would not really participate at all.

But I respect my corporate customers, including the fact that they have much to do, and cannot easily afford a one-week work-stop for each dozen team members, all with consultant/trainer fees. Some large organizations would have us on-site for years in a row before all their developers would be trained in any technology. It’s just not reasonable, and they push for a more feasible schedule.

It’s a rock and a hard place, and people on both sides understand that it is so.

The good news is that compressibility is not a total myth. There are techniques and tools that can make the training more effective, so that less time is necessary in class. There is the idea of a jump-start, where a topic is introduced and discussed in a short time and students are given materials on which they can continue via self-training. The compressibility myth drives us to more dense and useful presentation. It is a valuable spur.

It also has the advantage of the time box. If you have only three days to teach someone how to TDD, you have to choose the three most important days-worth of material to teach. This kind of prioritization is the same thing that we demand of our product owners in a scrum, and is perfectly fair and reasonable.

Even though compressibility is a myth, it is a useful myth.

The only mythological part of the compression myth is the idea that it is lossless. It is lossy compression. But as is the case with audio and visual compression, compressed learning can be “good enough.” If that’s what we’re after, it’s all good.

Posted in Tim's Tepid Torrent
Meta 12 comments, permalink, rss, atom

Unit Tests Coverage: Less Is More 19

Posted by tottinger Mon, 07 May 2007 13:06:00 GMT

In TDD a unit test has to be very small to isolate failures. This does funny things to code coverage as a metric. Each test should have a very small area of effect, and so each unit test should have a negligible effect on the overall code coverage statistic. Bear with me here, see where I’m missing out.

Say you have an existing (legacy) system with no coverage at all. Zero percent. If you start doing TDD today, the overall coverage percentage should barely change at all. If only the new code is test-driven, then the old code is not gaining coverage except where tests are necessary to ensure that the new code is being called. The low coverage per test is a good thing because it shows that the unit tests have good isolation. In such a situation, code coverage is really telling you the ratio of new code to old code. Again, this is so obvious and logical to me that I must be missing some cool subtleties.

If all the code was test-driven from the beginning, you should have a very high coverage number, and writing a new unit test before you add code should not impact that number. You only write enough code to pass the test, so again you aren’t getting much in the way of uncovered code. A lowering of the ratio might indicate a problem with test-to-production-code ratio. This seems pretty simple and logical, so I’m sure I’m missing some interesting corner cases.

OTOH, system tests and integration tests paths through many components at once, and should have a more significant effect on coverage in a previously-untested system, though their job is to prove function points, not to raise the metrics. These non-unit tests are the thing that boost your coverage of old code. That is also a good effect, because it is the goal of the system test to ensure that the parts work together.

I’m not saying that code coverage should be low, only that as we move incrementally, the unit tests we write should individually have negligible effect on our overall code coverage numbers… a thought that intrigues me.

Posted in Tim's Tepid Torrent
Meta 19 comments, permalink, rss, atom

CppUnit and Vim 60

Posted by tottinger Fri, 04 May 2007 02:54:00 GMT

I’m playing with ways to make CPPUNIT and VIM a bit more agile for my C++ teams.

I’ve been hearing about great lengths people go to, flipping back and forth between .h and cpp for the tests or writing perl scripts to coalesce them together. I’m too simple-minded for that. I figure nobody wants to #include my tests, so I don’t need a .h. I do it all in the .cpp file, all inline. It works fine. This way, everything is in one place. I don’t have to jump up to the header file to add a protoype and then down to the implementation to add the CPPUNIT_TEST line, and then back down to my code to enter it. I have no .h, so no prototype. Inline makes it a bit more like pyunit or junit.

I grokked the CompilerOutputter pretty quick. If you use the quickfix mode (and esp if you use :cw) you will definitely want the CompilerOutputter. I like the way it helps me move from testing error to testing error (err.. if there were ever more than one), just as it does with compile errors (errr. if they happen. They do). I set up my makefiles to run the tests, so there’s no additional step required. That’s handy. I also have the makefile run ctags to make my navigation nice and easy.

Still, adding and removing tests takes some effort. I figure VIM can take up the slack. I have macros to start up a new test file (F2), to add a test class(F11), to register a function(F12), and to delete a function and its CPPUNIT_TEST line (F9) .

I also abbreviate most of the macros that I would otherwise have to type. I am thinking about remapping them maybe to more memorable values, but it helps.

There are probably a lot more elegant ways to do this. I can’t wait to hear about them.

 " Add methods to help create CPPUNIT classes
function! NewCppUnitFile()
    call NewCppUnitClass()
    0r~/.cppunit_file_template.vim
endfunction

function! NewCppUnitClass() 
    r~/.cppunit_class_template.vim
    let name = input("Name your new test class: ")
    exec "%s/@@@NAMEHERE@@@/" . name . "/g" 
endfunction

nmap <F2> :call NewCppUnitFile()<CR>
nmap <F9>  ^f(b*ddN0d/{<CR>d%
nmap <F11> G:call NewCppUnitClass()<CR>
nmap <F12> ebmz"zyw?CPPUNIT_TEST_SUITE_END(<CR>OCPPUNIT_TEST(<C-R>z);<ESC>'z

" CPPUnit abbreviations
ab cas CPPUNIT_ASSERT
ab cam CPPUNIT_ASSERT_MESSAGE
ab cfa CPPUNIT_FAIL
ab caf CPPUNIT_FAIL
ab cae CPPUNIT_ASSERT_EQUAL
ab cat CPPUNIT_ASSERT_THROW
ab cde CPPUNIT_ASSERT_DOUBLES_EQUAL

The current rhythm is to open a new file and press F2. F2 calls NewCppUnitFile, which sucks in my file header template. It’s pretty bare, but it can be edited to taste. Mostly it helps me remember what to include.

    #include <cppunit/TestFixture.h>
    #include <cppunit/extensions/HelperMacros.h>

Then I give the name of my first test class. I get a starter class. I can then add it to the makefile (note to self: automate the makefile bit). the starter template I have has a dummy test, so I can run it and see if it fails. Failing means it’s being built and run.

    class @@@NAMEHERE@@@: public CPPUNIT_NS::TestFixture {
        CPPUNIT_TEST_SUITE(@@@NAMEHERE@@@);
        CPPUNIT_TEST(shouldFailAsEvidenceThisMethodIsRegistered);  // DELETEME?
        CPPUNIT_TEST_SUITE_END();

        public:
            void setUp() {
            }
            void tearDown() {
            }
        protected:
            void shouldFailAsEvidenceThisMethodIsRegistered() { // DELETEME?
                CPPUNIT_FAIL("This function is registered and can be deleted");
            }
    };
    CPPUNIT_TEST_SUITE_REGISTRATION(@@@NAMEHERE@@@);

Substitution is automagical. the @@NAMEHERE@@ is the magic pattern being replaced with test names.

A nice thing about macros is that they’re atomic. If I hate the name I gave my test, I press ‘u’ for undo, and it vanishes. Back to empty file. :-)

If the ”:mak” works and I get the inevitable failure in the “shouldFail” method, I put my cursor on the “void shouldFail…” line and press F9. In the wink of an eye, the method and its CPPUNIT_TEST line both vanish. Again, macros are atomic, so if I delete the wrong thing, I just press ‘u’ for undo and both the function and its registration reappear.

No flipping between files, no scrolling up and down. We have software to do that now.

I add a method under “protected”. I put the mouse on the method name and press F12 to add the CPPUNIT_TEST for it above. I add some more text. When I get to the part where I need to assert, I use one of the abbreviations. If I’m in insert mode and type “cae(” I get “CPPUNIT_ASSERT_EQUALS(”. I don’t have to remember whether it ends in an S or not. I actually struggle with that, sad thing that I am.

I can press F11 to create more test classes. I like to keep a test class for every circumstance (setup) I am using. I don’t mind at all having multiple tests per .cpp file. I like to group the tests together in files. After all, quickfix knows where the code is. This is actually one of the reasons I like to keep using CppUnit. I’ll have to tell the others someday.

Well, that’s how I do it. You don’t have to, but you could try it. If you find any errors, or glaring omissions, please mail them to me or comment here. In return, I will claim my actual scripts are perfect and blame it on the web master. ;-)

Posted in Tim's Tepid Torrent
Meta 60 comments, permalink, rss, atom

Mentor	twitter id
Uncle Bob	unclebobmartin
Brett Schuchert	schuchert
Michael Feathers	mfeathers
Bob Koss	bob_koss