C++ Algorithms, Boost and function currying 165

Posted by Brett Schuchert Sun, 13 Jun 2010 04:41:00 GMT

I’ve been experimenting with C++ using the Eclipse CDT and gcc 4.4. Since I’m a fan of boost, I’ve been using that as well. I finally got into I realistic use of boost::bind.

I converted this:
int Dice::total() const {
  int total = 0;

  for(const_iterator current = dice.begin();
      current != dice.end();
      ++current)
    total += (*current)->faceValue();

  return total;
}
Into this:
int Dice::total() const {
  return std::accumulate(
      dice.begin(),
      dice.end(),
      0,
      bind(std::plus<int>(), _1, bind(&Die::faceValue, _2))
  );
}

To see how to go from the first version to the final version with lots of steps in between: http://schuchert.wikispaces.com/cpptraining.SummingAVector.

This is a first draft. I’ll be cleaning it up over the next few days. If you see typos, or if anything is not clear from the code, please let me know where. Also, if my interpretation of what boost is doing under the covers (there’s not much of that) is wrong, please correct me.

Thanks!

CppUTest Recent Experiences 168

Posted by Brett Schuchert Thu, 04 Feb 2010 19:42:00 GMT

Background

Mid last year I ported several exercises from Java to C++. At that time, I used CppUTest 1.x and Boost 1.38. Finally, half a year later, it was time to actually brush the dust off those examples and make sure they still work.

They didn’t. Bit rot. Or user error. Not sure which.

Bit rot: bits decay to the point where things start failing. Compiled programs do have a half-life.

User Error: maybe things were not checked in as clean as I remember. Though I suspect they were, I don’t really have any evidence to prove it, so I have to leave that option available.

To add to the mix, I decided to upgrade to CppUTest 2.x and to the latest version of the Boost library (1.41). I think that broke many several things. But the fixes were simple, once I figured out what I needed to do.

The Fixes

What follows are the three things I needed to do to get CppUTest 2.0, Boost and those exercises playing nicely together.

Header File Include Order

First, I used to have the header file for the CppUTest Test Harness, included first. It seems logical, but it caused all sorts of problems with CppUTest 2. That header file, includes a file, that ultimately includes something that uses macros to redefine new and delete. This is done so the testing framework can do simple memory tracking, which lets you know if your unit tests contains memory leaks.

I like this feature. Sure, it’s simple and light-weight, but it coves a lot of ground for a little hassle. The hassle? Include that header file last, instead of first. Problem Solved. Well at least the code compiles without hundreds of errors.

Boost Shared Pointer

Rather than hold pointers directly, I used the boost shared pointer class for a light-weight way to manage memory allocation. This is something I would do on a real project as well.

Somehow, the updated memory tracking in CppUTest 2.0 found something I had missed when using CppUTest 1.0.

I need to be able to control the date, so I have a simple date factory. By default, the date factory, when asked for the current date, returns the current date. Several unit tests want to simulate different dates. E.g., check out a book on one day, return it 14 days later. To do that, I manipulate the date factory (a form of dependency injection). This works fine, but by default the date factory is allocated using new.

When I replaced the existing date factory, I was not resetting it after the test. It turns out that this did not break anything because I was “lucky”. (Actually unlucky, I like things to fail fast.) CppUTest caught this in the form of not deallocating memory correctly:

  • I want to replace behavior
  • To do so I used polymorphism
  • Polymorphism in C++ requires virtual methods (please don’t correct me by suggesting that overloading is polymorphism, that is an opinion with which I strongly disagree)
  • Methods are only virtually dispatched via references or pointers
  • References cannot be changed, so I must use a pointer if I want a substitutable factory, which I wanted
  • Pointers suggest dynamic memory allocation

To fix this, I updated the setup method to store the original date factory in an attribute and then I updated the teardown method to restore the original date factory from that attribute. That I missed this suggests that my test suite is not adequate. I did not fix this problem for no good reason other than I was porting existing tests, so I left it as is. For the context it will not cause a problem. Pragmatic or lazy? You decide.

One Time Allocation

Here is a simple utility that uses Boost dates and regex:
ptime DateUtil::dateFromString(const string &dateString) {
  boost::regex e("^(\\d{1,2})/(\\d{1,2})/(\\d{4})$");
  string replace("\\3/\\1/\\2");
  string isoDate = boost::regex_replace(dateString, e, replace, boost::match_default | boost::format_sed);
  return ptime(date(from_string(isoDate)));
}

Now this is somewhat simplistic code. So be it, it serves the purposes of the exercise. I can think of ways to fix this, but there’s an underling issue that exists if you use the regex library from Boost.

When you use the library, it allocates (in this example) 10 blocks of memory. If you read the documentation (I did), it’s making space for its internal state machine for regex evaluation. This is done once and then kept around.

So what’s the problem? Well, when I run my tests, the first test that happens to exercise this block of code reports some memory allocation issues:

c:\projects\cppppp\dependencyinversionprinciple\dependencyinversionprinciple\pat
rongatewaytest.cpp:34: error: Failure in TEST(PatronGateway, AddAFew)
        Memory leak(s) found.
Leak size: 1120 Allocated at: <unknown> and line: 0. Type: "new" Content: "
?"
Leak size: 16 Allocated at: <unknown> and line: 0. Type: "new" Content: ?a"
Leak size: 20 Allocated at: <unknown> and line: 0. Type: "new" Content: "êà4"
Leak size: 52 Allocated at: <unknown> and line: 0. Type: "new" Content: ä4"
Leak size: 4096 Allocated at: <unknown> and line: 0. Type: "new" Content: ""
Leak size: 52 Allocated at: <unknown> and line: 0. Type: "new" Content: "êâ4"
Leak size: 20 Allocated at: <unknown> and line: 0. Type: "new" Content: ~4"
Leak size: 32 Allocated at: <unknown> and line: 0. Type: "new" Content: "?à4"
Leak size: 32 Allocated at: <unknown> and line: 0. Type: "new" Content: "h~4"
Leak size: 80 Allocated at: <unknown> and line: 0. Type: "new" Content: "êä4"
Total number of leaks:  10

This is a false positive. This is a one-time allocation and a side-effect of C++ memory allocation and static initialization.

There is a way to “fix” this. You use a command line option, -r, to tell the command line test runner to run the tests twice. If the allocation problem happens the first time but not the second time, then the tests are “OK”.

I didn’t want to do this.

  • The tests do take some time to run (30 seconds maybe, but still that doubles the time)
  • The output is ugly
  • It’s off topic for what the exercise is trying to accomplish
I tried a few different options but ultimately I went with simply calling that method before using the command line test runner. So I changed my main from:
#include <CppUTest/CommandLineTestRunner.h>

int main(int argc, char **argv) {
  return CommandLineTestRunner::RunAllTests(argc, argv);
}
To this:
#include "DateUtil.h"
#include <CppUTest/CommandLineTestRunner.h>

/** ************************************************************
    The boost regex library allocates several blocks of memory
    for its internal state machine. That memory is listed as a 
    memory leak in the first test that happens to use code that
    uses the boost regext library. To avoid having to run the
    tests twice using the -r option, we instead simply force
    this one-time allocation before starting test execution.
    *********************************************************** **/

void forceBoostRegexOneTimeAllocation() {
  DateUtil::dateFromString("1/1/1980");
}

int main(int argc, char **argv) {
  forceBoostRegexOneTimeAllocation();
  return CommandLineTestRunner::RunAllTests(argc, argv);
}

Since this one-time allocation happens before any of the tests run, it is no longer reported as a problem by CppUTest.

Before I introduced this “fix”, I spent quite a bit of time to verify that each of the 10 allocations were done by one of the three lines dealing with regex code in my DateUtil class. I used a conditional breakpoint and looked at the stack trace. (I know, using the debugger is considered a code smell, but not all smells are bad.)

Conclusion

I still like CppUTest. I’ve used a few C++ unit testing tools but there are several I have not tried. I don’t have enough face-time with C++ for this to be an issue. I am not terribly comfortable with the order of includes sensitivity. I’m not sure if that would scale.

I do appreciate the assistance with memory checking, though dealing with false positives can be a bit of a hassle. There was another technique, that of expressing the number of allocations. But in this case, that simply deferred the reporting of memory leaks to after test execution. In any case, I do like this. I’m not sure how well it would scale so it leaves me a bit uneasy.

If you happen to be using these tools, hope this helps. If not, and you are using C++, what can you say about your experiences with using this or other unit testing tools?

C++ shared_ptr and circular references, what's the practice? 36

Posted by Brett Schuchert Sat, 25 Apr 2009 07:30:00 GMT

I’m looking for comments on the practice of using shared pointers in C++. I’m not actively working on C++ projects these days and I wonder if you’d be willing to give your experience using shared pointers, if any.

I’m porting one of our classes to C++ from Java (it’s already in C#). So to remove memory issues, I decided to use boost::shared_ptr. It worked fine until I ran a few tests that resulted in a circular reference between objects.

Specifically:
  • A book may have a receipt (this is a poor design, that’s part of the exercise).
  • A receipt may have a book.

Both sides of the relationship are 0..1. After creating a receipt, I end up with a circular reference between Receipt and Book.

In the existing Java and C# implementations, there was no cleanup code in the test teardown to handle what happens when the receipt goes away. This was not a problem since C# and Java garbage collection algorithms easily handle this situation.

Shared pointers, however, do not handle this at all. They are good, sure, but not as good as a generation-scavenging garbage collector (or whatever algorithms are used these days – I know the JVM for 1.6 sometimes uses the stack for dynamic allocation based on JIT, so it’s much more sophisticated than a simple generation-scavenger, right?)

OK, so how to fix this problem? One way I could do is is manually break the circularity:
boost::shared_ptr<Receipt> r = ...;
CHECK(xxx, yyy);
r.setCopy(boost::shared_ptr<Book>());

(I did not use these types like this. When I use templates, especially those in a namespace, I use typedefs and I even, gasp, use Hungarian-esque notation.)

That would work, though it is ugly. Also, it is error prone and will either require violating DRY or making an automatic variable a field.

I could have removed the back reference from the Receipt to the book. That’s OK, but is a redesign of a system deliberately written with problems (part of the assignment).

Maybe I could explicitly “return” the book, which could remove the receipt and the back-reference. That would make the test teardown a bit more complex (and sort of upgrade the test from a unit test to something closer to an integration test), but it makes some sense. The test validate borrowing a book, so to clean up, return the book.

Instead of any of these options, I decided to use a boost::weak_ptr on the Receipt side. (This is the “technology to the rescue solution”, thus my question, is this an OK approach.)

I did this since the lifetime of a book object is much longer than its receipt (you return your library books, right?). Also, the Receipt only exists on a book. But the book could exist indefinitely without a Receipt.

This fixed the problem right away. I got a clean run using CppUTest. All tests passed and no memory leaks.

Once I had the test working, I experimented. Why? The use of a weak_ptr exposes some underlying details that I didn’t like exposing. For example, this line of code:
aReceipt->getBook()->getIsbn();

(Yes, violating Law of Demeter, get over it, the alternative would make a bloated API on the Book class.)

Became instead:
aReceipt->getBook().lock()->getIsbn();

The lock() method promotes a weak_ptr to a shared_ptr for the life of the expression. In this case, it’s a temporary in that line of code.

This worked fine, but I decided to put that promotion into the Receipt class. So internally, the class stores weak_ptr, but when you ask the receipt for its book, it does the lock:
boost::shared_ptr<Book> getBook() {
    return book.lock();
}

On the one hand, anybody using the getBook() method is paying the price of the promotion. However, the weak_ptr doesn’t allow access to its payload without the promotion so it’s really required to be of any value. Or at least that’s my take on it.

Do you have different opinions?

Please keep in mind, this is example code we use in class to give students practice naming things like methods and variables and also practice cleaning up code by extracting methods and such.

Even so, what practice do you use, if any, when using shared_ptr? Do you use weak_ptr?

Thanks in advance for your comments. I’ll be reading and responding as they come up.

Tdd with C++ and Boost 42

Posted by Brett Schuchert Tue, 24 Mar 2009 03:28:00 GMT

I’m going to give Boost another whirl. There appears to be a great installer for Visual Studio. It requires free registration. I’m on the fence about what to use first and was wondering if you have any suggestions.

I’m thinking about basic things initially:
  • Regex
  • Threading
  • Sockets

Maybe a multi-threaded server responding to message via socket, with client threads processing requests using regex?

If you have anything you’d like to see, reply to this blog and I’ll follow up with you.