Object Mentor Blog

Generic Java Agent Registry 37

Posted by Brett Schuchert Sun, 30 Dec 2007 04:44:00 GMT

I’m writing a Java Agent for the first time. Why? I’m interested in using this tool called ConTest from IBM. It was originally written during JDK 1.3 days and now it requires JDK 1.4. It instruments class files looking for code that uses concurrent constructs such as synchronized blocks and inserts code to monitor and play that code back in ways that will more likely expose concurrent problems.

What’s the problem?

It was written in the days when we would use a pre-processing stage to instrument code and then execute tests. This is fine if I wanted to work at the command line and use ant/maven to build. However, I want to work in an IDE that makes running unit tests easy (Eclipse, though they all do it now). But if I have to remember to instrument my classes before running my tests, that leads to human error. I don’t want that, I want to just run my unit tests and have my classes dynamically instrumented. (This is not speculation, this comes from working in a group consisting of multiple teams, all of which were using some Aspects written using AspectJ and running unit tests in Eclipse – when we introduced dynamic instrumentation – pre JDK 1.5 and even working in WebSphere 5.x), it improved productivity.

What about a plug-in? Sure, there’s one for Eclipse but I’ve not been able to download it. It probably works fine – after a class is compiled, it gets instrumented – but if I’m able to write a simple Java Agent, I can create a JVM configuration with a few parameters and every time I run my tests, viola, dynamic instrumentation with only a little one-time environment configuration. Also, I don’t have to worry about waiting for a plug-in update to continue using the tool.

Will my tests run slower? Probably, but until I know how much slower, I’m willing to risk it. (I’ve used dynamic instrumentation when running over 1,000 tests on a workspace with > 1 Million lines of code, it was fine.) If it’s an issue, I can imagine using a combination of annotations and the Java agent. Something like:

@TestInstrument
public class SomeClassThatUsesThreading {}

This would allow the Java Agent to only instrument some classes, rather than all classes. This could cause problems if I forget the annotation, but it’s an option if speed is an issue.

I would not do that unless it was necessary. First, most of my tests would be testing code that is not thread-related. Those tests would not require instrumentation. The tests that require instrumentation, would be somewhere else, and I’d run them with a different frequency; I’d run them longer, with more configurations and iterations, to increase my chances of finding threading-related issues.

There’s another option. Copy what ConTest is already doing using AOP. I tried, and I cannot select the correct point-cuts using traditional point-cut syntax – try selecting synchronized blocks and then every line within the block, that’s not a typical point-cut usage scenario. I considered a combination of hand instrumentation and AOP – it’ll work but it makes the code ugly. I even considered using asm or cglib directly and at that point I knew it was more time than I wanted to spend when the developers of ConTest have years of experience instrumenting classes.

Anyway, I’m hoping the team working on this tool will publish an API soon so I can give that a try. They mentioned they would at some point.

If you want more information on writing a Java Agent (and the class that actually registeres it), have a look at Brett’s Java Agent Notes.

Posted in Schuchert's Scattered Synapses
Tags AOP, concurrent, instrumentation, javaagent, threading
Meta 37 comments, permalink, rss, atom

Agile Snow Shoveling Plan 37

Posted by James Grenning Wed, 19 Dec 2007 20:51:00 GMT

My wife and I have evening plans. The driveway has a nice 10 inch layer of snow. To not keep our friends waiting we must leave the house by 6:30. We have a deadline. Working backwards, I need to be in the shower at 6:00. My requirements are to plow off the whole driveway and leave the house, showered and dressed by 6:30

Its 5:00. I have one hour. With the kids gone, the cub cadet (with plow) still in need of repair, I wonder can I meet my requirements and finish the job by 5:55 so I can dust off, open a beer and get in the shower by 6. The schedule looks tight, I could do more analysis, but starting the job will tell me a lot and help me get it done too.

I get my back friendly shovel and get to work, shoveling behind the car we plan on taking. After fifteen minutes I have a realization, I am not going to make it. The pan is not doable. A quick estimate and I see I have cleaned off about one eighth of the driveway and I have used one quarter of my time. Sound familiar. My back was not going to let me move more snow more quickly. Wishful thinking would mean not getting to the shower on time and possibly being blocked in the driveway. Thirty minutes from shower out the door is a metric established long ago. I needed to adjust the plan.

I got a committee together to discuss our options. Oh wait a second, that was a different plan.

I cleared about one eighth of the drive in fifteen minutes. Using my velocity it looks like a two hour job. So, its likely I will only have time to move half of the snow that’s preventing access to and from our house. I better focus on the “critical path”. The other car would be buried for another day and the front walk would have to wait.

A good snow drift could have destroyed my plan, but surprisingly there were none on the critical path. Shoveling this software is predictable, and my velocity was stable. I did not set any stretch goals, because I wanted to shovel another day. Keeping with my sustainable pace, I delivery the critical path by 5:55. The Heine tasted good, and we made the next milestone: dinner with our friends.

Business software is Messy and Ugly. 70

Posted by Uncle Bob Thu, 13 Dec 2007 15:41:00 GMT

I was at a client recently. They are a successful startup who have gone through a huge growth spurt. Their software grew rapidly, through a significant hack-and-slash program. Now they have a mess, and it is slowing them way down. Defects are high. Unintended consequences of change are high. Productivity is low.

I spent two days advising them how to adopt TDD and Clean Code techniques to improve their code-base and their situation. We discussed strategies for gradual clean up, and the notion that big refactoring projects and big redesign projects have a high risk of failure. We talked about ways to clean things up over time, while incrementally insinuating tests into the existing code base.

During the sessions they told me of a software manager who is famed for having said:

“There’s a clean way to do this, and a quick-and-dirty way to do this. I want you to do it the quick-and-dirty way.”

The attitude engendered by this statement has spread throughout the company and has become a significant part of their culture. If hack-and-slash is what management wants, then that’s what they get! I spent a long time with these folks countering that attitude and trying to engender an attitude of craftsmanship and professionalism.

The developers responded to my message with enthusiasm. They want to do a good job (of course!) They just didn’t know they were authorized to do good work. They thought they had to make messes. But I told them that the only way to get things done quickly, and keep getting things done quickly, is to create the cleanest code they can, to work as well as possible, and keep the quality very high. I told them that quick-and-dirty is an oxymoron. Dirty always means slow.

On the last day of my visit the infamous manager (now the CTO) stopped into our conference room. We talked over the issues. He was constantly trying to find a quick way out. He was manipulative and cajoling. “What if we did this?” or “What if we did that?” He’d set up straw man after straw man, trying to convince his folks that there was a time and place for good code, but this was not it.

I wanted to hit him.

Then he made the dumbest, most profoundly irresponsible statement I’ve (all too often) heard come out of a CTOs mouth. He said:

“Business software is messy and ugly.”

No, it’s not! The rules can be complicated, arbitrary, and ad-hoc; but the code does not need to be messy and ugly. Indeed, the more arbitrary, complex, and ad-hoc the business rules are, the cleaner the code needs to be. You cannot manage the mess of the rules if they are contained by another mess! The only way to get a handle on the messy rules is to express them in the cleanest and clearest code you can.

In the end, he backed down. At least while I was there. But I have no doubt he’ll continue his manipulations. I hope the developers have the will to resist.

One of the developers asked the question point blank: “What do you do when your managers tell you to make a mess?” I responded: “You don’t take it. Behave like a doctor who’s hospital administrator has just told him that hand-washing is too expensive, and he should stop doing it.”

Posted in Uncle Bob's Blatherings, Clean Code
Meta 70 comments, permalink, rss, atom

Thinking about an Appendix. 14

Posted by Uncle Bob Fri, 07 Dec 2007 15:27:04 GMT

No, not of Clean Code (look for it in Spring). This monday evening (12/3) I got a stomach ache.

The stomach pain was a low level annoyance that I was able to ignore for most of the evening. But it started to get pretty bad round midnight. By 2AM I threw up and then felt better. So I went back to bed and slept till morning.

Tuesday started well. I had some residual pain, but figured it was waning. I went to the office to get some work done there. By the time I got there the waning pain had waxed and I turned right around and went home to bed.

The pain continued to grow until 3PM when I thew up again. This made me feel a little better so I went back to bed. But by 6PM the pain had localized to my lower right quadrant and I had had enough. I asked my wife to take me to the immediate care facility.

They could tell by the grimace on my face that this was something to take care of quickly. They put me on a bed, and started an IV. They took blood and did an exam. Then they packed me into an ambulance and sent me to the ER for a CAT scan.

During the 20 minute ride the pain was getting really bad. I was giving it 8.5 on a scale of 1-10. I have intimate knowledge of every pothole on the road the took.

At the ER they put me in a room and gave me a dose of Morphine. Morphine is a very nice drug. It had the effect of filing the pain away in a convenient subdirectory where I could access it if I needed it, but was otherwise out of the way.

By 11PM the CAT scan was complete and the doctor came and said: “The good news is we know what’s wrong. The bad news is it’s your appendix. We just happen to have an OR ready. Do you want to do it now?” How could I refuse an offer like that!

The rolled me into the OR and told me they were going to give me the juice.

DISCONTINUITY

I was in the recovery room with two smiling pretty nurses urging me to awaken to the news that my appendix had been removed.

DISCONTINUITY

I was being wheeled into a hospital room. It was 3AM. My exhausted wife was waiting for me there. She kissed me goodbye and I went to sleep.

Wednesday was my birthday. I spent is sleeping for the most part. They fed me clear foods. The pain was not horrible, but I did accept a vicodin around noon. By 4PM the surgeon came in, looked me over, and said: “You look pretty good. You should go home.” I went home gladly. On the way home my wife and I picked up a pizza.

Thursday was a bit sore, but not bad. I took a couple of motrin and was “smart” about what muscles I used and when. I got a lot of work-reading done.

Today I woke up and the incision pain was like a mild sunburn. I can feel it, but I can ignore it. I’m not about to do 20 pushups, or run a marathon, but the day should be relatively normal.

Posted in Uncle Bob's Blatherings
Meta 14 comments, permalink, rss, atom

Shoveling Code 19

Posted by tottinger Thu, 06 Dec 2007 18:56:00 GMT

I was ill when the first snow fell last weekend, and didn’t get out and scrape the drive and sidewalks. Sadly, neither did I bundle up my kids and send them out with shovels. As a result, the snow melted and refroze. When I left home on sunday morning, clearing the snow was frustrated by the presence of hard, packed ice under the snow.

Now I have to be very careful when moving in the drive or sidewalk because of the black ice in some places and thick lumpy stuff in others. I could choose to leave it that way and just walk carefully and try to avoid the area, but I really don’t want it to be like this all winter.

Instead, I bundled up and spent several painful hours trying to clear not only the six to eight inches of new snow, but the ice under it. Clearing the fresh stuff is tough because the rough and lumpy ice is under it. The ice snags the shovel and hurts my elbows and shoulders. I use the metal edge of the shovel to plane off the ice so that it’s not quite so difficult, and the work gets a little easier.

Smooth, planed ice is not a solution. It is actually going to be more slippery and dangerous when the sun hits it and it melts a little. It will then refreeze to an even smoother and slicker sheen. I can’t ignore the problem, and I can’t smooth it over and leave it at that.

I invested in some complicated salt product to lower the melting point of the ice so I can scrape it away. It wasn’t enough, but now there are some safe areas in my driveway. I continue to scrape it and hope that I can get it all squared away so that my wife and kids have safe passage, though it is tedious and unpleasant work.

Yes, it’s my fault. I should have bundled up my sick self or my children and taken care of this when the first snow fell, back when it was easy.

Then it dawned on me that my driveway is a lot like source code. If I always take care of it when it’s relatively easy and not too polluted, it will remain easier to deal with in the longer term. The more unsafe and ugly it is, the more important it is to clean it up. It won’t do to smooth it out a little or to leave it be. It needs to be safe for me and my team.

And I guess that’s my parable for today.

Posted in Tim's Tepid Torrent
Tags avoid, cleaning, code, ottinger, parable, smoothing, snow, trouble
Meta 19 comments, permalink, rss, atom

Naming: A Word To The Wise 12

Posted by tottinger Wed, 28 Nov 2007 02:50:00 GMT

Please pronounce your variable names as words, even if they are merely abbreviations. The phonemes indicated by your spelling may correspond to an existing word with an entrenched meaning. It may be one that you do not intend to communicate.

Bugs Kill Productivity and Schedule Accuracy 5

Posted by Dean Wampler Sun, 25 Nov 2007 17:29:07 GMT

The title may seem obvious. Here is some hard data.

SD Times recently reported on a survey done by Forrester Research on the pernicious effects of bugs on development productivity. The developers and managers surveyed said they take an average of six days to resolve issues and they spend an average 3 out of every 10 hours working on various stages of troubleshooting, including initial investigation, documenting their findings, and resolving the problems.

The survey puts some data behind an intuitive sense (of gloom…) that many teams have about the problem.

For me, this is further evidence of why the practices of Extreme Programming (XP), especially Test-Driven Development (TDD), are so important. To a high degree, TDD ensures that the software is really working and that regressions are caught and fixed quickly. TDD also provides an unambiguous definition of “done”; do the automated tests pass? The feedback on which stories are really done tells the team its true velocity and hence how many story points will be finished by a given target date. Also, you spend far less unplanned and unpredictable time debugging problems.

Hence, productivity and schedule accuracy tend to be much better for XP teams. Over time, as the software grows in size and complexity, the automated test suite will keep delivering “dividends”, by continuing to catch regressions and by empowering the team to do the aggressive refactorings that are necessary to keep the software evolving to meet new requirements.

The SD Times article goes on to say that

... two-thirds of the responding managers indicated that a solution that reduced the time spent on resolving application problems would be of interest if it created significant efficiencies and improved quality.

The article concludes with a quote that automated testing is one solution. I agree. Just make sure it is the XP kind of automated testing.

Tags bugs, predictability, Productivity, schedules, XP
Meta 5 comments, permalink, rss, atom

Velocity Inflation Triggers Productivity Recession 74

Posted by Bob Koss Thu, 15 Nov 2007 13:59:00 GMT

A team’s velocity is a measure of its productivity. A higher velocity means more work is being completed each iteration and a team that shows an increasing velocity from iteration to iteration is being more and more productive. This isn’t necessarily a good thing.

In Extreme Programming (XP), a team’s velocity is defined as the number of story points completed in an iteration. We can track the velocity for each iteration and calculate the average velocity over several iterations. If we know how many story points comprise an application, we can even project when an application can be released . Velocity is a very useful measure of a team’s capabilities. I have found it to be a much more reliable (make that infinitely more reliable) number to be used for project planning than asking developers for time estimates based on requirements.

Because velocity correlates to a team’s productivity, there is a tendency for managers to try to â€œgooseâ€ the velocity. The technical term for this is a stretch goal. Just like the coach of a sport’s team trying to get the team to play better, I can imagine project managers saying at a team meeting at an iteration boundary, â€œCome on team â€“ you did 22 points last week, what do you say we try for 25 this week?â€ This same manager probably buys those motivation posters of eagles soaring and hangs them all around the team room.

The programming team takes pride in their achievements and they also like to see their velocity increasing. They will try to persuade the Customer to give them points for any bit of work they do. I can imagine a developer putting in a few minutes work to change the position of a GUI button on a screen and begging the Customer, â€œCan I get a point for that?â€

XP teams can become obsessed with their velocity, but increasing velocity isn’t always a good thing. A team can inadvertently become so focused on getting points done that they sometimes forget the real goal â€“ to deliver a quality product.

This became apparent to me recently when I visited a client to see how they were doing in their transition to XP. Their velocity had skyrocketed since I last saw them â€“ and it was definitely not a good thing.

When Object Mentor does an XP transition with a client, we start with our XP Immersion course to get everybody on the same page about what our goals are. Ideally, we use two instructors, one to train the programmers in topics such as Test Driven Development and Refactoring, and the other coach teaches story writing, iteration planning, and release planning to the customer team. We then bring the two groups together for a day and have the customer team drive the programming team on a classroom exercise so everybody can experience how the process works. The instructors then stay and work with the team for a few iterations, coaching them on how to use what they learned in class on their own project.

Such was the case with the client I mentioned earlier. I was working with the programmers and the other coach had the customer team. When it came time to estimate stories, the other coach suggested using an open-ended scale. Where I like to use a scale of 1-5, this team was using a 1-infinity scale. That’s fine, it doesn’t really matter which scale you use, as long as whatever scale you choose is consistent. Stories were estimated according to their relative complexities, iterations were planned, and the programmers started cranking out code. After a few iterations the team’s velocity settled to a value of about 14 points. The team was doing fine and it was time for me to move on and let them do it alone.

When I returned to do a process checkup, their velocity had climbed to 48 points. Wow. This new process must really be resonating with the team. I purposely timed my visit to occur on an iteration boundary and we conducted a retrospection on how well the team was adhering the the XP practices. This turned out to be the bad news.

With a focus on getting more points done, it seemed that the team had abandoned many of the practices. Programmers weren’t pairing, weren’t writing unit tests, and weren’t refactoring their code. Customers were trying to stay ahead of the programmer’s ever increasing productivity abandoned writing automated acceptance tests in FitNesse and were now back to manual testing. I was heartbroken.

Beck teaches that one of the things teams have to do is to adapt the practices to the organization. Perhaps what they were doing was adapting. Adapting to the point that they weren’t even close to doing XP anymore, but adapting nonetheless. I wondered how that was working for them.

I observed the iteration planning meeting for the upcoming iteration and I noticed that all of the user stories were bug fixes from the previous iteration. No new functionality was being added to the application, the team was essentially spinning its wheels. So even though the velocity indicated a very productive team, the actual productivity was essentially zero. There must be a lesson here.

Velocity is what it is. You must take great care when encouraging a team to increase its velocity because you will always get whatever you’re trying to maximize. You have to be careful that you are maximizing the right thing. Want to maximize number of lines of code produced? Well you’re most likely going to get a lot of really crappy code. Want to maximize the number of unit tests that developers write? You’ll get a lot of them, but they’ll probably be worthless. Applying pressure to increase the velocity might make developers to subconsciously inflate their estimates or worse, abandon good development practices in order to get the numbers up.

Posted in Young Bob's Rants
Tags Extreme Programming, Productivity, Velocity, XP
Meta 74 comments, permalink, rss, atom

You Don't Know What You Don't Know Until You Take the Next Step 49

Posted by Bob Koss Tue, 13 Nov 2007 16:11:00 GMT

I was teaching our brand new Principles, Patterns, and Practices course recently (https://objectmentor.com/omTraining/course_ood_java_programmers.html) and I was starting the section on The Single Responsibility Principle.

I had this UML class diagram projected on the screen:

Employee
+ calculatePay()
+ save()
+ printReport()

I asked the class, “How many responsibilities does this class have?” Those students who had the courage to answer a question out loud (sadly, a rare subset of students) all mumbled, “Three.” I guess that makes sense, three methods means three responsibilities. My friend and colleague Dean Wampler would call this a first-order answer (he’s a physicist and that’s how he talks ;-) ). The number will increase as we dig into details. I held one finger in the air and said, “It knows the business rules for how to calculate its pay.” I put up a second finger and said, “It knows how to save its fields to some persistence store.”

“I happen to know that you folks use JDBC to talk to your Oracle database.” Another finger for, “It knows the SQL to save its data.” Another finger for, “It knows how to establish a database connection.” My entire hand is held up for, “It knows how to map its fields to SQLStatement parameters.” I start working on my other hand with, “It knows the content of the report.” Another finger for, “It knows the formatting of the report.” If this example was a real class from this company’s code base I knew I’d be taking off my shoes and socks.

Not that my answer was any better than the students’ answer, given the information at hand there can’t be a right answer because I didn’t provide any context for my question. I found our different answers interesting though. This particular company would have a design phase for a project where UML diagrams would be produced and discussed. How can any reviewer know if a class diagram is â€œrightâ€?

I have this belief (hang-up?) that you don’t really know what you don’t know, until you take the next step and actually use what you currently have. Looking at UML, we can’t really say that it’s going to work or that it satisfies the Principles of Object Oriented Design, until we take the next step, i.e., write some code and see whether it works or not. The devil is in the details and those devilish dirty details just can’t be seen in a picture.

Let’s take a step back. Before there is UML specifying a design, there must have been requirements stating a problem that the design addresses. Have we captured all of the requirements? Are they complete? Are they accurate? Are they unambiguous? How do you know? I believe that you don’t know, and worse, you don’t even know what you don’t know. You don’t know until you take that next step and try to design a solution. It’s only during design that phrases like, “What’s supposed to happen here,” or, “That doesn’t seem to have been addressed in the spec,” are heard. You don’t know what you don’t know until you take that next step.

It is very easy for everybody on a project to believe that they are doing good work and that the project is going according to plan. If we don’t know what we don’t know, it’s hard to know if we’re on the right track. During requirements gathering, business analysts can crank out user stories, use cases, functional requirements, or whatever artifacts the process du jour dictates they produce. Meetings can be scheduled and documents can be approved once every last detail has been captured, discussed to death, and revised to everybody’s liking. Unfortunately, until solutions for these requirements are designed, they are but a dream. There is no way to predict how long implementation will take so project plans are really interpretations of dreams.

The same danger exists during design. Architects can be cranking out UML class diagrams, sequence diagrams, and state transition diagrams. Abstractions are captured, Design Patterns are applied, and the size of the project documentation archive grows. Good work must be happening. But are the abstractions â€œrightâ€? Can the code be made to do what our diagrams require? Is the design flexible, maintainable, extensible, testable (add a few more of your favorite -able’s)? You just don’t know.

The problem with Waterfall, or at least the problem with the way most companies implement it, is that there either isn’t a feedback mechanism or the feedback loop is way too long. Projects are divided into phases and people feel that they aren’t allowed to take that crucial next step because the next step isn’t scheduled until next quarter on somebody’s PERT chart. If you don’t know what you don’t know until you use what you currently have, and the process doesn’t allow you to take the next step (more likely somebody’s interpretation of the process doesn’t let you take the next step), then we don’t have a very efficient process in place.

In order to not delude myself that I am on the right track when in reality I’m heading down a blind alley, I would like to know the error of my ways as quickly as possible. Rapid feedback is a characteristic of all of the Agile methodologies. By learning what’s missing or wrong with what I’m currently doing, I can make corrections before too much time is wasted going in the wrong direction. A short feedback loop minimizes the amount of work that I must throw away and do again.

I’m currently working with a client who wants to adopt Extreme Programming (XP) as their development methodology. What makes this difficult is that they are enhancing legacy code and the project members are geographically distributed. The legacy code aspect means that we have to figure out how we’re going to write tests and hook them into the existing code. The fact that we’re not all sitting in the same room means that we have to have more written documentation. We don’t know the nature of the test points in the system, nor do we know what to document and how much detail to provide in documentation. We can’t rely mostly on verbal communication but we don’t want to go back to writing detailed functional specs either. There are many unknowns and what makes this relevant to this discussion is, we don’t even know all of the unknowns. Rapid feedback has to be our modus operandi.

An XP project is driven by User Stories developed by the Customer Team, composed of Business Analysts and QA people. A User Story, by definition, must be sized so that the developers can complete it within an iteration. I have a sense how much work that I personally can do in a two week iteration, but I’m not the one doing the work and I don’t know how much of an obstacle the existing code is going to be. The Customer Team could go off and blindly write stories, but that would lead to a high probability that we’d have to rewrite, rework, split, and join once the developers saw the stories and gave us their estimates. To minimize the amount of rework, I suggested that the Customer Team write about twenty or so stories and then meet with the developers to go over the stories and allow them to give estimates.

My plan to get a feedback loop going on user story development worked quite well. The Customer Team actually erred on the side of stories that were too small. The developers wanted to put estimates of 0.5 on some of the stories (I have my teams estimate on a scale of 1-5. I’ll write up a blog entry on the estimation process I’ve been using.) so we combined a few of them and rewrote others to tackle a larger scope. We took the next step in our process, learned what we didn’t know, took corrective action, and moved forward.

Writing Customer Acceptance Tests didn’t go quite as smoothly, but it is yet another example of learning what didn’t work and making corrections. I advise my clients to write tests for their business logic and to specifically not write business tests through the user interface. Well guess where a lot of business logic resided â€“ yep, in the user interface. We ran into the situation where the acceptance tests passed, but when the application was actually used through the user interface, it failed to satisfy the requirements. I’m very happy that we had a short feedback loop in place that revealed that our tests weren’t really testing anything interesting before the Customer Team had written too many FitNesse tests and the Development Team had written too many test fixtures.

Feedback is good. Rapid feedback is better. Seek it whenever you can. Learn what you don’t know, make corrections, and proceed.

Posted in Young Bob's Rants
Tags design, Extreme, feedback, programming, XP
Meta 49 comments, permalink, rss, atom

Active Record vs Objects 97

Posted by Uncle Bob Fri, 02 Nov 2007 16:29:31 GMT

Active Record is a well known data persistence pattern. It has been adopted by Rails, Hibernate, and many other ORM tools. It has proven it’s usefulness over and over again. And yet I have a philosophical problem with it.

The Active Record pattern is a way to map database rows to objects. For example, let’s say we have an Employee object with name and address fields:

public class Employee extends ActiveRecord {
  private String name;
  private String address;
  ...
}

We should be able to fetch a given employee from the database by using a call like:

Employee bob = Employee.findByName("Bob Martin");

We should also be able to modify that employee and save it as follows:

bob.setName("Robert C. Martin");
bob.save();

In short, every column of the Employee table becomes a field of the Employee class. There are static methods (or some magical reflection) on the ActiveRecord class that allow you to find instances. There are also methods that provide CRUD functions.

Even shorter: There is a 1:1 correspondence between tables and classes, columns and fields. (Or very nearly so).

It is this 1:1 correspondence that bothers me. Indeed, it bothers me about all ORM tools. Why? Because this mapping presumes that tables and objects are isomorphic.

The Difference between Objects and Data Structures

From the beginning of OO we learned that the data in an object should be hidden, and the public interface should be methods. In other words: objects export behavior, not data. An object has hidden data and exposed behavior.

Data structures, on the other hand, have exposed data, and no behavior. In languages like C++ and C# the struct keyword is used to describe a data structure with public fields. If there are any methods, they are typically navigational. They don’t contain business rules.

Thus, data structures and objects are diametrically opposed. They are virtual opposites. One exposes behavior and hides data, the other exposes data and has no behavior. But that’s not the only thing that is opposite about them.

Algorithms that deal with objects have the luxury of not needing to know the kind of object they are dealing with. The old example: shape.draw(); makes the point. The caller has no idea what kind of shape is being drawn. Indeed, if I add new types of shapes, the algorithms that call draw() are not aware of the change, and do not need to be rebuilt, retested, or redeployed. In short, algorithms that employ objects are immune to the addition of new types.

By the same token, if I add new methods to the shape class, then all derivatives of shape must be modified. So objects are not immune to the addition of new functions.

Now consider an algorithm that uses a data structure.


switch(s.type) {
  case SQUARE: Shape.drawSquare((Square)s); break;
  case CIRCLE: Shape.drawCircle((Circle)s); break;
}

We usually sneer at code like this because it is not OO. But that disparagement might be a bit over-confident. Consider what happens if we add a new set of functions, such as Shape.eraseXXX(). None of the existing code is effected. Indeed, it does not need to be recompiled, retested, or redeployed. Algorithms that use data structures are immune to the addition of new functions.

By the same token if I add a new type of shape, I must find every algorithm and add the new shape to the corresponding switch statement. So algorithms that employ data structures are not immune to the addition of new types.

Again, note the almost diametrical opposition. Objects and Data structures convey nearly opposite immunities and vulnerabilities.

Good designers uses this opposition to construct systems that are appropriately immune to the various forces that impinge upon them. Those portions of the system that are likely to be subject to new types, should be oriented around objects. On the other hand, any part of the system that is likely to need new functions ought to be oriented around data structures. Indeed, much of good design is about how to mix and match the different vulnerabilities and immunities of the different styles.

Active Record Confusion

The problem I have with Active Record is that it creates confusion about these two very different styles of programming. A database table is a data structure. It has exposed data and no behavior. But an Active Record appears to be an object. It has “hidden” data, and exposed behavior. I put the word “hidden” in quotes because the data is, in fact, not hidden. Almost all ActiveRecord derivatives export the database columns through accessors and mutators. Indeed, the Active Record is meant to be used like a data structure.

On the other hand, many people put business rule methods in their Active Record classes; which makes them appear to be objects. This leads to a dilemma. On which side of the line does the Active Record really fall? Is it an object? Or is it a data structure?

This dilemma is the basis for the oft-cited impedance mismatch between relational databases and object oriented languages. Tables are data structures, not classes. Objects are encapsulated behavior, not database rows.

At this point you might be saying: “So what Uncle Bob? Active Record works great. So what’s the problem if I mix data structures and objects?” Good question.

Missed Opportunity

The problem is that Active Records are data structures. Putting business rule methods in them doesn’t turn them into true objects. In the end, the algorithms that employ Active Records are vulnerable to changes in schema, and changes in type. They are not immune to changes in type, the way algorithms that use objects are.

You can prove this to yourself by realizing how difficult it is to implement an polymorphic hierarchy in a relational database. It’s not impossible of course, but every trick for doing it is a hack. The end result is that few database schemae, and therefore few uses of Active Record, employ the kind of polymorphism that conveys the immunity of changes to type.

So applications built around ActiveRecord are applications built around data structures. And applications that are built around data structures are procedural—they are not object oriented. The opportunity we miss when we structure our applications around Active Record is the opportunity to use object oriented design.

No, I haven’t gone off the deep end.

I am not recommending against the use of Active Record. As I said in the first part of this blog I think the pattern is very useful. What I am advocating is a separation between the application and Active Record.

Active Record belongs in the layer that separates the database from the application. It makes a very convenient halfway-house between the hard data structures of database tables, and the behavior exposing objects in the application.

Applications should be designed and structured around objects, not data structures. Those objects should expose business behaviors, and hide any vestige of the database. The fact that we have Employee tables in the database, does not mean that we must have Employee classes in the application proper. We may have Active Records that hold Employee rows in the database interface layer, but by the time that information gets to the application, it may be in very different kinds of objects.

Conclusion

So, in the end, I am not against the use of Active Record. I just don’t want Active Record to be the organizing principle of the application. It makes a fine transport mechanism between the database and the application; but I don’t want the application knowing about Active Records. I want the application oriented around objects that expose behavior and hide data. I generally want the application immune to type changes; and I want to structure the application so that new features can be added by adding new types. (See: The Open Closed Principle)

Posted in Uncle Bob's Blatherings, Dynamic Languages, Agile Methods, Clean Code
Meta 97 comments, permalink, rss, atom

Mentor	twitter id
Uncle Bob	unclebobmartin
Brett Schuchert	schuchert
Michael Feathers	mfeathers
Bob Koss	bob_koss