DotNet Development Using Parallels Is No Longer Painful 204
Because of the travel involved with being an Object Mentor, my main computer is my laptop, a MacBookPro. I have to be able to work wherever I happen to be, whether I’m at home, at Object Mentor Galactic Headquarters, or in a plane, train, or automobile. We’re allowed to buy whatever we want, and we all have independently settled on Macs, running OS X. I’ve developed on lots of different platforms over the years, but I find that OS X is the most productive development environment that I’ve ever used.
Besides the travel, we have to be jacks of all trades. One week I’m programming in Java, the next week I’m teaching a design class, and the next I’m developing in .NET. I’m able to do it all on my Mac, even the .NET work, through the magic of virtualization. I choose to run Parallels so that I’m able to run Windows as a process under OS X. Windows crashing isn’t a problem anymore, in that it doesn’t bring my entire computer down. One process just dies (or I’m forced to kill it) and I just start it up again, and all the other running processes just keep right on working.
I’m a backup fanatic. Besides having bootable backups produced using SuperDuper, I just love the way TimeMachine is built into OS X, allowing me to recover versions of any file on my computer going back months. And it’s all automatic. TimeMachine just runs in the background, keeping me safe.
But the thing about Parallels + TimeMachine that I just hate, is that Parallels stores all of Windows as one big file. A big one. Huge! So whenever I’m working in Parallels, the one big file is being changed and is going to be backed up again the next time TimeMachine runs. Needless to say this takes up a lot of space on my TimeMachine drive, but it’s also a damn nuisance. About the time that TimeMachine finishes its backup, it’s time to start all over again just because of the file size. Yet I dare not back up. I’m in the camp that believes it’s not a question of if your computer will fail, but when.
Because Parallels creates a share with my home directory on the OS X side, I thought about just keeping my source code there. That actually works, until I try to build or do anything with the code other than edit it, and then I get a security violation of some sort.
While I’m developing in Windows, I do frequent commits to git, my new favorite source code control system. It’s easy to go back to a previous version of any file, something that I can’t say about ctrl-z in Visual Studio.
The light finally lit up over my bald head. Git is a distributed version control system. Distributed doesn’t have to be any farther away than the OS X side of my file system.
I set up a git repository in OS X for every project that I work on. Then I can push/pull from/to it, just like I would do if I was working on a team project hosted on github. Why don’t I just get a private github account? This is easier and I don’t have to be connected to the net to do push/pulls.
Here are the details in case anyone wants to try it.
In a Terminal window in OS X:
MBP:~ bob$ cd Archive/git
MBP:git bob$ mkdir PresenterAsStateMachine.git
MBP:git bob$ cd PresenterAsStateMachine.git
MBP:PresenterAsStateMachine.git bob$
MBP:PresenterAsStateMachine.git bob$ git --bare init
Initialized empty Git repository in /Users/bob/Archive/git/PresenterAsStateMachine.git/
MBP:PresenterAsStateMachine.git bob$
then, in Windows, using the bash window that comes with git:
$ git init
Initialized empty Git repository in c:/Documents and Settings/Administrator/My Documents/Visual Studio 2008/Projects/PresenterAsStateMachine/.git/
$ git add .
$ git commit -a -m "Initial commit. Ready to add state machine."
[master (root-commit) f844369] Initial commit. Ready to add state machine.
16 files changed, 795 insertions(+), 0 deletions(-)
create mode 100644 .gitignore
create mode 100644 PresenterAsStateMachine.sln
create mode 100644 PresenterAsStateMachine.suo
create mode 100644 PresenterAsStateMachine/Form1.Designer.cs
create mode 100644 PresenterAsStateMachine/Form1.cs
create mode 100644 PresenterAsStateMachine/Form1.resx
create mode 100644 PresenterAsStateMachine/LoginPresenter.cs
create mode 100644 PresenterAsStateMachine/LoginPresenterTest.cs
create mode 100644 PresenterAsStateMachine/LoginView.cs
create mode 100644 PresenterAsStateMachine/PresenterAsStateMachine.csproj
create mode 100644 PresenterAsStateMachine/Program.cs
create mode 100644 PresenterAsStateMachine/Properties/AssemblyInfo.cs
create mode 100644 PresenterAsStateMachine/Properties/Resources.Designer.cs
create mode 100644 PresenterAsStateMachine/Properties/Resources.resx
create mode 100644 PresenterAsStateMachine/Properties/Settings.Designer.cs
create mode 100644 PresenterAsStateMachine/Properties/Settings.settings
Now I set up the git repository that I had previously created in OS X as the remote:
$ git remote add origin //.psf/Home/Archive/git/PresenterAsStateMachine.git/
$ git push origin master
Counting objects: 20, done.
Compressing objects: 100% (19/19), done.
Writing objects: 100% (20/20), 12.74 KiB, done.
Total 20 (delta 1), reused 0 (delta 0)
Unpacking objects: 100% (20/20), done.
To //.psf/Home/Archive/git/PresenterAsStateMachine.git/
* [new branch] master -> master
And now push and pull to my heart’s content, as well as add and commit to my local repository.
Now I configure TimeMachine to not back up the Parallels virtual machine directory. It still gets backed up daily by SuperDuper, but they only files that I want to be able to recover versions of is the source code that I’m working on – and this technique serves me well for that.
I Hate Cutesy Phrases 133
I’m in a bad mood today. It’s 12 degrees below zero outside. My reward for an hour on the elliptical trainer at they gym yesterday was gaining another 0.5 lb. on the scale this morning. And looking at the futures market, my ever dwindling retirement funds are going to sink even lower this morning. I decided to focus my frustrations towards an attack on that cutesy little phrase that we all learned in OO 101, “IS-A”.
There was a recent blog posting making the rounds on Twitter earlier this week about the old “Is a Square a Rectangle?” problem. It’s a nice post and a quick read. Here’s the URL: http://21ccw.blogspot.com/2009/01/is-square-rectangle.html. The Square/Rectangle discussion was going on way back in the early 1990’s on Usenet and it’s still applicable today. When I’m teaching, I’ll transition to the topic of the Liskov Substitution Principle by asking the class the question, “Is a square a rectangle or is a rectangle a square?” (sometimes I’ll substitute circle and ellipse for square and rectangle, just for my own amusement). I’ll ask for a showing of hands, “How many people think that a square IS-A rectangle?” I’ll draw the UML for Square inheriting from Rectangle on the whiteboard as I start this topic. The reason I say this problem is still applicable today is that 99% of my students will raise their hands. The remaining 1% either don’t know, don’t care, or they are enlightened and know that it’s a trick question. After 15 years, I find it hard to believe that everybody programming in (strongly typed) OO languages doesn’t know how to look at this problem. On the other hand, I guess I should be happy as the ignorance that exists in our industry provides a steady stream of business for us.
My classes usually chuckle when I explain how I tricked them by using that cutesy little phrase “IS-A” as I draw the UML for inheritance on the whiteboard. I explain that there isn’t enough information in the question to give an answer. Before we can say whether or not a Square should inherit from a Rectangle, we must first specify the behavior of a Rectangle. We must know what a Rectangle object can do and we must know the expectations of the clients of Rectangle. I ask the class, “Can a Rectangle have its width set independently of its height?” Everybody agrees that setWidth() and setHeight() should be methods of Rectangle. The Liskov Substitution Principle says that if a Square is to inherit from Rectangle, a square object must also respond to setWidth() and setHeight() requests. Square can certainly override those methods so that when a client calls setWidth(), the setWidth() method sets both the width and the height, maintaining its squareness. Ditto for setHeight(). The crux of the matter comes down to the expectations of the clients of the base class Rectangle. If they are interacting with a Square object through the Rectangle interface, wouldn’t they be surprised if they called setWidth() and both dimensions of the target object changed? Sometimes the answer could be yes, sometimes it could be no – it depends upon the context.
We’ve come a long way from “IS-A”. Clearly the reasoning of an inheritance architecture is more involved than questioning if something “IS-A” something else, in some sense of the word. If we must use a cutesy phrase, a better one, and one that adheres to the Liskov Substitution Principle would be “Is Substitutable For”.
Before I joined Object Mentor, I used to work with a guy who actually taught “IS-A”. He was well respected in the organization (not by me, I thought of him as the village idiot) and would tell people to go through the requirements document looking for the phrase “is-a” between two nouns. The nouns are classes and the “IS-A” association signals the use of inheritance. How can anyone with the slightest modicum of intelligence believe that this could work? He was actually telling people to base the architecture of their systems upon the linguistic skills of whoever wrote the requirements document. I’m glad I don’t work there anymore.
Don’t even get me started on “HAS-A”.
Refactoring Finds Dead Code 102
One of the many things that I just love about my job as a consultant/mentor is when I actually get to sit down with programmers and pair program with them. This doesn’t seem to happen nearly as often as I would like, so when two developers at a recent client site asked me if I could look at some legacy code to see if I could figure out how to get some tests around it, I jumped at the opportunity. We acquired a room equipped with a projector and a whiteboard. A laptop was connected to the projector and we were all able to comfortably view the code.
I visit a lot of different companies and see a lot of absolutely ghastly code. What I was looking at here wasn’t all that bad. Variable names were not chosen to limit keystrokes and method names appeared to be descriptive. This was good news, as I needed to understand how this was put together before I could offer help with a testing strategy.
As we walked through the code, I noticed that there were classes in the project directory ending in ‘Test’. This took me by surprise. Usually when I’m helping people with legacy code issues, there aren’t any tests. Here, there were tests in place and they actually passed. Very cool, but now my mission wasn’t clear to me as I thought my help was needed getting tests in place around legacy code.
The developers clarified that they wanted help in testing private methods. Ah ha, the plot thickens.
The question of testing private methods comes up frequently whether I’m teaching a class or consulting on a project. My first response is a question. “Is the private method being tested through the public interface to the class?” If that’s the case, then there’s nothing to worry about and I can steer the conversation away from testing private methods to testing behaviors of a class instead of trying to test individual methods. Note that a private method being tested through its public interface would be guaranteed if the class was developed TDD style where the test is written first, followed by one or more public methods to make the test pass, followed by one or more extract method refactorings, which would be the birth of the private methods. This is almost never the case. My client didn’t know how the code was developed, but by inspection they concluded that the parameters of the test were adequately exercising the private method.
It looked like my work was done here. But not so fast.
I have a policy that says that whenever I have code up in an editor, I have to try to leave it just a little bit better than when I found it. Since we had put the testing matter to rest and we still had some time left in the conference room before another meeting started, I suggested that we see if we could make some small improvements to the code we were examining.
As I said earlier, the code wasn’t horrible. The code smell going past my nostrils was Long Method and the cure was Extract Method.
The overall structure of the method we were examining was
if( conditional_1 )
{
// do lots of complicated stuff
}
else if( conditional_2 )
{
// do even more complicated stuff
}
else
{
// do stuff so complicated nobody understood it
}
where conditional_1 was some horribly convoluted expression involving lots of &&’s, ||’s, and parentheses. Same for condition_2, which also had a few ^’s thrown in for good luck. To understand what the method did, one would have to first understand the details of how the method did it.
I asked the developers if they could come up with a nice, descriptive method name that described what I’m calling condition_1 so that we could do an extract method refactoring and the code would look like:
if( descriptive_name() )
{
// do lots of complicated stuff
}
// etc
Now there were less details to understand when trying to determine what this method did. If we were to stop here and check in the code, we could leave the session feeling good as the code is better than when we started. But we still had time before we had to abandon our conference room so I pressed on.
“Can you summarize what this code does as a descriptive method name,” I asked. The developers pondered a few moments and came up with what they felt was a good name. Excellent. We did the same procedure for the “else if” clause. When we finished that, one of the developers said something along the lines of, “That was the easy part, I have no idea what this [else] remaining code does.” I was going to pat everybody on the back and call it a day because the code had improved tremendously from when we started, but the developers seemed to have a “we can’t stop now” attitude. They studied the code, pondered, cursed, discussed some more, and then one of them said, “This code can never execute!”
I’d like to use the expression, “You could have heard a pin drop,” to describe the silence in the room, but since there were only three of us, the phrase looses its power. As it turns out, now that the if() and else if() conditionals were given descriptive names and people could grok them at a glance, it became obvious that the business rules didn’t permit the final else – the first two conditions were all that could exist and the most complicated code of all was never being called. This was downright comical!
I asked if the tests would still pass if we deleted the code and after a short examination of the tests, the developers weren’t as confident that the test parameters actually hit that area of code. There was a log() statement in that code and one of the developers was going to examine the production logs to see if the code ever executed.
So there you have it, refactor your code and the bad parts just go away!
Velocity Inflation Triggers Productivity Recession 80
A team’s velocity is a measure of its productivity. A higher velocity means more work is being completed each iteration and a team that shows an increasing velocity from iteration to iteration is being more and more productive. This isn’t necessarily a good thing.
In Extreme Programming (XP), a team’s velocity is defined as the number of story points completed in an iteration. We can track the velocity for each iteration and calculate the average velocity over several iterations. If we know how many story points comprise an application, we can even project when an application can be released . Velocity is a very useful measure of a team’s capabilities. I have found it to be a much more reliable (make that infinitely more reliable) number to be used for project planning than asking developers for time estimates based on requirements.
Because velocity correlates to a team’s productivity, there is a tendency for managers to try to “goose†the velocity. The technical term for this is a stretch goal. Just like the coach of a sport’s team trying to get the team to play better, I can imagine project managers saying at a team meeting at an iteration boundary, “Come on team – you did 22 points last week, what do you say we try for 25 this week?†This same manager probably buys those motivation posters of eagles soaring and hangs them all around the team room.
The programming team takes pride in their achievements and they also like to see their velocity increasing. They will try to persuade the Customer to give them points for any bit of work they do. I can imagine a developer putting in a few minutes work to change the position of a GUI button on a screen and begging the Customer, “Can I get a point for that?â€
XP teams can become obsessed with their velocity, but increasing velocity isn’t always a good thing. A team can inadvertently become so focused on getting points done that they sometimes forget the real goal – to deliver a quality product.
This became apparent to me recently when I visited a client to see how they were doing in their transition to XP. Their velocity had skyrocketed since I last saw them – and it was definitely not a good thing.
When Object Mentor does an XP transition with a client, we start with our XP Immersion course to get everybody on the same page about what our goals are. Ideally, we use two instructors, one to train the programmers in topics such as Test Driven Development and Refactoring, and the other coach teaches story writing, iteration planning, and release planning to the customer team. We then bring the two groups together for a day and have the customer team drive the programming team on a classroom exercise so everybody can experience how the process works. The instructors then stay and work with the team for a few iterations, coaching them on how to use what they learned in class on their own project.
Such was the case with the client I mentioned earlier. I was working with the programmers and the other coach had the customer team. When it came time to estimate stories, the other coach suggested using an open-ended scale. Where I like to use a scale of 1-5, this team was using a 1-infinity scale. That’s fine, it doesn’t really matter which scale you use, as long as whatever scale you choose is consistent. Stories were estimated according to their relative complexities, iterations were planned, and the programmers started cranking out code. After a few iterations the team’s velocity settled to a value of about 14 points. The team was doing fine and it was time for me to move on and let them do it alone.
When I returned to do a process checkup, their velocity had climbed to 48 points. Wow. This new process must really be resonating with the team. I purposely timed my visit to occur on an iteration boundary and we conducted a retrospection on how well the team was adhering the the XP practices. This turned out to be the bad news.
With a focus on getting more points done, it seemed that the team had abandoned many of the practices. Programmers weren’t pairing, weren’t writing unit tests, and weren’t refactoring their code. Customers were trying to stay ahead of the programmer’s ever increasing productivity abandoned writing automated acceptance tests in FitNesse and were now back to manual testing. I was heartbroken.
Beck teaches that one of the things teams have to do is to adapt the practices to the organization. Perhaps what they were doing was adapting. Adapting to the point that they weren’t even close to doing XP anymore, but adapting nonetheless. I wondered how that was working for them.
I observed the iteration planning meeting for the upcoming iteration and I noticed that all of the user stories were bug fixes from the previous iteration. No new functionality was being added to the application, the team was essentially spinning its wheels. So even though the velocity indicated a very productive team, the actual productivity was essentially zero. There must be a lesson here.
Velocity is what it is. You must take great care when encouraging a team to increase its velocity because you will always get whatever you’re trying to maximize. You have to be careful that you are maximizing the right thing. Want to maximize number of lines of code produced? Well you’re most likely going to get a lot of really crappy code. Want to maximize the number of unit tests that developers write? You’ll get a lot of them, but they’ll probably be worthless. Applying pressure to increase the velocity might make developers to subconsciously inflate their estimates or worse, abandon good development practices in order to get the numbers up.
You Don't Know What You Don't Know Until You Take the Next Step 58
I was teaching our brand new Principles, Patterns, and Practices course recently (https://objectmentor.com/omTraining/course_ood_java_programmers.html) and I was starting the section on The Single Responsibility Principle.
I had this UML class diagram projected on the screen:
Employee
+ calculatePay()
+ save()
+ printReport()
I asked the class, “How many responsibilities does this class have?” Those students who had the courage to answer a question out loud (sadly, a rare subset of students) all mumbled, “Three.” I guess that makes sense, three methods means three responsibilities. My friend and colleague Dean Wampler would call this a first-order answer (he’s a physicist and that’s how he talks ;-) ). The number will increase as we dig into details. I held one finger in the air and said, “It knows the business rules for how to calculate its pay.” I put up a second finger and said, “It knows how to save its fields to some persistence store.”
“I happen to know that you folks use JDBC to talk to your Oracle database.” Another finger for, “It knows the SQL to save its data.” Another finger for, “It knows how to establish a database connection.” My entire hand is held up for, “It knows how to map its fields to SQLStatement parameters.” I start working on my other hand with, “It knows the content of the report.” Another finger for, “It knows the formatting of the report.” If this example was a real class from this company’s code base I knew I’d be taking off my shoes and socks.
Not that my answer was any better than the students’ answer, given the information at hand there can’t be a right answer because I didn’t provide any context for my question. I found our different answers interesting though. This particular company would have a design phase for a project where UML diagrams would be produced and discussed. How can any reviewer know if a class diagram is “rightâ€?
I have this belief (hang-up?) that you don’t really know what you don’t know, until you take the next step and actually use what you currently have. Looking at UML, we can’t really say that it’s going to work or that it satisfies the Principles of Object Oriented Design, until we take the next step, i.e., write some code and see whether it works or not. The devil is in the details and those devilish dirty details just can’t be seen in a picture.
Let’s take a step back. Before there is UML specifying a design, there must have been requirements stating a problem that the design addresses. Have we captured all of the requirements? Are they complete? Are they accurate? Are they unambiguous? How do you know? I believe that you don’t know, and worse, you don’t even know what you don’t know. You don’t know until you take that next step and try to design a solution. It’s only during design that phrases like, “What’s supposed to happen here,” or, “That doesn’t seem to have been addressed in the spec,” are heard. You don’t know what you don’t know until you take that next step.
It is very easy for everybody on a project to believe that they are doing good work and that the project is going according to plan. If we don’t know what we don’t know, it’s hard to know if we’re on the right track. During requirements gathering, business analysts can crank out user stories, use cases, functional requirements, or whatever artifacts the process du jour dictates they produce. Meetings can be scheduled and documents can be approved once every last detail has been captured, discussed to death, and revised to everybody’s liking. Unfortunately, until solutions for these requirements are designed, they are but a dream. There is no way to predict how long implementation will take so project plans are really interpretations of dreams.
The same danger exists during design. Architects can be cranking out UML class diagrams, sequence diagrams, and state transition diagrams. Abstractions are captured, Design Patterns are applied, and the size of the project documentation archive grows. Good work must be happening. But are the abstractions “rightâ€? Can the code be made to do what our diagrams require? Is the design flexible, maintainable, extensible, testable (add a few more of your favorite -able’s)? You just don’t know.
The problem with Waterfall, or at least the problem with the way most companies implement it, is that there either isn’t a feedback mechanism or the feedback loop is way too long. Projects are divided into phases and people feel that they aren’t allowed to take that crucial next step because the next step isn’t scheduled until next quarter on somebody’s PERT chart. If you don’t know what you don’t know until you use what you currently have, and the process doesn’t allow you to take the next step (more likely somebody’s interpretation of the process doesn’t let you take the next step), then we don’t have a very efficient process in place.
In order to not delude myself that I am on the right track when in reality I’m heading down a blind alley, I would like to know the error of my ways as quickly as possible. Rapid feedback is a characteristic of all of the Agile methodologies. By learning what’s missing or wrong with what I’m currently doing, I can make corrections before too much time is wasted going in the wrong direction. A short feedback loop minimizes the amount of work that I must throw away and do again.
I’m currently working with a client who wants to adopt Extreme Programming (XP) as their development methodology. What makes this difficult is that they are enhancing legacy code and the project members are geographically distributed. The legacy code aspect means that we have to figure out how we’re going to write tests and hook them into the existing code. The fact that we’re not all sitting in the same room means that we have to have more written documentation. We don’t know the nature of the test points in the system, nor do we know what to document and how much detail to provide in documentation. We can’t rely mostly on verbal communication but we don’t want to go back to writing detailed functional specs either. There are many unknowns and what makes this relevant to this discussion is, we don’t even know all of the unknowns. Rapid feedback has to be our modus operandi.
An XP project is driven by User Stories developed by the Customer Team, composed of Business Analysts and QA people. A User Story, by definition, must be sized so that the developers can complete it within an iteration. I have a sense how much work that I personally can do in a two week iteration, but I’m not the one doing the work and I don’t know how much of an obstacle the existing code is going to be. The Customer Team could go off and blindly write stories, but that would lead to a high probability that we’d have to rewrite, rework, split, and join once the developers saw the stories and gave us their estimates. To minimize the amount of rework, I suggested that the Customer Team write about twenty or so stories and then meet with the developers to go over the stories and allow them to give estimates.
My plan to get a feedback loop going on user story development worked quite well. The Customer Team actually erred on the side of stories that were too small. The developers wanted to put estimates of 0.5 on some of the stories (I have my teams estimate on a scale of 1-5. I’ll write up a blog entry on the estimation process I’ve been using.) so we combined a few of them and rewrote others to tackle a larger scope. We took the next step in our process, learned what we didn’t know, took corrective action, and moved forward.
Writing Customer Acceptance Tests didn’t go quite as smoothly, but it is yet another example of learning what didn’t work and making corrections. I advise my clients to write tests for their business logic and to specifically not write business tests through the user interface. Well guess where a lot of business logic resided – yep, in the user interface. We ran into the situation where the acceptance tests passed, but when the application was actually used through the user interface, it failed to satisfy the requirements. I’m very happy that we had a short feedback loop in place that revealed that our tests weren’t really testing anything interesting before the Customer Team had written too many FitNesse tests and the Development Team had written too many test fixtures.
Feedback is good. Rapid feedback is better. Seek it whenever you can. Learn what you don’t know, make corrections, and proceed.
Teaching an Old Dawg New Tricks 44
I learned something a few weeks ago that has saved me quite a bit of typing. I’m a pretty good typist but I still feel that saving a few seconds here and there pays compound interest when integrated over weeks, months, and a career. I’ve never seen anyone do this before and I get to work with a lot of programmers so I thought I’d share it here.
Joe “J.B.” Rainsberger was in our office a few weeks ago giving some internal training. Joe is the author of JUnit Recipes and if you don’t already own the book, buy two copies – it’s very good. Anyway, Joe was doing a demo for us and I watched how he created objects in Eclipse. If I wanted a new Counter object, for example, I would type:
Counter counter = new Counter();
But Joe would type:
new Counter();
and he then uses Eclipse’s Quick Help (ctrl-1) which offers to either create a local variable or a field in the class. That’s 2 keystrokes if Eclipse does it or 15 if I type it. That’s quite a return on investment in my book.
Try it and see if it makes you go just a little bit faster.
Size Matters 379
Contrary to what you may have heard or what you might like to hear, size really does matter. We programmers must take matters into our own hands and become masters of our domains. Unless we take action, things are just going to get bigger and bigger until we have a real mess on our hands.
I travel a lot and I get to visit a lot of different companies. No matter which industry a company is in or which programming language a team is using, there is one commonality in all of the code that I see – classes are just too damn big and methods are just too damn long. (What did you think I was talking about?)
Way back in the olden days when I had hair on my head, I studied Structured Design. This was where I learned the concept of cohesion. A software module with high cohesiveness was considered a good thing. As I transitioned to Object Oriented Design (still with a full head of hair), I learned Bertrand Meyer’s One Responsibility Rule and later Robert Martin’s Single Responsibility Principle. These latter two concepts restate and reenforce what Larry Constantine said back when Structured Design was in vogue – a module should do one thing and do it very well.
The trouble with this idea of a module (class or method) doing one thing is that it is subjective. What I consider one thing you might consider several things. For example, I might see a method as getting a Customer object out of a database, yet you see it as:
- establishing a connection to the database,
- forming the sql,
- executing the sql, and
- creating and returning a Customer object from what it found in the results of the sql execution.
A guideline that sometimes works when deciding if a module is doing “too much” is simply to describe what it does. If you find yourself using the word “and” in this description (or working really hard to phrase the description in such a way to avoid using “and”), it might be doing too much. Of course you have to adhere to the spirit of the guideline. I can describe the National Air Traffic Control System as “Prevents Collisions”. I didn’t have to use the word “and” once. It doesn’t follow that we can write the entire system as one class with one method – let’s call it main. Or if you program in C#, call it Main.
Uncle Bob presents a different view of “responsibility” in his Agile Software Development book. He promotes a responsibility as a reason to change. If a class has two reasons to change, it has two responsibilities and it might be wise to split the class into it’s two pieces.
This notion of breaking a class into smaller and smaller pieces is exactly opposite to what I learned when I first started studying OO. Way back when I worried about bad-hair days, people believed that a class should encapsulate everything that concerned it. A Customer class would know the business rules of being a Customer as well as how to retrieve itself from the database and display it’s data. That’s a fine idea, provided the database schema never changes, the display never changes, or the business rules never change. If any one of those responsibilities change, we are at a high risk of breaking other things that are coupled to it.
So what’s the real problem? What harm does it do if we go around proudly making our big modules even bigger. Well, I can think of a few problems:
1. Comprehensibility
The bigger a method (or class) is, the harder it is to understand without significant study and effort. I believe that as soon as I have to scroll a method in order to read it, I’m wasting valuable time because the method is doing more than my brain can hold. I often find myself scrolling back because I forgot what scrolled off the top of my window. I know I’m in a minority (considering all the code I’ve seen with massive classes and methods) but I like methods short and sweet.
2. Magnets for change
Massive classes with big methods do a lot – they have to because there is a lot of code in them. Unfortunately, that also means there’s a lot that can go wrong in them. When we fix a bug in one of these huge modules, we have to change the code – and changing code often means the code becomes worse. When we have to add additional functionality, the hooks seem to be in these big classes, so they get even bigger, and once again the code deteriorates even more. It’s easier to add code to existing classes and methods than it is to create new classes. Some companies have a heavy hand on the source code repositories and developers would rather make existing classes bigger than deal with the bureaucracy of adding another module to the corporate SCR.
3. Collisions
Because a lot of code is in each of the ever-growing modules, it stands to reason that different team members will be editing the code in these modules for different reasons. You know what that means come check-in time; the dreaded diff and merge. And because it’s so painful, developers put it off as long as possible which only makes the problem worse.
4. Lack of reuse
The bigger a module is, the less likely it is that you will be able to reuse it in a different context. It does so much that it becomes specialized to the current context.
5. Comments are needed
Large methods can’t be named for their intent, i.e., you can’t tell what a method does from its name because it does so much. Earlier this year I saw a multi-thousand line method named ‘execute’. Yeah, it was obvious what it did – not. Developers tend to write comments to explain what a method does. We’ve all seen them – the next 200 lines calculate a thingamabob, then another comment explaining the next 450 lines. The problem with comments is that in large systems, worked on by numerous developers over a period of years, the code goes one way and the comments tend to stay as originally written. When the code says one thing and the comments say another, which will you believe? Yet more obscurity.
6. Decreased code quality
Where do you think you’ll spot a glaring defect quicker, in a 6 line method or in the middle of a 300 line while loop with page-long if-else constructs in it?
7. Increased maintenance costs
Every time you visit a large module, you have to understand the piece that you’re going to work on. Often, that means you have to understand the entire module just to find the area where you are going to work. All of this takes time, and time is money. Ha – I seem to be on a roll with these sayings – “time is money”, “size matters”, hmmmm…. can “your check is in the mail” be far behind?
8. Easier performance profiling
When you are trying to locate performance bottle-necks, and whatever tool or timing mechanism you’re using tells you that the 8,000 line doItToIt() method is taking “too long”, how are you going to find where all the time is spent? It would be much easier to see the 6 line calculateAmount() function was taking too long because it was hitting the database.
There are many things in life that I wish were bigger—lots bigger (hard disk, thumb drive, monitor, RAM, etc.) but classes and methods should not be in that list. Understanding at a glance with good, meaningful, intention revealing names can go a long way to keeping software costs down and making our lives as developers better. I know, it isn’t easy to spare the time. It’s much easier to use that time to struggle to squeeze another clause to that already overgrown method; it’s much easier to use that time single stepping through the vastly indented forest of if/else/for/while; it’s much easier to use that time poring through a tangled and twisted rat’s nest of code over and over just to work out what it might just be doing. Oh yes, it’s MUCH easier.