Private vs Protected 2
Someone on comp.object recently asked why anyone would make a field private since privacy ruins extensibility.
I recently read an article on comp.object that asked the following question:
While I can see that the ‘private’ modifier has its uses, I’m puzzled as to why it’s advocated so much given that one of the strong points of OO is extensibility.
I responded with:
The Open-Closed Principle of OOD (See article) says that objects should be open for extension but closed for modification. In other words, you should be able to change what a module does without changing the module. Extensibility, in OO, is best achieved when you keep the code you are extending safe from modification.How do you protect a module from the forces that would try to modify it? One technique is to keep the variables that module depends upon private. If a variable is not private, then it is open to be used in a way that the module that owns that variable does not intend. Indeed, using a variable in an unintended way is the only reason to make the variable public or protected. But when you use a variable in an unintended way you likely force modifications into the owner. If, on the other hand, all the variables of a module are private, then no modification can be caused through unintended useage.
Privacy does not preclude extensibility. You can create public or protected accessor methods that: 1) provide extenders access to certain variables, and 2) ensure that the extenders don’t use the variable in an unintended way.
For example, given a variable
vused by a modulem, such thatvshould never be negative. If you makevpublicorprotectedsomeone could set it to a negative number breaking the code inmand possibly forcing mofidication tom. However, ifvis private but is accessible throughgetVandsetVmethods; and if thesetVmethod throws an exception if you pass it a negative number, thenmis safe, and extenders are forced to follow the rules thatmexpects.To be fair, while I am a big proponent of keeping variables private, I have also come to rely much more on my unit tests to enforce the appropriate use of variables. When the code enjoys 90+% unit test coverage those tests will uncover and prevent variable misuse. This softens the need for the compiler to enforce privacy. This is not to say that you should not make your variables private, you should. It is to say that if you use TDD, the cost/benefit ratio changes, and you may find that you can soften access to some variables.
Testing GUIs Part I: RoR. 7
Testing GUIs is one of the the holy grails of Test Driven Develoment (TDD). Many teams who have adopted TDD for other parts of their projects have, for one reason or another, been unable to adequately test the GUI portion of their code.
In this series of article I will show that GUI testing is a solved problem. Over the years the TDD community has produced and accumulated tools, frameworks, libraries, and techniques that allow any team to test their GUI code as fully as any other part of their code.
Testing GUIs Part I: Ruby on Rails
In the world of web development, no community has solved the problem of GUI testing better than the Ruby on Rails community. When you work on a rails project, testing the GUI is simply de-rigeur. The rails framework provides all the necessary tools and access points for testing all aspects of the application, including the generation of HTML and the structure of the resulting web pages.
Web pages in rails are specified by .rhtml files that contain a mixture of HTML and ruby code similar to the way Java and HTML are mixed in .jsp files. The difference is that .rhtml files are translated at runtime rather than being compiled into servlets the way .jsp pages are. This makes it very easy for the rails environment to generate the HTML for a web page outside of the web container. Indeed, the web server does not need to be running.
This ease and portability of generating HTML means that the rails test framework merely needs to set up the variables needed by the ruby scriptlets within the .rhtml files, generate the HTML, and then parse that HTML into a form that the tests can query.
A typical example.
The tests query the HTML using an xpath-like syntax coupled with a suite of very powerful assertion functions. The best way to understand this is to see it. So here is a simple file named:autocomplete_teacher.rhtml.
<ul class="autocomplete_list">
<% @autocompleted_teachers.each do |t| %>
<li class="autocomplete_item"><%= "#{create_name_adornment(t)} #{t.last_name}, #{t.first_name}"%></li>
<% end %>
</ul>
You don’t have to be a ruby programmer to understand this. All it is doing is building an HTML list. The Ruby scriptlet between <% and %> tokens simple loops for each teacher creating an <li> tag from an “adornment”, and the first and last name. (The adornment happens to be the database id of the teacher in parentheses.)
A simple test for this .rhtml file is:
def test_autocomplete_teacher_finds_one_in_first_name
post :autocomplete_teacher, :request=>{:teacher=>"B"}
assert_template "autocomplete_teacher"
assert_response :success
assert_select "ul.autocomplete_list" do
assert_select "li.autocomplete_item", :count => 1
assert_select "li", "(1) Martin, Bob"
end
end
- The
poststatement simply invokes the controller that would normally be invoked by a POST url of the form:POST /teachers/autocomplete_teacherwith theteacherparameter set to"B". - The first assertion makes sure that the controller rendered the
autocomplete_teacher.rhtmltemplate. - The next makes sure that the controller returned success.
- the third is a compound assertion that starts by finding the
<ul>tag with aclass="autocomplete_list"attribute. (Notice the use ofcsssyntax.)- Within this tag there should be an
<li>tag with aclass="autocomplete_item"attribute, - and containing the text
(1) Martin, Bob.
- Within this tag there should be an
It should not come as any surprise that this test runs in a test environment in which the database has been pre-loaded with very specific data. For example, this test database always has “Bob Martin” being the first row (id=1) in the Teacher table.
The assert_select function is very powerful, and allows you to query large and complex HTML documents with surgical precision. Although this example give you just a glimpse of that power, you should be able to see that the rails testing scheme allows you to test that all the scriptlets in an .rhtml file are behaving correctly, and are correctly extracting data from the variables set by the controller.
An example using RSpec and Behavior Driven Design.
What follows is a more significant rails example that uses an alternate testing syntax known as Behavior Driven Design (BDD). The tool that accepts this syntax is called RSpec.
Imagine that we have a page that records telephone messages taken from teachers at different schools. Part of that page might have an .rhtml syntax that looks like this:<h1>Message List</h1>
<table id="list">
<tr class="list_header_row">
<th class="list_header">Time</th>
<th class="list_header">Caller</th>
<th class="list_header">School</th>
<th class="list_header">IEP</th>
</tr>
<%time_chooser = TimeChooser.new%>
<% for message in @messages %>
<%cell_class = cycle("list_content_even", "list_content_odd")%>
<tr id="list_content_row">
<td id="time" class="<%=cell_class%>"><%=h(time_chooser.format_time(message.time)) %></td>
<td id="caller" class="<%=cell_class%>"><%=h person_name(message.caller) %></td>
<td id="school" class="<%=cell_class%>"><%=h message.school.name %></td>
<td id="iep" class="<%=cell_class%>"><%=h (message.iep ? "X" : "") %></td>
</tr>
<% end %>
</table>
Clearly each message has a time, caller, school, and some kind of boolean field named “IEP”.
We can test this .rhtml file with the following RSpec specification:
context "Given a request to render message/list with one message the page" do
setup do
m = mock "message"
caller = mock "person",:null_object=>true
school = mock "school"
m.should_receive(:school).and_return(school)
m.should_receive(:time).and_return(Time.parse("1/1/06"))
m.should_receive(:caller).any_number_of_times.and_return(caller)
m.should_receive(:iep).and_return(true)
caller.should_receive(:first_name).and_return("Bob")
caller.should_receive(:last_name).and_return("Martin")
school.should_receive(:name).and_return("Jefferson")
assigns[:messages]=[m]
assigns[:message_pages] = mock "message_pages", :null_object=>true
render 'message/list'
end
specify "should show the time" do
response.should_have_tag :td, :content=>"12:00 AM 1/1", :attributes=>{:id=>"time"}
end
specify "should show caller first and last name" do
response.should_have_tag :td, :content=>"Bob Martin", :attributes=>{:id=>"caller"}
end
specify "should show school name" do
response.should_have_tag :td, :content=>"Jefferson", :attributes=>{:id=>"school"}
end
specify "should show the IEP field" do
response.should_have_tag :td, :content=>"X",:attributes=>{:id=>"iep"}
end
end
I’m not going to explain the setup function containing all that mock stuff you see at the start. Let me just say that the mocking facilities of RSpec are both powerful and convenient. Actually you shouldn’t have too much trouble understanding the setup if you try; but understanding it is not essential for this example. The interesting testing is in the specify blocks.
You shouldn’t have too much trouble reading the specify blocks. You can understand all of them if you understand the first. Here is what it does:
- The first spec ensures that
<td id="time">12:00 AM 1/1</td>exists in the HTML document. This is not a string compare. Rather it is a semantic equivalence. Whitespace, and other attributes and complications are ignored. This spec will pass as long as there is atdtag with the appropriate id and contents.
HTML Testing Discipline and Strategy
One of the reasons that GUI testing has been so problematic in the .jsp world is that the java scriptlets in those files often reach out into the overall application domain and touch code that ties them to the web container and the application server. For example, if you make a call from a .jsp page to a database gateway, or an entity bean, or some other structure that is tied to the database; then in order to test the .jsp you have to have the full enabling context running. Rails gets away with this because the enabling context is lightweight, portable, and disconnected from the web container, and the live database. Even so, rails applications are not always as decoupled as they should be.
In Rails, Java, or any other web context, the discipline should be to make sure that none of the scriptlets in the .jsp, .rhtml, etc. files know anything at all about the rest of the application. Rather, the controller code should load up data into simple objects and pass them to the scriptlets (typically in the attributes field of the HttpServletRequest object or its equivalent). The scriptlets can fiddle with the format of this data (e.g. data formats, money formats, etc.) but should not do any calculation, querying, or other business rule or database processing. Nor should the scriptlets navigate through the model objects or entities. Rather the controller should do all the navigating, gathering, and calculating and present the data to the scriptlets in a nice little package.
If you follow this simple design discipline, then your web pages can be generated completely outside of the web environment, and your tests can parse and inspect the html in a simple and friendly environment.
Conclusion
I’ll have more to say about RSpec in a future blog. BDD is an exciting twist on the syntax of testing, that has an effect far greater than the simple syntax shift would imply.
I hope this article has convinced you that the rails community has solved the problem of testing HTML generation. This solution can be easily extrapolated back to Java and .NET as future blogs in this series will show.
Clearly the problem of testing Javascript, and the ever more complex issues of Web2.0 and GTK are not addressed by this scheme. However, there are solutions for Javascript that we will investigate in future blogs in this series.
Finally, this technique does not test the integration and workflow of a whole application. Again, those are topics for later blogs.
I hope this kickoff blog has been informative. If you have a comment, question, or even a rant, please don’t hesitate to add a comment to this blog.
Money Format WTF 2
The reason the DailyWTF is so funny, is that we all secretly identify with it. Here’s my latest WTF.
It happened last night at about 7pm. My wife wanted me to run to the store with her. I wanted to get my tests to pass so I could check in my code. I knew that if I left the code checked out until morning, one of my compatriots would wake up at 3am and change something, and I’d have to do a merge. I hate doing merges!
I needed to write thetoString() method for my Money object. I had my tests already. Here they are:
public void testToString() throws Exception {
assertEquals("$3.50", new Money(350).toString());
assertEquals("$75.02", new Money(7502).toString());
assertEquals("$0.01", new Money(1).toString());
assertEquals("$0.00", new Money(0).toString());
}
I quickly wrote the function I knew would work:
public String toString() {
return String.format("$%d.%02d",pennies/100, pennies%100);
}
What can I say. I’m an old C programmer. When the format method showed up in Java 5 I jumped for joy.
Even as I typed this code, something was nagging at the back of my brain. Something was telling me there was a better way. But then, I was interrupted by a huge disappointment.
It didn’t compile. Damn! I forgot I was writing in a Java 1.4 environment. No String.format!
What to do? What do to?
(Wife: “Bob, are you ready to leave yet? It’s getting late! The store is going to close!)
uncleBob.changeMode(CODE_MONKEY); I wrote this:
public String toString() {
int cents = pennies % 100;
int dollars = pennies / 100;
return "$" + dollars + "." + ((cents < 10) ? "0" : "") + cents;
}
The tests passed, and I checked in my code and went to the store with my lovely wife.
——-
This morning I woke up, finished reading a book on Quantum Mechanics, read a few blogs, and in general pursued my joyous life of study and work. But somethign was nagging at the back of my brain. Something told me to look in the Java Docs for NumberFormat.
(sigh). On one screen was the code my monkey brain had written last night. On the other screen was the JavaDoc for NumberFormat. (sigh).
So I sheepishly changed my code to:
public String toString() {
NumberFormat nf = NumberFormat.getCurrencyInstance();
return nf.format(pennies/100.0);
}
Of course I know Beck’s rule: Never let the sun set on bad code.
I really need to find that code monkey and kill it.
Web Death by Strings 6
Communication between web clients and servers is dominated by strings. This leads to complex and horrific problems of coupling, and fragility. Where are the rules?
I am in the enviable position of working on two web systems at the same time. One is a ruby-on-rails system for tracking substitute teachers. The other is a JEE system for managing the contents of a library. The point-counter-point of this happy coincindence has illuminated something that has tickled my subconscious for years. The world of Web programming is a world of pathological string manipulation.
Take, for instance, the library system I am working on. One of the pages in this system manages the books in the library by their ISBN, and by their copy ids. Let’s say we had 3 copies of ISBN 0131857258. The page would have a table row for the ISBN that contained a check box for each of the three copies. If the user checks the checkbox, the copy will be deleted from the library. Another checkbox in that row is named “Delete all”. When the user clicks that check box, all the other check boxes in that row are automatically checked, and all copies of that book are eliminated.
Now, think about this from an HTML point of view. How does the server know which copies should be deleted? That’s easy, the server builds the HTML for the page, so it simply gives a special name to each checkbox. When the form is submitted the names of the checked checkboxes are sent back to the server. So all the server has to do is to give each checkbox a name that identifies the copy it represents. We chose a syntax similar to: “delete_432”, which would be the name of the checkbox that represents the deletion of the copy whose id is 432.
Notice the string manipulation? We have encoded server side information in a string that is sent to the client, and we expect that information to come back to the server unchanged. While this makes perfect sense, any good software designer should feel a bit queasy about it. Depending on strings to encode information like this feels just a little bit reckless. It’s manageable, but it’s icky.
Today that ickiness got a lot worse for me. Dean Wampler is working with me on the library project. He was working on the JavaScript to make the “delete all” checkbox work. Now copy ids are globally unique. No two copies, regardless of ISBN, share the same copy id. So when the ‘delete_nnn” comes back to the server, the server does not need to know which ISBN the book belongs to. It just happily deletes copy ‘nnn’. However, Dean needed get his client side JavaScript to set only those checkboxes that corresond to the ISBN of the ‘delete all’ button. The client does not know which copies correspond to which ISBNs. To solve this problem he changed the format of the checkbox name to ‘delete_ssss_nnnn’ where ssss is the ISBN, and nnnn is the copy id. This allowed him to write the JavaScript to look for all the delete buttons that corresponded to the appropriate ISBN.
Of course when he made that change, he broke my server code which was looking for ‘delete_nnnn’. Fortunately I had unit tests that detected the problem instantly. (I truly pity those poor programmers whose only means to stumble accross errors like this is to deploy the system to test and work through the pages manually!) This would have been easy for me to repair on the server side; and I was tempted to do so, simply in the name of efficiency; but my conscience wouldn’t let me.
Why should a client-side JavaScript issue have any impact on the server code? Answer: It shouldn’t!. This is software design 101. Don’t couple different domains!
So I talked it over with Dean and we quickly realized that he could change the JavaScript to use the the ‘id’ attribute of the checkbox tag. The server would construct the page with the id’s set correctly, and the checkboxes would retain their normal name of ‘delete_nnn’.
There is a general rule here somewhere. It’s something like: use names to communicate with the server, and use ‘id’ attributes to communicate with the client. Or, rather, don’t break server code to make client side javascript work.
I’ve had similar string issues with the ‘Substitute’ system I’ve been working on in Rails. In this case I am using Ajax to allow users to type the names of substitute teachers and quickly pop up a list of possible teachers. So if you type “B” into the “Substitute” field, you quickly see a menu of all substitues whose name begins with “B”. As you type more letters the list gets smaller. You can pick a name from the list when it’s convenient for you.
This works great, but has one gaping flaw. The server is looking these names up using SQL statements and is then populating the list in a convenient format. So, for example, it will put “Bob Martin” into the popup list, constructing the name from the first_name and last_name fields of the Substitute record. It is this constructed name that comes back to the server in the form when the submit button is pressed. But the constructed name is not the key of the Substitute record! So how does the server know which substitute has been selected? It could break apart the string “Bob Martin” into “Bob” and “Martin” and then do a query against first_name and last_name, but I hope you share my disgust with that solution! Not only is it inefficient, there are just loads of opportunities for error and fragility. (Just think of honorifics, suffixes, prefixes, middle names, etc.)
My solution, which I dislike almost as much, is to encode the id of the substitute along with the name. So the string that actually pops up in the menu is “(384) Bob Martin”. OK, OK, I know this is bad, and I intend to fix it once I learn how to get the JavaScript that pops up the menu to load a hidden field. But I don’t know how to do that yet, and I am agahst that I need to learn it! It seems to me that being able couple a pretty name to an unambiguous ID is such a common thing to do that I would not have to resort to the deep mysticism of javascript to achieve it.
Ah well, the web is hell. That’s all I can really say about this. Web programming is probably the worst programming environment I have ever worked in; and I’ve worked in a lot of programmign environments. Not only is it flogged by commercial hype that tries to make it seem much more complicated than it is; but it’s so poorly conceived, and so sloppily put together that it is, frankly, embarrasing.
The Need For Speed 1
Yesterday was the day we felt the accelleration.
As you may know, we’ve been working on a largish web application to use as a running example in our new Principles, Patterns, and Practices course. The first couple of days of development were typically slow; and we all felt the pressure of getting “too little” done. But we stuck to our disciplines and wrote tests, and kept the code as clean as possible.
Working this way is not easy. We have a deadline that is very real; and there is a lot of money tied to it. So we all feel “the need for speed”. But all of us also know that Brian Marick is right when he says: “When it comes to software, it doesn’t pay to rush.” So we’ve nervously tolerated the pace and continued to practice TDD and continuous refactoring.
On the third day it started to pay off. The structure of our code is clean and simple enough that new features are starting to be able to take advantage of older features. We have enough tests (~90% coverage) to ensure that minor refactorings to facilitate reuse aren’t risky. And so we were able to twist the code a little here, and tweak it a little there, and within a single day triple the functionality of the code.
Let me make this more concrete. We have a set of stories. We’ve estimated those stories with point values. The sum of our velocity on Monday was close to zero. On Tuesday we got 11 points done. On Wednesday we got 26 points done. This acceleration is due to the fact that we were able to take advantage of the similarities in the features and craft the new features into place by making fine adjustments to the existing structure.
I don’t need to tell you that if we’d been practicing cut-and-paste programming we would not have had that option. Instead of a carefully crafted structure that allows new features to be easily melded, we’d have feature silos with much duplicated code and messy tangles.
I got tired and debugged! 2
My grandaughter slept over last night. She’s an early riser. She and I had breakfast at 6am. My wife got a new cell phone (TREO680) yesterday, and I helped her set it up. So I didn’t go to bed until late. In short, I didn’t get a lot of sleep.
A bunch of us are working on a new training example for object oriented design principles and patterns. It’s a large-ish web-based system that has lots of interesting lessons to learn. We’ll be presenting it as a series of exercises throughout the course. Anyway, we’ve all been working as a team, writing this software. It’s been a lot of fun!
Anyway, about 3pm I started to get really tired. I should have stopped. But I wanted to finish one last story! (You know the feeling.) So I pressed on.
I ran accross a small dependency problem between two of the classes. So I started an elaborate refactoring to resolve the issue. An hour later I had to back out the whole refactoring because it didn’t solve the problem, or even come close! I don’t know what I was thinking. The real solution was much simpler and took just 10 minutes or so to implement. I should have stopped then, but, well, you know…
I needed to leave at 5pm. By 4:30 I was executing my unit tests. There was a problem. It didn’t make sense. You know the kind! I stared at the code for a long time, but my brain was mush. I couldn’t think. The lines of code swam before my eyes, but did not speak to me. I should have stopped then.
But no. Instead, I did something I barely ever do. Something I haven’t done in many months. I set a breakpoint! Egad! I was single-stepping through the code. And then I would do that horrible little dance where you step to a point in a module and realize you’ve gone a step or two too far. The variables don’t make sense. So you start over, and step back to just before where you were. Hideous! I should have stopped! But I kept at it. Over and over, breaking, stepping, breaking, stepping.
Debuggers feed you a torrent of information. Even when you are awake it’s easy to misread them. When tired, you see what you want to see, not what you really see. And nothing I saw made any sense. I finally got to the point where an if statement comparing two identical strings was failing! (Or so I thought.) I demanded that the stupid machine was lying. I rebuilt the application. I rebooted my machine. I redid the breakpoint, over and over. No change.
Now it was after 5pm. I really needed to go. I also really needed to fix this damned problem and check in the code. So I did the only thing that might make a bit of sense (other than stopping, which is what I should have done 2 hours before), I turned off the debugger and asked my pair-partner James for some help.
James had been busy helping one of the other guys get subversion working with Eclipse, so he’d been somewhat distracted while I was spinning myself into a debug rathole. He came right over and looked at my unit test. He said “Oh, shouldn’t that fravitz be a dorvitz?”
...
Duh. Yes, that fravitz should have been a dorvitz. It was obvious. It was simple. It was a 2 second change, and the tests all passed. (sigh).
So I, once again, reinforce my rule about debuggers. They are a horrible time-sink. When you find you must debug, it’s time to get help or go home.
