So just what does synchronized do? 56

Posted by Brett Schuchert Fri, 04 Jan 2008 03:27:00 GMT

Synopsis

Using synchronized turns a huge state space into a comparatively small one.

Normally, the light from a star radiates out in all directions. But what happens when a star collapses? There are several possibilities depending on the mass of the star. Our sun will turn into a red giant and then later turn into a white dwarf, giving out light from its accumulated heat for many years after living on Earth has become unbearable; mostly because of all the traffic.

If the mass of the star is around 10x our star, its destiny is just a bit different. It will begin to collapse. Along the way, it will probably have some temporary reprieves from its ultimate fate, to become a Neutron star, by consuming other heavier elements such as carbon and helium . However, the writing is on the wall. The center of star, already an extreme environment, becomes even more so; eventually, the pressure from the collapsing of material into the center of the star results in a massive explosion known as a nova or super nova – depending on the mass of the star. What is left is a neutron star. A neutron star is the heart of a pulsar, able to spin at amazing rates without pulling itself apart.

Another option is a black hole. A black hole is an often used metaphor for either the effort expended on support/fixing/updating a big ball of mud or what the development effort appears to be in most of the time. Either way, it is not considered a good thing if you’re anywhere in the vicinity.

I’m a pessimist by nature. I see a project in decline, sucking in other resources, like a black holes eating neighboring galaxies, snuffing the light out of stars, just like projects can suck the life force out of individuals.

However, energy can escape from a black hole, eventually leading to the demise of the black hole. Energy can escape along the axis of rotation in either direction. In a sense, if we see the expansion of the black hole as a bad thing, anything that diminishes a black hole is a good thing. Stephen Hawking theorized that at the event horizon of black holes, fluctuations in the space can cause the spontaneous creations of pairs of particles and their anti particles. This is another way it can appear that a black hole is losing energy since it gives off particles while the opposite particle goes into the black hole.

Now I’m off track. I wanted to talk about how synchronized give a similar effect as the massive gravity of black hole. But instead of curving space-time, they invert the possible state space of code in the presence of threads. Let’s take a look.

Here’s a simple class:
public class X {
   int value;

   public synchronized void incrementValue() {
        ++value;
    }
}
Let’s say, for the sake of argument, two threads are trying to execute the single method in this class. If the method is written like this:
   public synchronized void incrementValue() {
        ++value;
    }

Then there are only two orderings, Thread 1/Thread 2, or Thread 2/Thread 1. All of those paragraphs at the top were to make this analogy.

OK, so if we use synchronized that that’s the black hole. The number of possible paths through the code for N threads is N!. What’s the opposite? What happens if we take off the keyword and then run two threads through it? In how many ways can that single line of code get executed?

Are you ready for it?

3,432 different ways

Really? Well at least conceptually. In practice, the Just In Time compiler will probably turn the JVM instructions into native code, so the actual number of individual steps will probably be less. But there really are that many paths through the system. You might ask how?

That single line of code is represented in byte code as 7 lines. I could show you the disassembly, but really, it’s 7. You can convince yourself by having a look at the byte-code using a nice byte-code viewer plug-in for eclipse.

What happens if we change the int to a long? We still have 7 lines of byte-code, but with different instructions. We have a total of 4 long reads/writes. Each long read/write requires two individual 32-bit operations – or at lest the JVM can implement it that way. The actual number of atomic operations after the Just In Time compiler has had its way with your byte-code will probably be less, but if we stick to byte-codes and the Java memory model, the number 3,432 is now a whopping 705,432!

Let’s extend that just a bit more. What if you have several lines of code? Each line of Java results in many more lines of byte-code. What if we have have 3 lies of Java? We probably have something like 21 lines of byte-code. How many paths of execution would 21 lines of byte-code executed by two threads have? 538,257,874,440!

Where the first program ended up being a white dwarf, the version using longs was a nova. I’m not sure what to call the thee line Java method, maybe a hyper-nova?

In practice, the actual number of paths through a method will end up being smaller. And, most of the paths do not have any negative side-effects. The problem, however, is that there are a small number of hard to find paths of execution that do cause problems and finding them is like looking for a needle in a haystack.

Remember what adding the keyword synchronized did? If we use it on a three line method, it turns 538,257,874,440 into 2. It collapses the number of paths into 2 for two threads. And 2 is less than 538,257,874,440, for even vary large values of 2.

How did I derive at these magic numbers? Two ways.

I wanted to know how many possible orderings there were for 7 instructions and two threads. I knew it had to do with combinations and permutations but I just wasn’t smart enough to figure it out. Searching on Google didn’t do much for me either. So I decided to write a program to numerically derive the result. I tested my way into a program that would calculate the result.

The basic algorithm is:
  1. Determine all possible orderings of each of the unique “nodes” in a system. For example, if I have 2 threads and 2 instructions, I can generate a natural key (ugh!) to represent each of the different combinations of threads and instructions: T1S1 T1S2 T2S1 2S2.
  2. Produce all valid permutations of these four “nodes”.
  3. Remove any that have invalid orderings.
  4. Count the results.

Many of my first implementations for some of these steps were exercises in writing really inefficient code! But having those tests made it really easy to swap out better algorithms when my brain caught up with what I was doing.

What is an invalid ordering? The code generated by the single line using the pre increment operator has 7 steps. Those 7 steps are in a single sequence with no looping and no conditionals. That’s like saying A is always before B. So if we generate a list of orderings that end up including that combination, we’ve generated an ordering that cannot happen.

So you can describe your possible state space in a few steps:
  1. Register a node for each thread, step combination. 7 threads, 4 steps = 28 nodes.
  2. Register dependences between steps. For example, if I have seven steps and two threads executing the same sequence, I could describe this as: T1S1 -> T1S2 -> T1S3, etc. And for the second thread: T2S1 -> T2S2 -> T2>S3. Since the name is arbitrary, I just used letters, so in fact I had: a -> b -> c -> d -> e -> f -> g for one set of dependencies and h -> i -> j -> k -> l -> m -> o

In fact, here’s the test I used to describe that exact space:

    @Test
    public void twoThreadsAndSevenSteps() {
        calculator = new PathCalculator(14);
        createLine('a', 'b', 'c', 'd', 'e', 'f', 'g');
        createLine('h', 'i', 'j', 'k', 'l', 'm', 'n');
        start();
        int totalPaths = calculator.totalPaths();
        stop(2, 4, totalPaths);
    }

By the time I got to this point, I wrote some “tests” using JUnit as a harness to run my code and display timing information and a summary of the

Time:        4ms -- Threads:  3, Opcodes:  2, Total paths:      90
Time:        1ms -- Threads:  2, Opcodes:  3, Total paths:      20
Time:       54ms -- Threads:  4, Opcodes:  2, Total paths:    2520
Time:        3ms -- Threads:  2, Opcodes:  4, Total paths:      70
Time:       61ms -- Threads:  2, Opcodes:  5, Total paths:     252
Time:     1248ms -- Threads:  2, Opcodes:  6, Total paths:     924
Time:    12819ms -- Threads: 10, Opcodes:  1, Total paths: 3628800
I then used this information to fit a cure to my data and derive a formula for various combinations. For 2 threads, I came up with the following formula. If you let N be the number of steps:
Total Paths = 252 * 3.6667 N-5

The morning after, Uncle Bob sent an email, showing me his work on how he calculated.

If you let:
  1. T be the number of threads
  2. N be the number of steps
Total = (N*T)!   /   N! T

When I plugged numbers into his formula, they fit my formula. His formula generated an exact result as opposed to my estimate based on fitting a curve. Another thing his solution has going for it is the algorithm I came up with, I believe, is NP-complete. The net effect was I that while I was able to calculate result for 2 threads and 6 steps, 2 threads and 7 steps took a long time before running out of memory with a 2GB heap space. Bob’s formula gave me a so-called verifier (as did my formula – only Bob’s was better) and the algorithm seems to grow based on factorials and not polynomials.

What’s the point of all of this? The single, one-line program, is a demonstration of why going from one thread to two threads makes writing/debugging/testing programs so much harder. It also numerically demonstrates why the single responsibility principle is even more important when we start writing concurrent programs. The more you put in one place, the large the state space. Larger state spaces:

  1. Make it harder to find that one or few number of paths that can cause your program to die, under load, in production, while you are on vacation
  2. Make your efforts to test (which is a sampling technique anyway) for concurrency issues all the more difficult
  3. Make it harder to find what might actually have caused a failure once you’ve notice it.

In a later post, I’ll describe how we can improve our chances of covering this huge space of potential orderings.

If you’ve made it this far, wow! Congratulations!

"Big Balls of Mud" and Shanty Towns 8

Posted by Dean Wampler Sat, 27 Oct 2007 14:21:23 GMT

Last Thursday, the last day of OOPSLA 2007, Brian Foote gave a restrospective of Big Ball of Mud, which he and Joseph Yoder presented at the Fourth Conference on Patterns Languages of Programs (PLoP ‘97/EuroPLoP ‘97) and which was published as a paper in 1999.

Foote and Yoder argue that the dominant architecture of deployed application is a Big Ball of Mud, which they define thusly:

A Big Ball of Mud is a haphazardly structured, sprawling, sloppy, duct-tape-and-baling-wire, spaghetti-code jungle. These systems show unmistakable signs of unregulated growth, and repeated, expedient repair. Information is shared promiscuously among distant elements of the system, often to the point where nearly all the important information becomes global or duplicated. The overall structure of the system may never have been well defined. If it was, it may have eroded beyond recognition. Programmers with a shred of architectural sensibility shun these quagmires. Only those who are unconcerned about architecture, and, perhaps, are comfortable with the inertia of the day-to-day chore of patching the holes in these failing dikes, are content to work on such systems.

I had to leave mid-way through the talk to catch a plane, but before I left, he said something that caught my attention. He compared such applications to shanty towns, those ad hoc communities that spring up with no planning, no infrastructure, and reflect the bare minimum of resources and expertise available to their builders and inhabitants.

However, as I looked at his picture of a typical shanty town, I noticed that there are paths through the maze of ad hoc homes. There is some structure there. Then it occurred to me that, for all their problems, there is one interesting difference between typical shanty towns and many software applications; shanty towns are subject to frequent “testing” and “refactoring”. Within the extreme limitations of their architectures and the available resources, the inhabitants do what they can to fix “bugs” and adapt to new “requirements”.

Of course, I’m not saying that shanty towns are good. I’m just pointing out that they have a feedback loop that leads to modest improvements. In contrast, although we application developers have more resources at our disposal, we don’t subject our applications to the same scrutiny.

Why is this? A big part of the problem is that we forget just how complex software really is. How many points of variation exist within a home and its community? How many points of variation exist within a software application? Perhaps more than one per line of (uncommented) code?

We interact with the points of variation in our homes on a regular basis, directly or indirectly, and we make adjustments as needed (unless we’re lazy ;). In contrast, most of the corresponding points in software applications are deeply hidden and not evident when we use the applications.

You know where this is going; automated testing is the only way to subject the points of variation in applications to the same level of scrutiny and to find what needs fixing.

Not A Task, But An Approach 46

Posted by tottinger Fri, 03 Aug 2007 03:14:00 GMT

Transitions are tough. It seems that lately I’ve been getting a lot of contact from frustrated people who don’t really have a good handle on the “drive” part of Test Driven Development. A question heard frequently is: “I’ve almost completed the coding, can you help me write the TDD?”

It seems like Test Driven Development is taken backward, that the developers are driven to write tests. The practitioner winces, realizing that he again faces The Great Misunderstanding of TDD.

TDD stands for Test-Driven Development, which is not as clear as TFD (Test-First Development). If the consultant would strive to always say the word “first” in association with testing, most people would more surely grasp the idea. In fact, I’ve begun an experiment in which I will not say the word “test” without the word “first” in close approximation. I’ll let you know how that works out for me.

If the tests are providing nothing more than reassurance on the tail end of a coding phase, then the tests aren’t driving the development. They are like riders instead of drivers. Test-Ridden Development (TRD)[1] would be a better term for such a plan. Even though it is better to have those tail-end tests than to have no automated testing, it misses the point and could not be reasonably be called TDD.

An old mantra for TDD and BDD is “it’s not about testing”. The term BDD was invented largely to get the idea of “testing” out of the way. People tend to associate “test” as a release-preparation activity rather than an active approach to programming. BDD alleviates some of that cognitive dissonance.

In TDD, tests come first. Each unit test is written as it is needed by the programmer. Tests are just-in-time and are active in shaping the code. Acceptance Tests likewise tend to precede programming by some short span of time. [2]

Through months of repetition I have developed the mantra:
TDD isn’t a task. It is not something you do. It is an approach. It is how you write your programs.

I wonder if we shouldn’t resurrect the term Test-First Programming or Test-First Development for simple evocative power. Admittedly there are some who would see that as a phase ordering, but maybe enough people would get the right idea.

Brett Schuchert(with some trivial aid from your humble blogger) has worked up an acronym to help socialize the basic concepts which are somehow being lost in translation to the corporate workplace.

The teaser: Fast, Isolated, Repeatable, Self-validating, and Timely.

As a reader of this blog, you are probably very familiar with all of the terminology and concepts behind TDD. I beg of you, socialize the idea that testing comes first and drives the shape of the code. If we can just get this one simple idea spread into programming dens across our small spheres of influence, then we will have won a very great victory over Test-Ridden Development.

“And there was much rejoicing.”

1 Jeff Langr will refer to this TRD concept as “Test-After-Development”, which he follows with a chuckle and a twinkle, “which is a TAD too late.”

2 Of course, one still needs QC testing as well, however TDD is about driving development, not testing its quality post-facto.

Older posts: 1 2