Applications Should Use Several Languages 10

Posted by Dean Wampler Wed, 04 Jul 2007 16:38:31 GMT

Yesterday, I blogged about TDD in C++ and ended with a suggestion for the dilemma of needing optimal performance some of the time and optimal productivity the rest of the time. I suggested that you should use more than one language for your applications.

If you are developing web applications, you are already doing this, of course. Your web tier probably uses several “languages”, e.g., HTML, JavaScript, JSP/ASP, CSS, Java, etc.

However, most people use only one language for the business/mid tier. I think you should consider using several; a high-productivity language environment for most of your work, with the occasional critical functionality implemented in C or C++ to optimize performance, but only after actually measuring where the bottlenecks are located.

This approach is much too rare, but it has historical precedents. One of the most successful and long-lived software projects of all time is Emacs. It consists of a core C-based runtime with most of the functionality implemented in Emacs lisp “components”. The relative ease of extending Emacs using lisp has resulted in a rich assortment of support tools for various operating systems, languages, build tools, etc. Even modern IDEs and and other graphical editors have not completely displaced Emacs.

Java has embraced the mixed language philosophy somewhat reluctantly. JNI is the official and most commonly-used API for invoking “native” code, but it is somewhat hard to use and few people actually use it. In contrast, for example, the Ruby world has always embraced this approach. Ruby has an easy to use API for invoking native C code and good alternatives exist for invoking code in other languages. As a result, many of the 3rd-party Ruby libraries (or gems) contain both Ruby and native C code. The latter is built on the fly when you install the gem. Hence, there are many high-performance Ruby applications. This is not a contradiction in terms, because the performance-critical sections run natively, even though interpreted Ruby is relatively slow.

Of course, you have to be judicious in how you use mixed-language programming. Crossing the language boundary is often somewhat heavyweight, so you should avoid doing such invocations inside tight loops, for example.

So, I think the solution to the dilemma of needing high performance sometimes and high productivity the rest of the time is to pick the right tools for each circumstance and make them interoperate. Even constrained embedded devices like cell phones would be easier to implement if most of the code were written in a language like Ruby, Python, Smalltalk, or Java and performance-critical components were written in C or C++.

If I were starting such a greenfield project, I would assume that time-to-money is the top priority and write most of my code in Ruby (my personal current favorite), using TDD of course. I would profile it constantly, as part of the nightly or build. When bottlenecks emerge, I would first determine if a refactoring is sufficient to fix them and if not, I would rewrite the critical sections in C. If the project were for an embedded device, I would also watch the resource usage carefully.

For my embedded device, I would test from the beginning whether or not the overhead of the interpreter/VM and the overall performance are acceptable. I would also be sure that I have adequate tool support for the inevitable remote debugging and diagnostics I’ll have to do. If I made the wrong tool choices after all, I would know early on, when it’s still relatively painless to retool.

If you’re an IT or web-site developer, you have fewer performance limitations and more options. You might decide to make the cross-language boundary a cross-process boundary, e.g., by communicating through some sort of lightweight web services. This is one way to leverage legacy C/C++ code while developing new functionality in a more productive language.

Trackbacks

Use the following link to trackback from your own site:
http://blog.objectmentor.com/articles/trackback/8786

Comments

Leave a response

  1. Avatar
    about 13 hours later:

    I completely agree that far more applications should take advantage of several languages. I think that programming languages are only just becoming mature enough for this to be possible, and I bet we’ll see more interoperating apps soon.

    I’m starting to work on a 3d graphics project which will probably use C++ for the high performance loops, but either Scheme or Erlang for the application logic. I’m still doing a lot of research into what I should use, but something like that just seems logical.

  2. Avatar
    Torbjörn Kalin about 15 hours later:

    After reading the first part I was hoping for an answer to the question whether it’s worth using TDD with C++. I assume your answer is yes, but the reasoning would have been nice to see after reading about all the drawbacks.

    Regarding your point of using several languages, I totally agree.

  3. Avatar
    Dean Wampler about 19 hours later:

    Torbjörn,

    Sorry I didn’t make it clear; yes, I believe you should still TDD everything, no matter what language it’s written in! Even though TDD in C++ is slower than in other languages, the benefits it provides for improving the design and building up a suite of tests far outweigh the drawbacks.

  4. Avatar
    Nilanjan Raychaudhuri about 23 hours later:

    I totally agree with you about using multiple programming languages for developing applications. One argument I heard against it more often is that developers are not comfortable with multi-lingual environment, which isn’t true at all because all the good developers I know, works and loves working with multiple programming languages. Anyways I think JVM is slowing becoming a platform where multi-lingual apps will be reality in future.

    Deciding which language to use should not only be driven by the performance criteria along but by requirement at hand. If java has a good crypto library and ruby helps you to be productive, JRuby should be your language.

  5. Avatar
    YAChris 1 day later:

    with the occasional critical functionality implemented in C or C++ to optimize performance, but only after actually measuring where the bottlenecks are located.

    I cannot count the number of times I’ve been told over the years, “NO, I don’t need to measure, I know where it’s slow! Go away!” The worst part of it being, they’re occaisonally right…

    I’ve never done the JNI thing with Java, but I have implemented code in C or C++ that was called from Perl, Tcl and Ruby.

    Not surprisingly, Ruby’s binding setup is wonderful to use. And its TDD support is wonderful too. Guess what my language of choice now is :-)

  6. Avatar
    1 day later:

    Hi, Dean. Today I actually think that one does not really have to choose any more between performance and productivity. A lot of people tend to be very productive with languages like Java and C# plus performancewise with today’s JITs they are almost never at all any worse than an average C or C++ version of comparable functionality. I find it’s way faster to get a nice and fast code by coding it in Java, then profiling and optimizing it (usually only a little) than taking all the burden and of a language like C++ on me. Usually any JIT’s much smarter than me anyway, and nowadays often even smarter than optimizing ahead-of-time compilers, because it has runtime information.

  7. Avatar
    Adam Sroka 1 day later:

    The value of heterogeneous language development has been long established in many industries. Particularly notable are telecom and games. In both, the speed of development and flexibility of “scripting” languages make them invaluable for high level code, but the complexity and time critical nature of certain sections of code (rendering for games, and network for telecom) make highly optimized native code a necessity. The trick is to isolate the native code in modules that can be called by script. Most languages make this quite easy. Java and .Net are the exceptions.

    On the other hand, there is a lot of business software out there that doesn’t need this level of performance. In fact, most of the business software that I have encountered would benefit a lot more from the simplicity and obviousness of homogeneous code written in e.g. Java or C# than from the theoretical performance gain of optimized code. Homogeneous code has a lot of advantages: All of the developers on the team can read it; it is easier to test; it is easier to change; the boundaries of “modules” tend to be fluid whereas with heterogeneous code boundaries can be difficult to change once they are established.

    Plus, native code does not guarantee performance gains. The knowledge necessary to optimize the most critical areas and gain the most is specialized and reasonably rare. A lot of teams may as well use a divining rod. Do the wrong thing and your code could actually perform much worse. This is particularly a problem in managed environments like Java or .Net where the cost of entry into non-managed (native) code marginalizes the gains.

  8. Avatar
    lucy 11 days later:

    Uncle Bob: I am an editor of Programmer Magazine of China. I have read this article,and like it very much. I think it will be popular with Programmer’s readers. Could you allow me to use this article in our magazine and share your viewpoint with our readers?

  9. Avatar
    lucy 14 days later:

    Dean Wampler : I am so sorry that I made a fault.I used to consider this blog as Uncle Bob’s,just because I came here from Uncle Bob’s link. please allow me to use your article in Programmer Magazine. thank you!

  10. Avatar
    Samuel A. Falvo II 17 days later:

    When it comes to embedded devices in particular, this approach has been taken since the 70s with the use of the Forth programming language. Forth is “vari-level,” like Lisp, so you can effectively engineer a language of any desired abstraction level for your application. Hence, high- and low-level Forth code use the same language engine, but the code looks very, very different.

    But, even Forth isn’t as fast as machine language in tight loops, which is why most Forth programming environments include an integrated assembler (it’s just another Forth-written domain-specific language!) for the target platform you’re coding for.

Comments