Nirav's Contemplations

Tuesday, February 27, 2007

For the love of beer

Funny ad, see what it takes to make a beer ;), never seen trebuchets being used for throwing beer ingredients (esp. nv girls, animals...). Cheers!

Imagine...

A work place where your regular tasks involve analyzing and modeling the parts of system based on stories (or requirements, whatever) and writing few snippets of instructions. You've a world class IDE with some wonderful perspectives, editors, views and a state of art execution infrastructure. The IDE allows quick execution of your models to test them, allows visual model debugging while encouraging ergonomics to prevent heavy "keying and mousing", IDE supporting finest model refactoring to allow painless design change, while providing real time collaboration for distributed brainstorming and discussions...

You pair model it, you test drive it, you version it at finer granular unit of control, you've a catalogue of blueprints (design + domain) which can be used statically as well as in context of existing system, you have continuous integration of models which compiles, validates (tests) and builds the system. At end of the day, you've a system which can simulate in development environment as well as deployable as production software....

Imagine a execution infrastructure able interpret UML (relax, take unambiguous parts of UML or Mellor subsets) directly, or may be it emits a script which can be plugged in existing JVM, or may be UML VM itself with attach on demand and hot model replacements :) ....

I dream to be in such an environment as well as developing such environment.

Realize....

These imaginations, although will sound like philosophies, are not without basis, existing tools and technologies can be leveraged and bridged to provide this envisioned platform; it's hard, takes lot of resources but not impossible. Ok, it will not give quick returns, will have adoption problems which can be handled.

At first, Executable UML will seem like a buzzword, to me it sounded daunting and it seemed like everyone wants to change the world. That change is for good, how long you want to work with something as ugly as XML or JSP and other complexities which in no way adds to customer requirements? They deserve to be generated or they don't even deserve to be in our codebase (I should say modelbase :) ).

Executability of models is not something entirely new (apart from Virtual Machines, OOP etc.), guys like Steven Mellor and Marc Balcer are working on it for quite sometime now.

If you feel, that, support for such tooling is not enough; Let me tell you, Eclipse has made such feeling "Past"; With Eclipse, We can leverage most of the Eclipse related technologies for achieving it, ECF for collaboration, UML/EMF/GEF/GMF for modeling, Debug Framework and GEF for simulation and model debugging (in this case, we may need to change the abstraction semantics of stack frames etc.) . Of course building VM or runtime is non-trivial, but it's worth trying alternative approaches like integrating with existing runtime.

This is the time for improving the software crafting (or engineering?), no one needs costly software, and current fashions make them expensive; Well its different topic to debate about productivity levels of existing methods but it can surely be improved.

Wednesday, February 21, 2007

When will 'Software Modeling' catch big time attention?

From last one year, I'm working on a modeling tool featuring support for UML2 and BPMN1.0 specifications. Well, this tool is not envisioned just like "yet another modeling tool" allowing you to draw basic elements (and strange visual manipulations) and generate structural code and documentation (which is not impressively useful!).

It's commonly known that modeling is good for visualizing and documenting a system. Well, I've heavily used UML for documenting and I found it useful for communicating software blueprints to those who don't understand the technology well.

Contemporary tools can do better apart from visualizing, Typically a modeling tool doesn't sponsor full blown generation of an application for target platform, say J2EE; Not that it's insane attempt but it's simply too much of work. They generate structural code (skeletal) from model (say, classes from Class Diagrams) ; this is commonly known as forward engineering. Tools also support round trip engineering for regenerating the code from model while maintaining the manual customizations.

How redundant, Why to model and code at the same time and why maintain both?

The tool I'm working on is supposed to generate "typical application" end-to-end (heh, really?!). Also, Tool has a special language(sort of BASIC dialect) , that can be used to write business logic for the application which might, otherwise, result in tedious micro modeling (boy, you will require incredible mouse capabilities for that level of modeling). This language is customized implementation of OMG(tm) Action Semantics Language (ASL) and Object Constraint Language (OCL); The logic in this language is translated in to platform specific code during transformation and code generation.

Technically, this is enough for generating entire application. Recently, tool vendors are trying to avoid the ASL approach altogether and use supplemental UML models (like activity and state machines) to specify behavioral aspect of system; but this approach has usability problems, although it is an interesting technical challenge (umm... given enough beer and I'll even admit that I wrote a state machine compiler for money to generate behavioral code..).

How bad again, Why I want to learn UML, a dumb ASL language and still struggle with generated code to customize it?

I agree that the tool can generate "student projects" like applications, but still keeps me skeptical about *serious* applications which involve complexities of security, performance and integration. It's not impossible to do that either, however, supporting multiple security/persistence/application frameworks/third party interfaces for multiple platforms, in its entirety, would be a non-trivially huge attempt ever made.

Lately, I'm observing generative methods in regular development; these days, most applications are combination of generated + hand written code(in some reasonable combination), people manage both code as well as model. Generative Technologies are gaining acceptance gradually, especially because support from Eclipse projects like EMF, GEF, GMF and other Modeling subprojects apart from rising community interest, hundreds of modeling tools are available and major players are investing in this technologies.

We have witnessed the transitions in technologies; Transitions from Hardware specific Assembly code to C code, C code to C++/Java/RoR code.., these transisions are fine and gradual, but with modeling being a paradigm shift in the way we see software development (its not just about learning new language, right?), modeling imposes the steep learning curve, and understanding the mammoth specifications. It's always difficult to adopt a different way of doing things.

With the current state of tooling and specifications, it is difficult to offer "Generate everything, Write Nothing" jargon feature, and its pain in a**e to maintain both model and code in sync (tools are improving though). I'm sure of one thing, generative development is going to be the next generation of software development, not sure how many more years...

Friday, February 16, 2007

Escape Analysis in Mustang - II

EDIT: As commented by Brian, these Escape Analysis optimizations were dropped before final release and deferred to Java 7. The debug version of VM can no longer be found at http://download.java.net/download/jdk6/6u1/promoted/b03/binaries/jdk-6u1-ea-bin-b03-windows-i586-debug-19_jan_2007.jar

In my previous post, I mentioned that escape analysis is available in mustang releases and debug flags available for it in HotSpot(tm) VM.

I ran some micro benchmarks yesterday, they are fairly trivial but good enough to identify the performance gains, and here is what I found:

import java.io.ByteArrayOutputStream;
import java.io.OutputStream;
import java.io.PrintStream;

public class TestEA {
   public static class MyPrintStream extends PrintStream {
       public MyPrintStream(OutputStream out) {
           super(out);
       }
      
       @Override
       public void println(int cnt) {/* deep inline candidate */
       }
      
   }
  
   private static final int COUNT = 10000000;
  
   public static void main(String[] args) throws Exception {
       System.setOut(new MyPrintStream(new ByteArrayOutputStream()));
       for (int i = 0; i <= 10; i++)
           test();
      
   }
  
   private static void test() {
       int cnt = 0;
       Object objLock = new Object();
       long start = System.currentTimeMillis();
       for (int i = 0; i < COUNT; i++) {
           synchronized (objLock) {
               cnt++;
           }
           consume(cnt);
       }
       long end = System.currentTimeMillis();
       System.err.println(cnt + ", time=" + (end - start) / 1000.0);
   }
  
   private static void consume(int cnt) {
       System.out.println(cnt);
   }
  
}

The results
VMArgs: -server

10000000, time=0.907
10000000, time=0.938
10000000, time=0.969
10000000, time=0.906
10000000, time=0.906

With
VMArgs: -server -XX+DoEscapeAnalysis
10000000, time=0.89
10000000, time=0.922
10000000, time=0.016
10000000, time=0.016
10000000, time=0.015
10000000, time=0.016

JVMOut:
28 JavaObject NoEscape [[ 54F]] 28 Allocate ....
40 LocalVar NoEscape [[ 28P]] 40 Proj ....
90 LocalVar NoEscape [[ 28P]] 90 Phi ....
213 JavaObject NoEscape [[]] 213 Allocate ....
225 LocalVar NoEscape [[ 213P]] 225 Proj ....
======== Connection graph for TestEscapeAnalysis::test ....
172 JavaObject NoEscape [[]] 172 Allocate ....
184 LocalVar NoEscape [[ 172P]] 184 Proj ....

I ran the loop in main for few times because server VM does a deep inline and it might optimize the execution considering cnt is not significantly used anywhere.

Well, It can be seen that escape analysis successfully eliminated synchronization on objLock (lock elision) . As I posted earlier, synchronization has significant impact on execution speed, elimination of this heavy operation improved the speed substantially. Consider the effect of it on a highly concurrent web server (JAWS) handling hundreds of simultaneous requests. Of course, it happened because objLock is allocated locally and VM identified that it can't be shared between multiple threads and its safe to remove the sync overhead.

I also tried making objLock a static field of the class and found that escape analysis has no positive impact on execution speed as can be seen here:

VMArgs: -server -XX:+DoEscapeAnalysis (objLock as static field in above program)
10000000, time=0.922
10000000, time=0.922
10000000, time=0.906
10000000, time=0.891
10000000, time=0.921

As expected, objLock being a shared field, VM has no way of identifying whether lock on it can be eliminated or not.

Unfortunately, stack allocation seems to be absent in current builds of mustang (or there were no hints in debug output as when it was done. Also, its hard to decipher the VM output and I found no explanation for it). Here is an excellent presentation on new optimization in HotSpot(tm) VM. And for lock elimination enhancements refer to this.

So, there you go, one more optimization for the managed runtime; for those who are still in the illusion that native static compiler optimized programs are the fastest, think again, your programs are static at runtime can't organize itself, managed runtime is metamorphic, it can adapt and substantially optimize itself at runtime. Man, that's what I call programming.