Thursday, September 10, 2009

First Jato release

So I somewhat promised to say something more on my project for the Google Summer of Code.

I participated for the first time, and it was really a great experience. This is the first time I've been paid to do what I would probably have done in either case (because I immensely enjoy it), namely to work on an interesting free software project. So this time it was Jato, a new implementation of the Java Virtual Machine.

Jato is written in C and aims to be a JIT-only compiler/VM. So before the summer started, it wouldn't run a lot of Java programs. It couldn't even run the trivial "Hello World", but more about that later. It could do a few things like (java.lang.)String operations, but that was mostly because it was riding piggyback on another VM, Jam VM, which would load classes and interpret a lot of the library code (i.e. most of the standard Java API implemented by GNU Classpath).

My own project was to replace the parts that were borrowed from Jam VM with completely new, shiny code. Well, in fact, I had already worked on a standalone library called cafebabe, which would parse the Java .class file and present it to C programs using a very thin layer of structs -- the idea was to do as little as possible with the actual data and just make it easily accessible, e.g. so cafebabe doesn't do any verification of the bytecode or make sure that a class is its own superclass.

So we linked this library with Jato, and that worked pretty well. (I believe strongly in modularisation; with the class loader in a separate library, there's no way to do stupid things like making the class loader itself depend on the rest of the VM for functioning properly.) There were problems, of course, the biggest being that after developing in my own private branch for a couple of weeks, the rest of the project with its three other GSoC student participants was moving ahead too quickly for me to keep up; I had to "fix" just about every new mainline commit, since most of the VM data structures (classes, methods, fields) now had different names, different fields, etc.

At around this point, we started to notice that Jam VM wasn't being used for nearly as little as just class loading. In fact, static initializers (the "<clinit>" methods) and their calls were all being interpreted and executed by Jam VM, not the JIT compiler of Jato. And of course, once we started using Jato for those methods which had previously been executed by Jam VM, we found tons of bugs everywhere: The core VM (mostly my code replacing that of Jam VM), including improperly implemented Java semantics, missing bytecode instruction handlers, and the compiler (instruction selection, liveness analysis, register allocation). But it was fun, and extremely educative.

Did you know that a simple System.out.println() will make a whopping 650550 Java-method calls? That's using GNU Classpath for the Java API.

So we had to implement a lot of stuff just to make that work. In fact, we have so much infrastructure that we can even run a few other programs. And thus it was decided that it was time to make a release. The version 0.0.1 announcement:

http://thread.gmane.org/gmane.comp.java.vm.jato.devel/1725

The "next big" missing feature is garbage collection. I expect that'll be version 0.0.2. In other words, if you're interested in Java, garbage collection, or just compilers in general, you're welcome to join us...

Our team consisted of:
  • Pekka Enberg; Original creator, project manager and mentor.
  • Tomek Grabiec; His original project proposal included implementing exception handling, but he ended up somwhat like Ingo Molnar of the Linux kernel; Tomek also implemented a generic radix tree, implemented the basic threading support (java.lang.Thread), rewrote large parts of the register allocator, wrote the VM side of the JNI interface from scratch, and also did a lot of other core-VM work. Tons of other things too, including a lot of debugging.
  • Arthur Huillet; Floating-point support, register allocator fixes, other missing bytecodes, also a lot of debugging. Various other things.
  • Eduard-Gabriel Munteanu; Ported Jato mostly to x86-64. A tough task, since the i386 parts were moving at close to light speed.

No comments: