Wednesday, March 10, 2010

Coroutines for Java: Status Update

I'm currently working on bringing Coroutines to the Hotspot Java VM (as part of the MLVM project). I'm happy to announce that I have pushed a version of my code to the mlvm mercurial repository. The patches are named "coro.patch".

If you want to test the new coroutine features you can either compile your own binaries (see the wiki for details) or use a recent openjdk build (from here) together with one of my precompiled binaries:


  • Linux x86 binaries:
    Place the desired binaries (debug or product) in the jre/lib/i386 folder

  • Windows x86 binaries:
    Place the desired binaries (debug or product) in the jre\bin folder

  • Java classes:
    These classes have to be inserted before the boot classpath because some classes in rt.jar are replaced. This can be achieved via prepending them to the boot classpath:
    "-Xbootclasspath/p:target_folder/coroutine_classes.jar"


If everything is configured correctly you shuold be able to run the following example program (source):

This program produces the following output:
main start
Coroutine.run
main end


The Coroutine is created simply by creating a new Coroutine instance. Analogous to Thread the code that will be executed within the coroutine can be defined either by overriding the run method or by passing a Runnable to the constructor. The Coroutine framework keeps all active coroutines in a doubly-linked ring, which defines an execution order for the coroutines and makes sure that no coroutine is "lost". (which is more important than you might think!)

These symmetric coroutines are great to allow, for example, periodic scheduling of agents and the like. But another way of using coroutines are asymmetric coroutines.
These are good at inverting the way a method behaves: Imagine a method that somehow generates a stream of values. Normally one would have to think about where to put and what to do with these values. But using coroutines we can write the method as if we had an infinite buffer for our values.

An example I like to bring up for this is the SAX parser: using asymmetric coroutines a (callback-based) SAX parser can be turned into a generator that returns one value at a time (source):

The extended for loop here is short for:
while(!parser.isFinished()) {
String element = coCall(parser);


The coroutine in this case extends CoGenerator. For other use cases there are two more classes: CoConsumer and CoExpression
A CoConsumer receives a value whenever it is called, but doesn't return anything, and a CoExpression both receives and returns a value.


The interface of the Coroutine classes looks like this:


The performance of this implementation is quite good - it takes ~15 ns per context switch on my machine in Ubuntu x86.

Have fun!
cheers,
Lukas

5 comments:

  1. Do you think this will make it into JDK7? It would be an awesome feature to have access to on the standard platform.

    Also how do coroutines interact with real threads? Can the same coroutine be call from a different threads as long as they don't call it at the same time? Or are they tied to the stack of the thread they are on?

    Good work! I always like to see the JVM get even more awesome.

    ReplyDelete
  2. JDK7: Very unlikely, to say the least. This is really just a first experiment on how coroutines could be implemented on a JVM. But its encouraging to see that people are interested in this...

    threads: Right now a coroutine is strictly tied to a thread. Of course letting coroutines travel from thread to thread opens up interesting use cases, but it's quite complicated.
    And: Having coroutines tied to a thread means that all data structures of the coroutine framework can be accessed without locking!

    ReplyDelete
  3. BTW: The MLVM project (also called "The Da Vinci Machine Project") is an incubator for many new language and VM features that make the Hotspot JVM more powerful:
    http://openjdk.java.net/projects/mlvm/

    ReplyDelete
  4. The use case that immediately came to mind for me was Actors (Actor Model). It needs to have support for some kind of "green" threads to be implemented easily, completely and efficiently. So I was thinking about setting up an N on M green thread model where the green threads are coroutines that are run on one of M real threads (they would jump from thread to thread as needed to execute). This would give them both concurrency and lightweightness at the cost of ease of use (they would need to yield occasionally or they would hold onto their thread).

    Coroutines may not be the best tool for this but it could work. But as you point out porting coroutines between threads could have hidden costs like the need for locking. Some these could be avoided by having the Coroutine bound to a thread but allow it to be unbound and rebound (maybe only the currently bound thread could unbind it and then another thread could bind it to itself). I'm just thinking out loud. Do not take this as any kind of feature demand.

    MLVM: Yea I've been tracking it. I just ask because InDy and other things have made the jump into JDK7 recently.

    ReplyDelete
  5. Being able to pass coroutines between threads was a design goal for my own implementation, which you might be interested in:

    http://oss.readytalk.com/avian/javadoc/avian/Continuations.html

    This implementation uses a simpler, less-sophisticated VM than HotSpot, but still performs very well. The source is available here if you're interested:

    http://oss.readytalk.com/avian/status.html

    I'm eager to get involved with MLVM myself once I have some free time to spend on it. I do a lot of work with non-blocking I/O, and having continuations in the JDK would save me the pain of writing and maintaining state machines and juggling callbacks.

    ReplyDelete