Thursday, February 11, 2010

Taking out the trash: Coroutines versus Garbage Collection

From the point of view of heap management a coroutine is just another object that contains pointers to (possibly lots of) other objects. On the other hand, coroutine data might be lying around on some coroutine stack outside the Java heap (instead of in an object inside the Java heap). So while designing a coroutine system one inevitably needs to think about how coroutines and garbage collection interact.

Ok, so where's the problem?
First of all, the Hotspot Java VM (like most VMs) handles stacks quite differently than other objects.
  • The pointers on the stacks are always strong root pointers. So any object that is referenced within a stack frame will always stay alive (=strong), and these pointers are the starting point for every Garbage Collection run (=root).
  • The VM makes adjustments to the stackframes before and after Garbage Collection (called prologue and epilogue). These mainly take care of converting bytecode pointers to bytecode indices (and back to bytecode pointers) in interpreted frames.
  • The VM looks at stackframes for a number of reasons, for example to see if a given method is in use, etc.
So what about Coroutine stackframes? Treating them like any other object doesn't work because of these special cases. In fact, if one doesn't want to make radical changes to the compilers and the compiler infrastructure the following condition needs to be fulfilled at all times:
The VM at all times needs to be able to efficiently locate all currently running instances of all methods.
Ok, so where's the problem? Well, there isn't one for symmetric Coroutines (which are periodically scheduled). They will be scheduled again and again until they die and when they die they will be thrown away and that's it.
But asymmetric Coroutines (which explicitly call each other) might never reach their end. The programmer can simply set the last reference to an asymmetric Coroutine to null and assume that it will be collected. But the this - pointer of the first stackframe of the Coroutine is still a root pointer, and thus the Coroutine and all the objects it references will never go away.
Of course they will be cleaned up as soon as the thread ends, but threads can (and most often will) live for a long time.

What can we do about it?
There are a few options that allow us to kill Coroutines before the thread ends:
Remove the special treatment for stacks
Removing the Garbage Collection prologue and epilogue seems possible. But having stackframes that are only reachable via a GC is hard: Imagine that the compiler needs to deoptimize a method because some assumptions it took while compiling are broken (more than one implementation of an interface, etc.). The compiler needs to find all current activations of the method and patch them so that they will be deoptimized.
Have the programmer close Coroutines when he's done
This will of course always work - and a close operation should probably be implemented in any case.
Let the references on Coroutine stacks be weak references
This looks very promising on first glance. The Coroutine is only weakly referenced by the stack, so as soon as it isn't referenced by the Java code any more it can go away.
But: This is too weak. As long as the Coroutine is alive none of the references on the stack should be weak!
Remove the Coroutine stack in the Coroutines finalizer
Well this is basically a variation of the aforementioned options. It won't work because the Coroutine will always be referenced from its stack.
Introduce a weakly referenced object between the thread and the coroutine stacks
That will be the way to go. This proxy object is referenced by the Coroutine, and as soon as the Coroutine dies it will go away, and the stack along with it. If the Coroutine still exists it will keep the whole stack alive.

Collecting Coroutines
One question remains: can we just collect Coroutines? What if they still hold locks, etc.?
Right. Coroutines are closed by resuming them and immediately throwing a CoroutineExitException. This exception will (or might) propagate down the stack until it ends the Coroutine.
A Coroutine can only by collected if throwing the exception ends it without any side effects.
Looking at a Coroutine stack it can be determined:
  • Are there any locks (synchronized) that will be freed?
  • Will the CoroutineExitException be caught?
  • Is there a finally clause?
If none of these is the case then we're on the safe side.

1 comment:

  1. Nice article and explanation is good,Thank you for sharing your experience on Coroutines versus Garbage Collection.you have clearly explained about the process thus it is very much interesting and i got more information from your blog.For more details please visit our website.
    Oracle Fusion Training Institute


    ReplyDelete