For a couple of years now, Pharo includes support for Ephemerons, originally introduced with the Spur memory manager written by Eliot Miranda. For the upcoming Pharo 9.0 release, we have stressed the implementation (with several hundred thousands Ephemerons), make it compatible with latest changes done in the old space compaction algorithm, and made it a tiny bit more robust. In other words, from Pharo 9 and on, Ephemerons will be part of the Pharo family for real, and we will work on Pharo 10 to have a nice standard library support for it. For now, the improvements are available only in the
latest night build of the VM, waiting to be promoted as
Still, you would be scratching your head at “what the **** are ephemerons?”. The rest of this post will give a taste of them.
What are Ephemerons?
An ephemeron is a data structure that gives some notification when an object is garbage collected, invented by Barry Hayes and published in 1997 in OOPSLA in a paper named “Ephemerons: A New Finalization Mechanism”. This mechanism is particularly useful when working, for example, with external resources such as files or sockets.
To be concrete, imagine you open a file, which yields an object having a reference to a system’s file descriptor. You read and write from it, and when you’re done, you close it. Closing the file closes the file descriptor and returns the ressource to the OS. You really want your file to be closed, otherwise nasty stuff may happen, because your OS will limit the number of files you can open.
Sometimes however, applications do not always have such a straight and simple control flow. Let’s imagine the following, not necessarily realistic, arguably not well designed, but very illustrative case: Sometimes you open a file, you pass your file as argument to some library, and… now the library owns a reference to your file. So maybe you don’t want to close it yet. And the library may not want to close it either because you are the real owner of the file!
Another possibility is to let the file be. And make sure that when the object is not used anymore and garbage collected, we close its file descriptor. An Ephemeron does exactly that! It allows us to know when an object is collected, and gives us the possibility to “finalize” it.
You can test it doing (using the latest VM!):
Object subclass: #MyAnnouncingFinalizer instanceVariableNames: '' classVariableNames: '' package: 'MyEphemeronTest' MyAnnouncingFinalizer >> finalize [ self inform: 'gone!' ] obj := MyAnnouncingFinalizer new. e := Ephemeron new. e key: obj. obj := nil
You will see that after nilling the variable
obj, the Ephemeron will react and send the
finalize message to our
What about weak objects?
Historically Pharo also supports weak objects, and another finalization mechanism for them.
A weak object is a special kind of object whose references are weak.
And to say it informally, a weak reference is an object reference that is not taken seriously by the garbage collector. If the garbage collector finds that an object is only referenced by weak references, it will collect it, and replace all those weak references by a “tombstone” (which tends to be
nil in many implementations).
Historically, we have used the weak mechanism for finalization in Pharo, which can be used like this:
obj := MyAnnouncingFinalizer new. weakArray := WeakArray new: 1. weakArray at: 1 put: obj. WeakRegistry default add: obj.
Here, the weak array object will have a weak reference to our object, and the
obj reference in the playground will be a strong reference. As soon as we nil the playground reference, the object will be detected for finalization and it will execute the finalize method too. Moreover, if we check our weak array, we will see our tombstone there in place of the original object.
obj := nil. weakArray at: 1.
Why not using this weak finalization instead of the ephemeron one?
The main explanation is performance. With the weak finalization process, every time the VM detects an object needs to be finalized, it raises an event. Then, the weak finalization library will iterate all elements in the registry checking what elements need to be finalized, by looking for the presence of tombstones. This means that for each weak object the weak finalization must do a full scan of all possible registered weaklings!
The ephemeron mechanism is more direct: when the VM detects an ephemeron needs to be finalized, it will push the ephemeron to a queue, and raise an event. Then, the ephemeron finalization will empty the queue and finalize them. No need to check all existing ephemerons.
A Weak Pharo Story, Memory Leaks and More
Of course, ephemerons are not only necessary for efficiency. They help also avoid many nasty memory leaks. A couple of years ago we did with Pavel a presentation in ESUG about a very concrete memory leak caused by mis-usage of weak objects. It’s a fun story to tell with enough perspective, but it was not a fun bug to track down at the time 😛 .
And even more, a robust ephemeron implementation will help us remove all the (potential buggy and inefficient) weak finalization code in Pharo 10!