A Taste of Ephemerons

For a couple of years now, Pharo includes support for Ephemerons, originally introduced with the Spur memory manager written by Eliot Miranda. For the upcoming Pharo 9.0 release, we have stressed the implementation (with several hundred thousands Ephemerons), make it compatible with latest changes done in the old space compaction algorithm, and made it a tiny bit more robust. In other words, from Pharo 9 and on, Ephemerons will be part of the Pharo family for real, and we will work on Pharo 10 to have a nice standard library support for it. For now, the improvements are available only in the latest night build of the VM, waiting to be promoted as stable.

Still, you would be scratching your head at “what the **** are ephemerons?”. The rest of this post will give a taste of them.

What are Ephemerons?

An ephemeron is a data structure that gives some notification when an object is garbage collected, invented by Barry Hayes and published in 1997 in OOPSLA in a paper named “Ephemerons: A New Finalization Mechanism”. This mechanism is particularly useful when working, for example, with external resources such as files or sockets.

To be concrete, imagine you open a file, which yields an object having a reference to a system’s file descriptor. You read and write from it, and when you’re done, you close it. Closing the file closes the file descriptor and returns the ressource to the OS. You really want your file to be closed, otherwise nasty stuff may happen, because your OS will limit the number of files you can open.

Sometimes however, applications do not always have such a straight and simple control flow. Let’s imagine the following, not necessarily realistic, arguably not well designed, but very illustrative case: Sometimes you open a file, you pass your file as argument to some library, and… now the library owns a reference to your file. So maybe you don’t want to close it yet. And the library may not want to close it either because you are the real owner of the file!

Another possibility is to let the file be. And make sure that when the object is not used anymore and garbage collected, we close its file descriptor. An Ephemeron does exactly that! It allows us to know when an object is collected, and gives us the possibility to “finalize” it.

You can test it doing (using the latest VM!):

Object subclass: #MyAnnouncingFinalizer
    instanceVariableNames: ''
    classVariableNames: ''
    package: 'MyEphemeronTest'

MyAnnouncingFinalizer >> finalize [
    self inform: 'gone!'
]

obj := MyAnnouncingFinalizer new.

e := Ephemeron new.
e key: obj.

obj := nil

You will see that after nilling the variable obj, the Ephemeron will react and send the finalize message to our MyAnnouncingFinalizer object.

What about weak objects?

Historically Pharo also supports weak objects, and another finalization mechanism for them.
A weak object is a special kind of object whose references are weak.
And to say it informally, a weak reference is an object reference that is not taken seriously by the garbage collector. If the garbage collector finds that an object is only referenced by weak references, it will collect it, and replace all those weak references by a “tombstone” (which tends to be nil in many implementations).

Historically, we have used the weak mechanism for finalization in Pharo, which can be used like this:

obj := MyAnnouncingFinalizer new.
weakArray := WeakArray new: 1.
weakArray at: 1 put: obj.
WeakRegistry default add: obj.

Here, the weak array object will have a weak reference to our object, and the obj reference in the playground will be a strong reference. As soon as we nil the playground reference, the object will be detected for finalization and it will execute the finalize method too. Moreover, if we check our weak array, we will see our tombstone there in place of the original object.

obj := nil.

weakArray at: 1.

Why not using this weak finalization instead of the ephemeron one?
The main explanation is performance. With the weak finalization process, every time the VM detects an object needs to be finalized, it raises an event. Then, the weak finalization library will iterate all elements in the registry checking what elements need to be finalized, by looking for the presence of tombstones. This means that for each weak object the weak finalization must do a full scan of all possible registered weaklings!

The ephemeron mechanism is more direct: when the VM detects an ephemeron needs to be finalized, it will push the ephemeron to a queue, and raise an event. Then, the ephemeron finalization will empty the queue and finalize them. No need to check all existing ephemerons.

A Weak Pharo Story, Memory Leaks and More

Of course, ephemerons are not only necessary for efficiency. They help also avoid many nasty memory leaks. A couple of years ago we did with Pavel a presentation in ESUG about a very concrete memory leak caused by mis-usage of weak objects. It’s a fun story to tell with enough perspective, but it was not a fun bug to track down at the time 😛 .

And even more, a robust ephemeron implementation will help us remove all the (potential buggy and inefficient) weak finalization code in Pharo 10!

Published by Guille Polito

Pharo dev. Researcher, engineer and father. > If it ain't tested, it does not exist.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: