Object-centric breakpoints: a tutorial

The new Pharo debugger is shipped with object-centric breakpoints. An object-centric breakpoint is a breakpoint that applies to one specific object, instead of being active for all instances of that object’s class. We have two kinds of object-centric breakpoints:

  • The haltOnCall breakpoint: give it a method selector and a target object, and the system will halt whenever the target object executes the corresponding method.
  • The haltOnAccess breakpoint: for a target object, the system will stop whenever the state of that target object is read or written. It is possible to specify which variable whose accesses will trigger the breakpoint.


If you use the object-centric experimental image, you can skip this part and start the tutorial. Else, follow installation instructions here https://github.com/StevenCostiou/Pharo-Object-Centric-Tutorial.

The tutorial

Context: the OCDBox objects loop

In this example, we will use a class named OCDBox. It models a trivial object named box that holds objects. It has two instance variables, elements and name, and a few methods in its API. In particular, the method addElement: adds an object into the box.

We will use a test to practice object-centric breakpoints. In this test, we instantiate 100 boxes. We iterate over all these boxes to:

  • add an object to each box,
  • print the box on the Transcript.

When you execute the test, the Transcript shows you each box that is printed in the iteration loop. If you look at the code of the OCDBox class, you will see that:

  • adding an element to a box modifies its name,
  • the printing method uses the name to display the box in the Transcript.

Object-centric breakpoints

In the following, we demonstrate the use-cases of the haltOnCall and haltOnAccess breakpoints. Each time, we start by demonstrating the breakpoint through a video, then we explain the use-case and how to identify situations where that object-centric breakpoint might help. All videos have sound.

Warming up with object-centric breakpoints

The following video shows how to install object-centric breakpoints through the inspector. Basically, we create two box objects b1 and b2, and we install various object-centric breakpoints on b2.

Installing object-centric breakpoints
  • We start by inspecting b2,
  • We select the name instance variable in the inspector, and install an object-centric breakpoint through the contextual menu:
    • a halt on read will stop the execution each time the name variable is accessed,
    • a halt on write will stop the execution each time the name variable is written to,
    • a halt on access will stop on both read and writes of name,
  • We select one of the methods of b2 in the inspector, and install a halt on call through the contextual menu:
    • each time this method is called, and the receiver is b2, then the execution halts

In our example, we install a halt on read breakpoint on the name instance variable of b2, and a halt on call breakpoint on the name: method of b2. For each case, the execution will halt only for methods reading the name instance variable in b2, or when b2 calls its name: method. The execution will not halt for b1, nor for any other object.

In the following, we apply our breakpoints on a debugging example. Instead of using the inspector, we install object-centric breakpoints directly from the debugger, as we would do in a real debugging session.

Removing object-centric breakpoints

Object-centric breakpoints are garbage collected along with objects: if your object disappears, then so does the breakpoint.

For existing objects, removing an object-centric breakpoint is as simple as inspecting the object (for example in the debugger), going into the breakpoint pane, then selecting the breakpoint and using the context menu to remove it.

Stopping when a specific object receives a particular message

The test iterates over a hundred boxes objects, and to each object it sends the addElement: message. In this exercise, we select one box object among all the boxes the test is iterating, and we install an object-centric breakpoint on the addElement: method of that object. Then, we proceed the test and the addElement: method is called on each of the boxes. The execution only stops when the selected box executes the addElement: method.

Getting to the object of interest

The first step to use an object-centric breakpoint is to get to the object you want to debug.

For now, we do that by putting a first breakpoint into our code. When the execution stops, we navigate in the debugger to find objects of interest and install object-centric breakpoints.

In the following, we put a self halt. in the test, before iterating over the boxes objects. We execute the test and start from there.

Breaking when the box of interest recives the #addElement: message
  1. After adding a halt (see above), we execute the test:
    • The execution halts and opens a debugger,
    • we are able to select an object to debug it.
  2. We select one of the boxes in the boxes collection:
    • Double-click on the boxes variable in the bottom inspector,
    • choose a box in the items pane by double-clicking on the box.
  3. In the opening inspector, go into the meta-side and select the addElement: method
  4. Right-click on that method, and select halt on call in the menu:
    • The breakpoint is installed on this method, and scoped to our box object,
    • you can see the breakpoint in the breakpoint pane (if you look at other boxes, there are no breakpoint).
  5. Proceed the execution:
    • The test iterates over the boxes and sends the addElement: message to each box,
    • only the box that you instrumented breaks on this method.
Use-cases for the halt on call breakpoint

The typical use-case for this object-centric breakpoint is when you have many instances of the same class running in your program, while you are interested to debug a method for one specific object among these instances. You are interested to answer the following question: “When is this particular object executing this particular method during the execution?”

Our box example illustrates this case. You need to debug a target method for a specific instance of OCDBox, while the test iterates over a hundred of them. Putting a breakpoint in the OCDBox class will stop the execution for every instance calling the target method.

Now imagine that you want to debug a display method in a graphical object. You will have much more instances sharing that display method. Using conventional breakpoints is tedious because it will stop every time one of the graphical object uses that display method.

In contrast, you use the halt on call breakpoint to debug a specific method for a specific object because:

  • You want to avoid designing super-complex conditional breakpoints to filter your object — sometimes you don’t even have enough information to discriminate your object,
  • you do not know when (or if!) the object you want to debug will call the method, so you may not know where to insert a standard breakpoint in the code — and there might be a lot of call sites to that method so you do not know which one your object will go through,
  • you want to avoid stepping many times in the debugger before getting to the point where your object calls the method to debug — also you do not want to miss that point by extra-stepping by mistake!

Breaking when the state of a specific object is written to

In this exercise, we reuse our test iterating over a hundred box objects. We select again a box among all the iterated boxes, but this time we want to stop when the name instance variable of that box is written to. To that end, we install an object-centric breakpoint on all write accesses to that variable in the selected object. Then, we proceed the test and the execution only breaks when the name instance variable of the selected box is modified.

Breaking when the name instance variable of our box object is written

As before, we execute the test and a debugger opens before the boxes are iterated in a loop. In this loop, each box is being printed one by one through the crTrace method.

We select one of the boxes in the boxes collection:

  • Double-click on the boxes variable in the bottom inspector,
  • choose a box in the items pane by double-clicking on the box.

In the opening inspector, go into the raw view and select the name instance variable. Right-click on that method, and select halt on write in the menu:

  • The breakpoint is installed on all write accesses to this variable, and scoped to our box object,
  • you can see the write accesses and the breakpoint installed on them in the breakpoint pane.

We proceed the execution and it breaks when the name variable of the selected box is modified. There is no halt when other boxes have their name variable modified.

Use-cases for the halt on access breakpoint

The typical use-case for this object-centric breakpoint is when you have many instances of the same class running in your program, while you are interested to know when the state of one specific object is modified. You are interested to answer the following question: “When is the state of this particular object modified during the execution?”

Our box example illustrates this case. Putting a breakpoint on every access to the name variable is long, tedious and error-prone (you may forget some). Watchpoints, Variable Breakpoints or Data breakpoints may help: those tools are able to stop execution when you access an instance variable defined in a class. However, they will stop the execution each time the name variable is modified in any instance of the OCDBox class (here, at each loop iteration!).

Imagine again that you are debugging a graphical object. You are interested to know when a given property of that object (i.e., an instance variable) is modified. You do not want all the graphical objects sharing a property to break then execution when that property is modified.

Instead, you use the halt on access breakpoint to stop the execution of a program when a particular property of a specific object is modified:

  • You want to avoid designing super-complex conditional breakpoints to filter your object — sometimes you don’t even have enough information to discriminate your object,
  • you do not know when (or if!) the property will be modified, so you may not know where to insert a standard breakpoint in the code — there might be a lot of methods modifying that property, and many call sites to those methods so you do not know which one your object will go through,
  • you want to avoid stepping many times in the debugger before getting to the point where your object calls the method to debug — and you cannot guarantee that the point where you stopped in the execution will actually modify the variable!

Conclusion: Using object-centric breakpoints for real

We have seen and practiced two kinds of object-centric breakpoints, the halt on call and the halt on access breakpoints. We used a simple example to illustrate how to use those breakpoints. However, in reality it is much more complex to decide when and how to use such breakpoints. Let’s review a few advices:

  • First, investigate to know what you have in hands,
  • put a first breakpoint at a strategic place to find your object, or inspect directly the object if you have it,
  • then discriminate: what is the information you need to find?
    • do you need to know why an object seems to not execute a method properly? Use a halt on call breakpoint on this method!
    • do you need to know when or how an instance variable is modified in an object? Use a halt on write breakpoint on this variable!

Object-centric breakpoints are not a magickal tool and, often, you may not find directly your problem. The execution may stop a few (or many!) times in the same place before you understand your problem. Still object-centric breakpoints are a faster, easier way than conventional breakpoints to find the information you need from your execution.

There are also risks: similarly to conventional breakpoints, an object-centric breakpoint will halt the execution as many times as it is hit at run time. If you install any breakpoint in code that is executed very often, you may interrupt your program for good. Think that if you put a break point on the code editor keystroke behavior, you may not be able to write code anymore!

As a closing note, other breakpoints do exist: breakpoints that stop the execution each time a particular object receives any message, or when two specific objects interact together. These breakpoints are not implemented yet, but they are scheduled for the near future!

Testing UFFI Binding with Travis

Part of software system development is to write tests. More than writing tests, developers want to set up a CI that will automatically (each commit, day, …) check the status of the project tests. While it is easy to use Travis as a CI system thanks to the work of smalltalkCI, setting up Travis to test FFI binding is more tedious.

In the following, I will present the configuration used to test Pharo-LibVLC. Pharo-LibVLC allows one to script VLC from Pharo, so one can play and control music or video.

Travis Set up

First of all, we need to set up the Travis CI. To do so, we create two configuration files: .smalltalk.ston and .travis.yml. Below, I present the default .smalltalk.ston for the VLC Pharo package. It allows one to execute the tests loaded by the VLC baseline. However, such tests might need external library to work, for instance the VLC library.

SmalltalkCISpec {
  #loading : [
    SCIMetacelloLoadSpec {
      #baseline : 'VLC',
      #directory : 'src'


Installing an external library

A project using FFI  may need to install the corresponding library to work. Often, as a FFI binding project developer, you know the main library (in the case of VLC, it is libvlc). However, this library has dependencies (in the case of VLC, it is libvlccore). There are many ways to determine all the libraries you need on your system to use your FFI binding. A simple one is to use the command ldd on the main library.

Once the needed libraries are identified, we have to configure Travis to install those libraries and add them to the PATH. For the installation, the best way is to rely on the apt addon of Travis.

language: smalltalk
dist: bionic

- linux

  - . ci/before_install.sh
  - export PATH=$PATH:/usr/lib

    update: true
      - libvlc-dev
      - libvlccore-dev
      - vlc

  - Pharo64-8.0
  - Pharo64-9.0

  fast_finish: true
    - smalltalk: Pharo64-9.0


Extra configuration (export display)

Additionally to the libraries, an FFI project may need to access to a screen. It is the case of VLC that can spawn windows to display video. Travis comes with a solution to test GUI. We then only need to export the Display to port :99. For instance, with VLC, we add a script in the before_install step of Travis. It will execute the file ci/before_install.sh that set up the display.

set -ev

# Setup display
export DISPLAY=:99.0


Use external resources with GitBridge

Finally, you may need external resources to test your FFI binding. For instance, in VLC, I need music and video files. I propose to add the needed resources in the github repository and to use them for the test. However, it is difficult to set up tests that work on your local filesystem, others filesystem, and with Travis.

Hopefully,  Cyril Ferlicot has developed GitBridge. This project allows one to access easily resources present in a local git repository. After adding GitBridge to the VLC FFI project, we can get a FileReference to the folder res of the VLC FFI project by executing the method res of the bridge

However, a last configuration set up is needed to use GitBridge with Travis. It consists on executing a Pharo script. To do so, we change the .smalltalk.ston to execute a specific file.

SmalltalkCISpec {
  #loading : [
    SCIMetacelloLoadSpec {
      #baseline : 'VLC',
      #directory : 'src'
  #preTesting : SCICustomScript {
    #path : 'res/ci/addRepoToIceberg.st'


Finally, the addRepoToIceberg.st file contains the script that will add to Iceberg the git project.

(IceRepositoryCreator new 
    location: '.' asFileReference;
    subdirectory: 'src';
    createRepository) register


And now you are done!

About Blocks: Variables and Blocks

This is the second post of the block closure series. This blog post is extracted from “Deep Into Pharo” and a new book under writing that will revisit chosen pieces of Pharo. We will start to play with Pharo to illustrate block and their interplay with variables.

Variables and blocks

A block can have its own temporary variables. Such variables are initialized during each block execution and are local to the block. We will see later how such variables are kept. Now the question we want to make clear is what is happening when a block refers to other (non-local) variables. A block will close over the external variables it uses. It means that even if the block is executed later in an environment that does not lexically contain the variables used by a block, the block will still have access to the variables during its execution. Later, we will present how local variables are implemented and stored using contexts.

In Pharo, private variables (such as self, instance variables, method temporaries and arguments) are lexically scoped: an expression in a method can access to the variables visible from that method, but the same expression put in another method or class cannot access the same variables because they are not in the scope of the expression (i.e., visible from the expression).

At runtime, the variables that a block can access, are bound (get a value associated to them) in ”the context” in which the block that contains them is ”defined”, rather than the context in which the block is executed. It means that a block, when executed somewhere else can access variables that were in its scope (visible to the block) when the block was ”created”. Traditionally, the context in which a block is defined is named the ”block home context”.

The block home context represents a particular point of execution (since this is a program execution that created the block in the first place), therefore this notion of block home context is represented by an object that represents program execution: a context object in Pharo. In essence, a context (called stack frame or activation record in other languages) represents information about the current execution step such as the context from which the current one is executed, the next byte code to be executed, and temporary variable values. A context is a Pharo execution stack element. This is important and we will come back later to this concept.

A block is created inside a context (an object that represents a point in the execution).

Some little experiments

Let’s experiment a bit to understand how variables are bound in a block.

Define a class named  BExp  (for BlockExperiment) as a subclass of  TestCase  so that we can define quickly some tests to make sure that our results are always correct.

TestCase subclass: #BExp

instanceVariableNames: ”

classVariableNames: ”

package: ‘BlockExperiment’

 Experiment 1: Variable lookup.

A variable is looked up in the block  definition  context.

We define two methods: one that defines a variable  t  and sets it to 42 and a block  [ t ]  and one that defines a new variable with the same name and executes a block defined elsewhere.

BExp >> setVariableAndDefineBlock

| t |

t := 42.

^ self executeBlock: [ t ]

BExp >> executeBlock: aBlock

| t |

t := 33.

^ aBlock value

BExp new setVariableAndDefineBlock

>>> 42

The following test passes. Executing the  BExp new setVariableAndDefineBlock  expression returns 42 .

BExp >> testSetVariableAndDefineBlock

self assert: self setVariableAndDefineBlock equals: 42

The value of the temporary variable  t  defined in the  setVariableAndDefineBlock  method is the one used rather than the one defined inside the method  executeBlock:  even if the block is executed during the execution of this method.

The variable  t  is  looked up in the context of the block creation (context created during the execution of the method  setVariableAndDefineBlock  and not in the context of the block evaluation (method  executeBlock: ).

Let’s look at it in detail. The following figure shows the execution of the expression  BExp new setVariableAndDefineBlock .

  • During the execution of method  setVariableAndDefineBlock , a variable  t  is defined and it is assigned 42. Then a block is created and this block refers to the method activation context – which holds temporary variables.
  • The method  executeBlock:  defines its own local variable  t  with the same name than the one in the block. This is not this variable, however, that is used when the block is executed. While executing the method  executeBlock:  the block is executed, during the execution of the expression  t  the non-local variable  t  is looked up in the  home context  of the block i.e., the method context that  created  the block and not the context of the currently executed method.

Non-local variables are looked up in the  home context  of the block (i.e., the method context that  created  the block) and not the context executing the block.

 Experiment 2: Changing a variable value

Let’s continue our experiments. The method  setVariableAndChangingVariableBlock  shows that a non-local variable value can be changed during the evaluation of a block.

Executing  BExp new setVariableAndChangingVariableBlock  returns 2008, since 2008 is the last value of the variable  t .

BExp >> setVariableAndChangingVariableBlock

| t |

t := 42.

^ self executeBlock: [ t := 2008. t ]

The test verifies this behavior.

BExp >> testSetVariableAndChangingVariableBlock

self assert: self setVariableAndChangingVariableBlock equals: 2008

Experiment 3: Accessing a shared non-local variable.

Different blocks can share a non-local variable and they can modify the value of this variable at different moments.

To see this, let us define a new method  accessingSharedVariables  as follows:

BExp >> accessingSharedVariables

| t |

^ String streamContents: [ :st |

t := 42.

self executeBlock: [ st print: t. st cr. t := 99. st print: t. st cr ].

self executeBlock: [ st print: t. st cr. t := 66. st print: t. st cr ].

self executeBlock: [ st print: t. st cr. ] ]

The following test shows the results:

BExp >> testAccessingSharedVariables

self assert: self accessingSharedVariables equals: ’42





BExp new accessingSharedVariables  will print 42, 99, 99, 66 and 66.

Here the two blocks  [ st print: t. st cr. t := 99. st print: t. st cr  ]  and  [ st print: t. st cr. t := 66. st print: t. st cr ]  access the same variable  t  and can modify it. During the first execution of the method  executeBlock:  its current value  42  is printed, then the value is changed and printed. A similar situation occurs with the second call. This example shows that blocks share the location where variables are stored and also that a block does not copy the value of a captured variable. It just refers to the location of the variables and several blocks can refer to the same location.

Experiment 4: Variable lookup is done at execution time.

The following example shows that the value of the variable is looked up at runtime and not copied during the block creation. First add the instance variable  block  to the class  BExp .

Object subclass: #BExp

instanceVariableNames: ‘block’


package: ‘BlockExperiment’

Here the initial value of the variable  t  is 42. The block is created and stored into the instance variable  block but the value to  t  is changed to 69 before the block is executed. And this is the last value (69) that is effectively printed because it is looked up at execution-time in the home context of the block. Executing  BExp new variableLookupIsDoneAtExecution  return  ’69’  as confirmed by the tests.

BExp >> variableLookupIsDoneAtExecution

^ String streamContents: [ :st |

| t |

t := 42.

block := [ st print: t ].

t := 69.

self executeBlock: block ]

BExp >> testVariableLookupIsDoneAtExecution

self assert: self variableLookupIsDoneAtExecution equals: ’69’

Experiment 5: For method arguments

We can naturally expect that method arguments referred by a block are bound in the context of the defining method.

Let’s illustrate this point. Define the following methods.

BExp >> testArg5: arg

^ String streamContents: [ :st |

block := [ st << arg ].

self executeBlockAndIgnoreArgument: ‘zork’]

BExp >> executeBlockAndIgnoreArgument: arg

^ block value

BExp >> testArg5

self assert: (self testArg5: ‘foo’) equals: ‘foo’

Executing  BExp new testArg: ‘foo’  returns  ‘foo’  even if in the method  executeBlockAndIgnoreArgument:  the temporary  arg  is redefined.

The block execution looks for the value of  arg  is its definition context which is the one where  testArg5:  ‘  arg  is bound to ‘foo’ due to the execution of method  testArg5 .

 Experiment 6: self binding

Now we may wonder if  self  is also captured in a block. It should be but let us test it.

To test we need another class.

Let’s simply define a new class and a couple of methods.

Add the instance variable  x  to the class  BExp  and define the  initialize  method as follows:

Object subclass: #BExp

instanceVariableNames: ‘block x’


package: ‘BlockExperiment’

BExp >> initialize

x := 123.

Define another class named  BExp2 .

Object subclass: #BExp2

instanceVariableNames: ‘x’

classVariableNames:  ”

package: ‘BlockExperiment’

BExp2 >> initialize

x := 69.

BExp2 >> executeBlockInAnotherInstance6: aBlock

^ aBlock value

Then define the methods that will invoke methods defined in  BExp2 .

BExp >> selfIsCapturedToo

^ String streamContents: [ :st |

self executeBlockInAnotherInstance6: [ st print: self ; print: x ] ]

BExp >> executeBlockInAnotherInstance6: aBlock

^ NBexp2 new executeBlockInAnotherInstance6: aBlock

Finally we write a test showing that this is indeed the instance of  BExp  that is bound to  self  even if it is executed in  BExp2.

BExp >> testSelfIsCapturedToo

self assert: self selfIsCapturedToo equals: ‘NBexp>>#testSelfIsCapturedToo123’

Now when we execute  BExp new selfIsCapturedToo , we get  NBexp>>#testSelfIsCapturedToo  printed showing that a block captures  self  too, since an instance of  BExp2  executed the block but the printed object ( self ) is the original  BExp  instance that was accessible at the block creation time. The printing is  NBexp>>#testSelfIsCapturedToo  because  BExp  is  subclass of  TestCase .


We show that blocks capture variables that are reached from the  context in which the block was defined  and not where there are executed. Blocks keep references to variable locations that can be shared between multiple blocks. In subsequent posts we will show how this is optimized by the Pharo compiler and VM to avoid expensive lookups.

[Pharo features] Advanced run-time reflection

In this series of blog-posts, we are explaining in more detail the Pharo features described in https://pharo.org/features.

The source code describes programs statically, but this perspective is insufficient if you want to fully understand the behavior of the computer system. The source code can, for example, describe a complex graph of data objects, although its concrete shape depends on input data and state of the application. To examine the real-time status of the system, you need to have tools that allow you to see the state of objects and relations between them. It is especially true if the program does not behave in an expected way, and you need to debug it. For modern development environments, to provide such tools for run-time analysis is an essential feature.

In Pharo, you can manipulate your program as any other objects. In fact programs are objects too. So to provide tools to examine your running program, Pharo needs to be able to inspect itself too. For this reason, this feature is named “reflection“. Pharo is a computer system that can examine and modify itself.

Pharo is truly strict in that and, on the level of the object memory, does not hide anything against itself. At the same time, it exposes all this information and capabilities to the programmers. So programmers can look inside every living object in the object memory, examine its state defined by instance variables, modify them and write their own tools to do the same. Of course, it does not limit this feature only on your objects: all the tools such as the browser are using reflection to introspect objects that are representing program itself (and Pharo itself). 


The standard Pharo tools for examination of objects are named “inspectors“. An inspector exposes the internal structure of objects and, usually, you use multiple of them at the same time, because, in most cases, you want to see more than one object at once. It allows multiple inspectors to be opened on the same object too and each one can show a different aspect of the object. The inspectors use so-called pagers for diving into the internal structure of the objects – to inspect objects that are referenced by instance variables of the examined object. We will talk about inspectors more deeply in a separate article of this series.

An Inspector inspecting itself.


Surprisingly, Pharo allows the opposite inspecting direction too. It is easy to know the objects that some object is referring but to know the objects that are referencing some object is hard, especially if this it should be fast. It is an extremely useful feature, mainly in cases when you are trying to find memory leaks.

Usually, a memory leak is a name for the state when you allocate a piece of memory explicitly and accidentally never free again while nobody has a pointer to it so it cannot be reused. In extreme but not uncommon case, it can cause exhaustion of available memory with all the nasty consequences. For managing the memory, Pharo uses automatic garbage collector, so it is safe of this kind of memory leaks, but it does not mean that it is safe from wasting memory. The garbage collector cleans only the objects that are not referenced by any other object, so if an object keeps a reference while it should not, it will not be garbage collected.

For detection and solution of such cases, Pharo offers several tools such as ReferenceFinder that finds the shortest path of references between two objects. If the starting point is a root object of the garbage collector, you can detect why your object is not garbage-collected as expected.

You first need to know that you have some memory leak before you try to solve it. Pharo has the next reflective feature that enables it. It allows enumeration of living instances of a given class. So in a situation where you expect that a class should not have any instance, you can easily prove it. We will talk about this feature in more details later too.

Practical demonstration

The short video we created for the demonstration of Pharo reflection, uses these techniques.

  • First, we ask the class SystemWindow to return some instance of it using a message with the unsurprising name:someInstance. We open an inspector on the resulting object sending such instance the message inspect.
    • SystemWindow someInstance inspect
  • Then we look at the internals of this object. We can see, for example, the rectangle that forms boundaries of the window
  • We send a message pointersTo to this window. It returns a collection of all objects that have a direct reference to the window object and open another inspector on it. This collection has 73 items.
  • We choose one of them. A graphical object (morph) that is part of the window and is responsible for resizing of the window by mouse in one direction.
  • Then we look at its bounds, the Inspector dives to show an object that represents it. It is a rectangle that has two points named origin and corner. We click on the second one to see the internals of it. The point has two instance variables with coordinates x and y containing numbers.


One basic rule of object-oriented programming declares that objects must be encapsulated. No-one has the rights to look at their internal structure or modify them. What we have demonstrated clearly violates this rule! Right?

Actually not. If the Inspector looks at an object private content or it modifies the instance variables directly, it does it by sending special messages to the object. These messages are instVarNamed:, instVarNamed:put: and so on. The object itself can still determine how it will react on such messages or if it will implement them at all. It can, for example, return false information or pretend that the state of the object was changed while it was not.

It is just extremely convenient to implement such methods for all the objects correctly, so the programmer has a right and complete view on the state of the system so he can study, understand and change it quickly.

For programmers, it is great to have the system fully exposed, but what about users? It may be dangerous to give them the power to examine and change everything they want, and the approach of Pharo may look like a recipe for disaster. But, in reality, Pharo isn’t different from any other computer system. You just do not let the users or attackers execute any code they create.

Ed: your new Emergency Debugger

“What is the nature of your debugging emergency?”


In this post, we’ll speak about debugging emergencies due to system error handling failures. We’ll cover the current tools available to revert code changes provoking such failures and introduce Ed, the new emergency debugger with improved revert capabilities.

What is a debugging emergency?

Sometimes when you execute code, you get a strange failure that opens that window and blocks your execution. This is the emergency evaluator invite.

It says a lot of things and it is scary, and it is common to just close it as fast as possible to get back to our Pharo image and forget about it.

Unfortunately, this means that there was a system error handling failure, it may happen again, and sometimes it totally breaks your image. This is a debugging emergency!

A debugging emergency is a system error handling failure

A system error handling failure happens when the debugger cannot be opened after an exception is raised. Most of the time, this happens because an error occurs within the debugger itself, which prevents it to work properly. Consequently, the debugger cannot open to debug its own error. This is an error-handling failure.

Because the system failed to handle the error, the whole system crashes. The system falls back to the emergency evaluator invite (see above) and waits for the user to decide what to do. We are in a debugging emergency state because tools are broken (here, the debugger) but nevertheless we need to recover from that error. At this point, the only tool still able to work is the emergency evaluator.

The only tool able to deal with an error-handling failure is the emergency evaluator

The emergency evaluator is a rudimental tool with a super limited user interface. It is started from the emergency evaluator invite. It answers to two commands only:

  • revert: reverts the last modified method to the last known version of that method (in this screenshot, it just failed doing that… why?).
  • exit: quits the emergency evaluator, dropping the original error.

Any other command is treated as Pharo source code: it is compiled and executed.

We need better emergency handling because current tools are limited

Emergency tools are too limited: they do not provide enough information, that information is not explorable and on top of that, they are buggy!

The first limitation is the restricted information that is given by the emergency evaluator invite. It is a printed representation of the last 20 elements of the call stack. All other debugging information is dumped, to only keep that printed stack. Because it is only printed, a lot of contextual information is lost. It is often hard to see in that stack where it crashed. The problem is that users have to rely on that information to devise what to do in the emergency evaluator!

Then comes the emergency evaluator. Its revert command is interesting. It allows us to recover from a fatal method change that provokes the error. But there are many problems. Among others, it is buggy and does not work all the time. Then, it only reverts the last modified method in the system. What if the problematic method is not the last modified one, but another one modified much before?

When reverting does not work, the only option left is to manually recompile the buggy method. This is tedious and error-prone: we have to remember the original class name, the method’s source code and rewrite it without any error. Although this is possible for simple methods, it does not scale. Finally, in case of error, the emergency evaluator will not give us any feedback and we’re left in the dark.

ED: the Emergency Debugger

Ed is an enhanced emergency evaluator that keeps as much debugging features as possible before falling back to the emergency evaluator. Ed provides a structured and navigable call stack from which users decide their debugging actions.

Ed is designed as a minimal, last-resort debugger to handle error handling failures. The goal of Ed is to provide a debugging tool with the less possible system dependencies to ensure that it can work even if almost everything else breaks.

Ed always opens on your domain code failure: it does not show you the debugger error. Your concern is to debug your code. However, it is possible to see a stack trace of the debugger failure, and to debug it with Ed.

Ed’s improved emergency stack view

Ed presents a navigable stack instead of a string representation of the stack. Instead of dropping the debug session when a system error handling failure occurs, Ed keeps it and uses it to provide us with contextual information about the failure. The stack that Ed presents is a stack of execution contexts extracted from the debug session.

The navigable stack helps us understand the error and where it originally happened:

  • We can see and navigate the call stack like in a standard debugger (top pane)
  • We can see the argument value (if any) printed between parentheses near each method in the call stack
  • We can see the source code of the method selected in the stack (middle pane), and see exactly what expression failed or is being executed (in red)

Ed’s method version selector

For a selected method in the call stack, Ed provides access to the different versions of that method. The source code of each version is displayable and navigable. Viewing methods’ versions helps in deciding which one is the method to revert. In addition, it does not matter if the right method was the last modified method in the system. It’s highly probable that this method is somewhere in the call stack, and we can navigate the call stack!

  • We can see and navigate the selected stack method’s versions using left/right arrows: bottom pane shows version 2/15 of the selected method
  • We can revert the method to the selected version by using the revert command
  • Once a method is reverted, the retry command allows you to retry opening a real debugger on your error, and the proceed command will attempt continue execution from there.
  • Any method in the stack can be selected and reverted to a previous version


What else can Ed do?

Ed provides a prompt with basic commands. For now, Ed uses a simple debugging API with minimal commands. Remember: you got here because the debugger failed while opening on an error from your domain code. So Ed just gives you another chance to debug your error. Maybe you are interested in debugging the debugger failure, in which case Ed allows you to switch context to debug that failure. It is also possible to retry the opening of the standard debugger after reverting a method or to terminate all non-vital system processes to kill potential image-breaker running code.

  • Ed provides some help to know all available commands: type "h"
  • The set of commands is provided by the debugging API of Ed
  • Using the showDebuggerError command, Ed shows the debugger failure that led you here, and the debugDebugger command opens a new Ed on the debugger failure.

Can Ed fail?

Yes, it can. Any program can. In case Ed fails, the system fallsback to the original emergency evaluator. For now, a failure in Ed should not block you. For instance, imagine that you get a debugger failure while debugging an exception in your domain code. First, Ed will open on your exception (not the debugger failure) so that you can debug your code. Second, if an exception is raised from within Ed and that it breaks Ed, the emergency evaluator will open on your exception again. Debugging the debugger is not going to be on your soulders, and you can focus on your bugs!

Note that, in the future, this last part is something we are going to improve to bring more flexibility to developers. Settings will enable the debugging of debugger failures or disable them to try to always find a debugger that can debug your exceptions.

Additionaly, the default emergency debugger can now be selected in the settings. You can choose if you wish to experiment Ed or to stay with the old emergency evaluator:

Ed is a prototype, but it moves fast!

Ed is still a prototype, and it needs better testing and experimentation. For instance, it still has system dependencies that can be removed, it has code that can be improved.

In the end, the goal is also to give users a read-eval-print-loop prompt with a strong debugging API. For now, Ed only has it’s minimal emergency API. However, it has been designed in a way that its API is extensible by additional tools, improving Ed’s potential capabilities. In the future, we want to explore remote debugging with Ed. For instance, Ed could act as a minimal debugging infrastructure that you can remotely connect to through an simple http connection. And if your debugging needs grow, Ed could receive a bytecode upload containing new debugging tools and API. We will experiment and explore this in future articles about debugging IOT devices.

About Blocks: Basics 101

This blog post is a first and short of a series where we will explore block-closures. This post will cover basic elements while the other will start to go over more semantical aspects using simple examples. This blog post is extracted from “Deep Into Pharo” and a new book under writing that will revisit chosen pieces of Pharo.

Lexically-scoped block closures, blocks in short, are a powerful and essential feature of Pharo. Without them it would be difficult to have such a small and compact syntax.
The use of blocks is key to get conditionals and loops as library messages and not hardcoded in the language syntax. This is why we can say that blocks work extremely well with the message passing syntax of Pharo.

What is a block?

A block is a lambda expression that captures (or closes over) its environment at creation-time. We will see later what it means exactly. For now, imagine a block as an anonymous function or method. A block is a piece of code whose execution is frozen and can be kicked in using messages. Blocks are defined by square brackets.

If you execute and print the result of the following code, you will not get 3, but a block.
Indeed, you did not ask for the block value, but just for the block itself, and you got it.

[ 1 + 2 ]
>>> [ 1 + 2 ]

A block is executed by sending the message value to it.
More precisely, blocks can be executed using value (when no argument is mandatory), value: (when the block requires one argument), value:value: (for two arguments), value:value:value: (for three) and valueWithArguments: anArray (for more arguments).

These messages are the basic and historical API for block execution.

[ 1 + 2 ] value
[ :y | y + 2 ] value: 5

Some handy extensions

Beyond the value messages, Pharo includes some handy messages
such as cull: and friends to support the execution of blocks even
in the presence of more values than necessary.
cull: will raise an error if the receiver requires more arguments than provided.

The valueWithPossibleArgs: message is similar to cull: but takes
an array of parameters to pass to a block as argument.
If the block requires more arguments than provided, valueWithPossibleArgs:
will fill them with nil .

[ 1 + 2 ] cull: 5
>>> 3
[ 1 + 2 ] cull: 5 cull: 6
>>> 3
[ :a | 2 + a ] cull: 5
>>> 7
[ :a | 2 + a ] cull: 5 cull: 3
>>> 7
[ :a :b | 1 + a + b ] cull: 5 cull: 2
>>> 8
[ :a :b | 1 + a + b ] cull: 5
>>> error because the block needs 2 arguments.
[ :a :b | 1 + a + b ] valueWithPossibleArgs: #(5)
>>> error because 'y' is nil and '+' does not accept nil as a parameter.

Other messages

Some messages are useful to profile execution:

  • bench. Return how many times the receiver block can be executed in 5 seconds.
  •  durationToRun. Answer the duration (instance of class Duration ) taken to execute the receiver block.
  • timeToRun. Answer the number of milliseconds taken to execute this block.

Some messages are related to error handling – we will cover them when we will present exceptions.

  • ensure: terminationBlock. Execute the termination block after evaluating the receiver, regardless of whether the receiver’s evaluation completes.
  •  ifCurtailed: onErrorBlock. Execute the receiver, and, if the evaluation does not complete, execute the error block. If evaluation of the receiver finishes normally, the error block is not executed.
  • on: exception do: catchBlock. Execute the receiver. If an exception exception is raised, execute the catch block.
  • on: exception fork: catchBlock. Execute the receiver. If an exception exception is raised, fork a new process, which will handle the error. The original process will continue running as if the receiver evaluation finished and answered nil, an expression like: [ self error: ‘some error’] on: Error fork: [:ex | 123 ] will always answer nil to the original process. The context stack, starting from the context which sent this message to the receiver and up to the top of the stack will be transferred to the forked process, with the catch block on top. Eventually, the catch block will be executed in the forked process.

Some messages are related to process scheduling. We list the most important ones. Since this Chapter is not about concurrent programming in Pharo, we will not go deep into them but you can read the book on Concurrent Programming in Pharo available at http://books.pharo.org.

  • fork. Create and schedule a Process evaluating the receiver.
  • forkAt: aPriority. Create and schedule a Process evaluating the receiver at the given priority. Answer the newly created process.
  • newProcess. Answer a Process evaluating the receiver. The process is not scheduled.

In the next post we will present some little experiences to show how blocks captures variables.

On the edge of class rules

Pharo has one simple basic rule: everything is an object. But the objects themselves are not entities living in an abstract universe and do not drink the dew of lilies of the valley for breakfast. They exist in the object memory served by the virtual machine. The virtual machine defines an interface that strictly specifies how the objects have to look like and what rules they need to follow.

For objects, the amount of rules is very low – basically just one. Pharo is a class-based object system that requires every object to have an assigned class. Every object is an instance of some class. 

Classic Class rules

For classes, the amount of rules is much wider. Classes are the source of the behaviour of objects, and they shape how the objects will look physically in memory. For this reason, every class needs to be an object with at least three instance variables. The placement and order of these variables are strictly given. These variables are:

  • superclass – can be nil for the class hierarchy roots
  • methodDict – a dictionary of method name symbols and associated compiled methods. It really needs to be a dictionary, and because this data structure is so essential for the virtual machine, the internal structure of it is strictly defined as well
  • format – an integer that encodes the kinds and numbers of variables

These three instance variables are defined in the class named Behavior that is the basic class of all classes in Pharo. If you try to add a new instance variable that will change the order of these three, the virtual machine crashes because it changes the expected layout of the classes in memory. Even if you try to define a superclass of Behavior and add a new instance variable there, it has the same fatal result because it modifies the memory layout as well. 

Classes themselves are objects too, so they need to follow the same rules. Most of the classes have a name and are nicely placed in the Pharo class hierarchy. They are mentioned in the system dictionary (a namespace) and the system browser can work with them. But not all classes need to be like that. Pharo provides an option to create so-called anonymous classes that behave more like standard objects. You can inspect them, but the other development tools do not know about them. They have various usages – mostly in special and strange cases – but they may be handy especially during testing because such classes are not registered in the system, do not need to have a unique name and are very easy to discard. They are created using the messagenewAnonymousSubclass that you send to an existing class.

aClass := Object newAnonymousSubclass.

We said that objects are instances of classes and classes are objects. If you think about such rules while drinking something uplifting, you may ask yourself: Is it possible to have an object that is an instance of itself? Can an object be its own class? Does Pharo allow such an edge case? And if yes, is it any useful?

Funky Classes Objects

How to try it? The most straightforward way how to do that would be to create an object (e.g., as an instance of the class Object) and then assign its class to itself. But here you quickly hit two problems. First, as we said, the virtual machine has some restrictions on the memory layout of classes that you need to follow. The second problem is that Pharo objects have no direct way of how to change their class to another. They do not understand any message like class: that would do this job.

Surprisingly, both of these problems are quite easily solvable! Let’s see how.

First, we will create an anonymous class. Unlike most of the anonymous classes, we will create classes that have Class as a superclass (an anonymous metaclass :)). That means that such anonymous class will define the same instance variables as Class. The three instance variables that make the virtual machine happy plus some other variables that make the tools happy.

aClass := Class newAnonymousSubclass.

Such class looks pretty standard in the Inspector.

Then we create an instance of such a class. 

anObject := aClass basicNew.

Inspecting of this class does not offer a very satisfying view. All instance variables are referencing the default value – nil. Moreover, the print string is broken.

There is an easy fix of this shattering state. We will just copy all instance variables from the original anonymous class.

aClass allInstVarNames do: [ :instVarName |
  anObject instVarNamed: instVarName put: (aClass instVarNamed: instVarName) ].

 Now the object looks the same way as the anonymous class in the Inspector, so there is no need to put the picture here again. They look the same way, but they are not the same objects. And, we still haven’t reached our goal to create an object that is its own class.

(anObject == aClass)
>>> false. (anObject class == anObject)
>>> false.

But we are close. Now, the object is an instance of our anonymous class and the object itself is already quite a suitable class.

(anObject class == aClass) 
>>> true.

So we may do the same step that Pharo does when it migrates object to a new class. For example, when you change the structure of the original class object. We may let our anonymous class become the desired object. 

aClass becomeForward: anObject.

That updates even the low-level pointer in the object memory that defines a class of our object. It was pointing to the anonymous class that becomes our object. The original anonymous class will stop to exist; we do not need it anymore.

(anObject class == anObject) 
>>> true.

The mission accomplished! We have created an object that is its own class.

Now, we can start to play with it. It is a class so we can add methods to it, which can be useful for creating mocks. 

anObject compile: 'ultimateAnswer ^ 42'.

And the object will understand it. As the only object in the object memory.

anObject ultimateAnswer 
>>> 42

So what?

Great, but is it useful? It can be. This technique is not something new. I have seen it in the Prototypes package by Russell Allen from from year 2005 that allowed prototype-based approach in the class based system. His original implementation cannot work in Pharo; however, the principle stays. Like JavaScript and unlike Self, it allows only a single target of delegation – here named superclass

It is more an interesting oddity than something that you would use every day in your programs. But it can imagine it may be useful mocking, in some debugging scenarios or modeling programs. Last but not least, it is a nice and amusing feature that demonstrates the flexibility of the object model of Pharo.

Here is the entire code:

aClass := Class newAnonymousSubclass.
anObject := aClass basicNew.
(aClass allInstVarNames) do: [ :instVarName |
    anObject instVarNamed: instVarName put: (aClass instVarNamed: instVarName) ].
aClass becomeForward: anObject.


We got some fun around classes, and we hope that you enjoyed it. If you want to understand more conceptually objects, classes and metaclasses you should read the little book: ‘A simple reflective object kernel’ http://books.pharo.org/booklet-ReflectiveCore/ . It is not about Pharo, but it shows in Pharo how to define a minimal reflective kernel, and you can learn a lot of fun aspects of OOP (like instantiation, allocation, lookup,…) in the process. 

Implementing Indexes – Who uses all my memory

This is the second entry of the series about implementing full-text search indexes in Pharo. The first entry is Implementing Indexes – A Simple Index. You can start reading from this post, but there are things that we have presented in the previous one. For example, how to install the Trie implementation.

In the first entry, we have seen how we can create a Trie, today we are going to analyze the drawbacks of this first implementation and how to improve it to get an acceptable ratio between used space and access time.

Let’s start with a simple example, let’s create a Trie with all the class names, so we can query them.

behaviors := CTTrie new. 
SystemNavigation default allBehaviorsDo: [ :aBehavior | 
       behaviors at: aBehavior name put: aBehavior name ].

In this example, we are creating a new Trie that indexes the names of all the behaviors in the system and put into the values of the tries the name of each behavior. This looks silly, but it will be clear later, as we want to analyze the performance of the data structure, so we put the simplest payload we can have.

This simple index allows us, as we have already seen before, to find all the classes that start with a given prefix.

behaviors allValuesBeginningWith: 'T'. 
behaviors allValuesBeginningWith: 'Obj'. 
behaviors allValuesBeginningWith: 'Spt'.

Now, we are exactly where we have been before. Let’s now analyze the memory footprint of our trie and where we can improve it.

Analyzing a Graph of Objects

As we all know, the programs and data in Pharo are represented by graphs of objects. Each object has a series of references to other objects, and they collaborate to fulfill a mission. One possible, way of looking at it is a static analysis. Let’s start with a class diagram.


This diagram does not help at all, as we only can see that the trie has a node, and the nodes have a value, a collection of children and a character. This does not help to see the memory occupied by the graph.

We need to have a dynamic view of the graph.

The first approach that we can try is to use the inspector, and use all the advantages of the Pharo live environment.

Screenshot 2020-04-03 at 20.52.30

This is a really powerful tool to explore and to understand data structures and complex object graphs. Although, we are facing a problem where our trie has 18.814 elements. So the task to analyze the size of the Trie with all its instances is not easy. We need another tool. Then, we are going to use the inspector, because it is so powerful to navigate the graph with tip of the fingers.

For this, we are going to analyze the statistics of a given graph of objects. We want to know:

  • Number of instances
  • How much space do they occupy
  • And to see how these numbers are related.

A custom tool for this particular problem

A nice way of working that you see a lot in machinists and carpenters is that they build custom tools for very specific problems. Investing a little time in a small tool that serves a particular problem provides better results than trying to apply a tool that tries to solve all problems in the world.

So we are going to use a little tool, 3 classes, that I have implemented in an hour or so. This tool takes a root object, and traverse the graph, taking all the objects reachable in the graph and then calculates some statistics.

You can get it from github.com/tesonep/spaceAndTime.

So we are going to do:

stats := GraphSpaceStatistics new
	rootObject: behaviors;

stats totalSizeInBytes.
stats totalInstances.
stats statisticsPerClass.	
stats statisticsPerClassCSV.

And get statistics about memory usage. With the last expression, we can generate a CSV of the results of the analysis. I got this table with the results:

Class name # Instances Memory (Bytes) Memory(%)
Array 159.568 7.694.592 36.46%
CTTrieNode 159.568 5.106.176 24.20%
IdentityDictionary 159.568 3.829.632 18.15%
Association 159.567 3.829.608 18.15%
ByteString 9.242 349.872 1.66%
ByteSymbol 9.242 293.928 1.39%
CTTrie 1 16 0.00%
UndefinedObject 1 16 0.00%
Character 64 0 0.00%
SmallInteger 90 0 0.00%
Total   21.103.840 100.00%

Nice data… but we need to extract some knowledge from it.

First, we can start reducing the noise of the table, let’s analyze why we have the last four lines that do not affect the result. UndefinedObject is an easy task. This is the class of nil, and as we only have one instance of it so it is easy to identify. We can see that this single instance occupies 16 bytes, which is the smallest size an object can occupy in Pharo.  The same happens with our sole instance of CTTrie.

Then we have, Character and SmallInteger, those have 0 bytes occupied even if we have a lot of instances. That should be wrong. But no, this is not an error. These objects are immediate. An immediate object is stored encoded in the reference. When you have an object with references only to immediate objects, the only occupied space is the space where the reference is. You can read more about it in Section 6.10 of  Pharo by Example (Page 112).

Now, let’s see the classes that have more impact. If we see the first 4 rows in the table, we can see the number of instances of Array, CTTrieNode, IdentityDictionary, and Association. We have the same number of instances of each of them, the only difference is that we have one less for Association. So, this is not a chance, we have something there. If we inspect the nodes in the Trie, we can see that each node has an instance of  IdentityDictionary to hold its children. That explains the relationship between each node and each dictionary. But the instances of Array and Association, where they came from?

Inspecting an IdentityDictionary
With the Inspector we understand how the IdentityDictionary is built

Doing a deeper analysis, using the inspector, we can see that an instance of IdentityDictionary has inside an array of associations. So we, have an array per IdentityDictionary and an Association per node. As each entry In the Array is a child node represented with an association.

If our theory is correct, where is the missing Association, as we have 159.568 nodes but one Association less. The explanation to this came when we inspect the CTTrie. The CTTrie instance has a direct reference to the root, so that explains the missing association. As the root is not contained in a Dictionary, so it is not referenced by an association.

Finally, to end our analysis, we can see the instances of ByteSymbol and ByteString. We have exactly 9.242 instances of each of them. Why? We need to remember that our trie contains exactly 18.484 entries and we are storing there the name of the classes. But the question is why some are instances of ByteSymbol and others are instances of ByteString. Again, analyzing the graph with the inspector. We see that for example, we have the following elements in the Trie: #Object and ‘Object class’. So we see that the name of the classes are Symbols and the names of the metaclasses are Strings. And of course, we know that for each class in the system we have a metaclass associated with them.

Can we do better?

One of the more interesting numbers that we have avoided so long, is the ratio of nodes and values stored. As we can see we have 18.484 values stored in the trie, but for those, we need 159.568 node instances; a ratio of 9 nodes per value. Why do we have so many nodes?

If we inspect the nodes, we see that there are lots of long chains of nodes with a single child and without node value. Again, we need to take some measures to understand the problem.  We are going to measure something that we will call the occupation rate of the trie. Again, I couldn’t find this is something studied or not, but it is helpful for my problem and to optimize in the future. We defined like this:

occupationRate = (nodes - intermediateNodes) / nodes

The Nodes value is the number of nodes in the trie, and the intermediateNodes value is the number of nodes that only have a child and does not contain a value. This rate gives us the idea of how much of the trie is holding information. In our case, the value is very small 13%.

Thinking about this problem, we got to the root of it (or in this case the branches 🙂 ). The problem is how the trie is structured to store the data. Let’s see a small example with the keys “aaaaa” and “aaaab”. The resulting trie is something like the following picture


In this diagram, it is clear that all the colored nodes are internal nodes. They have only one child and they don’t hold a value. And it is clear that this structure should be compressed in way that those nodes are reduced to a single entity. With the desired result of reducing the 4 of them to a single one.

Another elephant in the room is the space occupied by the use of an IdentityDictionary to hold the children of a node. Around 72% of the memory impact is related to the decision of using an IdentityDictionary. Can it be done using a better strategy?


We are in a great moment of the series, it looks like we have spent a lot of time to understand the problem, but still, we don’t have anything.

This is not true, we have a lot, we have a way of measuring the performance of our solution, we understand where these numbers came from and now every step that we can take we can validate it with the original baseline we have.

The first thing to improve the performance of any solution is to measure if we don’t measure we cannot compare if we cannot compare we make modifications blindly. Optimizing without measuring is like developing without testing… and you know what we think about that.

A Floating GCC Optimization

A couple of months ago, while debugging the build of the VM, I found a problem with the optimization of GCC 8.3 (I am not sure what other versions may show it). Basically, this GCC version does not compile one of our virtual machines functions well, because it unexpectedly removes what looks like valid code. The affected code in the VM was the following:

Spur64BitMemoryManager >> fetchLong32: fieldIndex ofFloatObject: oop
    "index by word size, and return a pointer as long as the word size"
    | bits |

    (self isImmediateFloat: oop) ifFalse:
        [^self fetchLong32: fieldIndex ofObject: oop].
    bits := self smallFloatBitsOf: oop.
        cCode: [(self cCoerceSimple: (self addressOf: bits) to: #'int *') at: fieldIndex]
        inSmalltalk: [
            self flag: #endian.
            fieldIndex = 0
                ifTrue: [bits bitAnd: 16rFFFFFFFF]
                ifFalse: [bits >> 32]]

This method is used to extract the two 32 bits words that are stored in a Float. It handles the cases when the Float number is an immediate value and when it is not. The problem is in the part that handles the immediate value. In the Smalltalk simulator, for extracting the first Int32 it performs and a #bitAnd: and a bit shift (#>>) for the second one.

As you can see, in the last part of the method, there is a conditional that handles a special case differently in the Pharo simulated code and in the translated C code. The C code was being generated form this then:

(self cCoerceSimple: (self addressOf: bits) to: #'int *')      at: fieldIndex

And the generated C code was the following:

 ((int *)(&bits))[fieldIndex];

This is accessing the memory of the bits variable using indirect memory access. Somehow, this pattern of code was marking all usages of the variable bits as dead code, and so all dead code was being removed. To detect the problem it was not easy, basically, I have used the Pharo Tests to detect the failing function and then to understand the generation of code in GCC, we need to debug its generation. After some digging, I understood this was a problem related to some undefined behavior and aliasing of variables in the stack, which you can read more about in here and here. However, in the middle I learned how to debug GCC optimizations, so the post is about that, I hope somebody finds it interesting and useful.

Starting from a simple reproducible case

I managed to create first a simpler case showing the problem:

long __attribute__ ((noinline)) myFunc(long i, int index){
  long v;
  long x = i >> 3;
  v = x;
  return ((int*)(&v))[index];

#include <stdio.h>

int main(){
   long i;
   int x;

   scanf("%ld", &i);
   scanf("%d", &x);


This small function is just taking a long value (64bit integer), then it shifts the value and stores it in a temporary variable (a variable that is in the stack of the function). Finally, it performs the same indirect access that we have seen in the VM code showing the error.

So, to continue the investigation let’s compile the function with different optimization levels, as bigger the number of the level bigger the number of optimizations the compiler will do. 

Understanding the Level 1 generated code

When using the optimization level 1 (-01) it generates the following code:

    sarq $3, %rdi
    movq %rdi, -8(%rsp)
    movslq %esi, %rsi
    movslq -8(%rsp,%rsi,4), %rax

Before continuing let’s analyze this piece of code. We see only the code of the function myFunc.  To understand it, we need to have some little background on how the parameters are passed in the calling convention in System V x86_64 (in this case a Linux 64bits).  Basically, the register rdi is used for the first parameter, the register rsi is used for the second parameter and the register rax is used for holding the return value. The other important register here is the rsp, that holds the address to the top of the execution stack.

However, using the optimization level 2 (-02) generates a much compact version:

    movslq %esi, %rsi
    movslq -8(%rsp,%rsi,4), %rax

As you can see, the more optimized version is losing the bit shift and
the first mov. It keeps only the final access to the memory and the return. 

Generating the snippets

The first snippet is generated using:

$ gcc -O2 prb.c -S -o prb.s -Wall

and the second using:

$ gcc -O1 prb.c -S -o prb.s -Wall

Debugging GCC steps

The GCC compiler is implemented as a pipeline of transformations. Each transformation takes the previous result and applies one of the steps of the process. When you configure different optimization levels and compilation options different pipelines are executed. 

The GCC pipeline has different steps, so I centered myself in the
tree optimizations. To allow debugging of GCC, using the correct options it dumps each of the intermediate states of the compilation. The intermediate transformations work with C like trees. For example to get the optimized version of the program, before generating the Assembler we have to do:

$ gcc -O2 prb.c -S -o prb.s -Wall -fdump-tree-optimized=/dev/stdout

This will generate:

myFunc (long int i, int index)
 long int v;
 long unsigned int _1;
 long unsigned int _2;
 int * _3;
 int _4;
 long int _8;

<bb 2> [local count: 1073741825]:
 _1 = (long unsigned int) index_7(D);
 _2 = _1 * 4;
 _3 = &v + _2;
 _4 = *_3;
 _8 = (long int) _4;
 v ={v} {CLOBBER};
 return _8;

This is the optimized SSA (static single assign) version of the function. As you can see in this version the code is already optimized, and our operations not correctly generated. I expected to see something like:

myFunc (long int i, int index)
 long int x;
 long int v;
 long unsigned int _1;
 long unsigned int _2;
 int * _3;
 int _4;
 long int _10;

<bb 2> [local count: 1073741825]:
 x_6 = i_5(D) >> 3;
 v = x_6;
 _1 = (long unsigned int) index_9(D);
 _2 = _1 * 4;
 _3 = &v + _2;
 _4 = *_3;
 _10 = (long int) _4;
 v ={v} {CLOBBER};
 return _10;


In the first SSA version, we are lacking the shift operation:

x_6 = i_5(D) >> 3;
v = x_6;

So, we need to see which of the optimizations and transformations is losing our code. To see all the steps we should execute:

$ gcc -O2 prb.c -S -o prb.s -Wall -fdump-tree-all

This will generate a lot, really a lot, of files in the same directory with the name prb.c.xxx.yyyy. Where xxx is the number of the step, and yyyy is the name of the step. Each of the files contains the result of applying the changes, so what I have done is looking in a binary search where my code was lost.

I have found that the problem was in the first dead code elimination
step (prb.c.041t.cddce1). GCC does not like that we are copying a stack variable and then accessing directly as memory:

v = x;
return ((int*)(&v))[index];

So, basically, it was assuming that we were only using v and not x, so all the code modifying x is discarded (this optimization is called “dead store code deletion”). Basically, it tries to remove all the code that is affecting stack (temporary) variables that are never used.

Solving the problem

I have fixed the problem by writing the code in a different way. Actually, it was only necessary to remove the special C case in the original code, and let the Slang C code translator do his thing.

Spur64BitMemoryManager >> fetchLong32: fieldIndex ofFloatObject: oop
    "index by word size, and return a pointer as long as the word size"
    | bits |

    (self isImmediateFloat: oop) ifFalse:
        [^self fetchLong32: fieldIndex ofObject: oop].
    bits := self smallFloatBitsOf: oop.
    ^ fieldIndex = 0
          ifTrue: [bits bitAnd: 16rFFFFFFFF]
          ifFalse: [bits >> 32]]


I had published a smaller version of this post in the Pharo mailing list when commenting on the bug.

I wanted to store this knowledge and propagate it before I eventually flush it from my mind. I like these things, but I am not debugging the GCC optimization pipelines every day.

All in all, be careful about undefined behaviors!

Implementing an object-centric breakpoint with Reflectivity

In this post, we describe how we implement a breakpoint that affects only one specific object, and how we implement it using the Reflectivity framework.

What is an object-centric breakpoint?

Let’s take a simple example: imagine two Point instances p1 and p2.

p1 := 0@2.
p2 := 1@3.

Each of these points has two instance variables, x and y, that we can change by calling the setX:setY: method. Imagine that we have a bug related to point p1, and that we want to halt the execution when this object executes the setX:setY: method.

We definitely do not want to put a breakpoint directly in the setX:setY: of class Point method. Points are used all over the system: putting a breakpoint in class Point will halt whenever any point calls that method and the image will freeze.

What we really need is a breakpoint that halts the execution only when setX:setY: is called on p1.

The halt-on-call breakpoint

This breakpoint is an operator called halt-on-call, defined by Jorge Ressia in his Object-Centric debugger. It halts whenever one specific object receives a given message. This is what we need! Ideally, we would like to use an API like this:

p1 haltOnCall: #setX:setY

This installs a breakpoint that halts the method setX:setY: for the point p1 exclusively. By extension, all objects should benefit from that API, so that we can install object-centric breakpoints on any kind of object. Now let’s implement it.

Implementing the halt-on-call breakpoint API

Let’s define an interface for this operator in the Object class, to make it available for all objects. Let’s call it haltOnCall:. It takes a method selector as parameter, which defines the selector of the method to halt. We delegate the definition and the installation of the breakpoint instrumentation to another object named ObjectCentricInstrumenter:

Object >> haltOnCall: methodSelector
  ^ ObjectCentricInstrumenter new halt: methodSelector for: self

This interface installs a halt-on-call breakpoint on its receiver, and returns the object modeling that instrumentation. It is very important to keep a reference to that instrumenter object if we want to uninstall our breakpoint later. This would typically be handled by a higher level tool such as a real debugger.

This method is now our top-level debugging API, available for all objects in the system. Using this API, we can now ask any object to halt when it receives a particular message. Now, we have to create the ObjectCentricInstrumenter class and implement the halt:for: method that is used to instrument the receiver in the code above.

Implementing an object-centric instrumenter using Reflectivity

We use the Reflectivity framework as a support to implement object-centric breakpoints.

What is Reflectivity?

Reflectivity is a reflective framework shipped in the base Pharo distribution. It features annotation objects named Metalink that apply reflective operations at the sub-method level (i.e., at the level of sub-expressions of a method).

A metalink is an annotation of a method AST. It is an object that defines a message selector, a receiver named meta-object, and an optional list of arguments. At run time, when the code corresponding to the annotated AST is reached, the metalink is executed. The message corresponding to the selector is sent to the meta-object, with the previously computed argument list: the corresponding method is executed in the meta-object.

For example, adding logging to Point>>setX:setY:with Reflectivity goes as follows:

  1. Instantiate a metalink
  2. Define a meta-object, for example, a block:
    [Transcript show: 'Hello World']
  3. Define a message selector that will be sent to the block at run time, for example: value
  4. Attach the metalink to the ast node of a method, for example the method setX:setY: of Point

At run time, each time a point will execute setX:setY:,  the metalink will first execute and send value to the block object [Transcript show: 'Hello World'], resulting in the execution of the block. Then the execution of setX:setY: will continue.

Now, back to our original problem, Reflectivity supports object-centric metalinks. This means we can actually scope a metalink to a single, specific object. We will use this feature to implement our object-centric instrumenter and define our object-centric breakpoint.

More details about Reflectivity are available in the latest Reflectivity paper.

Implementing the object-centric instrumenter

Now, we have to create the ObjectCentricInstrumenter class and implement the halt:for: method. This class has three instance variables:

  • targetObject: the target object affected by instrumentation
  • metalink: the instrumentation per se, that is, a metalink
  • methodNode: the AST node of the method we instrument
Object subclass: #ObjectCentricInstrumenter
  instanceVariableNames: 'targetObject metalink methodNode'
  classVariableNames: ''
  package: 'Your-Pharo-Package'

In this class, we have to define how we install the halt-on-call breakpoint on our object. This is done through the halt:for: method. This method takes two parameters: the message selector of the method that will halt and the target object that the breakpoint will affect.

ObjectCentricInstrumenter >> halt: methodSelector for: anObject
  targetObject := anObject.
  metalink := MetaLink new
    metaObject: #object;
    selector: #halt.
  targetObject link: metalink toMethodNamed: methodSelector

First, we store the target object (line 2). We need to keep a reference to that object to uninstall the breakpoint later. Then, we configure a metalink to send the #halt message to #object (lines 3-5). At run time, #object represents the receiver of the current method that is executing. This instrumentation is equivalent to insert a self halt instruction at the beginning of the instrumented method. Finally, we use ourReflectivity’s object-centric API (line 6) to install the breakpoint metalink on the target method, but only for the target object.

When that is done, targetObject will halt whenever it receives the message methodSelector. All other objects from the system remain unaffected by that new breakpoint.

Using our object-centric breakpoint

Now that we have implemented our object-centric breakpoint, we can use it. We instrument our point p1 with an object-centric breakpoint on the setX:setY: method (line 3). We store the instrumenter object in the instrumenter variable so that we can reuse it later. Calling setX:setY: on p1 will now halt the system, while calling it on p2 or any other point will not halt.

p1 := 0@2.
p2 := 1@3.
instrumenter := p1 haltOnCall: #setX:setY:.
p1 setX: 4 setY: 2. "<- halt!"
p2 setX: 5 setY: 3. "<- no halt"

After debugging, we will probably need to uninstall our breakpoint. As we kept a reference to the instrumenter object, we can use it to change or to remove the instrumentation it defines.

Let’s first define an uninstall method in the ObjectCentricInstrumenter class. This method just calls the uninstall behavior of the metalink, removing all instrumentation from the target object.

ObjectCentricInstrumenter >> uninstall
  metalink uninstall

Our little example script now becomes:

point p1 := 0@2.
point p2 := 1@3.

instrumenter := p1 haltOnCall: #setX:setY:.
p1 setX: 4 setY: 2. "<- halt!"
p2 setX: 5 setY: 3. "<- no halt"

instrumenter uninstall.
p1 setX: 4 setY: 2. "<- no halt"
p2 setX: 5 setY: 3. "<- no halt"

More object-centric debugging tools!

We showed how to implement a breakpoint that halts when one single, specific object receives a particular message. And it was simple!

But why stopping there? The Pharo reflective tools provide us with much more power! In our next blog posts, we’ll show how we can implement a breakpoint that halts when the state of one specific object is accessed.