Part II – The Return Journey
Welcome to the next step in the compiler journey.
As a simple recap, we were compiling
baz := 42 in a
bar, except that
baz is not declared.
We are currently in
OCSemanticWarning>>#defaultAction, the default action
of an uncaught Notification, that is ready to open a graphical menu by
The method is long; let us review it in small pieces.
openMenuIn: aBlock | alternatives labels actions lines caption choice name interval requestor |
A bunch of temporary variables.
"Turn off suggestions when in RubSmalltalkCommentMode This is a workaround, the plan is to not do this as part of the exception" requestor := compilationContext requestor. ((requestor class name = #RubEditingArea) and: [ requestor editingMode class name = #RubSmalltalkCommentMode]) ifTrue: [ ^UndeclaredVariable named: node name ].
These are some type checks. Type checks are usually bad. Those are
They prevent the menu thing if the requestor is a
RubEditingArea in a
RubSmalltalkCommentMode is used?
ClyRichTextClassCommentEditorToolMorph, which seems
never used in the system (dead class?)
is called only by
FileList, a basic file explorer, but it’s
not clear when or why the compiler is called by the file explorer, nor
why a “workaround” is needed here (quite deep) in the compiler
especially since it’s the only workaround of this type in the whole
source code for
It could be just dead code, so less problematic: a dead workaround is
a less technical dept than a live one. It also illustrates why type
checks can be bad; it reverses the responsibility:
here are not involved at all in the workaround, so code evolution
related to one of these two classes might likely miss the present
Moreover, such a workaround is fragile. The compiler should not care
about specific clients, and especially not care about their names, and
should behave equitably. E.g. imagine renaming classes or using
subclasses of the blacklisted ones, they could likely pass the check and
cause really subtle bugs.
Anyway, let us continue with the method:
interval := node sourceInterval. name := node name. alternatives := self possibleVariablesFor: name. labels := OrderedCollection new. actions := OrderedCollection new. lines := OrderedCollection new.
All those are the initialization of the temporary variables.
provides a list of existing names usable as a replacement (sorted from
the best match to the worst match). See
String>>#correctAgainst:continuedFrom: for the
details and the scoring system.
name first isLowercase ifTrue: [ labels add: 'Declare new temporary variable'. actions add: [ self declareTempAndPaste: name ]. labels add: 'Declare new instance variable'. actions add: [ self declareInstVar: name ] ]
The two first items of the menu. If the name looks like a temporary
or an instance variable, because it starts with a lowercase letter, then
maybe the programmer wants a new temporary or instance variable?
Note that there are two parallel lists, one of the labels (shown in
the menu) and the other of actions (here blocks), that is evaluated if
the user chose the corresponding label.
In the scenario, the option
Declare new temporary variable is selected, so
self declareTempAndPaste: name is eventually called. We
detail it in the next section. For now, we continue to read the
ifFalse: [ labels add: 'Leave variable undeclared'. actions add: [ self declareUndefined ]. lines add: labels size. labels add: 'Define new class'. actions add: [ [ self defineClass: name ] on: Abort do: [ self openMenuIn: aBlock ] ]. labels add: 'Declare new global'. actions add: [ self declareGlobal ]. compilationContext requestor isScripting ifFalse: [labels add: 'Declare new class variable'. actions add: [ self declareClassVar ]]. labels add: 'Define new trait'. actions add: [ [ self defineTrait: name ] on: Abort do: [ self openMenuIn: aBlock ] ] ].
For names that start with an uppercase, they look like global
variables, and that includes all the named classes, so the proposed
items in the menu are different. A first curiosity, there is the choice
to “leave variable undeclared” that is absent in the previous code
snippet. Another curiosity, defining a new class (or a new trait) opens
a new window, but if, for some reason, the entity creation fails or is
canceled, then a recursive call is used to open the same menu again.
lines add: labels size. alternatives do: [ :each | labels add: each. actions add: [ ^self substituteVariable: each atInterval: interval ] ]. lines add: labels size. labels add: 'Cancel'. caption := 'Unknown variable: ' , name , ' please correct, or cancel:'.
We have the addition of the possible variables (computed at the
beginning of the method), a cancel item, and the window title. The next
last two lines are the fun ones.
choice := aBlock value: labels value: lines value: caption. ^choice ifNotNil: [ self resume: (actions at: choice ifAbsent: [ compilationContext failBlock value ]) value ]
aBlockis the parameter of the method, it was more
than 50 lines ago, so we almost forgot about it. It is always
[:labels :lines :caption | UIManager default chooseFrom: labels lines: lines title: caption]
that just calls the UI and returns the index number of the selected item
starting at 1 (or 0 if the cancel button is used).
- The selector
value:value:value:is used to evaluate
the block with 3 supplied arguments (it is a Pharo thing, do not
^choice ifNotNil: ...returns nil if the choice is
nil (unlikely according to the API of
better safe than sorry). In the scenario, the first choice is selected
(declare new temporary variable). Therefore choice is 1, which is not
nil, so we look at the
self resume:cause the
is execution with the given value. Hopefully, a
object to bind to
baz(look back at the section
need to see the original
signalmethod invocation). Here,
the call to
resumefeels superfluous as the result of the
current method is used as the result of
is used as the value of the automatic
actions at: choicereturn the action (the block)
associated to the corresponding choice number. Ordered collections in
Pharo are 1-based; therefore 1 is the first block action. Here, the
[ self declareTempAndPaste: name ].
ifAbsent:is for what to do when there is no
corresponding action for the given choice. This happens when the user
chooses the cancel button (no action for 0) or chooses the cancel item
(no action for 3 in our scenario).
compilationContext failBlock valueis therefore
executed on a “cancel”. It evaluates the failBlock that, in the
scenario, comes from the
method and contains
[ ^ nil ](a non-local return).
Evaluating this failBlock cause the unwinding of many methods in the
call stack (something around 30 or 40 frames) and the return of the
ClassDescription>>#compilemethod with nil.
Note that there is a potential weakness here if the failBlock does
not perform a non-local return, then the result of the block evaluation
is used as the return of
openMenuIn:and eventually used as
Variableobject to bind
bazto. Callers of
the compiler might forget to do that and just provide
[nil], for instance (without a
valueevaluates the action block (since it exists in
the list), that has the responsibility to provide a
We selected “declare new temporary variable” in the menu, thus
executing this method. We’ll cover this large method (35 lines) piece by
declareTempAndPaste: name | insertion delta theTextString characterBeforeMark tempsMark newMethodNode |
Some temporary variables.
"Below we are getting the text that is actually seen in the morph. This is rather ugly. Maybe there is a better way to do this." theTextString := self requestor textMorph editor paragraph text.
Indeed, this is rather ugly. This leads to many
- Why is the text (source code) of the method
- Why does it assume that the requestor has a
- Why ask for something so deep? Demeter is likely rolling over in its
grave (it’s a joke on the Law of Demeter. Demeter is not dead
and is not even a person, it was a project named after the Greek goddess
- Why? We are still in a (deep) part of the compiler,
selfshould have a better way to get the source code
"We parse again the method displayed in the morph. The variable methodNode has the first version of the method, without temporary declarations. " newMethodNode := RBParser parseMethod: theTextString.
Let us take a breath.
We are doing a semantic analysis on an already parsed source code of
bar trying to get a variable to bind to
baz. And we parse the full source code again? Don’t we have
it? Just call
self node methodNode or something?
The hint might be “without temporary declarations” from the
comment. Does that mean we do not trust the current AST to be genuine?
Why? Maybe the previous interactive code error reparation changed the
current AST? Is this actually true in some possible scenarios? Is this
just leftover code?
Let us just continue… we must continue…
"We check if there is a declaration of temporary variables" tempsMark := newMethodNode body rightBar ifNil: [ self methodNode body start ].
It’s getting warm here, isn’t it?
newMethodNode body rightBargets the position (an
integer) of the closing
|character of the temporary
variable declaration syntax, or nil if there is no temporary variable
declared (like in the current scenario) an AST is useful for this task
since it knows which part of the source code is really a block of
temporary variable declarations.
self methodNode body startis the position (an integer)
of the beginning of the main body of the method, that position is
therefore used when there are no declarations of temporary
characterBeforeMark := theTextString at: tempsMark-1 ifAbsent: [$ ].
gets the character before the closing
| or before the
main body. The
ifAbsent might only occur if the source code
is empty, and the compiler let us progress until here because an empty
method is a syntax error, a method name is minimally needed (the name
(selector) and parameters are called the “method pattern” in Pharo
parlance). But better safe than sorry.
(theTextString at: tempsMark) = $| ifTrue: [ "Paste it before the second vertical bar" insertion := name, ' '. characterBeforeMark isSeparator ifFalse: [ insertion := ' ', insertion]. delta := 0. ] ifFalse: [
Some temporary variables are declared, and we want to add the new
variable after the last one. The code mainly manages spacing to avoid
concatenating the new variable and a previous one, or injecting
In our scenario, there is no temporary variable (yet), so the
ifFalse: part interests us more.
"No bars - insert some with CR, tab" insertion := '| ' , name , ' |',String cr. delta := 2. "the bar and CR" characterBeforeMark = Character tab ifTrue: [ insertion := insertion , String tab. delta := delta + 1. "the tab" ] ].
Here we prepare the text to insert in the source code and compute a
delta thing, we’ll discuss that later. The code tries to
care about preserving the indentation, if any.
tempsMark := tempsMark + (self substituteWord: insertion wordInterval: (tempsMark to: tempsMark-1) offset: 0) - delta.
Err… it’s getting cold here, isn’t it?
self substituteWord: insertion wordInterval: (tempsMark to: tempsMark-1) offset: 0
asks to insert the new string in the source code (because the interval
is empty, it is an insertion and not a replacement).
do that? By simply calling
#correctFrom:to:with: on the
requestor and doing some math, then doing more math to update
" we can not guess at this point where the tempvar should be stored, tempvars vs. tempvector therefore -> reparse" (ReparseAfterSourceEditing new newSource: self requestor text) signal
And it is the end of the method. I think it’s getting humid here,
The code is altered. We have no idea what really happened, there is a
new source code in town. The full AST might need to be rebuilt as there
are new potential AST nodes. The semantic analysis might need to be
redone, as the new temporary variable might conflict with other
variables declared further in the code. So at this point, it seems
better to just call it a day and run the compilation again.
It’s the point of the
that is a subclass of
Notification (we are now experts in
notifications and no afraid no more of them!).
There are still some questions about the behavior of the program and
some of its design decisions:
- The math thing about
tempsMarkupdate is completely unused. Possible leftover of
previously removed code.
- Where does the signal on
- Why the new source code should be passed around in the notification?
We did already update it in the requester.
- Why does the ugly (indeed)
self requestor textMorph editor paragraph textat the
beginning of the method exists, since apparently
self requestor textgive the same damn source code (while
not ideal, it is still better).
- What happens when the notification is resumed? The point of a
notification is to be resumable. Here it clearly appears that such an
endeavor is not supposed to happen.
- Why so much coupling?
- Why so little cohesion?
- And more specifically, why it is the job of
OCUndeclaredVariableWarningto perform this menu and string
based code reparation and hijack the requestor as if there were friends
in some abusive relationship? Shouldn’t a notification be just a means
of sending some kind of signal to the previous method in the call
There are a lot of symptoms of schizophrenia in the responsibilities
ReparseAfterSourceEditing go? I
(intentionally) skipped some steps between
Here is the source code of
parse | parser | [ parser := self createParser. ast := self semanticScope parseASTBy: parser. ast methodNode compilationContext: self compilationContext. self callParsePlugins. self doSemanticAnalysis ] on: ReparseAfterSourceEditing do: [:notification | self source: notification newSource. notification retry ]. ^ ast
The job of this method (as explained in a previous section) is to do
the frontend part of the compilation and produce a fully annotated and
analyzed AST of the method so that (virtual) machine code can be
generated. The content of the method is mostly straightforward.
What is interesting is the
on:do: method call used for
exception (and thus, notification) handling. When a
ReparseAfterSourceEditing is intercepted, we update the
source code and run the protected block from the beginning again (see
Some potential exits of this loop are:
- The source code is good enough and no more
- The source code is bad enough that either the
is invoked, or another unrecoverable exception occurs. Remember
OpalCompiler>>#compile, for instance.
- Someone is tired enough and terminates the process.
The first alternative is what happens in our scenario:
- A declaration of the temporary variable
bazis added to
the source code.
- The content editor window is updated to reflect that.
- The new code source is fully parsed and correctly analyzed and a
What a happy ending!