Undeclared Variable Reparation, An Epic Journey In a Compiler – Part III

Part III –
More fun with OCUndeclaredVariableWarning

Foo is a nice class, but could it become nicer if we
added an instance variable?

Let us click on the Foo tab of the Calypso window. We
get some code that declares the class.

Object subclass: #Foo
    instanceVariableNames: ''
    classVariableNames: ''
    package: ''

It is a legal Pharo expression that creates (or updates) a subclass
(named Foo) of the Object class (that is the
almost root of the class hierarchy). Creating or updating the
class (accept with Ctrl-S) simply evaluates the expression.
This is neat.

Unrelated note: there is a small check box Fluid in the
bottom right corner that switches to the modern fluid class
syntax.

Object << #Foo
    slots: {};
    package: ''

It is just a different syntax for almost the same stuff. It is still
neat, although a little less neat because while the syntax is better, it
is no more a sufficient expression to declare or update classes.

The thing is that to evaluate some random source code, we need to
compile it first. The evaluation of the class definition is done by
either
ClySystemEnvironment>>#defineNewClassFrom:notifying:startingFrom:
or
ClySystemEnvironment>>#defineNewFluidClassOrTraitFrom:notifying:startingFrom:
according to the fluid check box.

Both methods are really similar (and could be factorized), except
that one warned the developer in a comment about for now, a super
ugly patch
, so we shall look at the other one.

But let us play with the compiler a little without looking at the
code yet.

Object subclass: #Foo
    instanceVariableNames: +''
    classVariableNames: ''
    package: ''

I added a syntax error since + is a binary operator and
the left operand is missing. We evaluate (accept with Ctrl-S), and we
get the following.

class-err1

Object subclass: #Foo
    instanceVariableNames:  Variable or expression expected ->+''
    classVariableNames: ''
    package: ''

The -> way to present syntax error is very old school
(and I hate it) but that is not the point here. The point is that the
compilation process behaved in a sane and expected way:

  • OpalCompiler>>#compile is somewhat executed (cf
    the previous section for the content of the method);
  • it tries to parse, but a syntax error is detected, so an exception
    SyntaxErrorNotification is signaled;
  • OpalCompiler>>#compile catch the exception;
  • notify:at:in: is sent to the requestor (that injects
    the error message in the source code);
  • the failBlock is executed that terminates the method
    ClySystemEnvironment>>#defineNewClassFrom:notifying:startingFrom:.

Let us try with something else:

Object subclass: #Foo
    instanceVariableNames: baz
    classVariableNames: ''
    package: ''

The source code is an expression, and in expressions we can use
variables. Except that here baz is an undeclared variable,
so what happens when we evaluate?

The code is updated to:

class-err2

Object  Variable or expression expected ->subclass: #Foo
    instanceVariableNames: baz
    classVariableNames: ''
    package: ''

Wow. This is bad, and wrong, and bad again.

  • An error message is baldy placed and wrong and unrelated to
    baz.
  • There is no menu asking us what to do with the undeclared variable
    baz like we saw inside a method.
  • Since I work with a small screen, I also noticed an ephemeral
    notification popup box in the bottom left corner of the screen
    (poetically called a growl in Morphic) that stated the truthful
    information “Undeclared Variable in Class Definition”. Such
    growls are usually displayed by invoking the method
    Object>>#inform:.

ClySystemEnvironment>>#defineNewClassFrom:notifying:startingFrom:

Here is the code source of the method:

defineNewClassFrom: newClassDefinitionString notifying: aController startingFrom: oldClass

    "Precondition: newClassDefinitionString is not a fluid class"

    | newClass |
    [
    newClass := (self classCompilerFor: oldClass)
                    source: newClassDefinitionString;
                    requestor: aController;
                    failBlock: [ ^ nil ];
                    logged: true;
                    evaluate ]
        on: OCUndeclaredVariableWarning
        do: [ :ex | "we are only interested in class definitions"
            ex compilationContext noPattern ifFalse: [ ex pass ].
            "Undeclared Vars should not lead to the standard dialog to define them but instead should not accept"
            self inform: 'Undeclared Variable in Class Definition'.
            ^ nil ].

    ^ newClass isBehavior
          ifTrue: [ newClass ]
          ifFalse: [ nil ]

I won’t dive into all the details of this one. What is interesting is
the on: OCUndeclaredVariableWarning do: part that
intercepts the notification, thus preventing it from reaching the end of
the call stack, thus preventing it from executing its default action,
thus preventing it from displaying a menu, thus preventing the user to
repair the code having a semantic error.

Here we can see how it is possible to intercept the default error
reparation mechanism in case of undeclared variables in a specific
context where a temporary or an instance variable does not make much
sense.

What behavior do we get instead?

  • ex compilationContext noPattern ifFalse: [ ex pass ].
    if noPattern is false (double negation isn’t not bad) then
    process the notification anyway. Except that, here,
    noPattern isn’t unlikely to not be not true (nested
    negations are annoying, aren’t they?) because it is what distinguishes
    the compilation of an expression from the compilation of a method
    definition: a method definition starts with a name (and potential
    arguments) that is called the method pattern. But the first
    statement of the OpalCompiler>>#evaluate method that
    is called it to override noPattern with true, because one
    can only evaluate expressions, not method definitions.
  • self inform: 'Undeclared Variable in Class Definition'
    is responsible for the growl we get.

    You know what I hate? The -> error message
    insertions. You know what I hate more? Inconsistencies. Here, the source
    code error is reported as a (missable) popup with poor information,
    whereas all other errors are reported with ->.

Nevertheless, is the design legitimate? I’m not a fan of exceptions,
they make code comprehension harder and, in my humble opinion, should be
used with great reserve. Here there is also some breach of
encapsulation. But this is debatable. What is less debatable is that the
whole reparation of undeclared variables is bypassed completely,
including some legitimate needs.

For instance, we get no help if we try to evaluate
(Ctrl-S accept) Objectt subclass: #Bar, which
contains too much t in the identifier of the superclass
Object. All we get is an unhelpful growl and the same
wrongful -> error message insertion. As a matter of
fact, as Pharo users, we can bypass the accept (Ctrl-S) behavior and
just evaluate the code in place by selecting all the text (Ctrl-A) then
Doing It (Ctrl-D). The DoIt simply evaluates the
selected source code without (too much) hacking. So there is no
intercepting OCUndeclaredVariableWarning for instance, and
we get the menu “Unknown variable: Objectt please correct, or
cancel
” that presents Object (with a correct amount of
t) in the list of choice. We can select it and we get a
successful class definition and a new available class
Bar.

But why does the code signal a missable popup instead of a classic
text error insertion? Maybe because the
OCUndeclaredVariableWarning that is caught might not come
from the class definition syntax. When a random string is evaluated, it
can do a lot of things, like signaling exceptions. Unfortunately, a
broad error handling mechanism like exceptions has no way to distinguish
exceptions that come from the analysis of the code (syntactic and
semantic error) from the ones that come from the proper evaluation.

And that happens frequently. If you remove an instance variable
(attribute) from a class definition, then accept the new definition, the
system will recompile all the methods of this class. Methods that use
the removed instance variable will also be compiled and signal
OCUndeclaredVariableWarning. That exception will be caught
and growl “Undeclared Variable in Class Definition”. Note that
the message is misleading since the undeclared variable is not in the
class definition. So maybe it was not the original intention of the
growl and was just a random inconsistency.

Let us discuss the last statement of the method. It is a sanity
check. Because the initial class definition expression could have been
heavily edited by the programmer and replaced by anything else, the
method checks that the final result of the evaluation is a class-like
object. Otherwise, nil is returned.

ClyClassDefinitionEditorToolMorph>>#applyChangesAsClassDefinition

OK, we have an explanation for the absence of the menu, and an
explanation for the presence of the growl, but noting here is related to
the wrong -> syntax error insertion. Where does this one
come from?

First, there is no syntax error in the expression (it is a lie!). The
compiler manages to signal an OCUndeclaredVariableWarning
notification (the growl is the proof!) launched by the
OCASTSemanticAnalyzer, meaning that the parser can produce
an AST and not find any syntax error.

So, what is the deal?

  • We are at
    ClySystemEnvironment>>#defineNewClassFrom:notifying:startingFrom:,
  • that is called by
    ClySystemEnvironment>>#compileANewClassFrom:notifying:startingFrom:,
  • that is called by
    ClyFullBrowserMorph>>#compileANewClassFrom:notifying:startingFrom:,
  • that is called by
    ClyClassDefinitionEditorToolMorph>>#applyChangesAsClassDefinition,
  • that goes like this:
applyChangesAsClassDefinition

    | newClass oldText |
    oldText := self pendingText copy.
    newClass := browser
                    compileANewClassFrom: self pendingText asString
                    notifying: textMorph
                    startingFrom: editingClass.

    "This was indeed a class, however, there is a syntax error somewhere"
    textMorph text = oldText ifFalse: [ ^ true ].

    newClass ifNil: [ ^ false ].

    editingClass == newClass ifFalse: [ self removeFromBrowser ].
    browser selectClass: newClass.
    ^ true

We see the invocation of the compilation (the
newClass := browser compileANewClassFrom: thing). Because
it failed, newClass is nil (instead of a class).

What follows is interesting:
textMorph text = oldText ifFalse: [ ^ true ]. This states
that if the text in the code editor was changed during the compilation,
then there was an error. Interesting and sooo wrooong on sooo many
leveeels:

  • Detecting syntax error should not be done thanks to string
    comparison.
  • The method should not assume that error reporting to the user
    changed the source code (even if it is the old school way and is the
    active tradition of the present and previous millennium). For instance,
    instead of an ugly -> something might have preferred to
    display a growl (even if inconsistencies are bad, and I hate them, here
    the culprit is not the inconsistency).
  • Code change might be related to some code reparation that fixed an
    error, so exactly the opposite of an error.

ClyClassDefinitionEditorToolMorph>>#applyChanges

But wait, there is more, because newClass is nil, the
method returns false to its caller, that is
ClyClassDefinitionEditorToolMorph>>#applyChanges and
is defined by:

applyChanges

    | text |
    text := self pendingText copy.
    ^ self applyChangesAsClassDefinition or: [
          self pendingText: text.
          self applyChangesAsMethodDefinition ]

What the heck is that? I do not even understand what is the objective
of this thing! The defining class is
ClyClassDefinitionEditorToolMorph that is the widget whose
sole job as a text editor is to define new classes and to update
existing classes. And the Pharo way to do that is by evaluating
expressions that define or update classes. It seems to be an easy job
that even an unaware DoIt action can manage
successfully.

So what is this insane method doing:

  • saves the source code’s content (to avoid code change in the editor
    due to syntax error or code reparation);
  • tries to evaluate as a class definition (an expression);
  • if the result is false (and it is, in our case), then
    tries to compile the original pristine source code as a method
    definition.

OK.

Let’s just do that. Remember that the class definition we try to
process is:

Object subclass: #Foo
    instanceVariableNames: baz
    classVariableNames: ''
    package: ''

Let us try to parse this source code as a method definition instead
of as an expression.

  • A method starts with a method pattern that can be many
    things, but for simple unary methods, they are simple and plain
    identifiers. Do we have a simple and plain identifier? Yes, the token
    Object (RBIdentifierToken).
  • A method then have a body, with statements, that usually start with
    an expression. What follows is the token subclass:
    (RBKeywordToken) which is not the beginning or any correct
    expression. But the parser wants an expression right now! So it
    reports the error
    Variable or expression expected ->subclass:.

It’s the beauty of computer science. Whatever insane behavior we
might witness, there is always a rational explanation.

A Final Experiment

Can we bypass the bypass of OCUndeclaredVariableWarning
with the undefined variable baz? Let’s try the
DoIt way, it made wonder with the superfluous t of
Objectt some sections ago.

  • Select the full text (Ctrl-A);
  • Do It (Ctrl-D);
  • The menu “Unknown variable: baz please correct, or cancel
    appears and proposes: a new temporary variable, a new instance variable
    or to cancel;
  • Chose the “temporary variable”;
  • A debugger window appears: “Instance of ClyTextEditingMode did
    not understand #textMorph
    ”. What?
  • The error is caused by the line
    theTextString := self requestor textMorph editor paragraph text.
    of
    OCUndeclaredVariableWarning>>#declareTempAndPaste:.
    We discussed this (rather ugly) line in a previous section, stating that
    it is fragile. What a coincidence

Conclusion and Perspective

During this exploration of the
OCUndeclaredVariableWarning we discovered a lot of classes
and methods with a very variable quality of code and design. Obviously,
the present article focuses on the discussable parts that could be
improved, because it forces us to understand why things are bad, and how
they could be improved. It is also fun to see concrete effects of how
things can go bad when software design is not as tidy as it should
be.

Pharo is a wonderful dynamically typed programming language with
great features, abstractions and powerful semantics. And with great
power comes great responsibility.

An example could be the requestor thing. Adding a dependency between
UI and the compiler work is enough to raise some eyebrows (independently
of the programming language or its paradigm). But in Pharo this
dependency appears as an unwritten API (orality-based API?) with some
inconsistent or fragile hacks: notify:at:in:,
Object>>#bindingOf:,
requestor respondsTo: #interactive,
requestor textMorph,
requestor class name = #RubEditingArea, etc. This also
causes subtitle breakages when someone tries to fix things, breakages
that are often hard to catch because, for instance, they could be only
related to untested or rare UI interactions.

A lot of change is currently ongoing for Pharo 12 on the compiler. We
are in the early part of its development cycle, so it’s the best moment
to try large and disruptive hacks. At the time of publishing, most of
the design issues discussed here are already fixed. But they represent a
specific use case, and a lot of work is still needed. The full
meta-issue is available at
https://github.com/pharo-project/pharo/issues/12883.

Published by Jean Privat

Professor at Université du Québec à Montréal. I like programming languages, compilers, and cybersecurity.

%d bloggers like this: