This is a story about a real macro used by a number of programmers at the Institute for the Learning Sciences for years. It is an example of how not to design and implement a macro.
The creator of the macro was an adept Lisp programmer. He understood how macros worked, how to use backquote to define them, and so on.
The Problem
A demo of a fairly mature program crashed. Nothing major had been added to it, but a programmer had finished cleaning up its code. After a fair amount of bug hunting, the programmer found the change that broke the code. In essence, he had replaced
(defmacro my-wait-for (n) `(wait-for ,n))
with the seemingly more direct
(defun my-wait-for (n) (wait-for n))
But, with the new version, (my-wait-for 12)
never
waited! Why not?
The Cause of the Problem
The problem lay not with my-wait-for
, but the macro
it called, wait-for
. There were three ways to use
it:
(wait-for 12)
- waits for 12 seconds
- (
wait-for #'foo)
- calls
(foo)
until it returns a non-NIL value (wait-for (baz x y))
- evaluates
(baz x y)
until it returns a non-NIL value
The macro version of my-wait-for
expanded
(my-wait-for 12)
into (wait-for 12)
,
which fits the first calling format.
The function version of my-wait-for
, on
the other hand, calls (wait-for n)
, with
n
= 12. This fits the third calling format,
and is interpreted as "wait until the expression n
is non-NIL." Since n
is already non-NIL, no waiting
occurs.
The Cause of the Cause of the Problem
(wait-for 12)
looks like a function call. It looks
like it evaluates its argument. But it is doesn't. It looks at
the argument form to decide what to do.
wait-for
is a macro because its designer violated
the principle "one function to a function." He tried to do three
waiting tasks with one interface.
Unfortunately, the first task, "wait for number seconds," shares a common calling format with the third task, "wait for expression to be non-NIL." The rule "do the first task only if the number is explicitly given" is not intuitive, easy to forget, and a pain to live with. For example, it makes it impossible to say
(wait-for *default-wait-period*)
The Fix
The fix is not to write a comment that says "use only literal numbers for wait times." Where would the comment go? If we put it on the definition, it won't be seen by someone modifying a call to wait-for. And it's pretty unlikely that programmers using wait-for will remember to put a comment on every call to wait-for that says "don't replace this number with a variable!"
Nor is it to change
wait-for
to evaluate its argument and, if the result
is a number, do the first task, otherwise do the third. First, this rule counts
on programmers knowing and
remembering that, in wait-for, unlike elsewhere in Lisp, a number is
not treated like other non-NIL values.
Second, exactly what is the rule? If exp in (wait-for exp)
returns nil the first few times, and then returns 3, is that
a signal to stop waiting or a signal to wait three more seconds?
The appropriate fix is to have two waiting forms, e.g., (wait-for
number)
, which is a function, and
(wait-until expression)
, which is a macro.
Each waiting form can then have simple unsurprising semantics.