The DO iterative form in Lisp is very flexible, with a simple syntax and clean semantics. Many students however have trouble when they first begin trying to use it. The following method are intended to help you get from an iterative algorithm to a well-formed and efficient DO loop.
Why Do DO?
Here are some examples of simple loops, in DOLIST and DO form.
List length | |
---|---|
(let ((n 0)) (dolist (x lst n) (incf n)) |
(do ((ll lst (cdr ll)) (n 0 (1+ n))) ((null ll) n)) |
Copy list | |
(let ((copy 0)) (dolist (x lst (reverse copy)) (push x copy)) |
(do ((ll lst (cdr ll)) (copy nil (cons (car ll) copy))) ((null ll) (reverse copy))) |
Find first odd number | |
(dolist (x lst) (when (oddp x) (return x))) |
(do ((ll lst (cdr ll))) ((or (null ll) (oddp (car ll))) (car ll))) |
Collect odd numbers | |
(let ((odds nil)) (dolist (x lst odds) (when (oddp x) (push x odds)))) |
(do ((ll lst (cdr ll)) (odds nil (if (oddp (car ll)) (cons (car ll) odds) odds))) ((null ll) odds)) |
Clearly there is nothing hard to read or understand about the DOLIST versions. It's often shorter than the DO version. So why do I push you to become fluent with DO? Because these easy loops are not the point. You should never use DO or DOLIST for any of these cases. Lisp already has functions to do these simple cases. Mapping and sequence functions handle almost all the other cases involving lists.
So the question is what to use when you when there is an iteration situation that standard sequence looping functions can't handle. DOLIST or DOTIMES only work with proper lists. DOLIST can only step by a single CDR each iteration, and DOTIMES can only step by one.
DO has none of those limits. It will more likely produce short, robust, functional programming friendly solution than DOLIST or DOTIMES, with
- no buried assignments statements
- easy to see exit conditions
- easy to see list of variables that are being changed
- easy modified step clauses, because update is order-independent
The Structure of DO
The syntax of DO is
(DO ((var1 init1 incr1) ; the variable clauses (var2 init2 incr2) ...) (exit-test exit-exp exit-exp ...) ; the exit clause body-exp ; the body body-exp ...)
The semantics is
- Evaluate all the init forms in the variable clauses.
- In parallel, bind all the var's to their init values.
- Evaluate exit-test.
- If it's true,
- Evaluate the exit-exp's
- Exit the loop, returning the value of the last exit-exp.
- Otherwise, evaluate the body-exp's.
- Evaluate all the incr forms.
- In parallel, bind all the var's to the corresponding incr values
Turning an Iterative Algorithm into a DO
First, you need to identify the following elements of your algorithm.
Identify the steppers
A stepper is a variable that controls how long the loop runs. It might be a variable stepping through a list or a counter stepping from 0 up or N down. Each stepper has
- a start value, e.g., 100 or some list
- an end test, e.g., i = 0 or l is NIL
- some formula for "stepping" the variable, e.g., decrementing i or setting l to its cdr
Many loops have only one stepper, but they might have several, e.g., a loop that steps through a list and also has a counter.
What are the accumulators?
An accumulator is a variable or variables into which you collect your answer. Each accumulator has
- a start value, e.g., 0 or NIL
- some formula for "adding" a value, e.g., adding a number, or cons'ing on a new element
- optionally, some test that has to be true for the value to be added
Most loops have at most one accumulator, but you can have more, e.g., a loop that collects odd numbers into one list and even numbers into another.
Normally, test results, e.g., finding an item with a particular property, should not be thought of as accumulators. There's a better way to handle them.
What are the exit tests?
An exit test says when the loop should stop. There are two kinds of exit tests. First, there are tests for the end points of the steppers, e.g.,
- (NULL L) for list stepping
- (>= I end) for counters going up
- (<= I 0) for counters going down
You always have at least one stepper exit test. You rarely have more, even with two or more steppers, because usually just one stepper, e.g., the list stepper, is in charge.
Second, there may be exit tests for special conditions, e.g., finding an odd number. If you have this situation, then make the exit test an OR. The stepper exit test comes first, then the special test, e.g.,
(OR (NULL L) (ODDP (FIRST L)))
Use exit tests instead of a pseudo-accumulator holding the result of some test.
What is the return value?
Every loop returns something. Typically, it will be the value of some accumulator. If you have a loop with no accumulator, i.e., you have a loop that exits when some special test is true, then the return value will be determined by whether the test was ever true or not.
Make sure you understand what the return value should be if the loop executes 0 times.
Putting the pieces together
Make a DO variable clause for each stepper. The init should be the start value, and the incr should be whatever expressions "bumps" the stepper appropriately. We'll get to where to put the end expression shortly. Some examples:
- (L input-list (REST L)) for list stepping
- (I start (1+ I)) for counters going up
- (I start (1- I)) for counters going down
- etc.
Make a DO variable clause for each accumulator. The init should be the initial "empty" value. It should be the value you would return if the loop were to execute zero times. A common mistake here is forgetting about the empty loop case, but it's essential to building a clear and general DO.
The incr should be whatever expressions adds to the accumulator appropriately on every iteration. A common mistake here is writing an expression that returns the correct value sometimes but returns NIL in other cases, where, normally, NIL is not the right value.
Some examples:
- (RESULT NIL (CONS value RESULT) for collecting a list of values
- (SUM 0 (+ SUM value)) for accumulating a sum
- (RESULT NIL (IF test(CONS value RESULT) RESULT) for collecting a list of values satisfying a test
Now put your exit tests in the exit clause. If you have both stepper end tests and special end tests, then make the exit test an OR. Put the stepper exit test first, then the special test, e.g.,
(OR (NULL L) (ODDP (FIRST L)))
Finally, put in the exit expression in the exit clause. This is task-dependent, but for loops without a special exit test, the exit value is usually is one of the following forms:
- (NREVERSE RESULT) for a list accumulator
- SUM for a sum accumulator
For loops with a special exit test, the exit value is based on whether or not we reached the end of a stepper or the special condition. Typical return values with loops involving special exit tests, from simplest on, are:
- (NULL L) returns T when the stepper rather than the special exit test stopped the loop
- (NOT (NULL L)) returns T when the special exit test stopped the loop
- (IF (NULL L) val1 val2) returns val1 when the stepper stopped the loop and val2 when the special exit test stopped the loop
Note that the above method never generates a DO body. A DO body is only needed when your loop is not just accumulating a value or stopping on a special case. If instead your loop is being used to perform a side effect operation (printing, changing an array or table, ...), the side effect should go in the body, BUT it's probably the case a DOLIST or DOTIMES would be a better choice.