Simple Pattern Matching
Consider a simple pattern matcher, such as the one defined in pat-match.lisp. This is basically the unifier in Artificial Intelligence Programming, simplified to handle only patterns against constant forms, rather than patterns against patterns. Some example calls:
> (pat-match '(?x b ?y) '(a b c))
    (((?Y C) (?X A)))
    > (pat-match '(?x b ?x) '(a b c))
    NIL
    > (pat-match '(?x b ?x) '(a b a))
    (((?X A)))
        
        Issues
Pattern matching is the basis of a great deal of AI programs. Patterns are easy to write, compact, easily extended, and so on.
A serious AI program will have hundreds or thousands of patterns, and may apply a significant fraction of those patterns while solving a problem. As a result, time spent matching patterns may dominate the run-time of an AI program.
There are two approaches to reducing this cost:
- use good indexing techniques, such as discrimination trees, to reduce the number of patterns matched
- compile patterns to speed up the matching process
These approaches are independent. It's usually best to do the first one first. Then, if pattern matching is still the bottleneck in your application, even with the reduced number of patterns to be matched, compile them.
Patterns versus Code
To see the magnitude of the costs involved in pattern matching, consider these two functions. The first calls the pattern matcher:
(defun using-pat (input)
      (pat-match '(?x b ?x) input))
        
        The second produces the same results for any given input, but is coded quite differently:
(defun using-code (input)
      (and (consp input)
          (let ((blists (match-variable '?x (car input) '(nil))))
            (and (consp (cdr input))
                  (eql 'b (cadr input))
                  (consp (cddr input))
                  (null (cdddr input))
                  (match-variable '?x (caddr input) blists)))))
        
        using-pat is clearly more compact, readable, and
        maintainable.
using-code however can be significantly more
        efficient. Macintosh Common Lisp (version 3) gives the following
        results:
> (time (dotimes (i 100) (using-pat '(a b c))))
    (DOTIMES (I 100) (USING-PAT '(A B C))) took 164 milliseconds (0.164 seconds) to run.
    Of that, 9 milliseconds (0.009 seconds) were spent in The Cooperative Multitasking Experience.
    6,400 bytes of memory allocated.
    NIL
    > (time (dotimes (i 100) (using-code '(a b c))))
    (DOTIMES (I 100) (USING-CODE '(A B C))) took 13 milliseconds (0.013 seconds) to run.
    3,200 bytes of memory allocated.
    NIL
        
        For this simple 3-part pattern, using-code is
        over 10 times faster than using-pat!
Why is using-code so much faster?
Much of the work that pat-match does is looking
        at the pattern. Effectively, pattern matching involves these
        steps:
- check to see if the pattern is a variable
- if so, check to see if it can be bound to the input
- else, check to see if the pattern is an atom
- if so, check to see if it's equal to the input
- else, check to see if the pattern is a list
- if so, check to see if the input is a list
- if so, recursively apply the above steps to the
        carandcdrof the pattern and input
The hand-written code in using-code omits the
        run-time checks in Steps 1, 2, and 3. Every call to
        match-variable is known in eql advance.
        Furthermore, there's no recursive calls to pat-match
        because it's already known when the pattern is a list.
As a result, there are fewer conditional checks, fewer function calls, and the function calls are all explicit and compilable into machine code before the code is run.
Getting the best of both worlds
Looking at the run-times for using-pat and
        using-code, it's clear which code we'd rather run.
        On the other hand, looking at their definitions, it's also clear
        which code we'd rather write.
Since we clearly don't want to have to write code like that in
        using-code, we get the computer to do it for us.
The idea behind compiling patterns is simple. In most cases, the patterns in a knowledge base are static, i.e., known in advance. Therefore, we can
- take those patterns,
- generate Lisp functions that would produce the same binding lists that the patterns would generate when applied to an input,
- compile the functions, and
- call the functions instead of the pattern matcher
expand-pat.lisp
        defines a simple pattern compiler, capable of turning the
        patterns accepted by pat-match.lisp into
        the compilable Lisp code.
We'll first look at how expand-pat works, then at
        several ways in which it can be used.
How to Compile Patterns
Compiling a pattern turns out to be quite simple. It's very much like defining a macro expansion.
As with macros, we need to keep track of two types of variables:
- compile-time variables, whose values are known when the
        pattern is being compiled; we have one here: pat, the pattern being compiled.
- run-time variables, whose values won't be known until the
        expanded code is applied to some input; we have two here:
        inputandblists, bound to the input form and current list of binding lists, respectively.
Notice that pat-match had the same three
        variables. The difference is that pat is now a
        compile-time variable. It won't exist at run-time.
Now we need to figure out what Lisp code to generate for each kind of pattern elements. Lisp s-expression patterns have only three elements:
- atoms
- variables
- lists
The top-level compiler function, expand-pat,
        simply calls functions for each of these cases. ("One function to
        a function" -- the function of expand-pat is to
        select what other function to run.)
Variables are easy to compile. They just become calls to
        match-variable, e.g.,
> (expand-pat '?x)
    (match-variable '?x input blists)
        
        Atoms are equally easy to compile. They just become calls to
        eql that return blists if
        successful:
> (expand-pat 'a)
    (and (eql 'a input) blists)
        
        The only tricky bit is with patterns that are lists.
        Obviously, we can recursively expand the car and
        cdr of the pattern into Lisp code, but how to we
        combine the two pieces of code that result?
What we want is that, at run-time,
- the code from the carof the pattern will be executed withinputbound to thecarof the input, andblistsis bound to the current list of binding lists;
- the code from the cdrof the pattern will be executed withinputbound to thecdrof the input, andblistsis bound to the list of binding lists produced by the code for thecarof the pattern;
The following Lisp template achieves these two goals:
(let ((blists
            (let ((input (car input)))
              -- put pattern CAR code here --))
            (input (cdr input)))
        -- put pattern CDR code here --)
        
        This will evaluate the code from the two parts of the pattern
        in contexts with the desired values for input and
        blists.
Normally, it's a bad idea to reuse variables like this, but
        the generated code is not meant for human eyes, and doing it this
        way means the pattern expander can always count on
        input and blists being bound to the
        right things.
How to get compiled patterns into code
We can now see how to turn patterns into Lisp code. One big
        problem remains, however. How do we turn Lisp code into callable
        Lisp functions? In particular, expand-pat doesn't
        generate a function, it generates a list, e.g.,
> (expand-pat '?x)
    (MATCH-VARIABLE '?X INPUT BLISTS)
        
        It's not in function form, and input and
        blists free variables.
compile-pat does a little better:
> (compile-pat '?x)
    #'(LAMBDA (INPUT &OPTIONAL (BLISTS '(NIL)))
      (MATCH-VARIABLE '?X INPUT BLISTS))
        
        but this is still just a list, not a function.
It can be converted into a function using eval,
        e.g.,
> (eval (compile-pat '?x))
    #<Anonymous Function #xB22986>
        
        Unfortunately,
- evalis evil, and
- this won't work in many Lisp's that build stand-alone
        applications, because the first thing they delete is
        eval.
The solution is to look at where our patterns are being defined. For example, suppose we have a file of "if--then" rules, where we link patterns to code for a robot to execute if some input command matches the pattern, e.g.,
(defrule put-rule
      (put-in ?x ?y)
      => (move-hand-to ?x)
        (grasp)
        (move-hand-into ?y)
        (ungrasp))
        
        Without compiling, we'd have a robot command interpreter that would look something like this:
(defun run-robot (input)
      (dolist (rule *rules*)
        (let ((blists (pat-match (rule-test rule) input)))
          (unless (null blists)
            (do-actions (instantiate (rule-actions rule)
                                    blists))))))
        
        defrule would simply be
(defmacro defrule (name test => &rest actions)
      `(add-rule :test ',test :actions ',actions))
        
        To use a pattern compiler, we'd redefine defrule
        as follows:
(defmacro defrule (name test => &rest actions)
      `(add-rule :test ,(compile-pat test) :actions ',actions))
        
        Now, when our rule file is compiled, all the patterns become compiled Lisp functions.
Note: we also have to redefine the robot interpreter:
(defun run-robot (input)
      (dolist (rule *rules*)
        (let ((blists (funcall (rule-test rule) input)))
          (unless (null blists)
            (do-actions (instantiate (rule-actions rule)
                                    blists))))))
        
        Increasing maintainability of compiled patterns
One problem with compiling patterns is that rules become hard to read and trace. We no longer see anything that tells us what the rule is looking for, because the patterns are now anonymous functions.
Therefore, it's usually a good idea to modify our
        defrule to store both the pattern and the function.
        The former is for debugging, the latter is for efficiency.
(defmacro defrule (test => &rest actions)
      `(add-rule :test ,(compile-pat test)
                :pattern ',test
                :actions ',actions))