Challenges are typically two or three times longer and/or more difficult than average exercises. They involve the application of multiple skills and the ability to keep complex long code readable and maintainable. Students who come to this course with no prior Lisp or Scheme experience should do at least a few challenges. More experienced programmers should focus on challenges over simpler exercises.
make-best-change
Function name: make-best-change
This is a variation on Graham's Exercise 9-2, which defines a function make-change. Do that exercise first, and make sure you have an approved clean solution, before doing this challenge.
make-change
may fail to find the best answer for certain sets of coins, where
best means "uses the fewest coins." For example,
if there are no nickels, then (make-change 30 '(25 10 1))
will return 1 quarter and 5 pennies, which is 6 coins, but 3 dimes would
be better.
Also, if the set of coins doesn't include pennies, then there may not be an exact answer. The best answer would be the one that comes closest, without going over the total amount.
There are three common ways to attack this problem:
- Enumerate all possible coin combinations and pick the best.
- Gets a bit messy, generating the nested loops, and needs to use arrays or destructive list updates to reduce consing.
- Use
dynamic programming.
- Most commonly recommended. Dynamic programming is a technique well worth learning. But the resulting code is easier to write than to read.
- Solve recursively and memoize.
-
Simplest code to read and understand. Needs memoization to avoid exponential recalculation.
Do the last approach. It leads to the clearest easiest to debug code. If you never need to handle larger cases, you're done. If you do, first try adding memoization. That can be done without making the code less clear. If that's not efficient enough, then reimplement using dynamic programming.
Implement recursive best change
Define make-best-change
to have the same API as
make-change
. It takes an amount and an optional list of coins,
and returns the coin counts as multiple values. The difference is that it returns
a solution that comes closest to the amount as possible, without going over.
If there's more than such answer, it should return a solution that
uses the fewest coins.
Define make-best-change
to call a recursive helper best-counts
that takes an amount and list of coins, and returns a best solution as a list of
coin counts. The heart of the recursion is to realize that at most two
recursive calls are needed: one that uses the first coin, and one that doesn't. Return the better
of those two and you're done.
For example, given (best-counts 77 '(32 7 3))
, the answer is either
the value returned by adding a 32-cent piece to (best-counts 45 '(32 7 3))
or the value returned by (best-counts 77 '(7 3))
.
The conditional just needs a few branches for the base cases and code to return the better of the two recursive calls. Think carefully about the base cases. There are several. Missing any of them can lead to unnecessary recursions.
Write a version of this approach that passes all the test cases.
Optimizing with memoization
Once the code is working, compile and load it, and do the following to see how expensive your solution is:
(time (make-best-change 2325 '(50 23 17 9 2)))
time
is a Common Lisp function that evaluates code
and reports on how long it took to run.
The exact output is implementation-dependent.
Depending on your Lisp, processor, and your code, the example call may take somewhere between half a second to several seconds to run. Often, the problem with "try all subproblem" recursive algorithms is the repeated evaluation of the same subproblems in different branches of the recursion. When that is the case, memoization can make a very big difference. Memoization means keeping a hashtable of function calls and return values, so that once a call with a particular set of arguments is calculated, future calls with the same values is just returned directly from the table. For some problems, this can turn an exponential algorithm into a polynomial one.
Define the function (memoize function-name)
to replace the symbol-function definition of your recursive best change
function with a (lambda (&rest args) ...)
that stores the results
of calling the original code in a Lisp hash table, using args
as
the key. Repeated calls with the same arguments should just return the stored value.
See
the documentation
for how to create a table that handles list keys.
Lisp compilers differ in whether compiled code uses symbol-function
to get the current definition or goes directly to the code present when the code
was compiled. To be safe, put
(declaim (notinline best-counts))
at the top of your code file to tell the compiler to not optimize calls to
best-counts
.
Compile, load, and time make-best-change
with no memoization:
(time (run-tests make-best-change)) (time (make-best-change 2325 '(50 23 17 9 2)))
Then call (memoize 'best-counts)
and do the above timing tests
again.
Submit both the code for memoize
and the before and after timing results.
intersect-segments
Function name: intersect-segments
This is Exercise 9-4 in Graham. It is a surprisingly knotty problem, given how simple it is to state. There are a number of special cases to handle:
- Non-intersecting segments return nil.
- Intersecting non-parallel segments return 2 numbers, the x and y of the intersection point.
- Intersecting parallel segments return 4 numbers, the x and y of one end of the overlap, and the x and y of the other end of the overlap, even if those are the same point.
- Zero-length segments, i.e., start point equals end point, are considered parallel to everything.
- The infinite slope of a vertical segment is different than the undefined slope of aero-length segments.
There are many subcalculations that are reused that should not have to be calculated repeatedly. A major part of the challenge is organizing the code into short well-named modular subfunctions.
There are two common methods to solve the intersection problem for lines:
Use the slope and intercept method. The second doesn't handle parallel line setments. You end up needing slope and intercept for the special cases anyway.
Some tips for simpler code:
- There are only two top-level cases: parallel overlap and non-parallel point intersection.
- Use a different value for infinite slope (vertical segments) and undefined slope (zero-length segments)
- Give all segments an intercept.
- Use
nil
for zero-length segments. - Use the x-intercept for vertical segments.
- Use the y-intercept for the rest.
- Use
It is your call whether to define structures or classes to represent points, lines, and segments. Good solutions exist with and without these.
Renaming Variables
Function name: rename-variables
in the package sddr
Test cases: (run-tests rename-vars)
in the package sddr-tests
You will need to load sddr
(ql:quickload "sddr")
One of the required operations in the deductive retriever SDDR is renaming all the variables in a rule to avoid unintended name conflict. This has to be done consistently -- whatever new name you give a variable has to be used everywhere else that variable appears.
The simplest approach, used in SDDR and in Chapter 15 of Graham is to first go through the rule to collect all the unique variables, then make a list of pairs of old variables and new names, and finally go through the rule again and substitute.
A somewhat more challenging problem is to do this in one pass, i.e.,
maintain and use the list of replacements pairs as you go recursively through
the rule. Your job is to define rename-variables
to do this, two different ways:
- Using multiple values --
rename-variables
should return two values: the replaced rule and the list of replacement pairs. - Using continuations --
rename-variables
should return just the replaced rule, but it calls a helper that takes the rule plus two optional arguments: a list of replacement pairs, and a function that takes an updated rule part and an updated list of pairs.
The second part should become clearer after you do the first part. There's a close relationship
between returning multiple values and calling a function with those values. E.g.,
the Lisp function (floor m n)
returns two values: the quotient and remainder.
If Lisp didn't have multiple values, the same functionality could have been done by defining
floor
to take three arguments. The third would be a function that floor
calls, e.g., with funcall
, with the quotient and remainder. The default value for that third argument would be
(lambda (q r) q)
. Such a function is often called a continuation function
because it continues the operations of the code. A more common term is a callback function.
To make this a real challenge, both versions should CONS as little as possible. In particular, any part of a rule that has no variables should be used as is without copying. That means not only not building a list of pairs, but not making a copy of the rule.
The test cases for rename-vars
in
match.lisp
first check that your renaming produces a consistent renaming, then that it creates the minimal
new list structure. It does the latter by calling a function to see if a form contains any
variables anywhere in it. Your code should not do this! Such a check is expensive
because if you have a very deep tree with variables, the entire tree will be rescanned multiple times.
You need a cleverer way to figure out when you can avoid CONSing that looking ahead.
Word Tries
Test cases: trie-tests.lisp
Function names: add-word
, make-trie
, mapc-trie
, read-words
,
subtrie
, trie-count
, trie-word
The goal of this task is to make a very fast word lookup tool. The next two tasks apply this tool.
The trick is to pre-process the dictionary to make word lookup very fast, even faster than binary search on a sorted array, but, unlike a hashtable, capable of looking up words letter by letter, so that we can very quickly reject a sequence of letters that can't be a word, like "qp...".
A good data structure for this is a trie. A trie is a recursive tree data structure. The root node of the trie has a branch for every letter that starts at least one word in English. Each branch leads to a subtrie with branches for the second letters that can follow the first letter. Thus, under the subtrie for "q" will be branches for just those letters that can follow "q" in English.
A trie is easily built recursively. You start with an empty trie and insert words one by one from a word list. Inserting a word means going down the trie, letter by letter, following a branch if one already exists, or adding one if not. You add the word to the node reached by the last letter. A node can have at most one word, but a node can have both a word and subtries, e.g., "aft" reaches a node that has the word "aft" but also subtries leading to "after" and others.
API Design
A public API (Application Program Interface) for a complex structure like a trie needs to provide enough functionality to be useful, in a way that makes efficient robust client code easy to write and maintain, while leaving implementation details as hidden as possible, to allow for later improved re-implementations.
The following API is required by the testing code. If you think of a better API, especially after doing the Boggle Solver and Crossword Helper tasks, email me.
Define the following trie-related functions:
(make-trie)
: creates a new empty trie.(add-word string trie)
: adds a word, in lower case, to a trie node and returns the trie. The trie should be destructively modified.(subtrie trie char1 char2 ...)
: given a trie and zero or more characters, returns the root of the subtrie for that sequence of characters, if any, or nil.(trie-word trie)
: returns the word at a trie node if any, or nil.(trie-count trie)
: returns the number of words stored at or under this trie node.(mapc-trie fn trie)
: given a function and a trie, calls(fn char subtrie)
with the character and subtrie for every branch immediately under trie. Called for effect. Returns trie.(read-words file trie)
: Reads a file of words into trie. Returns trie. The file should contain one word per line.
Look at trie-tests.lisp to see specific examples.
Use a CLOS class or structure for trie nodes, with a print function that just shows the word in the trie node, if any, and how many words are at or under the trie. You don't want a trie to print its entire contents every time you look at it!
To run the test code, you'll need to download crosswd.txt (source the Moby Project) and put it in the same directory as your test code.
Compile your trie code before trying to load the entire dictionary!
Boggle Solver
Test cases: boggle-tests.lisp
Function name: solve-boggle
Boggle® is a multi-player word search game on a 4x4 or 5x5 board of randomly placed letter cubes. Rules are here.
Define a package boggle with two exported functions:
(solve-boggle board)
: returns a list of all legal Boggle® words in board(load-words pathname)
: loads the words in the file given into an internal unexported trie structure
board is a simple string representing the letters on a board, from left to right, top to bottom. The board can be any length that is a perfect square, e.g., "ases" or "edbtuhiaplieaoql". Internally, use a trie to hold the word list.
The list of words returned needs to be sorted from longest to shortest, words of equal length need to be sorted alphabetically, and words less than 3 letters removed.
Note: "q" in a board needs to be interpreted as "Qu". This means there are some rare English words that can't appear on a Boggle® board, but that's true for many words, because of the limited number of letters on the cubes.
This code should not be in the trie package, nor use any internal trie functions.
Crossword Helper
Test cases: trie-tests.lisp
Function name: pattern-words
A common helper for solving crossword puzzles is a tool that can say what words fit a pattern of known and unknown letters. An online example is One Across.
Define (pattern-words pattern trie)
to return a list of
words in the trie that fit the given pattern. A pattern is a string of letters and question
marks, e.g., "a?l?i??". Note that trie is explicitly given.
Note: The One Across site returns more than exact matches. It includes "close matches," just in case some of your known letters are wrong. Don't do this.]
The list of answers should be sorted.
This code should not be in the trie package, nor use any internal trie functions.
Joke Cleanup
No, this isn't about cleaning up dirty jokes. It's about cleaning up some very dirty code that generates jokes.
Download, compile, and load new-jokes.lisp. This is a riddle-generator created by Jess Johnson, and fixed to work in standard Common Lisp by Rainer Joswig.
Type
(generate).
Prepare to wait. Eventually some computer-generated riddles will appear.
As an AI application, this isn't bad. The output is comparable to other published attempts at generating jokes.
Now look at the code. Imagine trying to maintain a function
like answer-joke
or make-compound
. That should be
easy to imagine because that's your job!
Clean up this code! Make a copy of the code in the file clean-jokes.lisp and start cleaning. Ideally, you should be able to run
(critique-file "clean-jokes.lisp")
and get few if any complaints about function too long or anything else.
Along the way, you should find out if it really needs to be that slow. There's just a moderate amount of lexical data. Is the algorithm combinatorially explosive, or is there needless wasted computation?
There are two official subtasks. They can be done in either order.
Faster Riddles
You should never just randomly start cleaning up code. Have a purpose in mind. A first obvious one is to improve performance without losing correctness.
- Save the jokes produced by the current code for future reference. Either plain text for visual inspection, or change the code to generate a list you can test against programmatically.
- Identify where the bottlenecks are. Don't make the mistake of thinking the most complicated code is responsible. Find out where the time is really spent. Both Allegro and Lispworks have profilers built in, with documentation online.
- Come up with a faster approach, and implement. While implementing, refactor and clean up the code you need to change, to make your life and future maintenance easier. Don't worry about the rest of the code.
- Make sure you generate substantially the same riddles as before, if not more.
Note: working with a partner is encouraged. Each of you should clean up and submit different parts of the code, but work together figuring out how the code works and what to tackle. Let me know by email who you are working with.
What to submit: Submit the functions you changed significantly. Don't submit any of the original code, or code where you just adjusted some names or parameters. Document briefly what you changed and why in comments before your code, not intermixed.
One-at-a-Time Riddle Maker
The current code is all or none. It takes its lexical database and generates all possible riddles. More useful would be a tool that would let you easily specify constraints on a riddle, like "I want riddles involving pets" or "I want riddles with vegetables."
To do this, you need to break generate
up into smaller
functions, and you need to add some simple ontology capability to the database.
What to submit: Submit the functions you changed significantly, and the ontology code you added. Document briefly what you changed and why in comments before your code. Don't intermix code and comments.