EECS 311: Dynamic Programming Resources

Dynamic Programming

As many authors note, DP is an approach, not an algorithm. At a high level (so high as to be almost useless), dynamic programming (DP) can be viewed in two different ways:

The first view is helpful when you have a recursive but exponential solution, e.g., the standard recursive function for calculating Fibonacci numbers, and you want to create a more efficient solution.

Weiss discusses several examples of DP in Chapter 10, Section 10.3, but never actually defines what DP is.

Here are some readable online resources that describe DP solutions to various problems. I deliberately avoided the more mathematical sites:

Memoization

Memoization is a classic programming technique, easy to implement in some languages, like Lisp and Python, a little harder in strongly typed languages like C++. Memoization is partularly relevant to dynamic programming, because it's an alternative solution to the repeated subproblem problem (not a typo) that comes up when fn(n) requires combining the values for fn(k) where for some k < n. To avoid recalculating fn(k) over and over:

Some resources on this:

A Memoization Class

To make it simpler to define a memoized recursive function, I created some templated classes, in memoize.h. These were designed to it pretty simple to convert normal recursive definitions of unary and binary functions to a memoized ones. For example, here's the classic recursive Fibonacci function:

long fib( int n )
{
  return ( n == 0 || n == 1 ) ? 1 : fib( n - 1 ) + fib( n - 2 );
}

Here's the same function, memoized. You call it the same way, e.g., (fib 20).

class Fib : public UnaryMemoFunction<int, long>
{
protected:
  long call( int n )
  {
    return ( n == 0 || n == 1 ) ? 1 : memo( n - 1 ) + memo( n - 2 );
  }

} fib;

A subclass of UnaryMemoFunction must define call() to do the actual calculations. Any recursive calls should call memo() to cache the values returned.

The memoized function can be called just like the recursive function. In addition, fib.size() will return how many calls have been stored in the cache, and fib.clear() will empty the cache.

Here's an example of the binary recursive function for Pascal's Triangle (PDF):

long pascal( int r, int c )
{
  return ( r == 1 || c == 1 ) ? 1 : pascal( r, c - 1 ) + pascal( r - 1, c );
}

Here's the memoized version:

class Pascal : public BinaryMemoFunction<int, int, long>
{
protected:
  long call( int r, int c )
  {
  return ( r == 1 || c == 1 ) ? 1 : memo( r, c - 1 ) + memo( r - 1, c );
  }

} pascal;

While a memoized function will not compete with optimized hand-coded algorithms, there are still huge speedups possible with intensively recursive functions like these. Here are the timings on my desktop computer, using gcc 4.2, for fib(40) and pascal(20):

Fibonacci
40 = 165580141 2078 ticks (recursive) 0 ticks (memoized)


Pascal's Triangle
16,16 = 155117520 2328 ticks (recursive) 0 ticks (memoized)

Finally, here's a classic dynamic programming problem: Longest Common Subsequence. An example test case:

std::string x = "ABDCBABC";
std::string y = "ADBCDBAB";
CHECK_EQUAL( "ABDBAB", lcs( x, y ) );

"ABDBAB" is one of several solutions. Can you find the others?

The recursive code for this is:

std::string lcs( std::string x, std::string y )
{
  if ( x.empty() || y.empty() ) {
    return "";
  }
  else {
    char xc = lastChar( x );
    char yc = lastChar( y );
    return xc == yc ? lcs( prefix( x ), prefix( y ) ) + xc
     : longer( lcs( prefix(x ), y ), lcs( x, prefix( y ) ) );
  }
}

This uses some easily defined utility subfunctions for getting the last character in a string, the "prefix" of a string, which means all the characters except the last one, and the longer of two strings.

The memoized version is:

class LCS : public BinaryMemoFunction<std::string, std::string, std::string>
{
protected:
  std::string call( std::string x, std::string y )
  {
    if ( x.empty() || y.empty() ) {
      return "";
    }
    else {
      char xc = lastChar( x );
      char yc = lastChar( y );
      return xc == yc ? memo( prefix( x ), prefix( y ) ) + xc
       : longer( memo( prefix(x ), y ), memo( x, prefix( y ) ) );
    }
  }
} mlcs;

Running these two implementations on randomly generated DNA-like data yields the timings below. As can be seen, the problem becomes expensive very fast. There is also a very wide variation in running times for strings of the same length. This problem slows down so quickly that much more sophisticated algorithms are needed to handle the long strings that come up in real DNA.


Comments? comment icon Send mail to c-riesbeck@northwestern.edu.

Valid HTML 4.01 Strict