Optimizing recursion — 1

Recursion is a fundamental concept in not only computer science but also in mathematics. Recursion takes a problem and solves it by solving a smaller instance of the same problem. In theory, recursion is an element solution, but in practice, at least in computer science and software engineering, a recursive solution typically suffers performance hit and stack overflow. Today, let’s take a look at how we can one can implement a recursive solution without suffering from the aforementioned issues. We will write the code in Haskell.

Let’s look at a classical recursion problem: factorial. factorial n is a simple multiplication of all the numbers from 1 to n.

factorial :: Integer -> Integer
factorial n =
  if n <= 1
    then 1
    else factorial (n - 1) * n

main = do
  putStrLn $ show $ factorial 10

The problems, as described before, with a recursive implementation is that it is slow and can cause stack overflow.

It is slower than iterative solution due to large function call overhead. Every time a function is called, the computer needs to save the current stack state onto the stack and initialize the new function stack. At the time of the return, it also needs to clean up and restore the previous stack state. That is a lot of work and can build up when we have to call the function thousands of times.
More serious issue is that it may cause stack overflow if the recursion depth exceeds stack capacity. This will abort the program. In some applications, the recursion depth may well exceed 1 million, which is likely going to crash.

Now, there is an optimization technique most compilers are capable of, which is to convert certain types of recursive functions into iterative implementations. This allows developers to write the solution in elegant recursive format, while the compiler takes care of converting it into not-so-elegant but efficient and safe iterative implementation, resolving all the issues above. Unfortunately, this can be done to only a restricted set of recursion implementation, called tail-recursion.

Tail recursion

Tail recursion is a specific kind of recursion where the recursive call is the last operation in the function. This means that the function doesn’t need to do anything after the recursive call returns, and as a result, the current function’s stack frame can be reused for the recursive call. This optimization is known as tail call optimization (TCO), and it can significantly reduce the stack space used by the function.

Now, let’s go back to our factorial implementation. Is that tail recursion? The last line is factorial(n — 1) * n, which calls itself recursively but also performs additional multiplication after the result is obtained. Therefore, the last operation is not the recursive call — it is the multiplication — and therefore it is not tail recursion.

How to Tail-Recurse?

How do we then convert this to tail recursion? There is no generic way to tail-optimize any given recursive call. However, we can use clever tricks for certain operations. The factorial function is a good example where we can apply tail-optimization.

Let’s first expand the factorial expression, for say n = 5.

factorial(5) = 1*2*3*4*5 = (((1*2)*3)*4)*5

Now, this by itself can’t be tail-call-optimized because the first operation (inner most parentheses) is 1*2. Here, 1 is obtained from factorial(1), which is the deepest recursion call we have (the base case). That is, we need to step into the recursive calls until we reach the base case, then step out one at a time, carrying out the operation. This is why we can’t clear up the stack space within our recursion tree, as we must perform the operations after exiting out of each recursion call.

The clever trick is to reverse the order of computation, recalling that multiplication is an associative operation.

factorial(5) = 1*(2*(3*(4*5)))

Now, we do the computation first then call recursive function afterwards. Note that we do the multiplication with the base case 1 at the very end. This allows us to implement tail-recursion by passing accumulation in the call

tailRecursiveHelper :: Integer -> Integer -> Integer
tailRecursiveHelper n acc =
  if n <= 1
    then acc
    else tailRecursiveHelper (n - 1) (n * acc)

factorial :: Integer -> Integer
factorial n = tailRecursiveHelper n 1

main = do
  putStrLn $ show $ factorial 10

The last operation of the recursion is tailRecursiveHelper (n — 1) (n * acc), which is the recursion itself. Hence, no need to save the current stack and can essentially re-use the current stack. Most compiler should be able to easily tail-optimize the above code into an iterative implementation.

This is cool and great, but the factorial() example we used is perhaps too trivial. In fact, most compilers are already capable of tail-optimizing the original non-tail recursive function into a tail-recursive implementation. In the next article, we will explore a slightly more complex recursive call, which the compiler can’t tail-optimize without our explicit help. We will also compare performance before and after the optimization.