Computer science, programming, and whatnot.


April 10, 2016 back to posts

So, what is recursion1 anyways?

Mathematical Induction

Explaining recursion might be easier if we understand mathematical induction. Indunction is a proof--technique, which works somewhat like this2:

  1. Show that we can get from one state to the next state
  2. Show that we have an initial state
  3. We can now get to any state, after the initial state.

The concept can be visalized as climbing a staircase. If we are on an arbitrary step, we know how to get to the next step. We also know how to get to the 0th step, which (i guess) is the ground. From this, we conclude that we can get to step n, for any positive n.

What does this have to do with recursion? Well, recursion is kind of the opposite. With recursion, we need the following:

  1. The problem can be solved by first solving a smaller instance of the same problem
  2. When the input is small enough, it is trivial to solve
  3. We can now solve an instance of any size, simply by making the problem smaller enough times.

Recursion is heavily used in functional programming, so if you are familiar with, say a Lisp, or Haskell, this might be familiar.

For instance, we can find the length of a list using recursion: the length of a list is one more then the length of the same list, with the first element removed, and the length of an empty list is 0.

num list_length(List)
    if List == []
        return 0
    let head be first element
    let tail be rest of list
    return 1 + list_length(tail)

Or, in actual python:

def list_len(lst):
  if lst == []:
    return 0
  return 1 + list_len(lst[1:])

A Not So Straight Forward Recursion

Let's say we're trying to sort a list. We could to a lot of different things, like keeping a sorted list, and inserting one new element at a time, or trying to build a data structure, with which extracting a sorted list is trivial3. But we'll try something else. We can begin with the observation that if we split the list in two, and somehow manage to sort the two list separately, we can merge the lists, pretty easily: just take out the smallest element of the two in front of the lists, and push it at the end onto a new list. Then the new list will consist of all of the elements in sorted order, because we allways took the smaller element.

So great, we just found a way to sort a list... except we didn't, because in order for our algorithm to work, we need an additional algorithm to sort the two lists. But, do we really? The algorithm we just made is itself a sorting algorithm! What happends if we try to use the algorithm itself?

list merge_sort(List)
    split List at middle into A, B
    return merge(A, B)

Does this work? What happends if List is empty? Or contains one element? Actually, when the length of the list is less than two, the list is already sorted (if we can say that an empty list is sorted).

list merge_sort(List)
    if List.length < 2
        return List
    split List at middle into A, B
    return merge(A, B)

Ok, we are (perhaps) starting to build up confidence in our new, albeit weird, algorithm. But, this call to the function itself looks a little dangerous. How can we know if it will end?

We first observe that if the length of the list is less than two, we return the list, so the algorithm returns from that call. If the length of the list is larger or equal to two, we split the list at the middle, and call the parts A and B. But the length of both A and B is the half of List (or something very close to half). Neither of them can possibly be larger, or even of equal length. Hence, we know that the algorithm calls itself, but with a smaller input. Eventually, when the input it small enough --- containing less than two elements --- we return. Hence, no matter what size the input is, the algorithm always terminates!

Now, this result is kind of amazing. We have just constructed a sorting algorithm, kind of without even thinking about the problem of sorting! By assuming our own algorithm actually works, we gained confidence that the same algorithm works!

The merge step looks a bit scary, though4. Maybe we can try something similar, but without having to merge the lists.

Ok, what if we sort the list just a little bit, so that we can make the recursion work. For instance, we can take a element from the list and then split, or partition, the list in two, based on that one element, such that all elements in the first list are less than or equal to our selected element, and all elements in the last list are greater than it. At the end, we simply put them together, and put our special element, the pivot, in betweeen.

list sort(List)
    if List.length < 2
        return List
    select e from List
    make A such that a in A --> a < e
    make B such that b in B --> b > e
    return A + [e] + B

This might look more like sorting, because we partition the list into two, based on the pivot. But one would think there is still lots to do. There is not. This python function implements this algorithm, quick-sort.

def sort(lst):
  if len(lst) < 2:
    return lst
  pivot = lst.pop()
  A = [a for a in lst if a <= pivot]
  B = [b for b in lst if b <= pivot]
  return sort(A) + [pivot] + sort(B)

and this is the haskell version:

sort (x:xs) = l ++ [x] ++ g
            where l = sort [a | a<-xs, a <= x]
                  g = sort [b | b<-xs, b > x]
sort l = l

Recursion Gone Wrong

Recursion is an amazing tool, and sometimes it works out just right. However, there are multiple things that can go wrong5.

Make Sure It Ends

It is very easy to get burnt on this, either by looping forever6:

def fac(n):
    if n == 1:
        return 1
    return n * fac(n - 1)

# fac(-2) loops infinitely

or incorrectly handling, or even forgetting, the base case7, and result in a crash:

sort [x] = [x]
sort (x:xs) = l ++ [x] ++ g
            where l = sort [a | a<-xs, a <= x]
                  g = sort [b | b<-xs, b > x]
-- Non-exhaustive patterns:
-- the empty list case is not handled

As mentioned here1, making sure the recursion ends is tricky.

If we have multiple recursive calls it is also easy to do one thing over and over using recursion. The classical example here is the recursive fibonacci function8, which runs in exponential time. Try to run fib(35)9.

def fib(n):
    if n < 3:
        return 1
    return fib(n-1) + fib(n-2)

A solution to this is memoization10, where we save intermediate results, and look up values instead of recomputing them. Of course, in the case on calculating fibonacci numbers, this is still worse than an iterative solution, because we are using more memory. This is not to say that it is not possible to write a equally good implementation, while still using recursion; we just have to go upwards, instead of downwards:

def fib(n):
    def go(current, previous, a):
        if a == n:
            return current
        return go(current + previous, current, a + 1)
    if n < 3:
        return 1
    return go(1, 0, 1)

At this point, one could argue that this is basically a loop11, so there is no point in using recursion; we might be better off with simply writing

def fib(n):
    current = 1
    previous = 0
    for _ in range(n - 1):
      current, previous =  current + previous, current
    return current

Hopefully, we have gained a little insight in recursion --- both its magic and its dangers. While there might be tricky to get recursion right, actually getting it right is too much fun to miss out on.


  1. While trying to make the Recursion: See Recursion joke, the static site generator ended up crashing due to a never ending recursion so that the call stack became too large. 2

  2. This is not a rigid, and probably not a very good, intro to induction --- but that is fine, because this post is about recursion.

  3. This is my personal favorite.

  4. It's not, but it is easy to mess up the merge routine.

  5. These problems aren't unique to recursion --- both infinite loops and horrible running times are naturally possible without using recursion.

  6. You could argue that the factorial of a negative number is not well defined, so that calling fac(-2) does not make any sense. However, defining the factorial of a negative number is useful in many situations.

  7. This actually was the version I wrote when trying to write the correct version above.

  8. We will be using the same definition as on wikipedia, where fib(1) = fib(2) = 1.

  9. ... or fib(50), if you have a lot of time on your hands.

  10. Note that there is no r in memoization.

  11. The function is Tail--Recursive, so in many languages this would be transformed by the compiler to a loop. However, Python is not one of those languages.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License