Divide and Conquer

This article is a tutorial on how to analyse Divide and Conquer algorithms.

Startegy

The idea of Divide and Conquer is to solve the problem recursively on smaller versions of the problem. Analyzing Divide and Conquer algorithms always include the following steps.

  1. Understand the algorithm and how the recursion works.

    Example: The algorithm divides the problem into five subproblems of half the size, recursively solving each subproblem, and then combining the solutions in linear time.

  2. From this formulation, derive the time complexity recurrence.

    Example: $T(n) = 5 T \left( \frac n 2 \right) + O(n)$

  3. Draw the recursion tree to understand the recursion structure.

  4. From the tree, sum up the overhead that you spend at each node to find the final complexity $T(n)$: $$ T(n) = \sum_{\text{level } k = 0}^{\text{# levels}} (\text{number of nodes at level }k) \times (\text{overhead per node at level }k) $$

Fast exponentiation

Naive algorithm

To compute $a^n$, repeat $n$ times multiplying by $a$.

def NaiveMult(a, n):
  if n = 1: return a
  else: return NaiveMult(a, n-1) * a

If $T(n)$ is the number of multiplication needed to compute $a^n$, here $T(n) = O(n)$.

Efficient algorithm

The idea is to square $a^{n/2}$ to get $a^n$ (one multiplication) and when $n$ is odd, evaluate $a^{n-1}$ the same way and then multiply it by $a$ (one multiplication).

def FastMult(a, n):
  if n = 1: return a
  if a is even:
    b = FastMult(a, n/2)
    return b * b
  return a * FastMult(a, n-1)

Here again, let $T(n)$ be the number of multiplications. For the time analysis, we suppose without loss of generality that $n$ is power of 2. At each step $n$ is divided by 2. $$ T(n) = T \left( \frac n 2 \right) + 1 $$ Drawing the recursion tree gives us $$ T(n) = \sum_{k=0}^{\log n} 1 = O(\log n) $$

Tower of Hanoi

The goal of the game is to move the tower to another rod. You can move only one disk at a time, and you can't place a disk on top of a smaller one.

Recursive algorithm

To move a pile of $n$ disk to another rod, first recursively move a pile of $n-1$ to the last rod, then move the largest disk, then move back the $n-1$ on top of the largest.

def Hanoi(n, source, destination, last):
    if n > 1:
        Hanoi(n-1, source, last, destination)
        MoveOne(source, destination)
        Hanoi(n-1, last, destination, source)

Let's analyze the time complexity. Let $T(n)$ be the number of moves. Then the code gives us $$ T(n) = 2T(n-1) + 1 $$ Drawing the recursion tree allow us to conclude $$ T(n) = \sum_{k=0}^{n-1} 2^k \times 1 = \cdots = O(2^n) $$

Comparison of algorithms

You are product manager at Facebook and three engineers from your team come up with these three algorithms to detect fake Facebook accounts in the list of $n = 1$ billion Facebook accounts.

  • Rakesh solves problems by dividing them into five subproblems of half the size, recursively solving each subproblem, and then combining the solutions in linear time.

  • Chris solves problems of size $n$ by recursively solving two subproblems of size $n-1$ and then combining the solutions in constant time.

  • Matus solves problems of size $n$ by dividing them into nine subproblems of size $n=3$, recursively solving each subproblem, and then combining the solutions in $O(n^2)$ time

Examples from Algorithms, Dasgupta, Papadimitriou, and Vazirani.

Time analysis

  • Rakesh: $$ \displaystyle T(n) = 5T \left( \frac n 2 \right) + O(n) \to \text{(draw the tree)} \to \text{(sum the nodes)} \to T(n) = O\left(n^{\log 5}\right)$$

  • Chris: $$ \displaystyle T(n) = 2T(n-1) + C \to \text{(draw the tree)} \to \text{(sum the nodes)} \to T(n) = O\left(2^n\right)$$

  • Matus: $$ \displaystyle T(n) = 9T \left( \frac n 3 \right) + O(n^2) \to \text{(draw the tree)} \to \text{(sum the nodes)} \to T(n) = O\left(n^2 \log n\right)$$

Finding a fixed point

Given a sorted array $a[1...n]$, design an algorithm that finds if there is an element such that $a[i] = i$.
From Algorithms, Dasgupta, Papadimitriou, and Vazirani.

Naive algorithm

The naive approach is just to check all elements. The time complexity is linear.

def NaiveFind(a[1..n]):
  for i = 1...n:
    if a[i] = i: return True
  return False

Efficient algorithm

We didn't use the fact it is sorted in the previous idea. We look at the middle element of the array. Because the array is sorted, if the element is strictly larger than its index, then we know for sure that if an element being equal to its index exists, then it has to be in the right side of the array. Same idea for the opposite situation.

def EfficientFind(a[1...n]):
  if a is empty: return False
  i = n/2
  if a[i] = i: return True
  if a[i] > i: return EfficientFind(a[1...i-1])
  return EfficientFind(a[i+1...n])

At every step, we get rid of half the array, so we can guess it's going to be a logarithmic algorithmic algorithm. Let's check. Define $T(n)$ the number of comparison needed to answer the question in the worst case. Then $$ T(n) \leq T(n/2) + 2 $$

Drawing the recursive tree allow us to find $$ T(n) \leq \sum_{k=1}^{\log n} 1 \times 2 = \cdots = O(\log n) $$

Practice problem: Master Theorem

Suppose you have a recurrence of the form $$ T(n) = aT \left( \frac n b \right) + O(n^d)$$ with $a$, $b$ and $d$ constants. Draw the recurrence tree (numbering levels from zero) and answer the following questions.

  1. How many nodes are they at level 2?

  2. ... at some level $k$?

  3. What is the size input at level 2?

  4. ... at some level $k$?

  5. How many levels are there in the tree?

  6. What is the time overhead at each node of level 2?

  7. ... at each node of level $k$?

  8. Apply the formula given in the introduction and derive $$ T(n) = O \left( n^d \sum_{k=0}^{\log_b n} \left( \frac{a}{b^d} \right)^k \right) $$

  9. Use the geometric series formula $$ q^0 + q^1 + q^2 + \cdots + q^{m} = \frac{q^{m+1} - 1}{q-1} \quad \text{if } q \neq 1$$ to derive $$ T(n) = \left\{ \begin{array}{ll} O \left( n^d \log n \right) & \text{if } \frac{a}{b^d} - 1 = 0\\ O \left( n^d \right) & \text{if } \frac{a}{b^d} -1 < 0\\ O \left( n^d \left( \frac{a}{b^d} \right)^{\log_b n} \right) & \text{if } \frac{a}{b^d} -1 > 0 \end{array} \right. $$ Be sure to have the three cases.

  10. Prove using logarithmic and exponential rules this small property $$ \left( \frac{a}{b^d} \right)^{\log_b n} = n^{\log_b \left( \frac{a}{b^d} \right) } $$

  11. Use this property to derive the final form $$ T(n) = \left\{ \begin{array}{ll} O \left( n^d \log n \right) & \text{if } b^d = a\\ O \left( n^d \right) & \text{if } b^d > a\\ O \left( n^{\log_b a} \right) & \text{if } b^d < a \end{array} \right. $$

Quentin Pleplé
June 2013