Lexlog: Theoretical Computer Science: Divide and Conquer

Here, we develop results for divide and conquer algorithms.

The divide and conquer strategy consists of dividing a problem into subproblems until these become trivial to solve, then combining the results iteratively (bottom-up) to provide the solution to the original problem. The paradigm is very efficient for certain kinds of problems where that design makes sense: sorting (quicksort, mergesort), searching (binary search), multiplying (karatsuba, strassen), etc. This strategy is naturally implemented using recursion: if the problem is trivial, solve it, else make recursive calls to solve subproblems and combine their return values in a specific way.

The time complexity (function describing the amount of time the algorithm will take to terminate depending on the size of its input) of such algorithms is going to depend on the number of recursive calls, the time each of these recursive calls takes and the time complexity of the function that combines the return values of the recursive calls. Hence, the time complexity is going to take the form of a recurrence relation: for the trivial cases, the time is fixed, for the nontrivial cases, the running time given an input of size n will depend on the time given an input of size n' < n. To obtain a closed form (depending only on the size of the input) for the time complexity, we use the method of repeated substitution.

Since the validity of the obtained closed form relies on the assumptions made on k, there are several more steps to take in order to prove the result for all inputs: since we are interested in the time behaviour, we will actually end up using that closed form to set a bound on the time complexity. We can prove that T(n) is a non-decreasing function inductively, then use the closed form to set a lower bound, upper bound and/or tight bound on T. For convenience:

Recall asymptotic complexity bounds (big O notation):

Upper asymptotic bound

Tight asymptotic bound

Lower asymptotic bound

A generalisation of deriving asymptotic bounds for the recurrence relations arising as time complexities of divide and conquer algorithms is the Master Theorem. Given a relation of the form T(n) = aT(n/b) + f(n) for n bigger than a given threshold and f(n) in \Theta(n^d), where a constitutes the number of recursive calls, b is the dividing factor and f(n) represents the time complexity of the dividing and recombining procedure, the master theorem will provide a tight bound for T as follows:

if a < b^d then T(n) in \Theta(n^d)
if a = b^d then T(n) in \Theta(log(n)*n^d)
if a > b^d then T(n) in \Theta(n^log_b(a))

Notice that the theorem assumes a relation with a constant b, i.e. the subproblems must be roughly the same size. There is another generalisation of this procedure for deriving bounds for the time complexity of divide and conquer recurrences, that can be used when the subproblems have different sizes, it is called the Akra-Bazzi method.

Lexlog: Theoretical Computer Science

Wednesday, 7 November 2012

Divide and Conquer

No comments:

Post a Comment

About Me