Recursion#
Recursive Functions#
Definition 1 A function is recursive if it is recursive.
Definition 2 A function is recursive if it refers to itself.
It is probably easier to see some examples.
Divide and Conquer#
A common situation in which recursion is used is in Divide and Conquer algorithms.
The basic idea is to break up a problem into two (or more) sub-problems and combine the answer to get the final solution
Finding the maximum element of a list#
One way to find the maximum element of a list A
is to find the maximum element in the first half of the list, and find the maximum element in the second half of the list, and then take the maximum of these two values.
How do we find the maximum element of each half of the list? We can use this function recursively to break up the problem into smaller and smaller pieces.
At some point, the pieces will be trivial to solve. In this case, we don’t recurse further, we just solve the problem explicitly. For this problem, we’ll just solve the problem directly if the list is length 3 or less by using the built-in max
function:
max(1,2)
2
def maximum(A):
"""
find the maximum element of a list A recursively
"""
# check if the list is short enough to do explicitly
if len(A) < 4:
return max(A)
# else, we will use recursion to solve the problem
n = len(A) // 2 # partition the list into two halves here
max1 = maximum(A[:n]) # left half
max2 = maximum(A[n:]) # right half
return max(max1, max2)
A = list(range(100))
maximum(A)
99
Mergesort#
Mergesort is an example of a sorting algorithm. The input is a list a
, and the output is a list b
which has the same elements as a
, but appearing in sorted order.
Python has the sorted
function built-in to sort iterables like lists:
a = [2,3,1]
sorted(a)
[1, 2, 3]
There are a variety of sorting algorithms which use different techniques.
Merge sort uses the observation that if two arrays a1
and a2
are already sorted, that it is easy to merge the two arrays into a single sorted array in a single loop
def merge(a1, a2):
"""
merge sorted lists a1 and a2 into a sorted list b
"""
b = []
i1 = 0
i2 = 0
# insert the smallest element into b
while (i1 < len(a1) and i2 < len(a2)):
if a1[i1] < a2[i2]:
b.append(a1[i1])
i1 = i1 + 1
else:
b.append(a2[i2])
i2 = i2 + 1
# at most one of the following while-loops will do anything
while (i1 < len(a1)):
b.append(a1[i1])
i1 = i1 + 1
while (i2 < len(a2)):
b.append(a2[i2])
i2 = i2 + 1
return b
merge([1,3,4], [0,2,5])
[0, 1, 2, 3, 4, 5]
The divide-and-conquer strategy is to start with an input list a
, sort the left and right halves, and then merge the two halves to sort a
. We can employ recursion by using merge sort to sort each half. By definition, an list with 1 or no elements is already sorted.
def mergesort(a):
"""
sort a using merge-sort algorithm
"""
# if a has 1 or zero elements, it is already sorted
if len(a) < 2:
return a
k = len(a) // 2
L = a[:k] # left half
R = a[k:] # right half
# recurse to sort L and R
L = mergesort(L)
R = mergesort(R)
# now merge L and R
a = merge(L, R)
return a
mergesort([6,5,4,3,2,1,0])
[0, 1, 2, 3, 4, 5, 6]
Quicksort#
Quicksort was named by SIAM editors as one of the top-10 algorithms of the 20th century. Like mergesort, it also uses a divide and conquer strategy.
Quicksort works by partitioning a list into two halves divided by a pivot. All elements less than the pivot are moved to the first part of the list, and all elements greater than the pivot are moved to the second part of the list.
def swap(a, i, j):
"""
swap elements i and j in-place in list a
"""
a[i], a[j] = a[j], a[i]
a = [1,2,3]
print(a)
swap(a, 0, 2)
print(a)
[1, 2, 3]
[3, 2, 1]
from numpy.random import randint
def partition(a, lo, hi):
"""
choose a pivot in a[lo:hi+1] randomly
swap all elements of a[lo:hi+1] less than the pivot value to appear before the pivot
swap all elements of a[lo:hi+1] greater than the pivot value to appear after the pivot
"""
pi = randint(lo, hi+1) # pivot index
swap(a, pi, hi) # put pivot index in last position
pivot = a[hi]
i = lo # i is the pivot index for elements we have seen so far
for j in range(lo, hi+1):
if a[j] < pivot:
swap(a, i, j)
i = i+1 # increment pivot index
swap(a, i, hi) # put pivot in correct place
return i
def quicksort(a, lo=0, hi=None):
"""
perform quicksort algorithm on array a
performs operations in-place
"""
if hi is None:
hi = len(a) - 1
if lo < hi:
i = partition(a, lo, hi)
quicksort(a, lo, i-1) # recurse on lower half
quicksort(a, i+1, hi) # recurse on higher half
return a
quicksort([6,5,4,3,2,1,0])
[0, 1, 2, 3, 4, 5, 6]
Binary Search#
Binary search seeks to find where an element x
appears in a sorted list a
. The divide and conquer idea is to look at the element i=n//2
of the center of the list. If x = a[i]
then we’re done. Otherwise, we can recurse into either the left-hand or right-hand side of the list depending on whether x<a[i]
or x>a[i]
. We should also handle the case where x
does not appear in the list - in this case, we return -1
.
def binary_search(a, x, lo=0, hi=None):
"""
find the position of x in a sorted list a.
If x is not in a, return -1
"""
if hi is None:
hi = len(a) - 1
if lo > hi:
return -1 # element not in array
i = (hi - lo) // 2 + lo
if a[i] == x:
return i
elif a[i] < x:
return binary_search(a, x, i+1, hi) # recurse on right half
else:
return binary_search(a, x, lo, i-1) # recurse on left half
binary_search([0,1,2,3,4,5], 3)
3
Analysis of Recursive Algorithms#
Recursive algorithms are usually easy to implement, but they aren’t always fast. You’ll see an example in homework 0.
On the other hand, mergesort
has optimal asymptotic complexity for a sorting algorithm: \(O(n\log n)\)
if quicksort
is very unlucky with choice of pivot, it can take \(O(n^2)\) time, but its expected runtime is \(O(n \log n)\). It also does all operations in-place, unlike mergesort
, which uses auxillary memory.
The Master Theorem#
The master theorem helps us to obtain asymptotic time complexity for divide and conquer algorithms. The first step is to write down the time it takes to run a function on input size \(n\), \(T(n)\), in terms of the time it takes to run subproblems $\( T(n) = a T(n/b) + f(n)\)\( where \)f(n)$ is the work done to stitch the subproblems together.
The critical exponent is defined as \(c = \log_b a\). The amount of work to complete all trivial subproblems is \(\Theta(n^c)\). There are three regimes:
\(f(n) = o(n^c)\). In this case, the work to do the subproblems dominate, and the time complexity is \(T(n) = \Theta(n^c)\).
\(f(n) = \Theta(n^c)\). In this case, the work to do subproblems and combine them is comparible. The complexity is \(T(n) = \Theta(n^c \log n)\)
\(f(n) = \omega(n^c)\). In this case, the work to combine subproblems dominates, and the complexity is \(T(n) = \Theta(f(n))\).
For example, in our maximum
function, we split into two subproblems of size \(n/2\), so \(a = 2, b=2\), and \(c = 1\). \(f(n)\) is the time to take the max of two numbers, so is \(O(1)\). This means we are in case 1 above, and the time complexity of maximum
is \(\Theta(n^c) = \Theta(n)\).
Analysis of Binary Search#
Let’s apply the Master theorem to our binary_search
function. It takes a constant amount of work to compute the center index of the array a
between lo
and hi
, and perform the comparison of a[i]
with x
. We then recurse into (at most) one subproblem of half the size. So in the above expression, \(a = 1\), \(b = 2\), and \(f(n)\) is \(O(1)\). \(c = \log_2 1 = 0\) so \(n^c\) is \(O(1)\) and thus \(f(n) = \Theta(n^c)\). We can conclude that the time complexity of binary_search
is \(\Theta(n^c \log n) = \Theta(\log n)\).
Exercise#
Apply the master theorem to obtain the complexity of merge sort.
\(O(n\log n)\)
Exercise#
Implement the correlation coefficient defined in this paper
import numpy as np
def newCorr(X_fun,Y_fun):
n = int(1e3)
XY_array = [(X_fun(i),Y_fun(i)) for i in range(n)]
XY_array_sorted = sorted(XY_array)
def R(s,i):
cnt = 0
for element in s:
if element[1] <= s[i-1][1]:
cnt = cnt +1
return cnt
def L(s,i):
cnt = 0
for element in s:
if element[1] >= s[i-1][1]:
cnt = cnt +1
return cnt
# corr2 = 1 - 3 * np.sum([np.abs(R(XY_array_sorted,i+1) - R(XY_array_sorted,i)) for i in range(1,n)]) / (n**2 - 1)
corr = 1 - (n/2) * np.sum([np.abs(R(XY_array_sorted,i+1) - R(XY_array_sorted,i)) for i in range(1,n)]) \
/ sum([L(XY_array_sorted,i) * (n - L(XY_array_sorted,i)) for i in range(1,n+1)])
return corr
def fun1(i):
return i
def fun2(i):
return i**2
def fun3(i):
return np.log(i**2+5*i+3)/(i+1)
def fun4(i):
return np.random.rand()
print(newCorr(fun1,fun1)); # Y is a measurable function of X
print(newCorr(fun1,fun2)); # Y is a measurable function of X
print(newCorr(fun1,fun3)); # Y is a measurable function of X
print(newCorr(fun1,fun4)); # X and Y are independent
print(newCorr(fun4,fun4)); # X and Y are independent
0.997002997002997
0.997002997002997
0.9970030149309796
0.027864027864027863
0.023376023376023403
Tail Recursion#
TODO: this isn’t handled by the Python interpreter, but you can do it yourself!
Memoization#
TODO: example on Fibonacci