doc

advertisement
Data Structures and Algorithms
Lecture 2 – 18th August 2004
There are 3 major types of machines, depending on their representation of data in the
memory:
- Pointed Machines – these machines represent data as pointers only, no arrays are
defined in these machines.
- RAM model – these machines represent data as arrays. In all the lectures, it is
assumed that we are working with these machines.
- PRAM model – this model is used in parallel computing, and is similar to the
RAM model.
Asymptotic Complexity
Big-Oh
Definition: O(f(n)) is a class of functions g: N → R such that g(n) ≤ c*f(n) if n ≥ n0 for
some c, n0.
Big-Oh defines the upper limit for functions.
Example 1: 5*n + 10 = O(n)
Proof: 5*n + 10 ≤ c*n for all n ≥ n0 where c = 6, n0 = 10
Hence, 5*n + 10 = O(n)
Example 2: 2*n2 + 10*n + 7 = O(n2)
Proof: 2*n2 + 10*n + 7 ≤ c*n2 for all n ≥ n0 where c = 3, n0 = 17
Hence, 2*n2 + 10*n + 7 = O(n2)
Big-Omega
Definition: Ω(f(n)) s a set of function g: N → R such that g(n) ≥ c*f(n) if n ≥ n0 for some
c, n0.
Big-Omega defines the lower limit for functions.
Example 1: 5*n - 1000√n = Ω(n)
Proof: Let c = 1
‘c’ can be any value less than 5
5*n - 1000√n ≥ c*n
5*n - 1000√n ≥ n
4*n ≥ 1000√n
n ≥ 250√n
√n ≥ 250
n ≥ 62500
Hence, 5n - 1000√n ≥ c*n for all n ≥ n0 for c = 1, n0 = 62500
Theta
Definition: If f(n) = O(g(n)) and f(n) = Ω(g(n)) then f(n) = Θ(g(n))
Sorting Algorithms
Insertion Sort
Example: If the input data is [9, 7, 2, 3]
Output data: []
Output data: [9]
Output data: [7, 9]
Output data: [2, 7, 9]
Output data: [2, 3, 7, 9]
Pseudo Code:
procedure INSERTION_SORT(A, n)
1.
B = []
2.
3.
4.
5.
6.
7.
8.
9.
// initially the output array is empty
// we insert the first input, then the 2nd input
// if the 2nd input is larger then the 1st input
// then the 1st input is moved and the 2nd
// input is inserted in its place
// the completely sorted array
// A is the input array with n elements
// B is the output array with n elements
// we initialize the array to be empty
// put A[i] in B[]
for i = 1 to n
j=1
while (j ≤ i-1) and (B[j] < A[i])
do j = j + 1
for k = i-1 down to j
do B[k+1] = B[k]
B[j] = A[i]
return B
Insertion Sort analysis
By looking at the pseudo code, we can see that in the program, line no.1 and line no.9 are
executed once, while line no.2, 3 and 8 will get executed n times.
The other lines will run 1+2+3+…+n times = n(n+1)/2 times
The total time for this pseudo code will be as follows:
1 + n + n + n*(n+1)/2 + n*(n+1)/2 + n*(n+1)/2 + n*(n+1)/2 + n*(n+1)/2 + n + 1
2 + 3*n + 2*n*(n+1)
2 + 3*n + 2*n2 + 2*n
2*n2 + 5*n + 2
Hence, total time = 2*n2 + 5*n + 2 = O(n2)
The worst case for this algorithm, would be when the data is presented in descending
order, and the best case would be the sorted algorithm, in ascending order.
In this algorithm, even for the best case, the algorithm takes Ω(n2).
Hence, we can conclude that Insertion Sort has Θ(n2).
Merge Sort
Example:
1
5
3
4
2
7
8
Input Array
1
5
3
6
4
2
7
8
1
3
5
6
2
4
7
8
1
2
7
8
3
Pseudo code:
MERGE_SORT(A, n)
1.
if n = 1 return A
2.
L ← A[1, 2, …, n/2]
3.
4.
5.
6.
7.
6
4
5
6
Split the array into two
equal parts
Sort the parts
Merge them together to
form the output array
// Input array A with n elements
// if only one element, then the array is sorted
// divide the array into left and right, with equal
// elements
R ← A[1+n/2, 2+n/2, …, n]
L’← MERGE_SORT[L, n/2]
R’← MERGE_SORT[L, n/2]
A’← MERGE[L’, R’]
// MERGE is a routine defined elsewhere
return A’
// A’ contains the sorted array
Merge Sort analysis using the recursion tree
The MERGE
algorithm takes
n times
n
Levels
Of
Recursion
is
approx.
log(n),
n/2
n/2 times
n/2
OR
O(log(n))
n/4
.
.
.
.
n/4
.
.
.
.
n/4
.
.
.
.
n/4
.
.
.
.
n/4 times
By adding all the time that the MERGE algorithm, we would get
Total time for MERGE = n + (n/2) + (n/4)… = O(n)
Now, the total time, consists of the time taken to analyze the recursion levels, multiplied
by the time take to merge the sorted array at each level = O(n*log(n))
Hence, the total time taken for MERGE_SORT = O(n*log(n))
Merge Sort analysis by using Mathematical Induction
Let T(n) be the running time of Merge Sort with an input of size n
Then T(1) = O(1)
and T(n) ≤ c’*n + 2*T(n/2)
// c’ is any constant
// c’*n is the time taken for MERGE
We need to prove that T(n) ≤ c*n*log2(n) is true
Let us assume that T(n/2) ≤ c*(n/2)*log2(n/2) is true
Then T(n) ≤ c’*n + 2*T(n/2)
≤ c’*n + 2*c*(n/2)*log2(n/2)
≤ c’*n + c*n*log2(n-1)
// log(a/b) = log a – log b; log22 = 1
≤ c’*n + c*n*log2(n) – c*n
≤ c*n*log2(n)
if c > c’
Therefore, T(n) = O(n*log(n))
Hence, our assumption was correct.
Download