Lecture 2 - Computer Science

advertisement
Mathematical Preliminaries and Proof Techniques
In previous classes, summations, logarithms, exponents,
combinations and permutations, proof by example, proof by
contradiction, and induction. Thus, for this lecture I will only
cover loop invariants and probability, which you may not have
had in prior computer science classes. (Also, I'll go over
enough information on graphs so you can start working on the
homework problem.)
Loop Invariant
Often, to prove that an algorithm works properly, we can
prove that a loop invariant holds. A loop invariant is a
statement that is true throughout the duration of the execution
of a loop. Consider the following example:
int factorial(int n) {
int i=1, fact=1;
while (i < n) {
i = i+1;
fact = fact*i;
}
return fact;
}
We will prove that at the beginning of every loop iteration
fact=i!
We will use induction to prove this. Before the first loop
iteration starts, i=1 and fact = 1! so the base case holds.
We will assume that on the kth iteration, where k <n, fact = k!
We must prove that on the (k+1)th iteration fact = (k+1)!
We know that on the beginning of the kth iteration, fact = k!
Furthermore, on this iteration, the first line of code changes
the value of the variable i to k+1. Then, the second like changes
fact to be k!*(k+1) = (k+1)! Finally, we return to the top of the
loop proving that the value of fact at the beginning of the
(k+1)th iteration is indeed (k+1)! as desired.
The induction then shows that the value of fact at the end of
the loop with simply be n! since the loop ends when i=n.
I have taken a couple "short-cuts" in this proof above.
Typically, rigorous proofs of correctness of algorithms is quite
tedious. I'll expect you to give me detailed proofs, but none
quite as detailed as the one above. To show the above
algorithm is correct in a homework assignment, I would not
require induction. Rather it would be good enough to point out
that i increments each time and that fact is the product of each
positive integer up to n. (You should show that there's no off
by one error.)
For practice, try to find a loop invariant in the pseudocode
below that determines the minimum value in an array A.
min = A[0];
for (i=1; i<A.length; i++)
if (A[i] < min)
min = A[i];
}
return min;
Probability
The probability or likelihood of an event is defined as total
number of successes (or frequency of an event occurring)
divided by the sample space (or total number of possible times
for the event to occur.) For example, the probability of rolling
an even number on a standard six-sided die is 3/6 = ½ because
there are six possible outcomes (1,2,3,4,5,6) of which three are
even. It is IMPORTANT to note that each of the outcomes in
the sample space MUST be equally probable for this definition
to be valid. For example, if rolling a 1 was 5 times more likely
than each of a 2, 3, 4, 5, or 6, then ½ would not be the answer
to the question above. We can denote the probability of an
event A occurring as p(A).
Some probability rules:
1) The sum of the probabilities of all events/outcomes
occurring is always 1. (Each event/outcome must be
disjoint.)
2) The probability of any event is in between 0 and 1,
inclusive.
3) If two events A and B are disjoint, then the probability of
either event occurring is the sum of the probability of A
occurring and of B occurring. Symbolically, we have, if
p(AB) = 0, then p(AB) = p(A)+p(B).
4) If two events are independent, meaning that one event
does not affect the probability of another occurring, such
as two consecutive flips of a fair coin, then the probability
of both occurring is the product of the probability of each
occurring. Symbolically, p(AB) = p(A)p(B), for
independent events A and B.
5) A conditional probability, written as p(A|B) is defined as
the probability of an event A occurring given that event B
has occurred. p(A|B) = p(AB)/p(B). In essence, the
numerator accounts for all events where both A and B
occur, but the sample space has to be confined to the
events where B occurs, or p(B).
6) The inclusion-exclusion principle holds: p(AB) =
p(A)+p(B) - p(AB).
7) Bayes Law for conditional probabilities is p(A|B)=
p(A)p(B|A)/p(B). This can be derived from the equations
for Bayes Law. (p(A|B) = p(AB)/p(B) and p(B|A) =
p(BA)/p(A).)
Random Variables
We can model events through random variables. A random
variable is one that takes on various values with various
probabilities. For example, a random variable X that models a
die roll is one that is equal to 1, 1/6th of the time, 2 1/6th of the
time, etc. Given a random variable X, we define its expected
value as follows:
E(X) =  xP(X=x), where the sum is over all possible values of
the random X, and x is the index variable. For example, for a
unbiased die roll, the expectation is 1/6(1+2+3+4+5+6) = 7/2.
You can think of this as the probable average value of a die roll
over a long string of die rolls. (The expected value of a random
variable is it's average value.)
There is a linearity of expectation, therefore, with random
variables X and Y, we have E(X+Y) = E(X)+E(Y). The proof of
this is in the text. Furthermore, if two events X and Y are
independent, we have E(XY) = E(X)E(Y).
A good example here is the expected sum of rolling two dice.
The expected sum of rolling one die is 7/2 as stated above.
Thus, using the linearity of expectation and letting event X be
the value of the first die roll and event Y be the value of the
second die roll, we find:
E(X+Y) = E(X) + E(Y) = 7/2+7/2 = 7.
Also, we have E(XY) = E(X)E(Y) = 49/4.
Consider this example from algorithm analysis that involves
probability:
We have discussed the best and worst case analysis of a binary
search on n sorted items. Now, let’s do the average case
analysis (assuming that the item is always in the array and to
simplify things, I'll assume that n=2k - 1 for some integer k):
Given an array of n items, the probability that the value is
found on the first comparison is 1/n, assuming that each array
element is equally likely to be searched for.
If that first search is incorrect, then the probability of finding
the element in the second comparison is 1/(floor(n/2)), since
there are only floor(n/2), or 2k-1-1 elements left to search.
Continuing with this logic the probability the element is found
on the third comparison is 1/(n/22), etc. We know that the
element MUST BE found on the log2n comparison,
approximately. (This is the maximum number of steps in the
binary search.)
Now, let X be a random variable equal to the number of
comparisons done in a binary search of an array with n sorted
items. We have to solve for the expectation of X in terms of n to
approximate the AVERAGE case running time of the binary
search. We have:
E(X) = 1[1/n] + 2[2/n] + 3[22/n]+... + k[2k-1/n]
= [1(20) + 2(21) + 3(22) + ... + k(2k-1)]/n
Now, this sum isn't anything we recognize, but consider the
following technique:
Let S = 1(20) + 2(21) + 3(22) + ... + k(2k-1)
Now, multiply this equation by 2:
2S =
1(21) + 2(22) + ... +(k-1)(2k-1) + k(2k)
Now, subtract the second equation from the first:
-S=(20)+(21)+(22) + ... +(2k-1) - k(2k)
Solve for S:
S = k(2k)-[20+21+22 + ... +2k-1]
The sum to be subtracted from k(2k) is a geometric sequence,
with the first term 1, and a common ratio of 2. Using the
formula given in the book, we find:
S = k(2k)-(2k - 1)/(2-1)
S = k(2k)-2k + 1
S = (k-1)2k + 1
S = (n+1)[log2(n+1) - 1] + 1
Plugging back into our equation for E(X), we have
E(X) = S/n = ((n+1)[log2(n+1) - 1] + 1)/n = (log2n),
showing that the average case running time is proportional to
log2n.
Sample Algorithm Development and Analysis
The Prefix Average problem is as follows:
Given an array of values, X[0..n-1], compute a second array
with A with intermediate averages such that A[i] is the average
of X[0], X[1], … X[i].
The straightforward algorithm is as follows:
For each array element A[i]:
Compute this value by adding X[0], X[1], .., X[i] in a loop,
then dividing by i.
In code we have:
public static int[] prefixave(int [] X) {
int [] A = new int[X.length];
// Loop to successively compute each average.
for (int i=0; i<A.length; i++) {
A[i] = 0;
for (int j=0; j<=i; j++) // Sum X[0] to X[i].
A[i] += X[j];
A[i] = A[i]/(i+1); // Compute average from sum.
}
return A;
}
Hopefully it is evident that that this algorithm will work. The
question is, how long will it take? Notice that the statements
A[i]=0 and A[i]= A[i]/(i+1) both execute exactly n times.
The only question is how many times does A[i] += X[j]
execute? When i=0, it executes once, with i=1, it executes twice,
…, finally when i=n-1, (where n is the length of the array), it
executes n times. Thus, the number of times this statement is
executed is 1+2+3+…+n = n(n+1)/2 = O(n2).
So, using all of this, the total number of simple statements is
O(n) + O(n2). (The O(n) is for the 2 other loop statements and
the for the outer for loop increment statement and
comparison.) Using the Big-Oh rules, we find that this
algorithm is O(n2).
But, it seems as if we are doing too much work here. Can we
streamline this algorithm? Consider computing just all of the
prefix sums first, instead of the averages. We can compute each
running sum in a accumulator variable. Then we can simply
assign each array element A by dividing the running sum by
the number of terms added. From CS1, the efficient way to run
an accumulator variable is to initialize it to 0 and then simply
add subsequent terms from the array X into the accumulator
variable:
public static int[] prefixave2(int [] X) {
int [] A = new int[X.length];
int s = 0;
// Loop to successively compute each average.
for (int i=0; i<ave.length; i++) {
s += X[i];
A[i] = s/(i+1); // Compute average from sum.
}
return A;
}
Why is this an O(n) algorithm?
Download