(UNBF) on counting permutations, combinations etc.

advertisement
Notes on Counting
Patricia A. Evans
Unlike simple loops (such as “from 1 to n”), some algorithms need to iterate through all possibilities for
a defined set of sequences, subsets, or more complex structures that can be based on these.
To analyse these algorithms, we need to know how many of these items there can be, in terms of the
input size. This determination requires some basic combinatorics to determine the number of possible
sequences/subsets/structures based on the number of different things from which they can be composed,
and the rules for composing them together.
Multiplication Rule
If we are selecting items for two different things, and there are n possibilities for the first thing and m
possibilities for the second thing, where all pairs of possibilities can happen, then there are n · m pairs of
possibilities.
We can apply this to determining the number of ordered pairs that can be made from the elements of
two sets, A and B:
|A × B| = |{(a, b) : a ∈ A and b ∈ B}| = |A| · |B|.
This rule can be applied repeatedly to determine the number of possible k-tuples for any k. If all the
sets are the same, then we can count the number of sequences of length k of items from A:
# of k-sequences = |A|k
Permutations
Based on the Multiplication Rule, we can also consider sequences that do not repeat elements from the
set. If all elements of A appear in the sequence, then this is the number of arrangements or permutations
of A. We can still apply the Multiplication Rule; each selection of an item for a position in the sequence will
“use up” the item, reducing the number of possibilities for later selections by 1 (irrespective of which item
is picked).
# of permutations of A = |A| · (|A| − 1) · (|A| − 2) · · · · · 2 · 1
so if |A| = n, the number of permutations of A is n · (n − 1) · · · · · 2 · 1 = n!.
For example, the letters in the word “computer” can be arranged in 8! = 40320 different permutations.
Since our permutation has length n (we use all of the letters), these are complete permutations.
1
If we want a shorter sequence, we can stop picking items when we have all of the positions filled, producing
a k-permutation (for some k ≤ n). So with |A| = n,
# of k-permutations of A = P (n, k) = n · (n − 1) · · · · · (n − k + 1)
n!
. Essentially, if we calculate the
We can also express this result using factorials, as P (n, k) = (n−k)!
number of complete permutations, we then need to divide by all of the rearrangements of the items that are
not in the k positions used by the k-permutation, since these are overcounting the k-permutations.
For “computer”, we can thus determine that this word has 8 · 7 = 56 2-permutations, 8 · 7 · 6 = 336
3-permutations, and so on.
For complete permutations, we can also handle repetition in the items themselves. To adjust our counting
for identical items, we can first consider the number of permutations for distinct items, and then divide by
the overcount caused by treating rearrangements of identical items as different.
So if the set of n items had one type of item that had r copies in the set (all counted as part of the n
items), then the number of complete permutations would be
n!
r!
For example, the word “data” has
4!
2!
=
4·3·2·1
2·1
= 12 permutations.
We can continue this process to divide by the number of ways to rearrange each group of items that
are identical. For example, “sequences” has 3 copies of the letter ’e’ and 2 copies of the letter ’s’, so it has
9!
9·8·7·6·5·4·3·2·1
= 30240 permutations.
3!2! =
3·2·1·2·1
Unfortunately we cannot do the same process for k-permutations, since the number of rearrangements
of identical items would depend on how many of them appeared in the permutation. To handle a situation
like this, you would need to consider the subsets of elements being chosen, and then work out in how many
ways each type of subset could be rearranged.
Combinations
Reasoning about dividing by the number of rearrangements can also be applied to count subsets, where
the order of elements selected does not change the subset.
If we have n distinct items, and we count all k-permutations of those items (for a given size k), then the
number of times that each subset of size k is counted is equal to the number of its permutations, k!.
So the number of subsets of size k is equal
to the number of k-combinations of n items, which is represented
by C(n, k) and, more commonly, by nk .
By the above relationship between permutations and combinations, we have
n
n!
n · (n − 1) · · · · · (n − k + 1)
P (n, k)
=
=
C(n, k) =
=
k!
(n − k)!k!
k · (k − 1) · · · · · 1
k
Note that, as long as you include the 1 factors, the number of factors in the numerator and denominator of
a combination will always be the same. Also, since the result must be an integer, all of the factors in the
denominator must cancel against some part of one or more factors in the numerator.
2
For example, the number of 3-element subsets that can be formed from the letters in “computer” is
8
8·7·6
8!
=
= 56
=
5!3!
3·2·1
3
The number of subsets (of “computer”) of size 4 is
8
8!
8·7·6·5
=
=
= 70
4
4!4!
4·3·2·1
Selecting k items to be in the subset is the same as selecting n − k items to not be in the subset, so we
also have that
n
n
n!
n!
=
=
=
k
(n − k)!k!
k!(n − k)!
n−k
Selecting all of the items can be done in only one way, which is also true for selecting none of the items.
Repeated elements are harder to handle, but the subsets can still be counted by breaking the subsets
into types based on how many of the repeated elements they contain. So if there is a group of r repeated
elements in the set of size n, and all other elements are distinct, then we can count the subsets by adding up
how many subsets contain none of the identical group, one of them, two of them, and so on up to min(k, r),
equalling
min(k,r) X
i=0
n−r
k−i
For each term of this sum, we count how many ways there are to select the (k − i) other subset items from
the (n − r) distinct other elements of the set.
For example, we can count the number of 3-combinations of “strings” as the sum of the 3-combinations
with no ’s’, with one ’s’, and with two ’s’s.
5·4 5·4 5
5
5
5
=
+
+
+
+
= 10 + 10 + 5 = 25
1
2
3
2·1 2·1 1
Had all 7 letters been distinct, there would have been
7
3
=
7·6·5
3·2·1
= 35 3-combinations.
To determine for a particular situation whether permutations or combinations are involved, figure out
whether the roles or positions that items are being selected for are distinct. If they are distinct, such as
picking first, second, and third place winners, then count permutations; if they are not distinct, such as
picking a set of 3 unranked winners, then count combinations.
Pascal’s Formula
Combinations can also be calculated based on other combinations, by thinking about and breaking down
the subsets being counted.
Consider the last item, item n. It can either be in the subset or not; these two cases do not overlap and
together cover all possibilities.
If it is in the subset, then the other
k − 1 items for the subset need to be chosen from the remaining n − 1
items, which can be done in n−1
ways.
k−1
3
If it is not in the subset, then all k items for the subset need to be chosen from the remaining n − 1 items,
ways.
which can be done in n−1
k
Putting these together gives us Pascal’s Formula:
n
n−1
n−1
=
+
k
k−1
k
and this formula can be used to simplify expressions that look like theright-hand side, as well as providing
a recurrence that we can use repeatedly in an algorithm to compute nk .
We can use this formula over and over again, starting from 00 and using that n0 = nn = 1, to calculate
any combination and build a table of combinations (known as Pascal’s Triangle).
Pascal’s Triangle:
n\k
0
1
2
3
4
5
6
..
.
0
1
1
1
1
1
1
1
..
.
1
2
3
4
5
6
···
1
2
3
4
5
6
..
.
1
3
6
10
15
..
.
1
4
10
20
..
.
1
5
15
..
.
1
6
..
.
1
..
.
..
.
In this table, each number is the sum of the number directly above it and the number above and to the
left.
We can thus calculate the value of any specific combination simply by adding numbers for long enough
until the needed row of the table is produced. This may be quite a while if we are computing values by
hand, but if we use a computer we can use Pascal’s Formula to compute the value of combinations without
having problems with integer overflow for intermediate values.
Algorithm 1: Using Pascal’s Formula to Calculate Combinations
Input : number of items n, size of subset k
Output: number of size k subsets of a set of n elements
Let A[0..n] be an array of integers
A[0] ← 1
for i from 1 to n do
A[i] ← 0
for j from k down to 1 do
A[j] ← A[j] + A[j − 1]
end
A[0] ← 1
end
return A[k]
Finding Common Factors
We can also calculate combinations by computer in the same way that we calculate them by hand,
4
through dividing out common factors found in the numerator and denominator. Finding common factors is
also useful for reducing any ratio of two integers.
To find the greatest common divisor of two integers (which can be from the numerator and denominator
of some ratio), we can use a simple numerical fact:
For all integers a, b, c:
if c|a and c|b then c|(a − b)
Simply, if an integer is a factor of two other integers, then it must also be a factor of their difference.
We can use this principle repeatedly, as noted by Euclid, to find the greatest common divisor of two positive
integers. This algorithm works by subtracting the smaller number from the larger until one number is 0
(which will occur just after the numbers are equal), and then returns the nonzero number.
Algorithm 2: Euclid’s Algorithm for Finding the Greatest Common Divisor of Two Integers
Input : positive integers a and b
Output: greatest common divisor of a and b
while a > 0 and b > 0 do
if a > b then
a←a−b
end
else
b←b−a
end
end
return a + b
For example, if we start with 36 and 50, we will get the sequence of pairs:
(36, 50), (36, 14), (22, 14), (8, 14), (8, 6), (2, 6), (2, 4), (2, 2), (2, 0)
and 105 and 42 will produce the sequence
(105, 42), (63, 42), (21, 42), (21, 21), (21, 0)
Since the algorithm as given above will make b zero if the numbers are equal, it could return a at the
end instead of a + b (which would allow for either being zero). We could also instead test for equality.
The Binomial Theorem
Combinations are also useful when considering the expansion of powers in algebra. The most fundamental
of these relationships is called the Binomial Theorem, and gives the relationship between the expansions
of powers of a binomial (a function with two terms) and the rows of Pascal’s Triangle.
n
(a + b) =
n X
n
k=0
k
· ak bn−k
This theorem can be proven by considering how the term ak bn−k arises in the expansion of (a + b)n : there
are n terms, and exactly k of them must contribute an ato the result, with the remaining terms contributing
a b. Thus the number of ways that this can occur is nk .
5
This theorem can be used in both directions, either enabling us to easily expand a power of a binomial,
particularly enabling us to skip to a particular term, or to collapse a sum that is the expanded form of a
binomial.
For example:
• In (2x + 3)10 , what is the coefficient of x6 ?
x6 occurs in the sum when k = 6, so we have a coefficient of
10
6
· 26 · 34 .
9
1
· 11 · 38 .
• In (x2 + 3x)9 , what is the coefficient of x10 ?
x10 occurs in the sum when k = 1, so we have a coefficient of
• In (x3 + 2x)8 , what is the coefficient of x15 ?
x15 does not occur in the sum (all exponents are even), so it has a coefficient of 0. Note that we can
determine that all exponents are even by working with the expansion:
(x3 + 2x)8 =
8 X
8
k=0
k
(x3 )k (2x)8−k =
8 X
8
k=0
• Find a closed form for
n X
n
k=0
k
k
x3k 28−k x8−k =
8 X
8
k=0
k
28−k x2k+8
3k
We can use the binomial theorem to determine that
n X
n k
3 = (3 + 1)n = 4n
k
k=0
since the left-hand side is the binomial theorem summation for a=3 and b=1.
This last type of use is likely to be the most applicable in the analysis of algorithms, since analysis may
give rise to a summation of terms that include combinations.
If the term consists of the combination only, then we have
n X
n
k=0
k
= (1 + 1)n = 2n
which can be viewed as counting all subsets of a set of n elements; the summation groups them by size, while
the binomial does not.
6
Download