doc

advertisement

Algorithm Design Techniques

Divide and Conquer

Closest Pair Problem

Given a set p of n

2 points in 2D space. Find the “closest” pair of points (closest usually refers to Euclidean distance p

1

 x

1

, y

1

, p

2

 x

2

, y

2

, so distance

 x

1

 x

2 y

1

 y

2

2

).

Why would I want to do something like this?

Consider air or sea traffic control. We continually recalculate the speed, direction and position of vehicles in relation to one another to detect potential collisions.

How many pairs of points are there in a set containing n points? n = 3 a b P

P

 ab , ac , bc

3 c n = 4 a b P

P

 ab , ac , ad , bc , bd , cd

6 c d n = 5 a d c b e

P

P

 ab , ac , ad , ae , bc , bd , be , cd , ce , de

10 n

C

2

 n

 n

1

2 n

2

2

 n

 

 

A brute force method would check all pairs in an exhaustive search

– 

 

.

Closest Pair Algorithm

Data Structures struct Point

{ int x; int y;

}; struct ClosestPair

{ float dMin;

Point p1;

Point p2;

};

Point pX[], pY[], pXL[], pYL[], pXR[], pYR[], pYC[];

ClosestPair clPL, clPR, clP; int n, middle, x, y;

Important preprocessing step

Before first call to ClosestPair , pX and pY must be sorted. pX is sorted in ascending order by x-coordinate. pX must be copied to pY , and pY is sorted in ascending order by the y-coordinate.

General Idea

1

2 3 5

4

6

In any pair of partitions, the closest pair lies to the left of the border, the right of the border, or has one point on the left of the border and one point on the right of the border (i.e. straddling). both on left straddling both on right

Divide

Start by partitioning the points in “half” using 1 st

partition giving regions

and

.

Further partition the left and right “halves” using 2 nd

and 3 rd

partitions giving regions

,

,

, and

.

Conquer / Combine

In regions

and

calculate the distance of the closest pairs.

Combine regions  and  .

Distance will be the minimum regions

and

.

Determine whether there are any closer points that straddle the border between regions

and

(one only need check those points less than distance from the border).

Distance will be the minimum from regions

and

and the border.

The closest pair in region

has been found.

Repeat for regions

and

as done for

and

to find the closest pair in region

.

Combine regions

and

..

Distance will be the minimum for the whole set of points.

The closest pair has been found.

Algorithm

ClosestPair (pX, pY, n) if n <= 3 clP = BruteForce(pX, n) else middle = (n – 1) / 2 x = pX[middle].x y = pX[middle].y for i = 0 to middle pXL[i].x = pX[i].x pXL[i].y = pX[i].y j = 0 for i = middle + 1 to n – 1 pXR[j].x = pX[i].x pXR[j].y = pX[i].y j++ j = 0 k = 0 for i = 0 to n – 1 if pY[i].x < x or (pY[i].x == x and

pY[i].y <= y and j <= middle) pYL[j].x = pY[i].x pYL[j].y = pY[i].y j ++ else pYR[k].x = pY[i].x pYR[k].y = pY[i].y k ++ clPL = ClosestPair(pXL, pYL, j) clPR = ClosestPair(pXR, pYR, k) clP = Min(clPL, clPR) j = 0 for i = 0 to n – 1 if pY[i].x >= x – clP.dMin && pY[i].x <= x + clP.dMin

&& Abs(pY[i].x – x) <= clP.dMin pYC[j].x = pY[i].x pYC[j].y = pY[i].y j ++ k = j for i = 0 to k – 2

for j = i + 1 to k – 1

if Abs(pYC[j].y – pYC[i].y > clP.dMin)

break

d = Dist (pYC[i], pYC[j])

if d < clP.dMin

clP.dMin = d

clP.p1.x = pYC[i].x

clP.p1.y = pYC[i].y

clP.p2.x = pYC[j].x

clP.p2.y = pYC[j].y return clP

Complexity

Start with pX and pY . Both arrays need to be sorted in a pre-processing step. This can be done at a cost that is O

 n log n

.

For each recursive step, pX and pY need to be split in “half” O

 log n

times. Copying values from pY into pYL and pYR is done sequentially, placing each element of pY into pYL or pYR . This can be done at a cost of O

 

. Therefore, the total cost of each recursive splitting of points is O

 n log n

.

For the merge step, where points are checked to determine whether any pairs straddle the border, the cost is actually O

 

. The idea is that for each point within clP.dMin

of the border, only a few points need to be checked.

The overall complexity of the closest pair algorithm is therefore O

 n log n

.

Selection Problem

Given an unsorted collection of n elements, select the K th smallest.

One approach would be to sort the collection using one of the best sorting algorithms (would take O

 n log n

time) and then select the element in the th

K position.

Can we achieve O

 

running time for any value of K ?

An algorithm called Randomized Quickselect runs in O

 

for the average case.

The algorithm works much like the Quicksort algorithm. The difference comes after the partition phase; we do not need to examine both partitions but rather just one because we know which one contains the th

K element.

Randomized Quickselect Algorithm

RandomizedQuickSelect (a, left, right, K) if left == right return a[left] pivot = RandomizedPartition(a, left, right) i = pivot - left + 1 if K == i return a[pivot] else if K < i return RandomizedQuickSelect(a, left, pivot - 1, K) else return RandomizedQuickSelect(a, pivot + 1, right, K - i)

Additional Function:

RandomizedPartition(a, left, right) r = rand(left, right)

Exchange(a[right], a[r]) pivot = a[right] i = left - 1 j = right while true while a[++i] < pivot

― while pivot < a[--j]

― if i < j

Swap(a[i], a [j]); else break;

Exchange(a[i], a[right]) return i

The worst case running time is O

 

and occurs when we use partitions around the largest or smallest elements.

Example

13

0

Randomized Quickselect the 5 th element:

13

81

92

43 31

65

Select pivot p = 65

57

26

75

0

13

57 43

26

Select pivot p = 26

0

31

65

92

75

81

26

43 31

57

Select pivot p = 57

31

43

31

Select pivot p = 43

43

57

13 0 26 31 43 57 65 92 75 81

Download