Geometric Modeling Chapter 11

advertisement
Chapter 11
Geometric Modeling
Stereology is at heart a geometric science, using the principles of geometric
probability to estimate measures of three-dimensional structures from measurements that can be accessed in lower dimensions using planes, lines and points as
probes. There are modern texts in geometrical probability (Kendall & Moran, 1986;
Matheron, 1975; Santalo, 1976). However, one of the first areas of study in this field
(long predating the use of the name “stereology”) was the calculation of the probabilities of intersection of various probes with objects of specified shape. This has
roots back to the Buffon needle problem (18th century) and also involves Bertand’s
paradox, with its consideration of what a random probe really entails.
Quite a bit of work during the middle part of this century dealt with the
relationship between the size distribution of objects in 3D with the size distribution
of the linear or planar intercepts that a set of isotropic, uniform and random probes
would produce (see for example Wicksell, 1925; Cruz-Orive, 1976). These values are
needed for the unfolding of the size distribution of the intercepts to estimate the
size distribution of the 3D objects. As discussed elsewhere the trend in recent
decades has been away from this approach (Gundersen, 1986) because of the illconditioned nature of the mathematics used in unfolding (its sensitivity to small
variations in the measured size distribution producing much larger variations in the
calculated one), and because it makes some flawed assumptions about the nature of
the specimen. Basically it assumes that the shape of the 3D objects is known (and
has some relatively simple geometric form) and that all of the objects are the same
in shape. The most common shape used for unfolding is a sphere, because of the
mathematical simplicity that it yields. However, there are few real samples in which
all of the objects of interest are actually perfect spheres. Even small deviations from
this shape, especially deviations that are systematic with size, introduce substantial
bias into the final results.
Nevertheless it is useful to understand how the process for determining the
relationship between object shape and the size distribution of intercepts works. It
is unlikely that a working stereologist will need to develop such a set of data individually (and sets of such data have been published for many geometrical shapes
such as cylinders, ellipsoids, and polyhedra). But the basic method helps to educate
the mind into the relationship between three-dimensional objects and the intercepts
which planes and lines make with them, and this is an important step in developing an understanding and even an intuition for visualizing the three dimensional
objects that are revealed in traditional two dimensional microscopy.
This chapter reviews the tools used for making these calculations and estimations. There are two avenues by which to approach the determinations: integration of analytic expressions, and random sampling. The former will initially be more
271
272
Chapter 11
familiar, as an outgrowth of normal analytic geometry and calculus, but the latter
is ultimately be more powerful and useful for dealing with many of the problems
encountered with real three dimensional features.
Methods: Analytic and Sampling
As an introductory example, consider the problem of determining the area
of a circle of unit radius. Figure 11.1 shows the familiar analytical approach. The
circle is broken into strips of width dx, whose length if expressed as a function of
x is
L = 2 1 - x2
(11.1)
Then the area of the circle is determined by integration, giving
+1
Area = 2 Ú 1 - x 2 dx
(11.2)
-1
which can be directly integrated to give
Ï sin-1 (1) sin-1 ( -1) ¸ p Ê p ˆ
2Ì
˝ = - Ë- ¯ = p
2
2
Ó 2
˛ 2
(11.3)
The other approach to this problem is to draw a circle of unit radius inside
a square (of side 2, and area 4). Hang the paper on a tree, back off some suitable
distance, and shoot it with a shotgun. Figure 11.2 shows an example of typical
results that you might observe. Now count the holes that are inside the circle, and
those that are inside the square. The fraction that are inside the circle divided by the
total number of holes should be p/4.
This is a sampling method. It carries the implicit assumption of a perfectly
random shotgun, in the sense that the probability of holes (points) appearing within
any particular unit area of the target (plane) is equal. This is in agreement with our
Figure 11.1. Drawing illustrating the integration of the area of a circle
Geometric Modeling
273
Figure 11.2. The “shotgun” method of measuring the circle’s area.
whole idea of randomness. And, of course, the precision of the result depends on
counting statistics: we will have to shoot quite a lot of holes in the target and count
them to arrive at a good answer.
This is easier to do with a computer simulation than a real shotgun. Most
computer systems incorporate some type of random number generator, usually a
complex software routine that multiplies, adds and divides numbers to obtain difference values that mimic true random numbers. It is not our purpose here to prescribe tests of how good these “pseudo random number” generators are. (Some are
quite good, producing millions or hundreds of millions of values that do not repeat
and pass elaborate tests for uniformity and unpredictability; others are barely good
enough for their intended purpose, usually controlling the motion of alien space
invaders on the computer screen.)
Assuming the availability of a random number function, and some high-level
computer language in which to work, a program to shoot the holes and count them
might look like the one in Listing 1 (see Appendix). This program (and the others
in the Appendix) can be translated quite straightforwardly into any dialect of Basic,
Fortran, Pascal, C, etc. It uses a function RND that returns a real number value in
the range 0 <= value < 1 with a uniform random distribution. This means that we
are in effect looking not at the entire target, but at one-quarter of it (the center is
at X = 0, Y = 0). Each generated point is checked, and counted if it is inside the
circle. The program prints out the area estimate periodically, in the example every
thousand points, and continues until stopped. A typical run of the program produced the example data shown in Figure 11.3. Other runs, using a different set of
random numbers, would be different, at least in the early stages. But given enough
trials, the value approaches the correct one.
Because of the use of random numbers to measure probability, this sampling
approach is called the Monte-Carlo method (after the famous gambling casino). It
finds many applications in sampling of processes where the individual rules and
physical principles are well known, but the overall process is complex and the steps
274
Chapter 11
Figure 11.3. Typical result of running the program to measure circle area.
are not easily combined in analytical expressions that can be summed or integrated,
such as scattering or electrons, photons or alpha-particles.
In the example here, it is easy to see that the method does eventually
approach the “right” answer, but that it takes a lot more work than the straightforward integration of the circle. However, given some other much more complex shape
bounded by lines that could individually be described by relationships in X and Y,
but for which the integral cannot so readily be evaluated, the Monte-Carlo approach
could be a useful alternative way to find the area. Measuring the area of the British
Isles for example would simply require hanging up the map and blasting away with
our shotgun. We will find this approach especially useful for three-dimensional
objects.
If the sampling pattern were not random, but instead the program had two
loops that varied X and Y through a regular pattern, for instance
FOR Y = 0 TO 1 STEP 0.02
FOR X = 0 TO 1 STEP 0.02
...
then 2500 values would be sampled, with a good result. This systematic sampling is
in effect a numerical integration, if the number of steps is large enough (the step
size is small enough). But with systematic sampling, you have the same problems in
establishing the limits as for analytic integration. Furthermore, there is no valid
answer at all until you have completed the program, and no way to improve the
answer by running for a longer time (if you repeat it, you will get exactly the same
answer). The random sampling method produces a reasonable answer (whose accuracy can be estimated statistically) in a short time, and it will continue to improve
with running time.
Sphere Intercepts
The frequency distribution of the lengths of linear intercepts passing
through a sphere is also a problem that can be solved analytically, as indicated in
Geometric Modeling
275
Figure 11.4. Figure 11.4a shows many lines passing vertically through the sphere.
Because of its symmetry, it is not necessary to rotate the sphere or consider many
directions. If the vertical lines correspond to shots fired at the target (the flat surface
beneath it) then the sampling condition is satisfied and we need only to determine
the intercept length of each line.
The length of each intercept line passing vertically through a sphere is
uniquely related to its distance from the center (Figure 11.4b). A cross section
(Figure 11.4c) shows that this is 2 · (1 - r2)1/2), and the number of lines at each radius
is proportional to the area of the circular strip shown on the plane, which is 2p r
dr. The result is a frequency curve for the intercept length L whose shape is just the
straight line that was shown before.
dh = p/2L dL
(11.4)
The Monte-Carlo sampling approach to this is similar to that used before
for the area of a circle. Because of the symmetry of the sphere, we can work just in
one quadrant. X, Y values are generated by the RND function, and for each line
that hits the sphere, the length is calculated. These lengths are summed in an array
of 20 counters that become the frequency vs. length histogram. The program could
be written as shown in Listing 2 in the Appendix.
As before, a fairly large number of trials are required to obtain a good
estimate of the shape of the distribution curve. Figure 11.5 shows three results
from typical runs of the program, with 100, 1000 and 10,000 trials. The latter is
a fairly good fit to the expected straight line. The uncertainty in each bin is
directly predictable from the number of counts, as described in the chapter on
statistics.
Intercept Lengths in Other Bodies
It is possible to perform both the integration and Monte-Carlo operations
very simply for a sphere, because of the high degree of symmetry. For other shapes
this becomes more difficult. For example, the cube shown in Figure 11.6 has a radically different distribution of intercept lengths than the sphere. It is very unlikely
to find a short intercept for the sphere (as shown in the frequency histogram)
because the line must pass very close to the edge of the sphere. But for a cube, the
edges and corners provide numerous opportunities for short intercept lengths.
However, if we restricted the test lines to ones vertical in the drawing, no short intercepts would be observed (nor would any long ones; all vertical intercepts would have
exactly the same length). Unlike the case for the sphere, it is necessary to consider
all possible orientations of the lines with respect to the figure. This will require more
random numbers.
It is quite straightforward to determine a histogram of frequency versus
intercept length in a circle (2-dimensional) or sphere (3-dimensional) using the
Monte-Carlo approach, even though both these problems can also be solved
analytically. For other shapes, the Monte-Carlo method offers many practical
advantages, including that it is not difficult to program in even a small computer.
However, care is required so that the intersecting lines are properly randomized in
276
Chapter 11
a
b
Figure 11.4. Intercept lengths in a sphere (a). The number of lines with the same length
is proportional to the area of a circular segment on the plane (b) and the length is a
function of the distance of the line from the center (c). (For color representation see the
attached CD-ROM.)
Geometric Modeling
277
c
Figure 11.4. Continued
space with respect to the feature, and have uniform probability of passing through
all regions and in all directions.
In two dimensions, consider the case of determining intercept lines in a
square. The square may be specified as having a unit side length and corners at
(±0.5, ±0.5). Consider the following ways of specifying lines, and whether they will
meet the requirements for random sampling (all of the random numbers indicated
by RND below are considered to vary from 0 to 1, with uniform probability
density, as produced by most computer subroutines that generate pseudo-random
numbers):
1) generate one number Y = RND - 0.5, and use it to locate a horizontal
line across the square. This will uniformly sample space (if the square is
Figure 11.5. Probability curves for intercept lengths in a sphere generated by a MonteCarlo program.
278
Chapter 11
Figure 11.6. A cube with illustrative intercept lines. (For color representation see the
attached CD-ROM.)
subdivided into a checkerboard of smaller squares, and the number of lines
passing through each is counted, the numbers will be the same within
counting statistics). However, all of the intercept lengths through the
square will have length exactly 1.0 because the directions are not properly
randomized.
2) generate one number THETA = p RND, and use it to orient a line passing
though the origin (center of the square). This will uniformly sample orientations, but not positions. The counts in the sampling squares near the center
will be much higher than those near the periphery. Short intercept lengths
will not be generated, and this will consequently not produce a true frequency histogram.
3) generate three random numbers: X = RND - 0.5, Y = RND - 0.5, and
THETA = p RND. The line is defined as passing through the point X,Y with
slope THETA. It is less obvious that this also produces biased data, but it,
too, favors the sampling squares near the center at the expense of those near
the periphery and produces a biased set of intercept lengths. So do other
combinations such as using four random numbers for X1,Y1 and X2,Y2 (two
points to define the line).
To reveal the bias in these (and other) possible methods, it is instructive to
write a small program to perform the necessary steps. This is strongly recommended
to the practitioner, as the 2-dimensional geometry and simple shape of the square
make things much easier than some of the 3-dimensional problems that may be
encountered. Set up an array to count squares (e.g. 10 ¥ 10) through which the lines
pass. Then use various methods to generate the lines, sum the counts, and also
construct a histogram of intercept lengths. It is also useful to keep track of the
Geometric Modeling
279
histogram of line lengths within the circumscribed unit circle around the square,
since the shape of that distribution is known and departures will reflect some fault
in the randomness of the line generation routine.
When this is done for the methods described above, arrays like those shown
in Figures 11.7 and 11.8 indicate improper sampling. Both of these methods, and
the others mentioned, also produce quite wrong histograms of intercept length for
both the cube and sphere.
A proper method for generating randomized lines is:
Generate R = 0.5 RND and THETA = p RND. This defines a
point within the unit circle, and a vector from the origin to that
point. Pass the line for intercept measurement through the
point, perpendicular to the line. (Note that generating an
intercept line from two points on the circumscribed circle using
two angles THETA = p RND does not produce random lines,
although for a sphere it does.)
An equivalent way to consider the problem is to imagine that instead of the
square being fixed, and the lines oriented at all angles, the square is rotated inside
the circle (requiring one random number to specify the orientation angle). Then the
lines can all be parallel, just as for the earlier method of obtaining intercepts through
the circle. This also requires one random number (the position along the axis). With
these methods, or others which are equivalent once rotation of coordinates is carried
out, proper sampling of the grids is achieved.
112
123
134
159
147
156
139
127
123
115
110
170
191
181
172
175
183
165
158
126
122
163
228
213
201
211
230
230
166
118
143
181
242
259
253
259
252
252
190
127
147
195
216
260
285
288
283
261
209
145
163
175
220
255
287
305
265
230
207
159
121
185
188
255
273
286
271
244
208
160
114
153
189
247
254
258
270
193
179
132
97
138
176
175
200
224
209
189
141
124
103
124
157
130
166
151
172
160
125
118
Figure 11.7. Array of counts for 10 ¥ 10 grid using four random numbers to generate
X1 = RND - 0.5,Y1 = RND - 0.5 and X2 = RND - 0.5,Y2 = RND - 0.5 points to define
the line. Note the center weighting. The graph shows the counts as an isometric display.
280
Chapter 11
177
197
235
249
257
220
196
197
201
181
109
184
232
244
280
250
233
234
233
92
59
173
216
282
318
298
280
225
181
51
48
167
203
367
383
353
322
245
141
37
39
142
213
312
423
422
360
222
128
35
37
132
206
301
419
458
356
245
134
33
37
138
239
333
341
406
361
399
145
38
50
152
279
302
288
313
342
269
209
50
63
161
297
266
260
240
274
242
196
101
185
168
241
240
225
239
246
222
218
197
Figure 11.8. Array of counts for 10 ¥ 10 grid using two random numbers to generate
points along left and right edges of square. In addition to center weighting note the
sparse counts along top and bottom edges. The graph shows the counts as an isometric display.
The result is a histogram of intercept lengths in the square as shown in Figure
11.9. This shows a flat shelf for short lengths, where the line passes through two
adjacent sides of the square. The peak corresponds to the length of the square’s
edge, and the probability then drops off rapidly to the maximum intercept (the diagonal, which is equal to the circle diameter).
The program used to generate the data for the histogram is shown in Listing
3 in the Appendix (it uses the first of the randomizing methods described above).
In this program, the array CT is defined (to build a 20 point histogram in this
example). DG is the radius of the circumscribed circle, or half the square’s diagonal; P2 is 2p. The random function is used to generate a point, from which the slope
(M) and intercept (B) of the line through this point perpendicular to the vector from
the origin is calculated (the equation of the line is y = Mx + B). The intersection of
this point with the left side of the square, or if it does not pass through the side, the
intersection with the top or bottom is determined as one end of the line, and the
process repeated for the other end. The line length is obtained from the Pythagorean
theorem, and for lines that intersect the square produces an integer from the ratio
of this length to the circle diameter which is used to build the histogram.
Intercept Lengths in Three Dimensions
Progressing to three dimensions, things become more complicated, and there
are many more ways to bias the sampling. An extension of the methods described
Geometric Modeling
281
Figure 11.9. Histogram of intercept lengths in a square, generated by Monte-Carlo
program.
above for the circle and square can be used for a sphere and cube, as follows (refer
to Figure 11.10 for the nomenclature used for a spherical coordinate system).
1) Generate a random vector from the origin. This requires a radius R which
is just the circle radius times a random number, and two angles. Using the
nomenclature of the figure above, the q angle can be generated as 2p · RND,
but the F angle cannot be just (p/2) · RND. If this were done, the vectors
that pointed nearly vertically would be much denser in space than those that
were nearly horizontal, because for each vertical angle there would be the
same number of vectors, and the circumference of the “latitude” circles
decreases toward the pole. The proper uniform sampling of orientations
requires that F be generated as the angle whose sine is a random number
from 0 to 1 (producing angles from 0 to p/2 as desired). Through the point
at the end of the vector defined by these two angles and radius, pass a plane
perpendicular to the vector. Then within this plane, generate a random line
Figure 11.10. Spherical coordinates as described in the text. (For color representation
see the attached CD-ROM.)
282
Chapter 11
by the method described above (angle and radius from the initial point to
define another point, and a line through that point perpendicular to the
vector from the initial point). This is less cumbersome than it sounds if
proper use of matrix arithmetic is used to deal with the vectors.
2) Locate two random points on the circumscribed sphere. As in the method
above, the points are defined by two angles, q and F, which are generated as
q = 2p RND, F = arc sin (RND). Connect these points with a line.
As before, alternate methods where the cube rotates while the line orientation stays fixed in space can be used instead (but the same method for determining
the tilt angle is required). Conceptually it may be helpful to visualize the process as
one of embedding the cube inside a sphere which is then rotated while the intersection lines pass through it vertically, as shown in Figure 11.11. Once the line has
been determined, it is straightforward to find the intersections with the cube faces
and obtain the intercept length. In the program shown as Listing 4 in the Appendix, this is done by setting up a matrix equation and solving it. Simpler methods
can be used for the cube, but this more general approach is desirable when we next
turn our attention to less easy shapes.
This program dimensions an array for the histogram and line and defines the
equations for each of the six faces on the cube (equations such as 1X + 0Y + 0Z =
0.5). Then a random point on the sphere is generated. For convenience, the elevation angle E used here is the complement of the angle F discussed above, and the
complicated arctan function is used to get the arc sine, which many languages do
not provide. The point X1, Y1, Z1 (and the second point X2, Y2, Z2 ) define the
intersections of the random line on the sphere.
Figure 11.11. Linear intercepts in a cube can be determined by embedding the cube
in a sphere which is then oriented randomly and vertical lines passed through it. (For
color representation see the attached CD-ROM.)
Geometric Modeling
283
Coefficients in the A matrix define this line. Each face is checked for intersections with the line, by placing the face coordinates into the matrix. PC is a counter
used to find two intersections of the line with faces. The determinant of the matrix
(if zero, skip to another face, since the current one is exactly parallel to the line) calculates the coordinates (X, Y, Z) of the intersection of the line with the face plane,
which are checked to see if the intersection lies within the face of the cube. For the
first such intersection, the coordinates are saved and the counter PC incremented.
When the second point is found, the distance between them (the intercept length)
is calculated and converted it to an integer for the histogram address, and one count
added. The process repeats for the selected number of intersection lines. Some
output of the data as a graph is needed to make the results accessible.
Several additional features can be added to this type of program to serve as
useful checks on whether gross errors are present in the various parts of the
program. First, the sphere intercept length can also be used to build a histogram
(this is just the distance between the points X1, Y1, Z1 and X2, Y2, Z2) which should
agree with the expected frequency histogram for a sphere. Also, by summing the
lengths of all the lines in both the sphere and cube, the volume fraction VV and the
surface area per unit volume SV of the cube in the sphere can be obtained and compared to the known correct values. The mean intercept length in the feature l can
also be obtained, which should equal 4 V/S for the cube. It is also relatively simple
to add a counter for each face of the cube, and compare the number of intersections of lines with each face (if random orientations are correctly used, these values
should be the same within counting variability).
Before looking at the results from this program, It is worth pointing out that
it may be relatively easily extended to other shapes, even non-regular or concave
ones. As shown in Figure 11.12, for polyhedra the smaller the number of faces the
easier it is to produce short intercepts where the lines pass through adjacent faces.
The number of faces will vary, and so some of the dimensions will change, and it
may be worthwhile to directly compute the equations of each face plane from vertex
points. The most significant change is in the test to see if the intersection point of
the line with each face plane lies within the face. For the general polyhedron, the
faces are polygons. The simplest way to proceed is to divide the face into triangles
formed by two adjacent vertices and the intersection point, and sum the triangle
areas. If this value exceeds the known area of the face, then the point lies outside
the face (provision for finite numerical accuracy within the computer must be made
in the equality test). Many curved surfaces can also be described by analytical geometry, but complex objects (e.g. those defined by splines) require more difficult techniques to locate the intersection points. For convex objects some lines will create
more than one intercept.
Applying this program to a series of regular polyhedra produces the results
shown in Figure 11.13. The results are interesting in several respects. For instance,
note that whereas for the sphere, the most likely intercept length is the maximum
(equal to the sphere diameter), large values are quite unlikely for the polyhedra
because they requires a perfect vertex-to-vertex alignment (and is impossible for the
tetrahedron because even this distance is less than the sphere diameter). Also, the
peaks present in the frequency histogram correspond to the distance between
284
Chapter 11
a
b
Figure 11.12. Linear intercepts in other solids: a) tetrahedron; b) many-sided polyhedron; c) cylinder; d) torus; e) arbitrary non-convex shape (a banana). (For color representation see the attached CD-ROM.)
parallel faces (again, there are none for the tetrahedron), while the sloping but linear
shelves correspond to intersections through adjacent faces.
The mean intercept lengths in the various solids are obtained from the distributions. They are listed below as fractions of the diameter of the circumscribed
sphere.
Solid
Tetrahedron
Cube
Octahedron
Icosahedron
Sphere
Mean Intercept
0.232
0.393
0.399
0.538
0.667
Geometric Modeling
285
c
d
e
Figure 11.12. Continued
286
Chapter 11
Figure 11.13. Frequency histograms for intercept lengths through various regular polyhedra, compared to that for a sphere, generated by Monte-Carlo program using 50
intervals for length, and about 40,000 intercept lines.
Intersections of Planes with Objects
Not all distribution curves are for intercept lengths, of course. It is also possible to model using geometric probability (or to measure on real samples) the areas
of intersection profiles cut by planar sections through any shape feature. As usual,
this is easiest for a sphere. Figure 11.14a shows that any intersection of a plane with
a sphere produces a circle. Because of the symmetry, rotation is not needed and a
series of parallel planes (Figure 11.14b) can generate the family of circles whose
radii are calculable from the position of the plane, analogous to the case for the line
intercept.
For a cube, the intersection can have 3, 4, 5 or 6 sides (Figure 11.15). It is
evident that the cube’s corners produce a large number of small area intersections,
and that there are ways to cut areas quite a bit larger than the most probable slice
(which is nearly equal to the area of a square face of the cube). Figure 11.16 shows
a plot (Hull & Houk, 1958) of the area of profiles cut through spheres and cubes.
Generating similar distributions for planar intersections with other shapes is a
straightforward if rather lengthy exercise in geometric probability. It also reveals that
humans are not good intuitive judges of intersections of planes with solids. Figure
11.17 shows a torus, familiar to many researchers in the form of a bagel. A few of the
less obvious cuts which produce sections unexpected to casual observers are shown.
Considering the difficulty in predicting the shape of section cuts from known objects,
it should not be surprising that the reverse process of imagining the unknown 3D
object that may have given rise to observed planar sections easily leads to errors.
For other shapes with even less symmetry, the analytical approach becomes
completely impractical. Even Monte-Carlo approaches are time-consuming,
because of the problems of calculating the intersection lines and points, and it may
take a very large number of trials to get enough counting statistics in the low-probability regions of the frequency histogram to adequately define it (particularly when
Geometric Modeling
287
a
b
Figure 11.14. Intersection of a plane with a sphere in any orientation produces a circle
(a). A family of parallel planes (b) produces a distribution of circle sizes. (For color representation see the attached CD-ROM.)
the shape of the curve is to be used to deconvolute distributions from many different feature sizes).
Bertrand’s Paradox
There is a problem which has been mentioned briefly in the previous examples, but which often becomes important when dealing with ways to orient complex
shapes using random numbers. It is easy to bias (or, in mathematical terminology,
to constrain) the orientations so that the random numbers do not produce a random
sampling of the space inside the object. In a trivial example, if we only rotated the
cube around one axis, we would not see all intercept lengths; but if we rotate it uniformly around all three space angles, the proper results are obtained.
A classic example of the problem of (im)proper sampling is stated in a
famous paradox presented by Bertrand in 1899. The problem to be solved is stated
as follows: What is the probability that a random line passing through a circle will
have an intercept length greater than the side of the inscribed equilateral triangle in
the circle?
288
Chapter 11
a
b
Figure 11.15. Planes intersection a cube can produce intersections with: a) 3, b) 4; c)
5; d) 6 sides. (For color representation see the attached CD-ROM.)
Using the three drawings in Figure 11.18 (and working from left to right),
here are three arguments that can be presented to quickly arrive at an answer.
1. Since the circle has perfect symmetry, and any point on the periphery is the
same as any other, we will consider without loss of generality, lines that enter
the circle at one corner of the triangle, but at any angle. Since the total range
of the angle theta is from 0 to 180 degrees, and since the triangle subtends
an angle of 60 degrees (and any line lying within the shaded region clearly
has an intercept length greater than the length of the side of the triangle),
the probability that the intercept line length is greater than the side of the
triangle is just 60/180, or one-third.
Geometric Modeling
289
c
d
Figure 11.15. Continued
2. Since each line passing through the circle produces a line segment which must
have a center point, which will lie within the circle, we can reduce the problem
to determining how many of the line segments have midpoints that lie within
the shaded circle inscribed in the triangle (these lines will have lengths longer
than the side of the triangle, while any line whose midpoint is outside the
circle will have a shorter length). Since the area of the small circle is just onefourth the area of the large one, the probability that the intercept line length
is greater than the side of the triangle is one-fourth.
3. Since the circle is symmetric, we lose no generality in considering only vertical lines. Any vertical line which passes through the circle but passes outside
of the smaller circle inscribed in the triangle will have a length shorter than
the side of the triangle, so the probability that the intercept line length is
290
Chapter 11
0.125
Sphere
Cube
Frequency
0.100
0.075
0.050
0.025
1.00
0.75
0.50
0.25
0.00
0.000
Area/Max Area
Figure 11.16. Intercept area distributions for a sphere and cube. (For color representation see the attached CD-ROM.)
greater than the side of the triangle is the ratio of diameters of the circles,
or one-half.
These three arguments are all, at least on the surface, plausible. But they
produce three different answers. Perhaps we should choose one-third, since it is in
the middle and hence “safer” than the two extreme answers!
No, the correct answer is one-half. The other two arguments are invalid
because they impose constraints on the random lines. Lines spaced at equal angle
steps all passing through one point on the circle’s periphery do not uniformly sample
it, but are closer together at small and large angles. This under-represents the
number of lines that lie within the shaded triangle, and hence produces too low an
answer. Likewise the lines whose midpoints lie within the shaded inner circle would
be correct if only one angular orientation were considered (it would then be equivalent to the third argument), but when all angles are considered it undercounts the
lines that pass outside the shaded circle. It can be very tricky to find subtle forms
of this kind of bias. They may arise in either the analytical integration or the random
sampling methods.
The method used to generate random orientation of lines in space for the
Monte-Carlo routines described above must avoid this problem. The trick is to use
q = 2p RND and F = arc sin (RND), using the terminology of Figure 11.10. This
avoids clustering many vectors around the Z axis, and gives uniform sampling.
The Buffon Needle Problem
A good example of the role of angles in geometrical probability is the classical problem known as Buffon’s Needle (Buffon, 1777): If a needle of length L
is dropped at random onto a grid of parallel lines with spacing S, what is the
Geometric Modeling
291
a
b
Figure 11.17. Some of the less familiar planar cuts through a bagel (torus), which
produce one or two convex or concave intersections. (For color representation see the
attached CD-ROM.)
probability that it crosses a line? This is another example of a problem easy enough
to be solved by either analytical means or random sampling. First, we will look at
the procedure of integration.
Figure 11.19 shows the situation. Two variables, y and theta, are involved;
theta is the angle of the needle with respect to the line direction, and y is the
distance from the midpoint of the needle to a line. It is only necessary to consider
the angle range from 0 to p/2, because of symmetry. At any angle, the needle subtends a distance of L sin Á perpendicular to the lines, so if y is less than half this,
there will be an intersection. Hence, the probability of intersection is, as was stated
before
p
2
Ú
0
L sin q
dq
S
p
2
Ú dq
0
LÈ
Êpˆ
˘
Í - cosË 2 ¯ + cos(0 )˚˙ 2 L
S
Î
=
=
pS
Êpˆ
Ë 2¯
(11.5)
292
Chapter 11
c
d
Figure 11.17. Continued
Figure 11.18. Illustrations for Bertrand’s Paradox.
Figure 11.19. Geometry for the Buffon needle problem.
Geometric Modeling
293
A Monte-Carlo approach to solving this problem is shown as Listing 5 in
the Appendix. L and S have been arbitrarily set to 1, so the result for number / count
should just be p/2.
One particular sequence of running this program ten times, for 1000 trials
each, gave a mean and standard deviation of 1.5727 ± 0.0254, which is quite
respectable. There is a probably apocryphal story about an eastern European mathematician who claimed that he dropped a needle 5 cm. long onto lines 6 cm. apart,
and counted 226 crossing events in 355 trials. He used this to estimate p, as 2 ¥
355/266 = 3.1415929 (correct through the first 6 decimal places).
We can evaluate the (im)probability of the truth in this story. Dropping
the needle one more time, regardless of the outcome, would have seriously degraded
the “accuracy” of the result! This tale may serve as a caution to always consider
the statistical precision of results generated by a sampling method such as MonteCarlo simulation, as well as the statistics of measured data to which it is to be
compared.
Appendix
Listing 1: Random points in a unit square used to measure circle area.
Listing 1
Integer: Increment = 1000;
Integer: Number = 0;
Integer: Count = 0;
Repeat
{
X = RND;
Y = RND;
IF (X*X + Y*Y) < 1
THEN Count = Count + 1;
Number = Number + 1;
IF Float(Number/Increment) = Integer Number/
Increment)
THEN PRINT Number, Float(4*Count/Number);
}
Until Stopped;
Listing 2: Intercept lengths in a sphere.
Listing 2
Integer Count[1..20] = 0;
Integer Number, Bin, i;
Float X, Y, rsquared, Icept;
Input (“Number of trials= ”,Number);
for (i = 1; i <= Number; i++)
{
X = RND;
Y = RND;
rsquared = X*X + Y*Y;
if rsquared < 1 then
{
Icept = sqrt (1 - rsquared);
Bin = Integer (20*Icept) + 1;
Count[Bin] = Count[Bin] + 1;
}
}
294
Chapter 11
for (Bin = 1; Bin <= 20; Bin++)
PRINT Bin, Count[Bin];
Listing 3: Intercept lengths in a square.
Listing 3
Integer CT[20] = 0;
Integer N, NU, L;
Float M, B, TH, R, XL, YL, XR, YR;
Const DG = SQR(2)/2;
Const P2 = 8*ATN(1);
Input (“Number of Lines: ”,NU);
for (N = 1; N <= NU; N++)
{
R = DG * RND; /* two random numbers */
TH = P2 * RND;
M = -1/tan(TH);
B = 0.5 + R * sin(TH) - M * (0.5 + R * cos(TH));
/* get the coordinates of the ends of the line */
XL = 0;
YL = B;
if ((YL>1) or (YL<0)) then
{
if (YL>1)
then YL=1
else YL=0;
XL=(YL-B)/M;
}
XR = 1;
YR = M + B;
if ((YR>1) or (YR<0)) then
{
if (YR>1)
then YR=1
else YR=0;
XR=(YR-B)/M
}
LEN = sqrt((XR-XL)*(XR-XL)+(YR-YL)*(YR-YL));
if (LEN>0) then
{
L = integer(0.5+10*LEN/DG);
CT[L] = CT[L]+1;
}
}
Listing 4: Intercept lengths in a cube.
Listing 4
Float
Float
Integer
Const
Const
F[6,4] = 1,0,0,0.5
1,0,0,-0.5
0,1,0,0.5
0,1,0,-0.5
0,0,1,0.5
0,0,1,-0.5; // the six faces of the
cube
A[3,4]; // used to calculate line/face
intersection
LX[50]; // for histogram
P2 = 8 * atan(1.0); // 2p
R = sqrt(3.0)/2.0; // radius of circumscribed
sphere
Geometric Modeling
295
Input (“Number of Lines:”,NU);
for (n = 1; n <= NU; N++)
{
TH = P2 * RND;
E = -1 + 2 * RND;
E = atan (E/sqrt(1 - E*E));
X1 = R * cos(E) * cos(TH);
Y1 = R * cos(E) * sin(TH);
Z1 = R * sin(E); // one random point on the sphere
TH = P2 * RND;
E = -1 + 2 * RND;
E = atan (E/sqrt(1 - E*E));
X2 = R * cos(E) * cos(TH);
Y2 = R * cos(E) * sin(TH);
Z2 = R * sin(E); // second random point on sphere
A[1,1]=Y1*Z1-Y2*Z1;
A[1,2]=Z1*X2-Z2*X1;
A[1,3]=X1*Y2-X2*Y1;
A[1,4]=0;
A[2,1]=Y1*Z2-Y2*Z1;
A[2,2]=Z1*X2-Z2*X1+Z2-Z1;
A[2,3]=X1*Y2-X2*Y1+Y1-Y2;
A[2,4]=Y1*Z2-Y2*Z1; // define the line in A matrix
PC=0;
for (j=1; j<=6; j++) // check faces for intersection
{
for (k=1; k<=4; k++)
A[3,K]=F[J,K];
DE= A(1,1)*(A(2,2)*A(3,3)-A(2,3)*A(3,2))
+A(1,2)*(A(2,3)*A(3,1)-A(2,1)*A(3,3))
+A(1,3)*(A(2,1)*A(3,2)-A(2,2)*A(3,1));
if (DE<>0) then // zero means parallel to the
face
{
X=
(A(1,4)*(A(2,2)*A(3,3)A(2,3)*A(3,2))
+A(1,2)*(A(2,3)*A(3,4)A(2,4)*A(3,3))
+A(1,3)*(A(2,4)*A(3,2)A(2,2)*A(3,4)))/DE;
Y=
(A(1,1)*(A(2,4)*A(3,3)A(2,3)*A(3,4))
+A(1,4)*(A(2,3)*A(3,4)A(2,4)*A(3,3))
+A(1,3)*(A(2,1)*A(3,4)A(2,4)*A(3,1)))/DE;
Z=
(A(1,1)*(A(2,2)*A(3,4)A(2,4)*A(3,2))
+A(1,2)*(A(2,4)*A(3,1)A(2,1)*A(3,4))
+A(1,4)*(A(2,1)*A(3,2)A(2,2)*A(3,1)))/DE;
// intersection point of line with face
if ((abs(X)<=0.5)
and (abs(Y)<=0.5)
and (abs(z)<=0.5))
then // see if intersection is inside cube
{ if (PC=0)
then PC=1;
296
Chapter 11
X0=X;
Y0=Y;
Z0=Z;
}
else
{ LE=SQR((X-X0)*(X-X0)+(Y-Y0)*(Y-Y0)+
(Z-Z0)*(Z-Z0));
// distance between points = length
H=integer(25*LE/R);
LX[H]=LX[H]+1; // increment histogram
}
}// if de
} // for j
} // for n
Listing 5: The Buffon needle problem.
Listing 5
Integer Number, j, Count;
Float
Y, Theta, Vert;
Const
HalfPi = 2 * Atan(1); // p/2
Input (“How many trials: ”,Number);
for (j = 1; j<=Number; j++)
{
Y= RND;
Theta = HalfPi * RND;
Vert = Sin (Theta) / 2;
if ((Y - Vert < 0) OR (Y + Vert > 1))
then Count++;
}
PRINT Number/Count
Download