Chapter 11 Geometric Modeling Stereology is at heart a geometric science, using the principles of geometric probability to estimate measures of three-dimensional structures from measurements that can be accessed in lower dimensions using planes, lines and points as probes. There are modern texts in geometrical probability (Kendall & Moran, 1986; Matheron, 1975; Santalo, 1976). However, one of the first areas of study in this field (long predating the use of the name “stereology”) was the calculation of the probabilities of intersection of various probes with objects of specified shape. This has roots back to the Buffon needle problem (18th century) and also involves Bertand’s paradox, with its consideration of what a random probe really entails. Quite a bit of work during the middle part of this century dealt with the relationship between the size distribution of objects in 3D with the size distribution of the linear or planar intercepts that a set of isotropic, uniform and random probes would produce (see for example Wicksell, 1925; Cruz-Orive, 1976). These values are needed for the unfolding of the size distribution of the intercepts to estimate the size distribution of the 3D objects. As discussed elsewhere the trend in recent decades has been away from this approach (Gundersen, 1986) because of the illconditioned nature of the mathematics used in unfolding (its sensitivity to small variations in the measured size distribution producing much larger variations in the calculated one), and because it makes some flawed assumptions about the nature of the specimen. Basically it assumes that the shape of the 3D objects is known (and has some relatively simple geometric form) and that all of the objects are the same in shape. The most common shape used for unfolding is a sphere, because of the mathematical simplicity that it yields. However, there are few real samples in which all of the objects of interest are actually perfect spheres. Even small deviations from this shape, especially deviations that are systematic with size, introduce substantial bias into the final results. Nevertheless it is useful to understand how the process for determining the relationship between object shape and the size distribution of intercepts works. It is unlikely that a working stereologist will need to develop such a set of data individually (and sets of such data have been published for many geometrical shapes such as cylinders, ellipsoids, and polyhedra). But the basic method helps to educate the mind into the relationship between three-dimensional objects and the intercepts which planes and lines make with them, and this is an important step in developing an understanding and even an intuition for visualizing the three dimensional objects that are revealed in traditional two dimensional microscopy. This chapter reviews the tools used for making these calculations and estimations. There are two avenues by which to approach the determinations: integration of analytic expressions, and random sampling. The former will initially be more 271 272 Chapter 11 familiar, as an outgrowth of normal analytic geometry and calculus, but the latter is ultimately be more powerful and useful for dealing with many of the problems encountered with real three dimensional features. Methods: Analytic and Sampling As an introductory example, consider the problem of determining the area of a circle of unit radius. Figure 11.1 shows the familiar analytical approach. The circle is broken into strips of width dx, whose length if expressed as a function of x is L = 2 1 - x2 (11.1) Then the area of the circle is determined by integration, giving +1 Area = 2 Ú 1 - x 2 dx (11.2) -1 which can be directly integrated to give Ï sin-1 (1) sin-1 ( -1) ¸ p Ê p ˆ 2Ì ˝ = - Ë- ¯ = p 2 2 Ó 2 ˛ 2 (11.3) The other approach to this problem is to draw a circle of unit radius inside a square (of side 2, and area 4). Hang the paper on a tree, back off some suitable distance, and shoot it with a shotgun. Figure 11.2 shows an example of typical results that you might observe. Now count the holes that are inside the circle, and those that are inside the square. The fraction that are inside the circle divided by the total number of holes should be p/4. This is a sampling method. It carries the implicit assumption of a perfectly random shotgun, in the sense that the probability of holes (points) appearing within any particular unit area of the target (plane) is equal. This is in agreement with our Figure 11.1. Drawing illustrating the integration of the area of a circle Geometric Modeling 273 Figure 11.2. The “shotgun” method of measuring the circle’s area. whole idea of randomness. And, of course, the precision of the result depends on counting statistics: we will have to shoot quite a lot of holes in the target and count them to arrive at a good answer. This is easier to do with a computer simulation than a real shotgun. Most computer systems incorporate some type of random number generator, usually a complex software routine that multiplies, adds and divides numbers to obtain difference values that mimic true random numbers. It is not our purpose here to prescribe tests of how good these “pseudo random number” generators are. (Some are quite good, producing millions or hundreds of millions of values that do not repeat and pass elaborate tests for uniformity and unpredictability; others are barely good enough for their intended purpose, usually controlling the motion of alien space invaders on the computer screen.) Assuming the availability of a random number function, and some high-level computer language in which to work, a program to shoot the holes and count them might look like the one in Listing 1 (see Appendix). This program (and the others in the Appendix) can be translated quite straightforwardly into any dialect of Basic, Fortran, Pascal, C, etc. It uses a function RND that returns a real number value in the range 0 <= value < 1 with a uniform random distribution. This means that we are in effect looking not at the entire target, but at one-quarter of it (the center is at X = 0, Y = 0). Each generated point is checked, and counted if it is inside the circle. The program prints out the area estimate periodically, in the example every thousand points, and continues until stopped. A typical run of the program produced the example data shown in Figure 11.3. Other runs, using a different set of random numbers, would be different, at least in the early stages. But given enough trials, the value approaches the correct one. Because of the use of random numbers to measure probability, this sampling approach is called the Monte-Carlo method (after the famous gambling casino). It finds many applications in sampling of processes where the individual rules and physical principles are well known, but the overall process is complex and the steps 274 Chapter 11 Figure 11.3. Typical result of running the program to measure circle area. are not easily combined in analytical expressions that can be summed or integrated, such as scattering or electrons, photons or alpha-particles. In the example here, it is easy to see that the method does eventually approach the “right” answer, but that it takes a lot more work than the straightforward integration of the circle. However, given some other much more complex shape bounded by lines that could individually be described by relationships in X and Y, but for which the integral cannot so readily be evaluated, the Monte-Carlo approach could be a useful alternative way to find the area. Measuring the area of the British Isles for example would simply require hanging up the map and blasting away with our shotgun. We will find this approach especially useful for three-dimensional objects. If the sampling pattern were not random, but instead the program had two loops that varied X and Y through a regular pattern, for instance FOR Y = 0 TO 1 STEP 0.02 FOR X = 0 TO 1 STEP 0.02 ... then 2500 values would be sampled, with a good result. This systematic sampling is in effect a numerical integration, if the number of steps is large enough (the step size is small enough). But with systematic sampling, you have the same problems in establishing the limits as for analytic integration. Furthermore, there is no valid answer at all until you have completed the program, and no way to improve the answer by running for a longer time (if you repeat it, you will get exactly the same answer). The random sampling method produces a reasonable answer (whose accuracy can be estimated statistically) in a short time, and it will continue to improve with running time. Sphere Intercepts The frequency distribution of the lengths of linear intercepts passing through a sphere is also a problem that can be solved analytically, as indicated in Geometric Modeling 275 Figure 11.4. Figure 11.4a shows many lines passing vertically through the sphere. Because of its symmetry, it is not necessary to rotate the sphere or consider many directions. If the vertical lines correspond to shots fired at the target (the flat surface beneath it) then the sampling condition is satisfied and we need only to determine the intercept length of each line. The length of each intercept line passing vertically through a sphere is uniquely related to its distance from the center (Figure 11.4b). A cross section (Figure 11.4c) shows that this is 2 · (1 - r2)1/2), and the number of lines at each radius is proportional to the area of the circular strip shown on the plane, which is 2p r dr. The result is a frequency curve for the intercept length L whose shape is just the straight line that was shown before. dh = p/2L dL (11.4) The Monte-Carlo sampling approach to this is similar to that used before for the area of a circle. Because of the symmetry of the sphere, we can work just in one quadrant. X, Y values are generated by the RND function, and for each line that hits the sphere, the length is calculated. These lengths are summed in an array of 20 counters that become the frequency vs. length histogram. The program could be written as shown in Listing 2 in the Appendix. As before, a fairly large number of trials are required to obtain a good estimate of the shape of the distribution curve. Figure 11.5 shows three results from typical runs of the program, with 100, 1000 and 10,000 trials. The latter is a fairly good fit to the expected straight line. The uncertainty in each bin is directly predictable from the number of counts, as described in the chapter on statistics. Intercept Lengths in Other Bodies It is possible to perform both the integration and Monte-Carlo operations very simply for a sphere, because of the high degree of symmetry. For other shapes this becomes more difficult. For example, the cube shown in Figure 11.6 has a radically different distribution of intercept lengths than the sphere. It is very unlikely to find a short intercept for the sphere (as shown in the frequency histogram) because the line must pass very close to the edge of the sphere. But for a cube, the edges and corners provide numerous opportunities for short intercept lengths. However, if we restricted the test lines to ones vertical in the drawing, no short intercepts would be observed (nor would any long ones; all vertical intercepts would have exactly the same length). Unlike the case for the sphere, it is necessary to consider all possible orientations of the lines with respect to the figure. This will require more random numbers. It is quite straightforward to determine a histogram of frequency versus intercept length in a circle (2-dimensional) or sphere (3-dimensional) using the Monte-Carlo approach, even though both these problems can also be solved analytically. For other shapes, the Monte-Carlo method offers many practical advantages, including that it is not difficult to program in even a small computer. However, care is required so that the intersecting lines are properly randomized in 276 Chapter 11 a b Figure 11.4. Intercept lengths in a sphere (a). The number of lines with the same length is proportional to the area of a circular segment on the plane (b) and the length is a function of the distance of the line from the center (c). (For color representation see the attached CD-ROM.) Geometric Modeling 277 c Figure 11.4. Continued space with respect to the feature, and have uniform probability of passing through all regions and in all directions. In two dimensions, consider the case of determining intercept lines in a square. The square may be specified as having a unit side length and corners at (±0.5, ±0.5). Consider the following ways of specifying lines, and whether they will meet the requirements for random sampling (all of the random numbers indicated by RND below are considered to vary from 0 to 1, with uniform probability density, as produced by most computer subroutines that generate pseudo-random numbers): 1) generate one number Y = RND - 0.5, and use it to locate a horizontal line across the square. This will uniformly sample space (if the square is Figure 11.5. Probability curves for intercept lengths in a sphere generated by a MonteCarlo program. 278 Chapter 11 Figure 11.6. A cube with illustrative intercept lines. (For color representation see the attached CD-ROM.) subdivided into a checkerboard of smaller squares, and the number of lines passing through each is counted, the numbers will be the same within counting statistics). However, all of the intercept lengths through the square will have length exactly 1.0 because the directions are not properly randomized. 2) generate one number THETA = p RND, and use it to orient a line passing though the origin (center of the square). This will uniformly sample orientations, but not positions. The counts in the sampling squares near the center will be much higher than those near the periphery. Short intercept lengths will not be generated, and this will consequently not produce a true frequency histogram. 3) generate three random numbers: X = RND - 0.5, Y = RND - 0.5, and THETA = p RND. The line is defined as passing through the point X,Y with slope THETA. It is less obvious that this also produces biased data, but it, too, favors the sampling squares near the center at the expense of those near the periphery and produces a biased set of intercept lengths. So do other combinations such as using four random numbers for X1,Y1 and X2,Y2 (two points to define the line). To reveal the bias in these (and other) possible methods, it is instructive to write a small program to perform the necessary steps. This is strongly recommended to the practitioner, as the 2-dimensional geometry and simple shape of the square make things much easier than some of the 3-dimensional problems that may be encountered. Set up an array to count squares (e.g. 10 ¥ 10) through which the lines pass. Then use various methods to generate the lines, sum the counts, and also construct a histogram of intercept lengths. It is also useful to keep track of the Geometric Modeling 279 histogram of line lengths within the circumscribed unit circle around the square, since the shape of that distribution is known and departures will reflect some fault in the randomness of the line generation routine. When this is done for the methods described above, arrays like those shown in Figures 11.7 and 11.8 indicate improper sampling. Both of these methods, and the others mentioned, also produce quite wrong histograms of intercept length for both the cube and sphere. A proper method for generating randomized lines is: Generate R = 0.5 RND and THETA = p RND. This defines a point within the unit circle, and a vector from the origin to that point. Pass the line for intercept measurement through the point, perpendicular to the line. (Note that generating an intercept line from two points on the circumscribed circle using two angles THETA = p RND does not produce random lines, although for a sphere it does.) An equivalent way to consider the problem is to imagine that instead of the square being fixed, and the lines oriented at all angles, the square is rotated inside the circle (requiring one random number to specify the orientation angle). Then the lines can all be parallel, just as for the earlier method of obtaining intercepts through the circle. This also requires one random number (the position along the axis). With these methods, or others which are equivalent once rotation of coordinates is carried out, proper sampling of the grids is achieved. 112 123 134 159 147 156 139 127 123 115 110 170 191 181 172 175 183 165 158 126 122 163 228 213 201 211 230 230 166 118 143 181 242 259 253 259 252 252 190 127 147 195 216 260 285 288 283 261 209 145 163 175 220 255 287 305 265 230 207 159 121 185 188 255 273 286 271 244 208 160 114 153 189 247 254 258 270 193 179 132 97 138 176 175 200 224 209 189 141 124 103 124 157 130 166 151 172 160 125 118 Figure 11.7. Array of counts for 10 ¥ 10 grid using four random numbers to generate X1 = RND - 0.5,Y1 = RND - 0.5 and X2 = RND - 0.5,Y2 = RND - 0.5 points to define the line. Note the center weighting. The graph shows the counts as an isometric display. 280 Chapter 11 177 197 235 249 257 220 196 197 201 181 109 184 232 244 280 250 233 234 233 92 59 173 216 282 318 298 280 225 181 51 48 167 203 367 383 353 322 245 141 37 39 142 213 312 423 422 360 222 128 35 37 132 206 301 419 458 356 245 134 33 37 138 239 333 341 406 361 399 145 38 50 152 279 302 288 313 342 269 209 50 63 161 297 266 260 240 274 242 196 101 185 168 241 240 225 239 246 222 218 197 Figure 11.8. Array of counts for 10 ¥ 10 grid using two random numbers to generate points along left and right edges of square. In addition to center weighting note the sparse counts along top and bottom edges. The graph shows the counts as an isometric display. The result is a histogram of intercept lengths in the square as shown in Figure 11.9. This shows a flat shelf for short lengths, where the line passes through two adjacent sides of the square. The peak corresponds to the length of the square’s edge, and the probability then drops off rapidly to the maximum intercept (the diagonal, which is equal to the circle diameter). The program used to generate the data for the histogram is shown in Listing 3 in the Appendix (it uses the first of the randomizing methods described above). In this program, the array CT is defined (to build a 20 point histogram in this example). DG is the radius of the circumscribed circle, or half the square’s diagonal; P2 is 2p. The random function is used to generate a point, from which the slope (M) and intercept (B) of the line through this point perpendicular to the vector from the origin is calculated (the equation of the line is y = Mx + B). The intersection of this point with the left side of the square, or if it does not pass through the side, the intersection with the top or bottom is determined as one end of the line, and the process repeated for the other end. The line length is obtained from the Pythagorean theorem, and for lines that intersect the square produces an integer from the ratio of this length to the circle diameter which is used to build the histogram. Intercept Lengths in Three Dimensions Progressing to three dimensions, things become more complicated, and there are many more ways to bias the sampling. An extension of the methods described Geometric Modeling 281 Figure 11.9. Histogram of intercept lengths in a square, generated by Monte-Carlo program. above for the circle and square can be used for a sphere and cube, as follows (refer to Figure 11.10 for the nomenclature used for a spherical coordinate system). 1) Generate a random vector from the origin. This requires a radius R which is just the circle radius times a random number, and two angles. Using the nomenclature of the figure above, the q angle can be generated as 2p · RND, but the F angle cannot be just (p/2) · RND. If this were done, the vectors that pointed nearly vertically would be much denser in space than those that were nearly horizontal, because for each vertical angle there would be the same number of vectors, and the circumference of the “latitude” circles decreases toward the pole. The proper uniform sampling of orientations requires that F be generated as the angle whose sine is a random number from 0 to 1 (producing angles from 0 to p/2 as desired). Through the point at the end of the vector defined by these two angles and radius, pass a plane perpendicular to the vector. Then within this plane, generate a random line Figure 11.10. Spherical coordinates as described in the text. (For color representation see the attached CD-ROM.) 282 Chapter 11 by the method described above (angle and radius from the initial point to define another point, and a line through that point perpendicular to the vector from the initial point). This is less cumbersome than it sounds if proper use of matrix arithmetic is used to deal with the vectors. 2) Locate two random points on the circumscribed sphere. As in the method above, the points are defined by two angles, q and F, which are generated as q = 2p RND, F = arc sin (RND). Connect these points with a line. As before, alternate methods where the cube rotates while the line orientation stays fixed in space can be used instead (but the same method for determining the tilt angle is required). Conceptually it may be helpful to visualize the process as one of embedding the cube inside a sphere which is then rotated while the intersection lines pass through it vertically, as shown in Figure 11.11. Once the line has been determined, it is straightforward to find the intersections with the cube faces and obtain the intercept length. In the program shown as Listing 4 in the Appendix, this is done by setting up a matrix equation and solving it. Simpler methods can be used for the cube, but this more general approach is desirable when we next turn our attention to less easy shapes. This program dimensions an array for the histogram and line and defines the equations for each of the six faces on the cube (equations such as 1X + 0Y + 0Z = 0.5). Then a random point on the sphere is generated. For convenience, the elevation angle E used here is the complement of the angle F discussed above, and the complicated arctan function is used to get the arc sine, which many languages do not provide. The point X1, Y1, Z1 (and the second point X2, Y2, Z2 ) define the intersections of the random line on the sphere. Figure 11.11. Linear intercepts in a cube can be determined by embedding the cube in a sphere which is then oriented randomly and vertical lines passed through it. (For color representation see the attached CD-ROM.) Geometric Modeling 283 Coefficients in the A matrix define this line. Each face is checked for intersections with the line, by placing the face coordinates into the matrix. PC is a counter used to find two intersections of the line with faces. The determinant of the matrix (if zero, skip to another face, since the current one is exactly parallel to the line) calculates the coordinates (X, Y, Z) of the intersection of the line with the face plane, which are checked to see if the intersection lies within the face of the cube. For the first such intersection, the coordinates are saved and the counter PC incremented. When the second point is found, the distance between them (the intercept length) is calculated and converted it to an integer for the histogram address, and one count added. The process repeats for the selected number of intersection lines. Some output of the data as a graph is needed to make the results accessible. Several additional features can be added to this type of program to serve as useful checks on whether gross errors are present in the various parts of the program. First, the sphere intercept length can also be used to build a histogram (this is just the distance between the points X1, Y1, Z1 and X2, Y2, Z2) which should agree with the expected frequency histogram for a sphere. Also, by summing the lengths of all the lines in both the sphere and cube, the volume fraction VV and the surface area per unit volume SV of the cube in the sphere can be obtained and compared to the known correct values. The mean intercept length in the feature l can also be obtained, which should equal 4 V/S for the cube. It is also relatively simple to add a counter for each face of the cube, and compare the number of intersections of lines with each face (if random orientations are correctly used, these values should be the same within counting variability). Before looking at the results from this program, It is worth pointing out that it may be relatively easily extended to other shapes, even non-regular or concave ones. As shown in Figure 11.12, for polyhedra the smaller the number of faces the easier it is to produce short intercepts where the lines pass through adjacent faces. The number of faces will vary, and so some of the dimensions will change, and it may be worthwhile to directly compute the equations of each face plane from vertex points. The most significant change is in the test to see if the intersection point of the line with each face plane lies within the face. For the general polyhedron, the faces are polygons. The simplest way to proceed is to divide the face into triangles formed by two adjacent vertices and the intersection point, and sum the triangle areas. If this value exceeds the known area of the face, then the point lies outside the face (provision for finite numerical accuracy within the computer must be made in the equality test). Many curved surfaces can also be described by analytical geometry, but complex objects (e.g. those defined by splines) require more difficult techniques to locate the intersection points. For convex objects some lines will create more than one intercept. Applying this program to a series of regular polyhedra produces the results shown in Figure 11.13. The results are interesting in several respects. For instance, note that whereas for the sphere, the most likely intercept length is the maximum (equal to the sphere diameter), large values are quite unlikely for the polyhedra because they requires a perfect vertex-to-vertex alignment (and is impossible for the tetrahedron because even this distance is less than the sphere diameter). Also, the peaks present in the frequency histogram correspond to the distance between 284 Chapter 11 a b Figure 11.12. Linear intercepts in other solids: a) tetrahedron; b) many-sided polyhedron; c) cylinder; d) torus; e) arbitrary non-convex shape (a banana). (For color representation see the attached CD-ROM.) parallel faces (again, there are none for the tetrahedron), while the sloping but linear shelves correspond to intersections through adjacent faces. The mean intercept lengths in the various solids are obtained from the distributions. They are listed below as fractions of the diameter of the circumscribed sphere. Solid Tetrahedron Cube Octahedron Icosahedron Sphere Mean Intercept 0.232 0.393 0.399 0.538 0.667 Geometric Modeling 285 c d e Figure 11.12. Continued 286 Chapter 11 Figure 11.13. Frequency histograms for intercept lengths through various regular polyhedra, compared to that for a sphere, generated by Monte-Carlo program using 50 intervals for length, and about 40,000 intercept lines. Intersections of Planes with Objects Not all distribution curves are for intercept lengths, of course. It is also possible to model using geometric probability (or to measure on real samples) the areas of intersection profiles cut by planar sections through any shape feature. As usual, this is easiest for a sphere. Figure 11.14a shows that any intersection of a plane with a sphere produces a circle. Because of the symmetry, rotation is not needed and a series of parallel planes (Figure 11.14b) can generate the family of circles whose radii are calculable from the position of the plane, analogous to the case for the line intercept. For a cube, the intersection can have 3, 4, 5 or 6 sides (Figure 11.15). It is evident that the cube’s corners produce a large number of small area intersections, and that there are ways to cut areas quite a bit larger than the most probable slice (which is nearly equal to the area of a square face of the cube). Figure 11.16 shows a plot (Hull & Houk, 1958) of the area of profiles cut through spheres and cubes. Generating similar distributions for planar intersections with other shapes is a straightforward if rather lengthy exercise in geometric probability. It also reveals that humans are not good intuitive judges of intersections of planes with solids. Figure 11.17 shows a torus, familiar to many researchers in the form of a bagel. A few of the less obvious cuts which produce sections unexpected to casual observers are shown. Considering the difficulty in predicting the shape of section cuts from known objects, it should not be surprising that the reverse process of imagining the unknown 3D object that may have given rise to observed planar sections easily leads to errors. For other shapes with even less symmetry, the analytical approach becomes completely impractical. Even Monte-Carlo approaches are time-consuming, because of the problems of calculating the intersection lines and points, and it may take a very large number of trials to get enough counting statistics in the low-probability regions of the frequency histogram to adequately define it (particularly when Geometric Modeling 287 a b Figure 11.14. Intersection of a plane with a sphere in any orientation produces a circle (a). A family of parallel planes (b) produces a distribution of circle sizes. (For color representation see the attached CD-ROM.) the shape of the curve is to be used to deconvolute distributions from many different feature sizes). Bertrand’s Paradox There is a problem which has been mentioned briefly in the previous examples, but which often becomes important when dealing with ways to orient complex shapes using random numbers. It is easy to bias (or, in mathematical terminology, to constrain) the orientations so that the random numbers do not produce a random sampling of the space inside the object. In a trivial example, if we only rotated the cube around one axis, we would not see all intercept lengths; but if we rotate it uniformly around all three space angles, the proper results are obtained. A classic example of the problem of (im)proper sampling is stated in a famous paradox presented by Bertrand in 1899. The problem to be solved is stated as follows: What is the probability that a random line passing through a circle will have an intercept length greater than the side of the inscribed equilateral triangle in the circle? 288 Chapter 11 a b Figure 11.15. Planes intersection a cube can produce intersections with: a) 3, b) 4; c) 5; d) 6 sides. (For color representation see the attached CD-ROM.) Using the three drawings in Figure 11.18 (and working from left to right), here are three arguments that can be presented to quickly arrive at an answer. 1. Since the circle has perfect symmetry, and any point on the periphery is the same as any other, we will consider without loss of generality, lines that enter the circle at one corner of the triangle, but at any angle. Since the total range of the angle theta is from 0 to 180 degrees, and since the triangle subtends an angle of 60 degrees (and any line lying within the shaded region clearly has an intercept length greater than the length of the side of the triangle), the probability that the intercept line length is greater than the side of the triangle is just 60/180, or one-third. Geometric Modeling 289 c d Figure 11.15. Continued 2. Since each line passing through the circle produces a line segment which must have a center point, which will lie within the circle, we can reduce the problem to determining how many of the line segments have midpoints that lie within the shaded circle inscribed in the triangle (these lines will have lengths longer than the side of the triangle, while any line whose midpoint is outside the circle will have a shorter length). Since the area of the small circle is just onefourth the area of the large one, the probability that the intercept line length is greater than the side of the triangle is one-fourth. 3. Since the circle is symmetric, we lose no generality in considering only vertical lines. Any vertical line which passes through the circle but passes outside of the smaller circle inscribed in the triangle will have a length shorter than the side of the triangle, so the probability that the intercept line length is 290 Chapter 11 0.125 Sphere Cube Frequency 0.100 0.075 0.050 0.025 1.00 0.75 0.50 0.25 0.00 0.000 Area/Max Area Figure 11.16. Intercept area distributions for a sphere and cube. (For color representation see the attached CD-ROM.) greater than the side of the triangle is the ratio of diameters of the circles, or one-half. These three arguments are all, at least on the surface, plausible. But they produce three different answers. Perhaps we should choose one-third, since it is in the middle and hence “safer” than the two extreme answers! No, the correct answer is one-half. The other two arguments are invalid because they impose constraints on the random lines. Lines spaced at equal angle steps all passing through one point on the circle’s periphery do not uniformly sample it, but are closer together at small and large angles. This under-represents the number of lines that lie within the shaded triangle, and hence produces too low an answer. Likewise the lines whose midpoints lie within the shaded inner circle would be correct if only one angular orientation were considered (it would then be equivalent to the third argument), but when all angles are considered it undercounts the lines that pass outside the shaded circle. It can be very tricky to find subtle forms of this kind of bias. They may arise in either the analytical integration or the random sampling methods. The method used to generate random orientation of lines in space for the Monte-Carlo routines described above must avoid this problem. The trick is to use q = 2p RND and F = arc sin (RND), using the terminology of Figure 11.10. This avoids clustering many vectors around the Z axis, and gives uniform sampling. The Buffon Needle Problem A good example of the role of angles in geometrical probability is the classical problem known as Buffon’s Needle (Buffon, 1777): If a needle of length L is dropped at random onto a grid of parallel lines with spacing S, what is the Geometric Modeling 291 a b Figure 11.17. Some of the less familiar planar cuts through a bagel (torus), which produce one or two convex or concave intersections. (For color representation see the attached CD-ROM.) probability that it crosses a line? This is another example of a problem easy enough to be solved by either analytical means or random sampling. First, we will look at the procedure of integration. Figure 11.19 shows the situation. Two variables, y and theta, are involved; theta is the angle of the needle with respect to the line direction, and y is the distance from the midpoint of the needle to a line. It is only necessary to consider the angle range from 0 to p/2, because of symmetry. At any angle, the needle subtends a distance of L sin Á perpendicular to the lines, so if y is less than half this, there will be an intersection. Hence, the probability of intersection is, as was stated before p 2 Ú 0 L sin q dq S p 2 Ú dq 0 LÈ Êpˆ ˘ Í - cosË 2 ¯ + cos(0 )˚˙ 2 L S Î = = pS Êpˆ Ë 2¯ (11.5) 292 Chapter 11 c d Figure 11.17. Continued Figure 11.18. Illustrations for Bertrand’s Paradox. Figure 11.19. Geometry for the Buffon needle problem. Geometric Modeling 293 A Monte-Carlo approach to solving this problem is shown as Listing 5 in the Appendix. L and S have been arbitrarily set to 1, so the result for number / count should just be p/2. One particular sequence of running this program ten times, for 1000 trials each, gave a mean and standard deviation of 1.5727 ± 0.0254, which is quite respectable. There is a probably apocryphal story about an eastern European mathematician who claimed that he dropped a needle 5 cm. long onto lines 6 cm. apart, and counted 226 crossing events in 355 trials. He used this to estimate p, as 2 ¥ 355/266 = 3.1415929 (correct through the first 6 decimal places). We can evaluate the (im)probability of the truth in this story. Dropping the needle one more time, regardless of the outcome, would have seriously degraded the “accuracy” of the result! This tale may serve as a caution to always consider the statistical precision of results generated by a sampling method such as MonteCarlo simulation, as well as the statistics of measured data to which it is to be compared. Appendix Listing 1: Random points in a unit square used to measure circle area. Listing 1 Integer: Increment = 1000; Integer: Number = 0; Integer: Count = 0; Repeat { X = RND; Y = RND; IF (X*X + Y*Y) < 1 THEN Count = Count + 1; Number = Number + 1; IF Float(Number/Increment) = Integer Number/ Increment) THEN PRINT Number, Float(4*Count/Number); } Until Stopped; Listing 2: Intercept lengths in a sphere. Listing 2 Integer Count[1..20] = 0; Integer Number, Bin, i; Float X, Y, rsquared, Icept; Input (“Number of trials= ”,Number); for (i = 1; i <= Number; i++) { X = RND; Y = RND; rsquared = X*X + Y*Y; if rsquared < 1 then { Icept = sqrt (1 - rsquared); Bin = Integer (20*Icept) + 1; Count[Bin] = Count[Bin] + 1; } } 294 Chapter 11 for (Bin = 1; Bin <= 20; Bin++) PRINT Bin, Count[Bin]; Listing 3: Intercept lengths in a square. Listing 3 Integer CT[20] = 0; Integer N, NU, L; Float M, B, TH, R, XL, YL, XR, YR; Const DG = SQR(2)/2; Const P2 = 8*ATN(1); Input (“Number of Lines: ”,NU); for (N = 1; N <= NU; N++) { R = DG * RND; /* two random numbers */ TH = P2 * RND; M = -1/tan(TH); B = 0.5 + R * sin(TH) - M * (0.5 + R * cos(TH)); /* get the coordinates of the ends of the line */ XL = 0; YL = B; if ((YL>1) or (YL<0)) then { if (YL>1) then YL=1 else YL=0; XL=(YL-B)/M; } XR = 1; YR = M + B; if ((YR>1) or (YR<0)) then { if (YR>1) then YR=1 else YR=0; XR=(YR-B)/M } LEN = sqrt((XR-XL)*(XR-XL)+(YR-YL)*(YR-YL)); if (LEN>0) then { L = integer(0.5+10*LEN/DG); CT[L] = CT[L]+1; } } Listing 4: Intercept lengths in a cube. Listing 4 Float Float Integer Const Const F[6,4] = 1,0,0,0.5 1,0,0,-0.5 0,1,0,0.5 0,1,0,-0.5 0,0,1,0.5 0,0,1,-0.5; // the six faces of the cube A[3,4]; // used to calculate line/face intersection LX[50]; // for histogram P2 = 8 * atan(1.0); // 2p R = sqrt(3.0)/2.0; // radius of circumscribed sphere Geometric Modeling 295 Input (“Number of Lines:”,NU); for (n = 1; n <= NU; N++) { TH = P2 * RND; E = -1 + 2 * RND; E = atan (E/sqrt(1 - E*E)); X1 = R * cos(E) * cos(TH); Y1 = R * cos(E) * sin(TH); Z1 = R * sin(E); // one random point on the sphere TH = P2 * RND; E = -1 + 2 * RND; E = atan (E/sqrt(1 - E*E)); X2 = R * cos(E) * cos(TH); Y2 = R * cos(E) * sin(TH); Z2 = R * sin(E); // second random point on sphere A[1,1]=Y1*Z1-Y2*Z1; A[1,2]=Z1*X2-Z2*X1; A[1,3]=X1*Y2-X2*Y1; A[1,4]=0; A[2,1]=Y1*Z2-Y2*Z1; A[2,2]=Z1*X2-Z2*X1+Z2-Z1; A[2,3]=X1*Y2-X2*Y1+Y1-Y2; A[2,4]=Y1*Z2-Y2*Z1; // define the line in A matrix PC=0; for (j=1; j<=6; j++) // check faces for intersection { for (k=1; k<=4; k++) A[3,K]=F[J,K]; DE= A(1,1)*(A(2,2)*A(3,3)-A(2,3)*A(3,2)) +A(1,2)*(A(2,3)*A(3,1)-A(2,1)*A(3,3)) +A(1,3)*(A(2,1)*A(3,2)-A(2,2)*A(3,1)); if (DE<>0) then // zero means parallel to the face { X= (A(1,4)*(A(2,2)*A(3,3)A(2,3)*A(3,2)) +A(1,2)*(A(2,3)*A(3,4)A(2,4)*A(3,3)) +A(1,3)*(A(2,4)*A(3,2)A(2,2)*A(3,4)))/DE; Y= (A(1,1)*(A(2,4)*A(3,3)A(2,3)*A(3,4)) +A(1,4)*(A(2,3)*A(3,4)A(2,4)*A(3,3)) +A(1,3)*(A(2,1)*A(3,4)A(2,4)*A(3,1)))/DE; Z= (A(1,1)*(A(2,2)*A(3,4)A(2,4)*A(3,2)) +A(1,2)*(A(2,4)*A(3,1)A(2,1)*A(3,4)) +A(1,4)*(A(2,1)*A(3,2)A(2,2)*A(3,1)))/DE; // intersection point of line with face if ((abs(X)<=0.5) and (abs(Y)<=0.5) and (abs(z)<=0.5)) then // see if intersection is inside cube { if (PC=0) then PC=1; 296 Chapter 11 X0=X; Y0=Y; Z0=Z; } else { LE=SQR((X-X0)*(X-X0)+(Y-Y0)*(Y-Y0)+ (Z-Z0)*(Z-Z0)); // distance between points = length H=integer(25*LE/R); LX[H]=LX[H]+1; // increment histogram } }// if de } // for j } // for n Listing 5: The Buffon needle problem. Listing 5 Integer Number, j, Count; Float Y, Theta, Vert; Const HalfPi = 2 * Atan(1); // p/2 Input (“How many trials: ”,Number); for (j = 1; j<=Number; j++) { Y= RND; Theta = HalfPi * RND; Vert = Sin (Theta) / 2; if ((Y - Vert < 0) OR (Y + Vert > 1)) then Count++; } PRINT Number/Count