Nick Wong Professor Ramsey WRIT 340 10 November 2013 Blurb: Learn how the graphics in movies find their way to uses in medicine. Bio: Nick is a junior studying Computer Science (Games). He hopes to become a successful game developer in the future. Suggested Media: Interactive Computer Graphics, A Top-Down Approach with ShaderBased OpenGL, by Edward Angel; the article: “Principles and Applications of Computer Graphics in Medicine” (as cited as [4]) Abstract: The beautifully crafted effects of movies are not only the product of artists, but also mathematicians. Behind the scenes, what drives realistic simulations like movies is a lot of math, most notably, linear algebra. In the process of simply displaying a 3Dimensional object on screen, its coordinates have to be transformed through numerous different coordinate systems. The end result is an object displaying on screen with the proper viewing angle. A lot more calculations can be done with this result to get effects such as camera movement and animation. In medicine, these same fundamental techniques to display 3-D objects are used. In addition to these display techniques, certain medical operations, such as CT scans are able to take 3-D images of the specimens they scan. They do this by using voxels, which are a 3-D analog to pixels in an image. These voxels are used to create a triangular mesh with the Marching Cubes algorithm and Wong 2 displayed using the same calculations as above. As technology gets more powerful, scanners will be able to take higher resolution images of a specimen, allowing researchers to better understand organ structures and to diagnose diseases like cancer more efficiently. Computer Graphics: From the Movie Theater to the Medical Center Introduction Whoosh! A super hero soars across the big screen. Crash! Glass flies as a skyscraper it slowly topples over, its foundations groaning. Aaaah! A horde of zombies is coming! Of course, none of these actually happened: they are just very realistic simulations. Although the history of graphics is long, the evolution of the technology that powers these movie effects occurred over only a few decades. As each year passes, the power of computer graphics increases and uses outside of the entertainment industry become more prevalent. A field where computer graphics are used extensively is medicine, where it is used for visualizations. The same techniques to create the beautifully rendered scenes shown in movies are used to help diagnose diseases and analyze tissues of animals. How it Works - A Very Basic Overview The field of computer graphics is very extensive, comprising of many different concepts, such as rendering, lighting, and animation. However, virtually all computer graphics boils down to a lot of math - especially linear algebra - and some physics. The process of rendering a plain object could be considered the most fundamental Wong 3 concept in computer graphics. Without these primitive objects, a graphic artist has no medium upon which to apply the amazing effects that make certain modern movies so realistic. In most applications of computer graphics in entertainment, the graphic artist first creates a 3-D model. A 3-D model is simply a set of vertices in 3-dimensional space, which are then connected behind-the-scenes to form 2-dimensional polygons. The resulting polygons are then filled to create a solid enclosure instead of just a web of lines (fig. 1). Figure 1. The mountains of Ohiopyle State Park: without filling on the left, with filling on the right. Looking closely, the picture on the left is made of little triangles. Source: original. Polygons are usually triangles because having only 3 vertices guarantees that a polygon will be flat. Polygons that are not flat confuse the computer, as something that should have been a 2-dimensional entity is now 3-dimensional. Graphical hardware is not designed to know how to fill 3-dimensional polygons so whatever happens next is undefined. The graphics hardware may draw the scene as it was envisioned, or it could create very visible faults in the scene. To avoid such ambiguity, graphic artists try their best to ensure all polygons that make up a scene are triangles. Wong 4 3-D models are some of the most fundamental entities used in computer graphics. For example, in video games, each element of the scene, ranging from buildings and plants to even the character the player is controlling, is a 3-D model. In medicine, they may be images of the internal structures of a human. They are made of many of the triangles discussed above. However, a lot of work goes into just displaying the model on screen. Transforming a Model through 3-D space. So now we have an awesome looking 3-D model, say a tree, sitting on a hard drive. Now we want to get it to be part of a forest scene. Getting a 3-D model to display onto screen now introduces the problem of coordinate systems. Every 3-D model exists in a coordinate system called object space [2]. This coordinate system has 3 axes: an x-axis, a y-axis, and a z-axis, a simple 3-D Cartesian coordinate system (fig. 2). This article assumes a right-handed coordinate system, in which the z-axis points out of the screen. A left-handed coordinate system, by contrast, has the z-axis pointing into the screen. Figure 2. The right-handed 3-D Cartesian coordinate system. The Z-axis points toward the viewer. The X and Y axes are on the plane of the screen. The point at which all these axes intersect is the point (0, 0, 0), also known as the origin. Source: original. Wong 5 The object is probably centered at the origin in object space because it makes it easier to work with individual vertices. However, when shown in the scene, the object most likely will not be at the origin: it will have its own specific location in the scene. The new coordinate system that the object needs to now be a part of is called world space, which is also a simple 3-D Cartesian coordinate system [1]. This is where the linear algebra comes in. Transforming the object from object space into world space is a simple linear operation: the computer multiples each vertex (which is just a set of 3 numbers that defines the location of a point in space) with a matrix called the worldtransform matrix. To calculate the world-transform matrix, the graphic artist must know 3 pieces of information: how far the object is from the location (0, 0, 0), how much the object is rotated, and how much the object is scaled [2]. Knowing these data, the computer can calculate the world-transform matrix and place the object in the scene as the artist wanted it. Moving away from the forest to the hospital, consider a scenario where a doctor has a 3-D representation of a patient's innards. The model is positioned in the right place and is ready to be observed, except the doctor has no ability to observe it. What is the point of such an object if some kind of viewer cannot look around or explore it? How does one put a viewer into the representation, move it around and study its intricacies when the representation is immobile? It turns out there is a special object, the camera, that has a separate coordinate system centered on it. The x-axis points toward the right, the y-axis points up, and the z-axis points in the opposite direction the camera is trying to view. This coordinate system is called camera space [1]. The calculation of the matrix required to transform world coordinates into camera coordinates is a bit more involved. The first step Wong 6 is to treat the camera like the origin of its own 3-D Cartesian coordinate system and tell the computer which direction each axis mentioned above is pointing. Given these pieces of information, the computer can then calculate a matrix similar to the world-transform matrix called the view-to-world matrix. It so happens that the matrix to convert world coordinates to camera coordinates, the view matrix, is the inverse of the view-to-world matrix [1]. The matrix inverse is the matrix that when multiplied with a matrix results in the identity matrix. An example of this operation is shown below in fig. 3. Now that we have an additional coordinate system, the whole scene does not have to change for the audience to explore it. The audience can explore the scene by simply moving the camera around and recalculating the two camera matrices mentioned above. Figure 3. The multiplication of an arbitrary non-singular matrix (left) with its inverse (middle). The result of the operation is the identity matrix (right). If the left matrix is the view-to-world matrix, then the middle matrix is the view matrix. The method to calculate matrix inverses and products can be found in a linear algebra textbook. Source: original. This is a lot to take in, so to recap, there are 3 different coordinate systems used so far: object space, world space, and camera space. The 3-D model is created in object space. The scene exists in world space, so to place the 3-D model into the scene, each vertex in the model is multiplied with a world-transform matrix. Of course, the scene does not really change much, so to actually move around in the scene, each vertex present in the scene is multiplied with a view matrix. This operation places the vertices in camera space. The camera is like an eye: given a scene, whatever is in camera space is basically Wong 7 what the eye can see. Now that these coordinate systems are clearer, we can move on to the final step in displaying a 3-D object. Transformation to a Screen The final problem that a researcher encounters when trying to view a 3-D scene is the fact that although the scene is properly positioned and viewable in 3-D space, the screen is a 2-D surface. To properly display a 3-dimensional scene onto the 2-dimensional surface, one final transformation is required: the projection. The projection step encases the view of the scene in a volume called the viewing volume. There are 2 different kinds of projections: orthographic and perspective. In an orthographic projection, the viewing volume is a rectangular prism. In a perspective projection, the viewing volume is a frustum: a pyramid with its top chopped off to form something like a plateau (fig. 4). Figure 4. A frustum, the viewing volume for a perspective projection. The camera is situated at the small end of the frustum. Source: original. The choice of projection is important because it will cause the scene to be rendered quite differently. Fig 5 gives a basic example of the difference. Wong 8 Figure 5. Comparison between an orthographic projection (left) and a perspective projection (right). The object being drawn in both cases is a transparent cube. The face in the back is farther from the camera, but the orthographic projection ignores this fact. The perspective projection draws it like one would in real life with the rear face smaller. Source: original. Basically, the reason why this occurs has to do with ratios. The cross section of each viewing volume is a rectangle. The ratio of the size of the object to the size of the cross section of the viewing volume determines how big the object is. For an orthographic projection, each cross section is the same size because the viewing volume is a rectangular prism. Since each face of a cube is the same size, the ratio of the size of the front face to the size of the viewing volume cross section is equal to the ratio of the size of the back face to the size of viewing volume cross section (fig. 6). This results in both faces being drawn as the same size, which is why an orthographic projection results in something like a plain square. Wong 9 Figure 6. A cube (red) in an orthographic viewing volume. The amount of space the front face takes up in its viewing volume cross section is shown on the left. The amount of space the back face takes up in its viewing volume cross section is shown on the right. They are equal. Note that the image may not be to scale. Source: original. In a perspective projection, the cross section of the viewing volume increases as the distance from the camera increases. This means that the ratio of the size of an object to the size of the cross section decreases as the distance from the camera increases. This is why the situation in fig. 4 occurs. The size of both the front face of the cube and the back face are the same, but the cross sections of the frustum in the back are larger than the one in the front. This causes the back face to be drawn smaller (fig 7). Wong 1 0 Figure 7. A cube (red) in a perspective viewing volume. The amount of space the front face takes up in its viewing volume cross section is shown on the left. The amount of space the back face takes up in its viewing volume cross section is shown on the right. They are not equal. The back face takes up far less space. Note that the image may not be to scale. Source: original. Projection results in coordinates in Normalized Device Coordinate (NDC) space. These coordinates are in the range of -1 to 1. They are then transformed into actual screen coordinates given the size of the screen [2]. There we have it! Those are all the fundamental calculations required to display a 3dimensional object on a screen. They, unfortunately, are not trivial. Going forward, an artist can continue to enhance a scene by adding in lighting calculations to illuminate the objects, perhaps how a sun would, or maybe how a lamppost would. Maybe, the artist wants instead to make the trees sway in the wind, which require additional world space calculations. Just like a painting, the possibilities in designing a scene are endless and exciting. Use in Medical Research Wong 11 A pixel is like a piece of a picture. It literally stands for picture element and is known to be 2-dimensional. The 3-dimensional analog to this concept is called a voxel, or volumetric element. Game developers have considered voxels for a variety of functions, such as the more flexible creation of terrain. The traditional way of creating terrain, the height map, just uses a picture in which dark areas are lower and light areas are higher in elevation. Fig. 1 was rendered with a height map. Because voxels are literally like little cubes, games will often use a method called Marching Cubes to smooth it out [3]. This exact same technology has seen usage in medical visualizations. Computed tomography (CT) scans, for instance, are a way to visualize a sample using a volume, rather than just a picture. Having a 3-D image of a sample to be analyzed gives far more information than a flat image. Structures that are physically difficult to access through surgery, such as the cochlea deep in a human’s ear, can easily be visualized with the 3-D image produced by a tomographic scan [4]. In terms of medicine, doctors do not necessarily have to cut open a patient to analyze a possible tumor, using a scan to create the 3-D image is enough. However, previously, such a technology had limited usability in analyzing organisms other than humans or small structures, as the detail in the produced images was lacking as the resolution was about 1 cubic millimeter. In the past decade, a new method called micro-CT has become available. The images produced by a micro-CT system is far more detailed, giving voxels that can be the size of microns (where 1 micron is equal to 1/1000 of a millimeter) [5]. These high-resolution volumes can now be used to analyze small structures, like a human’s blood vessels. As technology such as this becomes more powerful in the future, the visualizations that can be produced from such technology will only become more detailed. Eventually, Wong 1 2 such detailed visuals can revolutionize and streamline the methods doctors use to diagnose serious diseases, such as cancer, in humans. Research into human organs can also be greatly enhanced, as detailed models remove some of the need to have an actual sample to study. This is a very exciting time for graphics. Not only is the entertainment industry embracing the use of powerful hardware and techniques to create an almost hyperrealistic experience, the medical and research community is also benefiting from the increase in detail that is a result of this growth. Analyses that were not possible before because of low resolutions are now used to visualize the smallest of structures, giving humans a deeper understanding of how those structures work. There may come a time where humans will have disease under control, but this will never be achieved without the use of computer graphics. References [1] J. Gregory, Game Engine Architecture. Boca Raton, FL: Taylor and Francis Group, 2009. [2] E. Angel, Interactive Computer Graphics, A Top-Down Approach with Shader-Based OpenGL.Boston, MA: Addison-Wesley, 2012. [3] E. S. Lengyel, “Voxel-Based Terrain for Real-Time Virtual Simulations,” Ph.D. dissertation, Dept. Comp. Sci., Univ. California, Davis, 2010. [4] F.P. Vidal, F. Bello, K.W. Brodlie, N.W. John, D. Gould, R. Phillips, and N.J. Avis. “Principles and Applications of Computer Graphics in Medicine”. Computer Graphics forum, vol. 25, pp. 113-137, 2006. [5] D. W. Holdsworth and M. M. Thornton. “Micro-CT in small animal and specimen imaging”. Trends in Biotechnology, vol. 20, pp. S34-S39, 1 August 2002.