Nick Wong

advertisement
Nick Wong
Professor Ramsey
WRIT 340
10 November 2013
Blurb: Learn how the graphics in movies find their way to uses in medicine.
Bio: Nick is a junior studying Computer Science (Games). He hopes to become a
successful game developer in the future.
Suggested Media: Interactive Computer Graphics, A Top-Down Approach with ShaderBased OpenGL, by Edward Angel; the article: “Principles and
Applications of
Computer Graphics in Medicine” (as cited as [4])
Abstract:
The beautifully crafted effects of movies are not only the product of artists, but also
mathematicians. Behind the scenes, what drives realistic simulations like movies is a lot
of math, most notably, linear algebra. In the process of simply displaying a 3Dimensional object on screen, its coordinates have to be transformed through numerous
different coordinate systems. The end result is an object displaying on screen with the
proper viewing angle. A lot more calculations can be done with this result to get effects
such as camera movement and animation. In medicine, these same fundamental
techniques to display 3-D objects are used. In addition to these display techniques, certain
medical operations, such as CT scans are able to take 3-D images of the specimens they
scan. They do this by using voxels, which are a 3-D analog to pixels in an image. These
voxels are used to create a triangular mesh with the Marching Cubes algorithm and
Wong 2
displayed using the same calculations as above. As technology gets more powerful,
scanners will be able to take higher resolution images of a specimen, allowing researchers
to better understand organ structures and to diagnose diseases like cancer more
efficiently.
Computer Graphics: From the Movie Theater to the Medical Center
Introduction
Whoosh! A super hero soars across the big screen. Crash! Glass flies as a skyscraper
it slowly topples over, its foundations groaning. Aaaah! A horde of zombies is coming! Of
course, none of these actually happened: they are just very realistic simulations. Although
the history of graphics is long, the evolution of the technology that powers these movie
effects occurred over only a few decades. As each year passes, the power of computer
graphics increases and uses outside of the entertainment industry become more prevalent.
A field where computer graphics are used extensively is medicine, where it is used for
visualizations. The same techniques to create the beautifully rendered scenes shown in
movies are used to help diagnose diseases and analyze tissues of animals.
How it Works - A Very Basic Overview
The field of computer graphics is very extensive, comprising of many different
concepts, such as rendering, lighting, and animation. However, virtually all computer
graphics boils down to a lot of math - especially linear algebra - and some physics.
The process of rendering a plain object could be considered the most fundamental
Wong 3
concept in computer graphics. Without these primitive objects, a graphic artist has no
medium upon which to apply the amazing effects that make certain modern movies so
realistic. In most applications of computer graphics in entertainment, the graphic artist
first creates a 3-D model. A 3-D model is simply a set of vertices in 3-dimensional space,
which are then connected behind-the-scenes to form 2-dimensional polygons. The
resulting polygons are then filled to create a solid enclosure instead of just a web of lines
(fig. 1).
Figure 1. The mountains of Ohiopyle State Park: without filling on the left, with
filling on the right. Looking closely, the picture on the left is made of little triangles.
Source: original.
Polygons are usually triangles because having only 3 vertices guarantees that a
polygon will be flat. Polygons that are not flat confuse the computer, as something that
should have been a 2-dimensional entity is now 3-dimensional. Graphical hardware is not
designed to know how to fill 3-dimensional polygons so whatever happens next is
undefined. The graphics hardware may draw the scene as it was envisioned, or it could
create very visible faults in the scene. To avoid such ambiguity, graphic artists try their
best to ensure all polygons that make up a scene are triangles.
Wong 4
3-D models are some of the most fundamental entities used in computer graphics.
For example, in video games, each element of the scene, ranging from buildings and
plants to even the character the player is controlling, is a 3-D model. In medicine, they
may be images of the internal structures of a human. They are made of many of the
triangles discussed above. However, a lot of work goes into just displaying the model on
screen.
Transforming a Model through 3-D space.
So now we have an awesome looking 3-D model, say a tree, sitting on a hard drive.
Now we want to get it to be part of a forest scene. Getting a 3-D model to display onto
screen now introduces the problem of coordinate systems. Every 3-D model exists in a
coordinate system called object space [2]. This coordinate system has 3 axes: an x-axis, a
y-axis, and a z-axis, a simple 3-D Cartesian coordinate system (fig. 2). This article
assumes a right-handed coordinate system, in which the z-axis points out of the screen. A
left-handed coordinate system, by contrast, has the z-axis pointing into the screen.
Figure 2. The right-handed 3-D Cartesian coordinate system.
The Z-axis points toward the viewer. The X and Y axes are
on the plane of the screen. The point at which all these axes intersect is the point (0, 0, 0),
also known as the origin. Source: original.
Wong 5
The object is probably centered at the origin in object space because it makes it
easier to work with individual vertices. However, when shown in the scene, the object
most likely will not be at the origin: it will have its own specific location in the scene.
The new coordinate system that the object needs to now be a part of is called world
space, which is also a simple 3-D Cartesian coordinate system [1]. This is where the
linear algebra comes in. Transforming the object from object space into world space is a
simple linear operation: the computer multiples each vertex (which is just a set of 3
numbers that defines the location of a point in space) with a matrix called the worldtransform matrix. To calculate the world-transform matrix, the graphic artist must know 3
pieces of information: how far the object is from the location (0, 0, 0), how much the
object is rotated, and how much the object is scaled [2]. Knowing these data, the
computer can calculate the world-transform matrix and place the object in the scene as
the artist wanted it.
Moving away from the forest to the hospital, consider a scenario where a doctor has a
3-D representation of a patient's innards. The model is positioned in the right place and is
ready to be observed, except the doctor has no ability to observe it. What is the point of
such an object if some kind of viewer cannot look around or explore it? How does one
put a viewer into the representation, move it around and study its intricacies when the
representation is immobile? It turns out there is a special object, the camera, that has a
separate coordinate system centered on it. The x-axis points toward the right, the y-axis
points up, and the z-axis points in the opposite direction the camera is trying to view. This
coordinate system is called camera space [1]. The calculation of the matrix required to
transform world coordinates into camera coordinates is a bit more involved. The first step
Wong 6
is to treat the camera like the origin of its own 3-D Cartesian coordinate system and tell
the computer which direction each axis mentioned above is pointing. Given these pieces
of information, the computer can then calculate a matrix similar to the world-transform
matrix called the view-to-world matrix. It so happens that the matrix to convert world
coordinates to camera coordinates, the view matrix, is the inverse of the view-to-world
matrix [1]. The matrix inverse is the matrix that when multiplied with a matrix results in
the identity matrix. An example of this operation is shown below in fig. 3. Now that we
have an additional coordinate system, the whole scene does not have to change for the
audience to explore it. The audience can explore the scene by simply moving the camera
around and recalculating the two camera matrices mentioned above.
Figure 3. The multiplication of an arbitrary non-singular matrix (left) with its inverse
(middle). The result of the operation is the identity matrix (right). If the left matrix is the
view-to-world matrix, then the middle matrix is the view matrix. The method to calculate
matrix inverses and products can be found in a linear algebra textbook. Source: original.
This is a lot to take in, so to recap, there are 3 different coordinate systems used so
far: object space, world space, and camera space. The 3-D model is created in object
space. The scene exists in world space, so to place the 3-D model into the scene, each
vertex in the model is multiplied with a world-transform matrix. Of course, the scene
does not really change much, so to actually move around in the scene, each vertex present
in the scene is multiplied with a view matrix. This operation places the vertices in camera
space. The camera is like an eye: given a scene, whatever is in camera space is basically
Wong 7
what the eye can see. Now that these coordinate systems are clearer, we can move on to
the final step in displaying a 3-D object.
Transformation to a Screen
The final problem that a researcher encounters when trying to view a 3-D scene is the
fact that although the scene is properly positioned and viewable in 3-D space, the screen
is a 2-D surface. To properly display a 3-dimensional scene onto the 2-dimensional
surface, one final transformation is required: the projection. The projection step encases
the view of the scene in a volume called the viewing volume. There are 2 different kinds
of projections: orthographic and perspective. In an orthographic projection, the viewing
volume is a rectangular prism. In a perspective projection, the viewing volume is a
frustum: a pyramid with its top chopped off to form something like a plateau (fig. 4).
Figure 4. A frustum, the viewing volume for a perspective projection. The camera is
situated at the small end of the frustum. Source: original.
The choice of projection is important because it will cause the scene to be rendered quite
differently. Fig 5 gives a basic example of the difference.
Wong 8
Figure 5. Comparison between an orthographic projection (left) and a perspective
projection (right). The object being drawn in both cases is a transparent cube. The face in
the back is farther from the camera, but the orthographic projection ignores this fact. The
perspective projection draws it like one would in real life with the rear face smaller.
Source: original.
Basically, the reason why this occurs has to do with ratios. The cross section of each
viewing volume is a rectangle. The ratio of the size of the object to the size of the cross
section of the viewing volume determines how big the object is. For an orthographic
projection, each cross section is the same size because the viewing volume is a
rectangular prism. Since each face of a cube is the same size, the ratio of the size of the
front face to the size of the viewing volume cross section is equal to the ratio of the size
of the back face to the size of viewing volume cross section (fig. 6). This results in both
faces being drawn as the same size, which is why an orthographic projection results in
something like a plain square.
Wong 9
Figure 6. A cube (red) in an orthographic viewing volume. The amount of space the front
face takes up in its viewing volume cross section is shown on the left. The amount of
space the back face takes up in its viewing volume cross section is shown on the right.
They are equal. Note that the image may not be to scale. Source: original.
In a perspective projection, the cross section of the viewing volume increases as the
distance from the camera increases. This means that the ratio of the size of an object to
the size of the cross section decreases as the distance from the camera increases. This is
why the situation in fig. 4 occurs. The size of both the front face of the cube and the back
face are the same, but the cross sections of the frustum in the back are larger than the one
in the front. This causes the back face to be drawn smaller (fig 7).
Wong 1
0
Figure 7. A cube (red) in a perspective viewing volume. The amount of space the
front face takes up in its viewing volume cross section is shown on the left. The amount
of space the back face takes up in its viewing volume cross section is shown on the right.
They are not equal. The back face takes up far less space. Note that the image may not be
to scale. Source: original.
Projection results in coordinates in Normalized Device Coordinate (NDC) space.
These coordinates are in the range of -1 to 1. They are then transformed into actual screen
coordinates given the size of the screen [2].
There we have it! Those are all the fundamental calculations required to display a 3dimensional object on a screen. They, unfortunately, are not trivial. Going forward, an
artist can continue to enhance a scene by adding in lighting calculations to illuminate the
objects, perhaps how a sun would, or maybe how a lamppost would. Maybe, the artist
wants instead to make the trees sway in the wind, which require additional world space
calculations. Just like a painting, the possibilities in designing a scene are endless and
exciting.
Use in Medical Research
Wong 11
A pixel is like a piece of a picture. It literally stands for picture element and is known
to be 2-dimensional. The 3-dimensional analog to this concept is called a voxel, or
volumetric element. Game developers have considered voxels for a variety of functions,
such as the more flexible creation of terrain. The traditional way of creating terrain, the
height map, just uses a picture in which dark areas are lower and light areas are higher in
elevation. Fig. 1 was rendered with a height map. Because voxels are literally like little
cubes, games will often use a method called Marching Cubes to smooth it out [3]. This
exact same technology has seen usage in medical visualizations. Computed tomography
(CT) scans, for instance, are a way to visualize a sample using a volume, rather than just
a picture. Having a 3-D image of a sample to be analyzed gives far more information than
a flat image. Structures that are physically difficult to access through surgery, such as the
cochlea deep in a human’s ear, can easily be visualized with the 3-D image produced by a
tomographic scan [4]. In terms of medicine, doctors do not necessarily have to cut open a
patient to analyze a possible tumor, using a scan to create the 3-D image is enough.
However, previously, such a technology had limited usability in analyzing organisms
other than humans or small structures, as the detail in the produced images was lacking as
the resolution was about 1 cubic millimeter. In the past decade, a new method called
micro-CT has become available. The images produced by a micro-CT system is far more
detailed, giving voxels that can be the size of microns (where 1 micron is equal to 1/1000
of a millimeter) [5]. These high-resolution volumes can now be used to analyze small
structures, like a human’s blood vessels.
As technology such as this becomes more powerful in the future, the visualizations
that can be produced from such technology will only become more detailed. Eventually,
Wong 1
2
such detailed visuals can revolutionize and streamline the methods doctors use to
diagnose serious diseases, such as cancer, in humans. Research into human organs can
also be greatly enhanced, as detailed models remove some of the need to have an actual
sample to study.
This is a very exciting time for graphics. Not only is the entertainment industry
embracing the use of powerful hardware and techniques to create an almost hyperrealistic experience, the medical and research community is also benefiting from the
increase in detail that is a result of this growth. Analyses that were not possible before
because of low resolutions are now used to visualize the smallest of structures, giving
humans a deeper understanding of how those structures work. There may come a time
where humans will have disease under control, but this will never be achieved without the
use of computer graphics.
References
[1] J. Gregory, Game Engine Architecture. Boca Raton, FL: Taylor and Francis Group, 2009.
[2] E. Angel, Interactive Computer Graphics, A Top-Down Approach with Shader-Based OpenGL.Boston,
MA: Addison-Wesley, 2012.
[3] E. S. Lengyel, “Voxel-Based Terrain for Real-Time Virtual Simulations,” Ph.D. dissertation, Dept.
Comp. Sci., Univ. California, Davis, 2010.
[4] F.P. Vidal, F. Bello, K.W. Brodlie, N.W. John, D. Gould, R. Phillips, and N.J. Avis. “Principles and
Applications of Computer Graphics in Medicine”. Computer Graphics forum, vol. 25, pp. 113-137,
2006.
[5] D. W. Holdsworth and M. M. Thornton. “Micro-CT in small animal and specimen imaging”. Trends in
Biotechnology, vol. 20, pp. S34-S39, 1 August 2002.
Download