Digital images

advertisement
Digital images
Multimedia authoring uses digital images to a large extent, and
website implementation is no exception. You therefore need to know
how to treat and manipulate digital images.
This chapter builds directly on chapter one, and we will here deal
with properties like image resolution, formats of different image files,
and compression.
Getting it
Before you can manipulate an image, you'll have to get it into your
computer as an image file. The file can originate from a number of
sources, including but not limited to:
- scanning
- downloading from a digital still image camera
- downloading from a web page
We will not cover the specifics of any one of these methods, as they
depend on factors that vary greatly between different software. We'll
however briefly mention a few things about getting images from the
web. Doing so is maybe the easiest way to acquire a few images to play
with, if you want to try a feature in an image manipulation software
package like Adobe Photoshop, or if you just want to try out a layout
when building a web page in for instance Macromedia Dreamweaver.
You can often get an image from a web page by right-clicking on it
(clicking on the image on the page with the right mouse button), or, on a
Mac, control-clicking on it (click with the mouse while holding down the
ctrl-key), and selecting a command from the menu that pops up that
goes something like "download image to disk", "save image to
disk" (specific text depends on the (version of the) operating system
you're using). You are subsequently asked to pick the location you want
to save the image to, and the result is a file containing the image on your
computer, ready for your exploration.
Fundamental properties
Before we start, we should note that fundamentally, there are two
main ways of representing an image on a computer. The following
applies to the type known as bit-map, or raster, image. We will return to
the other type at the end of the chapter.
Since you have read chapter one, you already know that an image
comes from a scanner as a matrix, or a raster, of pixels. Each pixel is
given a color value, and if the image measures 800 pixels in the xdirection and 600 in the y-direction, the spatial dimensions of the
image is 800x600. The color of each pixel is given with a certain
number of bits, as well, and it is often given as the third dimension of
the image, like 800x600x16, if our example image has 16 bits to hold
the color for each pixel.
The pixels are nothing more than bits and bytes, so we can easily
calculate how many bits the entire picture takes by multiplying the
three numbers. 800x600x16 yields 7680000 bits, or, if we divide by 8,
960 000 bytes. That is 960 kB, which is the size of a file holding this
image. I.e., that is the size of the part of the file holding the raw image
data. An image file can contain more than just the pixels. Often, there
will be information about other properties, such as resolution, in an
image file, as well.
If you scan a picture, you tell the scanner how many times to sample
the image pr. inch. If you tell the scanner to sample at 300 ppi, the
scanner will take 300 samples for every inch the scanner head moves.
Each of those samples will become a pixel in the image.
[Ill: Picture scanned at 15, 25, and 72 ppi.]
In any given image viewing/manipulation application you can choose
how densely the pixels should be viewed on the screen, so the resolution
of the image does not necessarily say anything about how an image looks
on a computer. The resolution information is used for instance if you
want to print the image, and it tells the printer software how densely to
print the pixels. If you scan at 300 ppi, then print the same image at
300 dpi, you will get a print-out that is approximately the same spatial
size as the original image you scanned. If you scan at 300, then print at
150 dpi, you in effect tell the print software to spread the pixels out half
as densely, and the result will be a print-out which is twice the size in
each dimension. The same amount of pixels are printed (namely all the
pixels the image contains), but they are more thinly spread.
In a image manipulation application like Adobe Photoshop you are
free to set a new resolution on any image. As described above, the
resolution of an image is only a directive included in the image file that
says something along the lines of "I was originally scanned at 300 ppi".
This directive is called an image tag, or a meta tag, because it is
contained within the image file itself, and says something about the
same image file. An image file usually contains many meta tags, and
they hold information such as creation date, color-space used etc.
Image file formats
If you want to exchange image files with other users, applications and
computers, you have to agree on how to pack the information inside the
file, otherwise the other application will not be able to use the file. How
do you arrange the bytes of pixels? Do you start with the lower, right
corner of the picture, or the upper left? What kind of meta tags should
you include, and where do you put the meta tags? In the beginning of
the file, or at the end? In which order?
A file format, not only image file formats, but any file format, has to
carefully specify all such details, so there can be no ambiguity as to how
the information you want to get at is packed within the file.
There are a large number of different file formats for images. The
reasons for that are manifold, but can be illustrated by detailing two
formats often used on the web: GIF and JPEG.
GIF – Graphics Interchange Format
Graphics Interchange Format, or GIF for short, is the first format
that was supported on the web, i.e. the format that the first web browser
could display. As has been told in previous chapters, the web was started
by academic researchers to be able to share documents detailing
methods, drafts and results from their research. They had a need for
including illustrations in those documents. The illustrations where most
often simple line drawings, or charts and schematics with straight lines
and few colors.
[Ill: Example GIF illustration]
They thus chose an image file format that suited their intended use.
The format would only need to support a limited number of colors, and
it should be a simple format to handle, so that as many types of
terminals and computers as possible could display it. GIF was the ideal
candidate. The GIF format specifies that any pixel in the image can use
a maximum of one byte to store the color. That means a GIF image has
an upper limit of 28 or 256 different colors. Using only one byte for the
color (or less - a GIF file can use between 1 and 8 bits for color per pixel)
helps reduce the file size and therefore the amount of time it takes to
download any one image. This was a critical parameter due to the
relatively low speed of the network.
As the web grew in popularity it started to be used for many other
purposes than scientific reporting. People started displaying other types
of images on the web, for instance scanned photos. The GIF format does
a very poor job of representing photos, because of its limited number of
colors. Another format was needed, with support for more colors while
still resulting in small files, and the choice fell on JPEG.
JPEG – Joint Picture Experts Group (Format)
JPEG is very well suited for representing photographic material
because the format specifies that each pixel in the image has 24 bits, or
3 bytes, at its disposal for storing its color. That yields 224 ≈ 16 million
different colors. The eye and brain will be satisfied that that is enough to
produce what usually is referred to as a photo-realistic representation.
[Ill: Example JPEG image]
Both the GIF and the JPEG format has another important
ingredient: The way they compress.
Compression
Compression is an important part of many image file format
specifications because an image file is usually big in its raw or
uncompressed state. If you scan an image that is 8x5 inches at 300 ppi
and 32 bit color, you will end up with a file that is 8x5x300^2x32 = 115
Mb / 8 ≈ 14.4 MB. That is a very big file, usually too big to transfer over
a slow network. To reduce the size of the file, compression is applied in
the form of a precise recipe on how to perform the compression, and
how to uncompress, or decompress the compressed data, to reconstruct
the original data again.
If you have ever been given a Kinder Egg, you know the approach.
Inside the egg there is a small toy in a plastic container. The toy is
disassembled in the container, but included is an illustrated recipe on
how to assemble, or decompress, it into a useable toy. Once the toy is
assembled, it does no longer fit into the container. It can however be
disassembled, or compressed, again and put into its container. With the
instructions, or the compression algorithm, it is always possible to
reassemble the toy into its useable form. This is an example of an nondestructive, or lossless, compression algorithm, supposing no pieces are
lost or destroyed in the (dis)assembly: you always get the same result
when you assemble the toy according to the instructions.
If a piece is lost or broken, the toy becomes unusable. The same does
not always hold true for computer data. It depends on what kind of data
that is compressed, but when we discuss images, the ultimate 'goal' of
an image is often to be viewed and understood by a human. Then does
it really matter if pixel number 523 down and 642 right has color red or
if it's pink? It might not, for the understanding of the whole picture, and
that is something lossy compression algorithms take advantage of. Such
compression schemes seek to keep the information in an image that is
important for the human viewer's brain's understanding of the image,
but will throw away some information that is not, in order to maximize
the compression efficiency.
GIF compression
The compression algorithm used in GIF is very simple, and geared
towards illustrations with simple geometrical forms and few colors. It is
called run-length encoding, because it takes advantage of pixels running
along the horizontal length of the image that are identical. Consider the
following illustration, in which we have enlarged a small portion of the
pixels.
[Ill: GIF image w/magnification]
Concentrating on the enlarged portion of the image, we see that in
the uppermost row of pixels the nine first pixels are identical, as are the
six last, with an in-between pixel occupying the tenth place. If we just
store this image uncompressed, the nine pixels in the uppermost row
would be stored as 9+1+6 bytes. Instead, the GIF compression scheme
specifies that we only store one byte containing the color of that runlength of identical pixels, and a number saying how many pixels of the
color run together before being broken by a pixel of a different color.
That would cost us only one byte to store the color, and maybe one more
byte to store the number of pixels to use that color. Clearly a good
saving. Depicting one byte as one square, we could illustrate the concept
like this, for the uppermost row of the enlarged part of the picture:
[Ill: run-length encoding]
When there are many adjacent identical pixels, we save many bytes.
But, whenever there is only one pixel of a given color isolated, we
actually use more space to store it when we use GIF-compression,
compared to the uncompressed version.
From this you might guess why GIF is a poor choice for photographic
material. Not only does it have few colors, but the compression
algorithm is very bad for the type of images that hardly have two pixels
with exactly the same color next to each other, as is the case with
photographs.
[Ill: JPEG-compr. at 60%: 4.0 kB, GIF-compr.: 5.0 kB]
GIF compression is a lossless algorithm. It can only handle 256 colors,
so you might disagree and say if I save a photo I've scanned in GIF
format, it will discard a lot of color information and I will end up with a
bad looking image on the screen. True, but losslessness in this respect is
not about converting from one format to the other. The important thing
is what happens if you compress an image, then decompress it for
viewing, then compress it again, and so on. The GIF format will
preserve the exact same information throughout any number of such
compress/decompress cycles, so you end up with the same data you
started out with in the first place. That is the definition of lossless
compression.
JPEG compression
JPEG compression was specifically designed to compress images
with many different colors and a lot of variation. It is not concerned with
preserving the data in a lossless state, but rather with optimizing how
the image look to the eye, while at the same time compressing the file as
much as possible. The exact algorithm used is much more complex than
the GIF algorithm, and we will not go any further into it (for the
technically inclined, the JPEG compression actually looks at the image in
its frequency domain, and makes use of FFT, Fast Foriér
Transformation).
Since JPEG uses a lossy compression algorithm, some data is
discarded, and some quality is inevitably lost. Once it is gone, it is gone
for good. Thus, repeatedly re-compressing the same image using JPEG
will result in poorer and poorer quality. For this reason, JPEG is not
used as a format when manipulating images in a workflow process, only
as a final output for for instance web.
A feature of the JPEG format is that the user, i.e. the person that
compresses the image, can choose how much the image should be
compressed. The exact steps to do so will vary between programs, but
the choice is usually presented as a slider, or a list, where you can choose
anything between 0% and 100% quality. The trade-off is file size. The
higher the quality, the bigger the file size, and therefore the longer time
needed to transfer an image over the net, for instance to be viewed in a
web browser. You are usually presented with a preview of the image as
you select different grades of quality, and it is up to you, as a
multimedia expert, to make a good trade-off between the perceived
visual quality of the image and its file size. Usually, a value between 40
and 60% is a good bet for web work.
[JPEG compression: 60% at 4.0 kB, 30% at 2.3 kB, and 5% quality at
1.4 kB file size]
JPEG-compression is not ideal for illustrations like charts, diagrams
and the like. JPEG was specifically introduced to complement GIF, and
for such types of images GIF is the best choice. If you look at the JPEGimages above, you might notice that as we decrease the quality, which is
equal to increasing the level of compression, visual artifacts become
apparent. These are especially noticeable around edges like the beak of
the parrot. We can illustrate this with the following:
[Ill: GIF-version of image: 1.2 kB. JPEG of the same at 5% quality, 1.3
kB, at 40% quality, 2.6 kB]
You can see that the middle image, compressed as a JPEG, looks very
bad, even if it is similar in size to the GIF-version. At 40% quality,
which is what we will have to use to approach the visual quality of the
GIF-version, the file has swelled to well over twice the size of the GIFversion.
This might seem insignificant, but if you have a web page with 10
images, it matters if they each weigh in at 80 kB, if they could have
been only 40 kB each.
There are a number of other image file formats that are worth
mentioning for one reason or another. We will briefly discuss a few of
them
The PNG format
The Portable Network Graphics format was implemented as a new
web graphics format to supersede both GIF and JPEG and create a new
format with the best characteristics from both. It is supported in all
newer browser, and is well suited for most kinds of web images. It looks
set to become the most used web image format over the next couple of
years.
TIFF
TIFF is a format designed to preserve image quality of any image,
and is suitable for using throughout any workflow process. It supports
millions of colors, several advanced features, and supports a lossless
compression algorithm.
Vector image file formats
Up until now we have discussed formats where the image
information is stored in the file as a matrix of pixels. There is another
way of doing it, but it's not suitable in all situations.
Say, for instance that you want to draw a picture of a circle on a
computer. If you use a vector-capable application, it will let you draw the
circle, but instead of storing it as a number of pixels, it will store it as a
mathematical expression for a circle, and a set of coordinates to say
where in the image the circle is. In a way, instead of displaying and
storing graphical elements as pixels in a matrix, the elements are stored
as objects. This has a number of advantages. For one thing, objects can
overlap and move independently of each other. Another big advantage is
that the objects are resolution independent. If you zoom in (enlarge) a
pixel matrix image, you will start to see the individual pixels as bigger
and bigger squares. Mathematical expressions, on the other hand, by
nature, has no resolution constraint. They will therefore display and
print smoothly at any size.
Their limitation lies in the fact that what is displayed must be
expressible in mathematical terms. A photo is not a suitable candidate.
Thus, vector images are usually illustrations like logos, stylished
drawings and the like.
If you want to use a vector image on the web, you usually have to
convert it to GIF first (Footnote: This is not always the case; sometimes
JPEG will be better even for a vector drawing, and sometimes you can
use another format altogether, like Flash). This process is called
rasterizing, because the objects in the vector image is converted into a
pixel raster, or matrix. Once an image is rasterized, the vector
information is lost, and the reverse process, turning a raster image into
a vector image, is an inaccurate process at best.
The arrow drawing that has been used as an example in this chapter
was made as a vector object in the application Freehand from
Macromedia. It was converted to a raster image format to be included in
this text, as such formats tend to be better supported in other
applications, i.e. those which are not specialized for handling vector
objects.
Download