Digital images Multimedia authoring uses digital images to a large extent, and website implementation is no exception. You therefore need to know how to treat and manipulate digital images. This chapter builds directly on chapter one, and we will here deal with properties like image resolution, formats of different image files, and compression. Getting it Before you can manipulate an image, you'll have to get it into your computer as an image file. The file can originate from a number of sources, including but not limited to: - scanning - downloading from a digital still image camera - downloading from a web page We will not cover the specifics of any one of these methods, as they depend on factors that vary greatly between different software. We'll however briefly mention a few things about getting images from the web. Doing so is maybe the easiest way to acquire a few images to play with, if you want to try a feature in an image manipulation software package like Adobe Photoshop, or if you just want to try out a layout when building a web page in for instance Macromedia Dreamweaver. You can often get an image from a web page by right-clicking on it (clicking on the image on the page with the right mouse button), or, on a Mac, control-clicking on it (click with the mouse while holding down the ctrl-key), and selecting a command from the menu that pops up that goes something like "download image to disk", "save image to disk" (specific text depends on the (version of the) operating system you're using). You are subsequently asked to pick the location you want to save the image to, and the result is a file containing the image on your computer, ready for your exploration. Fundamental properties Before we start, we should note that fundamentally, there are two main ways of representing an image on a computer. The following applies to the type known as bit-map, or raster, image. We will return to the other type at the end of the chapter. Since you have read chapter one, you already know that an image comes from a scanner as a matrix, or a raster, of pixels. Each pixel is given a color value, and if the image measures 800 pixels in the xdirection and 600 in the y-direction, the spatial dimensions of the image is 800x600. The color of each pixel is given with a certain number of bits, as well, and it is often given as the third dimension of the image, like 800x600x16, if our example image has 16 bits to hold the color for each pixel. The pixels are nothing more than bits and bytes, so we can easily calculate how many bits the entire picture takes by multiplying the three numbers. 800x600x16 yields 7680000 bits, or, if we divide by 8, 960 000 bytes. That is 960 kB, which is the size of a file holding this image. I.e., that is the size of the part of the file holding the raw image data. An image file can contain more than just the pixels. Often, there will be information about other properties, such as resolution, in an image file, as well. If you scan a picture, you tell the scanner how many times to sample the image pr. inch. If you tell the scanner to sample at 300 ppi, the scanner will take 300 samples for every inch the scanner head moves. Each of those samples will become a pixel in the image. [Ill: Picture scanned at 15, 25, and 72 ppi.] In any given image viewing/manipulation application you can choose how densely the pixels should be viewed on the screen, so the resolution of the image does not necessarily say anything about how an image looks on a computer. The resolution information is used for instance if you want to print the image, and it tells the printer software how densely to print the pixels. If you scan at 300 ppi, then print the same image at 300 dpi, you will get a print-out that is approximately the same spatial size as the original image you scanned. If you scan at 300, then print at 150 dpi, you in effect tell the print software to spread the pixels out half as densely, and the result will be a print-out which is twice the size in each dimension. The same amount of pixels are printed (namely all the pixels the image contains), but they are more thinly spread. In a image manipulation application like Adobe Photoshop you are free to set a new resolution on any image. As described above, the resolution of an image is only a directive included in the image file that says something along the lines of "I was originally scanned at 300 ppi". This directive is called an image tag, or a meta tag, because it is contained within the image file itself, and says something about the same image file. An image file usually contains many meta tags, and they hold information such as creation date, color-space used etc. Image file formats If you want to exchange image files with other users, applications and computers, you have to agree on how to pack the information inside the file, otherwise the other application will not be able to use the file. How do you arrange the bytes of pixels? Do you start with the lower, right corner of the picture, or the upper left? What kind of meta tags should you include, and where do you put the meta tags? In the beginning of the file, or at the end? In which order? A file format, not only image file formats, but any file format, has to carefully specify all such details, so there can be no ambiguity as to how the information you want to get at is packed within the file. There are a large number of different file formats for images. The reasons for that are manifold, but can be illustrated by detailing two formats often used on the web: GIF and JPEG. GIF – Graphics Interchange Format Graphics Interchange Format, or GIF for short, is the first format that was supported on the web, i.e. the format that the first web browser could display. As has been told in previous chapters, the web was started by academic researchers to be able to share documents detailing methods, drafts and results from their research. They had a need for including illustrations in those documents. The illustrations where most often simple line drawings, or charts and schematics with straight lines and few colors. [Ill: Example GIF illustration] They thus chose an image file format that suited their intended use. The format would only need to support a limited number of colors, and it should be a simple format to handle, so that as many types of terminals and computers as possible could display it. GIF was the ideal candidate. The GIF format specifies that any pixel in the image can use a maximum of one byte to store the color. That means a GIF image has an upper limit of 28 or 256 different colors. Using only one byte for the color (or less - a GIF file can use between 1 and 8 bits for color per pixel) helps reduce the file size and therefore the amount of time it takes to download any one image. This was a critical parameter due to the relatively low speed of the network. As the web grew in popularity it started to be used for many other purposes than scientific reporting. People started displaying other types of images on the web, for instance scanned photos. The GIF format does a very poor job of representing photos, because of its limited number of colors. Another format was needed, with support for more colors while still resulting in small files, and the choice fell on JPEG. JPEG – Joint Picture Experts Group (Format) JPEG is very well suited for representing photographic material because the format specifies that each pixel in the image has 24 bits, or 3 bytes, at its disposal for storing its color. That yields 224 ≈ 16 million different colors. The eye and brain will be satisfied that that is enough to produce what usually is referred to as a photo-realistic representation. [Ill: Example JPEG image] Both the GIF and the JPEG format has another important ingredient: The way they compress. Compression Compression is an important part of many image file format specifications because an image file is usually big in its raw or uncompressed state. If you scan an image that is 8x5 inches at 300 ppi and 32 bit color, you will end up with a file that is 8x5x300^2x32 = 115 Mb / 8 ≈ 14.4 MB. That is a very big file, usually too big to transfer over a slow network. To reduce the size of the file, compression is applied in the form of a precise recipe on how to perform the compression, and how to uncompress, or decompress the compressed data, to reconstruct the original data again. If you have ever been given a Kinder Egg, you know the approach. Inside the egg there is a small toy in a plastic container. The toy is disassembled in the container, but included is an illustrated recipe on how to assemble, or decompress, it into a useable toy. Once the toy is assembled, it does no longer fit into the container. It can however be disassembled, or compressed, again and put into its container. With the instructions, or the compression algorithm, it is always possible to reassemble the toy into its useable form. This is an example of an nondestructive, or lossless, compression algorithm, supposing no pieces are lost or destroyed in the (dis)assembly: you always get the same result when you assemble the toy according to the instructions. If a piece is lost or broken, the toy becomes unusable. The same does not always hold true for computer data. It depends on what kind of data that is compressed, but when we discuss images, the ultimate 'goal' of an image is often to be viewed and understood by a human. Then does it really matter if pixel number 523 down and 642 right has color red or if it's pink? It might not, for the understanding of the whole picture, and that is something lossy compression algorithms take advantage of. Such compression schemes seek to keep the information in an image that is important for the human viewer's brain's understanding of the image, but will throw away some information that is not, in order to maximize the compression efficiency. GIF compression The compression algorithm used in GIF is very simple, and geared towards illustrations with simple geometrical forms and few colors. It is called run-length encoding, because it takes advantage of pixels running along the horizontal length of the image that are identical. Consider the following illustration, in which we have enlarged a small portion of the pixels. [Ill: GIF image w/magnification] Concentrating on the enlarged portion of the image, we see that in the uppermost row of pixels the nine first pixels are identical, as are the six last, with an in-between pixel occupying the tenth place. If we just store this image uncompressed, the nine pixels in the uppermost row would be stored as 9+1+6 bytes. Instead, the GIF compression scheme specifies that we only store one byte containing the color of that runlength of identical pixels, and a number saying how many pixels of the color run together before being broken by a pixel of a different color. That would cost us only one byte to store the color, and maybe one more byte to store the number of pixels to use that color. Clearly a good saving. Depicting one byte as one square, we could illustrate the concept like this, for the uppermost row of the enlarged part of the picture: [Ill: run-length encoding] When there are many adjacent identical pixels, we save many bytes. But, whenever there is only one pixel of a given color isolated, we actually use more space to store it when we use GIF-compression, compared to the uncompressed version. From this you might guess why GIF is a poor choice for photographic material. Not only does it have few colors, but the compression algorithm is very bad for the type of images that hardly have two pixels with exactly the same color next to each other, as is the case with photographs. [Ill: JPEG-compr. at 60%: 4.0 kB, GIF-compr.: 5.0 kB] GIF compression is a lossless algorithm. It can only handle 256 colors, so you might disagree and say if I save a photo I've scanned in GIF format, it will discard a lot of color information and I will end up with a bad looking image on the screen. True, but losslessness in this respect is not about converting from one format to the other. The important thing is what happens if you compress an image, then decompress it for viewing, then compress it again, and so on. The GIF format will preserve the exact same information throughout any number of such compress/decompress cycles, so you end up with the same data you started out with in the first place. That is the definition of lossless compression. JPEG compression JPEG compression was specifically designed to compress images with many different colors and a lot of variation. It is not concerned with preserving the data in a lossless state, but rather with optimizing how the image look to the eye, while at the same time compressing the file as much as possible. The exact algorithm used is much more complex than the GIF algorithm, and we will not go any further into it (for the technically inclined, the JPEG compression actually looks at the image in its frequency domain, and makes use of FFT, Fast Foriér Transformation). Since JPEG uses a lossy compression algorithm, some data is discarded, and some quality is inevitably lost. Once it is gone, it is gone for good. Thus, repeatedly re-compressing the same image using JPEG will result in poorer and poorer quality. For this reason, JPEG is not used as a format when manipulating images in a workflow process, only as a final output for for instance web. A feature of the JPEG format is that the user, i.e. the person that compresses the image, can choose how much the image should be compressed. The exact steps to do so will vary between programs, but the choice is usually presented as a slider, or a list, where you can choose anything between 0% and 100% quality. The trade-off is file size. The higher the quality, the bigger the file size, and therefore the longer time needed to transfer an image over the net, for instance to be viewed in a web browser. You are usually presented with a preview of the image as you select different grades of quality, and it is up to you, as a multimedia expert, to make a good trade-off between the perceived visual quality of the image and its file size. Usually, a value between 40 and 60% is a good bet for web work. [JPEG compression: 60% at 4.0 kB, 30% at 2.3 kB, and 5% quality at 1.4 kB file size] JPEG-compression is not ideal for illustrations like charts, diagrams and the like. JPEG was specifically introduced to complement GIF, and for such types of images GIF is the best choice. If you look at the JPEGimages above, you might notice that as we decrease the quality, which is equal to increasing the level of compression, visual artifacts become apparent. These are especially noticeable around edges like the beak of the parrot. We can illustrate this with the following: [Ill: GIF-version of image: 1.2 kB. JPEG of the same at 5% quality, 1.3 kB, at 40% quality, 2.6 kB] You can see that the middle image, compressed as a JPEG, looks very bad, even if it is similar in size to the GIF-version. At 40% quality, which is what we will have to use to approach the visual quality of the GIF-version, the file has swelled to well over twice the size of the GIFversion. This might seem insignificant, but if you have a web page with 10 images, it matters if they each weigh in at 80 kB, if they could have been only 40 kB each. There are a number of other image file formats that are worth mentioning for one reason or another. We will briefly discuss a few of them The PNG format The Portable Network Graphics format was implemented as a new web graphics format to supersede both GIF and JPEG and create a new format with the best characteristics from both. It is supported in all newer browser, and is well suited for most kinds of web images. It looks set to become the most used web image format over the next couple of years. TIFF TIFF is a format designed to preserve image quality of any image, and is suitable for using throughout any workflow process. It supports millions of colors, several advanced features, and supports a lossless compression algorithm. Vector image file formats Up until now we have discussed formats where the image information is stored in the file as a matrix of pixels. There is another way of doing it, but it's not suitable in all situations. Say, for instance that you want to draw a picture of a circle on a computer. If you use a vector-capable application, it will let you draw the circle, but instead of storing it as a number of pixels, it will store it as a mathematical expression for a circle, and a set of coordinates to say where in the image the circle is. In a way, instead of displaying and storing graphical elements as pixels in a matrix, the elements are stored as objects. This has a number of advantages. For one thing, objects can overlap and move independently of each other. Another big advantage is that the objects are resolution independent. If you zoom in (enlarge) a pixel matrix image, you will start to see the individual pixels as bigger and bigger squares. Mathematical expressions, on the other hand, by nature, has no resolution constraint. They will therefore display and print smoothly at any size. Their limitation lies in the fact that what is displayed must be expressible in mathematical terms. A photo is not a suitable candidate. Thus, vector images are usually illustrations like logos, stylished drawings and the like. If you want to use a vector image on the web, you usually have to convert it to GIF first (Footnote: This is not always the case; sometimes JPEG will be better even for a vector drawing, and sometimes you can use another format altogether, like Flash). This process is called rasterizing, because the objects in the vector image is converted into a pixel raster, or matrix. Once an image is rasterized, the vector information is lost, and the reverse process, turning a raster image into a vector image, is an inaccurate process at best. The arrow drawing that has been used as an example in this chapter was made as a vector object in the application Freehand from Macromedia. It was converted to a raster image format to be included in this text, as such formats tend to be better supported in other applications, i.e. those which are not specialized for handling vector objects.