Being digital The computer is a digital machine, which fundamentally means it is capable of manipulating digits. To grasp what that implies is critical to understanding how a computer works. Not in an electrical or mechanical sense, of course. Nor need we dissect the logics of a computer into great details. What we seek is a sufficient understanding of key aspects, because to understand them is to understand the fundamental concepts that govern how a computer operates, which in turn has a profound effect on what you put in, and get out of, the computer. To get off to a start, it is helpful to start by asking ourselves about the world around us, especially to ask about how we measure things, i.e. how we quantify them. This is important because the computer also needs to quantify anything we want to put into it. When we scan a picture, what we are doing is let the computer measure, or quantify, some aspects of the picture, namely the colors and the brightness in many points in the picture. In other instances, there might be a more direct way of entering something familiar, like letters, into a computer, but it still needs to 'translate' everything into numbers. When you punch letters on a keyboard into a word processor program on a computer, the letters are translated into digits. That is, again, because the computer is only capable of handling digits. Thus, for the letters you type in to be handle-able by the computer, each letter is assigned a number internally. Now, let us return to the non-computerized world around us. Our analogue world The everyday world around us is a continuous world. What does that mean? For our purpose, since we are interested in counting, or quantifying, it means that measured numbers in principle can have any value. If I measure the angle of a slice of cake, that angle can in principle be any value between 0 and 360 degrees. And the value could in principle be measured with any degree of precision I wanted, as long as I measured accurately enough (and allowing for a measurement instrument capable enough). Remember this is just a thought experiment. We need not concern ourself with the obstacles of practicality. The point is, in our everyday world, measurements can be as accurate as we care them to be. (Footnote: Some of you might protest, saying that if I for instance measure the length of an iron rod, the accuracy I can get is limited by the length of an iron atom: the rod is either a gazillion atoms in length, or a gazillion and one atoms, and no in-between values are allowed. True. However, that kind of atom-splitting does not invalidate this argument, because the accuracy in that case is still many, many times higher that which we'll get in any computer, as we shall see). The computer as an inaccurate approximator With the hype the computer in general receives these days, the heading might surprise some. But that is exactly what a computer is when it comes to representing the real world. Why? We know that a computer must represent any bit of information it wants to manipulate with one or more numbers. To do that, these numbers must be stored and moved around inside the computer. Thus, each number takes up some storage space inside the computer, and the more digits the number contains (i.e. the more accurate it is, or the bigger it is), the more storage is needed for the number. The two biggest of these storage spaces are know as the hard drive and the memory (RAM) of the computer. Since the computer does not have an infinite amount of storage space, it follows that the computer can not handle, nor therefore represent, infinitely accurate representations of the real world. Since there is an upper limit to the accuracy, we might not be able to get the computer to represent what we put into it, be it a scanned image or a digitized video, with as much accuracy as we would like. We therefore inevitably loose some quality when we digitize a piece of analogue information. To see precisely why, and how, we have to step back for a moment and learn something about the way the computer manipulates numbers. The 2-digit system The above is a general argument, and we have so far avoided saying anything specific about the actual numbers the computer uses to represent things that are familiar to us, like a letter, word or image. The computer world is shot through with words, terms and methods that originates from the 2-digit number system the computers use. It is therefore in order to know a thing or two about this system. In the western world the prevalent number system is the 10-digit system. It uses the 10 symbols 0 through 9 as its base, alone, or put together, to represent any whole number. If we allow for a punctuation mark, it can also be used to represent fractions like 3.141592654 or 2,718281828459045. If you use this system, you might wonder at this 'constructed' explanation, since the numbers probably feel entirely natural to you. But the 10-number system is in fact completely arbitrary, and its prevalence today has historical roots. You can make up a fully functional and valid number system, using any number of symbols of your liking. One other thing to notice about the 10 digit system is that placement is significant. 123 is not the same as 321. (Footnote: That is smart, because it allows you to make very large numbers with the same base digits. The Roman numerals do not have this virtue. For instance, V is 5 in Roman numerals, but 50 is L. This, together with the fact that the Romans had no symbol for zero, made Roman mathematics a pain in the neck.) The 2-digit system that is used in computers today is such an alternate system. It uses only two different digits, 0 and 1. (Footnote: The reason for this is simply that transistors, which forms the basis of computer hardware, act as switches, being either on or off. The two states are the natural basis of the two digits. It is also the reason you sometimes see 0 and 1 labeled off and on, or low and high (voltage) , in other kinds of computer literature). So how do you count in the 2-digit system? The same way you count in the 10digit system, of course! In the 10-digit system, when you want to count upwards, say, to 99, you start with two symbols, both starting at zero (00). Then you start with the right-most symbol, and let it run through all its possible values from 0 to 9. Then you increase the leftmost symbol to its next value and start running through the right symbol's values again. This procedure you repeat until both symbols are at 99. If you want to count to more than 99 you need additional symbols. To count in the 2-digit system we do exactly the same, only the symbols can have only two different values. Start with the right symbol, let it run through its possible values: 00, 01. Then increase the left symbol to its next value and run through the right symbol's values again: 10, 11. We have now hit the roof of what we can count with two symbols, or digits, in the 2-digit system. As you can see, we've run through four different possibilities, which is the same as saying we can count to 4 (or more often, 0 to 3) with two digits in the 2-digit system. To count upwards from four, or to represent more than 4 different values of something, we need to add more symbols. The method for counting is the same, of course. I encourage you to try to see if you can use 3 symbols in the 2-digit system, and see how far you can count. With such tasks it can be helpful to set up a table representing the three symbols and their possible values, like this: [Table: 3 bits logical table] As you can see, you can count to 8 (or 0 to 7) with three digits, alternatively you can use three digits to represent 8 different values of some property, like color. More generally, each time we add one symbol to the number of symbols we have available in the 2-digit system, we double the number of values we can count, or have represent something. In mathematical terms; if x is the number of symbols we have available, we can represent 2x values. Bits and bytes Until now we've used the term symbols or digits to talk about the 2-digit system. In computer jargon, it is common to refer to one such symbol as a bit. A bit can be thought of as a storage-place in the computer that can hold one single symbol of the 2-digit system: 0 or 1. This is the smallest possible storage-cell in a computer. A cell is exactly how a bit usually is depicted; as an unfilled square, either empty if the value of the symbol inside it is not known or unimportant, or filled with a 0 or 1, like this: [illustration: bit] We will from now on drop the talk about symbols, and just refer to bits. A bit will mean one single digit of the 2-digit system, residing in a storage-cell in a computer. A small b is used as an abbreviation for bit. One byte is eight bits. Several bits together, all of them available to our storage needs, is depicted as a strip of bits: [Ill: byte, word] The byte, abbreviated with a capital B, is the most commonly referred bitgroup, often in tandem with size-abbreviations like k (kilo, thousand), M (mega, million) or G (giga, billion). Thus, 512 MB RAM means 512 million bytes of Random Access Memory, and a hard disk of 120 GB has room to store 120 billion bytes. Not quite that much, but a lot used in this book, is one byte. How many different values of something can you represent with one byte? According to the formula given above, the answer is 28, which is 256. Quantifying the real world - example We now return to the issue of how the computer represents real world values. And again we use the example of scanning a photograph. To really get a good understand of what is going on, we shall proceed to deconstruct the process to a degree of some detail. Sampling Before we detail what the scanner actually picks up, we must understand the concept of sampling. To sample means to to taste a little bit, and that is exactly what happens. The scanner has a pick-up, a scanner head. When you place the picture in the scanner and press the scan button, the head moves over the picture and samples the color values from all over the picture. More specifically, the scanner head moves a little bit, samples a point in the picture, moves a little bit, samples, moves, samples.... and so on, until the entire picture is covered. [Ill: sampling of a picture] The spacing between the samples (•) is usually much less than indicated in the illustration above, but the principle is the same. As you can see, the fact that the scanner head moves between every sample means that in-between every sample there is an area that is not sampled. That does not sound very smart, why cannot the sampling happen continuously over the entire picture, so that you get a continuous representation of the picture? Because the computer has no possibility to do so (disregarding the mechanical build limitations of the scanner itself). It cannot handle continuity. The computer controls the movement of the scanner head. To have it sample continuously must mean to divide the length the head should move into infinitely many, infinitely small, steps. As we have seen, the computer cannot handle infinity, because all values the computer manipulates must be stored in bits and bytes inside the computer. The bigger the value, the more storage space it takes, and the computer does not have infinitely much storage space. Therefore: the scanner head must move some small distance between the point samples. For each sample the color value of that point is recorded. That color value is also subject to limitations, for the same reasons. In the real world, color is a continuous quantity (we can safely ignore any quantum mechanical effects here...). If you give me any two points on a color scale, I can always get a value in-between those two, no matter how close together the two points are. In the computer, we only have a finite number of bits in which to store the color value resulting from a sample. Thus we only have as many different possible values for the color as the number of bits allow. If we have sat aside 1 byte to store the color of each sample, we can only give the color of that sample one of the possible 256 values we can represent with that byte. So we see that in scanning a picture, because of the fundamental limitation of the computer in its inability to handle infinity and continuity, we get a representation of the picture which is only an approximation, both spatially (the sampling is done in points with unsampled spaces in-between), and in terms of color. These limitation results from fundamental characteristics of the computer, and is therefore present in every kind of situation where we ask the computer to input, or sample, real world phenomena. Resolution So the representation of a scanned picture, or a sampled sound, is imperfect in a computer, but how imperfect? To talk about the degree of closeness to the original, or the fidelity of the sampling, we talk about resolution. In a scan, we have established that the scanner head moves between every sample. (Footnote: In real life scanners, the scanner head consists of many sample units in a row that spans one axis of the scanner. When scanning, this row samples the picture with all the sample units at once, moves the scanner head a bit along the other axis, samples with all the sample units again, etc.) The spatial resolution of the scan says how much the scanner head moves between every sample. It is usually measured in the number of samples the scanner takes pr inch, ppi (points pr inch). The smaller the movement between samples, the closer together the sample points are, the more samples are taken per inch, and the better approximation of the original picture. But at the same time the number of samples taken increase, and the more storage space in the computer is needed, and the heavier work it becomes for the computer to manipulate all the bytes of the picture afterwards. This is the core of the ever-present trade-off that you will meet again and again if you do creative work within multimedia. It is the same with the color resolution: the more bits that are sat aside to store the color value of each sample, the better approximation of the real color, but the more storage space is taken, and the more work has to be done to manipulate the image on the computer. Usually, between 8 and 48 bits, equivalent to 1 to 6 bytes, are used to store the color pr sample. You should check for yourself how many different color values can be represented with those two number of bits. Conclusion This chapter has laid the foundation for a better, and faster, understanding of a vast range of issues, from networks to multimedia. Remember that the examples discussed here are just that, but the principles of resolution, sampling, and the digital characteristics of the computer are valid throughout the field.