Being digital

advertisement
Being digital
The computer is a digital machine, which fundamentally means it is capable of
manipulating digits. To grasp what that implies is critical to understanding how
a computer works. Not in an electrical or mechanical sense, of course. Nor need
we dissect the logics of a computer into great details. What we seek is a sufficient
understanding of key aspects, because to understand them is to understand the
fundamental concepts that govern how a computer operates, which in turn has a
profound effect on what you put in, and get out of, the computer.
To get off to a start, it is helpful to start by asking ourselves about the world
around us, especially to ask about how we measure things, i.e. how we quantify
them. This is important because the computer also needs to quantify anything
we want to put into it. When we scan a picture, what we are doing is let the
computer measure, or quantify, some aspects of the picture, namely the colors
and the brightness in many points in the picture.
In other instances, there might be a more direct way of entering something
familiar, like letters, into a computer, but it still needs to 'translate' everything
into numbers. When you punch letters on a keyboard into a word processor
program on a computer, the letters are translated into digits. That is, again,
because the computer is only capable of handling digits. Thus, for the letters you
type in to be handle-able by the computer, each letter is assigned a number
internally.
Now, let us return to the non-computerized world around us.
Our analogue world
The everyday world around us is a continuous world. What does that mean? For
our purpose, since we are interested in counting, or quantifying, it means that
measured numbers in principle can have any value. If I measure the angle of a
slice of cake, that angle can in principle be any value between 0 and 360 degrees.
And the value could in principle be measured with any degree of precision I
wanted, as long as I measured accurately enough (and allowing for a
measurement instrument capable enough). Remember this is just a thought
experiment. We need not concern ourself with the obstacles of practicality. The
point is, in our everyday world, measurements can be as accurate as we care
them to be. (Footnote: Some of you might protest, saying that if I for instance
measure the length of an iron rod, the accuracy I can get is limited by the length
of an iron atom: the rod is either a gazillion atoms in length, or a gazillion and
one atoms, and no in-between values are allowed. True. However, that kind of
atom-splitting does not invalidate this argument, because the accuracy in that
case is still many, many times higher that which we'll get in any computer, as we
shall see).
The computer as an inaccurate approximator
With the hype the computer in general receives these days, the heading might
surprise some. But that is exactly what a computer is when it comes to
representing the real world. Why? We know that a computer must represent any
bit of information it wants to manipulate with one or more numbers. To do that,
these numbers must be stored and moved around inside the computer. Thus,
each number takes up some storage space inside the computer, and the more
digits the number contains (i.e. the more accurate it is, or the bigger it is), the
more storage is needed for the number. The two biggest of these storage spaces
are know as the hard drive and the memory (RAM) of the computer. Since the
computer does not have an infinite amount of storage space, it follows that the
computer can not handle, nor therefore represent, infinitely accurate
representations of the real world. Since there is an upper limit to the accuracy,
we might not be able to get the computer to represent what we put into it, be it a
scanned image or a digitized video, with as much accuracy as we would like. We
therefore inevitably loose some quality when we digitize a piece of analogue
information. To see precisely why, and how, we have to step back for a moment
and learn something about the way the computer manipulates numbers.
The 2-digit system
The above is a general argument, and we have so far avoided saying anything
specific about the actual numbers the computer uses to represent things that are
familiar to us, like a letter, word or image.
The computer world is shot through with words, terms and methods that
originates from the 2-digit number system the computers use. It is therefore in
order to know a thing or two about this system.
In the western world the prevalent number system is the 10-digit system. It
uses the 10 symbols 0 through 9 as its base, alone, or put together, to represent
any whole number. If we allow for a punctuation mark, it can also be used to
represent fractions like 3.141592654 or 2,718281828459045. If you use this
system, you might wonder at this 'constructed' explanation, since the numbers
probably feel entirely natural to you. But the 10-number system is in fact
completely arbitrary, and its prevalence today has historical roots. You can make
up a fully functional and valid number system, using any number of symbols of
your liking.
One other thing to notice about the 10 digit system is that placement is
significant. 123 is not the same as 321. (Footnote: That is smart, because it
allows you to make very large numbers with the same base digits. The Roman
numerals do not have this virtue. For instance, V is 5 in Roman numerals, but 50
is L. This, together with the fact that the Romans had no symbol for zero, made
Roman mathematics a pain in the neck.)
The 2-digit system that is used in computers today is such an alternate system.
It uses only two different digits, 0 and 1. (Footnote: The reason for this is simply
that transistors, which forms the basis of computer hardware, act as switches,
being either on or off. The two states are the natural basis of the two digits. It is
also the reason you sometimes see 0 and 1 labeled off and on, or low and high
(voltage) , in other kinds of computer literature).
So how do you count in the 2-digit system? The same way you count in the 10digit system, of course! In the 10-digit system, when you want to count upwards,
say, to 99, you start with two symbols, both starting at zero (00). Then you start
with the right-most symbol, and let it run through all its possible values from 0
to 9. Then you increase the leftmost symbol to its next value and start running
through the right symbol's values again. This procedure you repeat until both
symbols are at 99. If you want to count to more than 99 you need additional
symbols.
To count in the 2-digit system we do exactly the same, only the symbols can
have only two different values. Start with the right symbol, let it run through its
possible values: 00, 01. Then increase the left symbol to its next value and run
through the right symbol's values again: 10, 11. We have now hit the roof of
what we can count with two symbols, or digits, in the 2-digit system. As you can
see, we've run through four different possibilities, which is the same as saying
we can count to 4 (or more often, 0 to 3) with two digits in the 2-digit system. To
count upwards from four, or to represent more than 4 different values of
something, we need to add more symbols. The method for counting is the same,
of course. I encourage you to try to see if you can use 3 symbols in the 2-digit
system, and see how far you can count. With such tasks it can be helpful to set up
a table representing the three symbols and their possible values, like this:
[Table: 3 bits logical table]
As you can see, you can count to 8 (or 0 to 7) with three digits, alternatively you
can use three digits to represent 8 different values of some property, like color.
More generally, each time we add one symbol to the number of symbols we have
available in the 2-digit system, we double the number of values we can count, or
have represent something. In mathematical terms; if x is the number of symbols
we have available, we can represent 2x values.
Bits and bytes
Until now we've used the term symbols or digits to talk about the 2-digit
system. In computer jargon, it is common to refer to one such symbol as a bit. A
bit can be thought of as a storage-place in the computer that can hold one single
symbol of the 2-digit system: 0 or 1. This is the smallest possible storage-cell in a
computer. A cell is exactly how a bit usually is depicted; as an unfilled square,
either empty if the value of the symbol inside it is not known or unimportant, or
filled with a 0 or 1, like this:
[illustration: bit]
We will from now on drop the talk about symbols, and just refer to bits. A bit
will mean one single digit of the 2-digit system, residing in a storage-cell in a
computer. A small b is used as an abbreviation for bit.
One byte is eight bits. Several bits together, all of them available to our storage
needs, is depicted as a strip of bits:
[Ill: byte, word]
The byte, abbreviated with a capital B, is the most commonly referred bitgroup, often in tandem with size-abbreviations like k (kilo, thousand), M (mega,
million) or G (giga, billion). Thus, 512 MB RAM means 512 million bytes of
Random Access Memory, and a hard disk of 120 GB has room to store 120 billion
bytes.
Not quite that much, but a lot used in this book, is one byte. How many
different values of something can you represent with one byte? According to the
formula given above, the answer is 28, which is 256.
Quantifying the real world - example
We now return to the issue of how the computer represents real world values.
And again we use the example of scanning a photograph. To really get a good
understand of what is going on, we shall proceed to deconstruct the process to a
degree of some detail.
Sampling
Before we detail what the scanner actually picks up, we must understand the
concept of sampling. To sample means to to taste a little bit, and that is exactly
what happens. The scanner has a pick-up, a scanner head. When you place the
picture in the scanner and press the scan button, the head moves over the
picture and samples the color values from all over the picture. More specifically,
the scanner head moves a little bit, samples a point in the picture, moves a little
bit, samples, moves, samples.... and so on, until the entire picture is covered.
[Ill: sampling of a picture]
The spacing between the samples (•) is usually much less than indicated in the
illustration above, but the principle is the same. As you can see, the fact that the
scanner head moves between every sample means that in-between every sample
there is an area that is not sampled. That does not sound very smart, why cannot
the sampling happen continuously over the entire picture, so that you get a
continuous representation of the picture? Because the computer has no
possibility to do so (disregarding the mechanical build limitations of the scanner
itself). It cannot handle continuity. The computer controls the movement of the
scanner head. To have it sample continuously must mean to divide the length
the head should move into infinitely many, infinitely small, steps. As we have
seen, the computer cannot handle infinity, because all values the computer
manipulates must be stored in bits and bytes inside the computer. The bigger
the value, the more storage space it takes, and the computer does not have
infinitely much storage space.
Therefore: the scanner head must move some small distance between the point
samples. For each sample the color value of that point is recorded. That color
value is also subject to limitations, for the same reasons. In the real world, color
is a continuous quantity (we can safely ignore any quantum mechanical effects
here...). If you give me any two points on a color scale, I can always get a value
in-between those two, no matter how close together the two points are. In the
computer, we only have a finite number of bits in which to store the color value
resulting from a sample. Thus we only have as many different possible values for
the color as the number of bits allow. If we have sat aside 1 byte to store the color
of each sample, we can only give the color of that sample one of the possible 256
values we can represent with that byte.
So we see that in scanning a picture, because of the fundamental limitation of
the computer in its inability to handle infinity and continuity, we get a
representation of the picture which is only an approximation, both spatially (the
sampling is done in points with unsampled spaces in-between), and in terms of
color. These limitation results from fundamental characteristics of the computer,
and is therefore present in every kind of situation where we ask the computer to
input, or sample, real world phenomena.
Resolution
So the representation of a scanned picture, or a sampled sound, is imperfect in
a computer, but how imperfect?
To talk about the degree of closeness to the original, or the fidelity of the
sampling, we talk about resolution.
In a scan, we have established that the scanner head moves between every
sample. (Footnote: In real life scanners, the scanner head consists of many
sample units in a row that spans one axis of the scanner. When scanning, this
row samples the picture with all the sample units at once, moves the scanner
head a bit along the other axis, samples with all the sample units again, etc.) The
spatial resolution of the scan says how much the scanner head moves between
every sample. It is usually measured in the number of samples the scanner takes
pr inch, ppi (points pr inch). The smaller the movement between samples, the
closer together the sample points are, the more samples are taken per inch, and
the better approximation of the original picture. But at the same time the
number of samples taken increase, and the more storage space in the computer
is needed, and the heavier work it becomes for the computer to manipulate all
the bytes of the picture afterwards. This is the core of the ever-present trade-off
that you will meet again and again if you do creative work within multimedia. It
is the same with the color resolution: the more bits that are sat aside to store the
color value of each sample, the better approximation of the real color, but the
more storage space is taken, and the more work has to be done to manipulate the
image on the computer. Usually, between 8 and 48 bits, equivalent to 1 to 6
bytes, are used to store the color pr sample. You should check for yourself how
many different color values can be represented with those two number of bits.
Conclusion
This chapter has laid the foundation for a better, and faster, understanding of a
vast range of issues, from networks to multimedia. Remember that the examples
discussed here are just that, but the principles of resolution, sampling, and the
digital characteristics of the computer are valid throughout the field.
Download