Overview of Computer Science

advertisement
Overview of Computer Science
CSC 101 — Summer 2011
Analog, Binary and Digital Concepts
Di iti ti
Digitization
Lecture 4 — July 11, 2011
Announcements
• Writing Assignment #1 Due Today.
– Hand it to me after class if you haven’t already
– Make sure you have the electronic copy with you for
lab tomorrow
• Lab#1 is tomorrow (8am)
– Be sure to read the prelab tonight
2
Objectives
• Analog vs. digital information
• Binary encoding of information – bits and bytes
• Digitization
3
1
Processing Data
• For a device to process data, what three steps
are required?
– Input some data
– Process the data
(perform some planned operations on the data)
– Output the results
• A computer is any device that processes data
– Not necessarily only digital data
4
Analog Information
• Analog information is what we experience directly
– Sights, sounds, textures, smells, tastes, etc.
• Analog info is continuous and infinitely
variable
• Example:
monitoring the
outside temp
through the day
using an analog
thermometer
80°
70°
60°
50°
midnight
noon
5
An Analog Computer
• A very simple analog computer is a
mechanical thermostat
– “Inputs”:
• Measured temperature
• Desired temperature (“setpoint”)
– Executes a simple program:
If temp > setpoint then AC.on
– “Output” is the action of turning the AC on or off
– temp and setpoint are both analog values
• Temperature causes a spring to stretch or shrink
• Setpoint is set by turning a dial
• Both of these are continuous, infinitely variable values
6
2
Digital Information
• Digital information is discrete
– Definite, distinct, precise
– Enumerable (countable)
– Finite
• Example: measuring
temperature with a
digital thermometer
56.5 °F
Time
Temperature
12:00 AM
56.5°
12:30 AM
54.9°
1:00 AM
54.0°
1:30 AM
53.5°
2:00 AM
53.3°
2:30 AM
53.1°
3:00 AM
53.0°
…
…
7
Analog vs. Digital Information
• Advantages of digital information:
–
–
–
–
Efficient storage and transfer
Unlimited absolute replication
Can be compressed
Easily manipulated
• Editing, combining, etc.
• We don’t use many analog computers today
– Digital computers give us all the advantages of being
able to process digital information
8
Bits and Bytes
• Computers contain lots of on/off switches
– A relay, vacuum tube, or transistor acts like a switch – either on or off
– Let’s say a switch that is on represents the digit 1 and off represents 0
• Digital computers represent all data using only 1s and 0s
– Each of the billions of transistors in a computer are either on or off
• A single digit (1 or 0) is called a bit (binary digit)
• A bit is the smallest possible amount of information
– Like an ‘atom’ of data
• One bit provides only a minimum amount of data:
– 1 or 0; Yes or No; On or Off; Up or Down; Stop or Go …
any two-state value
– Anything beyond a simple two-state value requires more than one bit
9
3
Bits and Bytes
• A single light bulb is one bit of
information – on or off; yes or no
• The light
g gives
g
the
answer (yes or no),
but you need to
know the question
– “One if by land,
two if by sea…”
10
Bits and Bytes
• A single light bulb is one bit of
information – on or off
• But a whole bunch of light
g bulbs,
arranged in a proper pattern, can
give lots of information (such as a
scoreboard),
even though
each light is
only on or off
11
Bits and Bytes
• A bit is the smallest possible amount of
information – yes/no, on/off, 0/1, etc.
• One bit doesn’t ggive us much information,
but many bits together can give much more
–
–
–
–
An image (maybe on a scoreboard)
Words
Sounds
Numbers other than 0 or 1
• How can we represent numbers using bits?
12
4
Bits and Bytes
• One bit can represent only 2 things 1-bit Binary Decimal
– on or off, yes or no, 0 or 1
• Two bits can represent 4 things
– Th
There are 4 diff
different patterns:
00, 01, 10, 11
0 0(off)
0
1 1(on)
1
2-bit Binary Decimal
00
0
01
1
10
2
11
3
13
Bits and Bytes
• One bit can represent only 2 things 8-bit Binary Decimal
– on or off, yes or no, 0 or 1
00000000
0
• Two bits can represent 4 things
00000001
1
– Th
There are 4 diff
different patterns:
00, 01, 10, 11
00000010
2
00000011
3
00000100
4
• Eight bits can represent 256 things
– There are 256 different patterns
possible with eight bits
• A group of 8 consecutive bits
is called a byte
00000101
5
…
…
11111110
254
11111111
255
14
Bits and Bytes
• Bytes are usually grouped for convenience
– 1 typed character is (usually) 1 byte
– 1 KB (kilobyte) is about 1,000 bytes (actually 1024 = 210)
• A single typed manuscript page is about 1,500 characters—about 1.5 KB
– 1 MB (megabyte) is about 1,000 KB, or a million bytes
– 1 GB (gigabyte) is about 1,000 MB, or a billion bytes
• The WFU T60 ThinkPad has 1 GB of RAM memory and a 100-GB hard disk
• 100 GB is about 100,000,000 typed pages
– 1 TB (terabyte) is about 1,000 GB, or a trillion bytes
• 1 TB of data, if on typed pages of paper would be a stack of paper 50 miles high
• The print collection of the Library of Congress is about 10 TB
15
5
Bits and Bytes
– 1 PB (petabyte) is about 1,000 TB (1,000,000,000,000,000 bytes)
• A stack of paper more than 6 times the diameter of the Earth...
…1/5th the distance to the Moon!
• All material ever printed on paper is estimated to be about 200 petabytes
• Google processes many petabytes of data each day
(http://portal.acm.org/citation.cfm?doid=1327452.1327492)
– 1 EB (exabyte) is about 1,000 PB (1,000,000,000,000,000 bytes)
• All the words ever spoken by any human, ever, would be about 5 EB of text
– Next comes zettabyte, yottabyte, etc.…
• Check out “How Much Data is That”
• http://www.jamesshuggins.com/h/tek1/how_big.htm
16
Origin of the Term Byte
•
“…The term byte was coined by Werner Buchholz, a researcher at IBM, in 1956 during the early design phase for the
IBM Stretch computer (the company’s first supercomputer). It was a modification of the word bite that was intended to
avoid accidentally misspelling it as bit. …
“The movement toward an eight-bit byte began in late 1956. A major reason that eight was considered the optimal
number was that seven bits can define 128 characters (as against only 64 characters for six bits), which is sufficient for the
approximately 100 unique codes needed for the upper and lower case letters of the English alphabet as well as punctuation
marks and special characters, and the eighth bit could be used as a parity check (i.e., to confirm the accuracy of the other
bits).
“This size was later adopted by IBM's highly popular System/360 series of mainframe systems [1964] and this was a
key factor in its eventually becoming the industry-wide standard. …”
— From http://www.linfo.org/byte.html
•
“Half of an eight-bit byte (four bits) is sometimes called (playfully) a nibble (sometimes spelled nybble) or more
formally a hex digit. The nibble is often called a semioctet in a networking or telecommunication context and also by some
standards organisations.
“The eight-bit byte is often called an octet in formal contexts such as industry standards, as well as in networking and
telecommunication. This is also the word used for the eight-bit quantity in many non-English languages, where the pun on
— From http://www.wordiq.com/definition/Byte
bite does not translate. …”
17
Etymology of Unit Prefixes
1.
2.
3.
4.
5.
6
6.
7.
8.
Kilo
Mega
Giga
Tera
Peta
Exa
Zetta
Yotta
103
106
109
1012
1015
1018
1021
1024
from Greek khilioi = 1000
from Greek megas = great, e.g., Alexandros Megos (Alexander the Great)
from Latin gigas = giant
from Greek teras = monster
from Greek pente = five, because it’s the fifth prefix… peNta – ‘N’ = peta
from Greek hex = six,
six because it
it’ss the sixth prefix
prefix… Hexa – ‘H’
H = exa
the last letter of the Latin alphabet (similar to the Greek letter Zeta)
the penultimate letter of the Latin alphabet (similar to the Greek Iota)
The first prefix is number-derived; the second, third, and fourth are based on mythology.
The fifth and sixth are just that: fifth and sixth.
With the seventh, another fork has been taken. The General Conference of
Weights and Measures (Conférence Générale des Poids et Mesures, CGPM)
has now decided to name the prefixes, starting with the seventh, with the
letters of the Latin alphabet, but starting from the end. Thus, going
backwards through the Latin alphabet, the next prefixes will be:
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
Xona
Weka
Vunda
Uda
Treda
Sorta
Rinta
Quexa
Pepta
Ocha
Nena
Minga
Luna
1027
1030
1033
1036
1039
1042
1045
1048
1051
1054
1057
1060
1063
18
6
Digital Information
• Digital computers process digital information
• Digital information is discrete; however, natural forms
of information are analog and continuous
• The
h process off converting
i information
i f
i to a digital
di i l form
f
is called digitization
• Both discrete and analog information may be digitized
– Information that is already discrete (numbers, text characters,
etc.) is easily represented in a digital form
– Analog information must be converted in some way
19
Digitizing Analog Information
•
Text and numbers are discrete information
–
•
Digitization is simply a matter of conversion from one
discrete form to another
Analog information is continuous (non-discrete)
(non discrete)
–
•
Must be transformed into a discrete form for digitizing
Analog information is digitized in two steps:
1.
2.
Sampling:
Discrete samples are chosen to represent the continuous data
Quantizing:
Each sample is assigned a particular number
20
Digitizing Analog Information
• An example using an image
1. Sampling
–
Choose discrete pixels,
or “picture
picture elements
elements”
2. Quantizing
–
Assign a number to
each pixel
21
7
Digitizing Analog Information
• Sample: break up the data
into pixels
22
Digitizing Analog Information
• Sample: break up the data
into pixels
• Average the contents of
each pixel
23
Digitizing Analog Information
• Sample: break up the data
into pixels
• Average the contents of
each pixel
• Quantize: assign a number
to represent the gray
level of each pixel
– (e.g. from 0 – 15,
where 0 = “black”
and 15 = “white”)
24
8
Digitizing Analog Information
• The quality of the digitized
image depends on
– Number/size of pixels
– Number of different levels
used in quantization
• The size of the data file
depends on the same factors
• Tradeoff between image
quality and file size
25
Digitizing Analog Data
• Another example: temperature data
• Step 1: sampling
– How many
y samples
p do we need?
– Is once a day sufficient?
80°
70°
60°
50°
midnight
noon
73.2°
26
Digitizing Analog Data
• How about twice a day?
80°
70°
60°
50°
midnight
noon
66.3°
72.5°
27
9
Digitizing Analog Data
• How about every two hours?
80°
70°
60°
50°
midnight
noon
28
Digitizing Analog Data
• How about every two hours?
– More accurate representation
– But, still not complete
80°
70°
60°
50°
midnight
noon
29
Digitizing Analog Data
• Adding more samples increases the fidelity
(accuracy) of the representation
– But, still not exactly identical to the analog data
– Still have the tradeoff between data quality and file
size
80°
70°
60°
50°
midnight
noon
30
10
Download