Computer Science 1000 Terminology II

advertisement
Computer Science 1000
Terminology II

Storage

a computer has two primary tasks



a processor's primary job is to operate on data




store data
operate on data
math operations
move operations
note that processors do have a very small amount of
storage
how the majority of data stored by the machine?

Storage

there are a variety of storage media available for
computers:




these storage types are differentiated by:




RAM
hard drive
removable media
capacity
price
latency
first, we should determine what is being stored

Information Storage

ask a non-computer person what their computer
stores






programs/apps/games
pictures
songs/videos
email
documents (text)
what does it mean to store an object, like a piece of
text, in a computer?

in other words, how is it represented?

Information Storage

consider a notebook (for
comparison)

how is a piece of text
stored/represented?



as a set of written symbols
the set of available symbols
depends on your language
individual symbols can be
combined into other objects (e.g.
words, sentences)

Information Storage




in a computer, information is
stored as a set of bits
a bit is short for binary digit
in simplest terms, a binary digit is
either 0 or 1
hence, information stored by a
computer is simply a set of 0s
and 1s

Information Storage

how does the computer store
other information?


other information is encoded in
binary
the way that information is
stored in binary depends on the
information type

Information Storage

numbers



people typically use numbers in decimal format
represented by digits 0-9
any decimal number can be represented in binary
form

for example, here are the first 16 integers in binary:
Dec
Bin
Dec
Bin
Dec
Bin
Dec
Bin
0
0
4
100
8
1000
12
1100
1
1
5
101
9
1001
13
1101
2
10
6
110
10
1010
14
1110
3
11
7
111
11
1011
15
1111

Information Storage

numbers – notes

the entire number is typically coded in binary, not each
individual digit



e.g. 49 in binary is 110001, not 1001001
most numbers are stored as a fixed number of bits
e.g. 32-bit numbers



each number stored as a 32-bit sequence
smaller numbers are padded on left with zeroes (like decimal)
e.g. 14 (1110) as 32-bit number:
00000000000000000000000000001110

Information Storage

text


each character in a piece of text has a binary encoding
e.g. ASCII: 8-bit sequence

each character has a unique 8-bit sequence

Information Storage

image
a digital picture is made up of pixels (tiny
squares)
 each pixel stored as its colour
 each colour has a unique binary encoding
 images will often indicate their colour depth



e.g. 24-bit colour uses 24 bits per colour
example (RGB): pure red:
111111110000000000000000

Information Storage

context
the previous representation of the colour red is
also the binary representation of 16,711,680
 so when we see that sequence, how do we
determine what kind of data it is?




it's up to a program to interpret the number
often, the file type is used as a hint
different programs will interpret the same sequence
differently

example: kev.png

Information Storage

representations


the previous was a brief introduction to how information
is encoded, to facilitate understanding of memory and
storage
later in the semester, we will consider an entire chapter
on how information is stored, with topics like:


binary representation of negative numbers, and numbers with a
decimal point (3.4)
other text representations (e.g. Unicode)

Information Storage

units and prefixes

byte: 8 bits (typically)



most storage is measured in bytes, rather than bits
hence, a 100 byte file would contain 800 bits
bits and bytes are typically abbreviated as b and
B


hence, 80 B = 80 bytes
= 640 b = 640 bits

Information Storage – Unit Prefixes
bits and bytes are often abbreviated using
SI (metric) prefixes
 for example:

K
 M
 G
 T

(kilo)
(mega)
(giga)
(tera)
- e.g. kilobyte (KB)
- e.g. megabit (Mb)
- e.g. gigabyte (GB)
- e.g. terabyte (TB)

Information Storage – Units





it is not always clear what the multiplier is
when referring to main memory, we typically use powers of
two
hence, the prefix kilo means multiply by 210 , and not 1000
hence, 1 KB = 1024 bytes, 3 KB = 3072 bytes ...
other multipliers:



mega: 220 = 1048576
giga: 230 = 1073741824
when used in this context, known as binary prefixes

Information Storage – Units


when referring to other storage types, we typically use powers
of 10
hence, the prefix kilo means multiply by 103, like you are used
to




mega : 106
giga: 109
hence, 500 GB = 500,000,000,000 bytes
when used in this context, known as decimal prefixes

Information Storage – Units

the industry is not consistent

when you buy a 4 GB USB key, Windows will often report
it as smaller, as it assumes that 4 GB = 4 x 230

Other Interesting Example
http://en.wikipedia.org/wiki/Binary_prefix

Storage Media
now that we know what is being stored, and
how to define it, let's consider different ways
to store it
 types we will consider:

volatile storage
 persistent storage


Volatile Storage



typically referred to as memory
defined as storage that requires a continuous power source to
maintain its state
in other words, when its power source is disconnected, all
memory is erased



and you lose your data
your CPU cache discussed previously would be considered
volatile memory
however, RAM is the primary volatile storage on most
computers

RAM





Random Access Memory
also referred to as main memory
the location of your program and associated data
when your program is running
example: consider a running web browser
stores:



instructions (for your processor)
images and text from the webpage
things that you can't see (e.g. cookies, passwords)

RAM

the most defining feature of a system's main
memory is its capacity


the amount of information that it can store
modern consumer systems typically have 2-16 GB
of RAM


4-8 GB is very common
in 8 GB of RAM, you could store:



~4.2 million pages of text (~129 Encyclopaedia Britannica 2010 ed.)
~2000 songs
remember: for main memory, 1 GB = 230 bytes, not 109
http://pc.net/helpcenter/answers/how_much_text_in_one_megabyte

CPU – RAM

Why does RAM capacity affect performance?



recall that RAM stores programs and data
hence, the bigger the RAM, the more programs and
data it can store
this means:



more programs can be loaded into memory at once*
more data can be stored in main memory (important for
large media items like movies)
certain programs (e.g. newer games) have minimum
memory requirements just to run
* this ignores a concept called virtual memory, discussed later

Why Random Access Memory?

named because any location on RAM chip can be
accessed in (nearly) the same amount of time

compare this to sequential access memory




example: magnetic tape storage
items directly under the reader can be accessed quite
quickly
feeding the tape to find other locations is extremely slow
hence, RAM devices are typically much faster

Persistant Storage



sometimes referred to as non-volatile memory
defined as storage that maintains its state even
when no power source is connected
in other words, state is maintained between power
interruptions


although there are other potential forms of data corruption
many types of persistent storage



hard drives
optical drives
key drives

Hard Drive



also referred to as hard disk or simply disk
the primary source of persistent storage on modern
machines
like RAM


can store programs, documents, images, videos, etc
unlike RAM:


items in persistent storage are typically not in use
they are loaded into RAM from your hard drive in order to
be used

Hard Drive



like RAM, the most defining feature of a hard drive
is its capacity
typical consumer hard drives range in size from 500
GB to 4 TB
consider 2 TB of hard disk space:



~1 billion pages of text (~30000 Encyclopaedia Britannica)
~500000 songs (mp3)
remember: for persistent storage, 1 GB = 109 bytes, not
230

Hard Drive vs RAM

RAM and hard drives store data in
fundamentally different ways


details beyond scope of the class
one of the ways in which they differ is price
by price, let's consider $/GB (to be fair)
 note that certain things can affect this range
(e.g. laptop RAM is usually more expensive than
desktop RAM)


RAM – Example

8 GB = $50-$60  $6.25/GB - $7.50/GB

Hard Drive - Example

1 TB = $70-$75  $0.07/GB - $0.075/GB

Hard Drive vs RAM

persistence



capacity



most consumer hard drives: 500GB – 2TB of HD
most consumer RAM: 2GB – 16GB
price



hard drives are persistent, no data is lost when power is
disrupted
RAM is volatile, loss of power = RAM is erased
hard drives cost pennies per GB of storage
RAM costs dollars (about a 100 times more)
what is the advantage of RAM over an HD?
 Hard
Drive vs RAM
 answer:
speed!!
 RAM is fast compared to HD
 performance measured in two ways
 access
time
 transfer rate

Storage – Access Time


time to retrieve a single random piece of data
for modern RAM:


for modern hard drives:



50 – 150 nanoseconds*
5 – 15 milliseconds*
hence, RAM is the clear winner
performance can vary depending on how data is
accessed**
*http://www.webopedia.com/TERM/A/access_time.html
**http://queue.acm.org/detail.cfm?id=1563874

Storage – Transfer rate


how much data can be transferred in a second
for modern RAM:


for modern hard drives:



6-17 GB/s
50-120 MB/s*
again, RAM is the clear winner
better technologies (e.g. SSD drives) improve HD
performance, but still much slower than RAM
*http://www.storagereview.com/ssd_vs_hdd

RAM vs. Hard Drive
in summary, RAM has the ability to access
and transfer data much quicker
 for running programs, it is critical that data
latency be minimized

otherwise, your processor would always be
waiting
 although more expensive and less spacious,
RAM makes your current computer experience
possible


Hard Drive – RPMs

one other common feature listed with typical
hard drives is their RPMs

common values: 5400, 7200, 10000
RPMs stand for revolutions per minute
 basically, more RPMs = better performance
 to understand why, we must how consider
how hard drives are constructed


Hard Drive – Construction




data stored magnetically on platters,
which are just smooth round surfaces
data is read/written by the head, which is
at the end of the arm mechanism that
you see
these platters spin, and the arm moves
to a particular location and reads the
data that passes under it
hence, the faster it spins, the faster that
data can be accessed

Hard Drive – SSD
a newer technology than magnetic drives
 no moving parts (quiet)
 considerable performance improvement
over magnetic hard drives



throughput: 200-500 MB/s
considerably more expensive

over $1/GB
Download