Slides 3 - USC Upstate: Faculty

advertisement
SCSC 311 Information Systems:
hardware and software
Chapter 3 Objectives

Numbering systems

Various data representation methods

The representation of nonnumeric data

Data structures
Data Representation and Processing
Capabilities required of any (mechanical,
electrical, or optical )information processor:




Recognizing external data and converting it to an
appropriate internal format;
Storing and retrieving data internally
Transporting data among internal storage and
processing components;
Manipulating data to produce desired results or
decisions;
Goals of Computer Data Representation (1)

Compactness
the number of bits used to represent a numeric value
 The more compact a data representation format, the less expensive it is to
implement in computer hardware.
Q: Which one is more compact format, binary or decimal?


Range



In a given data representation, the more the bits used the larger the range
is.
But a large numeric range has a cost …
Accuracy:


Precision of representation increases with number of data bits used
In some cases, the quantities must be manipulated and stored as
approximations  a degree of error  compounded errors

Optimum coding method for each type of data or each type of
operation.

As the number of bits are limited (fixed) in any hardware, we need to
find the optimal tradeoff between range and accuracy. How?
Goals of Computer Data Representation (2)

Ease of manipulation
Is the machine efficiency when executing processor instructions
 Data representation formats decides the complexity of circuit.
Discussion: the complexity of circuit when using decimal / binary
data representation.


Standardization
Ensures correct and efficient data transmission among computer
systems
 Various organizations have created standard data encoding
methods, which provide flexibility to combine hardware from
different vendors with minimal data communication problems
e.g. ASCII, Unicode

Q: Why current electronic computers represent
data using binary format?
Why current electronic computers represent data
using binary format?
Ans:
 Binary numbers represented as electrical signals can be
reliably transported among computer systems
components;
 Binary numbers represented as electrical signals can be
processed by two–state electrical devices that are
relatively easy to design and fabricate;
 Binary numbers correspond directly with values in
Boolean logic.
Automated Data Processing
Computers represent data electrically and process
it with electrical switches


Two-state (ON/OFF) electrical switches are well suited for binary
format (0/1)
Electrical switches  processing circuits  CPU
 Automated data processing combines electronics and
mathematics
A+B=C
Positional Numbering System

Positional Numbering System:



Key terms: base / radix, radix point



The symbol at each digit & the digit position  value
The value of the entire string is the sum of the values of all
digitals within the string.
Base / radix: the multiplier that describes the difference between
one position and the next.
Radix point: the fractional part of a numeric value is separated
from the whole part by a point
Two common positional numbering systems:


Decimal Notation
 Uses 10 as its base
 10 possible values (0, 1, 2, … 9 ) per digit
Binary Notation
 Uses 2 as its base
 two possible values (0 or 1) per digit
Binary, Decimal Notations
Conversion: binary & decimal
E.g. 1: (101101.101)2 = ( )10
E.g. 2: (45)10 = ( )2
Ans: binary to decimal
Other data formats: Hexadecimal & Octal
(self-study)

Hexadecimal





Uses 16 as its base
Compact; advantage over binary notation
Often used to designate memory addresses
The primary advantage of hexadecimal notation, as
compared to binary notation, is its compactness.
Octal


Uses 8 as its base
Expresses large numeric values in:


One-third the length of corresponding binary notation
Double the length of corresponding hexadecimal
notation
Hexadecimal Notation (self-study)
Octal to Decimal
Octal  Decimal
(3 2 1)8
82 81 80
3 x 82 = 3 x 64 = 192
2 x 81 = 2 x 8 = 16
1 x 80 = 1 x 1 =
1
(209)10
(self-study)
Hexadecimal to Decimal (self-study)
Hexadecimal  Decimal
(A B C)16
162 161 160
A x 162 = 10 x 256 = 2560
B x 161 = 11 x 16 = 176
C x 160 = 12 x 1 =
1
(2748)10
Index

Numbering systems

Various data representation methods

The representation of nonnumeric data

Data structures
CPU Data Types

Primitive data types






Integer
Real number
Character
Boolean
Memory address
Representation format for each type balances:
compactness, accuracy, ease of
manipulation, and standardization
Integers

Integer is a whole number — a value that does
not have a fractional part

Most CPUs provide an unsigned integer data type


Store positive integers as ordinary binary numbers
Other binary notations:


Excess notation
Two’s complement notation
Excess Notation

Excess Notation can be used to represent
signed integers

Divides a range of binary numbers in half
lower half for negative values
 upper half for nonnegative values
(as shown in figure)


The leftmost bit representing the sign (1 for
nonnegative and 0 for negative values)
Excess Notation
Excess Notation

To represent a specific integer value in excess
notation, you must know how many bits are to
be used.

Range: from -2^(n-1) to 2^(n-1) – 1
Exercise: In 8-bit excess notation,
e.g.1: (25)10 = ( )2
e.g.2: (-25)10 = ( )2
Two’s Complement Notation
Two’s complement:



Nonnegative integer = ordinary binary
values
Negative integer = Complement of positive
binary values + 1
Range: from -2^(n-1) to 2^(n-1) – 1
Exercise: in 8-bit two’s complement
notation
e.g.1 (25)10 = ( )2
e.g.2 (-25)10 = ( )2
e.g.3 (0000 1111)2 = ( )10
e.g.4 (1111 1011)2 = ( )10
Two’s Complement Notation
Why two’s Complement is common in CPU
design?
Two’s Complement Notation
Ans: Two’s complement is awkward to people,
but It is Highly compatible with digital electronic
circuitry

Only two logic circuits required to perform addition on
single-bit values
Adding two's complement numbers requires no special
processing if the operands have opposite signs: the sign of the
result is determined automatically.

Subtraction can be performed as addition of a
negative value
Two’s Complement Notation
e.g. 1 : 0000 1111 + 1111 1011
e.g. 2: 0110 0100 - 0001 0110
e.g.3
0110 0100 + 0111 0011
Range and Overflow

Overflow




Occurs when absolute value of a computational result
contains too many bits to fit into fixed-width data
format
Range: -2^(n-1) to 2^(n-1) – 1
Treated as an error by the CPU
Avoiding overflow


Double precision data formats: combines two adjacent
fixed-length data item to hold a single value
e.g. long integer
Careful programming
Real Numbers


Real Numbers contain both whole and fractional
components
Require separation of components to be represented
within computer circuitry
 Fixed radix point notation
(simple but inflexible)

Floating point notation
(complex but flexible)
Floating Point Notation

Scientific notation: a x 10b



exponent b is an integer,
mantissa a is any real number in the range of 1 to 10, excluding
10.
Floating Point Notation: similar to scientific notation,
except that 2 is the base
value = mantissa x 2exponent

Trades numeric range for accuracy
Value can have many digits of precision for large or small
magnitudes, but not both simultaneously

Floating point numbers are less accurate and more
difficult to process than two’s complement format
IEEE 32-bit Floating Point Format
• one leading sign bit;
• 23-bit mantissa in coded as an ordinary binary number.
• 8-bit exponent is coded in excess notation
Range, Overflow, and Underflow

Range: limited by number of bits in a floating point string
& formats of mantissa and exponent fields



Overflow: occur within the exponent


The number of bits in the mantissa  the number of significant
digits
The number of bit in the exponent  the number of possible bit
position to the right / left of the radix point.
Large positive exponent  floating point number with large
absolute value
Underflow: occurs when absolute value of a negative
exponent is too large to fit within allocated bits
Examples …
Precision and Truncation

Precision

Accuracy is reduced as the number of digits available
to store mantissa is reduced


more bits in exponent part  a larger range
Truncation


Stores numeric value in the mantissa until available
bits are consumed; discards remaining bits
More values have non-terminating representations in
binary than decimal. E.g. (0.1)10  (?)2
Causes an error or approximation

Problem when truncated values are used as input to
computation, approximations could be magnified.
Processing Complexity

Floating point formats


Although it is optimized for processing efficiency,
floating point notation requires complex processing
circuitry (translates to difference in speed)
Programmers should never use real numbers
when an integer will suffice (speed and
accuracy)
e.g. In monetary system,
Index

Numbering systems

Various data representation methods

The representation of nonnumeric data

Data structures
Character Data

Character data are represented indirectly by defining a
table that assigns numeric values to individual
characters (alphabetic / numerical letter, punctuation
mark, special purpose symbol)

Common Coding Methods

EBCDIC (Extended Binary Coded Decimal Interchange Code)
developed by IBM in the 60’s

ASCII (American Standard Code for Information Interchange)
 Subset of Unicode
 Defines a number of device control codes (CR, LF …)
 Some limitations
A partial list of ASCII and EBCDIC
(self-study)
Unicode

ASCII Limitations

Insufficient range




(self-study)
Uses 7-bit code, providing 128 table entries (33 for
device control)
95 printable characters can be represented
An English-based Coding method
Unicode


Assigns nonnegative integers to represent individual
printable characters (like ASCII)
Larger coding table than ASCII



Uses 16-bit code providing 65,536 table entries
Can represent written text from all modern languages
Widely supported in modern software
Boolean and Memory Address

Boolean



Has only two data values—true and false
The most concise coding format; only a single bit is
required
Memory Address


Primary memory is a series of contiguous bytes of
storage
Memory address is unique identifying number of
memory
Memory Address
Two memory address models:
 Flat memory addresses



Memory bytes are identified by a series of nonnegative numbers
Minimize the complexity of processor circuitry
Segmented memory addresses


Using multiple integers as memory addresses
e.g. segmented memory model in IBM PC



Pages are identified by sequential nonnegative integers;
Each byte in a page is identified by a sequential nonnegative
integers.
Therefore, each byte of memory has a two-part address: page
number and byte number in the page  a new data type is required.
Index

Numbering systems

Various data representation methods

The representation of nonnumeric data

Data structures
Data Structures

Data structures are related groups of primitive
data elements organized for a type of common
processing


Are defined and manipulated within software
Commonly used data structures: array, linked list,
record, table, file, index, and object


Some data structures are supported by system software:
string, record, file
Other data structures are usually supported by
programming languages: array, indexed file, database
structures
Pointers and Address

Pointer: data element that contains the address
of another data element

Many data structures use pointers to link primitive data
components

Address: location of a data element within a storage
device
Array and List


List: A set of related data values
Array: An ordered list in which each element can
be referenced by an index to its position
e.g. A character array
Linked Lists (self-study)

Data structures that use
pointers so list elements
can be scattered among
nonsequential storage
locations





Singly linked lists
Doubly linked lists
Circular Linked Lists
Etc.
Easier to expand or
shrink than an array
(a)
(b)
A Linked List in RAM
(self-study)
e.g. Insert a new element in a Singly
Linked List (self-study)
Q:
How to insert a new
element in an array (in
contiguous memory) ?
Records

Record


Data structures composed of other data structures or
primitive data elements
Used as a unit of input and output to files
File and Methods of Organizing Files

Files


Sequence of records on secondary storage
Two Methods of Organizing Files

Sequential File


Stores records in contiguous storage locations
Indexed File


An array of pointers to records
Efficient record insertion, deletion, and retrieval
An Indexed File
Classes and Objects

Classes

Data structures that contain:




(self-study)
traditional data elements (static part)
methods that manipulate data elements (dynamic part)
Related data items & methods that manipulate the data items
Objects

One instance, or variable, of the class
Summary

Understanding data representation is key to
understanding hardware and software
technology

How data is represented and stored in computer
hardware (e.g. integer, floating, …)

How data types are used as building blocks to
create more complex data structures
(e.g., arrays, records, files, …)
Mock Quiz
True / false

A cluster is a group of similar or identical computers, connected by
a high speed network, that cooperate to provide services or
execute a common application.

A doubly linked list stores one pointer with each list element.
Multiple choice

The Babbage difference engine is an example of what type of
computing device?
a.Mechanical
b.Electrical
c.Optical
d.Quantum
Mock Quiz
Fill in blank
5.
The contents of ____________________ can be accessed by the
CPU more quickly than the contents of primary storage.
Short answer:

What is the difference between an application program and a
systems program?

Convert a decimal number (-34)10 to octal and two’s complement
binary (in 8-bit);
Download