Chapter 3, Data Representation and Processing

advertisement
Computers
Data Representation
Chapter 3, SA
Data Representation and
Processing
Data and information processors must
be able to:
• Recognize external data and convert it to an
appropriate internal format
• Store and retrieve data internally
• Transport data among internal storage and
processing components
Binary Representation of
Data
• Computers represent data using binary
numbers.
• Binary numbers correspond directly with
values in boolean logic.
• Computers combine multiple digits to form
a single data value to represent large
numbers.
Basic data types
• Integers – whole numbers
• Real numbers – w/ fractional components
• Exponential representation
• Character
• ASCII vs EBCDIC
• Boolean –true/false
• BLOB (Binary Large Object)
Data structures
• Defined in software
• Arrays
• Lists
• Records
• Tables
• Files
• Indices
• Objects
Data Structures
A data structure is a related group of
primitive data elements that is organized
for some type of processing.
Data structures are defined and
manipulated within software.
Data Structures
Virtually all data structures make
extensive use of pointers and addresses.
Pointer – a data element that contains the
address of another data element.
Address – the location of some data
element within a storage device.
Arrays and Linked Lists
Linked List:
A linked list is a data structure that uses
pointers so list elements can be scattered
among nonsequential storage locations.
Records and Files
• A record is a data structure
composed of other data structures or
primitive data elements.
• Records are used as a unit of input
and output to files or databases.
File Organization
Physical arrangement of the records of a file on
secondary storage devices
•Sequential
•Linked List
•Indexed
•Hashed
Sequential File
Sequential file sorted in alphabetical order.
Sequential files are usually sorted in ID
sequence order to facilitate batch processing.
a ddr
00
01
02
03
Ayers
Buckley
Daley
Dejoie
ACCT
MGT
ACCT
MGT
04
Kenderdine
MKT
05
Linn
FIN
06
Lusch
MKT
07
Price
MGT
08
Razook
MKT
09
Schwarzkopf
MGT
Sequential File Processing
Old Master
Process
New Master
Transaction
Sequential files must be recopied from the point of
any insertion or deletion to the end of the file. They
are commonly used in batch processing where a
new master file will be generated each time the file
is updated.
Linked List
Linked list to sort data alphabetically within department.
An external reference must point to the start record (05).
a ddr
00
Price
MGT
pointe r
01
01
02
03
Schwarzkopf
Kenderdine
Lusch
MGT
MKT
MKT
02
03
08
04
Buckley
MGT
09
05
Ayers
ACCT
06
06
Daley
ACCT
07
07
Linn
FIN
04
08
09
Razook
Dejoie
MKT
MGT
##
00
Linked List File Processing
The next record in a linked list is found at the address
stored in the record. Records are added at any
location in the DASD and pointers adjusted to include
them. Deletions are not erased, but pointers
changed to omit the deleted record.
Indexed File
(sequential index)
Index to access data by
department abbreviation.
addr
00
01
02
03
04
Price
Schwarzkopf
Kenderdine
Lusch
Buckley
MGT
MGT
MKT
MKT
MGT
ACCT
ACCT
FIN
MGT
MGT
MGT
MKT
Ayers
Daley
Linn
Razook
Dejoie
00
01
02
00
01
04
03
ACCT
ACCT
FIN
MKT
MGT
Indexed File Processing
Index
Index
Data File
When a record is inserted or deleted in a file the data
can be added at any location in the data file. Each index
must also be updated to reflect the change. For a
simple sequential index this may mean rewriting the
index for each insertion.
Segmented Index
Index
addr
100
101
102
103
200
201
202
203
204
205
206
Root
Nodes
Leaf
pointer
101
Kenderdine
200
Buckley
203
Lusch
205
Schwarzkopf
00
Ayers
01
Daley
00
Price
02
Linn
02
Kenderdine
01
Schwarzkopf
5
Van Horn
pointer
102
Razook
201
Dejoie
202
206
04
Buckley
04
Dejoie
03
Razook
03
Lusch
Data
addr
00
Price
MGT
Ayers
ACCT
01
Schwarzkopf
MGT
Daley
ACCT
02
Kenderdine
MKT
Linn
FIN
03
Lusch
MKT
Razook
MKT
04
Buckley
MGT
Dejoie
MGT
05
Van Horn
MGT
pointer
103
204
201
204
205
202
203
206
Indexed File Processing
(segmented index)
Index
Data File
Data can be inserted or deleted at any location in the
data file. The index(es) must be updated for each
change, but only the affected segments need to be
rewritten.
Hashing
(Prime Number Remainder Algorithm)
Pick a prime number to define the file space
Divide the key by the prime number
Put the result in the location of the remainder
3
Key = 41
13
41
39
2
Location = 2
Hashed File Processing
addr
Key
Calculation
Contents
Records and Files
• A sequence of records on secondary
storage is called a file.
• A sequence of records stored within main
memory is called a table.
• Sequential files suffer the same problems
as contiguous arrays when inserting and
deleting records.
• To eliminate this problem, linked lists and
indexed files are used.
Classes and Objects
Download