Uploaded by Nyo Nyo Myint

13 Data representation

advertisement
13 Data representation
By NNM
13.1 User-defined data types
A data type based on an existing data type or other data types that have
been defined by a programmer.
(A) Non-composite data types
i. Enumerated data type
ii. Pointer data type
(B) Composite data types
i. Record
ii. Sets
iii. Classes
13.1 Activity
13.2 File organisation and access
File organization
Computers are used to access vast amounts of data and to present it as useful information.
Data of all types is stored as records in files. These files can be organized using different
methods.
• Serial file organization
• Sequential file organization
• Random file organization
File access
File access is the method used to physically find a record in the file.There are different
methods of file access.We will consider two of them:
• Sequential access
• Direct access
Hashing algorithms
13.2 Activity
13.3 Floating-point numbers, representation
and manipulation
Floating-point number representation
• Converting binary floating-point numbers into denary
• Converting denary numbers into binary floating-point numbers
• Potential rounding errors and approximations
• Precision versus range
• Floating-point problems
• A non-composite data type can be defined without referencing
another data type
• Non-composite user-defined data types are usually used for a special
purpose.
• An enumerated data type is non-composite data type and is defined by a given
list of all possible values that has an implied order.
• contains no references to other data types when it is defined.
• TYPE identifier (value1, value2,…)
eg: TYPE Tmonth (January, February,….,December)
• Note: January, February,.. are not string, no need to put in quotation marks
• Then variable can be defined as
DECLARE thismonth,nextmonth: Tmonth
thismonthNovember
nextmonththismonth+1 //nextmonth will be December
Activity
ACTIVITY 13A
Using pseudocode,
• declare an enumerated data type for the days of the week.
• Then declare two variables today and yesterday,
• assign a value of Wednesday to today, and
• write a suitable assignment statement for tomorrow.
Check Answer
ACTIVITY 13B
• Using pseudocode for the enumerated data type for days of the week,
• declare a suitable pointer to use.
• Set your pointer to point at today.
• Remember, you will need to set up the pointer data type and the
pointer variable.
Check Answer
• A pointer data type is a non-composite data type that uses the memory
address of where the data is stored.
• It is used to reference a memory location.
• This data type needs to have information about the type of data that will be
stored in the memory location.
• ^ shows that the type being declared is a pointer
TYPE pointer = ^ Typename
eg: TYPE Tmonthptr=^Tmonth
• Then variable can be defined as
DECLARE mptr: Tmonthptr
mptr^thismonth
currentmonthmptr^
Activity
• A data type that refers to any other data type in its type definition is a
composite data type.
Record
Type Typename
DECLARE identifier: datatype
DECLARE identifier: datatype
.
.
DECLARE identifier: datatype
ENDTYPE
• Type Tstudent
DECLARE id: INTEGER
DECLARE name: STRING
DECLARE score: REAL
ENDTYPE
Python does not
natively support
record types.
However, you can
represent a record by
defining a class with
attributes but no
methods. The
constructor method
__init__ will be called
automatically when a
new instance of the
PlayerRecord class is
created.
{'apple'}
{'apple'}
Sets
• A set is a given list of unordered elements that can
use set theory operations such as intersection and
union.
• A set data type includes the type of data in the set.
• TYPE set_identifier=SET OF (basic data type)
• DEFINE identifier (value1,value2, …):setidentifier
eg : TYPE Sletter=SET OF (CHAR)
DEFINE vowel(‘a’, ’e’, ’i’, ’o’, ’u’):Sletter
x = {"apple", "melon", "strawberry"}
y = {"google", "microsoft", "apple"}
z = x.intersection(y)
u = x.union(y)
print(z)// {"apple"}
print(u)//{"apple", "melon",
"google", "microsoft", "strawberry"}
Classes
• A class is a composite data type that includes variables of given data
types and methods.
• An object is defined from a given class; several objects can be
defined from the same class.
• To be continued in Chapter 20
ACTIVITY 13C
1. Explain, using examples, the difference between composite and noncomposite data types.
2. Explain why programmers need to define user-defined data types.
Use examples to illustrate your answers.
3. Choose an appropriate data type for the following situations. Give the
reason for your choice in each case.
a) A fixed number of colours to choose from.
b) Data about each house that an estate agent has for sale.
c) The addresses of integer data held in main memory.
Check Answer
13A Answer Key
TYPE Tday = (Monday, Tuesday, Wednesday, Thursday, Friday, Saturday,
Sunday)
DECLARE today : Tday
DECLARE yesterday : Tday
today  Wednesday
yesterday  today - 1
13B Answer Key
TYPE TdayPointer = ^Tday
DECLARE dayPointer : TdayPointer
dayPointer ^today
13C Answer Key
1. A composite data type refers to other data types in its definition. A noncomposite data type does not refer to other data types.
2. User defined data types allow programs to be more readable for other
programmers. For example, using the days of the week as an
enumerated data type.
3. (a) An enumerated data type, as a list of colours can be provided with
meaningful names used for each colour in the list.
(b) A record structure that contains different types of data would be
used, so the data for each house can be used together in one
structure.
(c)pointer data type as this will reference the address/location of the
integer stored in main memory.
• The serial file organization method physically stores records of data in a
file, one after another, in the order they were added to the file.
• New records are appended to the end of the file.
• It is often used for temporary files storing transactions to be made to
more permanent files
• For example, storing customer meter readings for gas or electricity
before they are used to send the bills to all customers.
Customer
Customer
Customer
Customer
Customer
Customer
and so on
6
5
3
1
4
2
• The sequential file organization method physically stores records of data
in a file, one after another, in a given order.
• The order is usually based on the key field of the records as this is a
unique identifier.
• For example, a file could be used by a supplier to store customer records
for gas or electricity in order to send regular bills to each customer. All
records are stored in ascending customer number order, where the
customer number is the key field that uniquely identifies each record.
Customer
Customer
Customer
Customer
Customer
Customer
1
2
3
4
5
6
and so on
• The random file organization method physically stores records of data in
a file in any available position.
• The location of any record in the file is found by using a hashing
algorithm on the key field of a record
Customer
Customer
Customer
Customer
Customer
Customer
and so on
5
3
1
2
6
4
sequential access
A method of file access in which records are searched one after another from
the physical start of the file until the required record is found.
This method is used for serial and sequential files.
For a serial file, if a particular record is being searched for, every record needs to be checked until that record is found or the whole file has
been searched and that record has not been found.
For a sequential file, if a particular record is being searched for, every record needs to be checked until the record is found or the key field of
the current record being checked is greater than the key field of the record being searched for.
direct access
The direct access method can physically find a record in a file without other
records being physically read. Both sequential and random files can use direct
access. This allows specific records to be found more quickly than using
sequential access.
For a sequential file, an index of all the key fields is kept and used to look up the address of the file location where a given record is stored.
For large files, searching the index takes less time than searching the whole file.
For a random access file, a hashing algorithm is used on the key field to calculate the address of the file location where a given record is
stored
• Hashing algorithm is a mathematical formula used to perform a calculation on the key field of
the record; the result of the calculation gives the address where the record should be found.
• a simple hashing algorithm:
• If a file has space for 2000 records.
• key field can take any values between 1 and 9999
• Find key MOD 2000
• Start address+ mod * the size of the space allocated to each record.
• Eg where the start address is 0 and each record is stored in 2 location.
• key field value =3024, the hashing algorithm would give address
2048 (0+2 x 1024)
• Unfortunately, storing another record with a key field would result in trying to use the same file
location and a collision would occur.
• There are two ways of dealing with this:
1. An open hash where the record is stored in the next free space.
2. A closed hash where an overflow area is set up and the record is stored in the next free
space in the overflow area.
Extension
Activity
Activity
ACTIVITY 13D
• A file of records is stored at address 500.
• Each record takes up five locations and there is space for 1000 records.
• The key field for each record can take the value 1 to 9999.
• The hashing algorithm used to calculate the address of each record is the
remainder when the value of key field is divided by 1000 together with the
start address of the file and the size of the space allocated to each record.
• Calculate the address to store the record with key field 9354.
• If this location has already been used to store a record and an open hash is
used, what is the address of the next location to be checked?
Check Answer
EXTENSION ACTIVITY 13B
Write a program to
• find the ASCII value for each character in a name of up to 10
characters
• add the values together
• divide by 1000 and find the remainder
• multiply this value by 20 and add it to 2000
• display the result.
If this program simulates a hashing algorithm for a file, what is the start
address of the file and the size of each record?
Check Answer
ACTIVITY 13E
1. Explain, using examples, the difference between serial and sequential
files.
2. Explain the process of direct access to a record in a file using a hashing
algorithm.
3. Choose an appropriate file type for the following situations. Give the
reason for your choice in each case.
a) Borrowing books from a library.
b) Providing an annual tax statement for employees at the end of the
year.
c) Recording daily rainfall readings at a remote weather station to be
collected every month.
Check Answer
Activity 13D Answer
• As 9354 / 1000 is 9 remainder is 354,
• with 5 locations for each record,
• this record would be stored at address 2175 = 500 + 353 x 5 and the
next four locations, assuming that the first record is stored at address
500.
• If this location has already been used, the record is stored in the next
free space for open hash.
Activity 13E Answer
1. Serial file each new record is added to the end of a file, for example a log
of temperature readings taken at a weather station. Sequential file
records are stored in a given order, usually based on the key field, for
example ascending number of employee number for a personnel file.
Check
2.
Answer
3. a) Random access as only one record is required at a time, low hit rate.
b) Sequential access as all the records need to be accessed, high hit
rate.
c) Serial access, as each record is added to the end of the file in
chronological order.
• Using scientific notation system in binary, we get:
• M × 2E
• M is the mantissa and E is the exponent. This is known as binary
floating-point representation.
• 0.31211 × 10 24 means:
• the binary floating-point equivalent
Convert this binary floating-point number into denary.
More Examples
Convert this binary floating-point number into denary.
Answer
Answer
ACTIVITY 13F
• Convert these binary floating-point numbers into denary numbers
(the mantissa is 8 bits and the exponent is 8 bits in all cases).
Activity 13F Answer
• a) 39/64 × 25 = 19.5
• b) 41/128 × 27 = 41
• c) 7/8 × 2−5 = 7/256 (0.02734375)
• d) 15/64 × 2−4 = 15/1024 (0.0146484375)
• e) 7/8 × 23 = 7
• f) −13/16 × 22 = −3.25
• g) −3/32 × 24 = −1.5
• h) −5/8 × 25 = −20
• i) −5/8 × 2−3 = −5/64 (−0.078125)
• j) −1/4 × 2−6 = −1/256 (−0.00390625)
Download