Solution to assignment 2

advertisement
INF3703/202/01/2011
PRINCIPLES OF DATABASES
Tutorial letter 202/
Studiebrief 202
CONTENTS / INHOUD:
Solution to assignment 2 / Oplossing vir werkopdrag 2
SCHOOL OF COMPUTING
SKOOL VIR REKENAARKUNDE
INF3703/202/01/2011
1.
Answers to Assignment 1
Question 1
1.1 What is normalization?
Answer to 1.1:
Normalization is the process for assigning attributes to entities. Properly
executed, the normalization process eliminates uncontrolled data redundancies,
thus eliminating the data anomalies and the data integrity problems that are
produced by such redundancies.
Normalization does not eliminate data redundancy; instead, it produces the
carefully controlled redundancy that lets us properly link database tables.
1.2 When is a table in 1NF?
Answer to 1.2:
A table is in 1NF when all the key attributes are defined (no repeating groups in
the table) and when all remaining attributes are dependent on the primary key.
However, a table in 1NF still may contain partial dependencies, i.e.,
dependencies based on only part of the primary key and/or transitive
dependencies that are based on a non-key attribute.
1.3 When is a table in 2NF?
Answer to 1.3:
A table is in 2NF when it is in 1NF and it includes no partial dependencies.
However, a table in 2NF may still have transitive dependencies, i.e.,
dependencies based on attributes that are not part of the primary key.
1.4 When is a table in 3NF?
Answer to 1.4:
A table is in 3NF when it is in 2NF and it contains no transitive dependencies.
1.5 When is a table in BCNF?
Answer to 1.5:
A table is in Boyce-Codd Normal Form (BCNF) when it is in 3NF and every
determinant in the table is a candidate key. For example, if the table is in 3NF
and it contains a nonprime attribute that determines a prime attribute, the BCNF
requirements are not met.This description clearly yields the following conclusions:
 If a table is in 3NF and it contains only one candidate key, 3NF and BCNF
are equivalent.
 BCNF can be violated only if the table contains more than one candidate
key. Putting it another way, there is no way that the BCNF requirement
can be violated if there is only one candidate key.
2
INF3703/202/01/2011
Question 2
2.1
Using the INVOICE table structure shown in table 1, write the relational
schema, draw its dependency diagram and identify all dependencies
(including all partial and transitive dependencies). You can assume that the
table does not contain repeating groups and that any invoice number may
reference more than one product. (Hint: This table uses a composite primary
key.)
Table 1
Attribute Name
INV_NUM
PROD_NUM
SALE_DATE
PROD_LABEL
VEND_CODE
VEND_NAME
QUANT_SOLD
PROD_PRICE
Sample
Value
211347
AAE3422QW
15-Jan-2006
Rotary
sander
211
NeverFail,
Inc.
1
$49.95
Sample Value
Sample Value
Sample Value
Sample Value
211347
QD-300932X
211347
RU-995748G
211348
AA-E3422QW
211349
GH-778345P
15-Jan-2006
0.25-in. drill bit
15-Jan-2006
Band saw
15-Jan-2006
Rotary sander
16-Jan-2006
Power drill
211
NeverFail, Inc.
309
BeGood, Inc.
211
NeverFail, Inc.
157
ToughGo, Inc.
8
$3.45
1
$39.99
2
$49.95
1
$87.75
Answer to 2.1:
Note: QUANT_SOLD = NUM_SOLD
3
INF3703/202/01/2011
2.2
Using the initial dependency diagram drawn in question 2.1, remove all partial
dependencies, draw the new dependency diagrams, and identify the normal
forms for each table structure you created.
Answer to 2.2:
2.3
Using the table structures you created in question 2.2, remove all transitive
dependencies and draw the new dependency diagrams. Also identify the
normal forms for each table structure you created
Answer to 2.3:
4
INF3703/202/01/2011
Question 3
Some Tiny University staff employees are information technology (IT) personnel.
Some IT personnel provide technology support for academic programs. Some IT
personnel provide technology infrastructure support. Some IT personnel provide
technology support for academic programs and technology infrastructure support. IT
personnel are not professors. IT personnel are required to take periodic training to
retain their technical expertise. Tiny University tracks all IT personnel training by
date, type, and results (completed vs. not completed). Given that information, create
the complete ERD containing all primary keys, foreign keys, and main attributes.
Answer:
These are optional
5
INF3703/202/01/2011
Question 4
4.1
Explain the following statement: a transaction is a logical unit of work.
Answer to 4.1:
A transaction is a logical unit of work that must be entirely completed of
aborted; no intermediate states are accepted. In other words, a transaction,
composed of several database requests, is treated by the DBMS as a unit of
work in which all transaction steps must be fully completed if the transaction is
to be accepted by the DBMS.
Acceptance of an incomplete transaction will yield an inconsistent database
state. To avoid such a state, the DBMS ensures that all of a transaction's
database operations are completed before they are committed to the
database. For example, a credit sale requires a minimum of three database
operations:
4.2
What is concurrency control?
Answer to 4.2:
Concurrency control is the activity of coordinating the simultaneous execution
of transactions in a multiprocessing or multi-user database management
system. The objective of concurrency control is to ensure the serializability of
transactions in a multi-user database management system. (The DBMS's
scheduler is in charge of maintaining concurrency control.) Because it helps to
guarantee data integrity and consistency in a database system, concurrency
control is one of the most critical activities performed by a DBMS. If
concurrency control is not maintained, three serious problems may be caused
by concurrent transaction execution: lost updates, uncommitted data, and
inconsistent retrievals.
4.3
What is the advantages of DDBMS?
Answer to 4.3:
 Data are located near the greatest demand site
 Faster access
 Faster data access
 Growth facilitation
4.4
What is the disadvantages of DDBMS?
Answer to 4.4:
 Complexity of management and control
 Security
 Lack of standards
 Increased storage requirements
6
INF3703/202/01/2011
Question 5
5.1
Explain the difference between a distributed database and distributed
processing
Answer to 5.1:
Distributed processing, a database’s logical processing is shared among
two or more physically independent sites that are connected through a
network. For example the data input/output(I/O), data selection and data
validation might be performed on one computer, and a report based on that
data might be on another computer. Distributed database, on the other
hand, stores a logically related database over two or more physically
independent sites. The sites are connected via computer network.
5.2
Describe the different types of database requests and transactions.
Answer to 5.2:
A database transaction is formed by one or more database requests. Each
database request is the equivalent of a single SQL statement. The basic
difference between a local transaction and a distributed transaction is that the
latter can update or request data from several remote sites on a network. In a
DDBMS, a database request and a database transaction can be of two types:
remote or distributed.
A remote request accesses data located at a single remote database
processor (or DP site). In other words, an SQL statement (or request) can
reference data at only one remote DP site.
A remote transaction, composed of several requests, accesses data at only
a single remote DP site.
A distributed transaction allows a transaction to reference several different
local or remote DP sites. Although each single request can reference only
one local or remote DP site, the complete transaction can reference multiple
DP sites because each request can reference a different site. Use Figure
12.12 to illustrate the distributed transaction.
A distributed request lets us reference data from several different DP sites.
Since each request can access data from more than one DP site, a
transaction can access several DP sites. The ability to execute a distributed
request requires fully distributed database processing because we must be
able to:
 Partition a database table into several fragments.
7
INF3703/202/01/2011

Reference one or more of those fragments with only one request. In other
words, we must have fragmentation transparency.
The distributed request feature also allows a single request to reference a
physically partitioned table.
5.3
What is the objective of query optimisation function?
Answer to 5.3:
The objective of query optimization functions is to minimize the total costs
associated with the execution of a database request. The costs associated with a
request are a function of:the access time (I/O) cost involved in accessing the
physical data stored on disk


the communication cost associated with the transmission of data among
nodes in distributed database systems
the CPU time cost.
It is difficult to separate communication and processing costs.
Query-optimization algorithms use different parameters, and the algorithms
assign different weight to each parameter. For example, some algorithms
minimize total time, others minimize the communication time, and still others
do not factor in the CPU time, considering it insignificant relative to the other
costs. Query optimization must provide distribution and replica transparency
in distributed database systems.
5.4
What is XML and why is it important?
Answer to 5.4:
Extensible Markup Language (XML) is a metalanguage used to represent and
manipulate data elements. XML is designed to facilitate the exchange of
structured documents, such as orders and invoices, over the internet.
8
INF3703/202/01/2011
2.1
Extras exercises to complete
Question 1: Problem 7 on page 200 of textbook
Question 2: Problem 7 on page 234 of textbook
Question 3: Problem 11 and 12 on page 284 of textbook
9
INF3703/202/01/2011
Answer to Question 1:
Solution to Problem 7 on page 200.
10
INF3703/202/01/2011
Answer to Question 2:
Solution to Problem 7on page 238
11
INF3703/202/01/2011
Answer to Question 3:
Solution to Problem 11 & 12 on page 284
EMP_CODE EMP_LNAME
EMP_EDUCATION DEPT_CODE DEPT_NAME DEPT_MANAGER
Transitive Dependencies
Continued….
EMP_DEPENDENTS EMP_DOB EMP_HIRE_DATE EMP_TRAINING
Continued….
JOB_TITLE JOB_CLASS EMP_BASE_SALARY EMP_COMMISSION_RATE
Transitive Dependencies
EMPLOYEE
EMP_CODE EMP_LNAME DEPT_CODE
JOB_CLASS
EMP_DOB
EMP_HIRE_DATE
DEPARTMENT
DEPT_CODE DEPT_NAME
EMP_CODE
QUALIFICATION
EDUCATION
EMP_CODE EDUC_CODE QUAL_DATE
EDUC_CODE EDUC_DESCRIPTION
DEPENDENT
EMP_CODE DEP_NUM DEP_FNAME DEP_TYPE
JOB
JOB_CLASS JOB_TITLE JOB_BASE_SALARY
12
INF3703/202/01/2011
The relational schemas are written as:
EMPLOYEE (EMP_CODE, EMP_LNAME, DEPT_CODE, JOB_CLASS, EMP_DOB,EMP_HIREDATE)
DEPENDENT(EMP_CODE, DEP_NUM, DEP_FNAME, DEP_TYPE)
DEPARTMENT(DEPT_CODE, DEPT_NAME, EMP_CODE)
JOB(JOB_CLASS, JOB_TITLE, JOB_BASE_SALARY)
EDUCATION(EDUC_CODE, EDUC_DESCRIPTION)
QUALIFICATION(EMP_CODE, EDUC_CODE, QUAL_DATE_EARNED)
UNISA 2011
13
Download