Uploaded by Slagga B

INF3703 2013-2019 exam pack updated (1)

advertisement
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Contents
MAY JUNE 2013............................................................................................................................................. 3
QUESTION 1 .............................................................................................................................................. 3
QUESTION 2 .............................................................................................................................................. 3
QUESTION 3 .............................................................................................................................................. 7
QUESTION 4 .............................................................................................................................................. 8
QUESTION 5 .............................................................................................................................................. 9
QUESTION 6 ............................................................................................................................................ 11
QUESTION 7 ............................................................................................................................................ 12
OCTOBER NOVEMBER 2014 ........................................................................................................................ 13
QUESTION 1 ............................................................................................................................................ 13
QUESTION 2 ............................................................................................................................................ 14
QUESTION 5 ............................................................................................................................................ 19
MAY JUNE 2016........................................................................................................................................... 20
SECTION A ............................................................................................................................................... 20
Question 1 ............................................................................................................................................... 20
Question 2 ............................................................................................................................................... 20
Question 3 ............................................................................................................................................... 22
Question 4 ............................................................................................................................................... 23
Question 5 ............................................................................................................................................... 24
OCTOBER NOV 2016MAY JUNE 2017.......................................................................................................... 26
Question 1 ............................................................................................................................................... 27
Question 2 ............................................................................................................................................... 27
Question 3 ............................................................................................................................................... 30
Question 4 ............................................................................................................................................... 31
Question 5 ............................................................................................................................................... 32
MAY JUNE 2017....................................................................................................................................... 35
Question 1 ............................................................................................................................................... 35
Question 2 ............................................................................................................................................... 36
Question 3 ............................................................................................................................................... 38
Question 4 ............................................................................................................................................... 40
Location Transparency ................................................................................................................................. 40
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Fragmentation Transparency.......................................................................................................................... 41
Replication Transparency.............................................................................................................................. 41
Question 5 ............................................................................................................................................... 42
OCTOBER NOVEMBER 2017 ........................................................................................................................ 44
Question 1 ............................................................................................................................................... 44
Question 3 ............................................................................................................................................... 48
Question 4 ............................................................................................................................................... 49
Question 5 ............................................................................................................................................... 51
MAY JUNE 2018........................................................................................................................................... 53
Question 1 ............................................................................................................................................... 53
QUESTION 2 ............................................................................................................................................ 54
Question 3 ............................................................................................................................................... 56
Question 4 ............................................................................................................................................... 58
Question 5 ............................................................................................................................................... 60
Question 6 ............................................................................................................................................... 69
Question 7 ............................................................................................................................................... 72
INF3703 EXAM PACK MAY JUNE 2019 ........................................................................................................ 76
QUESTION 2 ................................................................................................................................................ 78
QUESTION 3 ................................................................................................................................................ 80
QUESTION 4 ................................................................................................................................................ 83
QUESTION 5 ................................................................................................................................................ 89
INF3703 EXAM PACK 2013-2019 SOLUTIONS
MAY JUNE 2013
QUESTION 1
1.1 F security is a disadvantage of DDBMS.
1.2 F all remote transactions or request access data at a single site.
1.3 F The label stateless system indicates that the web server do not know the status of its
clients at any given time.
1.4 T
1.5 F An enterprise database do not provide all the support for expected future operations.
1.6 F A computerized DBMS that is not properly designed cannot generate best solutions
required by managers .
1.7 F Atomicity states that all transactions must be successfully completed or aborted.
1.8 F As long as it is read-read only.
1.9 T
1.10
T
QUESTION 2
2.1
A database transaction is formed by one or more database requests. Each database request
is the equivalent of a single SQL statement. The basic difference between a local
transaction and a distributed transaction is that the latter can update or request data from
several remote sites on a network. In a DDBMS, a database request and a database
transaction can be of two types: remote or distributed.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
A remote request accesses data located at a single remote database processor (or DP
site). In other words, an SQL statement (or request) can reference data at only one remote
DP site.
A remote transaction, composed of several requests, accesses data at only a single
remote DP site.
A distributed transaction allows a transaction to reference several different local or
remote DP sites. Although each single request can reference only one local or remote DP
site, the complete transaction can reference multiple DP sites because each request can
reference a different site. Use Figure 12.12 to illustrate the distributed transaction.
A distributed request lets us reference data from several different DP sites. Since each
request can access data from more than one DP site, a transaction can access several DP
sites. The ability to execute a distributed request requires fully distributed database
processing because we must be able to:

Partition a database table into several fragments.

Reference one or more of those fragments with only one request. In other words, we
must have fragmentation transparency.
The distributed request feature also allows a single request to reference a physically partitioned
table
2.2
A fully distributed database management system must perform all of the functions of a
centralized DBMS, as follows:
1. Receive an application’s (or an end user’s) request.
2. Validate, analyze, and decompose the request. The request might include mathematical and/or
logical operations such as the following: Select all customers with a balance greater than $1,000.
The request might require data from only a single table, or it might require access to several
tables.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
3. Map the request’s logical-to-physical data components.
4. Decompose the request into several disk I/O operations.
5. Search for, locate, read, and validate the data.
6. Ensure database consistency, security, and integrity.
7. Validate the data for the conditions, if any, specified by the request.
8. Present the selected data in the required format.
2.3
Data fragmentation allows you to break a single object into two or more segments, or
fragments. The object might be a user’s database, a system database, or a table. Each fragment
can be stored at any site over a computer network.
Horizontal fragmentation refers to the division of a relation into subsets (fragments) of tuples
(rows). Each fragment is stored at a different node, and each fragment has unique rows.
However, the unique rows all have the same attributes (columns). In short, each fragment
represents the equivalent of a SELECT statement, with the WHERE clause on a single attribute.
Vertical fragmentation refers to the division of a relation into attribute (column) subsets. Each
subset (fragment) is stored at a different node, and each fragment has unique columns—with the
exception of the key column, which is common to all fragments. This is the equivalent of the
PROJECT statement in SQL.
Mixed fragmentation refers to a combination of horizontal and vertical strategies. In other
words, a table may be divided into several horizontal subsets (rows), each one having a subset of
the attributes (columns).
2.4
PTA_properties, PTA_server, Pretoria
PROP_I PROP_NA
PROP_ADDR
PROP_LOCATI
PROP_PROVIN RENTAL_PRI
INF3703 EXAM PACK 2013-2019 SOLUTIONS
D
ME
ESS
ON
CE
CE
1001
SEQUIOR
10 CEILIERS
PRETORIA
GAUTENG
6 000.00
PRETORIA
GAUTENG
5 600.00
ST
1002
JABAVU
14 MESSY ST
CT_properties, CT_server, CAPE TOWN
PROP_I PROP_NA
PROP_ADDR
PROP_LOCATI
PROP_PROVIN RENTAL_PRI
D
ME
ESS
ON
CE
CE
1003
MIKE’S
20 RHODES
CAPE TOWN
WESTEN
8 000.00
ST
1004
ANN’S
40 SARAH ST
CAPE
CAPE TOWN
WESTEN
7 600.00
CAPE
PE_properties, PE_server, PORT ELIZABETH
PROP_I PROP_NA
PROP_ADDR
PROP_LOCATI
PROP_PROVIN RENTAL_PRI
D
ME
ESS
ON
CE
CE
1005
ELIBETH
201 MITCELL
P.E
EASTERN
6 000.00
ST
1006
GINA’S
14 MESSY
CAPE
P. E
EASTERN
5 600.00
CAPE
DBN_properties, DBN_server, DURBAN
PROP_
PROP_NAM
PROP_ADDR
PROP_LOCAT
PROP_PROVI
RENTAL_PR
ID
E
ESS
ION
NCE
ICE
1007
PRESIDENTI 151 TINA ST
DURBAN
KZN
6 700.00
DURBAN
KZN
5 675.00
AL
1008
EMBASSY
141 MESSY
ST
INF3703 EXAM PACK 2013-2019 SOLUTIONS
QUESTION 3
3.1
The basic ODBC architecture has three main components:
A high-level ODBC API through which application programs access ODBC functionality.
A driver manager that is in charge of managing all database connections.
An ODBC driver that communicates directly to the DB
3.2
OLE-DB is database middleware that adds object-oriented functionality for access to relational
and nonrelational data.
The OLE-DB model is better understood when you divide its functionality into two types of
objects:
Consumers are objects (applications or processes) that request and use data. The data consumers
request data by invoking the methods exposed by the data provider objects (public interface) and
passing the required parameters.
Providers are objects that manage the connection with a data source and provide data to the
consumers.
Providers are divided into two categories: data providers and service providers.
- Data providers provide data to other processes. Database vendors create data provider objects
that expose the functionality of the underlying data source (relational, object-oriented, text, and
so on).
Service providers provide additional functionality to consumers. The service provider is located
between the data provider and the consumer. The service provider requests data from the data
provider, transforms the data, and then provides the transformed data to the data consumer
3.3
INF3703 EXAM PACK 2013-2019 SOLUTIONS
SQL data services (SDS) refers to a new wave of Internet-based data management services that
provide relational data storage, access, and management to companies of any size without the
typically high costs of in-house hardware, software, infrastructure and personnel. In effect, this
type of service leverages the Internet to provide:
Highly reliable and scalable relational database for a fraction of the cost,
High level of failure tolerance because data are normally distributed and replicated among
multiple servers,
Dynamic and automatic load balancing,
Automated data backup and disaster recovery is included with the service,
Dynamic creation and allocation of database processes and storage.
QUESTION 4
4.1
Coordinating, monitoring, allocating resources. Resources include people and data.
Defining goals and formulating strategic plans.
Interacts with end user by providing data and information.
Enforces policies, standards, procedures.
Manages security, privacy, integrity.
Ensures data can be fully recovered.
Ensures data distributed appropriately.
4.2
A dimension is a structure that categorizes facts and measures in order to enable users to answer
business questions. Commonly used dimensions are people, products, sales, place and time. In
a data warehouse, dimensions provide structured labelling information to otherwise unordered
numeric measures.
4.3
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Requirements elicitation is the practice of collecting the requirements of a system from users,
customers and other stakeholders. Requirements elicitation is a part of
the requirements engineering process, usually followed by analysis and specification of
the requirements.
A requirement is a statement about an intended product that specifies what it should do or how it
should perform.
Functional: What the system should do
Non-Fictional: what constraints there are on the system its development? (For example that a
work processor runs on different platforms)
Functional: What the product should do.
Data requirements: Capture the type, volatility, size/amount, persistence, accuracy and the
amounts of the required data.
Environmental requirements: a) context of use b) Social environment (eg. Collaboration and
coordination) c) how good is user support likely to be d) what technologies will it run on
User Requirements: Capture the characteristics of the intended user group.
Usability Requirement: Usability goals associated measures for a particular product (More info
on Chapter 6).
QUESTION 5
5.1
Atomicity requires that all operations (SQL requests) of a transaction be completed; if not, the
transaction is aborted. If a transaction T1 has four SQL requests, all four requests must be
INF3703 EXAM PACK 2013-2019 SOLUTIONS
successfully completed; otherwise, the entire transaction is aborted. In other words, a transaction
is treated as a single, indivisible, logical unit of work.
Consistency indicates the permanence of the database’s consistent state. A transaction takes a
database from one consistent state to another consistent state. When a transaction is completed,
the database must be in a consistent state; if any of the transaction parts violates an integrity
constraint, the entire transaction is aborted.
Isolation means that the data used during the execution of a transaction cannot be used by a
second transaction until the first one is completed. In other words, if a transaction T1 is being
executed and is using the data item X, that data item cannot be accessed by any other transaction
(T2 ... Tn) until T1 ends.
Durability ensures that once transaction changes are done (committed), they cannot be undone
or lost, even in the event of a system failure.
Serializability ensures that the schedule for the concurrent execution of the transactions yields
consistent results. This property is important in multiuser and distributed databases, where
multiple transactions are likely to be executed concurrently. Naturally, if only a single
transaction is executed, serializability is not an issue.
Lock granularity indicates the level of lock use. Locking can take place at the following levels:
database, table, page, row, or even field (attribute).
In a database-level lock, the entire database is locked, thus preventing the use of any tables in
the database by transaction T2 while transaction Tl is being executed.
table-level lock, the entire table is locked, preventing access to any row by transaction T2 while
transaction T1 is using the table. If a transaction requires access to several tables, each table may
be locked. However, two transactions can access the same database as long as they access
different tables.
In a page-level lock, the DBMS will lock an entire diskpage. A diskpage, orpage, is the
equivalent of a diskblock, which can be described as a directly addressable section of a disk. A
page has a fixed size, such as 4K, 8K, or 16K.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
A row-level lock is much less restrictive than the locks discussed earlier. The DBMS allows
concurrent transactions to access different rows of the same table even when the rows are located
on the same page.
The field-level lock allows concurrent transactions to access the same row as long as they
require the use of different fields (attributes) within that row.
QUESTION 6
6a
The star schema is a data-modeling technique used to map multidimensional decision support
data into a relational database.
6b
Facts are numeric measurements (values) that represent a specific business aspect or activity.
For example, sales figures are numeric measurements that represent product and/or service sales.
Facts commonly used in business data analysis are units, costs, prices, and revenues.
Dimensions are qualifying characteristics that provide additional perspectives to a given fact.
Recall that dimensions are of interest because decision support data are almost always viewed in
relation to other data. For instance, sales might be compared by product from region to region
and from one time period to the next.
Each dimension table contains attributes. Attributes are often used to search, filter, or classify
facts. Dimensions provide descriptive characteristics about the facts through their attributes.
Attributes within dimensions can be ordered in a well-defined attribute hierarchy. The attribute
hierarchy provides a top-down data organization that is used for two main purposes:
aggregation and drill-down/roll-up data analysis. For example, Figure 13.15 shows
how the location dimension attributes can be organized in a hierarchy by region, state, city, and
store.
6C
Time, region, salesperson and product.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
6D
6.2
The creation of a database that provides fast and accurate answers to data analysis queries is the
data warehouse design’s prime objective. Therefore, performance-enhancement actions might
target query speed through the facilitation of SQL code as well as through better semantic
representation of business dimensions. These four techniques are often used to optimize data
warehouse design:
Normalizing dimensional tables.
Maintaining multiple fact tables to represent different aggregation levels.
Denormalizing fact tables.
Partitioning and replicating tables.
QUESTION 7
7.1
Database performance tuning refers to a set of activities and procedures designed to reduce the
response time of the database system—that is, to ensure that an end-user query is processed by
the DBMS in the minimum amount of time.
Client side Tuning
SQL performance tuning is evaluated from the client perspective. Therefore, the goal is to
illustrate some common
practices used to write efficient SQL code. A few words of caution are appropriate:
Most current-generation relational DBMSs perform automatic query optimization at the server
end.
Most SQL performance optimization techniques are DBMS-specific and, therefore, are rarely
portable, even
INF3703 EXAM PACK 2013-2019 SOLUTIONS
across different versions of the same DBMS. Part of the reason for this behavior is the constant
advancement
in database technologies.
Server side Tuning
DBMS performance tuning includes tasks such as managing the DBMS processes in primary
memory (allocating
memory for caching purposes) and managing the structures in physical storage (allocating space
for the data files).
The performance of a typical DBMS is constrained by three main factors: CPU processing
power, available primary memory (RAM), and input/output (hard disk and network) throughput.
7.2
What happens at the DBMS server end when the client’s SQL statement is received? In simple
terms, the DBMS processes a query in three phases:
1. Parsing. The DBMS parses the SQL query and chooses the most efficient access/execution
plan.
2. Execution. The DBMS executes the SQL query using the chosen execution plan.
3. Fetching. The DBMS fetches the data and sends the result set back to the client.
OCTOBER NOVEMBER 2014
QUESTION 1
1.1 T
1.2 T
1.3 F
1.4 F
1.5 T
1.6 T
1.7 F
1.8 F
INF3703 EXAM PACK 2013-2019 SOLUTIONS
1.9 T
1.10
F
SECTION B
QUESTION 2
2.1
Based on Microsoft’s Component Object Model (COM), OLE-DB is database middleware that
adds object-oriented functionality for access to relational and non-relational data. OLE-DB was
the first part of Microsoft’s strategy to provide a unified object-oriented framework for the
development of next-generation applications.
OLE-DB is composed of a series of COM objects that provide low-level database connectivity
for applications. Because OLE-DB is based on COM, the objects contain data and methods, also
INF3703 EXAM PACK 2013-2019 SOLUTIONS
known as the interface. The OLE-DB model is better understood when you divide its
functionality into two types of objects:
Consumers are objects (applications or processes) that request and use data. The data consumers
request data by invoking the methods exposed by the data provider objects (public interface) and
passing the required parameters.
Providers are objects that manage the connection with a data source and provide data to the
consumers. Providers are divided into two categories:
data providers and service providers.
Data providers provide data to other processes. Database vendors create data provider objects
that expose the functionality of the underlying data source (relational, object-oriented, text, and
so on).
Service providers provide additional functionality to consumers. The service provider is located
between the data provider and the consumer. The service provider requests data from the data
provider, transforms the data, and then provides the transformed data to the data consumer. In
other words, the service provider acts like a data consumer of the data provider and as a data
provider for the data consumer (end-user application). For example, a service provider could
offer cursor management services, transaction management services, query processing services,
and indexing services.
For the diagram see page 809 of the textbook.
2.2
the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states
that it is impossible for a distributed computer system to simultaneously provide all three of the
following guarantees:

Consistency (every read receives the most recent write or an error)
INF3703 EXAM PACK 2013-2019 SOLUTIONS

Availability (every request receives a response, without guarantee that it contains the
most recent version of the information)

Partition tolerance (the system continues to operate despite arbitrary partitioning due to
network failures)
In other words, the CAP theorem states that in the presence of a network partition, one has to
choose between consistency and availability.
2.3
Query optimization is the central activity during the parsing phase in query processing. In this
phase, the DBMS must choose what indexes to use, how to perform join operations, which table
to use first, and so on. Each DBMS has its own algorithms for determining the most efficient
way to access the data. The query optimizer can operate in one of two modes:
A rule-based optimizer uses preset rules and points to determine the best approach to execute a
query. The rules assign a “fixed cost” to each SQL operation; the costs are then added to yield
the cost of the execution plan. For example, a full table scan has a set cost of 10, while a table
access by row ID has a set cost of 3.
A cost-based optimizer uses sophisticated algorithms based on the statistics about the objects
being accessed to determine the best approach to execute a query. In this case, the optimizer
process adds up the processing cost, the I/O costs, and the resource costs (RAM and temporary
space) to come up with the total cost of a given execution plan.
The optimizer’s objective is to find alternative ways to execute a query—to evaluate the “cost”
of each alternative and then to choose the one with the lowest cost.
QUESTION 3
3.1
Lock granularity indicates the level of lock use. Locking can take place at the following levels:
database, table, page, row, or even field (attribute).
3.2
INF3703 EXAM PACK 2013-2019 SOLUTIONS
In a database-level lock, the entire database is locked, thus preventing the use of any tables in
the database by transaction T2 while transaction Tl is being executed.
table-level lock, the entire table is locked, preventing access to any row by transaction T2 while
transaction T1 is using the table. If a transaction requires access to several tables, each table may
be locked. However, two transactions can access the same database as long as they access
different tables.
In a page-level lock, the DBMS will lock an entire diskpage. A diskpage, or page, is the
equivalent of a diskblock, which can be described as a directly addressable section of a disk. A
page has a fixed size, such as 4K, 8K, or 16K.
A row-level lock is much less restrictive than the locks discussed earlier. The DBMS allows
concurrent transactions to access different rows of the same table even when the rows are located
on the same page.
The field-level lock allows concurrent transactions to access the same row as long as they
require the use of different fields (attributes) within that row.
3.3
A full backup of the database, or dump of the entire database. In this case, all database objects
are backed up in their entirety.
A differential backup of the database, in which only the last modifications to the database (when
compared with a previous full backup copy) are copied. In this case, only the objects that have
been updated since the last full backup are backed up.
A transaction log backup, which backs up only the transaction log operations that are not
reflected in a previous backup copy of the database. In this case, only the transaction log is
backed up; no other database objects are backed up. (For a complete explanation of the use of the
transaction log see Chapter 10, Transaction Management and Concurrency Control.
3.4
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Data fragmentation allows you to break a single object into two or more segments, or fragments.
The object might be a user’s database, a system database, or a table. Each fragment can be stored
at any site over a computer network.
Horizontal fragmentation refers to the division of a relation into subsets (fragments) of tuples
(rows). Each fragment is stored at a different node, and each fragment has unique rows.
However, the unique rows all have the same attributes (columns). In short, each fragment
represents the equivalent of a SELECT statement, with the WHERE clause on a single attribute.
Vertical fragmentation refers to the division of a relation into attribute (column) subsets. Each
subset (fragment) is stored at a different node, and each fragment has unique columns—with the
exception of the key column, which is common to all fragments. This is the equivalent of the
PROJECT statement in SQL.
Mixed fragmentation refers to a combination of horizontal and vertical strategies. In other words,
a table may be divided into several horizontal subsets (rows), each one having a subset of the
attributes (columns).
3.5
The DDBMS must include at least the following components:
Computer workstations or remote devices (sites or nodes) that form the network system. The
distributed database system must be independent of the computer system hardware.
Network hardware and software components that reside in each workstation or device. The
network components allow all sites to interact and exchange data. Because the components—
computers, operating systems, network hardware, and so on—are likely to be supplied by
different vendors, it is best to ensure that distributed database functions can be run on multiple
platforms.
Communications media that carry the data from one node to another. The DDBMS must be
communications media-independent; that is, it must be able to support several types of
communications media.
The transaction processor (TP), which is the software component found in each computer or
device that requests data. The transaction processor receives and processes the application’s data
INF3703 EXAM PACK 2013-2019 SOLUTIONS
requests (remote and local). The TP is also known as the application processor (AP) or the
transaction manager (TM).
The data processor (DP), which is the software component residing on each computer or
device that stores and retrieves data located at the site. The DP is also known as the data manager
(DM). A data processor may even be a centralized DBMS.
QUESTION 5
5.1
Page 519 of the textbook or slides 19 onwards for chapter 10.
5.2
There are two classical approaches to database design:
Top-down design starts by identifying the data sets and then defines the data elements for each
of those sets. This process involves the identification of different entity types and the definition
of each entity’s attributes.
Bottom-up design first identifies the data elements (items) and then groups them together in data
sets. In other words, it first defines attributes, and then groups them to form entities.
See figure 536
5.3
Managerial role
the DBA is responsible for:
Coordinating, monitoring, and allocating database administration resources: people and data.
Defining goals and formulating strategic plans for the database administration function.
See page 558.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
MAY JUNE 2016
SECTION A
Question 1
1.1 F (minimum amount of resources on the server side)
1.2 F (Convenience is not a property of transaction)
1.3 T
1.4 T
1.5 F (ODBC only connects to relational database)
1.6 T
1.7 T
1.8 T
1.9 F (That is the definition of data mart not data mining)
1.10 T
SECTION B
Question 2
2.1
DATABASE INITIAL STUDY
If a designer has been called in, chances are the current system has failed to perform functions
deemed vital by the company. (You don’t call the plumber unless the pipes leak.) So, in addition
to examining the current system’s operation within the company, the designer must determine
how and why the current system fails. That meansspending a lot of time talking with (but mostly
listening to) end users. Although database design is a technical business, it is also peopleoriented. Database designers must be excellent communicators, and they must have finely tuned
interpersonal skills.
Depending on the complexity and scope of the database environment, the database designer
might be a lone operator or part of a systems development team composed of a project leader,
INF3703 EXAM PACK 2013-2019 SOLUTIONS
one or more senior systems analysts, and one or more junior systems analysts. The word designer
is used generically here to cover a wide range of design team compositions.
The overall purpose of the database initial study is to:
 Analyze the company situation.
 Define problems and constraints.
 Define objectives.
 Define scope and boundaries.
Figure 9-4 depicts the interactive and iterative processes required to complete the first phase of
the DBLC successfully.
As you examine Figure 9.4, note that the database initial study phase leads to the development of
the database system objectives. Using Figure 9.4 as a discussion template, let’s examine each of
its components in greater detail.
2.2)
The technical aspects of the DBA’s job are rooted in the following areas of operation:
 Evaluating, selecting, and installing the DBMS and related utilities.
 Designing and implementing databases and applications.
 Testing and evaluating databases and applications.
 Operating the DBMS, utilities, and applications.
 Training and supporting users.
 Maintaining the DBMS, utilities, and applications.
The following sections will explore the details of those operational areas.
See page 558.
2.3)
Desired DBA Skills
TECHNICAL:
Broad data-processing background
Systems Development Life Cycle knowledge
Structured methodologies:
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Data flow diagrams
Structure charts
Programming languages
Database Life Cycle knowledge
Database modeling and design skills
Conceptual
Logical
Physical
Operational skills: Database implementation, data
dictionary management, security, and so on
Question 3
3.1)
Conceptual design is the first stage in thedatabase design process. The goal at this stage is to
design a database that is independent of database software and physical details. The output of
this process is a conceptual data model that describes the main data entities, attributes,
relationships, and constraints of a given problem domain. This design is descriptive and narrative
in form. That is, it is generally composed of a graphical representation as well as textual
descriptions of the main data elements, relationships, and constraints.
Conceptual design steps
1 Data analysis and requirements
2 Entity relationship modeling and normalization
3 Data model verification
4 Distributed database design
3.2)
1 Identify, analyze, and refine the business rules.
2 Identify the main entities, using the results of Step 1.
3 Define the relationships among the entities, using the results of Steps 1 and 2.
4 Define the attributes, primary keys, and foreign keys for each of the entities.
5 Normalize the entities. (Remember that entities are implemented as tables in an RDBMS.)
6 Complete the initial ER diagram.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
7 Validate the ER model against the end users’ information and processing requirements.
8 Modify the ER model, using the results of Step 7.
Refer to section 9.3 of the textbook
3.3)
What happens at the DBMS server end when the client’s SQL statement is received? In simple
terms, the DBMS
processes a query in three phases:
1. Parsing. The DBMS parses the SQL query and chooses the most efficient access/execution
plan.
2. Execution. The DBMS executes the SQL query using the chosen execution plan.
3. Fetching. The DBMS fetches the data and sends the result set back to the client.
The processing of SQL DDL statements (such as CREATE TABLE) is different from the
processing required by DML statements. The difference is that a DDL statement actually updates
the data dictionary tables or system catalog, while a DML statement (SELECT, INSERT,
UPDATE, and DELETE) mostly manipulates end-user data.
Figure 11.2 shows the general steps required for query processing. Each of the steps will be
discussed in the following sections.
Refer to page 660 Query processing
Question 4
4.1)
Atomicity requires that all operations (SQL requests) of a transaction be completed; if not, the
transaction is aborted. If a transaction T1 has four SQL requests, all four requests must be
successfully completed; otherwise, the entire transaction is aborted. In other words, a transaction
is treated as a single, indivisible, logical unit of work.
Consistency indicates the permanence of the database’s consistent state. A transaction takes a
database from one consistent state to another consistent state. When a transaction is completed,
the database must be in a consistent state; if any of the transaction parts violates an integrity
constraint, the entire transaction is aborted.
Isolation means that the data used during the execution of a transaction cannot be used by a
second transaction until the first one is completed. In other words, if a transaction T1 is being
INF3703 EXAM PACK 2013-2019 SOLUTIONS
executed and is using the data item X, that data item cannot be accessed by any other transaction
(T2 ... Tn) until T1 ends.
Durability ensures that once transaction changes are done (committed), they cannot be undone
or lost, even in the event of a system failure.
Serializability ensures that the schedule for the concurrent execution of the transactions yields
consistent results. This property is important in multiuser and distributed databases, where
multiple transactions are likely to be executed concurrently. Naturally, if only a single
transaction is executed, serializability is not an issue.
4.2)
CONCURRENCY CONTROL
The coordination of the simultaneous execution of transactions in a multiuser database system is
known as
concurrency control. The objective of concurrency control is to ensure the serializability of
transactions in a multiuser database environment. Concurrency control is important because the
simultaneous execution of transactions over a shared database can create several data integrity
and consistency problems. The three main problems are lost updates, uncommitted data, and
inconsistent retrievals.
Question 5
5.1)
Whether the database is centralized or distributed, the design principles and concepts described
in Chapter 3, The Relational Database Model; Chapter 4, Entity Relationship (ER) Modeling;
and Chapter 6, Normalization of Database Tables, are still applicable. However, the design of a
distributed database introduces three new issues:
 How to partition the database into fragments.
 Which fragments to replicate.
 Where to locate those fragments and replicas.
Data fragmentation and data replication deal with the first two issues, and data allocation deals
with the third issue.
12.11.1 Data Fragmentation
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Data fragmentation allows you to break a single object into two or more segments, or
fragments. The object might be a user’s database, a system database, or a table. Each fragment
can be stored at any site over a computer network. Information about data fragmentation is stored
in the distributed data catalog (DDC), from which it is accessed by the TP to process user
requests.
_ Horizontal fragmentation refers to the division of a relation into subsets (fragments) of tuples
(rows). Each fragment is stored at a different node, and each fragment has unique rows.
However, the unique rows all have the same attributes (columns). In short, each fragment
represents the equivalent of a SELECT statement, with the WHERE clause on a single attribute.
_ Vertical fragmentation refers to the division of a relation into attribute (column) subsets.
Each subset (fragment) is stored at a different node, and each fragment has unique columns—
with the exception of the key column, which is common to all fragments. This is the equivalent
of the PROJECT statement in SQL.
_ Mixed fragmentation refers to a combination of horizontal and vertical strategies. In other
words, a table may be divided into several horizontal subsets (rows), each one having a subset of
the attributes (columns).
Data replication refers to the storage of data copies at multiple sites served by a computer
network. Fragment copies can be stored at several sites to serve specific information
requirements. Because the existence of fragment copies can enhance data availability and
response time, data copies can help to reduce communication and total query costs.
Suppose database A is divided into two fragments, A1 and A2. Within a replicated distributed
database, the scenario depicted in Figure 12.19 is possible: fragment A1 is stored at sites S1 and
S2, while fragment A2 is stored at sites S2 and S3.
Replicated data are subject to the mutual consistency rule. The mutual consistency rule requires
that all copies of data fragments be identical. Therefore, to maintain data consistency among the
replicas, the DDBMS must ensure that a database update is performed at all sites where replicas
exist.
Three replication scenarios exist: a database can be fully replicated, partially replicated, or
unreplicated.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
 A fully replicated database stores multiple copies of each database fragment at multiple
sites. In this case, all database fragments are replicated. A fully replicated database can
be impractical due to the amount of overhead it imposes on the system.
 A partially replicated database stores multiple copies of some database fragments at
multiple sites. Most DDBMSs are able to handle the partially replicated database well.
 An unreplicated database stores each database fragment at a single site. Therefore, there
are no duplicate database fragments.
Data allocation describes the process of deciding where to locate data. Data allocation strategies
are as follows:
 With centralized data allocation, the entire database is stored at one site.
 With partitioned data allocation, the database is divided into two or more disjointed
parts (fragments) and stored at two or more sites.
 With replicated data allocation, copies of one or more database fragments are stored at
several sites. Data distribution over a computer network is achieved through data
partitioning, through data replication, or through a combination of both. Data allocation is
closely related to the way a database is divided or fragmented. Most data allocation
studies focus on one issue: which data to locate where.
Data allocation algorithms take into consideration a variety of factors, including:
 Performance and data availability goals.
 Size, number of rows, and number of relations that an entity maintains with other entities.
 Types of transactions to be applied to the database and the attributes accessed by each of
those transactions.
 Disconnected operation for mobile users. In some cases, the design might consider the
use of loosely disconnected fragments for mobile users, particularly for read-only data
that does not require frequent updates and for which the replica update windows (the
amount of time available to perform a certain data processing task that cannot be
executed concurrently with other tasks) may be longer.
OCTOBER NOV 2016MAY JUNE 2017
SECTION A
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Question 1
1.1 T
1.2 T
1.3 F Partition processing and transaction processing are not part of twelve commandment
1.4 F Introduce changes is under maintenance and evolution
1.5 T
1.6 T
1.7 F A data mart is a small, single-subject data warehouse subset that provides decision
support to a small group of people.
1.8 T
1.9 F the first stage is; Map conceptual model to logical model components
1.10
T
SECTION B
Question 2
2.1)
 Implement file system security.
 Implement share access security.
 Use access permission.
 Encrypt data at the file system or database level.
 Install firewalls.
 Virtual Private Networks (VPN).
 Intrusion Detection Systems (IDS).
 Network Access Control (NAC).
 Network activity monitoring.
 Test application programs extensively.
 Built safeguards in code.
 Do extensive vulnerability testing in applications.
 Install spam filter/antivirus for e-mail system.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
 Use secure coding techniques
 Educate users about social engineering attacks.
 Apply OS security patches and updates.
 Apply application server patches.
 Install antivirus and antispyware software.
 Enforce audit trails on the computers.
 Perform periodic system backups.
 Install only authorized applications.
 Use group policies to prevent
 Unauthorized installs.
Refer to section 15.6.2 of the textbook
2.2)
The success of the overall information systems strategy, and therefore, of the data administration
strategy depends on several critical success factors. Understanding the critical success factors
helps the DBA develop a successful corporate
data administration strategy. Critical success factors include managerial, technological, and
corporate culture issues,
such as:
 Management commitment. Top-level management commitment is necessary to enforce
the use of standards,
procedures, planning, and controls. The example must be set at the top.
 Thorough company situation analysis. The current situation of the corporate data
administration must be
analyzed to understand the company’s position and to have a clear vision of what must be done.
For example,
how are database analysis, design, documentation, implementation, standards, codification, and
other issues
handled? Needs and problems should be identified first, then prioritized.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
 End-user involvement. End-user involvement is another aspect critical to the success of
the data administration
strategy. What is the degree of organizational change involved? Successful organizational
change requires that
people be able to adapt to the change. Users should be given an open communication channel to
upper-level
management to ensure success of the implementation. Good communication is key to the overall
process.
 Defined standards. Analysts and programmers must be familiar with appropriate
methodologies, procedures,
and standards. If analysts and programmers lack familiarity, they might need to be trained in the
use of the
procedures and standards.
 Training. The vendor must train the DBA personnel in the use of the DBMS and other
tools. End users must
be trained to use the tools, standards, and procedures to obtain and demonstrate the maximum
benefit, thereby
increasing end-user confidence. Key personnel should be trained first so that they can train
others.
 A small pilot project. A small project is recommended to ensure that the DBMS will
work in the company,
that the output is what was expected, and that the personnel have been trained properly.
Refer to section 15.8 of the textbook
2.3)
There are two classical approaches to database design:
 Top-down design starts by identifying the data sets and then defines the data elements
for each of those sets.
This process involves the identification of different entity types and the definition of each
entity’s attributes.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
 Bottom-up design first identifies the data elements (items) and then groups them
together in data sets. In
other words, it first defines attributes, and then groups them to form entities.
Refer to section 9.8 of the textbook
Question 3
3.1)
 Cloud computing is a computing model for enablingubiquitous, convenient, on-demand
network access to a shared pool of configurable computer resources(e.g.,networks,
servers, storage, applications and services) that can be rapidly provisioned and released
with minimal management effort or service provider interaction.
 The term cloud services is used in this book to refer to the services provided by cloud
computing.
3.2)
Public cloud: This type of cloud infrastructure is built by a third-party organization to sell cloud
services to the general public.
 Private cloud: This type of internal cloud is built by an organization for the sole purpose
of servicing its own needs.
 Community cloud: This type of cloud is built by and for a specific group of organizations
that share a common trade, such as agencies of the federal government, the military, or
higher education.
3.3)
Developing the Conceptual Model Using ER Diagrams
STEP ACTIVITY
1 Identify, analyze, and refine the business rules.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
2 Identify the main entities, using the results of Step 1.
3 Define the relationships among the entities, using the results of Steps 1 and 2.
4 Define the attributes, primary keys, and foreign keys for each of the entities.
5 Normalize the entities. (Remember that entities are implemented as tables in an RDBMS.)
6 Complete the initial ER diagram.
7 Validate the ER model against the end users’ information and processing requirements.
8 Modify the ER model, using the results of Step 7.
Refer to section 9.3 of the textbook
Question 4
4.1)
Database transaction recovery uses data in the transaction log to recover a database from an
inconsistent state to a
consistent state.
Before continuing, let’s examine four important concepts that affect the recovery process:
 The write-ahead-log protocol ensures that transaction logs are always written before any
database data are
actually updated. This protocol ensures that, in case of a failure, the database can later be
recovered to a
consistent state, using the data in the transaction log.
 Redundant transaction logs (several copies of the transaction log) ensure that a physical
disk failure will not
impair the DBMS’s ability to recover data.
 Database buffers are temporary storage areas in primary memory used to speed up disk
operations. To
improve processing time, the DBMS software reads the data from the physical disk and stores a
copy of it on
a “buffer” in primary memory. When a transaction updates data, it actually updates the copy of
the data in the
INF3703 EXAM PACK 2013-2019 SOLUTIONS
buffer because that process is much faster than accessing the physical disk every time. Later on,
all buffers that
contain updated data are written to a physical disk during a single operation, thereby saving
significant
processing time.
 Database checkpoints are operations in which the DBMS writes all of its updated buffers
to disk. While this is happening, the DBMS does not execute any other requests. A
checkpoint operation is also registered in the transaction log. As a result of this operation,
the physical database and the transaction log will be in sync. This
Synchronization is required because update operations update the copy of the data in the buffers
and not in the physical database. Checkpoints are automatically scheduled by the DBMS several
times per hour. As you will see next, checkpoints also play an important role in transaction
recovery.
The database recovery process involves bringing the database to a consistent state after a failure.
Transaction recovery procedures generally make use of deferred-write and write-through
techniques.
Refer to section 10.6.1 of the textbook
4.2)
Shared/Exclusive Locks
The labels “shared” and “exclusive” indicate the nature of the lock. An exclusive lock exists
when access is reserved specifically for the transaction that locked the object. The exclusive lock
must be used when the potential for conflict exists. (See Table 10.11, Read vs. Write.) A shared
lock exists when concurrent transactions are granted read access on the basis of a common lock.
A shared lock produces no conflict as long as all the concurrent transactions are read-only.
Question 5
5.1
INF3703 EXAM PACK 2013-2019 SOLUTIONS
a)During query optimization, the DBMS must choose what indexes to use, how to perform join
operations, which table to use first, and so on. Each DBMS has its own algorithms for
determining the most efficient way to access the data. The two most common approaches are
rule-based and cost-based optimization.
 A rule-based optimizer uses preset rules and points to determine the best approach to
execute a query. The rules assign a “fixed cost” to each SQL operation; the costs are then
added to yield the cost of the execution plan.
 A cost-based optimizer uses sophisticated algorithms based on the statistics about the
objects being accessed to determine the best approach to execute a query. In this case, the
optimizer process adds up the processing cost, the I/O costs, and the resource costs (RAM
and temporary space) to come up with the total cost of a given execution plan.
b)SQL performance tuning deals with writing queries that make good use of the statistics. In
particular, queries should
make good use of indexes. Indexes are very useful when you want to select a small subset of
rows from a large table based on a condition. When an index exists for the column used in the
selection, the DBMS may choose to use it. The objective is to create indexes with high
selectivity. Index selectivity is a measure of how likely an index will be used in query
processing. It is also important to write conditional statements using some common principles.
DBMS performance tuning includes tasks such as managing the DBMS processes in primary
memory (allocating memory for caching purposes) and managing the structures in physical
storage (allocating space for the data files).
c)
The data cache or buffer cache is a shared, reserved memory area that stores the most recently
accessed data blocks in RAM. The data cache is where the data read from the database data files
are stored after the data have been read or before the data are written to the database data files.
The data cache also caches system catalog data and the contents of the indexes.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
 The SQL cache, or procedure cache, is a shared, reserved memory area that stores the
most recently executed SQL statements or PL/SQL procedures, including triggers and
functions. (To learn more about PL/SQL procedures, triggers, and SQL functions, study
Chapter 8, Advanced SQL.) The SQL cache does not store the end-user-written SQL.
Rather, the SQL cache stores a “processed” version of the SQL that is ready for execution
by the DBMS.
5.2)
A fully distributed database management system must perform all of the functions of a
centralized DBMS, as follows:
1. Receive an application’s (or an end user’s) request.
2. Validate, analyze, and decompose the request. The request might include mathematical and/or
logical operations such as the following: Select all customers with a balance greater than $1,000.
The request might require data from only a single table, or it might require access to several
tables.
3. Map the request’s logical-to-physical data components.
4. Decompose the request into several disk I/O operations.
5. Search for, locate, read, and validate the data.
6. Ensure database consistency, security, and integrity.
7. Validate the data for the conditions, if any, specified by the request.
8. Present the selected data in the required format.
5.3
The creation of a database that provides fast and accurate answers to data analysis queries is the
data warehouse design’s prime objective. Therefore, performance-enhancement actions might
target query speed through the facilitation of SQL code as well as through better semantic
INF3703 EXAM PACK 2013-2019 SOLUTIONS
representation of business dimensions. These four techniques are often used to optimize data
warehouse design:
Normalizing dimensional tables.
Maintaining multiple fact tables to represent different aggregation levels.
Denormalizing fact tables.
Partitioning and replicating tables.
Refer to section 13.7.6 of the textbook
MAY JUNE 2017
MAY JUNE inf3703
SECTION A
Question 1
1.1 T
1.2. T
1.3. T
1.4.T
1.5. T
1.6. T
1.7.T
1.8.T
1.9.F
1.10.T
INF3703 EXAM PACK 2013-2019 SOLUTIONS
SECTION B
Question 2 [22]
2.1 ABCM is a new start-up company with its head office based in Johannesburg South
Africa (SA). The company also has other branches in different SA provinces and abroad.
ABCM sells hand crafted traditional materials from different African countries. The
company has outsourced the development of its database to you and your team. The
database was well developed and ABCM liked the end product particularly how your team
incorporated the internet (web-to-database middleware) in the database to enable their
end-users (which are ABCM staffs at the head office, across provinces, and countries) to
interact with the companies database over the web.
Draw a figure and use it to explain in a step-by-step process to your client ABCM how the
Web-to-database middleware enables/allows end-users (ABCM employee) to interact with
database over the web. Ensure you explain the process in word. (16)
INF3703 EXAM PACK 2013-2019 SOLUTIONS
A database server-side extension program is also known as Web-to-database middleware.
Figure above shows the
Interaction between the browser, the Web server, and the Web-to-database middleware.
1. The client browser sends a page request to the Web server.
2. The Web server receives and validates the request. In this case, the server will pass the request
to the Web-to-database middleware for processing. Generally, the requested page contains some
type of scripting language to enable the database interaction.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
3. The Web-to-database middleware reads, validates, and executes the script. In this case, it
connects to the database and passes the query using the database connectivity layer.
4. The database server executes the query and passes the result back to the Web-to-database
middleware.
5. The Web-to-database middleware compiles the result set, dynamically generates an HTMLformatted page that includes the data retrieved from the database, and sends it to the Web server.
6. The Web server returns the just-created HTML page, which now includes the query result, to
the client browser.
7. The client browser displays the page on the local computer.
2.2 Object Linking and Embedding for Database (OLE-DB) is a database middleware that
adds object-oriented functionality for access to relational and non-relational data. Identify
the main types of objects in the OLE-DB model. Explain each of them. (6)
Consumers- are objects (applications or processes) that request and use data. Consumers
request data by invoking the methods exposed by the data provider objects (public
interface) and passing the required parameters.
Providers- are objects that manage the connection with a data source and provide data to the
consumers.
Providers are divided in two categories data providers and service providers
Data providers – provide data to other processes .database vendors create data provider objects
that expose the functionality of the underlining data source (relational,object oriented ,text and
soon )
Service providers – provide functionality to customers .The service provider is located between
the data provider and the consumer
Question 3 [14]
3.1 Explain in details what a data mart is and provide reasons why some organizations
prefer to create a data mart instead of data warehouse. (6)
INF3703 EXAM PACK 2013-2019 SOLUTIONS
A data mart is a simple form of a data warehouse that is focused on a single subject (or
functional area), such as Sales or Finance or Marketing.
Data marts are often built and controlled by a single department within an organization. Given
their single-subject focus, data marts usually draw data from only a few sources.
The sources could be internal operational systems, a central data warehouse, or external data.
Nevertheless, data marts are typically smaller and less complex than data warehouses; hence,
they are typically easier to build and maintain.
Data warehouse
Data mart
Scope
coperate
Line of bussines
Objects
Multiple
Single subjects
Data source
many
few
size
100GB – TB +
< 100
Implementation time
Months to years
< month
3.2 Evis Velekhaya is a manager of a small multimedia distributed company. Due to the
fact that the business of the company is fast growing, Elvis identifies that it is time to
manage the massive information pool to help assist and guide the accelerating growth.
As newly employed database designer in the company, Evis asked you to develop a data
warehouse application that will enable the company employees particularly himself to
study sales figures by year, region, salesperson and product
Draw a star schema diagram for the multimedia distributed company data warehouse. The
diagram should have at least two attributes, primary and foreign keys. (8)
Each dimension in a star schema is represented with only one-dimension table. This dimension
table contains the set of attributes. The following diagram shows the sales data of a company
with respect to the four dimensions, namely time, item, branch, and location.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
There is a fact table at the center. It contains the keys to each of four dimensions. The fact table
also contains the attributes, namely dollars sold and units sold. Note: Each dimension has only
one dimension table and each table holds a set of attributes. For example, the location dimension
table contains the attribute set {location_key, street, city, province_or_state,country}. This
constraint may cause data redundancy. For example, "Vancouver" and "Victoria" both the cities
are in the Canadian province of British Columbia. The entries for such cities may cause data
redundancy along the attributes province_or_state and country.
Question 4 [18]
4.1 Distributed transparency allows a physically dispersed database to be managed as
though it were a centralized database. The level of transparency supported by the DDBMS
varies from system to system. Briefly explain the three (3) levels of distribution
transparency. (6)
Location Transparency
Location transparency ensures that the user can query on any table(s) or fragment(s) of a table
as if they were stored locally in the user’s site. The fact that the table or its fragments are stored
at remote site in the distributed database system, should be completely oblivious to the end user.
The address of the remote site(s) and the access mechanisms are completely hidden.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
In order to incorporate location transparency, DDBMS should have access to updated and
accurate data dictionary and DDBMS directory which contains the details of locations of data.
Fragmentation Transparency
Fragmentation transparency enables users to query upon any table as if it were unfragmented.
Thus, it hides the fact that the table the user is querying on is actually a fragment or union of
some fragments. It also conceals the fact that the fragments are located at diverse sites.
This is somewhat similar to users of SQL views, where the user may not know that they are
using a view of a table instead of the table itself.
Replication Transparency
Replication transparency ensures that replication of databases are hidden from the users. It
enables users to query upon a table as if only a single copy of the table exists.
Replication transparency is associated with concurrency transparency and failure transparency.
Whenever a user updates a data item, the update is reflected in all the copies of the table.
However, this operation should not be known to the user. This is concurrency transparency.
Also, in case of failure of a site, the user can still proceed with his queries using replicated
copies without any knowledge of failure. This is failure transparency.
4.2 Discuss the three properties of the CAP theorem. (6)
The three letters of CAP denotes consistency, availability and partition tolerance which are also
the three main important characteristics of a distributed system having replicated data
 Availability – in CAP refers to the successful handling of the request for performing read
or write operations or displaying a message that the operation cannot be performed
 Partition – tolerance of the node implies that even in the case of a network fault the
distributed system can keep functioning .such network fault results in nodes being
partitioned
 Consistency- implies that the multiple identical copies of replicated data would be made
available to the nodes
INF3703 EXAM PACK 2013-2019 SOLUTIONS
4.3 The DBMS processes query in three phases. Identify those phases, and outline what is
accomplished in each phase? (6)
Parsing -The DBMS parses the SQL query and chooses be most efficient access/execution
plan.
Execution-The DBMS executes the SQL query using the chosen execution plan.
Fetching-The DBMS fetches the data and sends the result set back to the client
Question 5 [26]
5.1 Suppose the database system of your organization has failed. Do the following:
a. Explain to the management of your organization what database recovery process is. (2)
 Data recovery is the process of restoring data that has been lost, accidentally
deleted, corrupted or made inaccessible.
b. Describe to them the database recovery process you will follow to recover the database.
(6)
Using Transaction Log Backups
With the combination of full database and differential backups, it is possible to take snapshots of
the data and recover them. But in some situations, it is also desirable to have backups of all
events that have occurred in a database, like a record of every single statement executed. With
such functionality, it would be possible to recover the database to any state required. Transaction
log backups provide this functionality. As its name suggests, the transaction log backup is a
backup of transaction log entries and contains all transactions that have happened to the
database. The main advantages of transaction log backups are as follows:
 Transaction log backups allow you to recover the database to a specific point in time.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
 Because transaction log backups are backups of log entries, it is even possible to perform
a backup from a transaction log if the data files are destroyed. With this backup, it is
possible to recover the database up to the last transaction that took place before the
failure occurred. Thus, in the event of a failure, not a single committed transaction need
be lost.
c. Also describe to them the use of deferred-write and write-through techniques.
(6)
Recovery restores a database from a given state, usually inconsistent, to a previously consistent
state.
Depending on the type and the extent of the failure, the recovery process ranges from a minor
short-term inconvenience to a major long-term rebuild action.
Regardless of the extent of the required recovery process, recovery is not possible without
backup.
The database recovery process generally follows a predictable scenario:
1. Determine the type and the extent of the required recovery.
2. If the entire database needs to be recovered to a consistent state, the recovery uses the most
recent backup copy of the database in a known consistent state.
3. The backup copy is then rolled forward to restore all subsequent transactions by using the
transaction log information.
4. If the database needs to be recovered, but the committed portion of the database is usable, the
recovery process uses the transaction log to "undo" all the transactions that were not committed.
Recovery procedures generally make use of deferred-write and write-through techniques.
In the case of the deferred-write or deferred-update, the transaction operations do not
immediately update the database. Instead:
All changes (previous and new data values) are first written to the transaction log
The database is updated only after the transaction reaches its commit point.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
If the transaction fails before it reaches its commit point, no changes (no roll-back or undo)
need to be made to the database because the database was never updated. In contrast, if the writethrough immediate-update technique is used:
The database is immediately updated by transaction operations during the transaction's
execution, even before the transaction reaches its commit point. The transaction log is also
updated, so if a transaction fails, the database uses the log information to roll back("undo") the
database to its previous state.
OCTOBER NOVEMBER 2017
OCTOBER NOVEMBER inf 3703 2017
Question 1
SECTION A
1.1 T
1.2 T
1.3 F ONLY IF IT’S WRITTEN CONSUMERS
1.4 F MULTIPLE FACT TABLES NOT SINGLE
1.5 F ITS IN THE SCOPE OF BI
1.6 F is the property of distributed databases by the virtue of which the internal details of
the distribution are hidden from the users.
1.7 T
1.8 T
1.9 F ISOLATION PROPERTY
INF3703 EXAM PACK 2013-2019 SOLUTIONS
1.10 T
Section B
Three levels of backup that may be used for database recovery
A full backup of the database creates a backup copy of all database objects
in their entirety. This is a complete backup of the entire database. Normally
performed overnight or on weekends and usually requires that no users be
connected to the database. This type of backup will take the longest time and
require ample storage.
A
differential backup of the database creates a backup of only those
database objects that have changed since the last full backup. The differential
backup will back up only the data that has changed since the last full backup and
that has not been backed up before. This type of backup can be done overnight
(depending on the amount of data).
A transaction log backup does not create a backup of database objects, but makes a backup
of the log of changes that have been applied to the database objects since the last backup.
2.2 using oppisimistic approach each transaction moves though 3 phases
•
During the read phase, the transaction reads the database, executes the needed
computations, and makes the updates to private copy of the database values. All
update operations of the transaction are recorded in a temporary update file,
which is not accessed by the remaining transactions.
•
During the validation phase, the transaction is validated to ensure that the changes
made will not affect the integrity and consistency of the database. if the validation
test is positive , the transaction goes to the write phase .if the validation test is
negative, the transaction is restarted and the changes are discarded.
•
During the write phase, the changes are permanently applied to the database.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
2.3 what is consistency state and how it is achieved
A consistent database state is one in which all data integrity constraints are satisfied.
To achieve a consistent database state, a transaction must take the database from one consistent
state to another.
2.4 three main problems resulting from concurrent execution of transaction in multiuser
database and give examples
Lost Updates
This problem occurs when two transactions, accessing the same data items, have their operations
interleaved in such a way, that one transaction will access the data before the other has applied
any updates.
In this simplified example, you can see that the operations are interleaved in such a way that
Transaction 2 had access to the data before Transaction 1 could reduce the seats by
1. Transaction 2's operation of reducing the seats by 2 has been overwritten, resulting in an
incorrect value of available seats.
Uncommitted Data
This problem occurs when one transaction updates a data item, but has not yet committed the
data permanently to the database. Because of failure, the transaction is rolled back and the data
item is returned to its previous value. A second transaction accesses the updated data item
before it is returned to its original value.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
In this example, you can see that Transaction 2 has access to the updated data of Transaction 1
before the changes were committed permanently to the database.
This violates the Isolation property of allowing a transaction to complete without interference
from another
Inconsistent Retrievals
This problem occurs when one transaction is calculating summary functions on particular data
items while other transactions are updating those data items. The transaction performing the
calculations may read some data items before they are updated and others after.
Assume Flight F106 has 12 seats available and Flight F113 has 4 seats available giving a total of
16 available seats. If Transaction 1 were to update Flight F106 by booking 1 seat and
Transaction 2 were to update Flight F113 by booking 2 seats, a calculation of the total seats
available would yield a result of 13 seats.
The following example illustrates what could happen when other transactions are permitted to
update while this calculation is occurring.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Question 3
3.1 In your ass 2 explain the sequence of interaction between their end users and the
DBMS through the use of queries to generate information
Parse Phase - During the parse phase, Oracle opens the statement handle, checks whether the
statement is OK (both syntactically and whether the involved objects exist and are accessible)
and creates an execution plan for executing this statement. Parse call does not return an error if
the statement is not syntactically correct.
Execute Phase - During the execute phase, Oracle executes the statement, reports any possible
errors, and if everything is as it should be, forms the result set. Unless the SQL statement being
executed is a query, this is the last step of the execution.
3.2 diffrerianted database performance tuning and sql performance tuning
Database tuning describes a group of activities used to optimize and homogenize the
performance of a database . Database tuning aims to maximize use of system resources to
perform work as efficiently and rapidly as possible. Most systems are designed to manage their
INF3703 EXAM PACK 2013-2019 SOLUTIONS
use of system resources, but there is still much room to improve their efficiency by customizing
their settings and configuration for the database and the DBMS.
SQL tuning is the process of ensuring that the SQLstatements that an application will issue
will run in the fastest possible time.
3.3 list and describe some typical BDMS processes the figure illustrates
- data files -All data in a database are stored in
listener - The listener process listens for clients’ requests and handles the processing of the SQL
requests to
other DBMS processes.
User- The DBMS creates a user process to manage each client session. Therefore, when you log
on to the
DBMS, you are assigned a user process.
Scheduler -The scheduler process organizes the concurrent execution of SQL requests
lock manager -This process manages all locks placed on database objects, including disk pages.
SQL cache is shared, reserved memory area that stores most recently executed SQL statements or
SQL procedures, including triggers and functions
Data cache
is shared, reserved memory area that stores most recently accessed data blocks in RAM
optimizer- The optimizer process analyzes SQL queries and finds the most efficient way to
access the data.
input/output request is low‐
level (read or write) data access operation to/from computer devices
Question 4
INF3703 EXAM PACK 2013-2019 SOLUTIONS
4.1 what transaction transparency is
Transaction transparency is a DDBMS property that ensures that database transactions will
maintain the distributed database's integrity and consistency. Transaction transparencyallows a
transaction to update data at several network sites. It also ensures that the transaction will be
either entirely completed or aborted.
4.2 the following .data structure and constraints exist for a magazine publishing company.
•The company publishes one regional magazine each in Pretoria (PTA), Florida (FL),
Durban (DB), and Cape Town (CT). •The company has 300,000 customers (subscribers)
distributed throughout the four cities listed in Part a.
PTA_properties, PTA_server, Pretoria
PROP_I PROP_NA
PROP_ADDR
PROP_LOCATI PROP_PROVI
RENTAL_PRI
D
ME
ESS
ON
NCE
CE
1001
SEQUIOR
10 CEILIERS
PRETORIA
GAUTENG
6 000.00
PRETORIA
GAUTENG
5 600.00
ST
1002
JABAVU
14 MESSY ST
CT_properties, CT_server, CAPE TOWN
PROP_I PROP_NA
PROP_ADDR
PROP_LOCATI PROP_PROVI
RENTAL_PRI
D
ME
ESS
ON
NCE
CE
1003
MIKE’S
20 RHODES
CAPE TOWN
WESTEN
8 000.00
ST
1004
ANN’S
40 SARAH ST
CAPE
CAPE TOWN
WESTEN
7 600.00
CAPE
PE_properties, PE_server, PORT ELIZABETH
PROP_I PROP_NA
PROP_ADDR
PROP_LOCATI PROP_PROVI
RENTAL_PRI
INF3703 EXAM PACK 2013-2019 SOLUTIONS
D
ME
ESS
ON
NCE
CE
1005
ELIBETH
201 MITCELL
P.E
EASTERN
6 000.00
ST
GINA’S
1006
14 MESSY
CAPE
P. E
EASTERN
5 600.00
CAPE
DBN_properties, DBN_server, DURBAN
PROP_
PROP_NAM
PROP_ADDR
PROP_LOCAT
PROP_PROVI
RENTAL_PR
ID
E
ESS
ION
NCE
ICE
1007
PRESIDENT
151 TINA ST
DURBAN
KZN
6 700.00
141 MESSY
DURBAN
KZN
5 675.00
IAL
1008
EMBASSY
ST
Question 5
5.1 Mr e smith is a manager of a small distribution company .due to the fact that the
business of the company is fast expanding another places , Elvis identiees that it is time to
manage the massive information pool to help assist and guide the accelerating growth . draw
a star schema diagram for the multi media distributed company data warehouse .each table
should have at least two attributes primary and foreign keys
There is a fact table at the centre. It contains the keys to each of four dimensions. The fact table
also contains the attributes, namely dollars sold and units sold. Note: Each dimension has only
one dimension table and each table holds a set of attributes. For example, the location dimension
table contains the attribute set {location_key, street, city, province_or_state,country}. This
constraint may cause data redundancy. For example, "Vancouver" and "Victoria" both the cities
INF3703 EXAM PACK 2013-2019 SOLUTIONS
are in the Canadian province of British Columbia. The entries for such cities may cause data
redundancy along the attributes province_or_state and country.
5.2 discuss main types of objects in the OLE-DB model
OLE-DB is composed of a series of COM objects that provide low-level database connectivity
for applications. Because OLE-DB is based on the COM object model, the objects contain data
and methods (also known as the interface.) The OLE-DB model is better understood when you
divide its functionality in two types of objects:
Consumers -are all those objects (applications or processes) that request and use data. The data
consumers request data by invoking the methods exposed by the data provider objects (public
interface) and passing the required parameters.
Providers- are the objects that manage the connection with a data source and provide data to the
consumers.
Providers are divided in two categories: data providers and service providers.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Data providers provide data to other processes. Database vendors create data provider objects
that expose the functionality of the underlining data source (relational, object-oriented, text, and
so on.)
Service providers provide additional functionality to consumers.
The service provider is located between the data provider and the consumer: The service
provider requests data from the data provider; transforms the data and provides the transformed
data to the data consumer. In other words, the service provider acts like a data consumer of the
data provider and as a data provider for the data consumer (end-user application). For example, a
service provider could offer cursor management services, transaction management services,
query processing services, indexing services, and so on.
5.3 what are the three main components in the basic ODBC architecture and their
functions
The three main components of ODBC architecture consists of:
A high-level ODBC API through which application programs access ODBC functionality
A driver manager that is in charge of managing all database connections
An ODBC driver that communicates directly to the DBMS
MAY JUNE 2018
INF 3703 MEMO 2018 MAY JUNE
Question 1
1.1. F Horizontal Fragmentation is splitting of tables horizontally that is into tuples or rows.
1.2. T
1.3. T
INF3703 EXAM PACK 2013-2019 SOLUTIONS
1.4. T
1.5.T
1.6.T
1.7.T
1.8.F testing is the role of the DB
1.9.T T 1.10.F durability definition
SECTION B
QUESTION 2
a. Name and discuss the three levels of distributed transparency (6)
• Fragmentation transparency is the highest level of transparency. The end user or
programmer does not need to know that a database is partitioned. Therefore, neither fragment
names nor fragment locations are specified prior to data access.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
• Location transparency exists when the end user or programmer must specify the database
fragment name but does not need to specify where those fragments are located.
• Local mapping transparency exists when the end user or programmer must specify both the
fragment names and their locations.
b. Use the request above to illustrate the use of various transparency levels .Do this by
writing three different cases of SQL query with each case conforming to each level of
distribution transparency (14)
Case 1: The Database Supports Fragmentation Transparency
The query conforms to a no distributed database query format; that is, it does not specify
fragment names or locations.
The query reads:
SELECT *
FROM EMPLOYEE
WHERE EMP_DOB<'01-JAN-1960';
Case 2: The Database Supports Location Transparency
Fragment names must be specified in the query, but fragment location is not specified. The query
reads:
SELECT *
FROM E1
WHERE EMP_DOB < „Ol-JAN-1960‟;
UNION
SELECT *
FROM E2
WHERE EMP_DOB <„Ol-JAN-1960‟;
UNION
SELECT *
FROM E3
WHERE EMP_DOB <„Ol-JAN-1960‟;
Case 3: The Database Supports Local Mapping Transparency
Both the fragment name and location must be specified in the query. Using pseudo-SQL:
INF3703 EXAM PACK 2013-2019 SOLUTIONS
SELECT *
FROM El NODE SA
WHERE EMP_DOB < „O1-JAN-1960;
UNION
SELECT *
FROM E2 NODE NIGERIA
WHERE EMP_DOB < O1-JAN-1960;
UNION
SELECT *
FROM E3 NODE ANGOLA
WHERE EMP_DOB <„O1-JAN-1960;
Question 3
a.identfy the appropriate fact table components (2)
b. identify the appropriate dimension tables (3)
c.identfy the attributes for the dimension tables (2)
INF3703 EXAM PACK 2013-2019 SOLUTIONS
d. Draw a star schema diagram for this data warehouse (3)
3.2 Techniques used to improve star schema performance
The following four techniques are commonly used to optimize data warehouse design:
 Normalization of dimensional tables is done to achieve semantic simplicity and to
facilitate end user navigation through the dimensions. For example, if the location
dimension table contains transitive dependencies between region, state, and city, we can
revise these relationships to the third normal form (3NF). By normalizing the dimension
tables, we simplify the data filtering operations related to the dimensions.
 We can also speed up query operations by creating and maintaining multiple fact
tables related to each level of aggregation. For example, we may use region, state, and
city in the location dimension. These aggregate tables are pre-computed at the data
loading phase,rather than at run-time. The purpose of this technique is to save processor
cycles at run-time, thereby speeding up data analysis. An end user query tool optimized
for decision analysis will then properly access the summarized fact tables, instead of
computing the values by accessing a "lower level of detail" fact table.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Denormalizing fact tables Denormalizing fact tables improves data access performance and
saves data storage space. The latter objective, however, is becoming less of an issue. Data
storage costs decrease almost daily, and DBMS limitations that restrict database and table size
limits, record size limits, and the maximum number of records in a single table have far more
negative effects than raw storage space costs.
Partitioning and Replicating Tables
Distributed Database Management Systems, those techniques are discussed here only as they
specifically relate to the data warehouse.
Table partitioning and replication are particularly important when a BI system is implemented in
dispersed geographic areas.
Partitioning splits a table into subsets of rows or columns and places the subsets close to the
client computer to improve data access time.
Replication makes a copy of a table and places it in a different location, also to improve
access time.
Question 4
INF3703 EXAM PACK 2013-2019 SOLUTIONS
4.1 with the aid of an illustrative diagram show and discuss feed back loop as one of the
categories of velocity of processing .use any scenario of your choice for the illustrative
diagram .you may also use the scenario used in your text book (11)
4.2 what does serializability of a transaction mean .Where is it considered important (4)
It is a transaction property that ensures that the schedule for the concurrent execution of the
transactions yields consistent results.
This property is important in multiuser and distributed databases, where multiple transactions are
likely to be executed concurrently. Naturally, if only a single transaction is executed,
serializability is not an issue.
 Multiuser databases are typically subject to multiple concurrent transactions. Therefore,
the multiuser DBMS must implement controls to ensure serializability.
4.3 how do you describe database statistics ?Why are they important (4)
INF3703 EXAM PACK 2013-2019 SOLUTIONS
The term database statistics refers to a number of measurements about database objects, such as
number of processors used, processor speed, and temporary space available. Such statistics give
a snapshot of database characteristics
Database statistics are stored in the system catalog in specially designated tables
Database statistics are important because they help in analyzing
Question 5
INF3703 EXAM PACK 2013-2019 SOLUTIONS
a. describe to the IT team you work with the activities that are typically associated with the
design and implementation services of the DBA technical function (8)
Many of the DBA‟s technical activities are a logical extension of the DBA‟s managerial
activities. For example, the DBA deals with database security and integrity, backup and
recovery, and training and support. Thus, the DBA‟s dual role might be conceptualized as a
capsule whose technical core is covered by a clear managerial shell.
The technical aspects of the DBA‟s job are rooted in the following areas of operation:
 Evaluating, selecting, and installing the DBMS and related utilities.
One of the DBA‟s first and most important technical responsibilities is selecting the database
management system, utility software, and supporting hardware to be used in the organization.
Therefore, the DBA must develop and execute a plan for evaluating and selecting the DBMS,
utilities, and hardware. That plan must be based primarily on the organization‟s needs rather than
on specific software and hardware features. The DBA must recognize that the search is for
solutions to problems rather than for a computer or DBMS software. Put simply, a DBMS is a
management tool and not a technological toy
 Designing and implementing databases and applications.
The DBA function also provides data modeling and design services to end users. Such services
are often coordinated with an application development group within the data-processing
department. Therefore, one of the primary activities of a DBA is to determine and enforce
standards and procedures to be used. Once the appropriate standards and procedures framework
are in place, the DBA must ensure that the database modeling and design activities are performed
within the framework.
 Testing and evaluating databases and applications.
The DBA must also provide testing and evaluation services for all of the database and end-user
applications. Those services are the logical extension of the design, development, and
implementation services described in the preceding section. Clearly, testing procedures and
standards must already be in place before any application program can be approved for use in the
company.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
 Operating the DBMS, utilities, and applications.
DBMS operations can be divided into four main areas:
INF3703 EXAM PACK 2013-2019 SOLUTIONS
System support.
Performance monitoring and tuning.
Backup and recovery.
Security auditing and monitoring.
 Training and supporting users.
Training people to use the DBMS and its tools is included in the DBA‟s technical activities. In
addition, the DBA provides or secures technical training in the use of the DBMS and its utilities
for the applications programmers. Applications programmer training covers the use of the
DBMS tools as well as the procedures and standards required for database programming
 Maintaining the DBMS, utilities, and applications.
The maintenance activities of the DBA are an extension of the operational activities.
Maintenance activities are dedicated to the preservation of the DBMS environment
b. what technical skills are desirable in the DBAs personnel to perform design and
implementation function? (2)
the designer must have Systems Development Life Cycle knowledge , Database Life Cycle
knowledge and Database modeling and design skills Conceptual Logical Physical
5.2 discus in your own words what a database connectivity is although there are many ways
to achieve database connectivity, name the five interfaces discussed in your textbook (7)
Database connectivity refers to the mechanisms through which application programs connect and
communicate with data repositories. Database connectivity software is also known as database
middleware because it provides an interface between the application program and the database.
The data repository, also known as the data source, represents the data management application,
such as Oracle RDBMS, SQL Server DBMS, or IBM DBMS, that will be used to store the data
generated by the application program. Ideally, a data source or data repository could be located
anywhere and hold any type of data. For example, the data source could be a relational database,
a hierarchical database, a spreadsheet, or a text data file.
 Native SQL connectivity (vendor provided).
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Most DBMS vendors provide their own methods for connecting to their databases. Native SQL
connectivity refers to the connection interface that is provided by the database vendor and that is
unique to that vendor. The best example of that type of native interface is the Oracle RDBMS.
To connect a client application to an Oracle database, you must install and configure the
Oracle‟s SQL*Net interface in the client computer. Figure 14.1 shows the configuration of
Oracle SQL*NET interface on the client computer
 Microsoft’s Open Database Connectivity (ODBC), Data Access Objects (DAO), and
Remote Data Objects (RDO).
Open Database Connectivity (ODBC) is Microsoft‟s implementation of a superset of the SQL
Access Group Call Level Interface (CLI) standard for database access. ODBC is probably the
most widely supported database connectivity interface. ODBC allows any Windows application
to access relational data sources, using SQL via a standard application programming interface
(API). The Webopedia online dictionary (www. webopedia.com) defines an API as “a set of
routines, protocols, and tools for building software applications.”
INF3703 EXAM PACK 2013-2019 SOLUTIONS
 Microsoft’s Object Linking and Embedding for Database (OLE-DB).
OLE-DB is composed of a series of COM objects that provide low-level database connectivity
for applications. Because OLE-DB is based on COM, the objects contain data and methods, also
known as the interface. The OLE-DB model is better understood when you divide its
functionality into two types of objects:
Consumers are objects (applications or processes) that request and use data. The data consumers
request data by invoking the methods exposed by the data provider objects (public interface) and
passing the required parameters.
Providers are objects that manage the connection with a data source and provide data to the
consumers.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Providers are divided into two categories: data providers and service providers.
- Data providers provide data to other processes. Database vendors create data provider objects
that expose the functionality of the underlying data source (relational, object-oriented, text, and
so on).
Service providers provide additional functionality to consumers. The service provider is located
between the data provider and the consumer. The service provider requests data from the data
provider, transforms the data, and then provides the transformed data to the data consumer. In
other words, the service provider acts like a data consumer of the data provider and as a data
provider for the data consumer (end-user application). For example, a service provider could
offer cursor management services, transaction management services, query processing services,
and indexing services.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Microsoft‟s ActiveX Data Objects (ADO.NET). Sun‟s Java Database Connectivity (JDBC).
Based on ADO, ADO.NET is the data access component of Microsoft‟s .NET application
development framework. The Microsoft .NET framework is a component-based platform for
developing distributed, heterogeneous, interoperable applications aimed at manipulating any type
of data over any network under any operating system and any programming language.
Comprehensive coverage of the .NET framework is beyond the scope of this book. Therefore,
this section will only introduce the basic data access component of the .NET architecture,
ADO.NET.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
5.3 how would you describe what a cloud computing is ?Explain why it is considered a
game changer (4)
the practice of using a network of remote servers hosted on the Internet to store, manage, and
process data, rather than a local server or a personal computer.
5.4 Name and contrast the types of cloud computing implementation (6)
There are basically three cloud computing implementation types (based on who the target
customers are):
INF3703 EXAM PACK 2013-2019 SOLUTIONS
 Public cloud. This type of cloud infrastructure is built by a third-party organization to
sell cloud services to the general public. The public cloud is the most common type of
cloud implementation; examples include Amazon Web Services (AWS), Google
Application Engine, and Microsoft Azure. In this model, cloud consumers share
resources with other consumers transparently. The public cloud infrastructure is managed
exclusively by the third-party provider.
 Private cloud. This type of internal cloud is built by an organization for the sole purpose
of servicing its own needs. Private clouds are often used by large, geographically
dispersed organizations to add agility and flexibility to internal IT services. The cloud
infrastructure could be managed by internal IT staff or an external third party.
 Community cloud. This type of cloud is built by and for a specific group of
organizations that share a common trade, such as agencies of the federal government, the
military, orhigher education. The cloud infrastructure could be managed by internal IT
staff or an external third party
Question 6
6.1 Name and explain all the transaction properties and where necessary use examples in
your explanation
Atomicity requires that all operations (SQL requests) of a transaction be completed; if not, the
transaction is aborted. If a transaction T1 has four SQL requests, all four requests must be
INF3703 EXAM PACK 2013-2019 SOLUTIONS
successfully completed; otherwise, the entire transaction is aborted. In other words, a transaction
is treated as a single, indivisible, logical unit of work.
 Consistency indicates the permanence of the database‟s consistent state. A transaction
takes a database from one consistent state to another consistent state. When a transaction
is completed, the database must be in a consistent state; if any of the transaction parts
violates an integrity constraint, the entire transaction is aborted.
 Isolation means that the data used during the execution of a transaction cannot be used
by a second transaction until the first one is completed. In other words, if a transaction T1
is being executed and is using the data item X, that data item cannot be accessed by any
other transaction (T2 ... Tn) until T1 ends. This property is particularly useful in
multiuser database environments because several users can access and update the
database at the same time.
 Durability ensures that once transaction changes are done (committed), they cannot be
undone or lost, even in the event of a system failure.
 Serializability ensures that the schedule for the concurrent execution of the transactions
yields consistent results. This property is important in multiuser and distributed
databases, where multiple transactions are likely to be executed concurrently. Naturally,
if only a single transaction is executed, serializability is not an issue.
6.2 what is concurrency control in a database ?Explain why it is important and its objective
The coordination of the simultaneous execution of transactions in a multiuser database system.
The objective of concurrency control is to ensure the serializability of transactions in a multiuser
database environment. Concurrency control is important because the simultaneous execution of
transactions over a shared database can create several data integrity and consistency problems
6.3 with the help of a diagram, demonstrate the flow of the phases in query processing and
the steps required in each phase (10)
What happens at the DBMS server end when the client‟s SQL statement is received? In simple
terms, the DBMS processes a query in three phases:
1. Parsing. The DBMS parses the SQL query and chooses the most efficient
access/execution plan.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
The optimization process includes breaking down—parsing—the query into smaller units and
transforming the original SQL query into a slightly different version of the original SQL code,
but one that is fully equivalent and more efficient. Fully equivalent means that the optimized
query results are always the same as the original query. More efficient means that the optimized
query will almost always execute faster than the original query
The SQL parsing activities are performed by the query optimizer, which analyzes the SQL query
and finds the most efficient way to access the data. This process is the most time-consuming
phase in query processing. Parsing a SQL query requires several steps, in which the SQL query
is: Validated for syntax compliance.
 Validated against the data dictionary to ensure that tables and column names are correct.
 Validated against the data dictionary to ensure that the user has proper access rights.
 Analyzed and decomposed into more atomic components.
 Optimized through transformation into a fully equivalent but more efficient SQL query.
 Prepared for execution by determining the most efficient execution or access plan.
2. Execution. The DBMS executes the SQL query using the chosen execution plan.
In this phase, all I/O operations indicated in the access plan are executed. When the execution
plan is run, the proper locks if needed are acquired for the data to be accessed, and the data are
retrieved from the data files and placed in the DBMSs data cache. All transaction management
commands are processed during the parsing and execution phases of query processing
3.Fetching. The DBMS fetches the data and sends the result set back to the client.
After the parsing and execution phases are completed, all rows that match the specified
condition(s) are retrieved, sorted, grouped, and/or aggregated (if required).
During the fetching phase, the rows of the resulting query result set are returned to the client.
The DBMS might use temporary table space to store temporary data. In this stage, the database
server coordinates the movement of the result set rows from the server cache to the client cache.
For example, a given query result set might contain 9,000 rows; the server would send the first
100 rows to the client and then wait for the client to request the next set of rows, until the entire
result set is sent to the client.
The processing of SQL DDL statements (such as CREATE TABLE) is different from the
processing required by DML statements. The difference is that a DDL statement actually updates
INF3703 EXAM PACK 2013-2019 SOLUTIONS
the data dictionary tables or system catalog, while a DML statement (SELECT, INSERT,
UPDATE, and DELETE) mostly manipulates end-user data.
Question 7
INF3703 EXAM PACK 2013-2019 SOLUTIONS
7.1 briefly explain the three (3) levels of distribution transparency (6)
Fragmentation transparency is the highest level of transparency. The end user or programmer
does not need to know that a database is partitioned. Therefore, neither fragment names nor
fragment locations are specified prior to data access.
Location transparency exists when the end user or programmer must specify the database
fragment names but does not need to specify where those fragments are located.
Local mapping transparency exists when the end user or programmer must specify both the
fragment names and their locations.
7.2 big data is one of the emerging concepts in database management systems .what do you
understand by big data and what are its basic components ? (10)
Big data is an evolving term that describes any voluminous amount of structured, semistructured
and unstructured data that has the potential to be mined for information
Big data is often characterized by 3Vs: the extreme volume of data, the wide variety of data types
and the velocity at which the data must be processed. Although big data doesn't equate to any
specific volume of data, the term is often used to describe terabytes, petabytes and even exabytes
of data captured over time.
Breaking down the 3Vs of big data
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Such voluminous data can come from myriad different sources, such as business sales records,
the collected results of scientific experiments or real-time sensors used in the internet of things.
Data may be raw or preprocessed using separate software tools before analytics are applied
Data may also exist in a wide variety of file types, including structured data, such as
SQLdatabase stores; unstructured data, such as document files; or streaming data from sensors.
Further, big data may involve multiple, simultaneous data sources, which may not otherwise be
integrated. For example, a big data analytics project may attempt to gauge a product's success
and future sales by correlating past sales data, return data and online buyer review data for that
product.
Finally, velocity refers to the speed at which big data must be analyzed. Every big data analytics
project will ingest, correlate and analyze the data sources, and then render an answer or result
based on an overarching query. This means human analysts must have a detailed understanding
of the available data and possess some sense of what answer they're looking for
Velocity is also meaningful, as big data analysis expands into fields like machine learning and
artificial intelligence, where analytical processes mimic perception by finding and using patterns
in the collected data.
Big data infrastructure demands
The need for big data velocity imposes unique demands on the underlying compute
infrastructure. The computing power required to quickly process huge volumes and varieties of
data can overwhelm a single server or server cluster. Organizations must apply adequate
compute power to big data tasks to achieve the desired velocity. This can potentially demand
hundreds or thousands of servers that can distribute the work and operate collaboratively.
Achieving such velocity in a cost-effective manner is also a headache. Many enterprise leaders
are reticent to invest in an extensive server and storage infrastructure that might only be used
occasionally to complete big data tasks. As a result, public cloud computing has emerged as a
primary vehicle for hosting big data analytics projects. A public cloud provider can store
petabytes of data and scale up thousands of servers just long enough to accomplish the big data
project. The business only pays for the storage and compute time actually used, and the cloud
instances can be turned off until they're needed again.
To improve service levels even further, some public cloud providers offer big data capabilities,
such as highly distributed Hadoop compute instances, data warehouses, databases and other
INF3703 EXAM PACK 2013-2019 SOLUTIONS
related cloud services. Amazon Web Services Elastic Map Reduce is one example of big data
services in a public cloud.
7.3 discuss in your own words what database connectivity is and name the five interfaces
discussed in your textbook (7)
7.4 The DBMS processes query in three phases .identify the first two phases and outline what is
accomplished in each phase?
1. Parsing. The DBMS parses the SQL query and chooses the most efficient
access/execution plan.
The optimization process includes breaking down—parsing—the query into smaller units and
transforming the original SQL query into a slightly different version of the original SQL code,
but one that is fully equivalent and more efficient. Fully equivalent means that the optimized
query results are always the same as the original query. More efficient means that the optimized
query will almost always execute faster than the original query
The SQL parsing activities are performed by the query optimizer, which analyzes the SQL query
and finds the most efficient way to access the data. This process is the most time-consuming
phase in query processing. Parsing a SQL query requires several steps, in which the SQL query
is: Validated for syntax compliance.
 Validated against the data dictionary to ensure that tables and column names are correct.
 Validated against the data dictionary to ensure that the user has proper access rights.
 Analyzed and decomposed into more atomic components.
 Optimized through transformation into a fully equivalent but more efficient SQL query.
 Prepared for execution by determining the most efficient execution or access plan.
2. Execution. The DBMS executes the SQL query using the chosen execution plan.
In this phase, all I/O operations indicated in the access plan are executed. When the execution
plan is run, the proper locks—if needed—are acquired for the data to be accessed, and the data
are retrieved from the data files and placed in the DBMSs data cache. All transaction
management commands are processed during the parsing and execution phases of query
processing
INF3703 EXAM PACK 2013-2019 SOLUTIONS
INF3703 EXAM PACK MAY JUNE 2019
1.1.T PAGE 557
1.2.T PAGE 502
1.3. F PAGE 507 reason: stage number 1
1.4. T PAGE 558
1.5. T PAGE 510
1.6. T PAGE 510
INF3703 EXAM PACK 2013-2019 SOLUTIONS
1.7. F PAGE difference in transactions (online transactions)
1.8. T PAGE 710
1.9. T PAGE 611
1.10. T PAGE
SECTION B
INF3703 EXAM PACK 2013-2019 SOLUTIONS
QUESTION 2
2.2
1. The client browser sends a page request to the web server.
2. The web server receives and passes the request to the web-to-database middleware
for processing.
3. Generally, the requested page contains some type of scripting language to enable
the database interaction. The web server passes the script to the web-to-database
middleware.
4. The web-to-database middleware reads, validates, and executes the script. In this case,
INF3703 EXAM PACK 2013-2019 SOLUTIONS
it connects to the database and passes the query using the database connectivity layer.
5. The database server executes the query and passes the result back to the web-to-database
middleware.
6. The web-to-database middleware compiles the result set, dynamically generates an
HTML-formatted page that includes the data retrieved from the database, and sends
it to the web server.
7. The web server returns the just-created HTML page, which now includes the query
result, to the client browser.
8. The client browser displays the page on the local computer.
The interaction between the web server and the web-to-database middleware
is crucial to the development of a successful Internet database implementation.
Therefore, the middleware must integrate closely via a well-defined web server
interface.
 Software as a Service (SaaS).
The cloud service provider offers turnkey applications that
run in the cloud. Consumers can run the provider’s applications internally in their
organizations
via the web or any mobile device. The consumer can customize certain aspects of
INF3703 EXAM PACK 2013-2019 SOLUTIONS
the application but cannot make changes to the application itself. The application is
actually
shared among users from multiple organizations. Examples of SaaS include MS Office
365, Google Docs, Intuit’s TurboTax Online, and SCALA digital signage.
 Platform as a Service (PaaS).
The cloud service provider offers the capability to build
and deploy consumer-created applications using the provider’s cloud infrastructure.
In this scenario, the consumer can build, deploy, and manage applications using the
provider’s cloud tools, languages, and interfaces. However, the consumer does not
manage the underlying cloud infrastructure. Examples of PaaS include the Microsoft
Azure platform with .NET and the Java development environment, and Google
Application Engine with Python or Java.
 Infrastructure as a Service (IaaS).
In this case, the cloud service provider offers consumers
the ability to provision their own resources on demand; these resources include storage,
servers, databases, processing units, and even a complete virtualized desktop.
The consumer then can add or remove the resources as needed. For example, a consumer
can use Amazon Web Services (AWS) and provision a server computer that runs
Linux and Apache Web server using 16 GB of RAM and 160 GB of storage.
QUESTION 3
INF3703 EXAM PACK 2013-2019 SOLUTIONS
 Provided accurate, timely budgets in less time
 Provided analysts with access to data for decision-making purposes
 Received in-depth view of product performance by store to reduce waste and increase
profits
 Ability to monitor performance using
 Dashboard technology
 Quick and easy access to real-time performance data
 Managers have closer and better control over costs
Facts are numeric measurements (values) that represent a specific business aspect or activity.
For example, sales figures are numeric measurements that represent product and service sales.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Facts commonly used in business data analysis are units, costs, prices, and revenues.
Facts are normally stored in a fact table that is the center of the star schema. The fact table
contains facts that are linked through their dimensions, which are explained in the next section.
Facts can also be computed or derived at run time. Such computed or derived facts
are sometimes called metrics to differentiate them from stored facts. The fact table is
updated periodically with data from operational databases.
Dimensions
Dimensions are qualifying characteristics that provide additional perspectives to a given
fact. Recall that dimensions are of interest because decision support data is almost always
viewed in relation to other data. For instance, sales might be compared by product from
region to region and from one-time period to the next. The kind of problem typically
addressed by a BI system might be to compare the sales of unit X by region for the first
quarters of 2006 through 2016. In that example, sales have product, location, and time
dimensions. In effect, dimensions are the magnifying glass through which you study the
facts. Such dimensions are normally stored in dimension tables. Figure 13.6 depicts
a star schema for sales with product, location, and time dimensions.
Attributes
Each dimension table contains attributes. Attributes are often used to search, filter, or
classify facts. Dimensions provide descriptive characteristics about the facts through them
INF3703 EXAM PACK 2013-2019 SOLUTIONS
attributes. Therefore, the data warehouse designer must define common business attributes
that will be used by the data analyst to narrow a search, group information, or
describe dimensions. Using a sales example, some possible attributes for each dimension
QUESTION 4
Distribution transparency allows a distributed database to be treated as a single
Logical database. If a DDBMS exhibits distribution transparency, the user does not
need to know:
 The data is partitioned, meaning the table’s rows and columns are split vertically
or horizontally and stored among multiple sites.
 The data is geographically dispersed among multiple sites.
 The data is replicated among multiple sites.
Transaction transparency
allows a transaction to update data at more than one
Network site. Transaction transparency ensures that the transaction will be either
entirely completed or aborted, thus maintaining database integrity.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Failure transparency
ensures that the system will continue to operate in the event of a node or network failure.
Functions that were lost because of the failure will be picked up by another network node. This is
a very important feature, particularly in organizations that depend on web presence as the
backbone for maintaining trust in their business.
Performance transparency
allows the system to perform as if it were a centralized DBMS. The system will not suffer any
performance degradation due to its use on a network or because of the network’s platform
differences. Performance transparency also ensures that the system will find the most costeffective path to access remote data. The system should be able to “scale out” in a transparent
manner or increase
performance capacity by adding more transaction or data-processing nodes, without
affecting the overall performance of the system.
Heterogeneity transparency
allows the integration of several different local DBMSs (relational, network, and hierarchical)
under a common, or global, schema. The DDBMS is responsible for translating the data requests
from the global schema to the local DBMS schema.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Lost Updates This problem occurs when two transactions, accessing the same data items, have
their operations interleaved in such a way, that one transaction will access the data before the
other has applied any updates.
In this simplified example, you can see that the operations are interleaved in such a way that
Transaction 2 had access to the data before Transaction 1 could reduce the seats by 1.
Transaction 2's operation of reducing the seats by 2 has been overwritten, resulting in an
incorrect value of available seats.
Uncommitted Data This problem occurs when one transaction updates a data item, but has not
yet committed the data permanently to the database. Because of failure, the transaction is rolled
back and the data item is returned to its previous value. A second transaction accesses the
updated data item before it is returned to its original value.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
In this example, you can see that Transaction 2 has access to the updated data of Transaction 1
before the changes were committed permanently to the database.
This violates the Isolation property of allowing a transaction to complete without interference
from another Inconsistent Retrievals This problem occurs when one transaction is calculating
summary functions on particular data items while other transactions are updating those data
items. The transaction performing the calculations may read some data items before they are
updated and others after. Assume Flight F106 has 12 seats available and Flight F113 has 4 seats
available giving a total of 16 available seats. If Transaction 1 were to update Flight F106 by
booking 1 seat and Transaction 2 were to update Flight F113 by booking 2 seats, a calculation of
the total seats available would yield a result of 13 seats. The following example illustrates what
could happen when other transactions are permitted to update while this calculation is occurring.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
1. It can become unresponsive to the needs of the business.
There are heavy workload requirements which become necessary when using a centralized
database. Individuals and teams find that the time constraints placed on them may be
unreasonable for the expectations asked. In time, if these constraints are not addressed as they
should be, a centralized database creates unresponsive teams that are focused on specific tasks
instead of collaboration. The teams can essentially rebel against the system to create their own
silos for self-protection.
2. There are lower levels of location-based adaptability.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Using a centralized database means you are trading movement efficiencies for less flexibility at
the local level. If there are changes which occur locally that affect the business, this data must be
sent to the centralized database. There is no option to control he data locally. That means
response times to local customers or the community may not be as efficient as they could be, as
there may not be any information in place to deal with a necessary response.
3. It can have a negative impact on local morale.
When there is a centralized database being used, the responsibilities of local managers are often
reduced. That occurs because the structure of the company may forbid local managers from
hiring their own employees. It may force them to use data from the centralized system which
they feel has a negative impact at the local level. Instead of being able to make decisions
immediately, they are forced to wait for data to come from the centralized database. For those
who wish to experience higher levels of responsibility with their management role, the
centralized process can be highly demoralizing.
4. Succession planning can be limited with a centralized database.
Because the information is managed by a centralized database, there is little need to work on the
development of new or upcoming managers. The only form of succession planning that is
necessary with this setup involves bringing in an individual to replace a core analyst at some
point. Should top-level managers experience a family issue, health event, or some other issue
which interferes with their job, there may be no one with the necessary experience to take over
the position, which reduces the effectiveness of the centralized database.
5. It reduces the amount of legitimate feedback received.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
A centralized database may provide transparency. It may lead to greater levels of
communication. Those are not always positive benefits. When anyone can offer an opinion or
feedback on information they have received, they often feel a responsibility to send a response.
Many employees may have general knowledge about certain policies or procedures, but not have
access to the full picture. They waste time creating feedback which isn’t needed, which wastes
time for everyone who reads that feedback. Over time, this can lead to lower levels of
productivity and higher levels of frustration.
QUESTION 5
Depending on the type of database implementation, however, establishing physical
security might not always be practical.
 Password security allows the assignment of access rights to specific authorized users.
Password security is usually enforced at login time at the operating system level.
 Access rights can be established through the use of database software. The assignment
of access rights may restrict operations (CREATE, UPDATE, DELETE, and so on) on
predetermined objects such as databases, tables, views, queries, and reports.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
 Audit trails are usually provided by the DBMS to check for access violations. Although
the audit trail is an after-the-fact device; its mere existence can discourage unauthorized
use.
 Data encryption can render data useless to unauthorized users who might have
violated some of the database security layers.
 Diskless workstations allow end users to access the database without being able to
download the information from their workstations.
 Physical security allows only authorized personnel physical access to specific areas.
Depending on the type of database implementation, however, establishing physical
security might not always be practical. For example, a university student research
database is not a likely candidate for physical security.
 Password security allows the assignment of access rights to specific authorized users.
Password security is usually enforced at login time at the operating system level.
 Access rights can be established through the use of database software. The assignment
of access rights may restrict operations (CREATE, UPDATE, DELETE, and so on) on
predetermined objects such as databases, tables, views, queries, and reports.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
 Audit trails are usually provided by the DBMS to check for access violations. Although
the audit trail is an after-the-fact device, its mere existence can discourage unauthorized
use.
 Data encryption can render data useless to unauthorized users who might have violated
some of the database security layers.
 Diskless workstations allow end users to access the database without being able to
download the information from their workstations.
a. A rule based optimizer
uses preset rules and points to determine the best approach
to execute a query. The rules assign a “fixed cost” to each SQL operation; the costs
are then added to yield the cost of the execution plan. For example, a full table scan
has a set cost of 10, while a table access by row ID has a set cost of 3.
Cost based optimizer
uses sophisticated algorithms based on statistics about the
objects being accessed to determine the best approach to execute a query. In this case,
the optimizer process adds up the processing cost, the I/O costs, and the resource
costs (RAM and temporary space) to determine the total cost of a given
execution plan.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
b. SQL performance tuning
SQL performance tuning is evaluated from the client perspective. Therefore, the goal
is to illustrate some common practices used to write efficient SQL code. A few words
of caution are appropriate
DBMS performance tuning
in primary memory (allocating memory for caching purposes) and managing the
structures in physical storage (allocating space for the data files).
Fine-tuning the performance of the DBMS also includes applying several
practices
examined in the previous section. For example, the DBA must work with
developers
to ensure that the queries perform as expected—creating the indexes to speed up
query
response time and generating the database statistics required by cost-based
optimizers.
c. Data cache
The data cache size must be set large enough to permit as many data
requests as possible to be serviced from the cache. Each DBMS has settings that
control the size of the data cache; some DBMSs might require a restart. This cache is
shared among all database users. The majority of primary memory resources will be
INF3703 EXAM PACK 2013-2019 SOLUTIONS
allocated to the data cache.
SQL cache
The SQL cache stores the most recently executed SQL statements (after the
SQL statements have been parsed by the optimizer). Generally, if you have an
application with multiple users accessing a database, the same query will likely be
submitted by many different users. In those cases, the DBMS will parse the query
only once and execute it many times, using the same access plan. In that way, the
second and subsequent SQL requests for the same query are served from the SQL
cache, skipping the parsing phase.
Database transaction recovery uses data in the transaction log to recover a database from an
inconsistent state to a consistent state. Before continuing, let’s examine four important concepts
that affect the recovery process:
The write-ahead-log protocol ensures that transaction logs are always written before any
database data are actually updated. This protocol ensures that, in case of a failure, the database
can later be recovered to a consistent state, using the data in the transaction log.
Redundant transaction logs (several copies of the transaction log) ensure that a physical disk
failure will not impair the DBMS’s ability to recover data.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
Database buffers are temporary storage areas in primary memory used to speed up disk
operations. To improve processing time, the DBMS software reads the data from the physical
disk and stores a copy of it on a ―buffer‖ in primary memory.
When a transaction updates data, it actually updates the copy of the data in the buffer because
that process is much faster than accessing the physical disk every time.
Later on, all buffers that contain updated data are written to a physical disk during a single
operation, thereby saving significant processing time.
Database checkpoints are operations in which the DBMS writes all of its updated buffers to
disk. While this is happening, the DBMS does not execute any other requests. A checkpoint
operation is also registered in the transaction log.
As a result of this operation, the physical database and the transaction log will be in sync. This
Synchronization is required because update operations update the copy of the data in the buffers
and not in the physical database.
Checkpoints are automatically scheduled by the DBMS several times per hour. As you will see
next, checkpoints also play an important role in transaction recovery.
The database recovery process involves bringing the database to a consistent state after a failure.
Transaction recovery procedures generally make use of deferred-write and write-through
techniques.
Using Transaction Log Backups
With the combination of full database and differential backups, it is possible to take snapshots of
the data and recover them.
INF3703 EXAM PACK 2013-2019 SOLUTIONS
But in some situations, it is also desirable to have backups of all events that have occurred in a
database, like a record of every single statement executed. With such functionality, it would be
possible to recover the database to any state required.
Transaction log backups provide this functionality. As its name suggests, the transaction log
backup is a backup of transaction log entries and contains all transactions that have happened to
the database.
The main advantages of transaction log backups are as follows:
Transaction log backups allow you to recover the database to a specific point in time.
Because transaction log backups are backups of log entries, it is even possible to perform a
backup from a transaction log if the data files are destroyed.
With this backup, it is possible to recover the database up to the last transaction that took place
before the failure occurred. Thus, in the event of a failure, not a single committed transaction
need be lost.
Download