Desirable features in an information system • Integrity • Referential integrity

advertisement
Desirable features in an
information system
•
•
•
•
•
•
Integrity
Referential integrity
Data independence
Controlled redundancy
Security
Privacy
File systems
• Sequential or serial
• Indexed sequential
• Relative
Database definition
• a computerised record-keeping system
• used by a range of users who have different
requirements
– minimal enquiries
– in-depth updating
– restructuring
• A well-implemented database will have data integrity,
data independence, controlled redundancy, security and
privacy, for all users.
Uses of a Database
• Generally used for on-line transaction
processing (OLTP)
• Data Warehouses are a hybrid of
databases which are used for On-line
analytical processing (OLAP)
Structure of a database
External
Schema
Conceptual
Schema
Internal
Schema
Physical
Schema
External level
• Level visible to user
• Multiple views of the system
– e.g. View an order - see limited product and
customer information
• Only the database Administrator may
access the whole database at this level
EXTERNAL SCHEMA
• Each external view is defined by means of an
external schema
• Provides definitions of each external view.
• Written in a Data Definition Language
• individual to the user
• accessed through a 3GL, a query language or a
special purpose forms or menu-based language
Conceptual level
• CONCEPTUAL - represents the entire
information content of the database
• Consists of multiple types of conceptual record.
This level preserves the data independence of
the database.
• CONCEPTUAL SCHEMA - defines each of the
various types of conceptual record, in a
conceptual Data Definition Language.
Internal level
• INTERNAL - a low-level representation of the
entire database; it consists of multiple
occurrences of multiple types of internal record.
It is the stored record, inasmuch as it contains all
but the device-specific information on the
storage of the database.
• PHYSICAL - the physical device and block
addresses for each of the records.
Mappings
• Each level maps onto adjoining levels
• conceptual / internal mapping specifies how
conceptual records and fields are represented at
the internal level
• Changes can be made in the internal level
without affecting the conceptual level
• external / conceptual mapping defines the
correspondence between an external view and
the conceptual view
DBMS - Database Management
System
• software handling access to the database
• allows both the database administrator
and all users the access to the database
to which they are entitled
How requests are processed
• User issues request (e.g. through SQL)
• DBMS intercepts and analyses request
• DBMS inspects user's external schema, external
to conceptual mapping, conceptual schema,
conceptual to internal mapping and the storage
structure definition.
• DBMS executes operations on stored database.
DATABASE ADMINISTRATOR
(DBA)
• Decide on the storage structure and
access strategy
• Liaise with the users
• Define security and integrity checks
• Define a backup and recovery strategy
• Monitor and respond to performance
Utilities used by the DBA
•
•
•
•
•
•
Load routines
Dump/Restore routines
Reorganisation routines
Statistics routines
Analysis routines
Data dictionary (containing METADATA,
which gives data descriptions and
mappings)
Relational database
• Data is independent from programs and from
other data
• Data is represented in TABLES rather than files.
(one entity corresponds to 1 table)
• Column headings are described as DOMAINS.
(i.e. attributes)
• Items of information as TUPLES or ROWS
rather than records (i.e. occurrences of the
entity)
Definitions
•
A RELATION is a collection of semantically related information,
usually containing a unique key. A RELATION = a Table
• FOREIGN key - a key to a different relation that is used as non-key
data in this relation. (i.e. the enforcing field in the relationship)
• SIMPLE key - uses one item from the row
• COMPOUND key - uses more than one item / attribute
• Unnormalized data - contains headings, footings, differing number of
occurrences for different fields.
Properties of a relation
• Third Normal form (TNF) test.
– All row entries are non-divisible (atomic) - i.e.
no such thing as arrays
– All entries in a particular column are drawn
from the same set (i.e. no such thing as
redefines)
Normalisation of data
• Collect all documents to be entered/produced
• Represent documents in unnormalized form
• Choose and identify key items, giving unnormalized data + keys
• Separate out repeating groups -> 1st Normal Form (1NF)
• Separate out part key dependencies -> 2nd Normal Form (2NF)
• Separate out inter-data and inter-key dependencies -> 3rd Normal
Form (TNF)
• Apply TNF tests
• Optimise by combining relations with identical keys
• Apply TNF tests again
Relational database
• This is a database that is perceived by its users as a
collection of tables. Each table can define an ENTITY
• Entities can be related through RELATIONSHIPS
• Relationships are implemented by use of foreign keys in
tables
• Each column has a unique name within the table
• All rows are distinct (no two are the same)
• Row or column order is not significant
• Every relation must have a key
Operations in SQL
• Tables are created by the CREATE TABLE statement:
CREATE TABLE DRIVERS
(DRIVER_NUMBER SMALLINT NOT 0,
DRIVER_NAME CHAR(20),
HOME_DEPOT CHAR(6),
VEHICLE_TYPE etc...
• Tables can be changed:
ALTER TABLE DRIVERS
ADD OTHER_ALLOWANCES CHAR(6);
• and deleted:
DROP TABLE DRIVERS;
Operations in SQL
• Tables can be joined together on fields
which have the same attributes:
SELECT DRIVER.*, VEHICLE.*
FROM DRIVER, VEHICLE
WHERE DRIVER.VEHICLE_TYPE =
VEHICLE.VEHICLE_TYPE;
Implementation of desirable
features
• Integrity
– A field’s validation can be declared when the field is
declared. If this validation is used, then the integrity
of the field remains intact.
– Entity integrity - No attribute participating in the
primary key of a base relation is allowed to accept
null values.
– Domain constraints - what are the possible valid
values that can be used?
Referential integrity
– Through the propagation and use of foreign
keys, no detail can be created where a master
is needed, nor can a master be deleted
without consent to the deletion of the details
Implementation of desirable
features
• Data independence
– The implementation of relational databases
causes the external and conceptual schema
to be data independent. The internal schema
and the physical level are data dependent.
• Controlled redundancy
– The relational model reduces redundancy at
the conceptual level
SECURITY
• Legal, social and ethical considerations (e.g.
Data protection act)
• Physical controls - locking of computer rooms
• Company policy
• Operational - e.g. password access rulings
• Hardware controls - e.g. privileged operating
mode
• Limits on fields that users can see
Security and SQL
• SQL allows views to be created that only allow the view users
access to a range or selection of values for particular fields; e.g.
CREATE VIEW CORK_DRIVERS
AS SELECT DRIVER_NUMBER, DRIVER_NAME,
YEARS_SERVICE
FROM DRIVERS
WHERE HOME_DEPOT = "CORK";
• This is a value-dependent constraint.
Security and Privacy in SQL
•
Different users can be granted different access rights :
GRANT SELECT, UPDATE (CREDIT_LIMIT, AMOUNT_OWING)
ON TABLE CUSTOMER TO GRP_ACCNTS;
GRANT SELECT ON VIEW CUSTOMER_TOTAL TO
DEPOT_CONTROLLERS;
•
•
The access types that can be granted are SELECT,
UPDATE, DELETE and INSERT.
•
Access rights can also be REVOKEd.
Security and SQL
• Field-dependent constraints can be imposed by omitting
the field from the view. Views can also be presented so
that they give totals only - not individual items:
CREATE VIEW CUSTOMER_TOTAL
AS SELECT CREDIT_LIMIT, AMOUNT_OWING
FROM CUSTOMERS
GROUP BY CREDIT_LIMIT
JOURNALLING
• An audit trail can be set up to follow operations on the
database. This involves journalling of each, or a specific
type of operation on the database or some part of it.
• The audit trail should specify the operation, the terminal
from which it was invoked, the user, the date-time, the
database, table, record and field affected, the old and
new value of the field.
• The advantages of this are that it gives the auditors a
way of tracing any discrepancies. However, it slows
down the operation of the system considerably.
BACKUP SECURITY
• As well as the fact that the database administrator will
ensure that the full database is backed up in a logical
way, most databases have the COMMIT/ROLLBACK
facility:
• Whenever a program updates the database, the update
remains tentative only, until a COMMIT causes it to
become permanent, or a ROLLBACK
• cancels it. ROLLBACK is only issued if an exception
occurs
Internal level (relational)
• Internal schema (some Data Definition Language).
Stored_Driver
Driver_number
Driver_Name
Driver_Home_depot
Driver_vehicle_type
Driver_empl_date
Driver_TFA
Driver_Tax_Table
Length 41.
BYTE(6), Offset 0, INDEX.
Byte(20), Offset 6.
Byte(1), Offset 26.
Byte(2), Offset 28. (**)
Byte(8), Offset 30.
Byte(2), Offset 38.
Byte (1), Offset 40.
Conceptual schema (some Data
Definition Language)
Driver.
Driver_number
Driver_Name
Driver_Home_depot
Driver_vehicle_type
Driver_employment_date
Driver_TFA
Driver_Tax_Table
Character (6).
Character (20).
Numeric (1).
Character (2).
Date
Numeric
7 digits 2 decimal
Character 1.
Subschema or External schema
(COBOL)
01Driver-pay-table.
02 Driver_no
02 Driver_name
02 Driver_Vehicle_type
02 Driver_TFA
02 Driver_Tax_Table
02 Driver_Employ_date
pic x(6).
pic x(20).
pic xx.
pic 9(5)v99.
pic A.
pic 99/99/9999.
External schema
01
Driver_location_table.
02 Driver_no
02 Driver_name
02 Driver_Vehicle_type
02 Driver_Home_depot
pic x(6).
pic x(20).
pic xx.
pic 9.
Data Warehouse
• Definition - a collection of current and historical
operational data stored for use in executive support
systems (a.k.a. executive information systems EIS) and
decision support systems DSS.
• Purposes
– Growing demand that executives and management have rapid,
easy access to operational data for planning and decision
making
– Diversity of format and location of historical data
Storage of non-standard data
types
• Pictures, Video clips, Sound clips
• Can be done on a relational database. These data types
are seen conceptually as just another data type. Only
data is held on them - i.e. a video clip can be held on a
relational database, but separate functionality must be
provided to play it - this also applies to sound and still
pictures.
• Oracle and Informix call these databases “universal”
databases. IBM call them “extenders” to DB2.
Distributed databases
• Databases can now be distributed over different
computers and operating systems by the use of
middleware
– Open DataBase Connectivity (ODBC)
• In order for database requests to be passed from one
computer to the other, special software is supplied
that will translate the client computer’s request into a
format understood by the target server computer.
The reply is then converted back. This layer of
software is called middleware.
ODBC
• This middleware provides only database connectivity there is a generally accepted ODBC (open database
connectivity) standard. This increases scalability.
• ODBC connects to relational database management
systems, but not to flat files, thereby excluding a lot of
legacy systems.
• All the major RDBMS vendors are offering software to
link their databases to the Web. Primary examples are
Oracle’s Network Computing Architecture and Informix’s
Universal Web Architecture
Download