7.3. Designing a database

advertisement
Chapter Seven
Basic Data Storage
135
Importance of this chapter
With the immense advancement using of diversified applications across the globe, it goes
without saying that managing this huge user information is of outmost importance. How
these data are stored in the data storages along with every software, websites and
automated systems or in other tools are established and managed is the main focus of this
chapter.
Expected outcome of this chapter
•
Describes databases and DBMS concepts, terminology, and architecture.
•
Describes the basic concepts necessary for a good understanding of
databases design and implementation
Describe the conceptual modeling techniques used in database systems.
•
•
Describes the relational data model, its integrity constraints and update
operations, and the operations of the relational algebra.
7.1. An introduction to database
Database: Database is a structured and self-describing collection of data that is used
to store data and data definitions (metadata) and mange data consistency and
integrity.
Database contains data and metadata stored in a table like format with columns
(attributes) and rows (records). Metadata describes different data definitions. The tool
that is used to manage a database is called Database Management System (DBMS). It
provides common functionalities and interfaces for managing and controlling database
activities.
Database functionalities:
•
•
•
•
•
Data Storage Management
Data Transformation and Presentation
Security
Multi-user Access Control
Backup and Recovery
136
•
•
•
Data Integrity
Database Access Language
Database Communication Interface.
Applications of databases:
•
•
•
•
•
•
•
Banking: transactions
Airlines: reservations, schedules
Universities: registration, grades
Sales: customers, products, purchases
Online retailers: order tracking, customized recommendations
Manufacturing: production, inventory, orders, supply chain
Human resources: employee records, salaries, tax deductions
Storing data in database is advantageous than storing data in a file system (.doc, .excel file
etc.). Because, in a database, data can be easily manipulated (add, delete, update) which is
difficult in file processing system as there are many different file types.
Advantages:
•
•
•
•
•
•
High data quality, integrity, and consistency
Reduced data redundancy and application maintenance
Easy access and sharing
Scalable
Improved security
Specialized and productive management tool
Major disadvantages:
•
•
Increased complexity
Greater impact of failure
Problems related to topics
Problem 1: What is database? Why do we use database?
Solution: Database is a structured and self-describing collection of data that is used to
store data and data definitions (metadata) and mange data consistency and integrity.
Database is used to•
•
•
•
•
Data Storage Management.
Data Transformation and Presentation
Security
Multi-user Access Control
Backup and Recovery
137
•
•
•
Data Integrity
Database Access Language
Database Communication Interface.
Problem 2: What are the advantage and disadvantage of database?
Solution: Advantage and disadvantages of database is given below:
Advantages:
•
•
•
•
•
•
High data quality, integrity, and consistency
Reduced data redundancy and application maintenance
Easy access and sharing
Scalable
Improved security
Specialized and productive management tool
Major disadvantages:
•
•
Increased complexity
Greater impact of failure
Exercises:
1. What are the contents of a database? What is metadata?
2. Write the name of some application where database is used.
3. Why storing data in file system is difficult then storing in database?
7.2. Database architecture: Table, Fields and Records
Relational Database: A relational database (RDB) is a collective set of
multiple data sets organized by tables, records and columns. RDBs also establish a
well-defined relationship between database tables. In simplest terms, a relational
database is one that presents information in tables with rows and columns showing
relation between them.
All the database design and data manipulation tasks are carried by a Database Management
System (DBMS). It is computer software designed for the purpose of managing databases
based on a variety of data models.
Regardless of data models, the logical structure of the database is called Schema and the
actual content of the database at a particular point in time is called an Instance.
In a database, data are organized in relations (tables) which may be linked by some
138
constraints. Columns contain what information is needed to be stored and Rows contain
each record. For example, in a university database, there can be tables to store information
about courses, instructors, student, section etc. Figure (1) is a sample database design for a
university.
Columns are called fields and each row contains record which stores value for each field.
For example, figure (2) shows a database table for storing the data about Instructors. Here,
ID, name, dept_name and salary are the fields to store and {2222, Einstein, Physics, 95000}
is a record.
Database table properties:
Unique table name.
All values in a row describe the one instance.
All values in a column are of the same kind.
Each row is distinct.
A cell of the table holds a single value.
Each column has a unique name.
There is no ordering in rows.
NULL value can also be stored in tables.
139
Problems related to topics
Problem 1: What is DBMS?
Solution: The database design and data manipulation tasks are carried by a Database
Management System (DBMS). It is computer software designed for the purpose of
managing databases based on a variety of data models.
Problem 2: What is schema and instance?
Solution: the logical structure of the database is called Schema and the actual content of
the database at a particular point in time is called an Instance.
Problem 3: What are the properties of database table?
Solution: Database table properties:
Unique table name.
All values in a row describe the one instance.
All values in a column are of the same kind.
Each row is distinct.
A cell of the table holds a single value.
Each column has a unique name.
There is no ordering in rows.
NULL value can also be stored in tables.
Exercises:
1. What is a relational database? How data are stored in a relational database?
What is a field?
2. Find the tables necessary for a Library Management System/ Hospital
Management System.
3. Identify the fields of each table for the above mentioned systems.
7.3. Designing a database: Entity, Attributes and Relationships
Data Model: A detailed model that captures overall structure of organizational data
while being independent of any implementation considerations.
Data modelling involves examining the data objects in a system and identifying the
relationship between these objects. There are different ways of model a database. They
are:
Relational model
140
Entity-Relationship data model (mainly for database design)
Object-based data models (Object-oriented and Object-relational)
Semi structured data model (XML)
Entity Relationship Diagram is a form of data modelling which is widely used in designing a
database. This model uses a graphical representation of entities and their relationships to
each other, based on which tables are created in a database. The primary purpose of an
ERD is to document the logical structure of a database.
Entity: An entity is an object that exists and is distinguishable from other objects.
Example: specific person, company, event, plant.
Attribute: An entity is represented by a set of attributes, that is descriptive properties
possessed by an entity. Example: instructor = (ID, name, street, city, salary), course=
(course_id, title, credits)
Relationship: A relationship is an association among several entities. Example:
students are enrolled in courses. Instructor teaches courses.
It is the primary task to determine all the entities and their attributes while modelling a
database. Entities are denoted using rectangle. For example, in a university database
following entities may be present:
141
Attributes are denoted using elliptical shape. Entity relationships are denoted using
rhombus shape. For example, for the Students entity the attributes would be sid, name, age,
gpa and fro courses the attributes would be cid and title. Both the entities are related using
the relation enrol as follows:
142
Problems related to topics:
Problem 1: What is a data model? Write the names of different data models.
Solution: Data Model: A detailed model that captures overall structure of organizational
data while being independent of any implementation considerations.
Different kinds of data models are as follows:
Relational model
Entity-Relationship data model (mainly for database design)
Object-based data models (Object-oriented and Object-relational)
Semi structured data model (XML)
Problem 2: What is entity and attribute? Discuss with example.
Solution: Entity: An entity is an object that exists and is distinguishable from other objects.
Example: specific person, company, event, plant.
Attribute: An entity is represented by a set of attributes, that is descriptive properties
possessed by an entity. Example: instructor = (ID, name, street, city, salary), course=
(course_id, title, credits).
For example, in a university management system, there can be entities like faculty,
departments, classrooms, students, courses. Each of these entities will have several
attributes.
Suppose that, attributes of Students entity are SID, name, age, GPA. Another entity
Courses have attributes like CID and title. This is demonstrated in below diagram:
143
Exercises:
1. Find all the entities of an online bookshop management database. Find out all the
attributes for each entity.
7.4. Designing a database: Keys
It is important that any entity in an entity set be uniquely identifiable. Practically, we use
the values of certain attributes to uniquely identify an entity. For example, in a bank
database using customer’s SSN, the customer’s full information can be brought up.
In a database table, keys are defined to identify each record distinctly. Key can be one
single attribute or a collection or set of attributes. For example, in a Person table a person
can be uniquely identified by the SSN or a combination of First Name, Last Name and SSN.
In practice, these combinations of attributes are classified using four types of Keys:
Super key
Candidate key
Primary key
Foreign key.
Primary key: A primary key is a candidate key and a single field that is most appropriate to
be the main reference key for the table.
The primary key must contain unique values, must never be null and uniquely identify each
record in the table.
For example, in a Students table, we can see that, using only {StudentID} it is possible to
identify each record distinctly. So, this is the primary key for the table.
Foreign Key: A foreign key is generally a primary key from one table that appears as a field
144
in another table to establish a relation between the first and second table.
For example, consider the relationship between Students and Courses. Student information
is stores in Students table and course information is stored in Courses table. From these
two tables how to show that which student have which courses? This is done using a
foreign key field {courseId} in the Students table, which will contain the values of
{courseId} field of Courses table.
145
Problems related to topics
Problem 1: Why do we need to use keys in a database? Write the names of the keys used in
databases.
Solution: In a database table, keys are defined to identify each record distinctly. It is
important that any entity in an entity set be uniquely identifiable. Practically, we use the
values of certain attributes to uniquely identify an entity, which are called keys. Keys can be
one single attribute or a collection or set of attributes.
Different kinds of keys are used in designing a database. They are:
Super key
Candidate key
Primary key
Foreign key
Problem 2: What is Primary key? Describe with example.
Solution: Primary key: A primary key is a candidate key and a single field that is most
appropriate to be the main reference key for the table.
The primary key must contain unique values, must never be null and uniquely identify each
record in the table.
For example, in a Students table, we can see that, using only {StudentID} it is possible to
identify each record distinctly. So, this is the primary key for the table.
Problem 2: For a movie database, identify the Primary Key and Foreign Key in Directors
and Movies table.
Solution: In Directors table DirectorID is the primary key. Similarly, in Movies table,
MovieID is the primary key.
To establish a relationship between these two tables, DirectorID field would be used as a
foreign key in Movies table, which will reference to the primary key field DirectorID of
Directors table.
146
Exercises:
1. What is Foreign Key? Why do we use Foreign Key in a database? Explain with
example.
2. For an online book shop database, identify the Primary Key and Foreign Key in
Customers
and
Orders
table.
147
 Points to Remember
1. Database is a structured and self-describing collection of data.
2. The logical structure of the database is called Schema and the actual content of the
database at a particular point in time is called an Instance.
3. Database contains data and metadata stored in a table like format with columns
(attributes) and rows (records). Columns are called fields and each row contains
record which stores value for each field
4. An entity is an object that exists and is distinguishable from other objects. It is
represented by a set of attributes that is descriptive properties possessed by an
entity. Entities are connected by different relationships.
5. Keys are defined to identify each record distinctly. Several types of keys are used in
designing a database.
Vocabulary
DBMS – Database Management System. Tools to mange or manipulate data using database.
RDB – Relational Database. It is a collective set of multiple data sets organized by tables,
records and columns.
ERD – Entity Relationship Diagram. This is a form of data modeling which uses a graphical
representation of entities and their relationships to each other, based on which tables are
created in a database.
Schema – Design of the database. The logical structure of the database is called Schema.
Instance – the actual content of the database at a particular point in time is called an
Instance.
Key – Keys are defined to identify each record distinctly. Key can be one single attribute or
a collection or set of attributes.
148
Download