Uploaded by Kumaka Labukab

Information-Management-reviewer

advertisement
I.
Introduction
Why database?
Data is abundant, everywhere, and
prevalent, persistent.
Data
-
Representation of facts or concepts.
Raw facts.
Information
-
Knowledge derived from processing the
data.
Management
-
-
Refers to the organized and strategic
handling of information and data within
an organization.
Discipline that focuses on the proper
generation, storage, and retrieval.
Information management
-
-
Process by which relevant information is
provided to decision-makers in a timely
manner (Davis 1997).
To provide the right information to the
right time person at the right time.
Database
-
Shared, integrated computer structure
that stores the collection of the
following:
 Metadata – data about data
 End user data – raw facts of interest
to the end user.
File system
-
Composed of a collection of file folders
with proper tags.
Useful in data management but become
obsolete.
Drawbacks of file system
1.
2.
3.
4.
5.
6.
7.
Data redundancy and inconsistency.
Difficulty accessing data.
Data isolation.
Integrity problems.
Atomicity problems.
Concurrent-access anomalies.
Security problems.
Database system
- Collection of interrelated data and a set
of programs that allow users to access
and modify these data.
Components of database system
1. Data
2. Hardware
3. Software
a. operating system
b. application programs and utilities
software
c. DBMS
4. Procedure
5. People
a. Administrator
b. Designer
c. End-user
 Casual
 Naïve
 Sophisticated
 Standalone users
Database management system (DBMS)
- Collection of programs that manage the
database structure and controls access
to the data stored in the database.
DBMS functions:
- Data dictionary management
- Data storage management
- Data transformation and presentation
- Security management
- Multi-user access control
- Backup and recovery management
- Data integrity management
-
Database access languages and
application programming interfaces
Database communication interfaces
Advantages
1. Improved data sharing.
2. Improved data security.
3. Better data integration.
4. Minimized data inconsistency.
5. Improved data access.
6. Improved decision-making.
7. Increased end-user productivity.
Disadvantages
1. Complexity
2. Skilled resources
3. Performance tuning
4. Database failure
5. Cost
6. Additional hardware cost
7. Frequent upgrades
II.
Introduction Part 2
Data model
- Collection of concepts that can be used
to describe the structure of a database.
Categories of data model
High-level or conceptual data models
- Close to the way many users perceive
data. Use concepts such as entities,
attributes, and relationships.
Low-level or physical data models
- Describe the details of how data is
stored on the computer.
Schema and instances
Schema
- Organization of data as a blueprint of
how the database is constructed.
- Displayed schema is a schema diagram.
Instances / Database state / current set of
occurrences
- Data in the database at a particular
moment in time.
Level of schema
Data independence
1. Logical data independence
Capacity to change the
conceptual schema without having to
change the external schemas or app.
2. Physical data independence
Capacity to change the internal
schema without having to change the
conceptual schema.
Due to physical independence, any changes
will not affect the conceptual layer.
- Using a new storage device
- Modifying the file organization
technique
- Switching to different data structures
- Changing access method
- Modifying indexes
- Changes to compression techniques
- Change of location of database
Due to logical independence, any changes will
not affect the external layer.
Add/modify/delete a new attribute,
entity or relationship is possible without
a rewrite of existing app program.
- Merging two records in one
- Breaking an existing record into two or
more records.
DBMS architecture
Tier-I (single tier architecture)
- Where the client, server, and database
all reside on the same machine.
Tier-II
- An application interface is called ODBC
(Open Database Connectivity) an API
which allows the client-side program to
call the DBMS.
Tier-III (three tier architecture)
- An extension of the 2-tier architecture
with 3 layers: presentation layer,
application layer, database server.
DBMS languages
-
-
Data Definition Language (DDL)
for specifying the database schema.
 CREATE – create database instance.
 ALTER – alter structure of database.
 DROP – drop database instances
and objects.
 TRUNCATE – delete tables.
 RENAME – rename instances.
 COMMENT – to comment.
Data Manipulation Language (DML)
for accessing and manipulating data
in a database.
 SELECT – read records from tables.
 INSERT – insert records into tables.
 UPDATE – update data.
 DELETE – deletes records.
-
Data Control Language (DCL)
granting and revoking access
 GRANT – grant access to user
 REVOKE – revoke access from user
-
Transactional Control Language (TCL)
performed or rollback actions.
 COMMIT – persist the changes
made by DML commands.
 ROLLBACK – rollback changes made.
III.
Data Manipulation
Data modeling
- First step in designing a database.
- Creating a specific data model for a
determined problem domain.
Data model
- Usually graphical, of more complex realworld data structures.
- Represents data structures and their
constructs with the purpose of
supporting a specific problem domain.
Types of data models
1. Flat file model
Consists of a single, twodimensional array of like elements.
2. Hierarchical model
Data organized into a tree like
structure with each record has one
parent record and many children.
3. Network model
Expands upon the hierarchical
structure, allowing many-to-many
relationships in a tree-like structure that
allows multiple parents.
4. Object-oriented database models
Aims to avoid the objectrelational impedance mismatch.
5. Entity-relationship model
Describes the structure of a
database with the help of a diagram,
Entity Relationship Diagram.
Entity Relationship Diagram
- Shows the complete logical structure of
a database.
- Best used for the conceptual design of a
database.
- Based on entities and their attributes
and relationships among entities
-
Entity
- Real-world entity having properties
called attributes. Every attribute defined
by its set of values is called domain.
-
Relationships
- Logical association among entities.
- Mapped with entities in various ways.
Mapping cardinalities
- Define the number of associations
between two entities.
 One to one
 One to many
 Many to one
 Many to many
6. Relational model
- Ordering of columns is immaterial in a
table, there can’t be multiple tuples or
rows in a table, each tuple will contain a
single value for each of its attributes.
- Contains multiple tables, each like the
one in the “flat” database model.
Degree of abstraction (Data hiding)
- DBMS tries to hide details of how the
data is stored and maintained,
implementation details of the database
and complexity of the database.
Degrees of abstraction
External model/schema
- End-user’s view of the environment. ER
representation of this is called external
schema.
-
IV.
Conceptual model/schema
Represents global view of the database
by the organization.
Basis for the identification and highlevel description of the main data
objects.
Internal model/schema
Representation “seen” by the DBMS.
Depicts specific representation of an
internal model, using database
construct.
Physical model/schema
Lowest level of abstraction, describing
the way data is saved on the storage.
Data Model XML
Types of data structure
Extensible Markup Language (XML)
- Way to structure and store data in a
format readable by human and machine
- Allows you to structure and organize
data in a hierarchical manner.
Elements
- Fundamental building blocks of an XML
document.
- Enclosed in angle brackets (<>).
Attributes
- Elements can provide additional
information about the element.
- Specified like:
<book title=”ABC” author=” None”/>
Text
- Provides the actual data.
<name>john doe</name>
What is a document schema in XML?
- A document schema like a blueprint or
set of rules that defines the structure,
elements, and data types.
- Acts as a guide to ensure XML
documents conform to specific format
or structure.
Key components of a document schema
1. Elements – building blocks of an XML.
Represent different pieces of data.
2. Attributes – provide additional
information about elements. Properties
or characteristics of an element.
3. Data type – XML can specify the data
type that an element or attribute can
contain. Includes text, number, dates.
4. Hierarchical structure – defines how
elements can be nested within each
other, creating a tree-like structure.
Determines the order of the
relationship between elements.
Document schema
<complexType>
- Element that defines a complex type
- An XML element that contains other
elements and/or other attributes.
-
V.
Sequence
Specifies that the child elements must
appear in a sequence.
Any child elements can occur from 0 to
any number of times.
Relational database model
Logical view
Relational model
- Represents the database as a collection
of relations.
- Relation is nothing but a table of values.
What are DBMS keys?
- An attribute or set of attributes which
helps you uniquely identify a record or
row of data in a relation (table).
Super key
- A group of single or multiple keys which
identifies rows in a table.
- Can be used to identify row of data in a
table.
Candidate
- Set of attributes that uniquely identify
tuples in a table.
- A super key with no repeated attributes.
Primary
- Column or group of columns in a table
that uniquely identify every row in that
table.
- Two rows can’t have the same primary
key value, cannot be null, never be
modified or updated.
Foreign
- Column that creates a relationship
between two tables.
- The purpose is to maintain data
integrity and allow navigation between
two different instances of an entity.
Composite
- Combination of two or more columns
that uniquely identify a record.
Integrity rules
- Overall completeness, accuracy, and
consistency of data.
Integrity constraints
Entity integrity
- Primary key value cannot be null.
- PRI is used to identify individual rows in
a relation.
Domain integrity
- Definition of a valid set of values for an
attribute.
- The value of the attribute must be
available in the corresponding domain.
Referential integrity
- Specified between two tables. It
ensures that the values for a set of
attributes in one relation must also
appear the same.
Relational set operators
- Data in relational tables are of limited
value unless the data can be
manipulated to generate useful
information.
- Properties of closure – the use of
relational algebra operators on existing
relations (tables) produces new
relations.
1. Select
Also known as RESTRICT, yields values
for all rows found in a table that satisfy
a given condition.
2. Project
Yields all values for selected attributes.
Yields a vertical subset of a table.
3. Union
Combines all rows from two tables,
excluding duplicate rows. Columns and
Domains must be compatible to be used
in the union.
4. Intersect
Yields only the rows that appear in both
tables.
5. Difference
Yields all the rows in one table that are
not found in the other table; it subtracts
one table from the other.
6. Product
Yields all possible pairs of rows from
two tables. Also known as Cartesian
product.
7. Join
Allows the information to be combined
from two or more tables.
a. Inner join
Includes only those tuples with
matching attributes and the rest
are discarded in the resulting
relation.
b. Outer join
Include all the tuples from the
participating relations in the
resulting relation.
i.
Left outer join (R)
All tuples from the LEFT relation
are included in the resulting
relation.
ii.
Right outer join (S)
all the tuples from the RIGHT
relation are included in the
resulting relation.
iii.
Full outer join (R and S)
All the tuples from the RIGHT
relation are included in the
resulting relation, if there are
tuples in right without any
matching tuple with the left,
the R-attributes resulting
relation are made NULL.
Data dictionary
- Provides detailed description of all
tables found within the user/designercreated database.
- Contains metadata – data about data.
Relationships within relational database
 One-to-one (1:1)
 one-to-many (1:M)
 many-to-many (M:M)
VI.
Functional dependency
- Relationship that exists between two
attributes. Typically exists between the
primary key and non-key attribute
within a table.
Terms:
 Decomposition – rule that suggests
if you have a table that appears to
contain two entities that are
determined by the PK, consider
breaking them up into two tables.
 Dependent – right side of functional
dependency diagram
 Determinant - left side of functional
dependency diagram
 Functional dependency –
relationship between two
attributes, typically between the PK
and other non-key
 Non-normalized table – a table that
has data redundancy in it.
 Union – rule that suggests that if
two tables are separate, the PK is
the same, consider putting them
together.
Rules of functional dependencies
Multivalued dependency
- Occurs in the situation where there are
multiple independent multivalued
attributes in a single table.
Multivalued dependency
- Complete constraint between two sets
of attributes in a relation.
Trivial functional dependency
- Set of attributes which are called trivial
if the set of attributes are included in
that attribute.
- X -> B where B is a subset of A.
Non-trivial functional dependency
- Occurs when A -> B holds true where B
is not a subset of A.
Transitive dependency
- A type of functional dependency which
happens when “t” is indirectly formed
by two functional dependencies.
Advantages of functional dependency
Avoids data redundancy.
- Help maintain quality of data.
- Helps define meanings and constraints.
- Help identify bad design.
- Help finding the facts regarding design.
VII.
-
Normalization
Process for evaluating and correcting
table structures to minimize data
redundancies, reducing data anomalies.
Anomalies in DBMS
1. Insertion anomalies
Makes the repetition of several data.
2. Deletion anomalies
Remove some needed data in a table.
3. Update anomalies
If you miss updating every single data.
First normal form (1NF)
- An attribute (column) of a table cannot
hold multiple values.
Rule:
1. Each col should contain atomic values.
2. A col should contain values that are of
the same type.
3. Each col should have a unique name.
4. The order in which data is saved doesn’t
matter.
Second normal form (2NF)
Two rules for 2NF
1. The table must be in 1NF.
2. The table must not have partial
dependency.
Partial dependency
- Occurs when a non-prime attribute is
functionally dependent on part of a
composite key.
Foreign key
- Ensures rows in one table correspond
rows in another.
Third normal form (3NF)
Two rules for 3NF
1. The table must be in 2NF.
2. The table must not have transitive
dependency.
transitive dependency
- Attribute is dependent on an attribute
that is not part of primary.
Transitive functional dependency
- When changing a non-key column,
might cause any of other non-key to
change.
CODE CHEAT SHEET
CREATE DB
create database databaseName;
DROP DB
drop database databaseName;
USE DB
use databaseName;
DROP tables inside DB
drop table tableName;
CREATE table
create table tableName(
id int unassigned not null auto_increment,
first_col varchar(255) not null,
second_col varchar(255) not null,
third_col varchar(255) not null,
primary key (id));
CREATE table with foreign key
create table tableName(
id int(11) primary key,
foreignId int(11),
first_col varchar(255),
Foreign key(foreginId) references
tableWhereForegin(foreignId));
SHOW table
show tables;
INSPECT the table schema
desc tableName;
ALTER contents in the table
alter table tableName
modify first_col varchar(255) not null;
RENAME table name
rename table tableName to newTableName;
SHOW table contents
select * from tableName;
ADD column in table
alter table tableName
add new_col not null
[ first | (and) after col_name];
DROP column in table
alter table tableName
drop column col_name;
RENAME column in table
alter table tableName
change column old_name new_name
not null
[first | (and) after col_name];
DELETE ROW
delete from <table> where
<column='element'>;
CREATE a primary key using alter table
alter table tableName
add constraint tableName_pk
primary key (id);
DROP primary key
alter table tableName
drop primary key;
ADD foreign key
alter table tableName
add constraint fk_foreign_id
foreign key (foreign_id_on_this_table)
references tableWhereForeign(foreign_id);
CASCADING
STEP 1:
show create table tableName;
//before
STEP 2:
alter table tableName
add constraint fk_foreign_id
foreign key (foreign_id_on_this_table)
references tableWhereForeign(foreign_id);
on delete cascade
on update restrict;
STEP 3:
show create table tableName;
//after
BACKUP
STEP 1:
exit
STEP 2:
mysqldump -u root -p databaseName >
E:\folderDestination\databaseBackup.sql
USE BACKUP
mysql -u root -p databaseName <
E:\folderDestination\databaseBackup.sql
Download