Uploaded by Arnaud Y

012 DatabaseEssentials-Database

advertisement
DATABASE ESSENTIALS I:
RDBMS
BCIS 4660
Obi Ogbanufe, Ph.D
“If we have data, let’s look at data.
If all we have is opinions,
lets go with mine.”
Jim Barksdale
Former Netscape CEO
2
OBJECTIVES
▪ Understand databases
▪ Understand relational database management systems (RDBMS)
▪ Understand databases and database objects (tables, views, indexes etc.)
▪ Understand database guidelines and standards
COURSE OVERVIEW
Data Warehouse Overview
Operational
Data
Data
Warehouse
Data
Extraction
and
Integration
Business
Intelligence
Applications
4
COURSE PLAN
Intro to Data
Warehousing
DW and BI
Database
Fundamentals
DW Design
ETL
Introduction to
DW
Database
Essentials
Dimensional
Modeling
Design ETL
Business
Intelligence
T-SQL
DW Design
Process
Manage ETL
Business
Intelligence
w/DW
BI Application
Advanced DW
Cloud DW
ER Modeling
5
WHAT IS A DATABASE?
▪ A database is a electronic collection of data that is organized
or structured in a specific format.
6
DATABASE MANAGEMENT SYSTEM
A database system is an information or computer system that manages the
collection, storage, retrieval of data
▪ Organizations’ data must be stored and managed
▪ The data should be available to users that need it
▪ The management of the data needs to be automated
7
Class discussions and activities
Can you think of other instances that
require the collection, management, and
automation of data for a businesses?
DATABASE MANAGEMENT SYSTEM
▪ Simply put, a database management system
(DBMS) is a software system that manages data
▪ It is a program that manages the storage,
update, and retrieval of data
▪ It manages how users interact to add, update, and
delete data
▪ It provides and manages the interface between
the stored data and the users
▪ It ensures that the data is consistent, available, and
accessible to users or other programs
9
RELATIONAL DATABASE MANAGEMENT SYSTEM
▪ A relational database is a database structure that
allows database objects to have relationships with
other objects in the database
Students
▪ A DBMS that manages relational objects is called
relational database management system (RDBMS)
▪ Data in a relational databases is stored in tables
Courses
▪ Data manipulation in the RDBMS uses Structured
Query Language (SQL )
10
RELATIONAL DATABASE: TABLE ELEMENTS
▪ A table is two-dimensional structure that represents
the connection between a row and a column
o Row: record, tuple
o Column: attribute, field, variable
▪ Each row in a table should have the same number of
columns
▪ A relationship is made between the tables (entities)
when there is a common column (attribute) in both
tables
Table
Row
Column
Relationship
11
A RELATIONAL DATABASE
Columns (Fields)
Primary Key
Rows (Records)
StudentID
FirstName
LastName
DOB
889900
LaTonya
Baker
10/12/2000
997766
Michael
Caine
06/06/2001
334455
Quyhn
Tran
05/20/1999
772255
Terry
Ostermeier
02/08/1998
009277
Chike
Ogumike
01/29/2004
115566
August
Rush
12/25/2002
Relationships
Primary Key
CourseID
CourseName
StudentID
00012345
BCIS4600
889900
00012345
BCIS4600
997766
00012345
BCIS4600
334455
12
RDBMS & TRANSACTIONS
▪ A RDBMS must handle transactions in the database
▪ A RDBMS in the organization must ensure that multiple
users work concurrently without overwriting each others
work or corrupting the data (multi-user database)
▪ MS SQL Server is a multi-user RDBMS
▪ MS Access is not multi-user RDBMS
14
RDBMS & TRANSACTIONS
▪ A transaction is an atomic unit of work that contains one or more
SQL statements
▪ An atomic unit of work must be successfully completed (committed)
as a unit or undone (rolled back) as a unit
▪ $100 funds withdrawal from an ATM could involve a transaction
with 3 operations
Withdraw
$100
 Decrease the savings
 Provide the funds
 Record the transaction in database
▪ An RDBMS ensures that all three operations complete successfully.
Otherwise, all 3 operations are rolled back
▪ An RDBMS transaction should be “All or Nothing”. We all succeed
or we all fail (Atomicity, Consistency, Isolation, Durability)
15
WHAT IS ACID?
A RDBMS processes transactions using the ACID property. ACID
is the RDBMS property that ensures the integrity of transactions
▪ Atomicity: All operations in a transaction are performed or none is
performed. There is no partial transaction
▪ Consistency: The transaction should always keep the database in
consistent state
▪ Isolation: The effect of a transaction should not be visible to other
transactions until the transaction is complete and committed
▪ Durability: Changes made by committed transactions are
permanent
16
RELATIONAL DATABASE MANAGEMENT SYSTEM
A relational database consists of the following:
▪ Structure: Defined database objects used for
storing and accessing the database
▪ Operations: Defined actions that allow users to
manipulate the data and the data structures
▪ Rules: Rules that govern the operations performed
on the data and data structure
Structure
Operations
Rules
17
Database Objects
Tables, Views, Index
18
DATABASE STRUCTURES (OBJECTS)
▪ Database objects are “objects” in the database that are used
to store, view, and retrieve data
▪ There are many database objects. The most frequently used
are: Tables, Views, Indexes, and stored procedures
Table
View
Index
Stored Procedure
19
DATABASE OBJECTS (TABLES)
▪ Tables are the most important objects in an RDBMS
▪ Tables store database data (in rows and columns)
▪ Tables are also called entities
▪ An entity can be a person, place, object or event.
oEach entity (e.g., students, grades, courses) requires
data related to that entity to be stored and
managed.
o Tables: Student, Grades, Courses
Table
20
DATABASE OBJECTS (VIEWS)
▪ Views are virtual tables (a.k.a stored queries)
▪ Views do not store data
▪ Views create a layer of abstraction between the table
and the user
View
▪ It allows users to access the data without fear of
changing the underlying tables
▪ Views can be used as a security measure. Users can
access the data in tables through views without being
granted permission to the table structures
21
DATABASE OBJECTS (INDEXES)
▪ An index is a database structure that helps improve
performance and speed during data retrieval
▪ An index improves database performance by
allowing the database engine to access and
retrieve data quickly (think of a book index)
Index
▪ Indexes are typically added to columns that are
used frequently in the WHERE and ORDER BY
clauses
22
INDEXES
▪ Poorly designed indexes and/or lack of indexes could
cause database performance problems
▪ An index is stored on-disk or in-memory and associated
with a table for speeding up the retrieval of rows from the
table
▪ The design of indexes depend on the database type: OLTP
(Write) or OLAP (Read)
Index
▪ There are 3 main types of indexes: Clustered,
Nonclustered, and Unique Indexes
23
COMMON INDEX TYPES
▪ A clustered index sorts and stores the data rows of the
table in order based on the clustered index key.
Uniqueness is a property of clustered indexes
▪ A nonclustered index can be defined on a table and
the data rows are not in any particular order.
Uniqueness is a property of nonclustered indexes
24
DATABASE OBJECTS (INDEXES)
Think about an ordered table
of BusinessEntityID and row
positions. If the objective is to
quickly retrieve a number of
rows. An index can help
minimize the number of rows
that the database has to
examine in order to retrieve
specified rows.
25
Database Operations
DDL, DML and DCL
26
DATABASE OPERATIONS
▪ Almost all operations performed on the RDBMS are done using SQL statements
▪ SQL stands for Structured Query Language
▪ A SQL (pronounced sequel) statement is a program instruction that allows users
and programs to access data in the database. SQL consists of identifiers,
parameters, variables, names, data types etc.
▪ Three main types of SQL statements:
Data Definition Language
(DDL)
Commands that define a
database, including
creating, altering, and
dropping tables and
establishing constraints
Data Control Language
(DCL)
Data Manipulation
Language (DML)
Commands that maintain
and query a database.
Commands that control a
database, including
administering privileges
and committing data
DATA DEFINITION LANGUAGE (DDL)
▪ DDL statements allow users to create, alter, and drop
objects and other database structures, including the
database itself
▪ Most DDL statements start with keywords: CREATE,
ALTER, DROP
 CREATE TABLE: creates a new table structure/definition
 DROP TABLE: drops the table and deletes all data
 ALTER TABLE: edits the structure/definition of table
DDL
Alter Table
Create Index
Drop Index
Create View
Drop View
Create Schema
DATA DEFINITION LANGUAGE (DDL)
CREATE TABLE
DROP TABLE IF EXISTS Employee;
CREATE TABLE Employee (
EmployeeID int IDENTITY(1,1) NOT NULL PRIMARY KEY,
FirstName char (30),
LastName char (30),
EmailAddress char (50),
JobID int ,
HireDate date);
INSERT INTO Employee VALUES
('Ben', 'Aller', 'Ben.Aller@nocompany.com', 1115, '09/02/2020'),
('Kenneth', 'Onye', 'Ken.Onye@nocompany.com', 1123, '07/12/2020');
DATA MANIPULATION LANGUAGE (DML)
▪ DML statements query or manipulate data (content) in
existing database objects
▪ Most DML statements start with the keywords SELECT,
INSERT, UPDATE
▪ DML statements are the more commonly used SQL
statements
o Retrieve (SELECT) data from tables or views
o Add (INSERT) and remove (DELETE) rows of data tables or
views
o Change (UPDATE) column values in existing records in tables of
views
DML
SELECT
INSERT
UPDATE
DELETE
DATA MANIPULATION LANGUAGE (DML)
SELECT * FROM employees
INSERT INTO employee (LastName, FirstName, EmailAddress, Jobid, Hiredate)
VALUES ('Shreya', 'Mackenzie', 'Mackenzie.Shreya@bcis.edu', 1234, '14-FEB-2008')
UPDATE employee
SET FirstName =Millie'
WHERE Jobid = 1234
DELETE FROM employee
WHERE Jobid = 1234
DATA CONTROL LANGUAGE (DCL)
▪ DCL statements allows the user or program to control
the database system, granting, revoking permissions,
or administering privileges to the database system.
▪ DCL is sometimes used interchangeably with
Transaction Control Language (TCL)
▪ DCL/TCL manage changes made by DML statements.
▪ DCLs are used for grouping DML statements together
as a unit of transaction
DCL
Grant
Revoke
Commit
Rollback
DATA CONTROL LANGUAGE (DCL)
▪ COMMIT: Make a transaction change permanent
▪ ROLLBACK: Undo a transaction change
▪ GRANT: Grants permissions on a table, view, stored
procedure, etc.
▪ REVOKE: Removes a previously granted or denied
permission.
DCL
Grant
Revoke
Commit
Rollback
Rules
Constraints
CONSTRAINTS
A constraint is a rule placed on a table or column
that restricts operations and data values allowed
▪ Enforces rules at the table level
▪ Enforces integrity
▪ Prevents deletion of tables, if dependencies exist
▪ Can be defined during or after table creation
TABLE CONSTRAINTS
There are 5 major constraints that can be created:
1. Not Null (NN)
2. Unique (U)
3. Primary Key (PK)
4. Foreign Key (FK)
5. Check (CHK)
CONSTRAINTS (COMMONLY USED)
▪ Primary Key (PK) – This constraint ensures that the column is the PK for the table and will
not have duplicate values
▪ Foreign Key (FK) – This defines a column as a foreign key (reference). It references to the
primary key column of another table. Ensures referential integrity
o Prevents orphaned record: A record whose foreign key points to (or references) a nonexistent primary key value
▪ Unique (U) – Enforces unique constraint on a column. This means that there can be no
duplicate values for this column of data
▪ Not Null (NN) – This constraint is an unknown value. It restricts a column from being an
unknown value. Please note that a NULL value is not blank, zero, empty.
Difference between Primary Key and Unique Key
• PK does not allow NULL |Unique key can allow a NULL
• Table has only 1 PK | Tables can include multiple Unique Keys
REFERENTIAL INTEGRITY CONSTRAINT
▪ Referential integrity constraint - Regulates the relationship between a table with a
foreign key
▪ Ensures that the value of the foreign key matches one of the values in the primary key
column of the other table
EmployeeID
LastName
DepartmentID (FK)
DepartmentID
DepartmentName
EmployeeID
LastName
DepartmentID
DepartmentID
DepartmentName
1110
Ken
101
101
Technology
1128
Lanre
101
101
Technology
1139
Abigail
214
214
Accounting
Database Guidelines and Standards
Naming Conventions
DATABASE GUIDELINES AND STANDARDS
Naming conventions
Consistency in database objects naming conventions and abbreviations
▪ Allow users to easily identify objects
▪ Allow ease of database administration
▪ Database naming standards are often developed in conjunction with all users
(DB Admin, DB Engineers, Data Analysts, Business Analysts etc.)
▪ Database naming standards apply to all users of the database (database
administrator, database developer etc.)
DATABASE GUIDELINES AND STANDARDS- EXAMPLES
No spaces in table/field names
Table names should match PrimaryKey name
Use camelCase or PascalCase table names
and field names
 camelCase: First word starts with lowercase with no
spaces
 PascalCase: Each word starts with an uppercase
with no spaces
Create Database camelOrigin
Create Database PascalOrigin
GENERAL RULES FOR IDENTIFIERS (DATABASE OBJECTS)
The name of each database object (e.g., database, table,
view,) is referred to as its identifier
Rules for identifiers in RDBMS (SQL Server)
 The first character must be one of these:
 A letter, underscore, the at sign @, or number sign #
 Subsequent characters can have:
 Letters, decimal number, the at sign @, dollar $, number sign, or underscore
 The identifier must not be a T-SQL reserved word (both
upper or lowercase)
 Embedded spaces or special characters are not allowed
43
SUMMARY
Learned about databases
Learned about relational database management systems (RDBMS)
Learned about database objects (tables, views, indexes etc.)
Learned some database guidelines, standards, and naming conventions
Download