powerpoint lecture

advertisement
Connecting with Computer
Science, 2e
Chapter 6
Database Fundamentals
Objectives
• In this chapter you will:
–
–
–
–
Consider the widespread use of databases
Take a brief tour of database development history
Learn basic database concepts
Be introduced to popular database management
software
– See how normalization makes your data more
organized
Connecting with Computer Science, 2e
2
Objectives (cont’d.)
• In this chapter you will (cont’d.):
– Explore the database design process
– Understand data relationships
– Gain an understanding of Structured Query Language
(SQL)
– Learn some common SQL commands
Connecting with Computer Science, 2e
3
Why You Need to Know About...
Databases
• Data must be organized
• Effective computer professionals know correct
database design
– Normalization
• Ensures an accurate and reliable database
– Structured Query Language (SQL)
• Describes how information is retrieved from relational
database
Connecting with Computer Science, 2e
4
Database Applications
• Database
– Data logically related and organized into a file or set
of files to allow access and use
• Database applications
–
–
–
–
Student grading and library inventory
Genealogy studies and Social Security payments
Real estate sales, video store rentals, and retail sales
Space shuttle missions
• Database development
– Essential part of computer professional’s daily life
Connecting with Computer Science, 2e
5
Brief History of Database Management
Systems
• 1970 to 1975
– Work of IBM employees E. F. Codd and C. J. Date
• Created theoretical model for designing data structures
• Model became foundation for database design
– Software for organizing and sorting data
•
•
•
•
System R by IBM and Ingres by UC-Berkeley
Structured Query Language (SQL)
SQL: database standard
Database management system (DBMS) for PCs
• 1978
– C. Wayne Ratliff of Martin Marietta develops Vulcan
Connecting with Computer Science, 2e
6
Brief History of Database Management
Systems (cont’d.)
• 1980 to the present
– Vulcan renamed dBASE II (no dBase I)
– Popularity of dBASE II inspires other companies
• Paradox, Microsoft Access, and FoxPro
– Databases become essential for business
• Corporate decision making
• Systems: inventory management to customer support
Connecting with Computer Science, 2e
7
Brief History of Database Management
Systems (cont’d.)
Table 6-1, Popular database management systems
Connecting with Computer Science, 2e
8
Database Management System
Fundamentals
• Six main DBMS functions
–
–
–
–
–
–
Manage database security
Manage multiple users’ access to the database
Manage database backup and recovery
Ensure data integrity
Provide an end-user interface to the database
Provide a query language allowing users to modify
and view database information easily
Connecting with Computer Science, 2e
9
Database Concepts
• Basic database elements
–
–
–
–
Database: collection of one or more tables (entities)
Table or entity: divided into rows and columns
Row (record or tuple): collection of columns
Column (field or attribute): represents specific
information
– Domain: set of possible column values
Connecting with Computer Science, 2e
10
Database Concepts (cont’d.)
Figure 6-1, A database table consists of rows and columns
Connecting with Computer Science, 2e
11
Indexes
• Special files occupying their own space
– Specify columns determining how information stored
in a table can be accessed more efficiently
• Examples: music database and the telephone book
• Advantages
– Flexibility: many different columns to sort against
– Searching and retrieval speeds up
• Disadvantages
– Extra storage space
– Updating takes longer
Connecting with Computer Science, 2e
12
An Example of Indexing
Figure 6-2, You use database concepts in your everyday life
Connecting with Computer Science, 2e
13
An Example of Indexing (cont’d.)
• Each database row has similar attributes
• Sort key: one or more columns used to determine
the data’s sort order
– One key or a combination of keys determines sort
order
• Database information is stored in natural or
sequential order
– Order of records displayed equals the order records
are entered
Connecting with Computer Science, 2e
14
An Example of Indexing (cont’d.)
Figure 6-3, Database records sorted by using the UPC column as a key
Connecting with Computer Science, 2e
15
An Example of Indexing (cont’d.)
Figure 6-4, Database records sorted by Brand_Name and Description
Connecting with Computer Science, 2e
16
Normalization
• Set of rules dictating database design
– Eliminates duplication and inconsistencies
– Process: sequence of stages called normal forms
• Five normal forms
• Third normal form provides sufficient structure
• Three database design problems solved
– Representation of certain real-world items
– Redundancies (repetitions) in data
– Excluded and inconsistent information
Connecting with Computer Science, 2e
17
Preparing for Normalization: Gathering
Columns
• Make a list of all pertinent fields (columns or
attributes)
– Source of fields: end-user reports or song inventory
– Write fields on the column list
• Review user-specified input forms
– Convert each field from the report to column in table
Connecting with Computer Science, 2e
18
Preparing for Normalization: Gathering
Columns (cont’d.)
Figure 6-5, End-user report with table columns highlighted
Connecting with Computer Science, 2e
19
Preparing for Normalization: Gathering
Columns (cont’d.)
Figure 6-6, Additional table columns can be gleaned
from input forms
Connecting with Computer Science, 2e
20
Preparing for Normalization: Gathering
Columns (cont’d.)
• Reconcile fields in report to column list
• Create tables of columns by combining associated
fields
– Logically group related information
• Example: information on artist and song files
• Gather data to create physical music database
Connecting with Computer Science, 2e
21
First Normal Form
• Unnormalized table
– Row-column intersection with two or more values
• First normal form (1NF) eliminates redundancies
– Create new record for duplicated column
– Fill in blanks so all columns in record have a value
– Columns with duplications: Album_Num,
Album_Name, Artist_Code, Artist_Name,
Media_Type, and Genre_Code
• Remaining redundancies are addressed later
Connecting with Computer Science, 2e
22
Second Normal Form
• Next steps:
– Assign a primary key to the table
– Identify functional dependencies within the table
• Primary key (PK)
– Column or combination of columns (composite)
uniquely identifying a row within a table
• Examples: Student ID or Artist_Code
Connecting with Computer Science, 2e
23
Second Normal Form (cont’d.)
• Determinant: column(s) used to determine value
assigned to another column in the same row
– Example: Artist_Code determinant for Artist_Name
• Functional dependency: column’s value dependent
on another column’s value
– Each value of first column is matched to single value
in second
• Example: Artist_Name functionally dependent on
Artist_Code
• Composite key: primary key made up of more than
one column
Connecting with Computer Science, 2e
24
Second Normal Form (cont’d.)
• Second normal form (2NF)
– First normal form and non-PK columns functionally
dependent on PK
• Creating 2NF
– Determine columns not dependent on PK
• Remove columns and place in new table
– Default 2NF: table without composite PK
• Primary 2NF benefit: save disk space
Connecting with Computer Science, 2e
25
Second Normal Form (cont’d.)
Figure 6-10, 2NF: Remove any columns that aren’t dependent
on the composite primary key and create a new table
Connecting with Computer Science, 2e
26
Third Normal Form
• Third normal form (3NF)
– Eliminate transitive dependencies
• Column dependent on another column not part of PK
• Example: Genre_Desc depends on Genre_Code
– Each nonkey field should be a fact about the PK
Connecting with Computer Science, 2e
27
Third Normal Form (cont’d.)
Figure 6-11, Songs table with the Genre_Desc column added
Connecting with Computer Science, 2e
28
Third Normal Form (cont’d.)
• Creating 3NF
– Remove transitive dependencies
– Place removed columns in new table
• Primary 3NF benefit: eliminates repetition and saves
disk space
Connecting with Computer Science, 2e
29
Third Normal Form (cont’d.)
Figure 6-12, Songs and Genre tables in 3NF
Connecting with Computer Science, 2e
30
Third Normal Form (cont’d.)
Figure 6-13, Eliminating repetition saves storage space
Connecting with Computer Science, 2e
31
The Database Design Process
• Six steps to designing a normalized database
• Example:
– Creation of student-grading system
Connecting with Computer Science, 2e
32
Step 1: Investigate and Define
• Investigate and research modeled information
• Define purposes and uses of the database
• Use any documents end users work with to
complete tasks
• Involve end users in design process
– Example: student-grading system based on a course
syllabus
Connecting with Computer Science, 2e
33
Step 2: Make a Master Column List
• Create list of fields for information
• Possible field properties:
–
–
–
–
Field name
Data type (char, varchar, number, date, etc.)
Length
Number of decimal places (if any)
• Review users’ documents for fields
– Forms and reports good source for fields
• Example fields: Student ID, First Name, Last Name, EMail, Grade Level, Grade Level Description, Homework
Average, Quiz Average, Test Average
Connecting with Computer Science, 2e
34
Step 3: Create the Tables
• Logically group defined columns into tables
– Heart of design process
– Relies heavily on normalization rules
• Main rules in database design: 1NF – 3NF
– Table in 3NF: well defined
• Normalizing databases
– Like cleaning a closet
Connecting with Computer Science, 2e
35
Step 3: Create the Tables (cont’d.)
Figure 6-14, Tables created for the student-grading system
Connecting with Computer Science, 2e
36
Step 4: Work on Relationships
• Relationship: defines table relations
– Two types of relationships discussed in this chapter
• One-to-many (1:M)
• One-to-one (1:1)
• Primary and foreign keys are defined in each of the
tables
– Primary key (PK): determinant discussed earlier
– Foreign key (FK): column in one table is PK in
another
Connecting with Computer Science, 2e
37
Copy editor:
Chg. “multiple” to
“zero to many”?
Step 4: Work on Relationships (cont’d.)
• One-to-many (1:M)
– Most common relationship
– Each record in Table A relates to multiple records in
Table B
– Requires that FK column(s) in “many” table refer to
PK column in “one” table
• Example:
– Grades Table to Student Table
Connecting with Computer Science, 2e
38
Step 4: Work on Relationships (cont’d.)
Figure 6-15, The relationship of Student
to Course Grade is one-to-many (1:M)
Connecting with Computer Science, 2e
39
Step 4: Work on Relationships (cont’d.)
• One-to-one (1:1)
– For every record in Table A, there can be one and
only one matching record in Table B
– Consider combining tables in 1:1 relationship
– Normally the two tables can be combined into one
table
Connecting with Computer Science, 2e
40
Step 4: Work on Relationships (cont’d.)
Figure 6-16, The relationship of Student to Grade Level
is one-to-one (1:1)
Connecting with Computer Science, 2e
41
Step 5: Analyze the Design
• Analyze work completed
– Search for design errors and refine tables as needed
– Follow normalization forms (ideally to 3NF)
– Correct any violations
• ER models
– Visual diagram composed of entities and relationships
• Entities represent the database tables
• Relationships show how tables relate to each other
• Cardinality
– Shows numeric relations between entities
Connecting with Computer Science, 2e
42
Step 5: Analyze the Design (cont’d.)
• Types of cardinality (and their notation) include:
–
–
–
–
–
–
0..1, 0:1 (zero to one)
0..M, 0:N, 0..*, 0..n (zero to many)
1..1, 1:1 (one to one)
1..M, 1:M, 1:N, 1..*, 1..n (one to many)
M..1, M:1, N:1, *..1, n..1 (many to one)
M..M, M:M, N:N, *..*, n..n (many to many)
• Example:
– ER model for the student-grading system
Connecting with Computer Science, 2e
43
Step 5: Analyze the Design (cont’d.)
Figure 6-17, The student-grading system ER model in Visio
Connecting with Computer Science, 2e
44
Step 6: Reevaluate
• Reevaluate database performance
–
–
–
–
–
Ensure database meets all reporting and form needs
Include the end users
Explain each table and field being used
Make sure fields are defined to users’ requirements
Manipulate data structure with SQL commands
Connecting with Computer Science, 2e
45
Structured Query Language (SQL)
• Structured Query Language (SQL) functions
– Manipulate data
– Define data
– Administer data
• Many different “dialects” of SQL
Connecting with Computer Science, 2e
46
Structured Query Language (SQL)
(cont’d.)
• SQL advantages
–
–
–
–
Reduces training time (syntax based in English)
Makes applications portable (SQL is standardized)
Reduces the amount of data being transferred
Increases application speed
• Following sections show basic SQL commands
– Creating tables
– Adding (inserting) rows of data
– Querying table to select certain information
Connecting with Computer Science, 2e
47
Structured Query Language (SQL)
(cont’d.)
Figure 6-18, A sample SQL statement and results
Connecting with Computer Science, 2e
48
CREATE TABLE Statement
• CREATE TABLE statement: make a new table
– NULL/NOT NULL
• Optional property indicates whether data required
• Syntax:
Connecting with Computer Science, 2e
49
CREATE TABLE Statement (cont’d.)
• SQL statements to create table called Songs:
Connecting with Computer Science, 2e
50
INSERT INTO Statement
• INSERT INTO statement: add new rows of data
– Requires a table name
– Square brackets ([…]) specify optional columns
– Columns are on separate lines for readability
• Syntax:
INSERT INTO table_name [(column1,column2, . . . )]
VALUES (constant1, constant2, . . . )
Connecting with Computer Science, 2e
51
INSERT INTO Statement (cont’d.)
Figure 6-19, SQL INSERT INTO statement to add a record to the Songs
table and its result
Connecting with Computer Science, 2e
52
SELECT Statement
• Retrieves data from one or more tables
• Syntax:
• Specified order determines order of retrieval/display
Connecting with Computer Science, 2e
53
SELECT Statement (cont’d.)
Figure 6-20, SQL SELECT statement to return the
name, media type, and track number for songs
Connecting with Computer Science, 2e
54
WHERE Clause
• Specifies additional criteria for retrieving data
– Fields should be included in fields selected
• AND and OR keywords
– Allow specification of multiple search criteria
– AND indicates that all criteria must be met
– OR indicates only one criterion needs to be met
Connecting with Computer Science, 2e
55
WHERE Clause (cont’d.)
Figure 6-21, SQL SELECT statement with a WHERE clause
and the results
Connecting with Computer Science, 2e
56
WHERE Clause (cont’d.)
Figure 6-22, More descriptive SQL SELECT statement with
a WHERE clause
Connecting with Computer Science, 2e
57
WHERE Clause (cont’d.)
Figure 6-23, SQL SELECT statement with a WHERE clause and
AND versus OR
Connecting with Computer Science, 2e
58
ORDER BY Clause
• Permits user to change how the data is returned
– Makes for a more meaningful presentation
• Default: data returned in sequential order
• User can specify the ORDER BY column name(s)
• Also returns data in ascending (default) or
descending order
Connecting with Computer Science, 2e
59
ORDER BY Clause (cont’d.)
Figure 6-24, SQL SELECT statement with an ORDER BY
clause and the results
Connecting with Computer Science, 2e
60
ORDER BY Clause (cont’d.)
Figure 6-25, SQL SELECT statement, using an
ORDER BY clause with the default ascending option
Connecting with Computer Science, 2e
61
ORDER BY Clause (cont’d.)
Figure 6-26, SQL SELECT statement, using an ORDER BY
clause with the DESC option
Connecting with Computer Science, 2e
62
ORDER BY Clause (cont’d.)
• Many more options
– Specified on SELECT statement
• Many more SQL commands used
– Maintaining, defining, and administering data found
within a database
Connecting with Computer Science, 2e
63
One Last Thought
• Poorly organized database
– More of a hindrance than a benefit
• Careful planning required
– How data will be structured and stored
– Applying normalization rules
– Using the right SQL statements to extract the kind of
information desired
• Database
– Powerful tool used in many areas
• Including business and computing
Connecting with Computer Science, 2e
64
Summary
• Database
– Collection of logically related records
• DBMS
– Software used to design, manage, and interface with
databases
• Indexes
– Files that revise default sequential order of data
• Normalization
– Process of removing data redundancies
Connecting with Computer Science, 2e
65
Summary (cont’d.)
• Data is normalized with five normal forms
– First three normal forms most important
• Primary key
– Uniquely identifies table entries
• Foreign key
– Primary keys in other tables
• Entity relationship model
– Visual diagram of tables and relationships
Connecting with Computer Science, 2e
66
Summary (cont’d.)
• 1:M and 1:1 notations indicate cardinality
• Six-step database design process
• Structured Query Language (SQL)
– Manipulates, defines, and administers data
• Basic SQL statements
– CREATE TABLE, INSERT INTO, SELECT
Connecting with Computer Science, 2e
67
Download