Chapter 3

advertisement

311: Management Information

Systems

Database Systems

Chapter 3

Database

• Organized collection of related data

– Databases: A list of phone numbers and names. List of book titles and authors. List of customers and sales figures.

– Not databases: List of book titles and phone numbers. Pile of papers on a desk.

• Major Problems

– Data are usually not collected in a very organized fashion

– Too much data – not enough information

© Jakob Iversen, 2002

2

The Data Hierarchy

© Jakob Iversen, 2002

3

Traditional File Environment

© Jakob Iversen, 2002

4

Problems with the File Approach

• data redundancy - the same piece of information could be duplicated in several places data inconsistency - the various copies of the data no longer agree data isolation - difficulty in accessing data from different applications data integrity - data values don’t adhere to integrity constraints

© Jakob Iversen, 2002

5

Database : The Modern Approach

© Jakob Iversen, 2002

• One database – several programs

• Data changes only in one place

• Examples: MS Access, Oracle, DB2

6

Advantages of Database Approach

Reduced data redundancy

Shared data and information resources

Improved data integrity

• Easier modification and updating

• Data and program independence

• Better access to data and information

• Standardization of data access

• Framework for program development

• Better overall protection of the data

• Improved strategic use of corporate data

© Jakob Iversen, 2002

7

Disadvantages of Database Approach

© Jakob Iversen, 2002

8

DBMS Components

• Data model

– defines the way data are conceptually structured

• Data definition language (DDL)

– defines what types of information are in the database and how they will be structured

– functions of the DDL

• provide a means for associating related data

• indicate the unique identifiers (or keys) of the records

• set up security access and change restrictions

• Data manipulation language (DML)

– query the contents of the database, store or update information in the database, and develop database applications

– Structured query language (SQL) - most popular relational database language, combining both DML and DDL features

• Data Dictionary (metadata)

– stores definitions of data elements and data characteristics

© Jakob Iversen, 2002

9

DBMS: Logical versus Physical View

• Physical view

– Actual, physical arrangement and location of data

– Described in a schema (describes entire database)

• Logical view

– represents data in a format that is meaningful to a user and to the software programs that process that data

– Can be different for different users as described in subschemas

– Underlying structure may change but subschema (user view) remains the same

© Jakob Iversen, 2002

10

Use of Schemas and Subschemas

© Jakob Iversen, 2002

11

Database : Centralized database

• all related files in one location

• single mainframe computer

• Users can work on a database as a whole at one location

• files only accessible via the host computer

• disaster recovery can be more easily accomplished at a central location

• vulnerable to a single point of failure

• speed problem

© Jakob Iversen, 2002

12

Database : Distributed database

• complete copies (or portions) of a database, in more than one location

• replicated database complete copies of entire database available at many locations: No singlepoint-of-failure and increased responsiveness partitioned database

- a portion of the entire database in each location

• This is planned redundancy (p. 106)

-

© Jakob Iversen, 2002

13

Logical Data Models

• A manager’s ability to use a database is highly dependent on how the database is structured logically and physically.

• In logically structuring a database, businesses need to consider the characteristics of the data and how the data will be accessed.

• Three common data models: hierarchical, network, and relational

© Jakob Iversen, 2002

14

Hierarchical Model

• Fast access, large installed base

• Best with one-to-many relationship btwn data

• Cumbersome, redundant data

© Jakob Iversen, 2002

15

Network Model

• Related data ordered in sets

• Member/owner of set

• Can handle many-to-many relationships

• How do we notify customers who ordered a defective product?

© Jakob Iversen, 2002

16

Relational Model

• Tables

– Files

• Tuples

– Records

• Attributes

– Fields

• Structured Query

Language (SQL)

– Query language that simplifies access to data

– MS Access makes it even simpler!

SQL Example:

SELECT ( Customer_Name and Customer_Address )

FROM Customer_Table

WHERE Credit_Limit > 5000

© Jakob Iversen, 2002

17

The Entity Relationship Model

Relationship

© Jakob Iversen, 2002

Attribute

Entity

Primary Key

18

Important Relational DB Concepts - 1

• Primary Key

– Uniquely identifies records in a table

– Examples: Employee ID, SSN, Account #

– Can be more than one field

– Must be defined in every table

• Foreign Key

– Identifies a primary key in a different table

– This is how information is related

© Jakob Iversen, 2002

19

Important Relational DB Concepts - 2

• Redundant data

– The same information exists in more than one place in the database

– Should be avoided

– Only foreign keys should be duplicated

– Example

© Jakob Iversen, 2002

20

A Relational Database Model

• Identify for each table:

– Records

– Fields

– Field values

– Primary keys

– Foreign keys

© Jakob Iversen, 2002

21

Queries can combine data

© Jakob Iversen, 2002

22

Problems with redundant data

© Jakob Iversen, 2002

23

Comparing Data Models

Model Advantages Disadvantages

Hierarchical

Network

Relational

Speed and efficiency in search Access to data is predefined by exclusively hierarchical relationships, predetermined by administrator. Limited search/ query flexibility. Not all data is naturally hierarchical.

Many more relationships between data elements can be defined. Greater speed and efficiency than relational database models.

Conceptual simplicity; no predefined relationships among data. High flexibility in ad hoc querying. New data and records can be added easily

The most complicated model to design, implement, and maintain.

More flexibility than hierarchical model, but less than relational model.

Lower processing efficiency and speed. Data redundancy is common, requiring additional maintenance.

24

© Jakob Iversen, 2002

Worldwide Dabase Market Share, 2002

NCR

3%

Others

9%

Microsoft

18%

IBM

36%

Oracle

34%

© Jakob Iversen, 2002

Total revenue: $6.6 billion

Source: Gartner Dataquest

25

Selecting a DBMS

• Database Size

• Number of concurrent users

• Performance

• Integration

• Features

• The Vendor

• Cost

© Jakob Iversen, 2002

26

Data Warehousing

• Data extracted from production systems

• Historical data for decision making

• Concerns

– Data extraction (when, from where, what data)

– Data cleaning

– Timeliness

– Business mergers

– Analysis: Data mining and Online Analytical

Processing (OLAP)

© Jakob Iversen, 2002

27

© Jakob Iversen, 2002

Data Warehouse

Elements

28

Data Mart

• A data warehouse for single division or department

• Easier and cheaper to set up

• More detailed data

• But might create ’islands’ of unlinked information

© Jakob Iversen, 2002

29

OLAP and Data Mining

© Jakob Iversen, 2002

30

Now what?

• Thursday

– Lab, Access Tutorial 1

– Lecture: Intro to lab and Access

– Location: Halsey 101C

– Due: IT Problem 1

• Next Tuesday

– Catching up on Chapter 1 and 3

© Jakob Iversen, 2002

31

Download