Chapter 5

advertisement
Chapter 5
Foundations of Business Intelligence: Database and Information
Management
Student Objectives
1.
2.
3.
4.
5.
Describe how a relational database organizes data and compare its approach to an objectoriented database.
Identify and describe the principles of a database management system.
Evaluate tools and technologies for providing information from databases to improve
business performance and decision making.
Assess the role of information policy and data administration in the management of
organizational data resources.
Assess the importance of data quality assurance for the business.
Chapter Outline
5.1
5.2
5.3
5.4
5.5
The Database Approach to Data Management
Entities and Attributes
Organizing Data in a Relational Database
Establishing Relationships
Database Management System
Operations of a Relational DBMS
Capabilities of Database Management Systems
Object-Oriented Databases
Using Databases to Improve Business Performance and Decision Making
Data Warehouses
What is a Data Warehouse?
Data Marts
Business Intelligence, Multidimensional Data Analysis, and Data Mining
Data Mining
Databases and the Web
Managing Data Resources
Establishing an Information Policy
Ensuring Data Quality
Hands-On MIS
Key Terms
The following alphabetical list identifies the key terms discussed in this chapter. The page
number for each key term is provided.
Attributes, 160
Business intelligence (BI), 171
Data administration, 179
Data cleansing, 180
Data definition language, 166
Data dictionary, 166
Data manipulation language, 168
Data mart, 170
Data mining, 173
Data quality audit, 180
Data warehouse, 170
Database, 159
Database administration, 179
Database management system (DBMS), 165
Database server, 176
Entity, 160
Entity-relationship diagram, 162
Field, 161
Foreign key, 162
Information policy, 178
Key field, 161
Normalization, 163
Object-oriented DBMS, 169
Object-relational DBMS, 169
Online analytical processing (OLAP), 172
Predictive analysis, 174
Primary key, 161
Records, 161
Referential integrity, 163
Relational database, 160
Structured Query Language (SQL), 168
Tuples, 161
Teaching Suggestions
The essential message of this chapter is the statement that “Organizations need to manage their
data assets very carefully to make sure that the data are easily accessed and used by managers
and employees across the organization.” Data have now become central and even vital to an
organization’s survival. You can illustrate these comments by referencing the opening case,
“NASCAR Races to Manage Its Data”, in order to stress the importance of data and database
systems for success in business.
What’s interesting and intriguing about the opening vignette is how it points out that every
organization, even something as non-traditional as NASCAR, needs to manage data and
information as an important resource. You could substitute almost any other company for
NASCAR and the story would be the same. How businesses store, organize, and manage their
data has a tremendous impact on organizational effectiveness. Companies need to manage their
data to help them reduce costs, improve operational efficiency and decision making, and most of
all, boost profitability.
Section 5.1, “The Database Approach to Data Management” This section introduces students
to file organization terms and concepts. The database management system is comprised of three
components: important database terminology, types of databases, and the elements of SQL. If
you have access to a relational DBMS during class time, you can demonstrate several of the
concepts presented in this section.
Section 5.2, “Database Management System” Database design and management requirements
for database systems are introduced. Help your students see how a logical design allows them to
analyze and understand the data from a business perspective, while physical design shows how
the database is arranged on direct access storage devices. At this point, you can use the
enrollment process at your university as an example. Have your students prepare a logical design
for the enrollment process. If you have time and as a class activity, ask your students to prepare
an entity-relationship diagram, as well as normalize the data. Your students will need guidance
from you to complete this activity, but it will help them see and understand the logical design
process.
Section 5.3, “Using Databases to Improve Business Performance and Decision Making” This
section focuses on how data technologies are actually used: data warehouses, data marts,
business intelligence, multidimensional data analysis, and data mining. Regardless of their
career choice, students will probably use some or all of these in their jobs. For example, data
warehouses and data marts are important to many people, partly because they are critical for
those who want to use data mining, which in turn has many uses in management analysis and
business decisions. Keep in mind as you teach this chapter that managing data resources can be
very technical, but many students will need and want to know the business uses and business
values. In the end, effectively managing data is the goal. Doing it in a way that will enable your
students to contribute to the success of their organization is the reason why most students are in
this course.
Interactive Session: Organizations: DNA Databases: Crime Fighting Weapon or Threat to
Privacy?
Case Study Questions:
1. What are the benefits of DNA Databases?
DNA databases provide a centralized, digitized collection of one-of-a-kind data to prove the guilt
or innocence of suspected criminals. The database provides a fast, economical method of
comparing evidence at crime scenes with DNA profiles to help apprehend those suspected of
committing crimes. Law enforcement agencies around the world can access the databases
created by the states and linked through the FBI’s CODIS system. The ability to share the data
saves money and time. Sharing the data ensures a wider availability of the collected data.
2. What problems do DNA databases pose?
The DNA databases pose privacy risks to the innocent if the databases contain data on people
who are not convicted criminals. People who collect and analyze the DNA samples can make
mistakes and improperly identify an innocent person as a criminal. Innocent people could be
wrongly convicted of crimes they didn’t commit.
3. Who should be included in a national DNA databases? Should it be limited to convicted
felons? Explain your answer.
The answers to these questions will vary. Some students may say that only convicted criminals
should be included in the DNA database. However, that assumes that only convicted criminals
will commit future crimes. On the other hand, if you say that everyone’s DNA should be
included in the database on the assumption that anyone is capable of committing a crime, then
you run into serious privacy questions. Including everyone’s DNA assumes that everyone may
at some point commit a crime.
Including families of suspected or convicted criminals also invites privacy concerns and social
problems. Just because a family member commits a crime, are we to suppose that everyone in
that family is a criminal or capable of committing crimes? Our legal system is designed to
protect juveniles from some of the harsher rules found in the adult criminal legal system. Does
including them in the DNA databases violate some of those protections? Does that mark them as
a criminal for life?
4. Who should be able to use DNA databases?
Most people will say that all law enforcement agencies should have access to the DNA
databases. That poses privacy concerns if the data are misused. Some students may say that
Transportation Security Agencies that provide airport security should have access to the
databases to help track suspected criminals that may pose threats to airline passengers. If that’s
true, then why not allow access for security personnel at bus stations, train stations, and even
those who run cruise ships?
MIS In Action
Explore the Web site for the Combined DNA Index System (CODIS) and answer the following
questions. (Answers to the questions below are taken directly from the FBI’s Web site at the
following address: http://www.fbi.gov/hq/lab/html/codis1.htm )
1. How does CODIS work? How is it designed?
The FBI Laboratory’s Combined DNA Index System (CODIS) blends forensic science and
computer technology into an effective tool for solving crime. The FBI Laboratory’s CODIS
project began as a pilot software project in 1990 serving 14 state and local laboratories. The
DNA Identification Act of 1994 formalized the FBI’s authority to establish a national DNA
Index System (NDIS) for law enforcement purposes.
CODIS supports NDIS (National DNA Index System), SDIS (State DNA Index System), and
LDIS (Local DNA Index System). NDIS is the highest level in the CODIS hierarchy, and
enables the laboratories participating in the program to exchange and compare NDA profiles on
the national level. SDIS slows laboratories within states to exchange DNA profiles. All DNA
profiles originate at LDIS, and then flow to SDIS and NDIS.
2. What information does CODIS maintain?
Several indexes categorize the profiles entered into CODIS:
 Convicted Offender: Contains profiles of individuals convicted of crimes
 Forensic: Contains DNA profiles developed from crime scene evidence, such as semen
stains or blood
 Arrestees: Contains profiles of arrested persons (if state law permits the collection of
arrestee samples)
 Missing Persons: Contains DNA reference profiles from missing persons
 Unidentified Human Remains: Contains DNA profiles developed from unidentified
human remains. Biological Relatives of Missing Persons contains DNA profiles
voluntarily contributed from relatives of missing persons.
3. Who is allowed to use CODIS?
Today, over 170 public law enforcement laboratories participate in NDIS across the United
States. Internationally, more than 40 law enforcement laboratories in over 25 countries use the
CODIS software for their own database initiatives.
4. How does CODIS aid criminal investigations?
CODIS generates investigative leads in cases where biological evidence is recovered from the
crime scene. Matches made among profiles in the Forensic Index can link crime scenes together;
possibly identifying serial offenders. Based upon a match, police from multiple jurisdictions can
coordinate their respective investigations and share the leads they developed independently.
Matches made between the forensic and Offender Indexes provide investigators with the identity
of a suspected perpetrator(s). Since names and other personally identifiable information are not
stored at NDIS, qualified DNA analysts in the laboratories sharing matching profiles contact
each other to confirm the candidate match.
Interactive Session: Technology: The Databases Behind MySpace
Case Study Questions
1. Describe how MySpace uses databases and database servers.
In its initial phases, MySpace operated with two Web servers communicating with one database
server and a Microsoft SQL Server database. The site continued adding Web servers to handle
increased user requests. After the number of accounts exceeded 500,000 the site added more
SQL Server databases: one served as a master database, the others focused on retrieving data for
user page requests. After two million accounts were activated, MySpace switched to a vertical
partitioning model in which separate databases supported distinct functions of the Web site.
After three million accounts, the site scaled out by adding many cheaper servers to share the
database workload.
It eventually switched to a virtualized storage architecture in which databases write data to any
available disk, thus eliminating the possibility of an application’s dedicated disk becoming
overloaded. MySpace later installed a layer of servers between the database servers and the Web
servers to store and serve copies of frequently accessed data objects so that the site’s Web
servers wouldn’t have to query the database servers with lookups as frequently
2. Why is database technology so important for a business such as MySpace?
Almost everything MySpace receives from and serves to its users are data objects like pictures,
audio files and video files. The objects are very individualized and attached to a certain entity
(person). Its databases must make the objects readily available to anyone requesting access to
that entity. Database technology is the only technology that accomplish the mission.
3. How effectively does MySpace organize and store the data on its site?
In its infancy, MySpace used two Web servers communicating with one database server. That
was adequate when the site had a small number of users who were updating or accessing
database objects. Obviously that won’t work with tens of millions of users. Unfortunately,
MySpace still overloads more frequently than other major Web sites. With a log-in error rate of
20 to 40 percent on some days, the site is not effectively organizing or storing data at all.
4. What data management problems have arisen? How has MySpace solved, or attempted
to solve, these problems?
Some of the problems MySpace has encountered are inadequate storage space on its database
servers, slow access or no access through its log-in application, and users’ inabilities to access
data. Over the years, MySpace has attempted to fix these problems by adding more Web servers
and more database servers. Some were simply “added on” without restructuring the entire
system to more efficiently use its hardware and software. Workloads were not distributed evenly
between servers which caused inefficient use of resources. MySpace developers continue to
redesign the Web site’s database, software, and storage systems, to keep pace with its exploding
growth, but their job is never done.
MIS In Action
Explore MySpace.com, examining the features and tools that are not restricted to
registered members. Then answer the following questions:
1. Based on what you can view without registering, what are the entities in MySpace’s
database?
Obviously, individual users are the main entity in MySpace’s databases. Other entities are video
files, audio files, blogs, forums, groups, events, favorites, and email.
2. Which of these entities have some relationship to individual members?
Which of the entities have a relationship to individual members depends on what the individual
decides. For instance, it’s possible that Sarah would have a list of films (video files) attached to
her profile. She may also participate in forums or groups. It’s possible that all the entities have
some relationship to individual members.
3. Select one of these entities and describe the attributes for that entity.
Films included in MySpace’s databases likely have these attributes: name, date produced, date
released, actors, actresses, director, subject, place it was filmed, musical scores included in the
film, awards given to the film, comments of film goers, and critics’ ratings.
Section 5.4, “Managing Data Resources” This section introduces students to some of the
critical issues surrounding corporate data. Students should realize that setting up the database is
only the beginning of the process. Managing the data is the real challenge. In fact, the main
point is to show how data management has changed and the reason why data must be organized,
accessed easily by those who need access, and protected from the wrong people accessing,
modifying, or harming the data.
Developing a database environment requires much more than selecting database technology. It
requires a formal information policy governing the maintenance, distribution, and use of
information in the organization. The organization must also develop a data administration
function and a data-planning methodology. Data planning may need to be performed to make
sure that the organization’s data model delivers information efficiently for its business processes
and enhances organizational performance. There is political resistance in organizations to many
key database concepts, especially the sharing of information that has been controlled exclusively
by one organizational group. Creating a database environment is a long-term endeavor requiring
large up-front investments and organizational change.
Section 5.5, “Hands-On MIS”
Improving Decision Making: Redesigning the Customer Database: Dirt Bikes U.S.A.
Software skills: Database design; querying and reporting
Business skills: Customer profiling
Redesign Dirt Bikes’ customer database so that it can store and provide the information
needed for marketing. You will need to develop a design for the new customer database
and then implement that design using database software. Consider using multiple tables in
your new design. Populate each new table with ten records.
Develop several reports that would be of great interest to Dirt Bikes’ marketing and sales
department (for example, lists of repeat Dirt Bikes customers, Dirt Bike customers who
attend racing events, or the average age and years of schooling of Dirt Bikes customers)
and print them.
The solution file represents one of many alternative database designs that would satisfy Dirt
Bikes’s requirements. The design shown here consists of four tables: Customer, Distributor,
Purchase, and Model. Dirt Bikes’s old customer database was modified by breaking it down into
these tables. Data on both Dirt Bike’s customer purchases captured from distributors and
customer purchases of non, Dirt Bike, models are stored in the Purchase table. The Customers
table no longer contains purchase data but it does contain data on e-mail addresses, customer
date of birth, years of education, additional sport of interest, and whether they attend dirt bike
racing events. This particular design tracks repeat Dirt Bikes’s customers through reports of
customer purchases showing which customers have purchased more than one Dirt Bike. Reports
for this solution were developed using Access query and report wizards.
An example solution file can be found in the Microsoft Access file named: Ess8ch05 running
case solution.mdb.
Improving Operational Excellence, Building a Relational Database for Inventory
Management
Software skills: Database design, querying and reporting
Business Skills: Inventory Management
This exercise requires that students know how to create queries and reports using information
from multiple tables. The solutions provided here were created using the query wizard and report
wizard capabilities of Access. Students can, of course, create more sophisticated reports if they
wish.
The database would need some modification to answer other important questions about the
business. The owners might want to know, for example, which are the fastest-selling bicycles.
The existing database shows products in inventory and their suppliers. The owners might want to
add an additional table (or tables) in the database to house information about product sales, such
as the product identification number, date placed in inventory, date of sale, purchase price, and
customer name, address, and telephone number. Management could use this enhanced database
to create reports on best selling bikes over a specific period, the number of bicycles sold during a
specific period, total volume of sales over a specific period, or best customers. Students should
be encouraged to think creatively about what other pieces of information should be captured on
the database that would help the owners manage the business.
The answers to the following questions can be found in the Microsoft Access File named:
Ess8ch05solutionfile.mdb.
1. Prepare a report that identifies the five most expensive bicycles. The report should list the
bicycles in descending order from most expensive to lease expensive, the quantity on hand
for each, and the markup percentage for each.
2. Prepare a report that lists each supplier, its products, their quantities on hand, and associated
reorder levels. The report should be sorted alphabetically by supplier. Within each supplier
category, the products should be sorted alphabetically.
3. Prepare a report listing only the bicycles that are low in stock and need to be reordered. The
report should provide supplier information for the items identified.
4. Write a brief description of how the database could be enhanced to further improve
management of the business. What tables or fields should be added? What additional reports
would be useful?
Improving Decision Making: Searching Online Databases for Overseas Business Resources
Software skills: Online databases
Business skills: Researching services for overseas operations
List the companies you would contact to interview on your trip to determine whether they
can help you with these and any other functions you think vital to establishing your office.
Student answers will vary based on the companies they choose to contact.
Rate the databases you used for accuracy of name, completeness, ease-of-use, and general
helpfulness.
The U.S. Department of Commerce Web site contains a fair amount of economic information.
However, it may be simpler to direct your students to go to http://www.aol.com. The Web site
for the Nationwide Business Directory of Australia is http://www.nationwide.com.au
What does this exercise tell you about the design of databases?
Students may not understand that the World Wide Web is one massive data warehouse, but in
non-technical terms that is exactly what it is. Remind them of this when they are completing this
assignment. This assignment may best be accomplished in groups, where they can consolidate
their findings into a written or oral presentation.
Review Questions
1. How does a relational database organize data and how does it differ from an objectoriented database?
Define and explain the significance of entities, attributes, and key fields.



Entity is a person, place, thing, or event on which information can be obtained.
Attribute is a piece of information describing a particular entity.
Key field is a field in a record that uniquely identifies instances of that unique record so
that it can be retrieved, updated, or sorted. For example, a person’s name cannot be a key
because there can be another person with the same name, whereas a social security
number is unique. Also a product name may not be unique but a product number can be
designed to be unique.
Define a relational database and explain how it organizes and stores information.
The relational database is the primary method for organizing and maintaining data today in
information systems. It organizes data in two-dimensional tables with rows and columns
called relations. Each table contains data about an entity and its attributes. Each row
represents a record and each column represents an attribute or field. Each table also contains
a key field to uniquely identify each record for retrieval or manipulation.
Explain the role of entity-relationship diagrams and normalization in database design.
An entity-relationship diagram graphically depicts the relationship between entities (tables)
in a relational database. A well-designed relational database will not have many-to-many
relationships, and all attributes for a specific entity will only apply to that entity. The process
of breaking down complex groupings of data and streamlining them to minimize redundancy
and awkward many-to-many relationships is called normalization.
Relational databases organize data into two-dimensional tables (called relations) with
columns and rows. Each table contains data on an entity and its attributes.
Define an object-oriented database and explain how it differs from a relational
database.
An object-oriented DBMS stores the data and procedures that act on those data as objects
that can be automatically retrieved and shared. Object-oriented database management
systems (OODBMS) are becoming popular because they can be used to manage the various
multimedia components or Java applets used in Web applications, which typically integrate
pieces of information from a variety of sources.
Although object-oriented databases can store more complex types of information than
relational DBMS, they are relatively slow compared with relational DBMS for processing
large numbers of transactions.
2. What are the principles of a database management system?
Define a database management system (DBMS) and describe how it works and its
benefits to organizations.
A database management system (DBMS) is a specific type of software for creating, storing,
organizing, and accessing data from a database. A DBMS consists of software that permits
centralization of data and data management so that businesses have a single, consistent
source for all their data needs. A single database services multiple applications. The most
important feature of the DBMS is its ability to separate the logical and physical views of
data. The user works with a logical view of data. The DBMS retrieves information so that
the user does not have to be concerned with its physical location.
Define and compare the logical and physical views of data.
The DBMS relieves the end user or programmer from the task of understanding where and
how the data are actually stored by separating the logical and physical views of the data. The
logical view presents data as end users or business specialists would perceive them, whereas
the physical view shows how data are actually organized and structured on physical storage
media, such as a hard disk.
Define and describe the three operations of a relational database management system.
In a relational database, three basic operations are used to develop useful sets of data: select,
project, and join.
 Select operation creates a subset consisting of all records in the file that meet stated
criteria. In other words, select creates a subset of rows that meet certain criteria.
 Joint operation combines relational tables to provide the user with more information that
is available in individual tables.
 Project operation creates a subset consisting of columns in a table, permitting the user to
create new tables that contain only the information required.
Name and describe the three major capabilities of a DBMS.
A DBMS includes capabilities and tools for organizing, managing, and accessing the data in
the database. The principal capabilities of a DBMS include data definition language, data
dictionary, and data manipulation language.
 The data definition language specifies the structure and content of the database.
 The data dictionary is an automated or manual file that stores information about the data
in the database, including names, definitions, formats, and descriptions of data elements.
 The data manipulation language, such as SQL, is a specialized language for accessing
and manipulating the data in the database.
3. What are the principal tools and technologies for accessing information from databases
to improve business performance and decision making?
Define a data warehouse and describe how it works.
A data warehouse is a database with archival, querying, and data exploration tools (i.e.,
statistical tools) and is used for storing historical and current data of potential interest to
managers throughout the organization and from external sources (e.g., competitor sales or
market share). The data originate in many of the operational areas and are copied into the
data warehouse as often as needed. The data in the warehouse are organized according to
company-wide standards so that they can be used for management analysis and decision
making. Data warehouses support looking at the data of the organization through many
views or directions. The data warehouse makes the data available to anyone to access as
needed, but it cannot be altered. A data warehouse system also provides a range of ad hoc
and standardized query tools, analytical tools, and graphical reporting facilities. The data
warehouse system allows managers to look at products by customer, by year, by salesperson,
essentially different slices of the data. Normal operational databases do not permit such
different views.
Define business intelligence and explain how it is related to database technology.
Powerful tools are available to analyze and access information that has been captured and
organized in data warehouses and data marts. These tools enable users to analyze the data to
see new patterns, relationships, and insights that are useful for guiding decision making.
These tools for consolidating, analyzing, and providing access to vast amounts of data to help
users make better business decisions are often referred to as business intelligence. Principal
tools for business intelligence include software for database query and reporting tools for
multidimensional data analysis and data mining.
Describe the capabilities of online analytical processing (OLAP).
Data warehouses support multidimensional data analysis, also known as online analytical
processing (OLAP), which enables users to view the same data in different ways using
multiple dimensions. Each aspect of information represents a different dimension.
OLAP represents relationships among data as a multidimensional structure, which can be
visualized as cubes of data and cubes within cubes of data, enabling more sophisticated data
analysis. OLAP enables users to obtain online answers to ad hoc questions in a fairly rapid
amount of time, even when the data are stored in very large databases. Online analytical
processing and data mining enable the manipulation and analysis of large volumes of data
from many perspectives, for example, sales by item, by department, by store, by region, in
order to find patterns in the data. Such patterns are difficult to find with normal database
methods, which is why a data warehouse and data mining are usually parts of OLAP. OLAP
represents relationships among data as a multidimensional structure, which can be visualized
as cubes of data and cubes within cubes of data, enabling more sophisticated data analysis.
Define data mining describe what types of information can be obtained from it, and
explain how it differs from OLAP.
Data mining provides insights into corporate data that cannot be obtained with OLAP by
finding hidden patterns and relationships in large databases and inferring rules from them to
predict future behavior. The patterns and rules are used to guide decision making and
forecast the effect of those decisions. The types of information obtained from data mining
include associations, sequences, classifications, clusters, and forecasts.
Explain how users can access information from a company’s internal databases
through the Web.
Conventional databases can be linked via middleware to the Web or a Web interface to
facilitate user access to an organization’s internal data. Web browser software on his/her
client PC is used to access a corporate Web site over the Internet. The Web browser software
requests data from the organization’s database, using HTML commands to communicate
with the Web server. Because many back-end databases cannot interpret commands written
in HTML, the Web server passes these requests for data to special middleware software that
then translates HTML commands into SQL so that they can be processed by the DBMS
working with the database. The DBMS receives the SQL requests and provides the required
data. The middleware transfers information from the organization’s internal database back to
the Web server for delivery in the form of a Web page to the user. The software working
between the Web server and the DBMS can be an application server, a custom program, or a
series of software scripts.
4. What is the role of information policy and data administration in the management of
organizational data resources?
Define information policy and data administration and explain how they help
organizations manage their data.
An information policy specifies the organization’s rules for sharing, disseminating,
acquiring, standardizing, classifying, and inventorying information. Information policy lays
out specific procedures and accountabilities, identifying which users and organizational units
can share information, where information can be distributed, and who is responsible for
updating and maintaining the information.
Data administration is responsible for the specific policies and procedures through which
data can be managed as an organizational resource. These responsibilities include
developing information policy, planning for data, overseeing logical database design and data
dictionary development, and monitoring how information systems specialists and end-user
groups use data.
In large corporations, a formal data administration function is responsible for information
policy, as well as for data planning, data dictionary development, and monitoring data usage
in the firm.
5. Why is data quality assurance so important for a business?
List and describe the most common data quality problems.
Data that are inaccurate, incomplete, or inconsistent create serious operational and financial
problems for businesses because they may create inaccuracies in product pricing, customer
accounts, and inventory data, and lead to inaccurate decisions about the actions that should
be taken by the firm. Firms must take special steps to make sure they have a high level of
data quality. These include using enterprise-wide data standards, databases designed to
minimize inconsistent and redundant data, data quality audits, and data cleansing software.
List and describe the most important tools and techniques for assuring data quality.
A data quality audit is a structured survey of the accuracy and level of completeness of the
data in an information system. Data quality audits can be performed by surveying entire data
files, surveying samples from data files, or surveying end users for their perceptions of data
quality.
Data cleansing consists of activities for detecting and correcting data in a database that are
incorrect, incomplete, improperly formatted, or redundant. Data cleansing not only corrects
data but also enforces consistency among different sets of data that originated in separate
information systems.
Discussion Questions
1. It has been said that you do not need database management software to create a
database environment. Discuss.
A database is a collection of data organized to service many applications at the same time by
storing and managing data so that they appear to be in one location. It is not mandated that a
database have a DBMS. What is most important is the concept of a database — a model for
organizing information so that it can be stored and accessed flexibly and efficiently. Without
the right vision of a database and data model, a DBMS is not effective. A DBMS is special
software to create and maintain a database. It enables individual business applications to
extract the data they need without having to create separate files or data definitions in their
computer programs. However, the use of a DBMS can reduce program-data dependence
along with program development and maintenance costs. Access and availability of
information can be increased because users and programmers can perform ad-hoc queries of
data in the database. The DBMS allows the organization to centrally manage data, its use,
and security.
2. To what extent should end users be involved in the selection of a database management
system and database design?
End users should be involved in the selection of a database management system and the
database design. Developing a database environment requires much more than just selecting
the technology. It requires a change in the corporation’s attitude toward information. The
organization must develop a data administration function and a data planning methodology.
The end-user involvement can be instrumental in mitigating the political resistance
organizations may have to many key database concepts, especially to sharing information
that has been controlled exclusively by one organizational group.
Video Case Questions
You will find a video case illustrating some of the concepts in this chapter on the Laudon Web
site at www.prenhall.com/laudon along with questions to help you analyze the case.
Teamwork: Identifying Entities and Attributes in an Online Database
With a group of two or three of your fellow students, select an online database to explore,
such as AOL Music or the Internet Movie Database. Explore these Web sites to see what
information they provide. Then list the entities and attributes that they must keep track of
in their databases. If possible, diagram the relationship between the entities you have
identified. If possible, use electronic presentation software to present your findings to the
class.
Direct your students to these Web sites. In their analysis, students should quickly articulate that
many of these sites use the same entities and attributes to keep track of their database.
There are hundreds of Internet Movie Databases so students will have to select the one that
interests them. The Web sites for AOL Music and Gracenote.com are listed below.
http://music.aol.com/
http://gracenote.com/
Business Problem-Solving Case: Can HP Mine Success from an Enterprise Data
Warehouse?
1. Identify the problem described in this case. What people, organization, and technology
factors were responsible for creating this problem?
At one time HP had
 5000 information system applications
 85 computer centers
 Between 19,000 and 22,000 servers
 17 different database technologies
 14,000 different databases in use
With all of that computing capacity the organization had these data-related problems:
 It couldn’t collect and analyze “consistent, timely data spanning different parts of the
business
 Systems tracked sales data differently
 Commonly used financial information was calculated differently in different business
units
 Compiling information from various systems could take up to a week

Seemingly simple questions were difficult to answer
Without a consistent view of the enterprise, senior executives struggled with decisions on
matters such as the size of sales and service teams assigned to particular systems.
Factors that were responsible for creating this problem include:
People: As with most companies, HP experienced political turf issues. Not all departments
want to depend on a central data warehouse supported by a centralized information systems
staff for their data-analysis needs. HP’s departmental users initially resisted the idea of a
central data warehouse. Many of them preferred smaller data marts configured to their
particular needs.
Organization: HP had too many different information system applications in too many
computer centers. It had too many different database technologies and way too many
different databases. As with most organizations, departments were allowed to create,
manage and use their own databases without regard towards sharing the data with other
departments—islands of information at their finest. Even though HP wanted its data
warehouse to give its workforce access to data in real time with no departmental or
geographic boundaries, its old system fell far short of that goal.
Technology: All-inclusive data warehouses require enormous work to organize and
integrate all the data. Knowledge of database technology and design principles are talents
that are hard to find in a large pool of potential employees—techies and non-techies. HP
lacked the hardware and software that would allow it to build such a large, consolidated
database that is easily and quickly available to over 50,000 users.
2. What solution has HP chosen to fix this problem? Did management select the best
solution alternative?
HP CIO Randy Mott began consolidating hundreds of data marts into a single data
warehouse. He created a 300-person team that had experience in running data marts and
charged them with modeling the enterprise-wide database. He had three goals for the
database: it had to always be up-to-date, consistent for the entire enterprise, and complete.
The new database uses proprietary software developed by internal employees. At its
implementation the warehouse contains 180 terabytes of raw data and 75 terabytes of
functional data. Since the company anticipates the database will double in size at its
completion, it’s assumed the team built scalability into the new hardware and software.
Whether management selected the best solution alternative is based on individual perceptions
and experiences. Those students who’ve had good success working with very large,
consolidated data warehouses will probably agree with HP’s solution. Others who’ve not
had good success working with data warehouses probably will not agree with HP’s solution.
The fact remains that the company had to do something about its data problems, especially
the inability to serve timely, complete, and consistent information to managers. Apparently
HP has had good success since it has been able to market the home-grown system to other
companies.
3. How much will HP’s database experience and technology help HP and its clients build
all-inclusive data warehouses?
The fact that HP built its own data warehouse and had to experience the pain first-hand lends
credence to the Neoview system as a potential product and service it can sell to other
organizations. It will understand the people, organization, and technology problems that
other companies will have to work through. It can offer real-world advice and expertise
based on its own experiences.
4. How much will Neoview help HP and its clients create enterprise-wide data
warehouses? Explain your answer.
HP promotes Neoview by differentiating it from typical data warehouses, which are costly,
use proprietary technology (although so does Neoview), and tend to focus on one area of a
business rather than an entire enterprise. The Neoview system was designed from the ground
up to be an all-inclusive data warehouse that provides dexterity with table joins and gives the
system the ability to perform analysis functions at the same time that it’s managing new
incoming data. It includes all of the data used by a company and not just partial segments of
data or the company. Most warehouses don’t have that feature.
5. If you were in charge of developing an enterprise-wide data warehouse for your
company, describe the steps you would have to take to complete this project. List and
describe all of the people, organization, and technology issues that must be addressed to
build an enterprise-wide data warehouse successfully.
The first step is to identify the real problem. In HP’s case the real problem was that data was
inconsistent across the organization and the current system was slow to provide information
to users. It simply did not give the organization a clear, concise, and consistent view of the
entire enterprise.
The second step is to assemble the right people, technical and business units users, that could
develop an acceptable solution for the entire enterprise. The third step is to implement the
solution and the fourth step is to maintain the new system and processes.
Issues that must be addressed to build an enterprise-wide data warehouse successfully
include:
People: Perhaps the most important issue is to convince employees, managers, and
executives that the new system will be better than the old one. The organization’s change
agent is responsible for ensuring all the people in the organization accept the new system.
Assemble the right people—techies and non-techies—that have the business knowledge and
technical knowledge to build the database. Train, train, and train some more so users have a
complete knowledge of the new system.
Organization: Solve, or least reduce, the political turf battles inherent in the old system.
Show how the organization will benefit from a better system by having consistent, complete,
and up-to-date information across all organizational boundaries.
Technology: HP’s new system has familiar components that will create a larger pool of
people with the knowledge to run most data warehouses. The new system will emphasize
cost and flexibility. Neoview’s hardware can be used to run other applications aside from
those connected to the data warehouse. Most other warehouses in use do not incorporate 100
percent of a company’s data as HP contends Neoview will. Neoview’s system uses servers
with Itanium processors from Intel so they meet industry standards and are more versatile
than servers with proprietary technology. The system is highly scalable and promises
availability 99.999 % of the time.
Chapter Summary
Section 5.1: The Database Approach to Data Management
The relational database is the primary method for organizing and maintaining data today in
information systems. It organizes data in two-dimensional tables with rows and columns called
relations. Each table contains data about an entity and its attributes. Each row represents a
record and each column represents an attribute or field. Each table also contains a key field to
uniquely identify each record for retrieval or manipulation. An entity-relationship diagram
graphically depicts the relationship between entities (tables) in a relational database. A welldesigned relational database will not have many-to-many relationships, and all attributes for a
specific entity will only apply to that entity. The process of breaking down complex groupings
of data and streamlining them to minimize redundancy and awkward many-to-many
relationships is called normalization.
An object-oriented DBMS stores data and procedures that act on the data as objects, and it can
handle multimedia as well as characters and numbers.
Section 5.2: Database Management Systems
A database management system (DBMS) consists of software that permits centralization of data
and data management so that businesses have a single consistent source for all their data needs.
A single database services multiple applications. The most important feature of the DBMS is its
ability to separate the logical and physical views of data. The user works with a logical view of
data. The DBMS retrieves information so that the user does not have to be concerned with its
physical location.
The principal capabilities of a DBMS include a data definition capability, a data dictionary
capability, and a data manipulation language. The data definition language specifies the
structure and content of the database. The data dictionary is an automated or manual file that
stores information about the data in the database, including names, definitions, formats, and
descriptions of data elements. The data manipulation language, such as SQL, is a specialized
language for accessing and manipulating the data in the database.
Section 5.3: Using Databases to Improve Business Performance and Decision Making
Powerful tools are available to analyze and access the information in databases. A data
warehouse consolidates current and historical data from many different operational systems in a
central database for reporting and analysis. Data warehouses support multidimensional data
analysis, also known as online analytical processing (OLAP). OLAP represents relationships
among data as a multidimensional structure, which can be visualized as cubes of data and cubes
within cubes of data, enabling more sophisticated data analysis. Data mining analyzes large
pools of data, including the contents of data warehouses, to find patterns and rules that can be
used to predict further behavior and guide decision making. Conventional databases can be
linked via middleware to the Web or a Web interface to facilitate user access to an organization’s
internal data.
Section 5.4: Managing Data Resources
Developing a database environment requires policies and procedures for managing
organizational data as well as a good data model and database technology. A formal information
policy governs the maintenance, distribution, and use of information in the organization. In large
corporations, a formal data administration function is responsible for information policy, as well
as for data planning, data dictionary development, and monitoring data usage in the firm.
Data that are inaccurate, incomplete, or inconsistent create serious operational and financial
problems for businesses because they may create inaccuracies in product pricing, customer
accounts, and inventory data, and lead to inaccurate decisions about the actions that should be
taken by the firm. Firms must take special steps to make sure they have a high level of data
quality. These include using enterprise-wide data standards, databases designed to minimize
inconsistent and redundant data, data quality audits, and data cleansing software.
Download