CHAPTER 3
Data and
Knowledge
Management
Chapter 3: Data and Knowledge
Management
3.1 Managing Data
3.2 The Database Approach
3.3 Database Management
Systems
3.4 Data Warehouses and Data
Marts
3.5 Knowledge Management
Copyright John Wiley & Sons Canada
2
LEARNING OBJECTIVES
1. Identify three common challenges in managing data,
and describe one way organizations can address each
challenge using data governance.
2. Name six problems that can be minimized by using the
database approach.
3. Demonstrate how to interpret relationships depicted in
an entity-relationship diagram.
4. Discuss at least one main advantage and one main
disadvantage of relational databases.
Copyright John Wiley & Sons Canada
3
LEARNING OBJECTIVES
(continued)
5. Identify the six basic characteristics of data warehouses
and data marts.
6. Demonstrate the use of a multidimensional model to
store and analyze data.
7. List two main advantages of using knowledge
management, and describe the steps in the knowledge
management system cycle.
Copyright John Wiley & Sons Canada
4
OPENING CASE 3.1 BIG DATA
The Problem
•
In fact, the amount of digital data increases tenfold every five
years. Scientists say that we are undergoing a new revolution, the
“Industrial Revolution of Data,” and they have coined the term
“Big Data” to describe the superabundance of data available
today. This causes issues in storage space, speed, time,
structure, quantity and quality of data.
Copyright John Wiley & Sons Canada
5
THE SOLUTION
For many organizations, the first step in managing Big Data
was to deal with the problem of information silos. Silos are
information that is stored and isolated in separate
functional areas. Organizations began to integrate this
information into a database environment and then to
develop data warehouses to serve as decision-making
tools. Next, they turned their attention to the business of
data and information management; that is, making sense of
their proliferating data. Seeing a market need for data
management, Oracle, IBM, Microsoft, and SAP together
have spent more than $15 billion in recent years to
purchase software firms specializing in data management
and business intelligence
Copyright John Wiley & Sons Canada
6
THE RESULTS
•
•
•
The way information is managed touches all areas of
life.
Today, the availability of abundant yet small-scale data
enables companies to cater to niche markets, and even
individual customers, anywhere in the world.
Some industries have led the way in gathering and
exploiting data. For example, credit card companies
monitor every purchase and can accurately identify
fraudulent ones, using rules derived by analyzing
billions of transactions.
Copyright John Wiley & Sons Canada
7
DISCUSSION
•
•
What market do you believe will experience the most
growth in “Big Data”? Smart Phones? Tablets?
What type of “Big Data” is used at a university?
Copyright John Wiley & Sons Canada
8
3.1 MANAGING DATA
•
•
The Difficulties of Managing Data
Data Governance
Copyright John Wiley & Sons Canada
9
DIFFICULTIES IN MANAGING DATA
• Amount of data increases exponentially over time
• Data are scattered throughout organizations
• Data obtained from multiple internal and external sources
• Data degrade over time
• Data subject to data rot
• Data security, quality, and integrity are critical, yet easily
jeopardized
• Information systems that do not communicate with each
other can result in inconsistent data;
• Federal regulations.
Copyright John Wiley & Sons Canada
10
DATA GOVERNANCE
•
•
•
•
Data Governance
Master Data Management
Master Data
See video
Copyright John Wiley & Sons Canada
11
MASTER DATA MANAGEMENT
•
John Stevens registers for Introduction to Management
Information Systems (ISMN 3140) from 10 AM until 11
AM on Mondays and Wednesdays in Room 41 Smith
Hall, taught by Professor Rainer.
Transaction Data
Master Data
John Stevens
Student
Intro to Management Information Systems
Course
ISMN 3140
Course No.
10 AM to 11AM
Time
Mondays and Wednesday
Weekday
Room 41 Smith Hall
Location
Professor Rainer
Instructor
Copyright John Wiley & Sons Canada
12
3.2 THE DATABASE APPROACH
•
Databases minimize the following problems:
– Data redundancy: The same data are stored in many places.
– Data isolation: Applications cannot access data associated with
other applications.
– Data inconsistency: Various copies of the data do not agree.
Copyright John Wiley & Sons Canada
13
DATABASE APPROACH
(CONTINUED)
•
Database Management Systems (DBMS) maximize the
following issues:
– Data security: Databases have extremely high security
measures in place to deter mistakes and attacks.
– Data integrity: Data meet certain constraints, such as no
alphabetic characters in a Social Insurance Number field.
– Data independence: Applications and data are not linked to each
other, so that all applications are able to access the same data.
Copyright John Wiley & Sons Canada
14
DATABASE MANAGEMENT
SYSTEMS
Figure 3.1 University Database Management System
Copyright John Wiley & Sons Canada
15
DATA HIERARCHY
•
•
•
•
•
•
Bit: (binary digit) represents the smallest unit of data a
computer can process.
Byte: represents a single character.
Field: A logical grouping of related characters
Record: A logical grouping of related fields
File (or table): A logical grouping of related records
Database: A logical grouping of related files
Copyright John Wiley & Sons Canada
16
HIERARCHY OF DATA FOR A
COMPUTER-BASED FILE
Figure 3.2 Hierarchy of data in University database
Copyright John Wiley & Sons Canada
17
DATA HIERARCHY (CONTINUED)
•
•
Bit (binary digit): 1 0 0 1
Byte (eight bits): 01101010
Copyright John Wiley & Sons Canada
18
DATA HIERARCHY (CONTINUED)
Example of Field and Record
Copyright John Wiley & Sons Canada
19
DATA HIERARCHY (CONTINUED)
Example of Field and Record
Copyright John Wiley & Sons Canada
20
DESIGNING THE DATABASE
•
Data model
– Entity is a person, place, thing, or event which an organization
maintains information.
– Instance: is a specific, unique representation of the entity.
– Attribute is a characteristic or quality of a particular entity
– Primary key is a field that uniquely identifies a record.
– Secondary keys are other field that have some identifying
information but typically do not identify the file with complete
accuracy.
Copyright John Wiley & Sons Canada
21
ENTITY-RELATIONSHIP MODELING
•
•
Database designers plan the database design in a
process called entity-relationship (ER) modeling.
ER diagrams consists of entities, attributes and
relationships.
– Entity classes
– Instance
– Identifiers
Copyright John Wiley & Sons Canada
22
RELATIONSHIPS BETWEEN
ENTITIES
Figure 3.3 Cardinality and Modality Symbols
Copyright John Wiley & Sons Canada
23
ENTITY-RELATIONSHIP DIAGRAM
MODEL
Copyright John Wiley & Sons Canada
24
3.3 DATABASE MANAGEMENT
SYSTEMS
•
•
Database management system (DBMS)
Relational database model
– Structured Query Language (SQL)
– Query by Example (QBE)
•
Data Dictionary
Copyright John Wiley & Sons Canada
25
STUDENT DATABASE EXAMPLE
Figure 3.5 Example of Student Database
Copyright John Wiley & Sons Canada
26
NORMALIZATION
• Normalization
–Minimizes redundancy
–Maximizes data integrity
–Optimizes processing performance
• Normalized data occurs when attributes in the table
depend only on the primary key.
Copyright John Wiley & Sons Canada
27
NON-NORMALIZED RELATION
Copyright John Wiley & Sons Canada
28
NORMALIZING THE DATABASE
(PART A)
Copyright John Wiley & Sons Canada
29
NORMALIZING THE DATABASE
(PART B)
Copyright John Wiley & Sons Canada
30
NORMALIZATION PRODUCES
ORDER
Copyright John Wiley & Sons Canada
31
3.4 DATA WAREHOUSING AND
DATA MARTS
•
Data warehouses and Data Marts
–
–
–
–
–
–
Organized by business dimension or subject
Use On-line Analytical Processing
Integrated
Time Variant
Nonvolatile
Multidimensional
Copyright John Wiley & Sons Canada
32
THE ENVIRONMENT FOR DATA
WAREHOUSING AND DATA MARTS
•
•
•
•
•
Source systems that provide data to the data
warehouse or data mart
Data integration technology and processes that are
needed to prepare the data for use
Different architectures for storing data in an
organization’s data warehouse or data marts
Different BI tools and applications for the variety of
users
The need for metadata, data quality, and governance
processes to be in place to ensure that the data
warehouse or data mart meets its purposes
Copyright John Wiley & Sons Canada
33
DATA WAREHOUSE FRAMEWORK
Copyright John Wiley & Sons Canada
34
RELATIONAL DATABASES
Copyright John Wiley & Sons Canada
35
MULTIDIMENSIONAL DATABASE
Copyright John Wiley & Sons Canada
36
EQUIVALENCE BETWEEN RELATIONAL
AND MULTIDIMENSIONAL DATABASES
Copyright John Wiley & Sons Canada
37
DATA INTEGRATION (ETL)
•
•
•
To extract data from source systems, transform them,
and load them into a data mart or warehouse.
Can be performed by hand-written code (e.g., SQL
queries) or by commercial data-integration software.
Can be transformed to make them more useful.
Copyright John Wiley & Sons Canada
38
STORING THE DATA
•
•
•
The most common architecture is one central
enterprise data warehouse, without data marts.
Independent data marts, which store data for a single
or a few applications, such as in marketing or finance.
Hub and spoke stores data in a central data
warehouse while simultaneously maintaining dependent
data marts that obtain their data from the central
repository.
Copyright John Wiley & Sons Canada
39
STORING DATA (CONTINUED)
•
•
•
•
Metadata is Data about data.
Data Quality: The quality of the data in the warehouse
must be adequate to satisfy users’ needs
Governance requires that people, committees, and
processes be in place.
Users: There are a large number of potential BI users,
including IT developers; front-line workers; analysts;
information workers; managers and executives; and
suppliers, customers, and regulators.
Copyright John Wiley & Sons Canada
40
3.5 KNOWLEDGE MANAGEMENT
•
•
•
Knowledge management (KM)
Knowledge
Intellectual capital (or intellectual assets)
Copyright John Wiley & Sons Canada
41
KNOWLEDGE MANAGEMENT
(CONTINUED)
•
Explicit knowledge: objective, rational, technical
knowledge that has been documented.
– Examples: policies, procedural guides, reports, products,
strategies, goals, core competencies
•
Tacit knowledge: cumulative store of subjective or
experiential learning.
– Examples: experiences, insights, expertise, know-how, trade
secrets, understanding, skill sets, and learning
Copyright John Wiley & Sons Canada
42
KNOWLEDGE MANAGEMENT
(CONTINUED)
• Knowledge management systems (KMSs)
• Best practices
Copyright John Wiley & Sons Canada
43
KNOWLEDGE MANAGEMENT
SYSTEM CYCLE
•
•
•
•
•
•
Create knowledge
Capture knowledge
Refine knowledge
Store knowledge
Manage knowledge
Disseminate knowledge
Copyright John Wiley & Sons Canada
44
KNOWLEDGE MANAGEMENT
SYSTEM CYCLE
Copyright John Wiley & Sons Canada
45
CHAPTER CLOSING
•
•
•
Organizations can use knowledge management to
develop best practices, the most effective and efficient
ways of doing things, and to make these practices
readily available to a wide range of employees.
The database approach minimizes the following
problems: data redundancy, data isolation, data
inconsistency, data security, data integrity, and data
independence.
Master data management provides companies with the
ability to store, maintain, exchange, and synchronize a
consistent, accurate, and timely “single version of the
truth” for the company’s core master data.
Copyright John Wiley & Sons Canada
46
Copyright
Copyright © 2014 John Wiley & Sons Canada, Ltd. All rights
reserved. Reproduction or translation of this work beyond
that permitted by Access Copyright (the Canadian copyright
licensing agency) is unlawful. Requests for further
information should be addressed to the Permissions
Department, John Wiley & Sons Canada, Ltd. The purchaser
may make back-up copies for his or her own use only and
not for distribution or resale. The author and the publisher
assume no responsibility for errors, omissions, or damages
caused by the use of these files or programs or from the use
of the information contained herein.
Copyright John Wiley & Sons Canada