data warehousing in computer integrated manufacturing (cim)

advertisement
DATA WAREHOUSING IN COMPUTER INTEGRATED MANUFACTURING
(CIM)
Steve S. Daino
Masters of Science Graduate Student
Submitted in Partial Completion of the Requirements of
IEM 5303
Computer Integrated Manufacturing
This paper was developed to assist students in partial fulfillment of course requirements. No warranty of
any kind is expressed or implied. Readers of this document bear sole responsibility for verification of its
contents and assume any/all liability for any/all damage or loss resulting from its use.
Table of Contents
ABSTRACT ........................................................................................................................ 3
INTRODUCTION TO DATA WAREHOUSING ............................................................. 3
Definitions and Concepts ............................................................................................ 4
Considerations............................................................................................................. 6
FUNDAMENTALS OF DATA WAREHOUSING ........................................................... 9
Types of Data Warehouses ......................................................................................... 9
Technological Building Blocks................................................................................... 9
Data Warehouse Framework and Architecture ......................................................... 10
Data Warehouse Logical Model ............................................................................... 12
BENEFITS OF DATA WAREHOUSING IN CIM ......................................................... 15
CASE STUDIES ............................................................................................................... 16
Toyota Motor Company's Toyota Logistics Services (TLS) [4] ............................. 16
General Motors [5]................................................................................................... 17
MCI WorldCom and Industry Data Exchange (IDE) [6].......................................... 17
REFERENCES ................................................................................................................. 19
List of Figures
Figure 1. Reasons for moving data outside the operations systems ……………… 5
Figure 2. Data Warehouse Architecture …………………………………………. 10
Figure 3. Data warehouse entities align with the business structure …………….. 13
Figure 4. Data De-normalization and Transformation ………………………….. 14
Figure 5. Data Scrubbing and Staging ………………………………………….
2
14
ABSTRACT
The incorporation of data warehousing in computer-integrated manufacturing
provides a means by which organizations can become more successful and globally
competitive in today's fierce economic market. "Business as usual" is no longer enough.
Since the onset of the information superhighway, a new avenue has opened enabling
organizations to include state-of-the-art technologies for the collection, processing, and
analysis of information. Data collection technologies and the tools used to analyze and
form decisions based on the collected data encompass the concept of data warehousing.
This technology is becoming an elemental tool by which corporations are able to gain a
vital competitive decision-making edge throughout the product development lifecycle.
Data warehousing is driven by the concept that data stored for business analysis is
most effectively accessed by separating it from the data in operational systems. “ It is
unacceptable to have business analysis interfere with and degrade performance of
operational systems.” [8] Analytical tools and procedures complement data warehouses
by allowing users to actively manipulate data retrieved from computer servers in order to
get the necessary information used for decision-making. This paper will discuss the
scope of data warehousing in computer integrated manufacturing (CIM), explain the
advantages of data warehousing in CIM, and illustrate the use of data warehousing in
manufacturing enterprises through selected case studies.
INTRODUCTION TO DATA WAREHOUSING
Computers began to play an important and vital role in the manufacturing process
with the Automation era. However, these older processes were independent or standalone systems that did not allow the transmission and storage of information onto other
computer systems. The users' needs to have access to this data eventually evolved into
the creation of distributed database management technology. This software allowed
specific information to be pulled from databases located throughout the organization,
collected and stored on a computer located in a central location and consolidated and
analyzed by the users.
3
Although the concept of the distributed database management system was good in
theory, it still did not resolve the issues with the inability to share information among the
various relational databases or incompatible computer systems. “Despite all of the
changes in platforms, architectures, tools , and technologies over the past decades, a
remarkably large number of business applications continue to run in the mainframe
environment of the 1970’s.” [8] Additionally, the speed of the entire distributed process
was fairly slow. The solution to this problem was the onset of data warehousing.
Through the concept of client/server technology, data would be copied regardless of
computer types and platforms and placed onto a common server.
Definitions and Concepts
By definition, a data warehouse is a centralized, integrated repository of information that
must support complex decision support queries without performance degradation . A data
warehouse simply provides a means to manage data outside of the operational systems
where the data is originated. Reasons for moving data outside of operational systems are
illustrated below in Figure 1. This process is achieved through an integrated system of
hardware, software and network technologies designed to convert operational data into
accessible business information. The main purpose of the data warehouse is to provide
historical data for analyzing past performance in order to aid in future decisions. It is
also important to remember that once operational data is brought to the data warehouse, it
is no longer dynamic data and should not be further modified.
4
Order processing
Data
Warehouse
•2 second response time
•Last 6 months orders
Daily closed orders
• Last 5 years data
• Response time 2 seconds
to 60 minutes
Product Price/inventory
Weekly product price/Inventory
•10 second response time
• Data is not modified
•Last 10 price changes
•Last 20 inventory transactions
Weekly marketing programs
Marketing
•30 second response time
•Last 2 years programs
•Different performance requirements
•Combine data from multiple applications
•Data is mostly non-volatile
•Data saved for a long time period
Figure 1. Reasons for moving data outside the operations systems
[8]
The concept of data warehousing differs greatly from that of our current
technology regarding production databases, also known as online transaction-processing
systems or OLTP systems. As displayed in Table 1, the differences between these two
concepts affect both purpose and design. [1] The type of data warehouse an organization
decides to incorporate should be based on two factors: the data needed to make
informative decisions and the methods used to conduct its business. Since organizations
are finding an increasing need to analyze historical data from all aspects of their
manufacturing processes in order to improve their efficiency and increase their business
performance, data warehousing solutions are becoming the foundation for future
opportunities.
5
Data Warehouses
OLTP Systems
Optimized for data retrieval and
reporting
Read-only system
Optimized for data entry and
updates
Contains data needed for
running the day-to-day
operations of a business
Contains data used for analyzing Contains current and highly
the business
volatile data with some
elements being unknown or
incomplete at data entry time
Incorporated redundant data
Does not incorporate
redundancy in data storage
Table 1. Data Warehouses versus OLTP Systems
Considerations
One of the greatest advantages of data warehousing technology is in the area of
accessibility to information. Organizations are finding it advantageous to not only open
up the data to the users within the company, but also to customers, suppliers, and
business partners. The benefits of information sharing include improved operations and
products, more content customers, and increased revenue. However, sharing technology
also requires a considerable amount of planning and coordination, a reliable process to
ensure data integrity, increased security, and scalability of the system based on the
number of users accessing the server computer system.
The decision for an organization to incorporate data warehousing into their
business depends on several factors. The practical considerations listed in Table 2 show
that these factors have an impact as to whether the data warehousing technology in an
organization becomes a valuable tool or just another expensive technology fad. [1]
6
Time and Money
Space
Consolidation
Security
User-Friendliness
Project Planning
Table 2. Practical Considerations for Data Warehousing
Data warehouses are expensive in both time and money. In a 1996 study
published by IDC, the average cost of building a data warehouse was $2.2 million, with
an average time of 2.3 years to break even. Ninety percent of the companies in the study
achieved greater than 40 percent return on investment, and 50 percent achieved over 160
percent return on investment. The average return on investment over three years,
cumulative, was about 400 percent, with a higher return on investment for data marts. [1]
Building a data warehouse can be profitable, however, a return on investment will not be
realized for quite some time. The organization needs to be fully aware of the amount of
investment and time required before any payback can be expected.
Data warehouses also require an enormous amount of disk space. It is not
uncommon to have stored a years worth of data as the minimum, or several years if
trending analysis is used. Data warehouses measured in terabytes or petabytes are not
uncommon. Disk space issues will also affect the cost of the project in both
implementation and maintenance.
Consolidation of data should be considered in the design of the warehouse
especially if data from multiple sources may uncover incompatibility issues. Consistency
in the data is the key to being able to extract useful data for decision-making applications.
Security affects the organization both physically and philosophically. Since one
the great advantages of a data warehouse is the ability to share information amongst
multiple users, security considerations that allow this type of access are a necessity. This
7
contradicts the old security standard of limiting access for only those who absolutely
need it. However, the evolution of technology is forcing organizations to change from a
'right to know' mind set to a 'need to know' philosophy. By permitting users access to the
data, they are able to maximize their effectiveness and contributions to the organization.
Data warehouses and the tools used for analysis should be user-friendly and not
cumbersome to the point of frustration. Encouraging the users to search for answers to
their questions will promote creativity and reduce costs associated with maintaining
users' requests.
Finally, proper planning is a necessity if the project is to be successful. An
organization needs to have a full understanding of their business objectives, the potential
cost versus the benefits, the resources required and the organizational commitment
needed to implement a successful data warehouse project. If these areas are not fully
researched before the implementation of the project, the organization may have wasted
time and money on a project destined to fail. According to Hal Lorin of the Manticore
Consultancy in New York, most of the failed data warehousing projects he has had
chance to examine closely have demonstrated a set of common patterns:

The database product was driving the project, not being driven by it.

No close and effective link existed between business processes analysis and data
normalization.

No multi-dimensional views were available with the 'warehouse' context.

No serious data normalization had been done and there was no architecture for an
integrated repository of browse able Meta data.

There was a totally inadequate study of scale and capacity issues.

There was no co-ordination with legacy application portfolio management.

The 'decision tools' fronting the warehouse were trivial or not at all integrated,
providing two-dimensional representations of fixed queries based on simple
extraction to a desktop PC. [10]
8
FUNDAMENTALS OF DATA WAREHOUSING
Types of Data Warehouses
Organizations need to consider the different variations of data warehousing and
the benefits of each before deciding which would be best for their project. The
operational data store, the data mart, and the enterprise data warehouse are all examples
of the different types of data warehouses.
The operational data store, one of the simplest types of data warehouses, is
basically a replicated production database that has been adjusted for errors. Its primary
purpose is to generate standard operations reports and to provide transaction detail
information for summary level analysis.
The data mart is different from a data store in that it usually contains limited
information regarding a specific department or business process. An example for the use
of data marts would be analyzing sales information for a specific region or product line.
Since data marts contain only summary information, they can be linked to data stores for
more detailed transactional data.
The enterprise data warehouse holds information taken from throughout an
organization. This type of warehouse is the most complex in both areas of establishing
and maintaining, since information must be collected from multiple systems into a
common database. The most common problem with the enterprise data warehouse
regards incompatible or inconsistent data. The majority of the time spent on building
such a data warehouse is directed to the extracting, cleaning, and loading of data.
Technological Building Blocks
The technological building blocks of any data warehouse include the relational
database management system (RDBMS), a structured query language (SQL), and on-line
9
analytical processing software (OLAP).[2] The RDBMS stores the information in a
database. This element is critical to the efficiency of the warehouse due to the factors
that include the integrity of data stored and the relationship of the data within the tables.
The SQL is a tool used to create, maintain, and view data. The SQL is critical to the
warehouse because it is the point of entry through which the users access the information.
OLAP software allows the users to view the summarized data. This tool empowers the
users to take control of their needs by giving them the control to do their own search and
analysis of the data.
Data Warehouse Framework and Architecture
The structure of data, communication, processing and presentation that allows
users to access enterprise data consists of several interconnected parts. These parts,
referred to as layers, can be graphically viewed in Figure 1 shown below. [3]
Figure 2. - Data Warehouse Architecture
The operational and external database layers consist of the "front-line" operational
systems. These systems can be related to the different computer manufacturing processes
located throughout the organization. Since these systems directly control the
manufacturing process, their business transactions are limited in focus. Within an
10
enterprise, there may be hundreds of "front-line" systems each performing different
functions as well as collecting process information to be stored.
The information access layer provides the user of a data warehouse with both
hardware and software tools that enable them to process and display data in a desired
format for analysis. These tools allow the user to take control of their searches and gives
them the data needed to make informed decisions.
The data access layer is the communication network by which the information
access layer communicates with the operational database. An important function of the
data access layer is to act as the front-end for providing users with universal and
transparent access to the data, regardless of the different DBMSs, file systems,
manufacturers and network protocols. Structured Query Language (SQL) has become the
standard for polling information at this level. Data access filters allow SQL to access
both relational and non-relational DBMSs in addition to information from dated DBMSs
within an enterprise.
The data directory layer is the repository of metadata and can be thought of as the
"Yellow Pages" for information about the data. Some of the aspects described in
Metadata include the data's source, how it is mapped, and its reliability ranking. "This
information allows end-users to access data from the data warehouse or operational
database without having to know where the data resides or the form in which it is
stored." [3]
The process management layer serves as the process "scheduler" which updates
the data warehouse. Since data integrity is an important aspect, the organization needs to
consider the optimum time to schedule jobs that pull data from the production databases
to be stored in the warehouse database. Failures due to bad or unknown data are
unacceptable and procedures should be in place to handle these situations.
11
The application-messaging layer is also referred to as "middleware." Its function
is to transport information throughout the enterprise computing network and includes
tools for filtering and merging data, managing the metadata, and handling the data
warehouse change modifications.
The data warehouse layer is the core of the data warehouse. In a physical data
warehouse, data is extracted from operational and/or external systems and stored in a
database on a centrally located server. However, with the onset of client/server
technology, not all computers have the capacity to store data. For these systems, the data
warehouse may provide a virtual view. In a virtual or logical system, the data warehouse
does not actually involve storing data but displays a copy of the database stored on the
server.
The data-staging layer is also referred to as replication management. "Data
staging involves data quality analysis and filters to identify patterns and data structures
within existing operational data." [3]
Data Warehouse Logical Model
In order to bring data from operational systems into a data warehouse, it must be
logically transformed. Operational systems generally contain overlapping reference data
or data in varying forms that is not entirely necessary for analysis in a data warehouse.
The data entities incorporated into a data warehouse should align with the business
structure and are built by collecting data from multiple source applications. An example
of this is illustrated below in Figure 3.
12
Order processing
Customer
orders
Product
price
Data
Warehouse
Available Inventory
Customers
Products
Product Price/inventory
Product
price
Product
Inventory
Orders
Product Inventory
Product Price changes
Product Price
Marketing
Customer
Profile
Product
price
Marketing programs
•No data model restrictions of the source application
•Data warehouse model has business entities
Figure 3. Data warehouse entities align with the business structure [8]
Before data can be incorporated into a data warehouse it must first be “denormalized” and transformed, (Figure 4) and scrubbed or staged (Figure 5).
Normalization of data occurs when relations or tables are progressively decomposed into
smaller relations to a point where all attributes in a relation are tightly coupled” [8] Denormalization involves reducing the need for database table joins used in the
normalization process and results in increased performance as the time required to
perform “join” processes increases as the size of the data tables increase. Additionally,
because data in a warehouse is static, all of the “joins” in an operational system may no
longer necessary.
13
Order processing
Customer
orders
Product
price
Data
Warehouse
Available Inventory
Product Price/inventory
Product
price
Customers
De-normalized
data
Product
Inventory
Products
Orders
Transform
State
Product Price changes
Product Inventory
Product Price
Ex
ten
sib
le
dat
a
wa
reh
ou
se
Marketing
Customer
Profile
Product
price
Marketing programs
• Structured extensible data model
• Data warehouse model aligns with the business structure
• Transformation of the state information
• Data is de-normalized because the relationships are static
Figure 4. Data Denormalization and Transformation [8]
“Different source applications invariably use different attribute values to represent the
same meaning. These different values need to be converted into a single value as data is
loaded into the data warehouse.” [8]
Transformation
Operational
System A
----------------------cust, cust_id, borrower
>> customer ID
----------------------“1” >> “M”
Data Warehouse
System
Summarized Data
Detailed
“2” >> “F”
Operational
System B
-----------------------
Data
Missing >>> “……..”
• Uniform business terms
• Single physical definition of an attribute
• Consistent use of entity attributes
• Default and missing values
Figure 5. Data Scrubbing and Staging [8]
14
BENEFITS OF DATA WAREHOUSING IN CIM
Computer-Integrated Manufacturing (CIM) is defined as, “Systems which enable the
integrated, rationalized design, development, implementation, operation and
improvement of production facilities and their output over the life cycle of the product.
These systems identify and use appropriate technology to achieve their goals at minimum
cost and effort” [9]. The ability to establish and understand the correlation between
activities of different organizational groups within a company is often cited as the biggest
advanced feature of data warehousing systems. [8]
“Concurrent Engineering is a systematic approach to the integrated, concurrent
design of products and their related processes, including manufacture and
support. This approach is intended to cause the developers, from the outset, to
consider all elements of the product life cycle from conception to disposal,
including quality, cost, schedule, and user requirements. (Pennell and Winner,
1989)” Data warehousing can provide information for analysis that can aid in
reducing design and production costs, ensuring product quality, and reducing the
time required to go from product concept to production.
The advantages for implementing data warehouse technology in today's organizations are
sometimes hard to quantify since some elements are of intangible value. However, the
positive impact on data warehousing investments include: more cost effective decision
making, enhanced customer service, better business intelligence, enhanced asset and
liability management, support of business process re-engineering and return on
investment. [11]
Organizations are beginning to understand the value of information. By
implementing data warehouse technology, they are better able to achieve strategic
advantages by obtaining better and more timely access to information about their
business, products, customers, and competition.
15
CASE STUDIES
Toyota Motor Company's Toyota Logistics Services (TLS) [4]
Toyota Logistics Services is responsible for the shipment and maintenance
information for all imported and domestically produced Toyota automobiles.
Their
system for tracking vehicles imported into the various U.S. ports and those manufactured
in the U.S. factories proved tedious and overwhelming. TLS wanted a system that would
allow them to reduce the factory to dealer shipment time to seven days through the use of
inland transportation.
In order to do this, TLS decided to implement a data warehouse.
They
incorporated Red Brick Systems' data warehousing software along with Brio
Technologies' BrioQuery SQL software to run on a Unix-based IBM RS6000 computer.
This system had to be able to manage the fast paced changes in the logistic areas in
addition to provide the users with quick results for user queries.
Before installing the new system, annual reports on the cost of freight
transportation took up to three months to compile.
In addition, TLS had to hire
temporary personnel to assist in the preparation of this data. After implementing this new
technology, the same report now takes only three weeks without the use of additional
personnel.
Toyota, realizing the success of this project, is committed to expanding this
project to include another ten subject areas. This will allow the users to further analyze
lead times and costs for getting vehicles accessorized and delivered to the dealerships by
either truck or rail.
16
General Motors [5]
General Motors, recognizing that they have a wealth of information stored in
many different places, has decided to develop a massive data warehouse that will include
detailed consumer information to be used for marketing purposes. This project will
connect the stand alone customer databases from its car and truck divisions, car leasing,
and home mortgages and credit card units to a central repository with a link to the
company's international operations. General Motors' intentions are to promote their
BuyPower online car shopping service and cross-sell customers on General Motors'
financial services. In order to accomplish this task successfully, they need to better
understand who their customers are and what are their needs.
General Motors currently has not decided which hardware and software
technologies to use or which marketing strategy plan to incorporate. However, they are
currently researching and testing various systems around the world in an attempt to make
a determination on which type system is best for them.
MCI WorldCom and Industry Data Exchange (IDE) [6]
MCI Worldcom and Industry Data Exchange Association decided to join forces
and create several industry-specific, community-shared data warehouses that would serve
as a place for manufacturers to distribute product information using electronic data
interchange. Over 200 manufacturers are expected to participate in this project. In this
system, distributors who are authorized by the manufacturers would pay a monthly fee to
access the data warehouse and pull information into their own operational systems.
MCI Worldcom and IDE decided that their project would use a Sun Microsystems
ES5500 computer running Oracle's Oracle8 database. The disk capacity of their data
warehouse would be 160 GB and would hold information on over three million products.
The main objective of this project would be to serve as a central repository for suppliers
and distributors to maintain product specs, part numbers, pricing, and packaging
17
quantities.
By incorporating a standard format, confusion and misinformation are
minimized while efficiency and productivity increases for all users.
18
REFERENCES
[1]. Gagnon G., “Data Warehousing, an Overview,” PC Magazine, March 9, 1999, pp.
245.
[2] Ewing, J. and Lais, S., (1999), “Data Warehousing for Information Retrieval,”
Government Computer News 18, 4,47-50.
[3] Orr, K. (1996), “Data Warehousing Technology,” White Paper, The Ken Orr
Institute
[4] Shein, E. (1997), ”Toyota Test Drives Warehouse,” PC Week, 14, 9, 58-60.
[5]
Wallace, B. (1999), “Data Warehouse to Drive Online Marketing at GM,”
Computerworld, July 5, pp. 6.
[6] Davis, B. (1999), “Data Warehouses Open Up,” Information Week, June 28, pp.
42.
[7] Schroeck, M. (1998), “Data Warehousing is Worth the Investment,” June 8,
pp.47.
[8] Gupta, V. (1997), “An Introduction to Data Warehousing,” White Paper, System
Services Corporation, Chicago, Illinois.
[9] Nazemetz, W. John, “Lecture Slides,” Lecture No. 1, Fall 2000.
[10] “Data Warehouse Economics: ROI doubts?,” Data Warehousing Tools Bulletin,
11/01/96, Page: 2329
[11] Wentz, Dave, “Data Warehousing, Better Access=Better Decisions=Better
Business,” White Paper, Showcase Corporation
19
Download