Database Migration

advertisement
White Paper: DB2 to Oracle Migration
MacDB2O 2006: Version – February 1, 2005
1. Overview
This document reviews Macrosoft’s process methodology for migrating IBM Mainframe DB2 databases to
Oracle. Much of our work in this area has been done in conjunction with work projects intended to migrate
mainframe applications to server environments (Unix, Linux, Windows). We have our own migration tools
(MMK – Mainframe Migration Toolkit), as well as expertise in using industry standard tools such as Oracle
Workbench, Micro Focus Revolve, etc.
We are development partners of IBM and Oracle, and as well are partners of: Micro Focus, Migration
Transformation Consortium (MTC) for legacy migrations, and Mainframe Migration Alliance (MMA).
The methodology we use requires minimum interaction with the client’s mainframe production
environment, thereby reducing costs and minimizing interruptions of the production system. This is
achieved through a phased activity, in which most of the analysis, tool-creation, mock-conversion, testing
work is done on a PC (back office). Our global delivery model (offshore, near-shore, and onsite) further
minimizes costs for the client.
2. Process Phases
The migration process generally includes the following steps:
1.
2.
3.
4.
Creation of overall Project Plan
Creation of overall Migration Plan
Database Schema Migration
Data Migration
These are explained below:
2.1 Creation of Overall Project Plan
This involves the following activities:













Determine migration scenario
Identify all migration tasks
Develop infrastructure sustaining processes
Determine the training requirements
Determine resources that will be required (HW,SW and people)
Determine QA & Support processes and teams
Identify customer responsibilities
Determine MVS change management process
Determine the environments -- Test, Development, Production
Determine the network connectivity (VPN) to both source and target systems
Determine the amount of data to migrate
Determine the amount of customization for online and batch
Archiving of historical data to reduce the amount of data to migrate





Determine the parallel testing strategy
Determine details of integration with other systems
Determine naming conventions and process standards
[eg: Oracle database object names can be 30 chars long whereas in DB2 it is 18 chars ]
Prepare a proto target environment
Test the migration scenario
2.2 Creation of Overall Migration Plan
This involves the following activities:












Analyze customization and platform differences
Develop an infrastructure plan
Design the layout of the databases, table-spaces, and tables
Estimate CPU and Storage (disk & memory) sizes
Analyze and choose migration tools
Analyze the fixes for present and future environments
Develop plans for:
 Print migration
 Batch migration
 Testing (Test Criteria & Strategy )
Determine customization of source applications required
Determine conversion of JCLs etc. to shell scripts required
Determine conversion/Customization of third party tools
Determine EBCDIC-ASCII conversion - possible issues
Develop plans for
 Security
 Backup and disaster recovery
 Administration
 Database monitoring
 Performance monitoring
 Change management
 Stress test
 Incident tracking
 Vulnerability assessment
2.3 Database Schema Migration
The phases involved in DB schema migration are shown in Fig. 1 and explained below:
2.3.1 Extract Database Meta data
This phase involves extracting and analyzing the source database structure, from the source DDL
statements. Modeling tools and reverse engineering can help in capturing all details of the schema.
2.3.2 Convert Database Objects
This is the major step of the schema migration process. All database objects in the source database need
to be converted to the equivalent objects in the target system. Typically objects such as data types,
tables, columns, views, indexes, stored procedures, triggers, packages, sequences, authorities, functions
etc. need to be converted. Factors such as data type, scale, precision, length and default values for table
columns, functions, and stored procedures, null values etc. can cause issues. Refer Section 4 for
examples of terminology differences between DB2 and Oracle.
2.3.3 Convert Queries
This is the next major phase in the database schema migration process. Even though the basic SQL
commands are the same, SQLs differs from engine to engine (Refer to Section 4 for examples of
differences between Oracle PL/SQL and DB2 SQL). SQL translation requires good expertise and
knowledge of both the source and target systems in order to avoid performance issues.
2.3.4 Implement Converted Objects
This phase involves building the database structure, on the target platform through scripts or the facilities
provided in the target system. Enhancements related to the schema or performance can also be
considered in this phase, utilizing the special features in the target system
2.4 Data Migration
The phases involved in data migration are shown in Fig. 2 and explained below:
2.4.1 Data Analysis
This phase involves walkthrough of the data presently in the database (or in use). Some data, which is
well accommodated in the source system, may not be accommodated in the target system. Usually the
volume of data is large and a full walkthrough may not be possible. In such cases random samples are
taken for identifying data items, which can cause problems in movement.
2.4.2 Data Cleanup / Enrichment
A data cleanup / enrichment prior to migration can help in effective movement of data. There could be
obsolete or unused items, as well as items which will not affect the source or target system if modified. If
this step is performed well in advance, the subsequent phases in this process will gain significant
advantage.
2.4.3 Conversion Study
This phase involves assimilation of the outputs from the above two phases, and detailed study for
finalization of a conversion strategy. This phase can be categorized into the following steps:

Fitment / Conversion Study – Output of this phase is a study report detailing the changes
required in the data items for the movement.

Formation of Migration Strategy – Output of this phase is the “Migration Strategy Document”
detailing the planned process of migration, tools planned to be used etc.

Finalization of Scope of Migration - In this phase, the scope of migration is defined. Items such
as scope, limitations, performance and maintenance issues etc. need to be well defined.

Finalization of Acceptance Criteria – This phase will define the acceptance / test criteria, test
process and test procedures to ensure that the data movement is fault free.
2.4.3.1 Conversion Strategy Signoff
In this phase, user (client) approves all the documents mentioned above. This phase is very crucial, while
handling critical data.
2.4.4 Conversion Tool Preparation
In this phase, the tools required for the data movement are developed (or customized). In production
systems, tools are very crucial since the final data movement is done in one shot (usually in 1 or 2 days
during off hours or holidays). The tools preparation is a full project activity of its own involving all phases
of SDLC.
2.4.5 Mock Conversion
In this phase, a mock conversion is performed, using the existing data in the source system. This may
involve several rounds as below:
Mock Conversion Round 1
Fixing of mismatches observed in round 1
Mock Conversion Round 2
Fixing of mismatches observed in round 2
…
It is very important to document the change records during this phase.
2.4.6 Conversion for Parallel Run
Usually a pre-production system is setup for parallel run to which the data migration can be performed to
ensure that the migration is problem-free. In this phase a one-shot data migration from the source system
to the pre-production system is performed. Detailed testing is carried out to ensure that the data migration
is fault free. Detailed performance testing and monitoring is also done in this phase.
2.4.7 Conversion for Live System
This is the final step of actual data movement from source system to target system. In production
systems, this should be done in one shot when the system is not active (usually off hours or holidays). In
24x7 systems, the system may have to be brought down to off-line mode for the data movement.
3. General Milestones and Deliverables
Generally we envisage documents that include the following:
Planning




Overall Project Plan Document
Overall Migration Plan Document
System Overview document
Plan – Sign Off
Analysis & Design
 Gap Analysis document
 SRS - Migration Specification document
 Data Migration Strategy document
 Migration Tools (Design/Test/Usage) documents
 User Acceptance Test Plan (UAT) document
 Test Criteria, Plan & Procedures
 Migration Strategy & Tools – Sign Off
 Design – Sign Off
Schema Migration
 Schema Migration Reports
 Unit Test Reports (Schema Validation)
 Schema Migration – Sign Off
Data Migration
 Mockup Data Migration Reports (including cleanup details)
 Unit Test Reports
 Data Migration – Sign Off
User Acceptance Test
 Test Reports
 UAT – Sign Off
Installation/Live Run
 Live Migration Test Reports
 Live Cutover – Sign Off
Delivery & Post implementation Support
 Parallel Run Reports
 Tools & Application Software developed
 All other documents
 Project – Sign Off
4. Examples of Differences between DB2 and Oracle
Terminology
DB2
Oracle
Database
A subsystem can have more than one
database. Databases are used to
logically group application data. All
databases share the same system
catalogs, system parameters, and
processes in the subsystem. DBADM
authority is granted on the database
level. SYSADM authority is granted at
the subsystem level.
A database is logically divided into
tablespaces. There are several
tablespace types: simple, segmented,
partitioned and large partitioned (for
16 TB tables). A non-partitioned
tablespace points to one physical
VSAM file on DASD. A partitioned
tablespace points to one VSAM file
per partition on DASD. A segmented
or simple tablespace can contain one
or more tables.
Equivalent to pages; 4 K, 8 K, 16 K,
32 K.
Each instance has one database
and one set of system catalog
tables.
Tablespace
Blocks
Extents
The unit by which storage is allocated
for a VSAM file. The size of the
primary and secondary extents is
specified in the CREATE
A database is logically divided into
tablespaces. A tablespace can point
to one or more physical database
files on disk. One or more tables
can reside in a tablespace
The smallest unit of database
storage. Database files are
formatted into blocks, which can be
from 2 K to 16 K.
The unit by which storage is
allocated in a database file. The
size of the primary and secondary
extents are specified in the Storage
TABLESPACE statement. A VSAM
file can grow up to a maximum of 119
secondary extents. Extents are made
up of contiguous pages.
Stogroups
Stored
Procedures
Plan
Clusters
Clustering
Index
Secondary
Authid
A series of DASD volumes assigned a
unique name and used to allocate
VSAM datasets for DB2 tablespaces
and indexes.
Stored procedures are written in C,
C++, COBOL, Assembler, PL/1or the
new DB2 SQL Stored Procedure
language. The compiled host
language is stored on the DB2 server
and the compiled SQL is stored on
the database.
A plan is an executable module of
SQL that is composed of one or more
packages and was created from a
DBRM. A DBRM is a module of uncompiled SQL statements that were
extracted from the source program by
pre-compilation. A DBRM is bound
into a plan or a package.
No equivalent.
An index created on a column of a
table where the data values are
stored in the same physical sequence
as the index.
Allows for fast sequential access.
Secondary Authid or RACF Group.
Privileges can be granted to a
secondary authid. Primary authids are
assigned to the secondary authid
Group. Primary authids inherit all
clause of the CREATE TABLE or
CREATE INDEX statements or
default to the sizes specified in the
CREATE TABLESPACE statement.
Extents are allocated until there is
no
more free space in the files that
make up the tablespace, or the
maximum number of extents has
been reached. The size of
the file is specified in the CREATE
TABLESPACE statement. Extents
are made up of contiguous blocks of
storage.
No equivalent.
Written in PL*SQL, JAVA etc.
Stored procedures are stored in an
Oracle table and executed from
within the database.
No equivalent.
Clusters are an optional method of
storing data. This approach creates
an indexed cluster for groups of
tables frequently joined. Each value
for the cluster index is stored only
once. The rows of a table that
contain the clustered key value are
physically stored together on disk.
No equivalent.
No direct equivalent in Oracle.
Groups of privileges known as roles
can be granted to a user ID.
Package
Other
examples of
differences
privileges granted to the secondary
authid (group) they are in.
A package consists of a single
program of executable SQL and the
access paths to that SQL. The
package is stored on the database
and invoked by the host language
executable. A package is created by
doing a BIND. A package may be part
of a PLAN.
PRIQTY
SECQTY
Smallint\Decimal
FREEPAGE
etc.
No equivalent as known in Oracle. A
“package” in Oracle has another
meaning. Package is written in
PL*SQL and allows you to group all
related programming such as stored
procedures, functions, and variables
in one database object that can be
shared by applications.
INITIAL
NEXT
NUMBER
FREELIST
etc.
5. Database Migration – General Questionnaire (Example)
A. Customer Data
Customer Name:___________________________________ Phone no.:____________________
Contact Person: ____________________________________Fax no.: ______________________
B. Technical Data
B.1 Source System
Hardware Model ___________________
Operating System Name _____________________
Operating System Version ____________
Database DB2
Database Version ___________
Size of Production Database ____________ No. of Concurrent Users in Production ____________
Avg. No. of On-line Transactions in Production per hour __________________________________
No. of Batch Processes _____________
(a ) Production No. of Databases ________ No. of Tablespaces ________ Size _______
(b ) Test No. of Databases ________ No. of Tablespaces ________ Size _______
(c) Development No. of Databases ________ No. of Tablespaces ________ Size _______
(d) Other ___________ No. of Databases ________ No. of Tablespaces ________ Size _______
List Names and Number of rows for the 10 largest tables in Production:
__________________ ____________
__________________ ____________
__________________ ____________
__________________ ____________
__________________ ____________
__________________ ____________
Have any stored procedures been written to access the database? ( ) Yes ( ) No how many _________
What is the average number of SQL calls per stored procedure? ____________
How long are the stored procedures (total number of statements) ____________
Have any triggers been written to access the database? ( ) Yes ( ) No how many ________
What is the average number of SQL statements per trigger _____________
How long are the triggers (total number of statements)? _____________
Is an archival process in place ( ) Yes ( ) No
Brief description of Hardware Configuration _____________________________________________
Brief description of application, third party tools & Host languages __________________________
__________________________________________________________________
B.2 Target System
Hardware __________________________
Operating System Name ____________________
Operating System Version ____________
Database: Oracle
Database Version _________
Do Migration Tools such as “Oracle Migration Workbench” Exist ? ( ) Yes ( ) No
Brief description of Hardware Configuration _____________________________________________
Brief description of application, third party tools & Host languages __________________________
________________________________________________________________________
Download