Assessment Results
IU Insight Discovery Workshop
Version 1.5
October 2006
1
IU and Oracle Confidential
Insight Document Title: IU Insight Deliverable v1.5.doc
Revision date:
3/8/2016
July 2006
Oracle USA, Inc.
World Headquarters
500 Oracle Parkway
Redwood Shores, CA 94065
U.S.A.
Worldwide Inquires:
Phone: +1 650.506.7000
Fax: +1 650.506.7200
www.oracle.com
Copyright © 2006, Oracle, All rights reserved
This document is provided for information purposes only and the contents hereof are subject to change without notice. This document is not
warranted error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties
and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document and no
contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any
form or by any means, electronic or mechanical, for any purpose, without our prior written permission. Oracle is a registered trademark of
Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners.
2
IU and Oracle Confidential
Contents
Contents .........................................................................................................................................................3
Executive Commitment and Partnership ........................................................................................................................ 4
Executive Overview.....................................................................................................................................5
Insight Approach and Scope ............................................................................................................................................. 5
About Indiana University ................................................................................................................................................. 5
Summary of Findings ......................................................................................................................................................... 6
Summary of Recommendations ....................................................................................................................................... 7
Recommendations .......................................................................................................................................9
High Availability, and Grid Based Architectures ......................................................................................................... 9
The DBA Infrastructure Instance ................................................................................................................................ 10
The Oncourse CL Instance ........................................................................................................................................... 12
Improve planned outage windows ................................................................................................................................ 13
Best Practices ...................................................................................................................................................................... 17
Disaster Recovery ......................................................................................................................................................... 17
Monitoring and Maintenance ...................................................................................................................................... 17
Service Level Agreements ........................................................................................................................................... 18
Decision Support System ................................................................................................................................................ 18
Business Analysis ......................................................................................................................................22
List of Appendices .....................................................................................................................................25
3
IU and Oracle Confidential
Executive Commitment and Partnership
Your success is how we define our success. We look forward to continually working
with you to move from your current state to your desired state and value realization.
Through the Insight initiative Oracle Executive Management is prepared to work in
close partnership with IU to support this strategic effort by jointly defining the
architecture, sharing best practices and reviewing ongoing project efforts.
On behalf of the entire Oracle Insight team, we thank you for the opportunity to work
with you on your technology roadmap. We look forward to continued success.
Respectfully,
Jim Zemaitis
Regional Vice President
Oracle Corporation
4
IU and Oracle Confidential
Executive Overview
Insight Approach and Scope
The Oracle North American Strategic Accounts organization appreciates Indiana
University’s (IU) participation in Oracle’s Insight program. The Insight program is
designed to help Oracle’s most important clients realize increased value for their
investment in Oracle.
At IU’s request, a team of Oracle solution architects conducted a discovery workshop
focused on infrastructure optimization. This included a review of the current
infrastructure, examination of the potential to support flexible grid architecture,
recommendations relating to administration and manageability best practices, and a
supporting business case/ROI for infrastructure consolidation.
The discovery sessions were a dialogue during which both sides learned. At the end
of the discovery sessions, the Oracle team provided and validated their initial
analysis and recommendations. The first draft of this document will be reviewed by
both teams together. The University’s comments will be incorporated into the final
deliverable. As a result, this solution document is a collaborative effort between
Oracle and IU, and we thank you for the opportunity. Our goal is to give you access
to the best thinking and experience from Oracle and our vast customer base to assist
you in meeting your business goals.
As part of the deliverable, we provide a CD with all documents, white papers, and
information referenced during this Insight.
.
About Indiana University
Indiana University has eight campuses: the original campus in Bloomington, which is
a residential campus; an urban campus in Indianapolis, which also includes the IU
Medical Center; and six regional campuses in the Indiana cities of Gary, South Bend,
Fort Wayne, Kokomo, Richmond, and New Albany. IU has:
 More than 92,000 students on its eight campuses
 922 degree programs
 Almost 475,000 living alumni, including 230,000 working in Indiana
5
IU and Oracle Confidential
 An annual operating budget of $2.2 billion
 16,000 employees, including faculty and professional and support
staff
 More than 150 research centers and institutes
 An endowment of more than $1 billion
Indiana University is internationally known for the quality of its academic programs
and attracts students from all over the world. At the same time, IU plays a key role in
the economic and social well-being of Indiana residents, offering educational,
cultural, and economic benefits to the state.
Indiana University is a leader in fostering the multidisciplinary research essential to
solving challenges of life and health. It is also a leader in forming the partnerships
with business, industry, government, and other academic institutions that lead to
important research and development and economic growth.
Summary of Findings
Indiana University has
done an excellent job
“doing more with
less”, in an
environment where
state funding is
declining and the
demands of a 24 x 7
user community are
increasing.
Indiana University is a “university in transition”. The trustees have begun a process
of re-thinking the organization structure that has been in place for 30 years. The
President of the University will be leaving in less than two years. The university is reassessing its priorities to face the changing demands of the educational landscape.
Several key factors in this re-assessment, and initiatives communicated to the Insight
team include:
 An increase in on-line learning
 Growth in research
 Participation in the open-source community
 Reduction in state funding
 All IT activities at IU must create significant value.
 Improving physical facilities
 Finding better ways to keep up with technology
The Insight team found that the U.I.T.S. infrastructure and U.I.T.S. support teams at
IU displayed the following characteristics:
 Maximizing resources - doing more with less
 Good systems administration skills and knowledge, from both a hardware
and operating system perspective.
 Leveraging of IBM relationship, for attractive pricing and support

Good Marketing IU as a world class research institution (9th largest
6
IU and Oracle Confidential
supercomputer in the world dedicated to research)
 U.I.T.S. demonstrated they have a clear prioritized list of projects.
 Departments are coordinated and work well together.
 U.I.T.S. is respected and consulted by other groups.
 Good DBA practices and skills. For example: Maintaining the discipline to
support only two versions of the Oracle database for production systems.
The Insight team also learned that there are several challenges facing the university:
 Do more with less
 Challenges relating to support of a 24 x 7 Environment

Reducing planned downtime

Collaboration in Open-Source Community
-
Site failover issues
-
Staff Support
-
Open Source development teams are located across
time-zones
-
No acceptable down-time
-
In addition to production, there is a requirement for
both the development environment and test
environment to be available 24 x 7

Need rolling upgrades

Removing single points of failure
 Defining and scoping of Service Level Agreements (SLA’s)
 Develop a better working relationship with Oracle
 Data growth and data movement
 Challenges relating to identity management and security
It is with these initiatives and challenges in mind that the Insight team developed the
recommendations described in the next section.
Summary of Recommendations
Implement high-availability (HA. The Insight team found that there is a need for
some database systems at IU to be highly available, and leverage clustering
technology. This HA technology will provide:
 24 x 7 availability
 Shortened maintenance windows
 More productive management
 Better more accurate system monitoring
 Better utilization of hardware
7
IU and Oracle Confidential
 Capacity on demand
The Insight team recommends that U.I.T.S. first implement HA in the internal-facing
DBA Infrastructure Instance of the Oracle database. This is a good system to learn the
clustering technology. After the DBA Infrastructure Instance is made ready for HA,
the Insight team recommends deployment of Oncourse CL on a HA database
infrastructure. There is a business requirement for Oncourse CL to be available 24 x 7,
and therefore this is a good candidate for clustering.
In this document, we have also included several recommendations and best-practices
relating to reducing planned maintenance windows, as well as several management
and monitoring best practices.
The Insight team recommends an investigation to improve the decision support
system (DSS). The large volume of data that is rebuilt everyday and the large
amounts of data that flow from the OLTP systems to staging areas to the DSS presents
an opportunity for process and architectural improvement. The Insight team
acknowledges that the IU team is aware of this challenge, and we look forward to
working together to craft a solution that brings significant value to IU. Technical
details of this recommendation, and all the recommendations made by the Insight
team, can be found in the recommendations section of this document.
After presenting the detailed recommendations in this document, we provide a
business analysis section to highlight the costs and benefits associated with the
detailed recommendations.
Because the scope of this engagement was limited to infrastructure, we did not go
into detail during our discovery around the DSS processes and reporting systems that
rely on DSS. Perhaps another Insight, focused on Business Intelligence, would
provide additional value. During a Business Intelligence Insight, we would assemble
a different Insight team, with deep knowledge of data warehousing and business
intelligence. Additionally, an Insight focused on the Roadmap to Fusion, or Security
may be of interest to IU going forward.
8
IU and Oracle Confidential
Recommendations
High Availability, and Grid Based Architectures
Oracle introduced Real Application Clusters (RAC) with the release of Oracle 9i.
Oracle RAC allows multiple database nodes to connect to a single instance of the
database and transact simultaneaously. This technology eliminates the database as a
single point of failure. RAC is an important part of an overall strategy to improve
unplanned outages.
Higher service levels
Grid computing reduces the time and the effort necessary to failover due to its active
/ active architecture. If a database node fails for any reason (CPU board failure, etc),
the remaining nodes continue to serve the database requests. The system is still
available. Those users who were connected through the failed node are reconnected
to another server within a few seconds, and users who were connected to others
servers continue their work without any interruption, as shown in the illustration.
This approach allows for continuous processing without interrupting users and
eliminates downtime due to manual node failover.
In addition to providing high availability, Oracle RAC typically provides:
9
IU and Oracle Confidential

Better scalability – Oracle RAC allows the database to scale horizontally.
As the environment reaches its capacity limits, additional nodes can be
added on demand, instantly increasing capacity resources.

Better utilization of hardware - Nodes can be shared amongst clusters in
the overall environment allowing unused resource cycles to be better
utilized. For example, a system that requires nightly batch cycles can
borrow them from a typical OLTP system that requires them during the
day.

Reduced cost – Because of its scalability features, Oracle RAC can
leverage cheaper commodity type hardware, typically reducing the TCO
of the environment. For more information regarding costs please see the
business analysis section of this document.
For more information on GRID computing and RAC please see:
http://www.oracle.com/technologies/grid/index.html
or the appendices on the companion CD.
The DBA Infrastructure Instance
The DBA Infrastructure instance, also known as the OEM Instance (Oracle Enterprise
Manager), is the first system the Insight team recommends move towards a highavailability architecture. The reason for this choice is that the DBA Infrastructure
Instance is an internal facing application, without a user community, and will provide
an excellent environment to learn the nuances of Oracle cluster technology.
Key Findings
In a later section, we discuss the potential for improving the DSS Architecture at
U.I.T.S. Improving the architecture of the DSS at U.I.T.S. would allow the DBA staff
to better leverage the features of Oracle RMAN, such as integral backups, central
view of a backup catalog, flash backups and recoveries, improved backup
performance, etc. Currently, backups of the DSS environment take too long to
leverage the features of RMAN.
Definition - RMAN(Recovery Manager): An Oracle tool that allows you to perform
physical database backups in a more controlled manner. With RMAN, the backups
and recoveries are managed for you through the RMAN toolset as well as with a GUI
interface using Enterprise Manager. Syntax is simplified and the scripting is
powerful and consistent across platforms. This is the toolset that Oracle has been
investing in and moving toward since Oracle 8i.
Currently, the backup and recovery is connected to an IBM Tivoli Storage Manager
(TSM) environment.
When the IU staff begins to use RMAN as their backup standard, this RMAN
environment will require a higher level of availability. Currently, in this environment,
the database is a single point of failure.
The production OEM environment is currently on the 4-CPU, 8 GB LPAR esdb13 with
an rperf rating of 8.69.
Issues:
10
IU and Oracle Confidential
 If RMAN were used as the U.I.T.S backup and recovery standard, the RMAN
database would need to be architected for high availability. This is because
RMAN contains the information necessary to recover from an unplanned
outage.
 There may also be licensing issues for the media manager integration.
Recommendation: Use Oracle RMAN for Backups
The Insight team recommends using Oracle RMAN as its mechanism for backup and
recovery. Oracle RMAN maintains an online catalog of all backups allowing a view of
the University’s backup history. New features in Oracle RMAN, most notably
incremental backups, can significantly improve the backup and recovery time.
Benefits:
 Allows a catalog view of backup and recovery history
 Significantly improves backup and recovery time leveraging incremental
backups
Recommendation: Architect the DBA Infrastructure Instance for HA
Clustering the DBA Infrastructure instance would allow the RMAN database to be
highly available. With the release of Oracle 9i, Oracle introduced the concept of Real
Application Clusters. Oracle RAC allows multiple database nodes to simultaneously
connect to a single shared storage and act as a single instance of the database.
Therefore, when one of the nodes fails the database continues to process
uninterrupted.
Besides high availability, Oracle RAC provides other benefits. Environments
clustered with RAC can scale horizontally; as the environment nears its capacity,
nodes can be added to the cluster. Because of the horizontal scalability, you can
leverage commodity hardware for the architecture; oft times at a significant cost
savings. Oracle RAC allows for better utilization of hardware as well; nodes can be
shared between clusters as determined by the workload.
With the inception of Oracle 10g RAC, Oracle introduced two key new features:
 Oracle Cluster Repository Services (CRS)
 Oracle Automatic Storage Management (ASM)
Prior to Oracle 10g, a 3rd-party vendor clusterware was required to run Oracle RAC.
With CRS and ASM, this is no longer a necessity. CRS maintains a repository of all
clusters within the University. This allows for ease in adding and removing nodes,
sharing nodes between clusters, etc. ASM is a purpose built volume manager for
Oracle database files.
For more information on CRS, please see Oracle Tech Net: Clustering at
http://www.oracle.com/technology/products/database/clustering/index.html or
see the appendices on the companion CD.
For more information on ASM please see Oracle Tech Net: ASM at
http://www.oracle.com/technology/products/database/asm/index.html or see the
11
IU and Oracle Confidential
appendices on the companion CD
For the U.I.T.S. case, the esdb13 environment could be broken into two separate 2CPU, 8 GB nodes. These nodes would connect to the same OEM Production instance,
but be clustered together using Oracle RAC.
Figure 1 Two-Node Cluster
The nodes either could be LPARs as they are today, or could be implemented using
commodity-type hardware. Further details of the economic impact behind this
decision are included in the Business Analysis section of this document.
Benefits:
 Database is no longer a single point of failure
 Provides for horizontal scalability of the database
 Better utilization of hardware
 Potential home for other critical administration repositories such as those
required for Oracle’s Automatic Storage Management (ASM) and grid
control.
The Oncourse CL Instance
Key Findings
Using the DBA Infrastructure instance as the foundation for highly available systems
at IU, this reference architecture can be applied to all systems requiring high
availability.
During the Insight discussions the Oncourse CL system was targeted as one of the
next systems that would require the highest levels of availability. The Oncourse CL
system will likely have to support a larger number of users than currently supported
so scalability will be a requirement. Additionally, due to the 24 X 7 nature of
Oncourse CL, a highly available system will also be a requirement.
The current Oncourse CL production system resides on the 6-CPU, 8 GB LPAR
12
IU and Oracle Confidential
esdb06. This LPAR is on an IBM p570 with 16 processors, 64 GB of memory and an
rperf rating of 68.40.
Recommendation - Architect Oncourse CL for HA
To configure the Oncourse CL for high availability, a three node, 2-cpu / 8 GB per
node configuration could be utilized, as shown below:
Figure 2 Three-Node Cluster
Besides high availability, the Oncourse CL system is the perfect case study for
horizontal scalability. As this application is new, its performance characteristics are
not yet well defined. Oracle RAC would allow the architecture to grow in lock-step
with the Oncourse CL user adoption.
Benefits:
 Provide the Oncourse CL application on a highly available database
 Allow the Oncourse CL database resources to scale horizontally
 Allow the dynamic allocation of new system resources while the application
is still accessible to users
 Clustering the Oncourse CL database allows for online maintenance and
rolling upgrades
Improve planned outage windows
Key Findings
Reducing the duration of planned outage windows was identified as a key area of
contention for the IU project teams. Outages for the PeopleSoft SIS can last hours
currently. Many of these outages are occurring during the mid-week day. IU works
13
IU and Oracle Confidential
on many collaborative research projects where users may be world wide, thus further
complicating planning for downtime.
From the perspective of the DBA teams, they understandingly like to perform
maintenance during the working day whenever possible, to avoid late night and
weekend work.
Issues:

Outages are occurring during the day and impacting development schedules

Planning for downtime in off hours means IT staff must work on nights and
weekends.
Recommendation – Perform Rolling Upgrades
One of the prominent causes of system down time has traditionally been hardware
and software upgrades. As customers rely more and more on the IT infrastructure to
power their “always on” business, the concept of the “offline maintenance window”
is gradually becoming a thing of the past. Today, customers need IT vendors to
provide capabilities to perform such routine maintenance tasks without causing any
business interruption. As the technology leader, Oracle Database 10g contains the
most comprehensive set of features that enable customers to achieve this objective.
We will now present an overview of those features that allow customers to perform
software upgrades with little or no downtime.
Let’s examine the various types of software management operations typically
performed in an Oracle environment:
 One off patches
 Critical Patch Updates (CPUs)
 Patchsets
 Release and Version Upgrades
 Operating System Updates
 Application Updates
Each of these scenarios is described in detail in the following sections.
One off patches
One-off patches are generally issued in response to critical bugs encountered by the
customers that need to be fixed immediately and cannot wait until the release of the
next patchset. The decision regarding when and whether to release a one-off patch is
made by Oracle Worldwide Support and Defects and Diagnostic Resolution (DDR)
group within Server Technologies based on set of pre-defined criteria, such as:

The bug must have a significant impact on customer’s ability to conduct
normal business

The bug leads to issues such as database hangs, crashes, or data corruption
and must be fixed immediately
Each one-off patch is identified by a patch number (e.g. 3574504). Installing one-off
14
IU and Oracle Confidential
patches does not change the version of the installed Oracle software. For example,
even after applying patch number 3574504 on top of the version 10.1.0.3 of the Oracle
database software, the version of the updated software will continue to be 10.1.0.3.
However, a list of the installed one-off fixes is maintained in the Oracle install
inventory and can be queried if desired.
Installing one-off patches has traditionally required the database to be shutdown.
However, starting with the version 9.2.0.2, certain one-off patches can be applied
across Real Application Clusters (RAC) instances in a “rolling” fashion, provided
each instance uses a local installation of the Oracle software (i.e. the instances do not
share Oracle Home). The word “rolling” signifies that the patch can be installed on
each instance, one at a time, without requiring other instances to go down. This
allows the database to be accessible during the patch application process. However, it
is important to note that the instance on which the patch is being applied must be
brought down.
Another important point to remember is that not all one-off patches today are rolling
updateable. The patches that modify common data structures, inter-instance
messages, or database metadata (e.g. views, stored procedures, or any other on-disk
structure) cannot be applied in a rolling fashion. Consequently, patches that can be
applied in a rolling fashion are clearly labeled that way (on MetaLink and in the
readme.txt file). Customers can also use the OPATCH utility to determine if a given
one off patch is rolling updateable or not.
All Oracle Clusterware patches are rolling updateable.
For further details on rolling patch updates, please refer to Database Rolling Updates
with Real Application Clusters at
http://www.oracle.com/technology/deploy/availability/pdf/Rolling_Patch_Updat
e_Data_Sheet.pdf or see the appendices on the companion CD
Or peruse the Oracle Documentation set: Oracle Database High Availability
Architecture and Best Practices at
http://st-doc.us.oracle.com/10/101/server.101/b10726/recover.htm#i1006430 or see
the appendices on the companion CD.
Critical Patch Updates
In order to provide customers scheduled, periodic software updates to address
security and other critical issues, Oracle has started releasing quarterly Critical Patch
Updates (CPU) beginning January 2005. CPUs contain cumulative fixes for a number
of critical issues and, just like the one-off patches, they do not alter the version of the
installed Oracle software. For more information on the Critical Patch Updates
program, please refer to the MetaLink note 290738.1 at
https://metalink.oracle.com/metalink/plsql/showdoc?db=Not&id=290738.1 or see
the appendices on the companion CD.
Critical Patch Updates may or may not be rolling updateable and the release/preinstallation note for each CPU will indicate this. For example, both Alert 68 and CPU
released in April this year were rolling updateable.
In addition, the rolling update tests are often still in progress when a CPU is first
released. As such, an updated version is later released once the tests are completed.
The July 2005 CPU, which was initially released on July 12th, will soon be updated
15
IU and Oracle Confidential
with the results of rolling update tests.
Patchsets
Patchsets and release upgrades are considered relatively major software upgrades.
This type of software update typically requires changes in the database metadata and
it does cause the version of the installed Oracle software to change. For example,
applying the 10.1.0.4 patchset on top of the version 10.1.0.2 will update the version
number of the updated software to 10.1.0.4.
Release and Version Upgrades
Generally speaking, release and patchset updates to the database software cannot be
applied across RAC instances in a rolling fashion. The only exception to this rule is
Oracle Clusterware, which can be upgraded to a new patchset or release in a rolling
manner (e.g. Oracle Clusterware version 10.1 can be upgraded to 10.2 one machine at
a time). However, Oracle Database 10g provides another technology – Data Guard –
that can minimize the downtime required to perform patchset and release upgrades
to a few minutes, even when a rolling upgrade across RAC instances is not possible.
Using Data Guard SQL Apply, customers can create a logical standby database that
can subsequently be upgraded to the new version without impacting the current
production or primary database. After the logical standby upgrade is completed, a
Data Guard switchover operation may be executed to make this upgraded standby
database the new primary database, and applications and users can be rerouted to
this database, making the old primary database available for upgrade without
causing any application outage. This feature is only available in Oracle Database 10g
Patchset 1 onwards – i.e. it can only be used to upgrade from 10.1.0.3 or higher
versions. For more information on performing rolling upgrades using a logical
standby database, please refer to:
Technical White Paper: Oracle Database 10g Release 2 High Availability at
http://www.oracle.com/technology/deploy/availability/pdf/TWP_HA_10gR2_HA
_Overview.pdf or see the appendices on the companion CD
Documentation: Oracle Database High Availability Architecture and Best Practices at
http://st-doc.us.oracle.com/10/101/server.101/b10726/recover.htm#i1006387 or see
the appendices on the companion CD
MetaLink note 300479.1: Rolling Upgrades with Logical Standby at
http://metalink.oracle.com/metalink/plsql/docs/rollup_10_1_0_4.pdf or see the
appendices on the companion CD
It may be noted here that this mode of upgrade does require another database (i.e.
the logical standby database) to be created, which can be either RAC or non-RAC.
Also, while performing such upgrades on a RAC database, all instances must be
taken offline. In another words, rolling patchsets and release upgrades can not be
performed across RAC instances in the manner rolling patch updates are done i.e. one
instance at a time. But as stated above, it can be done using the Data Guard SQL
Apply feature that requires creation of a logical standby database. This is an
important distinction to understand and explain to customers in order to set the right
expectations.
16
IU and Oracle Confidential
Operating System Updates
Operating System updates may or may not require the machine to be taken off
service. Generally speaking, most major OS upgrades, such as going from Red Hat AS
2.3 to 3.0, do require machine to be taken down while some of the patches may be
applied online. In either case, OS vendors will indicate the fact whether the update
can be applied online or not.
Oracle Clusterware and Oracle Real Application Clusters support rolling upgrades of
the OS when the version of the Oracle Database is certified on both releases of the OS.
Alternatively Data Guard standby databases (physical standby database or logical
standby database) can always be used to perform rolling OS upgrades, using a
similar strategy as in the case of rolling database patchset and release upgrades.
Best Practices
During our discovery sessions, the Insight team heard from IU that they would like
information regarding best practices relating to infrastructure management.
Specifically, best practices relating to Disaster Recovery, Service Level Agreements
(SLA’s), Maintenance and Monitoring. What follows is a discussion of those areas.
When necessary, we reference appendices and white papers.
Disaster Recovery
During the Indiana University Insight discovery sessions the team discussed the
Disaster Recovery (DR) initiative underway and Oracle’s approach to protect
business critical databases and applications from system failures, user errors,
administration errors and data corruptions that might bring a production database
down.
Oracle Data Guard optimizes the primary-to-secondary replication and is the
recommended approach for IU to achieve their disaster recovery goals. Changes to
the primary database are immediately replicated to the failover database, minimizing
data loss. Oracle Data Guard is also a supported product with user interfaces, tool
kits and monitors that prove invaluable to administrators.
For more information on Data Guard, please see:
http://www.oracle.com/technology/deploy/availability/htdocs/DataGuardOvervi
ew.html or see the appendices on the companion CD
Monitoring and Maintenance
Oracle Enterprise Manager 10g Grid Control is uniquely positioned to streamline the
monitoring and administration of the IU infrastructure. All of the requirements
highlighted during the Insight are included in the Grid Control product, as well as
several other features that will also prove valuable. IU already holds licenses for this
product and some of its packs (Diagnostics and Tuning). Some of the features
described may require additional pack licenses.
17
IU and Oracle Confidential
Oracle Enterprise Manager 10g Grid Control provides:

System monitoring and tuning

Service Level Management

Inventory Monitoring

Patch Management and Deployment (Database and O/S)

Provisioning and Deployment (Database and O/S)

Policy compliance
Organizing and automating these tasks utilizing a tool can greatly ease the workload
of the IU staff and allow them to spend their time on more productive projects.
For more information on Oracle Enterprise Manager 10g Grid Control please see
http://www.oracle.com/technology/products/oem/index.html or see the
appendices on the companion CD
Service Level Agreements
For more information on SLA best practices please see the appendices.
Decision Support System
Key Findings
Indiana’s Decision Support System (DSS) is rebuilt on a daily basis. Historical data
for the rebuild is maintained in the feed systems, i.e., the OLTP systems from which
the source data is generated. For this reason, there is little opportunity with the
current architecture to archive information in the DSS system. The Insight team does
realize that in certain circumstances maintaining historical data in the OLTP systems
may be desirable for reasons such as reporting, data retention policies and data
availability. However, maintaining this volume of detail historical data should be a
choice, not a requirement.
As the total data volume continues to grow, without approaching a steady state, this
produces a cascading effect of not being able to sustain backup windows because of
the growth in data volume. It also has a subsequent negative performance effect on
the OLTP systems.
Copies of data are being generated to regenerate the DSS system, which creates a
multi step process. This process continues to produce a growing amount of overhead
and potential performance issues as the source data continues to grow. Additionally,
18
IU and Oracle Confidential
it also represents a physical cost as systems and disk allocations are required to
support the infrastructure.
With today’s technologies, it is possible to isolate and capture change data from
source systems and apply it in a single step to the DSS target without introducing any
other levels of complexity.
The Insight team recommends that Indiana University consider alternative techniques
in the construction and maintenance of their DSS systems, thereby avoiding
exceeding the capacity of their systems and environment.
Since Indiana has a working DSS system already, it will be a significantly reduced
effort to migrate to an improved technical position. First, Indiana already has a welldefined set of source data. Secondly, Indiana additionally has a well-defined
operational DSS system currently in use. In essence, it is known where the data is
being loaded from, where it is going to, the frequencies that the data is loaded and the
transformations necessary to achieve a usable state.
It should be possible to adopt newer technologies for the ETL process and to
implement new internal data structures within the DSS without having to modify the
existing reporting and analysis facilities.
Issues:
 DSS is rebuilt daily
 Source OLTP systems maintain large amounts of historical data.
 The bulk of data affects backup windows and has performance implications.
 Multiple copies of the same data are being maintained.
Additionally, U.I.T.S. would like to investigate the potential of adopting a more
holistic view of their business intelligence environment, as the U.I.T.S. team feels
there are several areas where reporting can be improved.
Recommendation – Investigate Improving the DSS System
Implementation of three concepts will go a long way towards the re-architecture of
the DSS. They are:
 Altering the change-data capture on the source systems
 Streamlining the ETL process
 Implementing partitioning in the DSS schemas
These concepts are discussed in further detail in the sections below.
The proposed solution eliminates the need for using ‘flash copy’ and reduces the
amount of data necessary to keep the DSS target updated on a daily basis. This
would free up major portions of disk space that could be provisioned elsewhere.
The OLTP and DSS systems are raised to a new level of efficiency, accuracy and
stability with the new technologies. IT will now have the ability to archive older
information from the OLTP systems thus making them significantly more efficient.
19
IU and Oracle Confidential
In addition, the IT staff will no longer have to do huge rebuilds of the DSS system.
Because of the major reduction in data overhead, the backups of the systems become
significantly less stressful and schedule constrained. The amount of data
manipulation is significantly reduced, backup and recovery performance is
significantly improved and the maintenance of all data is more granular. As an
added benefit, the partitioning schema provides a component level of high
availability as a by-product of the partitioning implementation.
Failed process resumption at point-of-failure is a built-in feature of the components of
the proposed solution. This gives the IT department the ability to fix the underlying
problem and resume processing where it left off.
Through every part of the proposed process are components focused on reducing the
management and overhead of the existing system and significantly improving the
performance and reliability of the process.
Alter the Change-Data Capture on Source Systems
The first component is to implement change-data capture on the source systems.
Rather than sending copies of the OLTP data to the DSS, we only really need to send
data that has changed.
Benefits:
 Eliminating the data copies significantly reducing the amount of data being
transferred
 Transfer each piece of data to the DSS system only once
 Allow archival of the source OLTP data, significantly reducing the backup
processing and improving the overall performance of the OLTP systems
 Conserve large amounts of disk space
 Allow migration of data to DSS in a single step
Streamline ETL Operations
Next, employ Oracle Streams technology to provide ETL (Extract, Transformation,
Loading) operations necessary to load the data into the DSS target. This technology
provides a schedulable system that has the ability to pick the change-data from the
source systems, provides transformations of data in the process of movement at either
the source system or the target system and provides queueable updates to the target
DSS system that can be restarted from point of failure should a problem arise with the
load process.
An additional optional component that could be used is Oracle Data Warehouse
Builder (OWB) that would provide a sustainable graphicaly oriented data mapping
system that can be used to generate the scope of how the Oracle Streams process
would be invoked. The Core OWB functionality is now a feautre of the Oracle
Database and is available to IU with no new licensing costs.
The Oracle Streams process would use Advanced queueing on the target DSS system
to stage data to be loaded into the DSS tables.
20
IU and Oracle Confidential
Benefits:

Schedulable system that has the ability to pick the change-data from the
source systems. This allows IU to determine the refresh rate applied from the
source OLTP systems to the target DSS systems.

Provides transformations in the process of movement. This would allow IU to
merge/matchdata from disperate systems and provide data consistency in
the resulting DSS system. Data transformations are easily modified in a
central facility thus streamlining the data movement process.

Can be restarted from point of failure. As the number of data sources increase
and the volume of data ETL increases, restart at point of failure becomes
increasingly important to provide data feeds in a timely and accurate fashion.

Allow for the retirement of the ODS environments allowing IU to free up disk
and CPU resources for reallocation elsewhere.
Partition the DSS
In order to provide performance, maintenance and high availability benefits, Indiana
should be using the partitioning capabilities of the database in the target DSS. IU is
already licensed for partiotioning, so this would present no new licensing costs.
Partitioning can effectively be used by the Oracle Cost Based Optimizer (CBO) to
significantly improve update and query performance. Because maintenance can be
much more granular and isolated to smaller portions of the total data, maintenance
time can be significantly reduced. The added advantage that is provided gratis when
employing partitioning is the fact that, when a partition becomes inaccessable for any
reason, all the rest of the surviving data is still available to users while the
inaccessable partition is restored.
Lastly, the usage of partitioning allows the creation of local indexes that work in
conjunction with the Cost Based Optimizer to perform partition elimination during
system usage to significantly improve performance by reducing the data result set
sizes and system memory utilization.
Benefits:
 Improved performance – the CBO can leverage partitioning to significantly
improve update and query performance.
 Easier maintenance - maintenance is much more granular and isolated to
smaller portions of the total data.
 Higher levels of availability - when a partition becomes inaccessable for any
reason, all the rest of the surviving data is still available to users.
 Allow use of local index – can perform partition elimination to significantly
improve performance by reducing the data result set sizes and system
memory utilization.
21
IU and Oracle Confidential
Business Analysis
Like many institutions of higher education, Indiana University is learning how to
cope and grow in an era of increasingly tight public funding, where funding for new
information technology initiatives requires reducing the cost of existing products and
services, and where demonstrating real value is the best currency with which to
compete for increasingly scarce dollars. At the same time, IU is a strong contributor
to the Open Source community, with support for projects like Kuali and Oncourse
CL.
The boundaries of the institution are expanding; IU is becoming a global supplier of
educational services. Along with that expansion of boundaries come the
requirements of a global services organization: the need to support 24X7 operations,
with their ever expanding requirements for storage and computing power, and ever
shrinking windows of downtime within which to perform system maintenance.
In an effort to help Indiana University grapple with the growing importance of its IT
infrastructure, the Oracle Insight program reviewed the Decision Support System
(DSS), the RMAN system and Oncourse CL. Unlike many other organizations
assisted by the Insight Program, Indiana University has negotiated extraordinarily
aggressive pricing for their AIX server infrastructure, resulting in a core
infrastructure cost that is highly optimized. Oracle confirmed this by evaluating
multiple technology options, including several UNIX options as well a commodity
server infrastructure. However, as noted below and elsewhere in this document,
while the University’s technology choices have resulted in a low-cost infrastructure,
they do not necessarily result in a highly available database environment. In
response, Oracle has focused in part on recommendations that optimize high
availability, without significant focus on optimizing infrastructure cost.
High availability for Oracle databases is achieved through Real Application Clusters
(RAC), which allows multiple database nodes or servers to simultaneously connect to
a single shared storage and act as a single instance of the database. In a RAC
environment, the failure of a node does not affect the ability of the database to
continue uninterrupted processing. Environments clustered with RAC are also more
readily scalable, as additional nodes can be added to the cluster. RAC can also lead
to better utilization of the server infrastructure by combining multiple individual
servers with low utilization commonly sharing server resources.
The value proposition of Oracle RAC begins with the elimination of single points of
failure. In a 24x7 global environment with interdependent systems, the outage of any
critical system can ripple through the infrastructure. An outage in one system can
impact the performance of other systems, creating transaction backlogs that threaten
windows of application availability. Ultimately, these ripples can increase the risk to
the integrity of the system. Manageability is also an important feature of RAC. In
traditional infrastructures, applications and databases must be brought down when
the underlying server infrastructure needs to be patched and maintained. With RAC,
individual servers in a RAC cluster can be taken out of production for maintenance
and repair without impacting database or application. In some instances, it is even
possible to do rolling patching and upgrades of the database itself, further improving
uptime and reducing dependence on increasingly scarce maintenance windows.
22
IU and Oracle Confidential
While implementation of RAC usually results in a significant reduction in the overall
cost of computing, the University has so successfully optimized its server costs that
there is no meaningful difference in the cost of an AIX infrastructure and a
commodity server infrastructure utilizing RAC. This means that deploying RAC
across AIX servers in order to disaggregate the application layer from the server
infrastructure and achieve true high availability and manageability comes at a cost.
The issue, therefore, is the value of high availability.
Oracle Recovery Manager (RMAN) High Availability
Oracle Recovery Manager (RMAN) is the Oracle-preferred method for efficiently
backing up and recovering the Oracle database. Ideally, the RMAN environment
should be engineered for high availability; unavailability of the backup system is an
unnecessary risk to data integrity and the ability of IT manage backup and recovery
windows.
While backups can be managed from single-server infrastructures, the failure of those
servers – or, in the case of AIX LPAR’s, the failure of that LPAR – can mean failure of
the scheduled backups and an increase in risk to the viability of the University’s
applications infrastructure. Some applications are of sufficiently low value that
restoring from days old backups is not a significant issue. For other applications, the
loss of data may require the restoration of lost transactions or recovery of changed
data through a lengthy and painstaking process which may take days, during which
the application is unavailable.
Oncourse CL High Availability
Oncourse CL is a highly visible, important system that requires nearly ubiquitous
availability. While this system can be managed within an AIX framework, true it
would profit from the availability, resiliency, manageability and horizontal scalability
of a RAC solution. While a RAC solution represents an increase in costs over a
straight AIX solution using LPAR’s, the reality is that LPAR’s alone cannot provide
database high-availability across multiple servers – meaning that a database installed
with out RAC is hostage to the server on which it is installed. By utilizing a RAC
across LPAR’s on multiple servers, Indiana University can achieve true high
availability and gain the ability to manage the server infrastructure without impact to
the database layer.
The issue of course is one of cost and value. Given that deploying RAC will represent
a cost above an AIX-only solution, the University must determine whether the high
availability and manageability features of RAC justify the expense. The question is
both one of risk of outage, and impact of an outage, and the degree to which the
University wants Oncourse CL to be an always available system.
Decision Support & Data Management
Indiana University’s Decision Support System (DSS) is rebuilt on a daily basis. Rearchitecting the DSS system and re-evaluating how data is managed and stored in
other key systems represents a significant opportunity to reduce the amount of data
transferred between and stored by various systems by:

Reducing copies of data

Transferring each piece of data to the DSS system only once

Significantly reducing the backup requirement
23
IU and Oracle Confidential

Reducing the amount of data stored across systems
The value of these changes can be understood both from a process management and
financial perspective. Eliminating the daily rebuilds of the DSS, for example, would
reduce the amount of time each day during which the system was unavailable,
meaning that queries and other processes that currently wait for the daily rebuilds
would be free to run on a less restricted schedule.
The volume of data being replicated and rebuilt daily also drives cost. The data
warehouse contains approximately 2 TB of data which is rebuilt daily. The Peoplesoft
system, with 1 TB of allocated storage, is replicated at least 6 times to support backup,
development and operational data stores. Reducing the amount of duplicative data
stored by half would free up 4 TB of storage, at a cost of $80,000 (given a cost of
$20,000 per TB for protected, usable storage). Further, the reduction in data stored
and manipulated on a daily basis will significantly improve backup and recovery
performance, eliminating the risk that the DSS, for example, cannot be backed up in
the available time window while insuring the that the DSS is positioned to support a
global, 24x7 educational institution.
24
IU and Oracle Confidential
List of Appendices
The following documents can be found on the companion CD
AppendixA-Grid.doc
2DayDBA.pdf
10.9.RACRollingUpgrade.doc
10.10.UpgradeWithLogicalStandby.doc
asm r2 new features.pdf
asm_10gr2_bptwp_sept05.pdf
asmov.pdf
asmwp.pdf
DataGuard.pdf
db_storage_consolidation_wp 12-05.pdf
ds_rac.pdf
EMConcepts.pdf
Generic_SLA.doc
MAA_WP_10gASMMigration.pdf
MetalinkNote290738.1.doc
Oracle Data Guard.doc
oracle_real_application_clusters_10g-the_foundation_for_grid_computing.pdf
Rolling_Patch_Update_Data_Sheet.pdf
rollup_10_1_0_4.pdf
take the guesswork out of db tuning 01-06.pdf
TWP_HA_10gR2_HA_Overview.pdf
twp_rac10gr2.pdf
25
IU and Oracle Confidential