Managing Oilfield Data Management

Managing Oilfield Data Management
Steve Balough
Vastar Resources
Houston, Texas, USA
As oil companies push for greater efficiency, many are revising how
they manage exploration and production (E&P) data. On the way out is
proprietary software that ran the corporate data base on a mainframe,
Peter Betts
Shell Internationale Petroleum
Maatschappij BV
The Hague, The Netherlands
and on the rise is third-party software operating on a networked computing system. Computing standards are making practical the sharing of
data by diverse platforms. Here is a look at how these changes affect
Jack Breig
Bruno Karcher
Petrotechnical Open Software Corporation
Houston, Texas
André Erlich
Corte Madera, California, USA
Jim Green
Sugar Land, Texas
Paul Haines
Ken Landgren
Rich Marsden
Dwight Smith
Houston, Texas
John Pohlman
Pohlman & Associates, Inc.
Houston, Texas
Wade Shields
Mobil Exploration and
Producing Technical Center
Dallas, Texas
Laramie Winczewski
Fourth Wave Group, Inc.
Houston, Texas
the geoscientist’s work.
Today’s exploration and production geoscientist works in a world of untidy data.
Maps, logs, seismic sections and well test
reports may be scattered on isolated computers. They may exist only as tapes or
paper copies filed in a library in Dubai or
stuffed in cardboard boxes in Anchorage
(next page ). Often, most of the battle is simply locating pertinent data.1 Once data are
found, the user may need to be multilingual,
fluent in UNIX, VMS, DB2, SQL, DLIS and
several dialects of SEG-Y.2 Then, even when
data are loaded onto workstations, two
interpretation packages may be unable to
share information if they understand data
differently—one may define depth as true
vertical depth, and another as measured
depth. The main concern for the geoscientist is not only: Where and in what form are
the data? but also: Once I find what I need,
how much time must I spend working on
the data before I can work with the data?
Despite this picture of disarray, E&P data
are easier to work on now than they were
just a few years ago, freeing more time to
work with them. Better tools are in place to
organize, access and share data. Data organization is improving through more flexible
database management software—a kind of
librarian that finds and sometimes retrieves
relevant data, regardless of their form and
location. Access to data is widening through
linkage of data bases with networks that
unite offices, regions and opposite sides of
the globe. And sharing of data by software
of different disciplines and vendors, while
not effortless, has become practical. It will
become easier as the geoscience community inches closer to standards for how data
are defined, formatted, stored and viewed.
Data management—the tools and organization for orderly control of data, from
acquisition and input through validation,
processing, interpretation and storage—is
going through changes far-reaching and
painful. This article highlights two kinds of
changes affecting data management: physical, the tools used for managing data; and
conceptual, ideas about how data, data
users and tools should be organized.
To understand where E&P data management is going, look at where it has come
from. Today as much as 75% of geoscience
data are still stored as paper.3 Yet, the direction of data management is determined by
the 25% of data that are computerized.
Oilfield Review
nThe incredible shrinking data store.
Log data from a single well have been
transferred from tapes and paper
copies in a cardboard box to a 4-mm
digital audio tape (DAT), being passed
from Karel Grubb, seated, to Mike Wille,
both of GeoQuest. This cardboard data
store was among 2000 for a project in
Alaska, USA. Logs from all 2000 wells
were loaded onto 18 DATs for off-site
storage provided by the LogSAFE data
archive service (page 40). One DAT can
hold 1.8 gigabytes, equivalent to conventional logs from about 100 wells.
In this article Charisma, Finder, GeoFrame, IES (Integrated
Exploration System), LogDB, LOGNET, LogSAFE and
SmartMap are marks of Schlumberger; IBM and AIX are
marks of International Business Machines Corp.; Lotus 12-3 is a mark of Lotus Development Corp.; Macintosh is a
mark of Apple Computer Inc.; VAX and VMS are marks of
Digital Equipment Corp.; MOTIF is a mark of the Open
Software Foundation, Inc.; PetroWorks, SeisWorks, StratWorks, SyntheSeis, and Z-MAP Plus are marks of Landmark Graphics Corp.; GeoGraphix is a mark of
GeoGraphix, Inc.; ORACLE is a mark of Oracle Corporation; POSC is a mark of the Petrotechnical Open Software
Corporation; Stratlog is a mark of Sierra Geophysics, Inc.;
Sun and Sun/OS are marks of Sun Microsystems, Inc.;
UNIX is a mark of AT&T; and X Window System is a mark
of the Massachusetts Institute of Technology.
July 1994
For help in preparation of this article, thanks to Najib
Abusalbi, Jim Barnett, Scott Guthery and Bill Quinlivan,
Schlumberger Austin Systems Center, Austin, Texas, USA;
Bill Allaway, GeoQuest, Corte Madera, California, USA;
Richard Ameen, Bobbie Ireland, Jan-Erik Johansson,
Frank Marrone, Jon Miller, Howard Neal, Mario Rosso,
Terry Sheehy, Mike Wille and Beth Wilson, GeoQuest,
Houston, Texas; Bob Born, Unocal, Sugar Land, Texas,
USA; Ian Bryant, Susan Herron, Karyn Muller and Raghu
Ramamoorthy, Schlumberger-Doll Research, Ridgefield,
Connecticut, USA; Alain Citerne and Jean Marc Soler,
Schlumberger Wireline & Testing, Montrouge, France;
Peter Day, Unocal Energy Resources Division, Brea, California; Mel Huszti, Public Petroleum Data Model Association, Calgary, Alberta, Canada; Craig Hall, Steve
Koinm, Kola Olayiwola and Keith Strandtman, Vastar
Resources, Inc., Houston, Texas; Stewart McAdoo,
Geoshare User Group, Houston, Texas; Bill Prins,
Schlumberger Wireline & Testing, The Hague, The
Netherlands; Mike Rosenmayer, GeoQuest, Sedalia,
Colorado, USA.
1. Taylor BW: “Cataloging: The First Step to Data Management,” paper SPE 24440, presented at the Seventh
SPE Petroleum Computer Conference, Houston, Texas,
USA, July 19-22, 1992.
2. UNIX and VMS are operating systems; DB2 is a
database management system for IBM mainframes;
SQL is a language for querying relational data bases;
and DLIS and SEG-Y are data formats.
3. Landgren K: “Data Management Eases Integrated
E&P,” Euroil (June 1993): 31-32.
“Data management” can increasingly be
called “computer” or “digital” data management, and its development therefore parallels that of information technology (right ). In
the last 30 years, information technology
has passed through four stages of
evolution.4 Viewed through a geoscience
lens, they are:
First generation: Back-room computing.
Mainframes in specially built rooms ran
batch jobs overseen by computer specialists.
Response time was a few hours or overnight.
Data processing was highly bureaucratic,
overseen by computer scientists using data
supplied by geoscientists. The highly secure
central repository of data, called the corporate data base, resided here. Evolution concentrated on hardware. Data were often
organized hierarchically, like branches of a
tree that must be ascended and descended
each time an item was retrieved.
Second generation: Shared interactive
computing. Minicomputers were accessed
by geoscientists or technicians working at
remote keyboards. Response time was in
minutes or seconds. Members of a user
group could communicate with each other,
provided they had the same hardware and
software. Bureaucracy declined; democracy
increased. Control was still centralized and
often exercised through in-house database
management software.
Third generation: Isolated one-on-one
computing. This was the short reign of the
rugged individualist. Minicomputers and
personal computers (PCs) freed the user
from the need to share resources or work
through an organization controlling a centralized system. Computing power was limited, but response time rapid. There were
Proprietary mainframes and
Networked computing using
client/server architecture
based on standards
In-house proprietary, leading
to islands of computing;
some proprietary
multidisciplinary packages
Common third-party software
packages; emphasis on sharing
of data and interpretations
between disciplines and between
products from different vendors
nEvolution in
petroleum computing parallels evolution of computing
in general.
Centralized data control by
team unlinked or loosely
linked with regions
Centralized master data base linked
with regional master and project
data bases, giving geoscientists
access to any data from any region
many copies of data, multiple formats and
no formally approved data. The pace of software evolution accelerated and data began
to be organized in relational data bases,
which function like a series of file drawers
from which the user can assemble pieces of
Fourth generation: Networked computing. Also called distributed computing, this
too is one-on-one, but allows memory, programs and data to be distributed wherever
they are needed, typically via a client/server
architecture. 6 Data are no longer just
alphanumeric but also images and will
eventually include audio and video formats.
Interaction is generally rapid and is graphic
as well as alphanumeric, with parts of the
data base viewed through a map. Linkage is
now possible between heterogeneous systems, such as a VAX computer and an IBM
PC. Control of data is at the populist level,
with server and database support from computing specialists who tend to have geo-
science backgrounds. Interpretation and
data management software evolves rapidly
and tends to be from vendors rather than
home grown. Sometimes the network connects to the mainframes, which are reserved
for data archiving or for a master data base
(more about this later). Mainframes are also
reemerging in the new role of “super
servers,”controlling increasingly complex
network communication.
To one degree or another, all four generations often still coexist in oil companies, but
most companies are moving rapidly toward
networked computing.7 This shift to new
technology—and, recently, to technology
that is bought rather than built in-house—is
changing the physical and conceptual
shapes of data management. Since about
1990 two key trends have emerged:
• The monolithic data store is being
replaced, or augmented, with three levels
of flexible data storage. A single corporate
data store8 used to feed isolated islands of
Three Elements of a Successful E&P Data Model
Jack Breig
Comprehensive information coverage
Useful abstractions
Implementable with today’s and tomor-
A successful data model captures all kinds of
The complexity and volume of information man-
row’s technology
information used in the E&P business, as well as
aged by the E&P industry is overwhelming. This
A data model is successful today if it has a struc-
sufficient semantics about the meaning of data,
is a serious impediment to information exchange,
ture, rules and semantics that enable it to be effi-
to ensure that it can be used correctly by diverse
especially where information is generated by pro-
ciently implemented using today’s database man-
applications and users, in both planned and
fessionals in dissimilar business functions. The
agement tools. A data model will be successful
unplanned ways. In other words, a semantically
successful data model reveals to the user the
for the long term if it can accommodate newer
rich data model optimizes the sharing of data
places of similarity in information semantics.
database management products, such as object-
across different disciplines.
This capability provides business opportunities
oriented, hybrid and extended relational
both for users, in accessing data from unexpected
database management systems.
sources, and for application developers to target
new, but semantically similar, business markets.
Oilfield Review
4. This evolutionary scheme was described by Bernard
Lacroute in “Computing: The Fourth Generation,” Sun
Technology (Spring 1988): 9. Also see Barry JA: Technobabble. Cambridge, Massachusetts, USA: MIT
Press, 1992.
For a general review of information technology:
Haeckel SH and Nolan RL: “Managing by Wire,” Harvard Business Review 71, no. 5 (September-October
1993): 122-132.
5. A relational data base appears to users to store information in tables, with rows and columns, that allow
construction of “relations” between different kinds of
information. One table may store well names, another
formation names and another porosity values. A query
can then draw information from all three tables to
select wells in which a formation has a porosity
greater than 20%, for example.
July 1994
data store
Data approval
Project data base
(shared data)
Data volume
Areal extent
Data volatility
Access frequency
Data versions
Create projects
Data quality
computing, each limited to a single discipline. For example, a reservoir engineer in
Jakarta might occasionally dip into the
corporate data store, but would have no
access to reservoir engineering knowledge—or any knowledge—from
Aberdeen. Oil companies today are moving toward sharing of data across the
organization and across disciplines. Nonelectronic data, such are cores, films and
scout tickets,9 may also be cataloged on
line. A new way of managing data uses
storage at three levels: the master, the project and the application.
• E&P computing standards are approaching reality. The three data base levels cannot freely communicate if one speaks
Spanish, one Arabic and the other Russian. Efforts in computing standards
address this problem on two fronts. One is
the invention of a kind of a common language, a so-called data model, which is a
way of defining and organizing data that
allows diverse software packages to share
different understandings of data—such as,
is a “well” a wellhead or a borehole?—
and therefore exchange them (see “Three
Elements of a Successful E&P Data
Model,” previous page). Data models
have been used in other industries for at
least 20 years, but only in the last few has
a broadly embraced E&P data model
begun to emerge. On the other front is
development of an interim means for data
exchange, a kind of Esperanto that allows
different platforms to communicate. In
this category is the Geoshare standard,
now in its sixth version.10
We look first at the data storage troika, and
then at what is happening with standards.
Master data base
(approved data)
External database references
data files
data files
data files
Inventory catalog
nStructure and properties of the three-level scheme for data management. Data shown
at the bottom may be kept outside the database system for better performance.
A New Data Storage Troika
In the days of punch cards, corporate data
were typically kept on a couple of computerized data bases, operated by mainframes.
Flash forward 30 years to networked computing, and the story gets complicated.
Instead of a couple, there are now tens or
hundreds of machines of several types,
linked by one or more networks. And
instead of one or two centralized data
bases, there tends to be many instances of
data storage: master data base, project data
base and application data stores (above ).
Sometimes the old corporate data base has
been given a new job and a new name.
Instead of holding all the data for the corporation, it now holds only the definitive
instances, and so is renamed the “master.”
There may be one centralized master, but
typically there is one master per operating
region or business unit.
6. Client/server architecture, the heart of distributed
computing, provides decentralized processing with
some level of central control. The client is the workstation in the geoscientist’s office. When the geoscientist needs a service—data or a program—the workstation software requests it from another machine called
a server. Server software fulfills the request and returns
control to the client.
7. For an older paper describing a diversity of computer
systems in Mobil: Howard ST, Mangham HD and
Skruch BR: “Cooperative Processing: A Team
Approach,” paper SPE 20339, presented at the fifth
SPE Petroleum Computer Conference, Denver, Colorado, USA, June 25-28, 1990.
8. This article distinguishes between a data store and a
data base. A data store is a collection of data that may
not be organized for browsing and retrieval and is
probably not checked for quality. A data base is a data
store that has been organized for browsing and
retrieval and has undergone some validation and quality checking. A log data store, for example, would be
a collection of tapes, whereas a log data base would
have well names checked and validated to ensure that
all header information is correct and consistent.
9. A scout ticket is usually a report in a ring binder that
includes all drilling-related data about a well, from
the time it is permitted through to completion.
10. For a description of the evolution of the Geoshare
system: Darden S, Gillespie J, Geist L, King G, Guthery S, Landgren K, Pohlman J, Pool S, Simonson D,
Tarantolo P and Turner D: “Taming the Geoscience
Data Dragon,” Oilfield Review 4, no. 1 (January
1992): 48-49.
In the master data base, data quality is
high and the rate of change—what information scientists call volatility—is low. There
are not multiple copies of data. Although
the master data base can be accessed or
browsed by anyone who needs data,
changes in content are controlled by a data
administrator—one or more persons who
decide which data are worthy of residing in
the master data base. The master data base
also contains catalogs of other data bases
that may or may not be on-line. The master
is often the largest data base, in the gigabyte
range (1 billion bytes), although it may be
overshadowed by a project data base that
has accumulated multiple versions of interpretations. The master may be distributed
over several workstations or microcomputers, or still reside on a mainframe.
From the master data base, the user withdraws relevant information, such as “all
wells in block 12 with sonic logs between
3000 and 4000 meters,” and transfers it to
one or more project data bases. Key characteristics of a project data base are that it
handles some aspects of team interpretation,
is accessed using software programs from
different vendors and contains multidisciplinary data. The project may contain data
from almost any number of wells, from 15
to 150,000. Regardless of size, data at the
project level are volatile and the project
data base may contain multiple versions of
the same data. Updates are made only by
the team working on the project. Interpretations are stored here, and when the team
agrees on an interpretation, it may be authorized for transfer to the master data base.
A third way of storing data is within the
application itself. The only data stored this
way are those used by a specific vendor’s
application program or class of applications,
such as programs for log analysis. Application data stores are often not data base management systems; they may simply be a file
for applications to read from and write to.
Applications contain data from the project
data base, as well as interpretations performed using the application. Consequently,
the data may change after each work session, making them most volatile. Because
applications were historically developed by
vendors to optimize processing by those
specific applications, applications from one
vendor cannot share data with applications
from other vendors, unless special software
is written. Vendors have begun to produce
multidisciplinary applications, and many
applications support the management of
interpretation projects. In addition, the proprietary nature of application data stores
will change as vendors adopt industry standards (see “Razing the Tower of Babel—
Toward Computing Standards,” page 45 ). As
a result of these changes, data management
systems of the future will have just two levels, master and project.
The Master Level
The master data base is often divided into
several data bases, one for each major data
type (next page ). A map-type program is
usually used to give a common geographic
catalog for each data base. An example of
this kind of program is the MobilView geographical interface (see “Case Studies,”
below ).
Because the master data base is sometimes cobbled together from separate, independent data bases, the same data may be
stored in more than one place in the master.
BP, for instance, today stores well header
information in three data bases, although it
is merging those into a single instance. An
ideal master data base, however, has a catalog that tells the user what the master contains and what is stored outside the master.
An example is the inventory and location of
physical objects, such as floppy disks, films,
cores or fluid samples. This is sometimes
called virtual storage. The master data base,
therefore, can be thought of as not a single,
Case Studies
Mobil Exploration & Producing
funds should be allocated to take advantage of the
Now, with a broader definition of who is using
Technical Center
best opportunities regardless of location. The
data, there is a greater need for data that can be
A sign on the wall in one of Mobil’s Dallas offices
organization as a whole is also leaner, placing
widely understood and easily shared.
sums up the most significant aspect of how Mobil
more demands on a smaller pool of talent. A
is thinking about data management: “Any Mobil
geoscientist should have unrestricted and ready
worldwide access to quality data.” The sign
describes not where the company is today, but
where it is working to be in the near future.
The emphasis on data management is driven by
Mobil’s reengineering of its business. Data management is affected by fundamental changes in the
current business environment and in how data are
used within this environment.
Traditionally, exploration funds were allocated
among the exploration affiliates, largely in proportion to the affiliate size. Within a changing business environment requiring a global perspective,
For the past three years, the company has been
petrophysicist in Calgary may be called to advise
moving to meet this need. In 1991, Mobil estab-
on a project in Indonesia. All geoscientists, there-
lished 11 data principles as part of a new “enter-
fore, need to understand data the same way. When
prise architecture.” The company is using the prin-
an interpretation and analysis are performed with
ciples as the basis for policies, procedures and
the same rules, right down to naming conventions,
standards. Mobil is now drafting its standards,
then not only can data be shared globally, but any
which include support of POSC efforts, down to
geoscientist can work on any project anywhere.
detail as fine as “in level five of Intergraph design,
A second motivation for restructuring data management is a change in who is using data. Mobil
a file will have geopolitical data.”
Mobil is reengineering data management on two
was traditionally structured in three businesses:
fronts, technology and organization. Technology
Exploration, Production, and Marketing and Refin-
changes affect the tools and techniques for doing
ing. Exploration found oil for its client, Production,
things with data: how they are validated, loaded,
which produced oil for its client, Marketing and
stored, retrieved, interpreted and updated. Organi-
Refining. Now the picture is not so simple. Explo-
zational changes concern attitudes about who uses
ration’s client may also be a government or
what tools for what ends. Both sides carry equal
national production company. Production may also
weight. This involves a realization that technology
work as a consultant, advising a partner or a
national company on the development of a field.
Oilfield Review
Data Stores: Present
Technical Data
Commercial Data
Geology &
3D Models
Data Stores: Short-term Future
nBP’s master data base: present, near
Data Bus
Common Well Data
related data
data bases
Data Stores: Ultimate Solution
Data Bus
DB #1
DB #2
DB #3
DB #4
alone doesn’t solve problems. It also involves the
finds bad data, it is that individual’s responsibility
“people side,” most importantly, the attitude of
to alert the data administrator and have the prob-
geoscientists toward data management.
lem corrected. The new thinking is that data qual-
An example of a new way to manage the people
side is the formation of a series of groups coordi-
ity isn’t someone else’s job; it has to become part
of the culture.
nated by the Information Technology (IT) depart-
A fundamental component of Mobil’s IT strategy
ment called Data Access Teams, which expedite
is a common storage and access model. Typically,
data management for the Mobil affiliates. A Data
each of the affiliates maintains its own, stand-
Access Team comprises Dallas staff and contrac-
alone E&P database system, along with links into
tors who validate, input, search and deliver data to
the central data bases in Dallas. The local systems
an affiliate site. Each site, in turn, collaborates
have evolved independently because local needs
with the team by providing a coordinator familiar
can vary. Frontier areas, for example, work only
with the local business needs and computer sys-
with paper data, whereas Nigeria is nearly all
tems. Another means of addressing the people
electronic. Despite these differences, a common
side of data management is making data manage-
data storage and access model was endorsed by
ment part of a geoscientist’s performance
all affiliates in the early 1990s.
appraisal. This becomes increasingly important as
more data go on-line.
In the past, geoscientists would rifle closets and
Legacy Data Management Systems
This model incorporates access to existing data
sources as well as the concept of master and project data stores (right). The master data base
ask colleagues for data. Any bad data would get
maintains the definitive version of the data while
tossed or worse, ignored. Now, if a geoscientist
the project data base is maintained as a more
Data Store
Project Builder
(searches master and
downloads data)
Migration toward
POSC compliance
Geology &
future and long term. Currently, the master
is divided into discrete data bases and
associated applications that have been
built to address the needs of specific disciplines. The advantage of this approach is
that data ownership is clearly defined. Disadvantages include duplicate data sets,
difficulty ensuring integrity of data shared
by more than one discipline, and
increased data management workload.
The near-term solution increases the permeability of walls separating disciplines,
and the ultimate solution dissolves these
walls, making use of standards established by the Petrotechnical Open Software
Corporation (POSC).
Project Data Store
nOverview of Mobil’s data management model. This
model, developed in the early 1990s, is part of a strategy for migration to POSC compliance.
ephemeral working environment interfacing with
July 1994
the applications. The project builder serves as a
Commands Session Form Record Field Utilities Help
WEN 72-453
Ret Service
----- Unit
----- Curves:
3865.77 3079.54
ETIM Elapsed Tim
Raw I1 Conc
SMNO Synthetic N
SMIN Synthetic N
Show Data Specification
Browse T
Show Property Indicators
Activity Parameters ( 7/26)
Date Circulation Stopped
Time Circulation Stopped
5:29 MAY 29
Drilling Fluid Type
Drilling Fluid Density
Drilling Fluid Viscosity
Drilling Fluid PH
Drilling Fluid Loss
Resistivity of Mud Sample
Mud Sample Temperature
Mud Filtrate Sample Source
MI0010:Only current Field,Hole,etc. RETRIEV
Resistivity of Mud Filtrate S 0.072
Updated By
Bit Size
-0.0508 3865.88 3079.54
Stop 29-May-91
Loaded By pagoda
Mud Sample Source
PAGODA. The control panel at left
allows the user to
obtain details
about selected
fields in the main
screen and navigate the system.
The central screen
shows curve summary information,
and the mud data
screen gives
details of mud
Activity Start 29-May-91
TOD Time of Day
Activity Type
Inc/Frame Start
Curve Indexes:
nActivity screen in
Courtesy of Shell Internationale Petroleum Maatschappij BV.
Case Studies
geographical search, retrieval and archival mech-
Unocal Energy Resources Division
a client/server system. The group responsible for
anism. The master and project data stores are
Unocal is an integrated energy resources company
overseeing this shift is Technical Services, based
based on the POSC Epicentre model and imple-
that in 1993 produced an equivalent of 578,000
in Sugar Land (next page). Here, 13 staff scientists
mented in the Open System/UNIX environment.
barrels of oil per day and held proven reserves
and five on-site contractors manage 200 gigabytes
The model calls for gradual replacement of most
equivalent to 2.2 billion barrels of oil.1 By assets,
of data on about two million wells. This is the
of the legacy data bases, based on business
Unocal ranked thirteenth worldwide in 1993, com-
largest portion of the company’s one terabyte of
drivers. The company will maintain selected
pared to Mobil in second place.2 Exploration and
E&P data, and is accessed by about 275 users in
legacy systems for the foreseeable future, but in a
production is carried out in about ten countries,
both exploration and business units.
way that minimizes resources and does not involve
including operations in Southeast Asia, North
formal development.
America, former Soviet Union and the Middle East.
main tasks: it supports client/server-based geo-
Unocal began a reorganization in 1992 that will
science applications, and it maintains and dis-
Being competitive in the current business envi-
The Technical Services group performs two
ronment requires a change in business processes.
eventually converge most all exploration activities
tributes application and database systems. There
This change from a local focus to a global perspec-
at a facility in Sugar Land, Texas, USA. Concurrent
are 23 software systems under its wing: 11 geo-
tive is particularly important in IT. Within Mobil’s
with this restructuring is a change, started in 1990,
physical applications, five mapping and database
E&P division, the major component of this initia-
in the physical database system—from IBM main-
applications, five geologic applications and two
tive is a paradigmatic shift in the way data are
frames running in-house database, mapping and
hard-copy management systems, which are cata-
handled at all levels and the development of a
interpretation software, to third-party software on
logs of physical inventories such as seismic sec-
POSC-based data storage and access model. In
the context of the corporate IT mission, this strategy was developed from the ground up, with input
from all Mobil affiliates.
tions and scout tickets. Altogether, the 23 systems
1. Unocal 1993 Annual Report: 8.
2. National Petroleum News. “Oil Company Rankings by
Marketers with Assets of $1 Billion or More,” (June
1993): 16.
represent products from at least 20 vendors,
(continued on page 40)
Oilfield Review
comprehensive library, but as a large digital
filing cabinet and a catalog directing the
user to other filing cabinets that may be
adjacent, across the hall or across the
ocean, and may contain data in any
form—digital, paper, cores and so on.
One component of storage at the master
level is the archive. The name implies longterm storage with infrequent retrieval, but
the meaning of archive is shifting with new
ways of organizing and accessing data. An
archive may be an off-site vault with seismic
tapes in original format, or an on-site optical
disk server with seismic data already formatted for the interpretation workstation. There
are countless variations, with differing
degrees of “liveness” of the data. Two examples of archives are the PAGODA/LogDB
and LogSAFE systems.
The PAGODA/LogDB system is a joint
development of Shell Internationale
Petroleum Maatschappij (SIPM) and
Schlumberger. PAGODA designates the
product in Shell Operating Companies and
Affiliates; the LogDB name is used for the
Schlumberger release. The objective of the
joint development was to produce a safe
storage and retrieval system for information
gathered during logging and subsequent
data evaluation.
The PAGODA/LogDB system comprises
software and hardware for loading, storing
and retrieving data acquired at wells, such
as wireline logs, data obtained while
drilling, borehole geophysical surveys and
mud logs. Original format data are scanned
during loading to ascertain header information and the integrity of the data format. All
popular data formats are supported. Once
scanning is complete, well identifiers are
validated against a master list of well
names. Extracted header information is written to ORACLE tables—a type of relational
database table—and bulk data are trans-
ferred to a storage device, such as an online jukebox of optical disks, magnetic disks
or other archival media. Once data are validated and loaded, they can be quickly
viewed, searched and retrieved for export to
other applications ( previous page ).
The PAGODA/LogDB system fulfills two
roles. As an off-line archival data base, the
system is capable of storing terabytes of data
that are safely held on a durable medium
and can be retrieved and written to disk or
tape. Exported data can be provided in
either original format or via a Geoshare linkage (page 46 ). As an on-line data base, the
system makes all header information available for browsing and searching. The systems are supported on VAX/VMS, Sun/OS
and IBM/AIX operating systems.
The PAGODA and LogDB systems are
essentially the same. To date, PAGODA has
been installed in most of Shell’s larger Operating Companies, where log data held on
Energy Resources
Exploration Groups
• Gulf/North America
• Far East
• Middle East/Latin
Large Business
Small Business
• US Gulf Coast
• Louisiana
• Thailand
• Indonesia
• Alaska
• Central US
• Calgary
• The Netherlands
• Aberdeen
• Syria
nA data management organization chart for Unocal’s
Energy Resources division. The company’s Technical
Services group, based in Sugar Land, Texas, contributes to data management for the four exploration
groups, which are also based in Sugar Land, and large
and small business units. The business units are moving their regional data models toward a single standard.
and database
IS support
Local IS
systems (IS)
Asset Teams
(field development)
July 1994
Asset Teams
Jill Orr photo
earlier systems or in tape storage are being
migrated to PAGODA. The PAGODA system also caters to the storage of Shell proprietary data formats and calculated results.
The LogDB system has been installed in several oil companies and government agencies and is being made available as a loading and archive service at Schlumberger
computing centers worldwide. The LogDB
system can be coupled to the Finder data
management system, in which well surface
coordinates can be stored together with limited log sections for use during studies.
The LogSAFE service performs some of
the same functions, but not as a tightly coupled data base. It is an off-line log archive
that costs about half as much as an internal
database system, yet can deliver logs over
phone or network lines in a few hours. For
oil companies that prefer a vendor to manage the master log data base, the LogSAFE
system functions as an off-site data base,
accessed a few times a week. For large
companies—those with more than about 15
users of the data base—it functions as an
archive, accessed once or twice a month.
nBob Lewallen, senior log data technician
with Schlumberger, returning 90 gigabytes
of log data to the LogSAFE vault in Sedalia,
Colorado, USA. Original logs are copied to
two sets of DATs stored in the vault. A third
backup is kept in a fireproof safe off-site.
To ensure data integrity, data are validated and rearchived every five years.
The LogSAFE system places logs from any
vendor in a relational data base, validating
data and storing them in their original format. Any tape format can be read and paper
copies digitized. Before it is entered into the
data base, the entire file is inspected to
make sure it is readable, and the data owner
is informed of any problems in data quality.
Validated data are then archived on DATs
(left ). By the end of 1994, data will also be
stored for faster, automated retrieval on a
jukebox of optical disks. Clients have access
only to their own data, and receive a catalog of their data and quarterly summaries of
their holdings. The LogSAFE system is based
in Sedalia, Colorado, USA, at the same
facility that houses the hub of the Schlumberger Information Network. This permits
automatic archiving of Schlumberger logs
that are transmitted from a North American
wellsite via the LOGNET service, which
passes all logs through the Sedalia hub.
LogSAFE data can be accessed several
ways, giving it the similar flexibility and
“live” feel of an in-house data base. Data
are either pulled down by the user, or
making Unocal advanced in its move from in1
house software (right).
Jim Green, who manages Unocal’s exploration
computing services, lists two essential ingredients
for successful management of E&P data: training
move all computerized E&P data from flat-file,
hierarchical systems to ORACLE-based relational
systems.3 A company-wide program is bringing
UNIX platform
for users of new data handling systems and automated tools for moving data. By 1995, Unocal will
data base
data base
data base
PC platform
geotechnicians up to speed on data loading and
nArchitecture of Unocal’s interpretation applications
1 Catalog of digital and hard-copy data
data. The company is encouraging vendors to
which scan two project data bases or a project and
develop a family of automated software tools: data
a master data base and produce a list of differ-
movers, editors, loaders, unloaders and compari-
ences between them. Quick identification of differ-
2 Best Unocal data from past projects
and commercial sources
3 Managed and converted by Geoscience
4 SeisWorks, StratWorks, Z-MAP Plus,
SyntheSeis, PetroWorks, etc.
5 Application for interpretations
tors. Of particular importance are comparitors,
ences between versions of data is essential in
quality checking.
The diversity of Unocal’s software underscores
the importance of managing multiple versions of
3. A flat-file data base contains information in isolated
tables, such as a table for porosity data and another for
paleontology data. The tables cannot be linked. A relational data base allows linkage of information in many
tables to construct queries that may be useful, such as
“show all wells with porosity above 20% that have coccolithus.” Flat file data bases are simple and inexpensive.
Relational data bases are complex, expensive and can be
used to do much more work.
and data stores. Pair-wise links connect all platforms.
Long-term evolution is toward the UNIX platform.
deciding how to update projects with vendor data
who load data and contribute to quality assurance
or how to update the master data base with geo-
(next page). The main issue is whether geotechni-
scientists’ interpretations from a project or
cians are most effective if they work by geographic
regional data base. Today this comparison is done
area, as they do now, or by discipline. The advan-
manually. “The client/server environment cannot
tage of focusing geotechnicians by area is that
be considered mature until we have robust, auto-
they learn the nuances of a region, and develop
mated data management tools,” Green says.
A debate within Unocal concerns finding the
best way to organize the work of geotechnicians,
Oilfield Review
pushed from Sedalia at client request. Data
are transferred either over commercial or
dedicated phone lines, or more commonly,
through the Internet. Internet connections
are generally faster than phone lines, and
allow transmission of 500 feet of a borehole
imaging log in 5 to 10 minutes.
In operation since 1991, the LogSAFE service now serves 70 clients and holds data
from 13 countries. As oil companies continue to look to vendors for data management, the LogSAFE service is expected to
expand from its core of North American
clients. Today it handles mostly logs and a
handful of seismic files, but is expected to
expand to archive many data types.
The Project and Application levels
E & P Data Acquisition
To the geoscientist, the project data base is
like a desktop where much of the action
takes place. Relevant resources from the
master data base are placed in the project
data base. To work on project data, users
move it from the project level into the application software. The process of interpretation, therefore, involves frequent exchange
Data gathered
by Unocal
• Well info
• Geopolitical
Data Purchased
• Tobin
• Petroleum Information
• Petroconsultants
• Seismic surveys
Data Management
Local Information
Systems Group
• Load
• Periodic updates
• Quality check
• Load
nUnocal’s data flow and
cohesion with other team members and a sense of
division of labor. Vendor
data are initially kept
separate from Unocal
data, mainly for quality
ownership of the data. The advantage of organizLoad project
ing geotechnicians by discipline—seismic, paleontology, logs and so on—is a deeper understandnaming conventions and data structure. The optimal choice is not so obvious. Complicating the
picture is the changing profile of the typical
geotechnician. Ten or 15 years ago, a geotechnician held only a high school diploma and often
ing of each data type and better consistency in
Regional Data
• Exploration
• Business units
Vendor Data
preferred to develop an expertise in one data type.
Today, many have master’s degrees and prefer a
data base
job with greater diversity that may move them
toward a professional geoscience position.
Geotechnicians today are increasingly involved in
broader matters of data management, including
the operation of networks and interapplication
Whichever way Unocal moves, Green says, the
company is addressing a change in data culture.
Application data bases
tists who understand what to do with a relational
data base,” Green says, “who understand the
potential of interleaving different kinds of data.
The future belongs to relational thinkers and that’s
and local
links. “We need both technicians and geoscien-
Archive Data Base
what we need to cultivate.”
July 1994
between the project and application levels,
and when a team of geoscientists arrives at
an interpretation and wants to hold that
thought, it is held in the project data base.
The distinction between the levels is not
rigid. For example, software that works at
the project level may also overlap into the
master and application levels. What sets
apart project database systems is where they
lie on the continuum of master-projectapplication. Two examples are the Finder
system, which manages data at the project
and master levels, and the GeoFrame system, which manages project data for interpretation applications.
The Finder system wears two hats. It functions as a data base, holding the main types
of E&P data, and as a data access system,
providing a set of programs to access and
Cross Section
Geoshare Tool Kit
X Window System
view data (below ).11 The Finder system consists of three main components: the data,
interfaces to the data, and the underlying
data model, which organizes the data.
The data bases reside in two places.
Inside the relational data base management
system are parametric data, which are single
pieces of data, such as those in a log header.
Usually, relational data bases like the one
Finder uses (ORACLE), cannot efficiently
handle vector data, also called bulk data.
These are large chunks of data composed of
smaller pieces retrieved in a specific order,
such as a sequence of values on a porosity
curve or a seismic trace. For efficiency and
resource minimization, vector data are
stored outside the relational structure, and
references to them are contained in the data
base management system. Data bases are
1 Well data are an encyclopedic description of
the well. The 50-plus general parameters
include well name, location, owner, casing
and perforation points, well test results
and production history.
2 Logs are represented either as curves or in
tabular form. Multiple windows can be
displayed for crosswell correlations. Log
data are used to select tops of formations
and are then combined with seismic
“picks” to map horizons, one of the main
goals in mapping.
Finder Tool Kit
Database Access
• Imbedded SQL
Bulk Data
1 (Well)
2 (Log)
3 (Seismic)
4 (Lease)
5 (Cultural)
typically set up for the most common types
of data—well, log, seismic, lease and cultural, such as political boundaries and
coastlines—but other data types and applications can be accessed. The Finder system
also includes tools for database administrators to create new projects, new data types,
new data base instances and determine
security. Users can move data between the
master and project levels.
The component most visible to the user is
the interface with the data base and its content. There are several types of interfaces.
The most common is an interactive map or
cross section (next page ). Another common
interface is the ORACLE form. Typical forms
looks like a screen version of the paper
form used to order data from a conventional library (page 44, top ). In the future,
User Interface (MOTIF)
Graphics (Graphical Kernel System)
Operating System
• Sun/OS
Transmission Control
Protocol/Internet Protocol
3 Seismic includes the names of survey
lines, who shot them, who processed the
data, how it was shot (influencing the
“fold”) and seismic navigation data. Also
listed are “picks,” the time in milliseconds
to a horizon, used to draw a subsurface
contour map through picks of equal value.
4 Lease data include lease expiration, lessee
and lessor and boundaries.
5 Cultural data include coastlines, rivers,
lakes and political boundaries.
Relational Database
Management Server
1 (Well)
2 (Log)
3 (Seismic)
4 (Lease)
5 (Cultural)
nStructure of the Finder system. The system lets the geoscientist work with five kinds of
bread-and-butter data: Well, log, seismic, lease and cultural.
Oilfield Review
forms will also be multimedia, merging
text, graphics and maps. The user also
interacts with the data base through
unloaders—utilities that convert data from
an internal representation to external, standard formats—and through external applications, like a Lotus 1-2-3 spreadsheet.
A more powerful interface is Standard
Query Language (SQL—often spoken
“sequel”), which is easier to use than a programming language but more difficult than
spreadsheet or word processing software.12
The Finder system includes at least 40 common SQL queries, but others that will be
used often can be written and added to a
menu for quick access, much like a macro in
PC programming. Often, a database administrator is the SQL expert and works with the
geoscientist to write frequently used SQL
queries. A point-and-click interactive tool for
building SQL queries is also available for
users with little knowledge of SQL.
The less visible part of the Finder system,
but its very foundation, is the data model.
Today, the Finder data model is based on
the second and third versions of a model
called the Public Petroleum Data Model
(PPDM), which was developed by a nonprofit association of the same name.13 To
satisfy new functionalities, the Finder model
has evolved beyond the PPDM in some
areas, notably in handling seismic and
stratigraphic data. The evolutionary path is
moving the Finder model toward the Epicentre/PPDM model, which is a merger of
the PPDM model with Epicentre, a model
developed by another industry consortium,
the Petrotechnical Open Software Corporation (POSC).
In the last two years, the Finder system
has gone through several cycles of reengineering, making it more robust, more flexible and easier to use. Updates are released
twice a year. Since 1993, for example,
about 50 new functions have been added.
Four changes most notable to the user are:
• easier user interface. Full compliance
with MOTIF, a windowing system, speeds
access to data, and makes cutting and
pasting easier.
11. For an accessible review of data bases and database
management systems: Korth HF and Silberschatz A:
Database System Concepts. New York, New York,
USA: McGraw-Hill, Inc., 1986.
12. This language, developed by IBM in the 1970s, is for
searching and managing data in a relational data
base. It is the database programming language that
is the most like English. It was formerly called
“Structured Query Language,” but the name change
reflects its status as a de facto standard.
13. PPDM Association: “PPDM A Basis for Integration,”
paper SPE 24423, presented at the Seventh SPE
Petroleum Computer Conference, Houston, Texas,
USA, July 19-22, 1992.
July 1994
nTwo interfaces of the Finder system. The most popular interfaces are graphical.
Two examples are the SmartMap software (top) and the Cross Section module (bottom). SmartMap software positions information from the data base on a map, and
allows selection of information from the data base by location or other attributes.
The user can tap into the data base by point-and-click selection of one or more
mapped objects, such as a seismic line or a well. The Cross Section module is used
to post in a cross-section format depth-related data, such as lithology type, perforations and, shown here, formation tops. Wells are selected graphically or by SQL
query, and can be picked from the base map.
Action Edit Block Field Record Query
HUSKY ET AL CECIL 16-20-84-8
Prov: AR
Class: OUT
Status: FLOW OIL
Field: CECIL
Z/Pool: H0508001
Fin D:
Cur Status:
On Prod:
Lic. Date:
No. of Formations
Source MD
No of Prod Zones
Prod Fluid From
OIL M3/day
GAS 1000M3/day
WATER M3/day
GOR 1000M3/M3
Number of DSTs 2
Test Type
Initial Latest
Latitude: 56.305279
Longitude: -113.21853
N/S coord: 23
E/W coord: 374
Operator: AL73
Number of Logs 9
Log Type
Count: #1
No of Statuses 3
Number of Completions
Comp Type
Number of Cores 1
Log Type
RDepth FFP
KB Elev: 618.9
Ground: 614.5
Old frm: DOIG
TD: 1130
Lic: AO131589
Number of Casings 2
Casing Type Size Depth
nAn ORACLE form for a scout ticket. The 80 types of forms preprogrammed into the Finder system do more than
display data. They include small programs that perform various functions, such as query a data base and
check that the query is recognizable—that the query is for something contained in the data base. The programs
also check that the query is plausible, for instance, that a formation top isn’t at 90,000 feet [27,400 m]. Forms
may perform some calculation. The well form, for example, contains four commonly used algorithms for the
automated calculation of measured and true vertical depth. Forms can be accessed by typing a query or clicking on a screen icon. A form showing all parameters available for a well, for example, can be accessed through
the base map by clicking on a symbol for a well.
Log Editing &
Marker Interpretation
Model Building
nThe GeoFrame
Available year end
Data Management
Geophysical Interpretation
In next major release
3D Visualization
and Interpretation
reservoir characterization system. The
GeoFrame system is
a suite of modular
applications that
are tightly linked
via a common data
base and package
of software that
handles intertask
This software automatically makes
any change in the
data base visible to
each application. It
also allows the user
to communicate
directly between
applications. For
example, a cursor
in a cross-section
view will highlight
the well in the map
• data model expansion. The data model
now accommodates stratigraphic and production data and improves handling of
seismic data.
• mapping and cross-section enhancements. As the data model expands, graphical capabilities expand to view new
kinds of data. Enhancements have been
made to cross-section viewing, and additions include 3D seismic overlay and
bubble mapping. A bubble is a circle on a
map that can graphically display any data
in the data base. A circle over a well, for
example, can be a pie chart representing
recoveries from a drillstem test.
• enhanced data entry. All data loaders
have the same look and feel. Loaders
have been added that handle data in Log
Interpretation Standard (LIS), generalized
American Standard Code for Information
Interchange (ASCII) and others.
Unlike the Finder system, which works
mainly on the master or project level, the
GeoFrame reservoir characterization system
works partly on the project level but mainly
on the application level. The GeoFrame system comprises control, interpretation and
analysis software for a range of E&P disciplines. The system is based on the public
domain Schlumberger data model, which is
similar to the first version of Epicentre and
will evolve to become compliant with the
second version. Applications today cover
single-well analysis for a range of topics:
petrophysics, log editing, data management,
well engineering and geologic interpretation
(previous page, bottom ). Additions expected
by the end of the year include mapping with
grids and contours, model building and 3D
visualization of geologic data.
GeoFrame applications are tightly integrated with a single data base, meaning,
most importantly, that there is only one
instance of data. From the user’s perspective, tight integration means an update
made in one application can become automatically available in all applications.
(Loose integration means data may reside in
several places. The user must therefore
manually ensure that each instance is
changed during an update). Project data
management is performed within GeoFrame
applications, but it may also be performed
by the Finder system, which is linked today
to the GeoFrame system via the Geoshare
standard (see “Petrophysical Interpretation,”
page 13 ).
July 1994
If vendor A program
is modified, five links
may need modification;
who maintains the link:
Vendor A? Vendor B?
Pair-wise Links
Six applications, 15 links; an additional application
means six new links.
If vendor A program
is modified, only A
half-links need
modification. Each
vendor maintains
its own half-links.
Geoshare Links
Geoshare Data Bus
Six applications, six links; an additional application
requires only one new link.
nPair-wise versus data-bus linkage of applications.
Razing the Tower of Babel—Toward
Computing Standards
Moving data around the master, project and
application levels is complicated by proprietary software. Proprietary systems often
provide tight integration and therefore fast,
efficient data handling on a single vendor’s
platform. But they impede sharing of data
by applications from different vendors, and
movement of data between applications
and data bases. Nevertheless, many proprietary systems are entrenched for the near
term and users need to prolong the usefulness of these investments by finding a way
to move data between them.
A common solution is called pair-wise
integration—the writing of a software link
between two platforms that share data. This
solution is satisfactory as long as platform
software remains relatively unchanged and
the number of linked platforms remains
small. Unocal, for example, maintains five
such links at its Energy Resources Division
in Sugar Land, Texas (see “Case Studies,”
page 38 ). A revision to one of the platforms,
however, often requires a revision in the
linking software. With more software
updates and more platforms, maintenance
of linking programs becomes unwieldy and
A solution that sidesteps this problem is
called a data bus, which allows connection
of disparate systems without reformatting of
data, or writing and maintaining many linking programs (above ). A data bus that has
Loose Integration of Interpretation Applications
Tight Integration of Applications
received widespread attention is the
Geoshare standard, which was developed
jointly by GeoQuest and Schlumberger and
is now owned by an independent industry
consortium of more than 50 organizations,
including oil companies, software vendors
and service companies. The Geoshare system consists of standards for data content
and encoding—that is, the kind of information it can hold, and the names used for that
information. It also includes programs called
half-links for moving data to and from the
bus (right ). Geoshare uses a standard for
data exchange—which defines what data
are called and the sequence in which they
are stored—that is an industry standard
called API RP66 (American Petroleum Institute Recommended Practice 66). The
Geoshare data model is evolving to become
compatible with the Epicentre data model
and its data exchange format, which is a
“muscled” version of the Geoshare standard.
Half-links are programs that translate data
into and out of the Geoshare exchange format. There are sending and receiving halflinks. The sending half-link translates data
from the application format into a format
defined by the Geoshare data model, and
the receiving half-link translates data from
the Geoshare format into a format readable
by the receiving application. Schlumberger
has written half-links for all its products, and
many half-links are now being developed
by other vendors and users. As of July this
Data Store
Data Store
Data Store
Application Data Interface
Data Base
Geoshare Data Bus
(Data Exchange Standard and Data Model)
Project or Master
Data Base
nHow Geoshare linkage works. Loosely integrated
applications (right) are linked with each other, with
tightly linked applications (left), and with a project or
master data base. GeoFrame applications are tightly
integrated, like the system on the top left.
year, 20 half-links were commercially available. This number may double by year end.
There are significant differences between
sending and receiving half-links, and in the
effort required to write them. The sending
half-link usually has to just map the application data to the Geoshare data model. This
is because the Geoshare data model is
Major Geoshare Data Model
• Seismic survey
• Wavelet
• Velocity data
• Fault trace set
• Lithostratigraphy
code list
• Map
• Land net list
• Zone list
• Field
• Surface set
• Drilling facility
broad enough to accept many ways of
defining data—like a multilingual translator,
it understands almost anything it is told. The
receiving half-link, however, needs to translate the many ways that data can be put in
the Geoshare model into a format understandable by the receiving application. Look
at a simple example.
What is Object-Oriented—Really?
The term “object-oriented” has become increas-
and manipulated through visual metaphors for
ingly misunderstood as its usage expands to
the thing itself. Files are contained in folders that
about the power of an “object-oriented program,”
cover a broad range of computer capabilities. The
look like paper folders with tabs. Files to be
they are often referring to object-oriented graph-
meaning of object-oriented changes with the
deleted are put in the trash, which is itself an
ics, which concerns the construction and manipu-
noun it precedes—object-oriented interface,
object—a trash can or dust bin. Object-oriented
lation of graphic elements commonly used in map-
graphics and programming. What makes them all
displays do not necessarily mean there is object-
ping, such as points, lines, curves and surfaces.
object-oriented is that they deal with
oriented programming behind them.
Object-oriented graphics describes an image com-
“objects”—usually a cluster of data and code,
When software developers talk to geoscientists
The object-oriented interface to data, at a
ponent by a mathematical formula. This contrasts
treated as a discrete entity. The lumping together
higher level, involves the concept of retrieving
with bit-mapped graphics, in which image compo-
of data is called encapsulation.
information according to one or more classes of
nents are mapped to the screen dot by dot, not as
An object-oriented interface is the simplest
things to which it belongs. For example, informa-
a single unit. In object-oriented graphics, every
example. It is a user interface in which elements
tion about a drill bit could be accessed by query-
graphic element can be defined and stored as a
of the system are represented by screen icons,
ing “drill bit” or “well equipment” or “parts
separate object, defined by a series of end points.
such as on the Macintosh desktop. Applications,
inventory.” The user doesn’t need to know where
Because a formula defines the limits of the object,
documents and disk drives are represented by
“drill bit” information is stored to retrieve it. This
is a breakthrough over relational database
queries, where one must know where data are
stored to access them.
Oilfield Review
The Problem
it can be manipulated—moved, reduced, enlarged,
take only six weeks when done in OOP. It is analo-
Further reading:
rotated—without distortion or loss of resolution.
gous to modular construction in housing, in which
These are powerful capabilities in analyzing repre-
stairs, dormers and roofs are prefabricated, cut-
sentations of the earth and mapped interpretations.
ting construction time in half.
Tryhall S, Greenlee JN and Martin D: “Object-Oriented Data
Management,” paper SPE 24282, presented at the SPE European Petroleum Computer Conference, Stavanger, Norway,
May 25-27, 1992.
Well location
Well location
model also defines well location by X and Y
coordinates, but it does not yet recognize
decimeters. The Finder model understands
depth as time (seconds or milliseconds),
inches, feet or meters (above). The Geoshare
data model can understand all units of
depth, so the Charisma sending half-link
therefore has only to send decimeters to the
Suppose you want to send a seismic interpretation from a Charisma workstation to
the Finder system. Both the Finder and
Charisma systems recognize an entity called
“well location.” In the Charisma data
model, well location is defined by X and Y
coordinates and by depth in one of several
units, such as decimeters. The Finder data
Geoshare bus. The Finder receiving half-link,
however, must recognize that depth can be
quantified six ways—feet, inches, meters,
decimeters, seconds and milliseconds. In this
case, the Finder receiving half-link provides
the translation of decimeters into a unit
understandable by the Finder system.
Writing a receiving half-link typically
takes six to nine months, which is an order
of magnitude longer than writing a sending
half-link. The main difficulty in writing a
receiving half-link is that it attempts to act
like a large funnel. It tries to translate data
from Geoshare’s broad data model into the
application’s focused data model, so it has
to consider many possible ways that data
can be represented in the Geoshare model.
The ideal receiving half-link would look at
every way a fact could be represented, and
translate each possibility into the variant
used by the receiving application. For
example, many interpretation systems use a
variant of a grid structure to represent surfaces. Such surfaces may represent seismic
reflectors or the thickness of zones of interest. Geoshare provides three major representations for such surfaces: grids, contours
and X, Y and Z coordinates. If the receiver
looks only at incoming grids, the other two
kinds of surface information may be lost.
The policy taken by a receiver should be to
look for any of these possibilities, and translate into the internal representation used by
the applications.
nSending and
Well location
The Solution
Well location
Well location
to meters
receiving Geoshare
half-links. Conversion of data—here,
units of depth—to a
form understandable by the receiving application is
the job of the
receiving half-link
between Geoshare
and the Finder system. A well location can be a point
with reference to
one of several features, such as the
kelly bushing, a formation top or a
seismic surface.
“Object-oriented” to a programmer or software
The next wave in data bases themselves is a
developer tends to mean object-oriented program-
shift toward object-oriented programming, in
ming (OOP). It does not necessarily imply new func-
which the user can define classes of related ele-
tionalities or a magic bullet. Its most significant
ments to be accessed as units. This saves having
contribution is faster software development, since
to assemble the pieces from separate relational
large chunks of existing code can often be reused
tables each time, which can cut access time.
with little modification. Conventional sets of
POSC’s Epicentre model uses essentially rela-
instructions that take eight weeks to develop may
tional technology but adds three object-oriented
• unique identifiers for real-world objects
• complex data types to reflect business usage
• class hierarchy for the drill bit example above.
July 1994
Zdonik SB and Maier D (eds): “Fundamentals of Object-Oriented Databases,” in Readings in Object-Oriented Database
Systems. San Mateo, California, USA: Morgan Kaufmann
Publishers, Inc., 1990.
Nierstrasz O: “A Survey of Object-Oriented Concepts,” in
Kim W and Lachovsky FH (eds): Object-Oriented Concepts,
Databases, and Applications. New York, New York, USA:
ACM Press, 1989.
Williams R and Cummings S: Jargon: An Informal Dictionary of Computer Terms. Berkeley, California, USA: Peachpit
Press, 1993.
Aronstam PS, Middleton JP and Theriot JC: “Use of an
Active Object-Oriented Process to Isolate Workstation Applications from Data Models and Underlying Databases,” paper
SPE 24428, presented at the Seventh SPE Petroleum Computer Conference, Houston, Texas, USA, July 19-22, 1992.
POSC established
as consortium
Request for
funded by
technology (RFT)
35 companies
for data model
RFT for data
access protocol
Request for
comments (RFC)
for data access
Published series
of snapshot*
Epicentre, Data
Access & Exchange,
Exchange Format
Publication of Base Computer
Standards (BCS) version 1.0
RFC for User
Style Guide
Published complete
Epicentre “root
model” snapshot
nMilestones in POSC efforts.
The Geoshare standard can link two programs if they can both talk about the same
data, even if not in the same vocabulary,
such as the Finder and Charisma systems
talking about “well location.” But two programs have a harder time communicating if
they don’t understand the same concept. By
analogy, a Bantu of equatorial and southern
Africa could not talk with a Lapp about
snow since the Bantu have no word for
snow. In the E&P world, this problem of a
missing concept happens about a third of
the time when linking applications through
the Geoshare standard. But there is a way
around it. Look again at an example.
The Finder system and IES Integrated
Exploration System, which is an interpretation system, both have conventions for designating wells. For the IES system, a well has
a number, which can be determined in various ways, and a nonunique name. For the
Finder system, a well is known by a specific
kind of number, its API number. The
Geoshare data model accepts all conventions. When a formation top is picked in the
IES system, the top is labeled by well name
and number. If it were shipped that way to
the Finder system through Geoshare, the
Published complete series
of final specifications version 1.0
Member organizations
number 85
Published BCS
version 2.0 snapshot
Helped start more than
20 migrations to BCS
Obtained PPDM
agreement to
with merger
Started Internet
information server,
made project
proposals and reviews
more democratic
Published Epicentre, Data Access
& Exchange, Exchange Format, User
Interface Style Guide (Prentice-Hall)
Will complete PPDM
merger, publish
Epicentre version 2.0.
Will publish
Data Access &
version 1.1
Will publish
BCS version 2.0
ShareFest: An
exhibition of
products sharing
data from
oil companies
Published Conformance
Statement Template snapshot
* A snapshot is an interim version,
like an alpha or beta test in engineering.
Finder system would not recognize the well
because it is looking for an API number. The
solution is to add a slot in the IES data
model for the API number. The IES system
itself does not use this slot, but when it
sends data to the Finder system, the IES
sending half-link adds the API number to
the message, enabling the Finder system to
recognize the well.
While the Geoshare standard is designed
to function as a universal translator, the E&P
community is working toward a common
data model that will eventually eliminate
the translator, or at least reduce the need for
one. Recognition of the need for a common
data model and related standards arose in
the 1980s as technical and economic forces
changed geoscience computing.14 The rise
of distributed computing made interdisciplinary integration possible, and flat oil
prices shrank the talent pool within oil companies, resulting in smaller, interdisciplinary
work groups focused on E&P issues rather
than software development.
When work on information standards
accelerated in the late 1980s, much of the
industry support went to a nonprofit consortium, the Houston-based Petrotechnical
Open Software Corp (POSC). An independent effort in Calgary, Alberta, Canada had
been established by another nonprofit consortium, the Public Petroleum Data Model
Association. Although the PPDM effort was
on a smaller scale, in 1990 it was first to
release a usable E&P database schema,
which has since been widely adopted in the
E&P community.15
At its inception, the PPDM was generally
regarded outside Canada as a local, Calgary
solution for migrating from a mainframe to a
client/server architecture. The PPDM is a
physical implementation, meaning it defines
exactly the structure of tables in a relational
data base. The effort at POSC is to develop
a higher level logical data model, concentrating less on specifics of table structure
and more on defining rules for standard
interfacing of applications.16 By analogy, the
PPDM is used as blueprints for a specific
house, whereas POSC attempts to provide a
set of general building codes for erecting
any kind of building.
Since 1991, widespread use and maturation of the PPDM—the data model is now in
version 3—has increased its acceptance.
This has changed POSC’s strategy. Today,
POSC is preparing two expressions of its
Epicentre high-level logical data model, one
that is technology-independent, implementable in relational or object-oriented
databases (see “What is Object-Oriented—
Really ?” page 46 ), and another one to
address immediate usage in today’s relational database technology. The logical
model is called the Epicentre hybrid
because it adds some object-oriented concepts to a relational structure, although it
Oilfield Review
leans heavily to the relational side. The first
version of the hybrid was released in July
1993. The relational solution, to be released
this summer, is POSC’s merger of the hybrid
and PPDM models, and is called informally
the Epicentre/PPDM model. This model
allows the few object-oriented features of
the hybrid model to be used by relational
software (right ). The merged model is being
tested by a task force comprising PPDM,
POSC and several oil companies.
Where is POSC Going?
After two years of developing standards, last
summer POSC released version 1.0 of its
Epicentre data model. Since then, Epicentre
has undergone review, validation, convergence with the PPDM, and use by developers in pilot studies and in commercial products. Release of Epicentre 2.0, which POSC
considers an industrial-strength data model,
is due this autumn (previous page ). Today
the Epicentre hybrid model as a whole
resides in a handful of database and commercial application products. Vendors have
written parts of the specifications into prototype software, which is increasing in number, ambition and stability.
POSC has learned hard lessons about
building a data model. “There are a lot of
very good local solutions,” says Bruno
Karcher, director of operations for POSC,
“that when screwed together make one big,
bad general solution. What we have learned
is how to take those local solutions, which
are flavored by local needs and specific
business disciplines, and find a middle
ground. We know that the Epicentre logical
model is not a one-shot magic bullet. It is
guaranteed to evolve.”
A key issue POSC is grappling with is the
definition of “POSC-compliance,” which
today remains unspecified. From the user’s
perspective “POSC-compliant” equipment
should have interoperability—seamless
interaction between software from different
vendors and data stores. The simplest defini14. For a later paper summarizing the industry needs:
Schwager RE: “Petroleum Computing in the 1990’s:
The Case for Industry Standards,” paper SPE 20357,
presented at the Fifth SPE Petroleum Computer Conference, Denver, Colorado, USA, June 25-28, 1990.
15. A schema is a description of a data base used by the
database management system. It provides information about the form and location of attributes, which
are components of a record. For example, sonic,
resistivity and dipmeter might be attributes of each
record in a log data base.
July 1994
Epicentre Logical Model and Data Access and Exchange Format
An information model based on data requirements and data flow
characteristics of the E&P industry
Data Model Expressions
of the Logical Model
Epicentre Hybrid
Implemented with a relational or partially
or fully object-oriented database
management system (DBMS)
Implemented with a relational DBMS and
operating system or a DBMS bulk storage
mechanism for large data sets
Accessed via a POSC-specified
application programing interface
Accessed via a vendor-supported interface
and a POSC-specified application program
interface for complex data types
User accesses a physical data base that is
not visibly different from the logical model
User accesses a physical data base that is
equivalent to the logical model or visibly
different from it
A single data model for applications that
have not yet been able to effectively use
a relational DBMS, such as reservoir
simulators and geographic information
Provides a set of highly similar relational
data models that can be implemented
with current database management
system software
Opens a market for vendors of objectoriented DBMS and an evolutionary
path for relational DBMS
Provides a market for vendor products that
meet the specifications of the logical model
tion of compliance means “applications can
access data from POSC data stores,” but the
E&P community is still a long way from
consensus on how this can be achieved
practically. POSC has started building a
migration path with grades of compliance.
In these early stages, compliance itself may
mean commitment to move toward POSC
specifications over a given period.
A consequence of having all data fit the
POSC model is disappearance of the application data store as it exists today—a proprietary data store, accessible only by the host
application. For this to happen means
rethinking the three levels—master, project
and application—and blurring or removing
the boundaries between them. Ultimately,
there may be a new kind of master data
base that will allow the user to zoom in and
out of the data, between what is now the
master and application levels. There will be
fewer sets of data, therefore less redundancy
and less physical isolation. At the conceptual level, applications that manipulate data
will no longer affect the data structure.
Applications and data will be decoupled.
“The problems of multiple data versions
and updating may disappear,” says Karcher,
“but we will be looking forward to new
problems, and as-yet unknown issues of
data administration, organization and security. I think even when standards are implemented in most commercial products,
POSC will continue to play a role in bringing together a body of experience to solve
those new problems.”
16. Chalon P, Karcher B and Allen CN: “AniInnovative
Data Modeling Methodology Applied to Petroleum
Engineering Problems,” paper SPE 26236, presented
at the Eighth SPE Petroleum Computer Conference,
New Orleans, Louisiana, USA, July 11-14, 1993.
Karcher B and Aydelotte SR: “A Standard Integrated
Data Model for the Petroleum Industry,” paper SPE
24421, presented at the Seventh SPE Petroleum
Computer Conference, Houston, Texas, USA, July
19-22, 1992.
Karcher B: “POSC The 1993 Milestone,” Houston,
Texas, USA: POSC Document TR-2 (September
Karcher B: “Effort to Develop Standards for E&P Software Advances,” American Oil & Gas Reporter 36,
no. 7 (July 15, 1993): 42-46.