Chemical Manufacturing Ontology

advertisement
CHEMICAL MANUFACTURING
ONTOLOGY
Chad Stahl
NOVEMBER 26, 2013
UNIVERSITY AT BUFFALO
IE 460/500 Special Topics
Table of Contents
Introduction .............................................................................................................................................. 2
Purpose & Resources ................................................................................................................................ 2
Industry Class ............................................................................................................................................ 3
Molecular Entity Class ............................................................................................................................... 4
Product Class ............................................................................................................................................. 5
Product Class Example .............................................................................................................................. 7
Quality Standard Class .............................................................................................................................. 8
Object Properties .................................................................................................................................... 10
Issues and Solutions ................................................................................................................................ 12
1
Introduction
The chemical manufacturing industry is one of the largest industries in the modern world,
accounting for nearly $3 trillion in sales annually. The United States and European Union are the largest
producers of chemicals, which are used for a myriad of different purposes; containing but not limited to:
consumer goods, agriculture, manufacturing, construction and other applications in the service
industries. This industry converts many natural raw material, such as: oil, natural gas, air, water and
minerals, into over 70,000 different products and applications. With such a wide variation of potential
outputs and the varying inputs of this industry, Ontological classification of products from this industry is
essential for the purpose of quickly identifying important or critical elements of the manufacturing or
supply chain. As such, the scope of this project is the development of a basic Ontological classification of
major sections of the chemical industry, based on value and/or volume of products produced or
consumed.
Purpose & Resources
The purpose of this ontology project was to create the first ever, to my knowledge, ontology
based on the chemical manufacturing industry as a means to create an organized and universal format
for the classification of products. An existing Chemical Manufacturing Ontology does not exist in this
regard, and as such the majority of classes introduced are pulled from an existing source (Wikipedia
entry of the Chemical Industry and subsequent sub-industries). However, some class elements were
derived from the Chemical Entities of Biological Interest Ontology (also denoted as the ChEBI Ontology)
in order to provide acceptable definitions, synonyms and chemical/biological abbreviations. ChEBI is a
freely available dictionary of natural or synthetic ‘small’ chemical and biological molecular entities,
which are used to intervene in the processes of living organisms. The ChEBI Ontology provided some
insight into chemical structuring of certain elements which other sources were unable to provide,
allowing for far more accuracy in the created Ontology. Without the support provided by the ChEBI
Ontology, much of the work done on the Chemical Manufacturing Ontology would not have been
possible.
Another source which was critical to the manufacture of the Chemical Manufacturing Ontology
is the Basic Formal Ontology (BFO), developed by Dr. Barry Smith and Pierre Grenon. BFO is an upperlevel ontology used for supporting information retrieval, analysis and integration. The Basic Formal
Ontology was extremely valuable in the creation of the Chemical Manufacturing Ontology for the aid in
cataloguing of the chemical manufacturing classes and the pre-existing structure which is provided in
BFO. The creation of the Chemical Manufacturing Ontology relied heavily on BFO for the
aforementioned reasons, and I am grateful to Dr. Smith for providing information on BFO. The Chemical
Manufacturing Ontology utilized the aforesaid ChEBI Ontology and BFO in order to classify and provide
better information about certain chemical and/or biological entities. However; the Chemical
Manufacturing Ontology utilized information from other sources, such as webpages, as well as these
existing ontologies, ensuring the individuality of this ontology project.
2
Industry Class
Shown in Figure 1 below, all of the major classes (Industry, Product, Role, Quality Standard and
Molecular Entities) are shown inside of the existing upper-level BFO classes. The Industry class contains
sub-classes of all chemical industries potentially producing or involved in the production of an item in
the Product class. These product classes utilize several Object Properties (such as manufacturer_of,
manufactures, and uses) in order to link these Industries to the Product classes which they produce or
utilize. These Chemical Industries can also be linked to the Quality Standards, listed under the general
Quality class, which they use for select Products. An example of this: the Quality Standard class
‘Rx_360Standard’, which is ‘used_by’ (Object Property) the HealthcareIndustry’ sub-class,
PharmacologicalIndustry class. The same general approach can be taken for the ‘manufacturer_of’ and
‘manufactures’ Object Properties, allowing for an in-depth analysis of the chemical manufacturing
industry and products relevant to said industry.
Figure 1 – Different Chemical Industries listed in the Ontology.
Obviously some industries have product inputs in order to produce their desires outputs, an
example of this would be the Industrial Gas products which are used in order to produce products for
the Petroleum Industry. This would be an instance in which the ‘uses’ and its inverse ‘usedIn’ Object
Properties comes into play, indicating which products are applied to which industries. Some Industries
may need to be added into this existing framework if need arises, but this should be a relatively easy
process.
3
Molecular Entity Class
The next major class in the Chemical Manufacturing Ontology is the Molecular Entity class,
which deals with the molecular traits which a particular Product may have. The only currently existing
sub-class of Molecular Entity class is Ion, in regards to the Ionic traits of an entity: Anion (one or more
elementary charges of the electron), Cation (one or more elementary charges of the proton) and
Zwitterion (a neutral molecule with a positive and negative electrical charge). Further items may be
added to this class in the future, but in regards to the current product selection, the current listing is
adequate. Show below in Figure 2 is the expanded form of the Molecular Entity class:
Figure 2 – Showing the expanded class hierarchy of the Molecular Entity class.
The Molecular Entity class was based on the class of the same name in the CheBI Ontology,
including definitions and other annotation and description information. While only basic Ion classes
were used, it would be this paper-s recommendation to incorporate other Molecular Entity subclasses
from the CheBI Ontology if possible. There are a wide range of possible classes which may be needed
once more complex chemical entities are added to the ontology, or a further depth of existing or new
classes is needed. However as stated before, only those classes which were needed for the basic
completion of this ontology project were added thus far.
4
Product Class
Perhaps the most important and largest portion of the Chemical Manufacturing Ontology is that
of the Product class, with several dozen products listed currently in the ontology but room for
potentially every chemical product a user may wish to add. As the name denotes, this class contains the
multiples types and variations of chemical products which are created via the potential manufacturers
listed in the Industry class. The Product class contains definitions, synonyms, references and other
information about a myriad of chemical products ranging from basic acids and soaps to specialty
agricultural and cleaning chemicals. An example of a small section of the Product hierarchy is shown in
Figure 3, detailing the complete class-path for a particular branch (this branch ending with the subclasses of FertilizerProduct), with the Annotation information detailing the specifics of the
OrganicFertilizerProduct class on the right.
The basic distinction between the four major types of chemical products deals with the use of
each product group (such as who will be using the end product), as well as the inputs that a particular
product has for its creation. For example, any product which will usually only be provided to the general
public as a standard consumer good (such as things you may find at a supermarket: soaps, detergents,
vitamins, etc.) would be classified under the ‘ConsumerProduct’ class. These consumer products may
have chemical inputs from another class, such as the ‘BasicChemicalProduct’ class or the
‘SpecialtyChemicalProduct’ class, in order to from the necessary products which can be purchased by
any consumer. Alternatively, products which are found in the ‘SpecialtyProductClass’ are (as the name
suggests) highly specialized for use in a very specific industry. Examples of this include products of the
surfactant industry, which would be found under the sub-class ‘SurfactantProduct’ and generally only
produced by this specialized industry, though the end product could be used as an input for a wider
range of product classes. This sort of format applies to the remainder of the Product class of the
ontology, with products broken up among the four main groups and end products of each group being
utilized in the creation of others.
The last group of products all fall under the LifeScienceProduct class, classifying products which
are provided to major industries dealing essentially in the creation or extension of life (whether it be
plant life, pet life or human life). These products can range from simple pesticides and other agricultural
chemicals, to very complex products used in veterinary or healthcare industry applications. This class
can be difficult to accurately include certain products into, due to a relaxed definitions of what exactly
qualifies and therefor some overlap with other industrial sectors (mainly Specialty and Basic Industrial
sectors).
5
Figure 3 – The complete hierarchy for a sub-section of the Product class. Annotation information (label,
definition, synonym, etc.) are listed on the right.
Figure 4 – Annotation and Description information for the Organic Fertilizer Product class.
6
Product Class Example
As the Product class is by and large the most important class in the ontology (reflecting the
outputs and inputs to the Chemical Manufacturing Industry, therefore being it’s most important aspect),
it is important to understand how exactly these products can be reflected in the ontology. An example
of how any given product class could be used effectively, in addition to elements from other classes
(different products, industries and object properties), can be shown with the Product sub-class
‘Ethanethiol’ and the implications are easy to apply to a myriad of other Product sub-classes.
Ethanethiol is a chemical additive to certain petroleum industry products, particular Liquefied Petroleum
Gas (commonly known as propane or butane, and reflected in its own Product sub-class
‘LiquefiedPetroleumGasProduct’). Its purpose is that of an aroma compound, allowing for leaks of LPG
gasses to be detected before any damage due to toxicity occurs. Therefore, here we have one product
which serves entirely as an input to another product, used in some specific industry and application,
which can easily be reflected utilizing all of the existing resources created in the Chemical Manufacturing
Ontology.
So when looking to reflect this into the ontology, many of the different classes and object
properties come into play. To start, Ethanethiol can be assumed to be some aroma compound and as
such it would fall under that class, as its sole purpose is as an additive to some chemical compound in
order to serve as a warning. Next we would use the Object Property ‘usedIn’ and create the expression
“usedIn some LiquefiedPetroleumGasProduct”, reflecting the use of Ethanethiol in LPG as its aroma.
When viewing the ‘Ethanethiol’ class in the Protégé builder, it will also show that it uses two other
chemicals as its own inputs (these being ethylene and hydrogen sulfide, which are used together in
order to create ethanethiol) and this is reflected by the ‘uses’ expression. From here, through
anonymous ancestor assertions, you can see how the Product class ‘Ethanethiol’ is important down the
line as an additive to LPG. It is an important aspect in some cooking role and it is used in some food
service industry, as both of these reflect the uses of LPG as a fuel in said industry. A role could also be
created entirely for products similar to ethanethiol, such as a role for aroma compounds used entirely as
a warning for dangerous chemical gasses, although this ontology did not as this was the only such
product listed with that application. The same basic format can be applied to many of the chemical
products listed in this ontology, as the majority of products created in non-consumer roles typically
serve as inputs to manufacturers whose products will cater to some consumer market somewhere
down-stream in the production process (creating a complicated system of inputs and outputs).
7
Quality Standard Class
The final critical section of the Chemical Manufacturing Ontology revolves around object and
molecular qualities, covered under the broad-ranging and fittingly titled ‘Quality’ main class. This class
contains the general quality and quality standards utilized for defining or measuring the condition of
some particular thing. These qualities can then be tied to potentially every product listed in the ontology
through a pre-defined Object Property (set in the Object Property Hierarchy tab, which will be reviewed
further into this paper). The expanded class hierarchy of the Quality class is shown below, in Figure 4, as
well as the Annotation information of a member of the Quality Standard class (the Rx-360 Standard used
in the Pharmaceutical Industry) in Figure 5. The Chemical Manufacturing Ontology is slightly lacking in
sub-classes of the different Quality classes, and it would be likely that anyone more knowledgeable in
their chemical field would add different quality standards. However, the basic structure could remain
unchanged and be used as the outline for any user desiring to add more detail to this class:
Figure 4 – The hierarchy of the Quality class, fully expanded.
8
Figure 5 - The Annotation and Description information for the Rx-360 Standard class, displaying the usedcase of this particular quality standard.
9
Object Properties
The next step in the creation of the Chemical Manufacturing Ontology was to create several
Object Properties which can be applied to several Products which have either a known manufacturer or
quality/quality standard. In order to determine which Object Properties were needed to adequately
cover each Product, several of these properties were taken from the ChEBI Ontology while a few were
added as-needed which were not covered in ChEBI. Figure 5 below shows the complete class hierarchy
for all of the Object Properties, with any Annotation information for the selected Object Property shown
on the right:
Figure 5 – Showing the complete class hierarchy of the Object Properties, with Annotation information
for the selected class (has_manufacturer) shown on the right. Description information (such as inverse
properties) are shown on the bottom-right.
In most cases, the ‘uses’ and ‘usedIn’ referred to products which were either used or used in
some particular industry or other product, instead of using some other Object Properties to denote this.
The hasRole and hasQualityStandard classes are rather self-explanatory, as well as the inverses of these
classes (qualityStandardOf). No annotation or description information were added to these Object
Properties, as the format they were taken from was mainly either BFO or CheBI, both of which do not
have any other information listed in this regard. It should be noted that in the description of each Object
Property class, if a class did have an obvious inverse (as shown above with ‘hasQualityStandard’ and
‘qualityStandardOf’ properties) then it was included in the description “inverse Of”.
10
The Chemical Manufacturing Ontology is a thorough analysis of the chemical manufacturing
industry, products which are created through their processes, as well as the quality standards which are
applied to these products and molecular qualities which certain products have. Major product branches
are explored throughout the Chemical Manufacturing Ontology, allowing for potential future users (such
as companies involved in said industry) to fully annotate their product line using the basic structured
format provided. Limitations of the Chemical Manufacturing Ontology are mainly regarding the class
hierarchy and structure, some smaller but more specialized chemical industries may not be directly
included into the current ontology, but could be added as a sub-class of some existing chemical
industries or manufacturers. Fortunately, the majority of the more general chemical industries and
product categories already exist in this project (albeit the classes may be lacking in annotation
information; such as definition, synonym, etc.) and so the addition of specialized sub-classes dependent
upon users’ needs should not be difficult.
The coverage on the Chemical Manufacturing Ontology was measured in the accuracy and
inclusiveness of the Product class, as the largest and (arguably) most important piece of the project. As
most of the product branches were taken from an online Wikipedia source, it was best to compare this
product selection against some pre-existing source. In this case, chemistry books were used in order to
cross-reference the selection of products in the Chemical Manufacturing Ontology to that in the index of
the books. If a product was found that could not be classified either through an existing class or the
creation of a sub-class for an existing class, then the Ontology was known to be incomplete. The
definition of each product and class could also be cross-referenced in this way, to ensure the online
Wikipedia and dictionary sources were accurate in their information. Obviously not every product could
be included at this stage of the project, but there should be at least a way of adding every chemical you
encounter, either through a sub-class or superclass. Then if a user of the ontology wanted to add their
own product or chemical to the listing, this should be a relatively easy and painless process.
11
Issues and Solutions
There were some issues the approach taken in classifying the four various branches of chemical
industry products, mainly issues dealing with multiple inheritance in certain instances. Some chemical
industry products are eligible for several different groups in the ontology at once, creating confusion as
to which group most accurately encompassed said product. As an example, products like surfactants
(soaps) and detergents have both Industrial and Consumer applications, as these are products that are
used in both a down-the-stream manufacturer after production as well as sent directly to the mass
consumption market (although likely in different grades). There were some possible ways to resolve this
issue, both of which would require major work to the ontology and were not included at this point.
The first potential solution was to re-order every product into either an Industrial grade or a
Consumer grade, reflecting the differences in product concentration which are prevalent in the chemical
manufacturing industry. This could be an easier solution to implement, if only breaking down products
with multiple inheritance in this way (however would create an inconsistency if every product was not
labeled this way, multiple inheritance or not). The second solution would revolve more around the
CheBI Ontology approach, with the removal of very generic terms (i.e. Surfactant, Detergent) and
instead replacing these with instances of each product. These instances would then be assigned to some
Role (such as ‘has_role some Soap’ or ‘has_role some Detergent’), in order to remove any cases of
multiple inheritance. This route would require a nearly complete overhaul of the existing ontology, but
is still an effective option if multiple inheritance was becoming problematic while running the Reasoner.
In the end, neither option was implemented, but it is my hope that if an outside source were to take up
this ontology project that they may implement either of these suggestions.
12
Download