TopoCalcIII - Department of Mathematics and Computer Science

advertisement
Topological Index Calculator III
[1]
Computation of the Hosoya Index using SMILES representation
Steven D. Granz, Departments of Mathematics & Computer Science, Gordon College, Wenham, MA 01984, sgranz@peace.gordon.edu
Irvin J. Levy, Departments of Chemistry & Computer Science, Gordon College, ijl@gordon.edu
Abstract:
JavaScript application for the computation of QSPR descriptor indices for alkane
Hosoya Index Algorithm Using
SMILES Representation:
molecules. The previous program supported the computation of the following
Computing the value of the Hosoya Index by hand can be tedious. As the number of bonds in the
indices: Balaban Index, Weiner Index, Randic Index, Odd-Even Index, Polarity Index,
molecule increases, the probability of human error becomes greater. Without a computational
Vertex Degree Distance Index and Harary Index.
tool such as Topological Index Calculator, human error is very likely to occur. In the process of extending
Previously we introduced Topological Index Calculator as a freely available
Z(CC(C))
Number of parent carbons: 2
N=2/2=1
N=1+1=2
Boiling Point vs Hosoya Index
C | C(C)
475
Z(CC(C))
= Z(C) * Z(C(C)) + PI Z(fragments)
425
Topological Index Calculator, we developed an algorithm that uses SMILES representation to compute
We now report the addition of the Hosoya Index for acyclic alkanes. Further, we discuss
Hosoya Index Regression:
PI Z(fragments)
=
=
the Hosoya Index.
SMILES representation of the molecule and how to compute the Hosoya Index for cyclic
SMILES Representation Summary:
Z(CC(C))
= Z(C) * Z(C(C)) + Z((C)) = 1 * 2 + 1 = 3
Z(C(C)CC)
Number of parent carbons: 3
N = 3 / 2 = 1.5
N=1
Boiling Point (K)
the algorithm we have developed to compute the Hosoya Index directly from the
alkanes.
SMILES [3] (Simplified Molecular Input Line Entry Specification) is a simple yet comprehensive
chemical nomenclature. SMILES is used for expressing the molecular graph of "normal" organic
N=1+1=2
molecules.
375
325
275
C(C) | CC
Two Basic Rules of SMILES:
225
Carbons are represented by the atomic symbol: C
Z(C(C)CC)
= Z(C(C)) * Z(CC) + PI Z(fragments)
y = 63.432Ln(x) + 169.01
Branching is indicated by parentheses
175
PI Z(fragments)
Hosoya Index Algorithm For Acyclic Alkanes:
=
0
10
20
30
40
50
60
70
80
Hosoya Index
= Z(C(C)) * Z(CC) + Z((C)) * Z(C) = 2 * 2 + 1 * 1 = 5
Given G is the SMILES representation of an acyclic alkane, Z(G) is computed as follows:
Z(G)
Find the number of carbons in G
=
Experimental vs Computed Boiling Point with Hosoya Index Equation
= Z(CC(C)) * Z(C(C)CC) + Z((C)) * Z((C)) * Z(C) * Z(CC)
= 3 * 5 + 1 * 1 * 1 * 2 = 17
If the number of carbons = 1
450
Z(G) = 1
Hosoya Index Algorithm for Cyclic Alkanes:
Else if the number of carbons = 2
400
Given G is the graph representation of a cyclic alkane, Z(G) can be computed by finding the Hosoya
Else if the number of carbons = 3
http://www.math-cs.gordon.edu/courses/organic/topo/
Index of two other alkanes for which the sum of their individual Z values will equal Z(G). Preferably, we
Z(G) = 3
wish to find two alkanes that are acyclic, but if that is not the case then we can again find the Hosoya Index
Else
Hosoya Index:
The Hosoya Index [2], Z = Z(G), was introduced by Hosoya in 1971 as the Z index. This index is
defined below:
Find the number of parent carbons in G
of two other alkanes, recursively continuing until we have a sum of the Hosoya Index of all acyclic alkanes,
Let N = (the number of parent carbon) / 2
yielding the Hosoya Index for the original cyclic structure.
If the number of parent carbon is odd truncate N to an integer
N /2
Add one to N
i 0
Find the Nth carbon in G
Z   p(G; i )
where p(G;i) is the number of selections of i mutually non-adjacent edges in G , a chemical graph with
N vertices. By definition, p(G;0) = 1, and p(G;1) is the number of edges in G.
300
250
200
1. Let H be a subgraph defined by removal of one random edge, E, from a ring in G
Z(G) = Z(subgraph, left of B) * Z(subgraph, from B to end) + PI Z(fragments)
2. Let I be a subgraph defined by removal of E from G as well as by the removal of
all edges adjacent to E in G
where PI Z(fragments) is equal to the product of Z for each fragment created
Average Percent Error = 2.48%
150
200
250
3. If H or I is a disconnected graph then calculate Z(H) and/or Z(I) as the product of Z
by removing B and the parent carbon to the left of B
2,3-dimethylpentane
300
350
400
Calculated BP (K)
for each of the connected components of the disconnected graph(s)
Acyclic Alkane Example:
Future Directions:
4. Z(G) = Z(H) + Z(I)
2,3-dimethylpentane
Cyclic Alkane Example:
p(G;0) = 1
p(G;1) = 6
p(G;2) = 8
• Students use the tool to verify values found in the literature since at least two journal
articles [4] [5] have been found to have errors.
• Incorporate existing indices into Topological Index Calculator
• Develop new indices with better predictions of the boiling point
1,2-cyclopropane
References:
SMILES Representation: G = CC(C)C(C)CC
Z(G)
Number of parent carbons: 5
N = 5 / 2 = 2.5
N=2
Z(G)
= Z(
) + Z(
Z(G)
)
[1] Topological Index Calculator II: Applications for the classroom and research,
Steven D. Granz and I. J. Levy, 228th ACS national meeting, Philadelphia, PA 2004.
N = 2 +1 = 3
Z(G)
= Z(CC(C)CC) + Z(CC) * Z(C) * Z(C) * Z(C)
Z(CC(C)CC)
= Z(CC(C)) * Z(CC) + PI Z(fragments)
CC(C) | C(C)CC
p(G;3) = 2
350
An algorithm to do this is the following:
Let the atom, B, be the Nth parent carbon atom in G’s SMILES representation
Example:
Expected BP (K)
Z(G) = 2
= Z(CC(C)) * Z(C(C)CC) + PI Z(fragments)
= Z(CC(C)) * Z(CC) + Z(C) * Z((C)) * Z(C)
[2] Mihalic, Z. "A Graph-Theoretical Approach to Structure-Property Relationships.”
J. Chem. Educ. 1992, 69, 701-712.
[3] Weininger, D. “SMILES, a Chemical Language and Information System.”
J. Chem. Inf. Comput. Sci. 1988, 28, 31-26.
=3*2+1*1*1=7
PI Z(fragments)
=
[4] Cao, C. "Topological Indices Based on Vertex, Distance and Ring: On Boiling
Points of Paraffins and Cycloalkanes." J. Chem. Inf. and Comp. Sci., 2001, 41, 867-877.
=
Z(G)
The Hosoya Index of 2,3-dimethylpentane
Z(G) = p(G;0) + p(G;1) + p(G;2) + p(G;3) = 1 + 6 + 8 + 2 = 17
Z(G)
= Z(CC(C)) * Z(C(C)CC) + Z((C)) * Z((C)) * Z(C) * Z(CC)
= Z(CC(C)CC) + Z(CC) * Z(C) * Z(C) * Z(C)
=7+2*1*1*1=9
[5] Mihalic, Z., et al. “The Detour Matrix And The Detour Index.” Topological
Indices and Related Descriptors in QSAR and QSPR. Ed. J. Devillers and A. T.
Balaban. Gordon and Breach Science Publishers, 1999, pp.297-299.
450
90
Download