Operators - Departamento de Física

advertisement
Informe de actividades realizadas en CHPC
Universidad de Utah
Lic. Ofelia Oña
Departamento de Física de la Universidad de Buenos Aires
Periodo: 17-12-05 / 28-02-06
Reintegracion: 01-03-06
Genetic Algorithms for atomic clusters used Density Functional
Theory
Introduction
The goal of MGAC is to search an arrangement for a cluster that minimizes the energy in
accordance with its composition. An interface was implemented between the MGAC and CPMD
for the prediction of clusters geometries using Density Functional Theory (the CPMD code is a
plane wave/pseudopotential implementation of Density Functional Theory, particularly designed
for ab-initio molecular dynamics). Operators analogous to crossover, mutation and natural
selection are employed to explore the multidimensional parameter space and determine which
regions of that space provide good solutions to the problem. So it is very important to analyze
what type of operators should be used.
We studied the convergence of the operators and due to the computers limit, we did the
performance of the different clusters of computers.
Operators
Generic operators (G)
These operators act directly on the genome that represents a cluster. They were the first
operators proposed for optimizing atomic clusters [1,2].
Phenotypic operators (P)
These operators act on the 3D structure of a cluster rather that on its genome. They were
developed as an improvement of the generic operators. Unfortunately, none of them guarantees
that the generated offspring are compact clusters without disconnected and/or overlapping atoms.
This last case may create numerical instabilities in the calculation of the cluster energies [3].
Constrained operators (C)
These are similar to phenotypic operators but they were defined to produce connected
clusters [4]. This means that the offspring generated by these operators are connected at least by
a given bond distance.
Testing of the operators on silicon clusters
Using MGAC we were able to study the behavior of those operators described above. For
this purpose, we run the MGAC ten times per operator type in the case Si10 employing 40
generations with 25 individuals. Then we averaged the minimal energy reached in each
generation in each run. This allows us to examine the convergence of the MGAC to the minimal
energy structures depending on operator type. All the energy calculations were done using semiempirical approximation (MSINDO [5-7]).
For this comparison we chose in the case of C operators, a bond distance of 2.34Å that is
the average covalent distance between silicones. We observed that the convergence for the C
operators was slower than the G and P operators. Actually the C operators needed more that 25
generations for reaching the same energies that in the cases of the other two operators. However,
choosing a shorter bond distance of 2.0 Å, i.e. creating compact connected clusters, improved the
convergence of C operators making it similar to the other two cases (see Fig. 1).
-39.7
-39.685
P
C
G
-39.69
-39.695
P
C
G
-39.705
-39.71
-39.7
-39.705
-39.715
-39.71
-39.72
-39.715
-39.72
-39.725
-39.725
-39.73
-39.73
0
10
20
30
40
50
0
10
20
30
40
50
Fig.1: Averaged of the minimal energy reached in each generation. P: Phenotypic operators; C: Constrained
operators; G: Generic operators. We add the standard error of the mean.
At the same time the initial population began with less energies but the convergence was
worse when the bond distance decreased to 1.4 Å. The convergence was slower than the previous
cases because all the operators needed more than 20 generations for reaching the same energies.
The P, C and G operators had similar performance for bond distance between 1.0 Å and 1.4 Å
(see Fig. 2).
-39.712
-39.72
P
C
G
-39.714
P
C
G
-39.721
-39.716
-39.722
-39.718
-39.723
-39.72
-39.724
-39.722
-39.725
-39.724
-39.726
-39.726
0
10
20
30
40
50
0
10
20
30
(a)
40
50
(b)
Fig.2: Averaged of the minimal energy reached in each generation. P: Phenotypic operators; C:
Constrained operators; G: Generic operators. We add the standard error of the mean. (a): In this case we
chose a bond distance of 1.4 Å, (b): the bond distance is of 1.0 Å.
The process described above was implemented for Si20 employing 100
generations with 40 individuals. As it happened with Si10, the initial population began
with less energies when the bond distance was reduced. We observed that the P operators
were better than G and C operators but neither of them presented convergence. In this
system, more compact clusters caused a similar behavior in all operators.
Conclusion
Compact clusters improved the convergence of C operators. Furthermore they
allowed to generate an initial population with less energy than those clusters that were not
compact. However they never achieved the performance of the P operators.
For this reason we will use P operators to explore and learn the multidimensional
parameter space and determine which regions of that space provide good solutions to the
problem.
-79.36
-79.4
P
C
G
-79.38
-79.4
P
C
G
-79.42
-79.44
-79.42
-79.46
-79.44
-79.48
-79.46
-79.5
-79.48
-79.52
-79.5
-79.52
-79.54
-79.54
-79.56
-79.56
-79.58
0
20
40
60
80
100
120
0
20
40
60
80
(a)
100
120
(b)
-79.44
P
C
G
-79.46
-79.48
-79.5
-79.52
-79.54
-79.56
0
20
40
60
80
100
120
(c)
Fig.3: Averaged of the minimal energy reached in each generation. P: Phenotypic operators; C:
Constrained operators; G: Generic operators. We add the standard error of the mean. (a): In this case we
chose a bond distance of 2.34 Å , (b): the bond distance is of 2.0 Å and (c): the bond distance is of 1.0.
Performance of clusters of computers
To carry out the performance of the computer clusters we optimize the geometry
of a pair of systems: Si10Cu, Si8Cu and Si30. We calculated EFFICIENCY for each
system and then we took its average. Furthermore we determined the SPEEDUP by
taking MARCHINGMEN cluster as a reference.
The goal is to install and compile MGAC/CPMD in the best cluster and to make it
run in this cluster.
Table I: Recurses of the different clusters analyzed.
CLUSTER
RECURSES
Arquitecture
Network-Interconnect
MARCHINGMEN
AMD Opterons
(1.4 GHz)
Gigabit Ethernet
NCSA
IA-64 Linux Cluster
(1.3 GHz , 1.5 GHz)
Myrinet 2000, Gigabit Ethernet, Fiber
Channel
SDSC
IA-64 Linux Cluster
(1.5 GHz)
Myrinet 2000, Gigabit Ethernet, Fiber
Channel
PSC LEMIEUX
Compaq Alphaserver
ES45 cluster (1.0 GHz)
Management network: Ethernet
Compute nodes:Quadrics
Table 2: Performance of the different clusters analyzed.
SPEEDUP EFFICIENCY PROCS.
MARCHINGMEN
WALLTIME QTIME
[hours]
[hours]
1
0,62
164
72
NCSA
2,4
0,95
1200
24
PSC-LEMIEUX
1,4
0,99
1020
18
SDSC
2,6
0,96
524
18
8
We observed that the NCSA and SDSC are most optimized than the others. The problem
with SDSC is the walltime, for this reason we selected the NCSA cluster.
[1] Y. Zeiri, Phys. Rev. E 51, R2790 (1995).
[2] J. A. Neisse and H. R. Mayne, J. Chem. Phys. 105, 4700 (1996).
[3] R. L. Johnson and C. Roberts, in Soft Computing Approaches in Chemistry, adited by
H. M. Cartwright and L. M. Sztandera (Springer-Verlag, Heidelberg, 2003), Vol. 120, p 161.
[4] O. Oña, V. E. Bazterra, M. C. Caputo, M. B. Ferraro, J C. Facelli, Phys. Rev.A 72,
053205 (2005).
[5] B. Ahlswede and K. Jug, J. Comput. Chem. 20, 563 (1999).
[6] B. Ahlswede and K. Jug, J. Comput. Chem. 20, 572 (1999).
[7] T. Bredow, G. Geudtner, and K. Jug, J. Comput. Chem. 22, 861 (2001).
Download