hypre intro. - University of Rochester

advertisement
Intro to
Hypre and Useful Linear Solvers
Shule Li
Computational Astrophysics Group
University of Rochester
Part I. Intro to HYPRE (high performance preconditioners)
What is hypre?
Hypre is a software library of high performance preconditioners for solutions of large and sparse linear systems on
massively parallel computers.
What are the features of hypre?
Scalability.
Provides a suite of common iterative methods.
Intuitive grid-centric interfaces. (Structured grid, Semi-Structured grid.).
Wide choice of options.
Language flexibility
What can hypre solve particularly in computational astrophysics?
Problem that requires additional step of solving sparse matrix can be dealt with in hypre.
How to run with hypre in Astrobear?
1. Turn on the elliptic solver flags in domain.data.
2. Turn on the corresponding linear problem flag in physics.data. When working on a problem with
diffusion turned on, assign values to kappa and ndif. kappa = 0.0 means no diffusion, ndif = 0.0 means
linear diffusion.
3. When compiling, turn on the hypre flag in Make.inc.
4. Compile and run!
Work on fixed grid in Hypre
How do I build new solver in hypre?
The major hypre steps are contained in the subroutine Hypre in the hypreBEAR.f90 file. The solver is
built within this subroutine. Hypre calls subroutine StructHypre or SStructHypre, depending on which
interface is chosen. For fixed grid, work in StructInterface.
How do I pass parameters and variables into hypre?
A struct “hypreinfo” is used to define parameters when calling hypre. It has the following members:
Interface: choose which interface is used in hypre: struct or semi-struct.
Solver: choose what kind of solver to use. Currently PCG is used.
qvarIn: which variable in q is passed in.
qvarOut: which variable in q is passed out.
InterpOrder: order of interpolation between AMR levels
Tolerance: tolerance of iteration steps.
PrintLevel: Print on screen informations.
maxIters: maximum number of iterations
hVerbosity: how verbose hypre should be.
These parameters can be assigned in bearez.f90 file. The rest of the physical constants should be
defined in globaldeclarations.f90 or physics.data.
What are the main steps in StructHypre?
matrix
right(variable) vector
StructHypre works under fixed grid. The main steps are:
1.Set up the grid.
2.Set up the stencil.
AS=B
3.Set up the matrix.
4.Set up the right vector.
solution vector
5.Call the solver, get the result.
When building a new solver in hypre, just follow these steps in StructHypre subroutine.
How to set up the grid?
The following functions are called: (usually not necessary to change.)
1. StructGridCreate
2,3. StructGridSetExtents
Apply BC
4.StructGridAssemble
For periodic boundary, call StructGridSetPeriodic.
hat is a stencil?
stencil defines the adjacent points around a center. In 3D, the labels used in hypre are:
iEntry
0:center
offsets3D (3, 7 array)
0, 0, 0,
1:left
-1, 0, 0,
2:right
1, 0, 0,
3:bottom
0, -1, 0,
4:top
0, 1, 0,
5:back
0, 0, -1,
6:front
0, 0, 1

u

uu
u
For instance, when solving equation uu
i

1
,
j
i

1
,
j 2
i
,
j
i
,1
j
 i
,1
j
2
i
,
j 0
We assign stencil values: u(0) = -4, u(1) = 1, u(2) = 1, u(3) = 1, u(4) = 1.
How do I assign stencil entries?
In the code, it is realized by the following lines. Usually we do not need to change anything here
when solving a new problem. The actual stencil values are assigned when setting up the matrix.
npts=2*ndim+1
do ientry=0:npts-1
// call StructStencilSetElement
enddo
How do I set the matrix values?
The matrix is set up by making the following calls:
StructMatrixCreate
no need to change
StructMatrixInitialize
no need to change
StructMatrixSetBoxValues
StructMatrixAssemble
assign matrix coefficients here.
no need to change
As the graph shows, for each new solver, a new StructMatrixSetBoxValues should be written.
How do I write StructMatrixSetBoxValues?
Just initialize an 1D array called matrixValues, assign the coefficients and then call the subroutine
StructMatrixSetBoxValues.
Why I assign coefficients to an array not a matrix? What is the rule of assigning its values?
The 1D array holds all the information for grid points and their stencils. It works as (if 2D):
matrixValues(0) ⇒ central point A
matrixValues(1) ⇒ stencil 1 of A
matrixValues(2) ⇒ stencil 2 of A
matrixValues(3) ⇒ stencil 3 of A
matrixValues(4) ⇒ stencil 4 of A
matrixValues(5) ⇒ stencil 5 of A
matrixValues(6) ⇒ stencil 6 of A
matrixValues(7) ⇒ central point B
matrixValues(8) ⇒ stencil 1 of B
matrixValues(9) ⇒ stencil 2 of B
matrixValues(10) ⇒ stencil 3 of B
matrixValues(11) ⇒ stencil 4 of B
matrixValues(12) ⇒ stencil 5 of B
matrixValues(12) ⇒ stencil 6 of B
.
.
.
.
So if the stencil has the npoints points on it where npoints = 2*ndim+1
The dimension of this array should be npoints*ncells, where ncells = mx*my*mz.
In the example on page 5, the values should be: m0 = -4, m5 = -4, m11 = -4, m17 = -4 … all the rest values are 1.
What if the matrix central values and stencil values are not the same, say, dependent on
the location?
You do the same thing as before, the following lines can be used:
Allocate(matrixValues(:))
m=0
do k = 1, mz
do j = 1, my
do i = 1, mx
matrixValues(m) = f(i,j,k) // central value
do n = m+1, m+npoints-1
nn = n-m // label of stencil
select case(nn)
case(1) // stencil 1
matrixValues(n) = f(i,j,k)
.
.
.
Function (i,j,k) can be defined arbitrarily.
The above case assigns coefficients in a order of x-y-z.
It is better to start m index from 0 (tried diffusion with m starting from 1, also works) .
How do I set up the vectors?
The subroutines on page 8 should be called. Usually you only modify the subroutine StructVectorSetBoxValues.
It works similar to the StructMatrixSetBoxValues, but here you need to setup two vectors, the right (variable) vector and the solution
vector.
StructVectorCreate
StructVectorIntialize
StructVectorSetBoxValues
assign vector values here.
StructVectorAssemble
How do I assign right vector values in
StructVectorSetBoxValues?
Allocate an array called vectorValues, assign values in the same manner as of
the matrix is assigned.
For instance, in 2D solver, if matrixValues(0:4) is assigned to point (1,1) in the
grid, then vectorValues(0) should also correspond to point (1,1). The
dimension of the vectorValues should equal to ncells, where ncells =
mx*my*mz. Then set it to variable vector:
call C_StructVectorSetBoxValues(SvariableVector, …)
How do I assign solution vector values in StructVectorSetBoxValues?
Allocate an array called vectorValues, assign values in the same manner as of the
matrix is assigned.
In 2D solver, if matrixValues(0:4) is assigned to point (1,1) in the grid, then
vectorValues(0) should also correspond to point (1,1). The dimension of the
vectorValues should equal to ncells, where ncells = mx*my*mz. Here, since the
solution vector is unknown, we can simply assign it to be the solution vector in the
previous time step. For instance, in solving the density diffusion, we can set:
vectorValues(m) = Info%q(i,j,k,qvarOut). Then set it to solution vector:
call C_StructVectorSetBoxValues(SsolutionVector, …)
How do I call the solver?
Use the following steps to call the solver and return the solution vector.
StructPCGCreate
tol=HypreInfo%tolerance // set parameters
StructPCGSetup
StructPCGSolve
How do I get the solution vector?
After calling the solver, call StructVectorGetBoxValues. Initialize vectorValues(m) with m from 0 to ncells-1, assign the solution
vector to vectorValues(m), and then pass the values of vectorValues(m) out. Example:
allocate(vectorValues(0:ncells-1))
call C_StructVectorGetBoxValues(SsolutionVector, …)
m=0
do k = 1, mz
do j = 1, my
do i = 1, mx
Info%q(i,j,k,1,qvarOut) = vectorValues(m)
.
.
.
What should I do after getting the solution vector?
Once the solution is passed out, simply destroy the grid, stencil, matrix, variable and solution vector.
(#・∀・)ノ finished!
Semi-Struct Interface
Semi-Struct Interface is if the grids are not entirely structured, e.g. block-structured grids, AMR applications.
to be continued …
Part II. Useful Linear Solvers - Diffusion Problem
An Implicit Scheme
The 1D diffusion equation reads:

T  
T
 (D )

t 
x 
x
where D is the thermal conductivity:
D  T n
To solve this equation, we first define:

z
Tn D
/

T
or equivalently,
Substitute D into the diffusion equation we have:
z
It is just:
1 n
T
n1

T  
 (
 z T)

t 
x 
T
x
T
2z
 2
t
x
Now write the differentials as finite differences on both sides, we have:
n

1 n
n
n n
T

T
z

2
z
z
i
i
i

1
i
i

1


2

t

x
which is equivalent to:
n

1 n
n
n n
h
(
T

T
)(

z

2
z
z
)
i
i
i

1
i
i

1
where
x2
h
t
This simple explicit method has a stability problem. Let’s look at the simplest case: linear diffusion. In this case
z T
Thus we have the differencing scheme to be:
n

1 n
n
n n
h
(
T
T

T

2
T
T
)
i 
i)(
i

1
i
i

1
Consider the method suggested by Von Neumann, we assume the variables are single mode planar waves on a
periodic grid:
Tmn Tneimka
Then the differencing scheme becomes:
n

1 n
n
i
k
a
n n

i
k
a
h
(
TT

)

(
T
e

2
T
T
e
)
Regrouping terms we get:
a
2k
sin(
)
T
2

14
n
T
h
n
1
The stability requires:
T n1
| n |1
T
Thus for all ka value we require:
a
2 k
h2sin(
)
2
we thus require h > 2:
x2
t 
2
In other words dt should be smaller than the thermal conduction time for each grid. For applications with finer grid
spacing, this requirement is devastatingly slow and inefficient. We are thus in need of a scheme with looser stability
requirement. This kind of method is usually an implicit method.
Now look back at the differencing scheme for linear diffusion on page 13, we can make it implicit by substituting the
variables at step n on the right hand side by variables at step “n+1/2”:
1
n

1n1
n

1 n

1n

1
n nn
h
(
T

T
)

(
T

2
T

TT
)

(

2
T

T
)
i
i
i

1 i
i

1
i

1
i
i

1
2
2
Now we can check the stability. After some math we have:
a
2 k
1

2
s
i
n(
)/h
Tn1
2

n
a
T 12sin(
2 k
)/h
2
Obviously this is always stable if h > 0.
This scheme is called Crank Nicholson scheme.
For nonlinear diffusion, we use z instead of T on the right hand side.
m

1mm

1 m

1m

1 mm
m
h
(
T
T
)

(
z

2
zz
)

(
zz

2
z
)
i 
i
i

1
i 
i

1
i

1
i
i

1
2x2
h
t
Here
To avoid having to solve a set of nonlinear equations we do the following expansion:

z
z 
z
(
T 
T)

Tim
,
m

1
i
m
i
m

1
i
m
i
m
m

1
m mn

z

(
T

T
(
T
i
i
i )
i )
Substitute the above expression for z into the differencing scheme, and regroup terms we finally have:
n

1
m
n
m

1
m
n
m

1
m
n
m

1m
n
m
nm

1 n

1
m
n

1
(
T
)
T

[

h

2
(
T
)
]
T

(
T
)
T


h
(
T
)

(
)
[
(
T
)

2
(
T
)

(
T
)
]
i

1
i

1
i i
i

1
i

1 i
i

1
i
i

1
n

1
This is a tri-diagonal matrix, which is easy to solve by backward iteration. We can assign values:
a  (Tim1)n
mn
b
h2
(T
i )

1
m
nn
m
n

1
m
n

1 m
n

1
d


h
(
T
)
(
)
[
(
T
)

2
(
T

(
T
)
]
i
i

1
i)
i

1
n

1
c  (T )
m n
i1
Then the scheme becomes:
m

1
m

1
m

1
a
T

b
T

c
T
d
i

1
i
i

1
Boundary Condition in Crank Nicholson scheme.
When the boundaries are transmissive, the equations are not closed. Thus when i=1, we should include the ghost region
values to close the equations:
m

1
m

1
m

1
a
T
b
T
c
T
d
0 
1 
2 
T0m1 T1m1
We thus have to extend the computation domain and compute ghost region values explicitly during the calculation, which
is not desired. But obviously, this can be avoided by combining the above two equations:
m

1
m

1
(
ab
)
T
c
T
d
1 
2 
Similarly, at the other end we have:
m

1
m

1
a
T

(
bc
)
T
d
m
x

1
m
x 
In hypre, we can simply set the matrix with values a, b, c and the right vector with value d, and then pass them into
the linear solver.
2D Crank Nicholson Scheme
The above 1D scheme can be extended to higher dimensions by using sweep method easily. We first do
calculation on x direction with conductivity Dx for time step dt pretending no diffusion on y, and then do calculation
on y direction with conductivity Dy for time step dt pretending no diffusion on x. What you should do is defining a
public variable, for example, hflag, and then pass it around in the “SetBoxValues “ subroutines. For instance, in
Hypre call, define:
integer hflag
common /sweep/hflag
select case(HypreInfo%Interface)
case(StructInterface)
hflag = 0
call StructHypre(HypreInfo)
hflag = 1
call StructHypre(HypreInfo)
.
In the diffusion solver, the matrix and vector values should change according to which direction you are solving.
But obviously, we do not need a sweep method for multi-dimension problem since the matrix is solved by hypre.
We can simply decompose the 2D equation and solve it at once. For example, if the eqution is:

T 
T 
T
(
D)
(
D)

t 
x 
x
y 
y
We can directly write it into difference form with Crank Nicholson method:
m

1
m
h
(
T

T
)
i
,
j
i
,
j
m

1
m
1
m

1
m
m
m
m

1
m
1
m

1
m
m
m

(
z

2
z

z
)

(
z

2
z

z
)

(
z

2
z

z
)

(
z

2
z

z
)
i

1
,
j i
,
j i

1
,
j i

1
,
j i
,
ji

1
,
j i
,
j

1
i
,
j i
,
j

1
i
,
j

1
i
,
ji
,
j

1
Grouping terms we finally get:
m
n
m

1m
n
m

1m
n
m

1m
n
m

1
m
n
m

1
(
T
)
T

(
T
)
T

(
T
)
T

(
T
)
T

[

h

4
(
T
)
]
T
i

1
,
j i

1 i

1
,
j i

1
,
j i
,
j

1
i
,
j

1,
i
j

1
i
,
j

1
i
i
,
j
n

1
m
n
m
n

1 m
n

1m
n

1m
n

1 m
n

1m
n

1


h
(
T
)

(
)
[
(
T
)

2
(
T
)
(
TT
)
(
)
2
(
T
)
(
T
)
]
i
,
j
i

1
,
j
i
,
j
i

1
,
j
i
,
j

1
i
,
j
i
,
j

1
n

1
In terms of stencil, we have:
m
n
m
a
t
r
i
x
V
a
l
u
e
s
()
mh



4
(
T
i)
mn
m
a
t
r
i
x
V
a
l
u
e
s
(
m

1
)

(
T
i

1
,j)
mn
m
a
t
r
i
x
V
a
l
u
e
s
(
m

2
)(
T
i

1
,j)
mn
m
a
t
r
i
x
V
a
l
u
e
s
(
m

3
)

(
T
)
ij
,
1
mn
m
a
t
r
i
x
V
a
l
u
e
s
(
m

4
)(
T
)
ij
,
1

1 mn
m
n n

1
m
n

1
mn

1
mn
1
m
n

1
mn
1
v
e
c
t
o
r
V
a
l
u
e
s
()
mh


(
T
)

(
)
[
(
T
2
(
T
(
T
(
T
)

2
(
T
(
T
)
]
ij
,
i

1
,j) 
ij
,) 
i

1
,j) 
ij
,
1
ij
,) 
ij
,
1
n

1
except for the boundaries.
At the boundaries, we use the same method as before. For instance, if i=1, we set:
m
n m
n
m
a
t
r
i
x
V
a
l
u
e
s
(
m
)


h

4
()(
T
T
)
i 
i

1
,
j
m
a
t
r
i
x
V
a
l
u
e
s
(
m

1
)

0
the rest elements remain the same.
To Verify the code, we can use an arbitrary function, fft it,
take each frequency components and do a corresponding exponential decay.
But remember: it's NOT a Gaussian! Gaussian is the solution for linear diffusion only when the
Dirichlet boundaries are placed at -inf to +inf!
More Complicated Boundary Conditions
need fft method. coming soon.
Download