State of Play of Dynamical Fermions B ´alint Jo ´o Jefferson Lab

advertisement
Jefferson Lab
State of Play of Dynamical Fermions
Bálint Joó
Jefferson Lab
Newport News, VA, U.S.A
-1-
January 23, 2006
Jefferson Lab
Contents I: Technology
• Rational Hybrid Monte Carlo
• Multiple Time Scale Intergrators
• Scale Splitting Schemes
• Differentiable Smearing
-2-
January 23, 2006
Jefferson Lab
Contents II: State of Play
• Staggered Like Simulations
• Wilson Like Simulations
• DWF like simulations
• Overlap like simulations
-3-
January 23, 2006
Jefferson Lab
Contents III: Possibilities and Constraints
• Algorithmic Ideas
• JLab programme and resource constraints
• JLab future plans
-4-
January 23, 2006
Jefferson Lab
Rational Hybrid Monte Carlo
• Essentially same as Hybrid Monte Carlo BUT
• Fermionic Functions in action replaced by Rational Approximations
1
• good for non-local actions: eg (M †M ) n
• Inversion a natural operation for rational functions: Applying f (M ) is
as hard as f −1(M )
• Write rational function in partial fraction form:
f (M ) ≈ R(M ) = A
X
pi (M + qi)−1
i
apply with multi-mass solver
-5-
January 23, 2006
Jefferson Lab
(R)HMC Algorithm
• Start with configuration U
• Refresh momenta (Gaussian Noise), Pseudofermions
• Reversible and Area Preserving Molecular Dynamics to generate U 0
in fixed pseudofermion background
– details of MD later
0
• Accept/Reject with Pacc = min 1, e−[H(U )−H(U )]
• Kennedy-Kuti noisy accept/reject to correct for Rational Approximation
• If accepted, new configuration is U 0, otherwise it is U .
-6-
January 23, 2006
Jefferson Lab
• Write Hamiltonian
1
1
H = π 2 + φ† R (M †M )− n φ
2
• eg n = 2 for 2 flavour staggered, 1 flavour wilson, DWF
• Refresh Fermions:
φ = R0
h
1
†
(M M ) 2n
χ with Gaussian χ
i
†
• R f (M M ) is a rational approximation to f (M † M ):
X
pi
Pm (M †M )
R M M =
=A
†
†
Qn(M M )
i M M + qi
h
†
i
-7-
January 23, 2006
Jefferson Lab
• It is easy to make rational approximation good to machine (or solver)
accuracy using Remez Algorithm
• Eliminates need for noisy accept reject step, to correct for badness of
approximation
• Successfully used in 2+1 flavour ASqTAD Staggered simulations and
2+1 flavour DWF simulations
• Details (read all about it)
– Horváth, Kennedy, Sint, hep-lat/9809092
– Clark, Kennedy, hep-lat/0309084
– Clark, deForcrand, Kennedy, hep-lat/0510004
– Mike Clark’s PhD thesis (available on request)
-8-
January 23, 2006
Jefferson Lab
Introduction to Scale Splitting Ideas
Most advances in algorithms in 2005 came from some kind of
scale splitting schemes in HMC like algorithms.
• Lüscher’s Domain Decomposition - split space
• Hasenbusch Mass Preconditiononing – add auxiliary mass scale
• Clark, Kennedy, deForcrand – approximation coefficients split scale
The separated scales are then simulated on different timescales.
Crucial Ingredient: Multiple Timescale Integrator
-9-
January 23, 2006
Jefferson Lab
Multiple Timescale Integrators
2
• Consider Hamiltonian: H = 1
2 p + S(q) over phase space states
(p, q).
• Time Evolution Operator is:U (δτ ) = e
d
δτ dt
= eδτ P +Q = eδτ H
• Consider symplectic coordinate and momentum update operators:
UQ(δτ ) = eδτ Q : (p, q) → (q ⊕ pδτ, p)
UP (δτ ) = eδτ P : (p, q) → (q, p ⊕ Ṡδτ )
where UQ and UP are reversible and have unit determinant
• Use Baker-Campbell-Hausdorff formula:
1
1
1
exp(A) exp(B) = exp(A+B+ [A, B]+ [A.[A, B]]+ [B, [B, A]]+. . .)
2
12
12
-10-
January 23, 2006
Jefferson Lab
• Single Time Scale (Leapfrog):
δτ
δτ
U (3) (δτ ) = UQ
UP (δτ ) UQ
2
2
= exp(δτ H + O(δτ 3))
2 + S (U ) + S (U ) and operators
• Now consider: H = 1
p
1
2
2
UP1 (δτ ) = eδτ P1 : (p, q) → (q, p ⊕ Ṡ1δτ )
UP2 (δτ ) = eδτ P2 : (p, q) → (q, p ⊕ Ṡ2δτ )
• Now construct operators:
δτ
δτ
U1(δτ ) = UP1
UQ(δτ )UP1
2
2
U2(δτ ) = UP2 (δτ )
-11-
January 23, 2006
Jefferson Lab
Sexton Weingarten Integrator:
• Consider:
δτ
U SW (δτ ) = U2
2
δτ
U1
n
n
δτ
U2
2
• Split Integration onto 2 time scales: δτ and δτ /n.
• Can recursively introduce more scales eg: δτ , δτ /n1, δτ /(n1n2)
• Choose n1 and n2 so to optimize (equalize?) Ṡiδτi force terms.
• Quite an old idea: (J. C. Sexton, D. H. Weingarten, Nucl. Phys. B380,
665, 1992, M. J. Peardon, J. C. Sexton, Nucl. Phys. Proc. Suppl.
119:985-987, 2003)
• New advances: How to split the fermionic part of the action
-12-
January 23, 2006
Jefferson Lab
Omelyan Integrator
• Alternative to leapfrog integrator. Originally by Omelyan, introduced to
lattice QCD by deforcrand and Takaishi (hep-lat/0505020)
• Instead of combining UP and UQ as in the leapfrog case, combine as
U (δτ ) = UQ (λδτ ) UP
δτ
UQ ((1 − 2λ)δτ ) UP
2
δτ
UQ (λδτ )
2
• tune coefficient λ to minimise error on 3rd order term.
• Roughly a 50% improvement is gained (more solves, but increased
step-size)
• Can be multi-scaled with a small amount of care
-13-
January 23, 2006
Jefferson Lab
Hasenbusch Preconditioning
• Write the desired determinant as:
†
det M M
†
†
det M M = det M2M2 ×
†
det M2M2
• Corresponding Fermion Action:
S
h
i
†
†
†
−1
†
−1
= ψ (M2 M2) ψ + φ M2(M M ) M2 φ
= S1(ψ †, ψ) + S2(φ† , φ)
• Choose M2 similar to M ⇒ M1−1M2 ≈ 1, Ṡ2 ≈ 0
• Now put S1 and S2 on different time scales (S2 on long timescale)
-14-
January 23, 2006
Jefferson Lab
• Forces in S1 and S2 have different sizes, both smaller than before split
• For Wilson fermions: Add small imaginary mass term to M2 eg:
M2 = M + iρ
†
– Spectrum of M2M2 bounded from below
• Used in
– Wilson-Clover Simulations: Hasenbusch,Jansen: hep-lat/0211042
– Wilson Simulations: Urbach et. al. hep-lat/0506011, hep-lat/0510064
– Overlap Simulations: DeGrand and Schaefer hep-lat/0412005, heplat/0506021, hep-lat/0508025
-15-
January 23, 2006
Jefferson Lab
Nroots Preconditioning
• Write
†
det M M =
n
Y
i=1
†
det M M
1
n
• with corresponding actions:
S=
n
X
i=1
1
†
†
ψi M M n ψ
• This kind of determinant splitting is an old idea (Joó, Horváth, Liu,
hep-lat/0112033) but most fruitful within context of RHMC algorithm
(Clark, Kennedy, hep-lat/0409134, Clark, Kennedy, deForcrand heplat/0510004)
-16-
January 23, 2006
Jefferson Lab
• Similar in spirit to Hasenbusch preconditioning, each pseudofermion
1 κ(M )
term is now better conditioned: κ(M 1/n) = n
1
• (M †M ) n approximated with Rational Approximation, Coefficients from
Remez algorithm (fits extremely well with RHMC).- where you do this
thing anyway to do single flavour simulations.
– Amounts to doing 2 flavour simulation as 1+1 flavour flavour RHMC
• No mass tuning needed. Taking root divides condition number equally
between terms. (Optimally according to Tony Kennedy)
-17-
January 23, 2006
Jefferson Lab
Rational Multiple Timescaling
• Rational Action:
S=A
X
i
φ†
!
pi
φ
†
(M M + qi )
• Rational Force:
F = −Aφ†
X
i
pi
!
1
1
†M + M †Ṁ
φ
Ṁ
†
†
(M M + qi)
(M M + qi)
• Use multi shift CG – cost dominated by smallest shift
• deForcrand & Clark: small shifts have high cost BUT small force
-18-
January 23, 2006
Jefferson Lab
1000
Relative force
CG iterations
CG iterations
800
600
400
200
0
0
1
2
3
4
5
Partial fraction
6
7
8
9
Thanks to M. Clark for figure
-19-
January 23, 2006
Jefferson Lab
• Run different poles on different timescales
• Ratio of scales may be guessed using pi and qi -s
• Reduces utility of multi mass
• But can then use chronological solver in principle
• BUT – smallest shift, highest cost, longest stepsize
– chronological solver least useful
• Easy to combine with Nroots preconditioning
• See deForcrand, Clark, Kennedy (hep-lat/0510004)
-20-
January 23, 2006
Jefferson Lab
Splitting Space: Domain Decomposition
• Discussed Extensively by Luigi
• Application of Schwarz Domain Decomposition by Lüscher (hep-lat/0409106,
hep-lat/0509152)
• Draw momenta and pseudofermion fields, then split lattice into blocks
• Identify active links in the blocks so that the blocks decouple
• Integrate each block in MD time in the usual way, and put the boundary
fields on a slower time scale.
• Accept Reject as per normal HMC
• Perform a random tranlsation on the links (to ensure ergodicity)
-21-
January 23, 2006
Jefferson Lab
• The blocks provide a natural IR cutoff
• This reduces ocurrence low (near singular) modes in block Dslash-es
• Block size has to be small enough
• but want block big enough to contain lots of active links
• Updating inter-block boundaries deals with UV effects.
• Suits ’fat blocks’ - most efficient on clusters...
• Successful in: Wilson simulations, Wilson Clover simulations
-22-
January 23, 2006
Jefferson Lab
Stout Link Smearing
• Smearings used often in improved gauge actions
• But most smearings involve a non-differentiable projection into SU (3)
• This makes molecular dynamics difficult
• Stout Links (Morningstar and Peardon, Phys. Rev. D69 (2004) 054501,
hep-lat/0311018
• Stout links are differentiable through recursive procedure
• Have been shown to be useful in overlap simulations (DeGrand &
Schaefer)
-23-
January 23, 2006
Jefferson Lab
Basic Idea
• Take APE smearing staple sum, and close to form a loop
• Project into the Lie algebra su(3)
• exponentiate back into SU (3):
– Cayley-Hamilton Theorem:
eiQ = f1 + f1 Q + f2Q2
Q is traceless, antihermitian (we use this in HMC a lot)
• Need time derivatives of Q, Q2 and f0 , f1, f2 .
• Well behaved if dQ
dt is well behaved
-24-
January 23, 2006
Jefferson Lab
as the adverts used to say...
-25-
January 23, 2006
Jefferson Lab
DWF Tricks
• Combine DWF and PV matrices in the one flavour term (in 2+1 flavour
RHMC)
• Previously had:
1
1
S = ψ †(M †(m)M (m))− 2 ψ + φ†(M (1)† M (1)) 2 φ
• Now consider
†
S = ψ (M
†
1
1
1
−
†
†
(1)M (1)) 4 (M (m)M (m)) 2 (M (1)M (1)) 4 ψ
• Reduces Noise
-26-
January 23, 2006
Jefferson Lab
Improved gauge actions
• Improved gauge actions smooth gauge fields (even on coarser lattices)
• Smooth fields improve the spectrum of auxiliary Dirac Operator H
• Recent study by UKQCD/RBC Collaboration (Peter’s Lattice Talk)
-27-
January 23, 2006
Jefferson Lab
Improved 5D Operators
• Various new 5D operators have been suggested for Chiral Fermion
Physics
– Optimal DWF fermions - T.W.Chiu, hep-lat/0209153
– Alternative to DWF fermions - Neuberger, hep-lat/0005004
– Continued Fraction operator - Neuberger, hep-lat/9901003, Borici
et.al hep-lat/0110070, Wenger hep-lat/0403003
– Möbius DWF fermions - Brower, Orginos, Neff, hep-lat/0409118,
hep-lat/0511031
-28-
January 23, 2006
Jefferson Lab
• Realisation: (Edwards et al, hep-lat/0510086) that all improved 5D operators are different facets of a rational approximation to the Chiral
Fermion operator distinguished through:
– Representation - the structure of the matrix
– Approximation - choice of coefficients
– Kernel - scaling behaviour
• Partially Quenched Performance study found best contenders:
– Continued Fraction with Zolotarev Coefficients
– Partial Fraction with Zolotarev Coefficienct
– Standard (Shamir) DWF form was least effective
-29-
January 23, 2006
Jefferson Lab
Cost vs Mres
Dyn. DWF (Ls=12), mf=0.020
Cost Normalised by Unscaled Shamir DWF
5
Shamir (α=1, HT, tanh)
Scaled Shamir (α=1.7, HT, tanh)
Chiu DWF
(b5=1, c5=1, Zolotarev)
off graph
4
Borici (α=1, Hw, tanh)
Continued Fraction (Hw, Zolotarev)
Continued Fraction (HT, Zolotarev)
DWF: Ls=32
3
DWF: Ls=24
2
1
0
1e-06
DWF: Ls=12
CFZ: Ls=8
CFZ: Ls=6
1e-05
0.0001
0.001
| mres | (mShamir/m)
0.01
0.1
1
-30-
January 23, 2006
Jefferson Lab
Self Criticism
• Negative mres still a little troubling - could get rid of it by cooking sign
function (as done in the KY papers)
• Can’t please everyone: Zolotarev coefficients give small mres , compared to tanh approximation, but not small enough for purists
• need explicit handling of low modes of H, make mres = 0.
• Only preliminary DF experience - seems similar in cost to usual DWF.
• More dynamical fermion studies needed - expensive (QCDOC rack
months)
-31-
January 23, 2006
Jefferson Lab
4D Overlap Advances
• As mass becomes small, changing topology becomes difficult
– eigenvalue of Hw changes sign
– Step function in action → delta function in fermion force
– Acceptance goes to 0 even on 64 lattice (Szabó, Lat 2004)
Szabó, Lat2004
-32-
January 23, 2006
Jefferson Lab
Reflection/Refraction Integrator
• Phase space has surfaces where eigenmodes of Hw change sign.
• Track eigenvalues of H through MD
• At level crossing, compute “angle” between momenta and Normal to
surface
• Reflect/refract accordingly
if hN, P i2 < 2∆S, then reflect:
otherwise refract:
P ← P − 2N hN, P i
q
P ← P −N hN, P i+N hN, P i 1 − 2∆S/hN, P i2
-33-
January 23, 2006
Jefferson Lab
• Original approach had step-size errors of O(τ1 ) where τ1 was the MD
time needed to reach the zero-ev surface
• Approach by Cundy et. al: hep-lat/0502007 remedies this and restores
errors to O(τ 2).
• Leaders in Overlap game: DeGrand and Schaefer, Cundy et. al (aka
the Wuppertal Gang), and of course the original team: Fodor, Katz,
Szabo et al (aka the Hungarians)
-34-
January 23, 2006
Jefferson Lab
State of the Art Wilson Simulation Algorithms
• Lüscher style Domain Decomposition
• Jansen et al: Hasenbusch mass preconditioning and multiple timescales
• Lattice Sizes: 243 ×48 with a = 0.06−0.08fm and mπ ≈ 294M eV .
• PACS-CS in Japan: to focus 14.3Tflops of PACS-CS onto Wilson Clover
simulations (eventually using Domain Decomposition)
-35-
January 23, 2006
Jefferson Lab
State of the Art Staggered Simulations
• R-algorithm with 2+1 flavours currently on NERSC archive
• Large lattices 403 × 96 at a = 0.09fm, ml /ms = 0.2, 0.4 available
through NERSC Gauge Connection (MILC and UKQCD Coordination)
• Future running: 2+1 flavour RHMC, Nroots acceleration, Omelyan Multitimescale integrator on QCDOC (UKQCD and MILC using Mike Clark’s
code in CPS)
• Humongous lattices planned: 483 ×144 at a = 0.06fm, with ml /ms =
0.2, 0.4
-36-
January 23, 2006
Jefferson Lab
State of the art Twisted Mass Simulations
• European Twisted Mass Collaboration (ETMC) (Germany, UK, France,
Italy)
• State of the art code similar to Wilson: Hasenbusch mass preconditioning and multiple time scales
• Plans presented at ILDG 7: a = 0.075 − 0.12fm, L ≈ 2.5fm,
250MeV ≤ mπ ≤ 500MeV
-37-
January 23, 2006
Jefferson Lab
State of the art Domain Wall Simulations
• UKQCD-RBC QCDOC Collaboration
• O(10) QCDOC Rack Years of concerted and coordinated effort
• Normal Shamir formulation, 2+1 flavours, Nroots acceleration, Omelyan
integrator with Multiple Timescales, and Hardware Optimized Multimass solver (all the tricks?) in CPS
• Runs planned at ILDG7: 163 × 32, Ls = 8, a−1 = 1.5 − 2.2GeV.
• According to Lattice 2005 contribution: mres is still about 30% of the
lightest quark mass
-38-
January 23, 2006
Jefferson Lab
State of The Art Overlap Simulations
• DeGrand and Schaefer
– in Boulder, Hasenbusch acceleration, Stout-smearing, improved
calculation of tunnelling probability,
– but smallish lattices so far.
• The Wuppertal Gang (Cundy, Krieg, Lippert, Frommer etc)
– Thin links (?)
– Own version of reflecting/refracting integrator accurate to O(τ 2 ).
– Planned for 163 × 32 lattices according to their Nicosia write up.
• JLQCD - plans for large scale Dynamical Overlap using KEK BG/L
-39-
January 23, 2006
Jefferson Lab
Apologies to the unmentioned
There are other people doing other things: eg Approximate Overlap Operators (Bietenholz et al), Fodor’s group – the first to publish dynamical
overlap simulations.
-40-
January 23, 2006
Jefferson Lab
Algorithmic Games - Low Hanging Fruit
• Stout Links in Dynamical Fermion Evolution
– Structure and preliminary code ready in Chroma
– But Needs debugging
– Needs usefulness tests - ie running simulations eg Stout Link Wilson, Stout Link Clover – would suit Graduate Student, PostDoc
• More work on Continued Fraction/Partial Fraction 5D operators
– Can investigate tuning and most importantly, need (R)HMC runs
– Need exact treatment of low e-values/modes
– Resource intensive (need Racks of QCDOC and Human Drivers)
-41-
January 23, 2006
Jefferson Lab
• Using Mixed 4D and 5D techniques
– Operator Application in 4D, inversion in 5D
– 5D inversion solves M φ = χ
– so may need 2 5D inversions to get (M †M )−1χ
• Hasenbusch Preconditioning
– Opens up way for new determinant splittings, combined with...
• Multi Timescale Integrators
– In Chroma, we have a 2 timescale Sexton Weingarten integrator
– Generalize and implement for more timescales
-42-
January 23, 2006
Jefferson Lab
Somewhat Higher Hanging Fruit
• Actually proper 4D Overlap Simulations
– Needs tidying of 4D overlap code (in chroma)
– Needs reflecting/refracting integrator, robust eigensolver techniques
– Would probably involve structural changes in Chroma
– However, this is playing catchup with Wuppertal, DeGrand etc.
-43-
January 23, 2006
Jefferson Lab
Jefferson Lab Physics
• Spectroscopy, Nucleon Excited States, GPDAs, Decays
• Chiral operators don’t have +ve definite 4D transfer matrix
– Wiggles in Correlation Functions
– Excited states difficult.
• Consider Wilson-Clover fermions in sea, on large fine lattices
• Lüscher:Wilson-Clover simulations should improve (hep-lat/0512021)
– eg: a = 0.08fm and V = 243 × 48 with mπ ≈ 300MeV
– or a = 0.06fm and V = 323 × 64
-44-
January 23, 2006
Jefferson Lab
0.625
0.6
0.575
γk γ5
γk γ4
a meff
0.55
1
0.525
0.5
γk
0.475
0.45
γ5
4
6
8
10
12
14
16
18
20
22
24
26
t
From hep-lat/0601137 (thanks to J. Dudek, R. Edwards, D. Richards)
-45-
January 23, 2006
Jefferson Lab
• JLab clusters oversubscribed (current SciDAC allocations)
• Dynamical overlap development needs lots of resources,
– Lots of algorithmic expertise at Jlab
– Little justification to focus limited human effort away from Jlab
physics
– Need good incentives (to satisfy our masters) such as:
∗ shared computing resource (actual computer time)
∗ share human resource (especially developers, runners)
-46-
January 23, 2006
Jefferson Lab
Chroma Development Plan
• Fermion Sector rework to cope with Dynamical Clover (in progress)
• MD interfaces need to change (extend) to support multi timescaling
• Need to write a general multi-timescale integrator
• Inversion structure needs rework
– Make applying an inverse like applying matrix.
– will clean up MD stuff, and propagators, easier to choose inverters
• Above changes are beneficial to all DF simulations
• Will allow us to try mixed 4D-5D approaches
-47-
January 23, 2006
Jefferson Lab
Current JLab Plans
• SciDAC ends this June. Currently applying for SciDAC 2.
– Intrastructure application sometime in February
– Applications for projects in late Feb, March
• Considering applying for time for DF Wilson/Clover simulations.
– Use Wilson-Clover valence for excited states (no wiggles)
– Use Chiral operator for light quarks (possibly overlap valence).
• Coordination with UKQCD could be beneficial
-48-
January 23, 2006
Jefferson Lab
Need to sort out issues:
• No of flavours? (2, 1+1, 2+1, 1+1+1?)
• Anisotropy or lack thereof?
• Politics politics politics
– Coordinate production parameters
– Share data
– Share or Compete on Analysis?
-49-
January 23, 2006
Download