Efficient solution algorithm of non-equilibrium Green’s functions in atomistic tight binding representation Yu He, Lang Zeng, Tillmann Kubis, Michael Povolotskyi, Gerhard Klimeck Network of Computational Nanotechnology Purdue University, West Lafayette, Indiana May, 2012 Why atomistic tight binding? Single atom transistor Nature Nanotechnology 7, 242 (2012) Ryu et al., Wednesday, 9.40am Band-to-band tunneling IEEE Elec. Dev. Lett. 30, 602 (2009) Jiang et al., Wednesday, Poster P82 Topological insulators Nature Physics 6, 584 (2010) Sengupta et al., Thursday, 12.40pm Countable device atoms suggest atomistic descriptions Modern device concepts, e.g. • Band to band tunneling • Topological insulators (gap less materials) • Band/Valley mixing etc. require multi band representations Why non-equilibrium Green’s functions? Device dimensions State of the art semiconductor devices utilize or suffer from quantum effects (tunneling, confinement, interference,…) are run in real world conditions (finite temperatures, varying device quality…) http://newsroom.intel.com/docs/DOC-2035 This requires a consistent description of coherent quantum effects (tunneling, confinement, interferences,…) and incoherent scattering (phonons, impurities, rough interfaces,…) Numerical load of atomic NEGF NEGF requires for the solution of four coupled differential equations GR = (E – H0 – ΣR)-1 ΣR = GRDR + GRD< + G<DR G< = GRΣ<GA Σ< = G<D< G‘s and Σ‘s are matrices in discretized propagation space (RAM ~N2, Time ~N3) Atomic device resolutions can yield very large N (e.g. N = 107) Motivation – transport in reality sp3d5s* TB band structure of 3nm (111) GaAs quantum well Full TB NEGF: Atomistic tight binding represents electrons by N*No states (N atoms, No orbitals) G(z,z’,k|| = 2/nm,E=0) Physics: Electrons with given (k|| ,E) do not couple with all N*N0 states A few states should suffice to describe the physics at (k|| ,E) This work: 1. Find the n relevant states and form a n-dimensional basis 2. Transform the NEGF equations into the n-dimensional basis and solve therein (low rank approximation) Incomplete basis transformation: LRA Wikipedia.org: “In mathematics, low-rank approximation is a minimization problem, in which the cost function measures the fit between a given matrix (the data) and an approximating matrix (the optimization variable), subject to a constraint that the approximating matrix has reduced rank.” In NEGF: “Low rank approximation” Given NEGF equation in an N-dimensional basis, i.e. matrix MNxN n < N orthonormal functions { Ψi } in the N-dimensional basis Approximate NEGF equation P is given by: P = T† M T, with T = ( Ψ1 , Ψ2 , … Ψn ) T is a matrix of N x n P is a matrix of n x n Key: How to find “good” Ψi ? Method – Find propagating states One method: For electrons at energy E and momentum k … 1. Solve the eigenproblem of nonhermitian Hamiltonian of the open system ( H(k) + ƩR(k,E) ) Φi = εi Φi 2. Choose the n states { Φ i } with Re(εi) closest to the considered particle energy E 3. Orthonormalize the n states { Φi } → { Ψi } 4. LRA: P = T† M T, with T = ( Ψ1 , Ψ2 , … Ψn ) 5. Solve the approximate NEGF equations P in the reduced basis { Ψ } 6. Transform the results back into the original basis representation GNxN = TNxn gnxn (TNxn)† Benchmark: Density in Si nanowire 5x5nm squared Si nanowire in sp3d5s* empirical tight binding model Result: 5nm Original matrix rank was reduced down to 10% No loss in accuracy of the electron density Preliminary implementation shows already a speed up of 8 times (effectively limited by the solution of the basis functions) Benchmark: Transmission in Si nanowire 5x5nm squared Si nanowire in sp3d5s* empirical tight binding model 5nm holes electrons Original matrix rank can be reduced down to 10% Too strong matrix rank reduction results in increasing deviations LRA works for electrons and holes Benchmark: Periodicity & band-to-band tunneling L-shape GaSb-InAs tunneling FET Broken gap bandstructure – mixture electrons/holes Periodic direction – momentum dependent basis functions 2D transport (nonlinear geometry) TFET concept (taken from MIND) 10 nm 10 nm Low rank approximation is Applicable to arbitrary geometries and periodicities Applicable to band to band tunneling LRA in NEGF: Feasibility of large devices Atomistic NEGF without LRA on Supercomputers: Typical maximum Si wire diameter ~ 8 nm (1000s CPUs) NEGF with LRA: 12nm diameter Si nanowire in sp3d5s* TB With 10% of original matrix rank Calculation done on ~100 CPUs With LRA, NEGF is easily applicable to larger device dimensions than ever See also Lang et al., Poster Thursday (LRA + inelastic scattering in eff. mass) Approximate basis functions exact 9% matrix rank eig(H+Σ(-1.2eV)) GaAs AlAs GaAs Transmission in resonant tunneling diode AlAs Challenge for further speed improvement: Solving a set of basis functions for every energy takes too much time Solution: Reuse basis functions of at (E,k) for different (E’,k’) exact 9% matrix rank eig(H+Σ(-0.6eV)) eig(H+Σ(2.5eV)) eig(H+Σ(1.9eV)) Approximate solutions of basis functions are feasible Conclusion & Outlook Conclusion Developed systematic efficiency improvement of NEGF in atomistic tight binding for LRA in effective mass + inelastic scattering see Lang et al., Wed., Poster P11 Freely tunable approximation of NEGF Applicable to electrons, holes and mixed particles, arbitrary device geometries,… Enables efficient NEGF solutions in large devices Implemented in free software tool NEMO5 Ongoing work LRA for NEGF with phonons LRA for NEGF with inelastic scattering in tight binding LRA for multiscaling transport problems Thank you!