Faster G0W0 calculations using Lanczos algorithm and Sternheimer equation Jonathan Laflamme Janssen, Gabriel Antonius, Nicolas Bérubé and Michel Côté Université de Montréal Regroupement Québécois pour les Matériaux de Pointe (RQMP) Abstract G0W0 corrections to DFT band structures are a popular way to go beyond the accuracy DFT is able to provide. However, the calculation of such corrections with the ABINIT code is currently prohibitive for systems with more than a few hundreds of electrons. What limits the calculations to this system size is the need in the current implementation to invert the dielectric matrix and to carry out some summations over conduction bands. This poster presents a strategy to avoid both of these limitations for the screened-exchange contribution to the self-energy (ΣSEX). In ABINIT’s implementation, the dielectric matrix is expressed in a plane wave basis, which needs to be relatively big to properly describe the matrix. This poster explains how a Lanczos basis can be generated to substantially reduce the size of the matrix. Also, the number of conduction bands needed to reach convergence in the summation is usually an order of magnitude bigger than the number of valence bands. Here, the calculation of all the conduction states is avoided by reformulating the summation into a linear equation problem (Sternheimer equation), which also substantially reduces the computation time. Introduction GW : real quasiparticles G0W0 : First order corrections to εn We approximate quasiparticles states φn and energies εn by Kohn-Sham states and energies : To obtain an estimate of the Self-Energy for a specific state : To obtain a first order estimate of quasiparticle energies : C Équations for Abinit Workshop 2011 P (1, 2) = −iG(1, 2)G(2, 1+) (64) W (1, 2) = v(1, 2)− i � v(1, 3)P (3, 4)W (4, 2)d(3, 4) (65) Σ(1, 2) = iG(1, 2)W (2, 1+) (66) (T̂ + V̂ext + V̂H) |φn�+ Σ̂(εn) |φn� = εn |φn� Ĝ(ω) = � n |φn� �φn| ω − εn + iη sgn(εn − µ) P̂ (ω) = � v,c |φ∗ cφv� � 1 ω − (εc − εv − iη) − 1 ω + (εc − εv − iη) � �φvφ ∗ c | Ŵ (ω) = v̂ + v̂P̂ (ω)Ŵ (ω) Σ(r, r�, εn) = i 2π � ∞ −∞ dω eiηωG(r, r�,ω + εn)W (r, r�,ω) (T̂ + V̂ext + V̂H) |φn�+ V̂xc |φn� = εn |φn� P̂ (ω) = � v,c |φ∗ cφv� � 1 ω − (εc − εv − iη) − 1 ω + (εc − εv − iη) � �φvφ ∗ c | Ŵ (ω) = v̂ + v̂P̂ (ω)Ŵ (ω) εQP n ≈ εn + �φn| Σ̂(εQP n )− V̂xc |φn� Σ(εe) ≡ �φe| Σ̂(εe) |φe� = i 2π � n � +∞ −∞ dωeiηω �φeφ∗ n| Ŵ (ω) |φ∗ nφe� ω − ωne + iηsgn(�n − µ) = ΣSEX(εe) + ΣCOH(εe) ΣSEX(�e) = − � v �ev|W (ωve) |ve� (67) 13 C Équations for Abinit Workshop 2011 P (1, 2) = −iG(1, 2)G(2, 1+) (64) W (1, 2) = v(1, 2)− i � v(1, 3)P (3, 4)W (4, 2)d(3, 4) (65) Σ(1, 2) = iG(1, 2)W (2, 1+) (66) (T̂ + V̂ext + V̂H) |φn�+ Σ̂(εn) |φn� = εn |φn� Ĝ(ω) = � n |φn� �φn| ω − εn + iη sgn(εn − µ) P̂ (ω) = � v,c |φ∗ cφv� � 1 ω − (εc − εv − iη) − 1 ω + (εc − εv − iη) � �φvφ ∗ c | Ŵ (ω) = v̂ + v̂P̂ (ω)Ŵ (ω) Σ(r, r�, εn) = i 2π � ∞ −∞ dω eiηωG(r, r�,ω + εn)W (r, r�,ω) (T̂ + V̂ext + V̂H) |φn�+ V̂xc |φn� = εn |φn� P̂ (ω) = � v,c |φ∗ cφv� � 1 ω − (εc − εv − iη) − 1 ω + (εc − εv − iη) � �φvφ ∗ c | Ŵ (ω) = v̂ + v̂P̂ (ω)Ŵ (ω) εQP n ≈ εn + �φn| Σ̂(εQP n )− V̂xc |φn� Σ(εe) ≡ �φe| Σ̂(εe) |φe� = i 2π � n � +∞ −∞ dωeiηω �φeφ∗ n| Ŵ (ω) |φ∗ nφe� ω − ωne + iηsgn(�n − µ) = ΣSEX(εe) + ΣCOH(εe) ΣSEX(�e) = − � v �ev|W (ωve) |ve� (67) 13 C Équations for Abinit Workshop 2011 P (1, 2) = −iG(1, 2)G(2, 1+) (64) W (1, 2) = v(1, 2)− i � v(1, 3)P (3, 4)W (4, 2)d(3, 4) (65) Σ(1, 2) = iG(1, 2)W (2, 1+) (66) (T̂ + V̂ext + V̂H) |φn�+ Σ̂(εn) |φn� = εn |φn� Ĝ(ω) = � n |φn� �φn| ω − εn + iη sgn(εn − µ) P̂ (ω) = � v,c |φ∗ cφv� � 1 ω − (εc − εv − iη) − 1 ω + (εc − εv − iη) � �φvφ ∗ c | Ŵ (ω) = v̂ + v̂P̂ (ω)Ŵ (ω) Σ(r, r�, εn) = i 2π � ∞ −∞ dω eiηωG(r, r�,ω + εn)W (r, r�,ω) (T̂ + V̂ext + V̂H) |φn�+ V̂xc |φn� = εn |φn� P̂ (ω) = � v,c |φ∗ cφv� � 1 ω − (εc − εv − iη) − 1 ω + (εc − εv − iη) � �φvφ ∗ c | Ŵ (ω) = v̂ + v̂P̂ (ω)Ŵ (ω) εQP n ≈ εn + �φn| Σ̂(εQP n )− V̂xc |φn� Σ(εe) ≡ �φe| Σ̂(εe) |φe� = i 2π � n � +∞ −∞ dωeiηω �φeφ∗ n| Ŵ (ω) |φ∗ nφe� ω − ωne + iηsgn(�n − µ) = ΣSEX(εe) + ΣCOH(εe) ΣSEX(�e) = − � v �ev|W (ωve) |ve� (67) 13 C Équations for Abinit Workshop 2011 P (1, 2) = −iG(1, 2)G(2, 1+) (64) W (1, 2) = v(1, 2)− i � v(1, 3)P (3, 4)W (4, 2)d(3, 4) (65) Σ(1, 2) = iG(1, 2)W (2, 1+) (66) (T̂ + V̂ext + V̂H) |φn�+ Σ̂(εn) |φn� = εn |φn� Ĝ(ω) = � n |φn� �φn| ω − εn + iη sgn(εn − µ) P̂ (ω) = � v,c |φ∗ cφv� � 1 ω − (εc − εv − iη) − 1 ω + (εc − εv − iη) � �φvφ ∗ c | Ŵ (ω) = v̂ + v̂P̂ (ω)Ŵ (ω) Σ(r, r�, εn) = i 2π � ∞ −∞ dω eiηωG(r, r�,ω + εn)W (r, r�,ω) (T̂ + V̂ext + V̂H) |φn�+ V̂xc |φn� = εn |φn� P̂ (ω) = � v,c |φ∗ cφv� � 1 ω − (εc − εv − iη) − 1 ω + (εc − εv − iη) � �φvφ ∗ c | Ŵ (ω) = v̂ + v̂P̂ (ω)Ŵ (ω) εQP n ≈ εn + �φn| Σ̂(εQP n )− V̂xc |φn� Σ(εe) ≡ �φe| Σ̂(εe) |φe� = i 2π � n � +∞ −∞ dωeiηω �φeφ∗ n| Ŵ (ω) |φ∗ nφe� ω − ωne + iηsgn(�n − µ) = ΣSEX(εe) + ΣCOH(εe) ΣSEX(�e) = − � v �ev|W (ωve) |ve� (67) 13 C Équations for Abinit Workshop 2011 P (1, 2) = −iG(1, 2)G(2, 1+) (64) W (1, 2) = v(1, 2)− i � v(1, 3)P (3, 4)W (4, 2)d(3, 4) (65) Σ(1, 2) = iG(1, 2)W (2, 1+) (66) (T̂ + V̂ext + V̂H) |φn�+ Σ̂(εn) |φn� = εn |φn� Ĝ(ω) = � n |φn� �φn| ω − εn + iη sgn(εn − µ) P̂ (ω) = � v,c |φ∗ cφv� � 1 ω − (εc − εv − iη) − 1 ω + (εc − εv − iη) � �φvφ ∗ c | Ŵ (ω) = v̂ + v̂P̂ (ω)Ŵ (ω) Σ(r, r�, εn) = i 2π � ∞ −∞ dω eiηωG(r, r�,ω + εn)W (r, r�,ω) (T̂ + V̂ext + V̂H) |φn�+ V̂xc |φn� = εn |φn� εQP n ≈ εn + �φn| Σ̂(εQP n )− V̂xc |φn� P̂ (ω) = � v,c |φ∗ cφv� � 1 ω − (εc − εv − iη) − 1 ω + (εc − εv − iη) � �φvφ ∗ c | Ŵ (ω) = v̂ + v̂P̂ (ω)Ŵ (ω) Σ(εe) ≡ �φe| Σ̂(εe) |φe� = i 2π � n � +∞ −∞ dωeiηω �φeφ∗ n| Ŵ (ω) |φ∗ nφe� ω − (εn − εe − iη sgn(εn − µ)) = ΣSEX(εe) + ΣCOH(εe) ΣSEX(εe) = − � v �φeφ ∗ v| Ŵ (εv − εe) |φ∗ vφe� = − � v �φeφ ∗ v| (1− v̂P̂ (ωve)) −1v̂ |φ∗ vφe� (67) 13 Pôles of G(ω) Pôles of W(ω) Screened exchange In our work, only screened exchange is implemented (for now) : C Équations for Abinit Workshop 2011 P (1, 2) = −iG(1, 2)G(2, 1+) (64) W (1, 2) = v(1, 2)− i � v(1, 3)P (3, 4)W (4, 2)d(3, 4) (65) Σ(1, 2) = iG(1, 2)W (2, 1+) (66) (T̂ + V̂ext + V̂H) |φn�+ Σ̂(εn) |φn� = εn |φn� Ĝ(ω) = � n |φn� �φn| ω − εn + iη sgn(εn − µ) P̂ (ω) = � v,c |φ∗ cφv� � 1 ω − (εc − εv − iη) − 1 ω + (εc − εv − iη) � �φvφ ∗ c | Ŵ (ω) = v̂ + v̂P̂ (ω)Ŵ (ω) Σ(r, r�, εn) = i 2π � ∞ −∞ dω eiηωG(r, r�,ω + εn)W (r, r�,ω) (T̂ + V̂ext + V̂H) |φn�+ V̂xc |φn� = εn |φn� εQP n ≈ εn + �φn| Σ̂(εQP n )− V̂xc |φn� P̂ (ω) = ... Ŵ (ω) = ... Σ(εe) ≡ �φe| Σ̂(εe) |φe� = i 2π � n � +∞ −∞ dωeiηω �φeφ∗ n| Ŵ (ω) |φ∗ nφe� ω − (εn − εe − iη sgn(εn − µ)) = ΣSEX(εe) + ΣCOH(εe) ΣSEX(εe) = − � v �φeφ ∗ v| Ŵ (εv − εe) |φ∗ vφe� = − � v �φeφ ∗ v| (1− v̂P̂ (ωve)) −1v̂ |φ∗ vφe� (67) DFT → {φn, εn} → P̂ (ω) → Ŵ (ω) → ΣSEX(εe) (68) 13 Method, part 1 : Sternheimer equation Bottleneck 1 : sum over conduction states To obtain ΣSEX(εe), we need P(εv-εe). This requires a sum over conduction states : Problematic example : C60 Nc ≈ 10Nv = 1200 & ecut = 30 Ha ⇒ mpw = 300 000 {φc, εc} : 9 Gb RAM & days of CPU time C Équations for Abinit Workshop 2011 P (1, 2) = −iG(1, 2)G(2, 1+) (64) W (1, 2) = v(1, 2)− i � v(1, 3)P (3, 4)W (4, 2)d(3, 4) (65) Σ(1, 2) = iG(1, 2)W (2, 1+) (66) (T̂ + V̂ext + V̂H) |φn�+ Σ̂(εn) |φn� = εn |φn� Ĝ(ω) = � n |φn� �φn| ω − εn + iη sgn(εn − µ) P̂ (ω) = � v,c |φ∗ cφv� � 1 ω − (εc − εv − iη) − 1 ω + (εc − εv − iη) � �φvφ ∗ c | Ŵ (ω) = v̂ + v̂P̂ (ω)Ŵ (ω) Σ(r, r�, εn) = i 2π � ∞ −∞ dω eiηωG(r, r�,ω + εn)W (r, r�,ω) (T̂ + V̂ext + V̂H) |φn�+ V̂xc |φn� = εn |φn� εQP n ≈ εn + �φn| Σ̂(εQP n )− V̂xc |φn� P̂ (ω) = ... Ŵ (ω) = ... Σ(εe) ≡ �φe| Σ̂(εe) |φe� = i 2π � n � +∞ −∞ dωeiηω �φeφ∗ n| Ŵ (ω) |φ∗ nφe� ω − (εn − εe − iη sgn(εn − µ)) = ΣSEX(εe) + ΣCOH(εe) ΣSEX(εe) = − � v �φeφ ∗ v| Ŵ (εv − εe) |φ∗ vφe� = − � v �φeφ ∗ v| (1− v̂P̂ (ωve)) −1v̂ |φ∗ vφe� (67) DFT → {φn, εn} → P̂ (ω) → Ŵ (ω) → ΣSEX(εe) (68) 13 Solution to bottleneck 1 Idea : transform ∑c into a linear equation problem... Implementation of solution 1) H is sparse ⇒ iterative method 2) H - εv ± ω is indefinite ⇒ SYMMLQ instead of CG 3) Convergence slow ⇒ preconditionning (6x faster for benzene, antracene, pentacene, heptacene and C60) DFT → {φn, εn} → P̂ (ω) → Ŵ (ω) → ΣSEX(εe) (68) P̂ (ω) |ψ� = − � v,c |φ∗ cφv� � 1 εc − εv − ω − 1 εc − εv − ω � �φvφ ∗ c |ψ� ≡ � v |f+∗ v v�+ |f−∗ v v� (69) ⇒ |f± v � = − � c |φc� �φc| εc − εv ± ω |ψ∗φv� = − P̂c Ĥ − εv ± ω |ψ∗φv� (70) ⇒ (Ĥ − εv ± ω) |f± v � = −P̂c |ψ∗φv� (71) References [1] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 81 115104 (2010) [2] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 79 20 (2009) [3] P. Giustino, M.L. Cohen, S.G. Louie, Phys. Rev. B 81 11 (2010) [4] A. Frommer Computing 70 pp. 87-109 (2003) 14 DFT → {φn, εn} → P̂ (ω) → Ŵ (ω) → ΣSEX(εe) (68) P̂ (ω) |ψ� = − � v,c |φ∗ cφv� � 1 εc − εv − ω − 1 εc − εv − ω � �φvφ ∗ c |ψ� ≡ � v |f+∗ v v�+ |f−∗ v v� (69) ⇒ |f± v � = − � c |φc� �φc| εc − εv ± ω |ψ∗φv� = − � c |φc� �φc| Ĥ − εv ± ω |ψ∗φv� = − P̂c Ĥ − εv ± ω |ψ∗φv� (70) ⇒ (Ĥ − εv ± ω) |f± v � = −P̂c |ψ∗φv� (71) References [1] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 81 115104 (2010) [2] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 79 20 (2009) [3] P. Giustino, M.L. Cohen, S.G. Louie, Phys. Rev. B 81 11 (2010) [4] A. Frommer Computing 70 pp. 87-109 (2003) 14 Eliminating DFT → {φn, εn} → P̂ (ω) → Ŵ (ω) → ΣSEX(εe) (68) P̂ (ω) |ψ� = − � v,c |φ∗ cφv� � 1 εc − εv − ω − 1 εc − εv − ω � �φvφ ∗ c |ψ� ≡ � v |f+∗ v v�+ |f−∗ v v� (69) ⇒ |f± v � = − � c |φc� �φc| εc − εv ± ω |ψ∗φv� = − � c |φc� �φc| Ĥ − εv ± ω |ψ∗φv� = − P̂c Ĥ − εv ± ω |ψ∗φv� (70) ⇒ (Ĥ − εv ± ω) |f± v � = −P̂c |ψ∗φv� (71) References [1] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 81 115104 (2010) [2] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 79 20 (2009) [3] P. Giustino, M.L. Cohen, S.G. Louie, Phys. Rev. B 81 11 (2010) [4] A. Frommer Computing 70 pp. 87-109 (2003) 14 (Sternheimer’s equation) Method, part 2 : Lanczos algorithm Bottleneck 2 : dielectric matrix inversion To obtain W(ω), one usually invert the dielectric matrix : Problematic example : C60 again ecuteps = 3 Ha ⇒ 10 000 x 10 000 dielectric matrix ε-1 ⇒ 1,5 Gb RAM & days CPU time C Équations for Abinit Workshop 2011 P (1, 2) = −iG(1, 2)G(2, 1+) (64) W (1, 2) = v(1, 2)− i � v(1, 3)P (3, 4)W (4, 2)d(3, 4) (65) Σ(1, 2) = iG(1, 2)W (2, 1+) (66) (T̂ + V̂ext + V̂H) |φn�+ Σ̂(εn) |φn� = εn |φn� Ĝ(ω) = � n |φn� �φn| ω − εn + iη sgn(εn − µ) P̂ (ω) = � v,c |φ∗ cφv� � 1 ω − (εc − εv − iη) − 1 ω + (εc − εv − iη) � �φvφ ∗ c | Ŵ (ω) = v̂ + v̂P̂ (ω)Ŵ (ω) Σ(r, r�, εn) = i 2π � ∞ −∞ dω eiηωG(r, r�,ω + εn)W (r, r�,ω) (T̂ + V̂ext + V̂H) |φn�+ V̂xc |φn� = εn |φn� εQP n ≈ εn + �φn| Σ̂(εQP n )− V̂xc |φn� P̂ (ω) = ... Ŵ (ω) = ... Σ(εe) ≡ �φe| Σ̂(εe) |φe� = i 2π � n � +∞ −∞ dωeiηω �φeφ∗ n| Ŵ (ω) |φ∗ nφe� ω − (εn − εe − iη sgn(εn − µ)) = ΣSEX(εe) + ΣCOH(εe) ΣSEX(εe) = − � v �φeφ ∗ v| Ŵ (εv − εe) |φ∗ vφe� = − � v �φeφ ∗ v| (1− v̂P̂ (ωve)) −1v̂ |φ∗ vφe� (67) 13 DFT → {φn, εn} → P̂ (ω) → Ŵ (ω) → ΣSEX(εe) (68) P̂ (ω) |ψ� = − � v,c |φ∗ cφv� � 1 εc − εv − ω − 1 εc − εv − ω � �φvφ ∗ c |ψ� ≡ � v |f+∗ v v�+ |f−∗ v v� (69) ⇒ |f± v � = − � c |φc� �φc| εc − εv ± ω |ψ∗φv� = − � c |φc� �φc| Ĥ − εv ± ω |ψ∗φv� = − P̂c Ĥ − εv ± ω |ψ∗φv� (70) ⇒ (Ĥ − εv ± ω) |f± v � = −P̂c |ψ∗φv� (71) ⇒ (1− v̂P̂ (ω))Ŵ (ω) ≡ �̂(ω)Ŵ (ω) = v̂ (72) ⇒ Ŵ (ω) = �̂−1(ω) v̂ (73) References [1] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 81 115104 (2010) [2] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 79 20 (2009) [3] P. Giustino, M.L. Cohen, S.G. Louie, Phys. Rev. B 81 11 (2010) [4] A. Frommer Computing 70 pp. 87-109 (2003) 14 DFT → {φn, εn} → P̂ (ω) → Ŵ (ω) → ΣSEX(εe) (68) P̂ (ω) |ψ� = − � v,c |φ∗ cφv� � 1 εc − εv − ω − 1 εc − εv − ω � �φvφ ∗ c |ψ� ≡ � v |f+∗ v v�+ |f−∗ v v� (69) ⇒ |f± v � = − � c |φc� �φc| εc − εv ± ω |ψ∗φv� = − � c |φc� �φc| Ĥ − εv ± ω |ψ∗φv� = − P̂c Ĥ − εv ± ω |ψ∗φv� (70) ⇒ (Ĥ − εv ± ω) |f± v � = −P̂c |ψ∗φv� (71) ⇒ (1− v̂P̂ (ω))Ŵ (ω) ≡ �̂(ω)Ŵ (ω) = v̂ (72) ⇒ Ŵ (ω) = �̂−1(ω) v̂ (73) References [1] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 81 115104 (2010) [2] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 79 20 (2009) [3] P. Giustino, M.L. Cohen, S.G. Louie, Phys. Rev. B 81 11 (2010) [4] A. Frommer Computing 70 pp. 87-109 (2003) 14 Solution to bottleneck 2 Idea : Use geometrical serie to express matrix inversion... Problem : convergence slow... DFT → {φn, εn} → P̂ (ω) → Ŵ (ω) → ΣSEX(εe) (68) P̂ (ω) |ψ� = − � v,c |φ∗ cφv� � 1 εc − εv − ω − 1 εc − εv − ω � �φvφ ∗ c |ψ� ≡ � v |f+∗ v v�+ |f−∗ v v� (69) ⇒ |f± v � = − � c |φc� �φc| εc − εv ± ω |ψ∗φv� = − � c |φc� �φc| Ĥ − εv ± ω |ψ∗φv� = − P̂c Ĥ − εv ± ω |ψ∗φv� (70) ⇒ (Ĥ − εv ± ω) |f± v � = −P̂c |ψ∗φv� (71) ⇒ (1− v̂P̂ (ω))Ŵ (ω) ≡ �̂(ω)Ŵ (ω) = v̂ (72) ⇒ Ŵ (ω) = �̂−1(ω) v̂ (73) Ŵ (ω) = (1− v̂P̂ (ω))−1 v̂ = ∞� m=0 (v̂P̂ (ω))mv̂ (74) ΣSEX(εe) = − � v ∞� m=0 �φeφ ∗ v| (v̂P̂ (εv − εe)) m v̂ |φ∗ vφe� (75) References [1] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 81 115104 (2010) [2] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 79 20 (2009) [3] P. Giustino, M.L. Cohen, S.G. Louie, Phys. Rev. B 81 11 (2010) [4] A. Frommer Computing 70 pp. 87-109 (2003) 14 DFT → {φn, εn} → P̂ (ω) → Ŵ (ω) → ΣSEX(εe) (68) P̂ (ω) |ψ� = − � v,c |φ∗ cφv� � 1 εc − εv − ω − 1 εc − εv − ω � �φvφ ∗ c |ψ� ≡ � v |f+∗ v v�+ |f−∗ v v� (69) ⇒ |f± v � = − � c |φc� �φc| εc − εv ± ω |ψ∗φv� = − � c |φc� �φc| Ĥ − εv ± ω |ψ∗φv� = − P̂c Ĥ − εv ± ω |ψ∗φv� (70) ⇒ (Ĥ − εv ± ω) |f± v � = −P̂c |ψ∗φv� (71) ⇒ (1− v̂P̂ (ω))Ŵ (ω) ≡ �̂(ω)Ŵ (ω) = v̂ (72) ⇒ Ŵ (ω) = �̂−1(ω) v̂ (73) Ŵ (ω) = (1− v̂P̂ (ω))−1 v̂ = ∞� m=0 (v̂P̂ (ω))mv̂ (74) ⇒ ΣSEX(εe) = − � v ∞� m=0 �φeφ ∗ v| (v̂P̂ (εv − εe)) m v̂ |φ∗ vφe� (75) References [1] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 81 115104 (2010) [2] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 79 20 (2009) [3] P. Giustino, M.L. Cohen, S.G. Louie, Phys. Rev. B 81 11 (2010) [4] A. Frommer Computing 70 pp. 87-109 (2003) 14 0 5 10 15 20 10−16 10−14 10−12 10−10 10−8 10−6 10−4 10−2 100 Convergence of screened exchange contribution to HOMO energy of silane Number of iterations C on ve rg en ce (H a) Power series method precision to 10-5 Method, part 2 : Lanczos algorithm Better solution to bottleneck 2 Power series version of screened exchange : Idea : Tridiagonalize the operator in (), taking the vector it acts on as the first vector of tridiagonal basis {qi}. Applying tridiagonal operator m times gives schematically : Constructing a kmax x kmax tridiagonal matrix costs the same as iterating mmax = kmax - 1 times the power series. But a T matrix of kmax dimensions contains a more precise estimate of ΣSEX than a power series with mmax = kmax - 1. Keeping kmax finite but letting mmax → ∞, we have : Which converges a lot faster : DFT → {φn, εn} → P̂ (ω) → Ŵ (ω) → ΣSEX(εe) (68) P̂ (ω) |ψ� = − � v,c |φ∗ cφv� � 1 εc − εv − ω − 1 εc − εv − ω � �φvφ ∗ c |ψ� ≡ � v |f+∗ v v�+ |f−∗ v v� (69) ⇒ |f± v � = − � c |φc� �φc| εc − εv ± ω |ψ∗φv� = − � c |φc� �φc| Ĥ − εv ± ω |ψ∗φv� = − P̂c Ĥ − εv ± ω |ψ∗φv� (70) ⇒ (Ĥ − εv ± ω) |f± v � = −P̂c |ψ∗φv� (71) ⇒ (1− v̂P̂ (ω))Ŵ (ω) ≡ �̂(ω)Ŵ (ω) = v̂ (72) ⇒ Ŵ (ω) = �̂−1(ω) v̂ (73) Ŵ (ω) = (1− v̂P̂ (ω))−1 v̂ = ∞� m=0 (v̂P̂ (ω))mv̂ (74) ⇒ ΣSEX(εe) = − � v ∞� m=0 �φeφ ∗ v| (v̂P̂ (εv − εe)) m v̂ |φ∗ vφe� (75) ΣSEX(εe) = − ∞� m=0 � v �φeφ ∗ v| √ v̂( √ v̂P̂ (ωve) √ v̂)m √ v̂ |φ∗ vφe� (76) 14 |q1� |q2� |q3� |q4� α1 α2 α3 β1 β2 β3 kmax ... ... ... ... α1 α1 α2 β1 β1 β2 mmax ΣSEX(εe) = − ∞� m=0 � v � ij �φeφ ∗ v| √ v̂|qi� T̂m kmax �qj| √ v̂|φ∗ vφe� = − ∞� m=0 � v � 1 0 · · · 0 �   α1 β1 · · · 0 β1 α2 . . . ... . . . . . . . . . ... . . . . . . βkmax−1 0 · · · βkmax−1 αkmax   m   1 0 ... 0   References [1] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 81 115104 (2010) [2] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 79 20 (2009) [3] P. Giustino, M.L. Cohen, S.G. Louie, Phys. Rev. B 81 11 (2010) [4] A. Frommer Computing 70 pp. 87-109 (2003) 15 ΣSEX(εe) = − ∞� m=0 � v � ij �φeφ ∗ v| √ v̂|qi� T̂m kmax �qj| √ v̂|φ∗ vφe� = − ∞� m=0 � v � 1 0 · · · 0 �   α1 β1 · · · 0 β1 α2 . . . ... . . . . . . . . . ... . . . . . . βkmax−1 0 · · · βkmax−1 αkmax   m   1 0 ... 0   ∞� m=0 ( √ v̂P̂ (ωve) √ v̂)m ≈ ∞� m=0 T̂m kmax = (1− T̂kmax) −1 (77) ΣSEX(εe) ≈ − � v � ij �φeφ ∗ v| √ v̂|qi� (1− T̂kmax) −1 ij �qj| √ v̂|φ∗ vφe� References [1] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 81 115104 (2010) [2] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 79 20 (2009) [3] P. Giustino, M.L. Cohen, S.G. Louie, Phys. Rev. B 81 11 (2010) [4] A. Frommer Computing 70 pp. 87-109 (2003) 15 ΣSEX(εe) = − ∞� m=0 � v � ij �φeφ ∗ v| √ v̂|qi� T̂m kmax �qj| √ v̂|φ∗ vφe� = − ∞� m=0 � v � 1 0 · · · 0 �   α1 β1 · · · 0 β1 α2 . . . ... . . . . . . . . . ... . . . . . . βkmax−1 0 · · · βkmax−1 αkmax   m   1 0 ... 0   ∞� m=0 ( √ v̂P̂ (ωve) √ v̂)m ≈ ∞� m=0 T̂m kmax = (1− T̂kmax) −1 (77) ΣSEX(εe) ≈ − � v � ij �φeφ ∗ v| √ v̂|qi� (1− T̂kmax) −1 ij �qj| √ v̂|φ∗ vφe� References [1] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 81 115104 (2010) [2] P. Umari, G. Stenuit, S. Baroni, Phys. Rev. B 79 20 (2009) [3] P. Giustino, M.L. Cohen, S.G. Louie, Phys. Rev. B 81 11 (2010) [4] A. Frommer Computing 70 pp. 87-109 (2003) 15 0 5 10 15 20 10−16 10−14 10−12 10−10 10−8 10−6 10−4 10−2 100 Convergence of screened exchange contribution to HOMO energy of silane Number of iterations C on ve rg en ce (H a) Power series method Lanczos method precision to 10-5 Conclusion - ∑SEX(εe) implemented without {φc, εc} and without substantial time spent on matrix inversion. - Preconditionning in SYMMLQ causes 6x increase of speed in organic systems. - Lanczos method is dramatically faster that power method. DFT → {φn, εn} → P̂ (ω) → Ŵ (ω) → ΣSEX(εe) (68) P̂ (ω) |ψ� = − � v,c |φ∗ cφv� � 1 εc − εv − ω + 1 εc − εv + ω � �φvφ ∗ c |ψ� ≡ � v |f+∗ v v�+ |f−∗ v v� (69) ⇒ |f± v � = − � c |φc� �φc| εc − εv ± ω |ψ∗φv� = − � c |φc� �φc| Ĥ − εv ± ω |ψ∗φv� = − P̂c Ĥ − εv ± ω |ψ∗φv� (70) ⇒ (Ĥ − εv ± ω) |f± v � = −P̂c |ψ∗φv� (71) ⇒ (1− v̂P̂ (ω))Ŵ (ω) ≡ �̂(ω)Ŵ (ω) = v̂ (72) ⇒ Ŵ (ω) = �̂−1(ω) v̂ (73) Ŵ (ω) = (1− v̂P̂ (ω))−1 v̂ = ∞� m=0 (v̂P̂ (ω))mv̂ (74) ⇒ ΣSEX(εe) = − � v ∞� m=0 �φeφ ∗ v| (v̂P̂ (εv − εe)) m v̂ |φ∗ vφe� (75) ΣSEX(εe) = − ∞� m=0 � v �φeφ ∗ v| √ v̂( √ v̂P̂ (ωve) √ v̂)m √ v̂ |φ∗ vφe� (76) 14