Your browser does not support JavaScript!

GPU-based LU decomposition for large method of moments problems

GPU-based LU decomposition for large method of moments problems

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
Electronics Letters — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

In the method of moments (MOM) analysis of electromagnetic phenomena, the LU decomposition is often an important and costly step in the solution process. In this reported work, the acceleration of LU decomposition using graphics processing units (GPUs) has been considered. Although existing GPU methods, such as those supplied by MAGMA, provide significant speedup over CPU-only implementations, they are limited to smaller problems by the amount of device memory available. The method now presented takes a left-looking LU decomposition as a starting point and uses an out-of-core like approach to significantly increase the size of the problems that can be solved. In addition, a hybrid implementation that utilises MAGMA as part of the solution process is presented, further improving the performance of the method. For the double precision complex variant of the LU decomposition, the number of MOM degrees of freedom that can be solved using a solver based on MAGMA and a GPU device with 1GB of memory is limited to 7936. Using the presented panel-based and hybrid approaches has already permitted problems more than four times larger to be solved with significant speedup.


    1. 1)
      • ‘GPGPU: General-Purpose Computation Using Graphics Hardware’, 2008. [Online]. Available:
    2. 2)
      • C. Lawson , R. Hanson , F. Krogh , D. Kincaid . Basic linear algebra subprograms for Fortran usage. ACM Trans. Math. Softw. (TOMS) , 3 , 324 - 325
    3. 3)
      • AMD, ‘AMD Developer Central – AMD Core Math Library (ACML)’, 2009. [Online]. Available:
    4. 4)
      • , : `CUBLAS Library', PG-00000-002 V3.0, Tech. Rep., February 2010.
    5. 5)
      • NVIDIA Corporation, ‘NVIDIA CUDA Zone’, 2008. [Online]. Available: home.html.
    6. 6)
      • Innovative Computing Laboratory, University Tennessee, Knoxville, ‘MAGMA: Matrix Algebra on GPU and Multicore Architectures’, 2009. [Online]. Available:
    7. 7)
      • T. Takahashi , T. Hamada . GPU-accelerated boundary element method for Helmholtz’ equation in three dimensions. Int. J. Numer. Methods Eng. , 1295 - 1321
    8. 8)
      • Chen, R., Xu, K., Ding, J.: `Acceleration of MoM solver for scattering using graphics processing units (GPUs)', Wireless Technology Conf., Oriental Institute of Technology, 2008, Taipei, Taiwan.
    9. 9)
      • D.B. Kirk , W.W. Hwu . (2010) Programming massively parallel processors – a hands-on approach.
    10. 10)
      • G.H. Golub , C.F. Van Loan . (1989) Matrix computations.
    11. 11)
      • Inman, M.J., Elsherbeni, A.Z., Reddy, C.J.: `CUDA based GPU solvers for method of moment simulations', 26thAnnual Review of Progress in Applied Computational Elctromagnetics, (ACES2010), April 2010, Tampere, Finland.
    12. 12)
      • D.B. Davidson . (2005) Computational electromagnetics for RF and microwave engineers.
    13. 13)
      • EMPhotonics, ‘CULA Tools – GPU-accelerated LAPACK’, 2009. [Online]. Available:
    14. 14)
      • Tomov, S., Nath, R., Ltaief, H., Dongarra, J.: `Dense linear algebra solvers for multicore with GPU accelerators', Tech. Rep., 2009.
    15. 15)
      • L.S. Blackford , J. Choi , A. Cleary , E. D'Azevedo , J. Demmel , I. Dhillon , J. Dongarra , S. Hammarling , G. Henry , A. Petitet , K. Stanley , D. Walker , R.C. Whaley . (1997) ScaLAPACK users' guide.
    16. 16)
      • J. Dongarra , S. Hammarling , D.W. Walker . Key concepts for parallel out-of-core LU factorization. Comput. Math. Appl. , 7 , 13 - 31
    17. 17)
      • E. Anderson , Z. Bai , C. Bischof , S. Blackford , J. Demmel , J. Dongarra , J. Du Croz , A. Greenbaum , S. Hammarling , A. McKenney , D. Sorensen . (1999) LAPACK users' guide.
    18. 18)
      • Lezar, E., Davidson, D.: `GPU acceleration of method of moments matrix assembly using Rao-Wilton-Glisson basis functions', Int. Conf. on Electronics and Information Engineering, (ICEIE2010), August 2010, Kyoto, Japan, accepted for publication.
    19. 19)
      • D. De Donno , A. Esposito , L.C.L. Tarricone . Introduction to GPU computing and CUDA programming: a case study on FDTD. IEEE Antennas Propag. Mag. , 3

Related content

This is a required field
Please enter a valid email address