Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

Performance prediction and procurement in practice: assessing the suitability of commodity cluster components for wavefront codes

Performance prediction and procurement in practice: assessing the suitability of commodity cluster components for wavefront codes

For access to this article, please select a purchase option:

Buy article PDF
$19.95
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Software — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

The cost of state-of-the-art supercomputing resources makes each individual purchase a length and expensive process. Often each candidate architecture will need to be benchmarked using a variety of tools to assess likely performance. However, benchmarking alone only provides a limited insight into the suitability of each architecture for key codes and will give potentially misleading results when assessing their scalability. In this study the authors present a case study of the application of recently developed performance models of the Chimaera benchmarking code written by the United Kingdom Atomic Weapons Establishment (AWE), with a view to analysing how the code will perform and scale on a medium sized, commodity-based InfiniBand cluster. The models are validated and demonstrate a greater than 90% accuracy for an existing InfiniBand machine; the models are then used as the basis for predicting code performance on a variety of alternative hardware configurations which include changes in the underlying network, the use of faster processors and the use of a higher core density per processor. The results demonstrate the compute-bound nature of Chimaera and its sensitivity to network latency at increased processor counts. By using these insights the authors are able to discuss potential strategies which may be employed during the procurement of future mid-range clusters for wavefront-rich workloads.

References

    1. 1)
      • R.M. Karp , A. Sahay , E.E. Santos , K.E. Schauser . Optimal broadcast and summation in the LogP model. ACM Symp. Parallel Algorithms and Architectures , 142 - 153
    2. 2)
      • Gabriel, E., Fagg, Gr.E., Bosilca, G.: `Open MPI: goals, concept, and design of a next generation MPI implementation', Proc. 11th European PVM/MPI Users' Group Meeting, September 2004, Budapest, Hungary, p. 97–104.
    3. 3)
      • Kerbyson, D.J., Hoisie, A., Wasserman, H.J.: `A comparison between the Earth Simulator and AlphaServer systems using predictive application performance models', International Parallel and Distributed Processing Symp. (IPDPS), 2003.
    4. 4)
      • Schmuck, F., Haskin, R.: `GPFS: a shared-disk file system for large computing clusters', Proc. First Conf. File and Storage Technologies (FAST), January 2002, p. 231–244.
    5. 5)
      • A. Alexandrov , M.F. Ionescu , K.E. Schauser , C. Scheiman . LogGP: incorporating long messages into the LogP model for parallel computation. J. Parallel Distrib. Comput. , 1 , 71 - 79
    6. 6)
      • D.J. Kerbyson , J.S. Harper , A. Craig , G.R. Nudd . PACE: a toolset to investigate and predict performance in parallel systems. European Parallel Tools Meeting, ONERA, Paris
    7. 7)
      • Hoisie, A., Johnson, G., Kerbyson, D.J., Lang, M., Pakin, S.: `A performance comparison through benchmarking and modeling of three leading supercomputers: blue gene/l, red storm, and purple', Proc. IEEE/ACM SuperComputing, October 2006, Tampa, FL.
    8. 8)
      • D.J. Kerbyson , A. Hoisie , S.D. Pautz . Performance modeling of deterministic transport computations. Performance Analysis and Grid Computing
    9. 9)
      • E.A. Brewer , C. Dellarocas , A. Colbrook , W.E. Weihl . PROTEUS: a high-performance parallel-architecture simulator. Meas. Model. Comput. Syst. , 247 - 248
    10. 10)
      • Mudalige, G.R., Hammond, S.D., Smith, J.A., Jarvis, S.A.: `Predictive analysis and optimisation of pipelined wavefront computations', Proc. 11th Workshop on Advances in Parallel and Distributed Computational Models (APDCM2009), 23rd IEEE Int. Parallel and Distributed Processing Symp. (IPDPS 2009).
    11. 11)
      • M. Frank , A. Agarwal , M.K. Vernon . LoPC: modeling contention in parallel algorithms. Princ. Pract. Parallel Program. , 276 - 287
    12. 12)
      • Hammond, S.D., Smith, J.A., Mudalige, G.R., Jarvis, S.A.: `Predictive simulation of HPC applications', IEEE 23rd Int. Conf. Advanced Information Networking and Applications (AINA-09), May 2009.
    13. 13)
      • V.S. Adve , R. Bagrodia , J.C. Browne . POEMS: end-to-end performance design of large parallel adaptive computational systems. Softw. Eng. , 11 , 1027 - 1048
    14. 14)
      • Hammond, S.D., Mudalige, G.R., Smith, J.A., Jarvis, S.A.: `WARPP – a toolkit for simulating high-performance parallel scientific codes', Second Int. Conf. Simulation Tools and Techniques (SIMUTools09), March 2009, Rome, Italy.
    15. 15)
      • S.K. Reinhardt , M.D. Hill , J.R. Larus , A.R. Lebeck , J.C. Lewis , D.A. Wood . The Wisconsin wind tunnel: virtual prototyping of parallel computers. Meas. Model. Comput. Syst. , 48 - 60
    16. 16)
      • Mudalige, G.R., Vernon, M.K., Jarvis, S.A.: `A plug and play model for wavefront computation', IEEE Int. Parallel and Distributed Processing Symp. 2008 (IPDPS'08), April 2008, Miami, Florida, USA, p. 1–14.
    17. 17)
      • G.R. Nudd , D.J. Kerbyson , E. Papaefstathiou , J.S. Harper , S.C. Perry , D.V. Wilcox . PACE: a toolset for the performance prediction of parallel and distributed systems. Int. J. High Perform. Comput. , 228 - 251
    18. 18)
      • D.E. Culler , R.M. Karp , D.A. Patterson . LogP: towards a realistic model of parallel computation. Princ. Pract. Parallel Program. , 1 - 12
    19. 19)
      • Sundaram-Stukel, D., Vernon, M.K.: `Predictive analysis of a wavefront application using LogGP', PPoPP'99: Proc. Seventh ACM SIGPLAN Symp. Principles and Practice of Parallel Programming, 1999, p. 141–150.
    20. 20)
      • Kerbyson, D.J., Hoisie, A., Wasserman, H.J.: `Use of predictive performance modeling during large-scale system installation', Proc. PACT-SPDSEC02, August 2002, Charlottesville, VA.
    21. 21)
      • MPI Benchmark Utility.
    22. 22)
      • G.R. Mudalige , S.A. Jarvis , D.P. Spooner , G.R. Nudd . Predictive performance analysis of a parallel pipelined synchronous wavefront application for commodity processor cluster systems. IEEE Int. Conf. Cluster Computing 2006
    23. 23)
      • Johnson, G., Kerbyson, D.J., Lang, M.: `Optimization of InfiniBand for scientific applications', Int. Parallel and Distributed Processing Symp. 2008 (IPDPS'08), April 2008, Miami, FL, USA.
    24. 24)
      • W. Gropp , E.L. Lusk . Reproducible measurements of MPI performance characteristics. PVM/MPI , 11 - 18
    25. 25)
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-sen.2009.0007
Loading

Related content

content/journals/10.1049/iet-sen.2009.0007
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address