http://iet.metastore.ingenta.com
1887

Impact of spintronic memory on multicore cache hierarchy design

Impact of spintronic memory on multicore cache hierarchy design

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Computers & Digital Techniques — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Spintronic memory [spin-transfer torque-magnetic random access memory (STT-MRAM)] is an attractive alternative technology to CMOS since it offers higher density and virtually no leakage current. Spintronic memory continues to require higher write energy, however, presenting a challenge to memory hierarchy design when energy consumption is a concern. This study motivates the use of STT-MRAM for the first-level caches of a multicore processor to reduce energy consumption without significantly degrading the performance. The large STT-MRAM first-level cache implementation saves leakage power. Moreover, the use of small level-0 cache regains the performance drop due to STT-MRAM long write latencies. The combination of both reduces the energy-delay product by 65% on average compared with CMOS baseline. The proposed STT hierarchy also shows good scalability over the CMOS with a few benchmarks which scale significantly better. The PARSEC and Splash2 benchmark suites are analysed running on a modern multicore platform, comparing performance, energy consumption and scalability of the spintronic cache system to a CMOS design.

References

    1. 1)
      • S.P. Park , S. Gupta , N. Mojumder .
        1. Park, S.P., Gupta, S., Mojumder, N., et al: ‘Future cache design using STT MRAMs for improved energy efficiency: devices, circuits and architecture’. DAC'12: Proc. of the 49th Annual Design Automation Conf., 2012.
        . DAC'12: Proc. of the 49th Annual Design Automation Conf.
    2. 2)
      • C.W.I. Smullen , V. Mohan , A. Nigam .
        2. Smullen, C.W.I., Mohan, V., Nigam, A., et al: ‘Relaxing non-volatility for fast and energy-efficient STT-RAM caches’. 2011 IEEE 17th Int. Symp. on High Performance Computer Architecture (HPCA), 2011, pp. 5061.
        . 2011 IEEE 17th Int. Symp. on High Performance Computer Architecture (HPCA) , 50 - 61
    3. 3)
      • M. Rasquinha , D. Choudhary , S. Chatterjee .
        3. Rasquinha, M., Choudhary, D., Chatterjee, S., et al: ‘An energy efficient cache design using spin torque transfer (STT) RAM’. ISLPED'10: Proc. of the 16th ACM/IEEE Int. Symp. on Low Power Electronics and Design, 2010.
        . ISLPED'10: Proc. of the 16th ACM/IEEE Int. Symp. on Low Power Electronics and Design
    4. 4)
      • Z. Sun , X. Bi , H.H. Li .
        4. Sun, Z., Bi, X., Li, H.H., et al: ‘Multi retention level STT-RAM cache designs with a dynamic refresh scheme’. MICRO-44'11: Proc. of the 44th Annual IEEE/ACM Int. Symp. on Microarchitecture, 2011.
        . MICRO-44'11: Proc. of the 44th Annual IEEE/ACM Int. Symp. on Microarchitecture
    5. 5)
      • X. Guo , E. Ipek , T. Soyata .
        5. Guo, X., Ipek, E., Soyata, T.: ‘Resistive computation: avoiding the power wall with low-leakage, STT-MRAM based computing’. ISCA'10: Proc. of the 37th Annual Int. Symp. on Computer Architecture, 2010.
        . ISCA'10: Proc. of the 37th Annual Int. Symp. on Computer Architecture
    6. 6)
      • S. Senni , L. Torres , G. Sassatelli .
        6. Senni, S., Torres, L., Sassatelli, G., et al: ‘Emerging non-volatile memory technologies exploration flow for processor architecture’. 2015 IEEE Computer Society Annual Symp. on VLSI (ISVLSI), 2015, p. 460.
        . 2015 IEEE Computer Society Annual Symp. on VLSI (ISVLSI) , 460
    7. 7)
      • A. Jog , A.K. Mishra , C. Xu .
        7. Jog, A., Mishra, A.K., Xu, C., et al: ‘Cache revive: architecting volatile STT-RAM caches for enhanced performance in CMPs’. DAC'12: Proc. of the 49th Annual Design Automation Conf., 2012, pp. 243252.
        . DAC'12: Proc. of the 49th Annual Design Automation Conf. , 243 - 252
    8. 8)
      • J. Kim , H. Zhao , Y. Jiang .
        8. Kim, J., Zhao, H., Jiang, Y., et al: ‘Scaling analysis of in-plane and perpendicular anisotropy magnetic tunnel junctions using a physics-based model’. Device Research Conf. (DRC), 2014, 2014.
        . Device Research Conf. (DRC), 2014
    9. 9)
      • W. Tuohy , C. Ma , P. Nandkar .
        9. Tuohy, W., Ma, C., Nandkar, P., et al: ‘Improving energy and performance with spintronics caches in multicore systems’. Europar'14: OMHI – Third Annual Workshop on On-Chip Memory Hierarchies and Interconnects, 2014.
        . Europar'14: OMHI – Third Annual Workshop on On-Chip Memory Hierarchies and Interconnects
    10. 10)
      • 10. L. Hewlett-Packard Development Company: ‘Cacti 6.5’, 2009, Available at http://www.hpl.hp.com/research/cacti/.
        .
    11. 11)
      • W. Zhao , Y. Cao .
        11. Zhao, W., Cao, Y.: ‘New generation of predictive technology model for sub-45 nm design exploration’. Seventh Int. Symp. on Quality Electronic Design, 2006. ISQED'06, 2006, p. 6.
        . Seventh Int. Symp. on Quality Electronic Design, 2006. ISQED'06
    12. 12)
      • X Dong , C. Xu , Y. Xie .
        12. Dong, X, Xu, C., Xie, Y., et al: ‘NVSim: a circuit-level performance, energy, and area model for emerging nonvolatile memory’, IEEE Trans. Comput. Aided Des. Integr. Circuits Systems, 2012, 31, (7), pp. 9941007.
        . IEEE Trans. Comput. Aided Des. Integr. Circuits Systems , 7 , 994 - 1007
    13. 13)
      • D. Genbrugge , S. Eyerman , L. Eeckhout .
        13. Genbrugge, D., Eyerman, S., Eeckhout, L.: ‘Interval simulation: raising the level of abstraction in architectural simulation’. 2010 IEEE 16th Int. Symp. on High Performance Computer Architecture (HPCA), 2010, pp. 112. Available at http://www.dx.doi.org/10.1109/hpca.2010.5416636.
        . 2010 IEEE 16th Int. Symp. on High Performance Computer Architecture (HPCA) , 1 - 12
    14. 14)
      • C. Bienia .
        14. Bienia, C.: ‘Benchmarking modern multiprocessors’. PhD thesis, Princeton University, January 2011.
        . PhD thesis
    15. 15)
      • S.C. Woo , M. Ohara , E. Torrie .
        15. Woo, S.C., Ohara, M., Torrie, E., et al: ‘The splash-2 programs: characterization and methodological considerations’. Proc. of the 22nd Annual Int. Symp. on Computer Architecture, ISCA'95 mct, New York, NY, USA, 1995, pp. 2436. Available at http://www.doi.acm.org/10.1145/223982.223990.
        . Proc. of the 22nd Annual Int. Symp. on Computer Architecture, ISCA'95 mct , 24 - 36
    16. 16)
      • C. Bienia , S. Kumar , K. Li .
        16. Bienia, C., Kumar, S., Li, K.: ‘PARSEC vs. splash-2: a quantitative comparison of two multithreaded benchmark suites on chip-multiprocessors’. IEEE Int. Symp. on Workload Characterization, 2008. IISWC 2008, 2008, pp. 4756.
        . IEEE Int. Symp. on Workload Characterization, 2008. IISWC 2008 , 47 - 56
    17. 17)
      • A.R. Alameldeen , D.A. Wood .
        17. Alameldeen, A.R., Wood, D.A.: ‘IPC considered harmful for multiprocessor workloads’, IEEE Micro, 2006, 26, (4), pp. 817.
        . IEEE Micro , 4 , 8 - 17
    18. 18)
      • K.C. Chun , H. Zhao , J.D. Harms .
        18. Chun, K.C., Zhao, H., Harms, J.D., et al: ‘A scaling roadmap and performance evaluation of in-plane and perpendicular MTJ based STT-MRAMs for high-density cache memory’, IEEE J. Solid-State Circuits, 2013, 48, (2), pp. 598610.
        . IEEE J. Solid-State Circuits , 2 , 598 - 610
    19. 19)
      • R. Gonzales , M. Horowitz .
        19. Gonzales, R., Horowitz, M.: ‘Energy dissipation in general purpose processors’, IEEE J. Solid State Circuits, 1995, 31, pp. 12771284.
        . IEEE J. Solid State Circuits , 1277 - 1284
    20. 20)
      • M. Bhadauria , V.M. Weaver , S.A. McKee .
        20. Bhadauria, M., Weaver, V.M., McKee, S.A.: ‘Understanding PARSEC performance on contemporary CMPs’. IEEE Int. Symp. on Workload Characterization, 2009. IISWC 2009, 2009, pp. 98107.
        . IEEE Int. Symp. on Workload Characterization, 2009. IISWC 2009 , 98 - 107
    21. 21)
      • Q. Li , J. Li , L. Shi .
        21. Li, Q., Li, J., Shi, L., et al: ‘Compiler-assisted refresh minimization for volatile stt-ram cache’. 2013 18th Asia and South Pacific Design Automation Conf. (ASP-DAC), 2013, pp. 273278.
        . 2013 18th Asia and South Pacific Design Automation Conf. (ASP-DAC) , 273 - 278
    22. 22)
      • W. Xu , H. Sun , X. Wang .
        22. Xu, W., Sun, H., Wang, X., et al: ‘Design of last-level on-chip cache using spin-torque transfer RAM (stt RAM)’, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 2011, 19, (3), pp. 483493.
        . IEEE Trans. Very Large Scale Integr. (VLSI) Syst. , 3 , 483 - 493
    23. 23)
      • Y. Kim , S.K. Gupta , S.P. Park .
        23. Kim, Y., Gupta, S.K., Park, S.P., et al: ‘Write-optimized reliable design of STT MRAM’. ISLPED'12: Proc. of the 2012 ACM/IEEE Int. Symp. on Low power Electronics and Design, 2012.
        . ISLPED'12: Proc. of the 2012 ACM/IEEE Int. Symp. on Low power Electronics and Design
    24. 24)
      • G. Sun , X. Dong , Y. Xie .
        24. Sun, G., Dong, X., Xie, Y., et al: ‘A novel architecture of the 3D stacked MRAM l2 cache for CMPs’. IEEE 15th Int. Symp. on High Performance Computer Architecture, 2009. HPCA 2009, 2009, pp. 239249.
        . IEEE 15th Int. Symp. on High Performance Computer Architecture, 2009. HPCA 2009 , 239 - 249
    25. 25)
      • P. Zhou , B. Zhao , J. Yang .
        25. Zhou, P., Zhao, B., Yang, J., et al: ‘Energy reduction for STT-RAM using early write termination’. ICCAD'09: Proc. of the 2009 Int. Conf. on Computer-Aided Design, 2009.
        . ICCAD'09: Proc. of the 2009 Int. Conf. on Computer-Aided Design
    26. 26)
      • K.-W. Kwon , S.H. Choday , Y. Kim .
        26. Kwon, K.-W., Choday, S.H., Kim, Y., et al: ‘AWARE (asymmetric write architecture with REdundant blocks): a high write speed STT-MRAM cache architecture’, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 2013, 22, (4), pp. 712720.
        . IEEE Trans. Very Large Scale Integr. (VLSI) Syst. , 4 , 712 - 720
    27. 27)
      • Z. Sun , H. Li , W. Wu .
        27. Sun, Z., Li, H., Wu, W.: ‘A dual-mode architecture for fast-switching STT-RAM’. ISLPED'12: Proc. of the 2012 ACM/IEEE Int. Symp. on Low Power Electronics and Design, 2012.
        . ISLPED'12: Proc. of the 2012 ACM/IEEE Int. Symp. on Low Power Electronics and Design
    28. 28)
      • J. Ahn , S. Yoo , K. Choi .
        28. Ahn, J., Yoo, S., Choi, K.: ‘Dasca: dead write prediction assisted stt-RAM cache architecture’. 2014 IEEE 20th Int. Symp. on High Performance Computer Architecture (HPCA2014), February 2014.
        . 2014 IEEE 20th Int. Symp. on High Performance Computer Architecture (HPCA2014)
    29. 29)
      • X. Wu , J. Li , L. Zhang .
        29. Wu, X., Li, J., Zhang, L., et al: ‘Power and performance of read–write aware hybrid caches with non-volatile memories’. Design, Automation Test in Europe Conf. Exhibition, 2009. DATE'09, 2009, pp. 737742.
        . Design, Automation Test in Europe Conf. Exhibition, 2009. DATE'09 , 737 - 742
    30. 30)
      • A. Jadidi , M. Arjomand , H. Sarbazi-Azad .
        30. Jadidi, A., Arjomand, M., Sarbazi-Azad, H.: ‘High-endurance and performance-efficient design of hybrid cache architectures through adaptive line replacement’. ISLPED'11: Proc. of the 17th IEEE/ACM Int. Symp. on Low-power Electronics and Design, 2011.
        . ISLPED'11: Proc. of the 17th IEEE/ACM Int. Symp. on Low-power Electronics and Design
    31. 31)
      • B. Del Bel , J. Kim , C. Kim .
        31. Del Bel, B., Kim, J., Kim, C., et al: ‘Improving stt-MRAM density through multibit error correction’. Design, Automation and Test in Europe Conf. and Exhibition (DATE), 2014, 2014, pp. 16.
        . Design, Automation and Test in Europe Conf. and Exhibition (DATE), 2014 , 1 - 6
    32. 32)
      • N.P. Jouppi .
        32. Jouppi, N.P.: ‘Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers’, ACM SIGARCH Comput. Archit. News, 1990, 18, pp. 364373.
        . ACM SIGARCH Comput. Archit. News , 364 - 373
    33. 33)
      • J. Kin , M. Gupta , W.H. Mangione-Smith .
        33. Kin, J., Gupta, M., Mangione-Smith, W.H.: ‘The filter cache: an energy efficient memory structure’. Proc. of the 30th Annual ACM/IEEE Int. Symp. on Microarchitecture, MICRO 30, Washington, DC, USA, 1997, pp. 184193. Available at http://www.dl.acm.org/citation.cfm?id=266800.266818.
        . Proc. of the 30th Annual ACM/IEEE Int. Symp. on Microarchitecture, MICRO 30 , 184 - 193
    34. 34)
      • A. Varma , Q. Jacobson .
        34. Varma, A., Jacobson, Q.: ‘Destage algorithms for disk arrays with non-volatile caches’. 22nd Annual Int. Symp. on Computer Architecture, 1995. Proc., 1995, pp. 8395.
        . 22nd Annual Int. Symp. on Computer Architecture, 1995. Proc. , 83 - 95
    35. 35)
      • B.S. Gill , D.S. Modha .
        35. Gill, B.S., Modha, D.S.: ‘Wow: wise ordering for writes – combining spatial and temporal locality in non-volatile caches’. Proc. of the Fourth Conf. on USENIX Conf. on File and Storage Technologies – Volume 4, FAST'05, Berkeley, CA, USA, 2005, p. 10.
        . Proc. of the Fourth Conf. on USENIX Conf. on File and Storage Technologies – Volume 4, FAST'05 , 10
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cdt.2015.0190
Loading

Related content

content/journals/10.1049/iet-cdt.2015.0190
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address