http://iet.metastore.ingenta.com
1887

Impact of spintronic memory on multicore cache hierarchy design

Impact of spintronic memory on multicore cache hierarchy design

For access to this article, please select a purchase option:

Buy article PDF
$19.95
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Computers & Digital Techniques — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Spintronic memory [spin-transfer torque-magnetic random access memory (STT-MRAM)] is an attractive alternative technology to CMOS since it offers higher density and virtually no leakage current. Spintronic memory continues to require higher write energy, however, presenting a challenge to memory hierarchy design when energy consumption is a concern. This study motivates the use of STT-MRAM for the first-level caches of a multicore processor to reduce energy consumption without significantly degrading the performance. The large STT-MRAM first-level cache implementation saves leakage power. Moreover, the use of small level-0 cache regains the performance drop due to STT-MRAM long write latencies. The combination of both reduces the energy-delay product by 65% on average compared with CMOS baseline. The proposed STT hierarchy also shows good scalability over the CMOS with a few benchmarks which scale significantly better. The PARSEC and Splash2 benchmark suites are analysed running on a modern multicore platform, comparing performance, energy consumption and scalability of the spintronic cache system to a CMOS design.

References

    1. 1)
      • 1. Park, S.P., Gupta, S., Mojumder, N., et al: ‘Future cache design using STT MRAMs for improved energy efficiency: devices, circuits and architecture’. DAC'12: Proc. of the 49th Annual Design Automation Conf., 2012.
    2. 2)
      • 2. Smullen, C.W.I., Mohan, V., Nigam, A., et al: ‘Relaxing non-volatility for fast and energy-efficient STT-RAM caches’. 2011 IEEE 17th Int. Symp. on High Performance Computer Architecture (HPCA), 2011, pp. 5061.
    3. 3)
      • 3. Rasquinha, M., Choudhary, D., Chatterjee, S., et al: ‘An energy efficient cache design using spin torque transfer (STT) RAM’. ISLPED'10: Proc. of the 16th ACM/IEEE Int. Symp. on Low Power Electronics and Design, 2010.
    4. 4)
      • 4. Sun, Z., Bi, X., Li, H.H., et al: ‘Multi retention level STT-RAM cache designs with a dynamic refresh scheme’. MICRO-44'11: Proc. of the 44th Annual IEEE/ACM Int. Symp. on Microarchitecture, 2011.
    5. 5)
      • 5. Guo, X., Ipek, E., Soyata, T.: ‘Resistive computation: avoiding the power wall with low-leakage, STT-MRAM based computing’. ISCA'10: Proc. of the 37th Annual Int. Symp. on Computer Architecture, 2010.
    6. 6)
      • 6. Senni, S., Torres, L., Sassatelli, G., et al: ‘Emerging non-volatile memory technologies exploration flow for processor architecture’. 2015 IEEE Computer Society Annual Symp. on VLSI (ISVLSI), 2015, p. 460.
    7. 7)
      • 7. Jog, A., Mishra, A.K., Xu, C., et al: ‘Cache revive: architecting volatile STT-RAM caches for enhanced performance in CMPs’. DAC'12: Proc. of the 49th Annual Design Automation Conf., 2012, pp. 243252.
    8. 8)
      • 8. Kim, J., Zhao, H., Jiang, Y., et al: ‘Scaling analysis of in-plane and perpendicular anisotropy magnetic tunnel junctions using a physics-based model’. Device Research Conf. (DRC), 2014, 2014.
    9. 9)
      • 9. Tuohy, W., Ma, C., Nandkar, P., et al: ‘Improving energy and performance with spintronics caches in multicore systems’. Europar'14: OMHI – Third Annual Workshop on On-Chip Memory Hierarchies and Interconnects, 2014.
    10. 10)
      • 10. L. Hewlett-Packard Development Company: ‘Cacti 6.5’, 2009, Available at http://www.hpl.hp.com/research/cacti/.
    11. 11)
      • 11. Zhao, W., Cao, Y.: ‘New generation of predictive technology model for sub-45 nm design exploration’. Seventh Int. Symp. on Quality Electronic Design, 2006. ISQED'06, 2006, p. 6.
    12. 12)
      • 12. Dong, X, Xu, C., Xie, Y., et al: ‘NVSim: a circuit-level performance, energy, and area model for emerging nonvolatile memory’, IEEE Trans. Comput. Aided Des. Integr. Circuits Systems, 2012, 31, (7), pp. 9941007.
    13. 13)
      • 13. Genbrugge, D., Eyerman, S., Eeckhout, L.: ‘Interval simulation: raising the level of abstraction in architectural simulation’. 2010 IEEE 16th Int. Symp. on High Performance Computer Architecture (HPCA), 2010, pp. 112. Available at http://www.dx.doi.org/10.1109/hpca.2010.5416636.
    14. 14)
      • 14. Bienia, C.: ‘Benchmarking modern multiprocessors’. PhD thesis, Princeton University, January 2011.
    15. 15)
      • 15. Woo, S.C., Ohara, M., Torrie, E., et al: ‘The splash-2 programs: characterization and methodological considerations’. Proc. of the 22nd Annual Int. Symp. on Computer Architecture, ISCA'95 mct, New York, NY, USA, 1995, pp. 2436. Available at http://www.doi.acm.org/10.1145/223982.223990.
    16. 16)
      • 16. Bienia, C., Kumar, S., Li, K.: ‘PARSEC vs. splash-2: a quantitative comparison of two multithreaded benchmark suites on chip-multiprocessors’. IEEE Int. Symp. on Workload Characterization, 2008. IISWC 2008, 2008, pp. 4756.
    17. 17)
      • 17. Alameldeen, A.R., Wood, D.A.: ‘IPC considered harmful for multiprocessor workloads’, IEEE Micro, 2006, 26, (4), pp. 817.
    18. 18)
      • 18. Chun, K.C., Zhao, H., Harms, J.D., et al: ‘A scaling roadmap and performance evaluation of in-plane and perpendicular MTJ based STT-MRAMs for high-density cache memory’, IEEE J. Solid-State Circuits, 2013, 48, (2), pp. 598610.
    19. 19)
      • 19. Gonzales, R., Horowitz, M.: ‘Energy dissipation in general purpose processors’, IEEE J. Solid State Circuits, 1995, 31, pp. 12771284.
    20. 20)
      • 20. Bhadauria, M., Weaver, V.M., McKee, S.A.: ‘Understanding PARSEC performance on contemporary CMPs’. IEEE Int. Symp. on Workload Characterization, 2009. IISWC 2009, 2009, pp. 98107.
    21. 21)
      • 21. Li, Q., Li, J., Shi, L., et al: ‘Compiler-assisted refresh minimization for volatile stt-ram cache’. 2013 18th Asia and South Pacific Design Automation Conf. (ASP-DAC), 2013, pp. 273278.
    22. 22)
      • 22. Xu, W., Sun, H., Wang, X., et al: ‘Design of last-level on-chip cache using spin-torque transfer RAM (stt RAM)’, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 2011, 19, (3), pp. 483493.
    23. 23)
      • 23. Kim, Y., Gupta, S.K., Park, S.P., et al: ‘Write-optimized reliable design of STT MRAM’. ISLPED'12: Proc. of the 2012 ACM/IEEE Int. Symp. on Low power Electronics and Design, 2012.
    24. 24)
      • 24. Sun, G., Dong, X., Xie, Y., et al: ‘A novel architecture of the 3D stacked MRAM l2 cache for CMPs’. IEEE 15th Int. Symp. on High Performance Computer Architecture, 2009. HPCA 2009, 2009, pp. 239249.
    25. 25)
      • 25. Zhou, P., Zhao, B., Yang, J., et al: ‘Energy reduction for STT-RAM using early write termination’. ICCAD'09: Proc. of the 2009 Int. Conf. on Computer-Aided Design, 2009.
    26. 26)
      • 26. Kwon, K.-W., Choday, S.H., Kim, Y., et al: ‘AWARE (asymmetric write architecture with REdundant blocks): a high write speed STT-MRAM cache architecture’, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 2013, 22, (4), pp. 712720.
    27. 27)
      • 27. Sun, Z., Li, H., Wu, W.: ‘A dual-mode architecture for fast-switching STT-RAM’. ISLPED'12: Proc. of the 2012 ACM/IEEE Int. Symp. on Low Power Electronics and Design, 2012.
    28. 28)
      • 28. Ahn, J., Yoo, S., Choi, K.: ‘Dasca: dead write prediction assisted stt-RAM cache architecture’. 2014 IEEE 20th Int. Symp. on High Performance Computer Architecture (HPCA2014), February 2014.
    29. 29)
      • 29. Wu, X., Li, J., Zhang, L., et al: ‘Power and performance of read–write aware hybrid caches with non-volatile memories’. Design, Automation Test in Europe Conf. Exhibition, 2009. DATE'09, 2009, pp. 737742.
    30. 30)
      • 30. Jadidi, A., Arjomand, M., Sarbazi-Azad, H.: ‘High-endurance and performance-efficient design of hybrid cache architectures through adaptive line replacement’. ISLPED'11: Proc. of the 17th IEEE/ACM Int. Symp. on Low-power Electronics and Design, 2011.
    31. 31)
      • 31. Del Bel, B., Kim, J., Kim, C., et al: ‘Improving stt-MRAM density through multibit error correction’. Design, Automation and Test in Europe Conf. and Exhibition (DATE), 2014, 2014, pp. 16.
    32. 32)
      • 32. Jouppi, N.P.: ‘Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers’, ACM SIGARCH Comput. Archit. News, 1990, 18, pp. 364373.
    33. 33)
      • 33. Kin, J., Gupta, M., Mangione-Smith, W.H.: ‘The filter cache: an energy efficient memory structure’. Proc. of the 30th Annual ACM/IEEE Int. Symp. on Microarchitecture, MICRO 30, Washington, DC, USA, 1997, pp. 184193. Available at http://www.dl.acm.org/citation.cfm?id=266800.266818.
    34. 34)
      • 34. Varma, A., Jacobson, Q.: ‘Destage algorithms for disk arrays with non-volatile caches’. 22nd Annual Int. Symp. on Computer Architecture, 1995. Proc., 1995, pp. 8395.
    35. 35)
      • 35. Gill, B.S., Modha, D.S.: ‘Wow: wise ordering for writes – combining spatial and temporal locality in non-volatile caches’. Proc. of the Fourth Conf. on USENIX Conf. on File and Storage Technologies – Volume 4, FAST'05, Berkeley, CA, USA, 2005, p. 10.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cdt.2015.0190
Loading

Related content

content/journals/10.1049/iet-cdt.2015.0190
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address