© The Institution of Engineering and Technology
Spintronic memory [spin-transfer torque-magnetic random access memory (STT-MRAM)] is an attractive alternative technology to CMOS since it offers higher density and virtually no leakage current. Spintronic memory continues to require higher write energy, however, presenting a challenge to memory hierarchy design when energy consumption is a concern. This study motivates the use of STT-MRAM for the first-level caches of a multicore processor to reduce energy consumption without significantly degrading the performance. The large STT-MRAM first-level cache implementation saves leakage power. Moreover, the use of small level-0 cache regains the performance drop due to STT-MRAM long write latencies. The combination of both reduces the energy-delay product by 65% on average compared with CMOS baseline. The proposed STT hierarchy also shows good scalability over the CMOS with a few benchmarks which scale significantly better. The PARSEC and Splash2 benchmark suites are analysed running on a modern multicore platform, comparing performance, energy consumption and scalability of the spintronic cache system to a CMOS design.
References
-
-
1)
-
10. L. Hewlett-Packard Development Company: ‘Cacti 6.5’, 2009, .
-
2)
-
4. Sun, Z., Bi, X., Li, H.H., et al: ‘Multi retention level STT-RAM cache designs with a dynamic refresh scheme’. MICRO-44'11: Proc. of the 44th Annual IEEE/ACM Int. Symp. on Microarchitecture, 2011.
-
3)
-
31. Del Bel, B., Kim, J., Kim, C., et al: ‘Improving stt-MRAM density through multibit error correction’. Design, Automation and Test in Europe Conf. and Exhibition (DATE), 2014, 2014, pp. 1–6.
-
4)
-
7. Jog, A., Mishra, A.K., Xu, C., et al: ‘Cache revive: architecting volatile STT-RAM caches for enhanced performance in CMPs’. DAC'12: Proc. of the 49th Annual Design Automation Conf., 2012, pp. 243–252.
-
5)
-
16. Bienia, C., Kumar, S., Li, K.: ‘PARSEC vs. splash-2: a quantitative comparison of two multithreaded benchmark suites on chip-multiprocessors’. IEEE Int. Symp. on Workload Characterization, 2008. IISWC 2008, 2008, pp. 47–56.
-
6)
-
33. Kin, J., Gupta, M., Mangione-Smith, W.H.: ‘The filter cache: an energy efficient memory structure’. Proc. of the 30th Annual ACM/IEEE Int. Symp. on Microarchitecture, MICRO 30, Washington, DC, USA, 1997, pp. 184–193. .
-
7)
-
18. Chun, K.C., Zhao, H., Harms, J.D., et al: ‘A scaling roadmap and performance evaluation of in-plane and perpendicular MTJ based STT-MRAMs for high-density cache memory’, IEEE J. Solid-State Circuits, 2013, 48, (2), pp. 598–610.
-
8)
-
8. Kim, J., Zhao, H., Jiang, Y., et al: ‘Scaling analysis of in-plane and perpendicular anisotropy magnetic tunnel junctions using a physics-based model’. Device Research Conf. (DRC), 2014, 2014.
-
9)
-
22. Xu, W., Sun, H., Wang, X., et al: ‘Design of last-level on-chip cache using spin-torque transfer RAM (stt RAM)’, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 2011, 19, (3), pp. 483–493.
-
10)
-
15. Woo, S.C., Ohara, M., Torrie, E., et al: ‘The splash-2 programs: characterization and methodological considerations’. Proc. of the 22nd Annual Int. Symp. on Computer Architecture, ISCA'95 mct, New York, NY, USA, 1995, pp. 24–36. .
-
11)
-
2. Smullen, C.W.I., Mohan, V., Nigam, A., et al: ‘Relaxing non-volatility for fast and energy-efficient STT-RAM caches’. 2011 IEEE 17th Int. Symp. on High Performance Computer Architecture (HPCA), 2011, pp. 50–61.
-
12)
-
11. Zhao, W., Cao, Y.: ‘New generation of predictive technology model for sub-45 nm design exploration’. Seventh Int. Symp. on Quality Electronic Design, 2006. ISQED'06, 2006, p. 6.
-
13)
-
19. Gonzales, R., Horowitz, M.: ‘Energy dissipation in general purpose processors’, IEEE J. Solid State Circuits, 1995, 31, pp. 1277–1284.
-
14)
-
1. Park, S.P., Gupta, S., Mojumder, N., et al: ‘Future cache design using STT MRAMs for improved energy efficiency: devices, circuits and architecture’. DAC'12: Proc. of the 49th Annual Design Automation Conf., 2012.
-
15)
-
13. Genbrugge, D., Eyerman, S., Eeckhout, L.: ‘Interval simulation: raising the level of abstraction in architectural simulation’. 2010 IEEE 16th Int. Symp. on High Performance Computer Architecture (HPCA), 2010, pp. 1–12. .
-
16)
-
12. Dong, X, Xu, C., Xie, Y., et al: ‘NVSim: a circuit-level performance, energy, and area model for emerging nonvolatile memory’, IEEE Trans. Comput. Aided Des. Integr. Circuits Systems, 2012, 31, (7), pp. 994–1007.
-
17)
-
25. Zhou, P., Zhao, B., Yang, J., et al: ‘Energy reduction for STT-RAM using early write termination’. ICCAD'09: Proc. of the 2009 Int. Conf. on Computer-Aided Design, 2009.
-
18)
-
3. Rasquinha, M., Choudhary, D., Chatterjee, S., et al: ‘An energy efficient cache design using spin torque transfer (STT) RAM’. ISLPED'10: Proc. of the 16th ACM/IEEE Int. Symp. on Low Power Electronics and Design, 2010.
-
19)
-
28. Ahn, J., Yoo, S., Choi, K.: ‘Dasca: dead write prediction assisted stt-RAM cache architecture’. 2014 IEEE 20th Int. Symp. on High Performance Computer Architecture (HPCA2014), February 2014.
-
20)
-
23. Kim, Y., Gupta, S.K., Park, S.P., et al: ‘Write-optimized reliable design of STT MRAM’. ISLPED'12: Proc. of the 2012 ACM/IEEE Int. Symp. on Low power Electronics and Design, 2012.
-
21)
-
32. Jouppi, N.P.: ‘Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers’, ACM SIGARCH Comput. Archit. News, 1990, 18, pp. 364–373.
-
22)
-
29. Wu, X., Li, J., Zhang, L., et al: ‘Power and performance of read–write aware hybrid caches with non-volatile memories’. Design, Automation Test in Europe Conf. Exhibition, 2009. DATE'09, 2009, pp. 737–742.
-
23)
-
21. Li, Q., Li, J., Shi, L., et al: ‘Compiler-assisted refresh minimization for volatile stt-ram cache’. 2013 18th Asia and South Pacific Design Automation Conf. (ASP-DAC), 2013, pp. 273–278.
-
24)
-
17. Alameldeen, A.R., Wood, D.A.: ‘IPC considered harmful for multiprocessor workloads’, IEEE Micro, 2006, 26, (4), pp. 8–17.
-
25)
-
24. Sun, G., Dong, X., Xie, Y., et al: ‘A novel architecture of the 3D stacked MRAM l2 cache for CMPs’. IEEE 15th Int. Symp. on High Performance Computer Architecture, 2009. HPCA 2009, 2009, pp. 239–249.
-
26)
-
6. Senni, S., Torres, L., Sassatelli, G., et al: ‘Emerging non-volatile memory technologies exploration flow for processor architecture’. 2015 IEEE Computer Society Annual Symp. on VLSI (ISVLSI), 2015, p. 460.
-
27)
-
27. Sun, Z., Li, H., Wu, W.: ‘A dual-mode architecture for fast-switching STT-RAM’. ISLPED'12: Proc. of the 2012 ACM/IEEE Int. Symp. on Low Power Electronics and Design, 2012.
-
28)
-
14. Bienia, C.: ‘Benchmarking modern multiprocessors’. PhD thesis, Princeton University, January 2011.
-
29)
-
26. Kwon, K.-W., Choday, S.H., Kim, Y., et al: ‘AWARE (asymmetric write architecture with REdundant blocks): a high write speed STT-MRAM cache architecture’, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 2013, 22, (4), pp. 712–720.
-
30)
-
34. Varma, A., Jacobson, Q.: ‘Destage algorithms for disk arrays with non-volatile caches’. 22nd Annual Int. Symp. on Computer Architecture, 1995. Proc., 1995, pp. 83–95.
-
31)
-
35. Gill, B.S., Modha, D.S.: ‘Wow: wise ordering for writes – combining spatial and temporal locality in non-volatile caches’. Proc. of the Fourth Conf. on USENIX Conf. on File and Storage Technologies – Volume 4, FAST'05, Berkeley, CA, USA, 2005, p. 10.
-
32)
-
9. Tuohy, W., Ma, C., Nandkar, P., et al: ‘Improving energy and performance with spintronics caches in multicore systems’. Europar'14: OMHI – Third Annual Workshop on On-Chip Memory Hierarchies and Interconnects, 2014.
-
33)
-
30. Jadidi, A., Arjomand, M., Sarbazi-Azad, H.: ‘High-endurance and performance-efficient design of hybrid cache architectures through adaptive line replacement’. ISLPED'11: Proc. of the 17th IEEE/ACM Int. Symp. on Low-power Electronics and Design, 2011.
-
34)
-
20. Bhadauria, M., Weaver, V.M., McKee, S.A.: ‘Understanding PARSEC performance on contemporary CMPs’. IEEE Int. Symp. on Workload Characterization, 2009. IISWC 2009, 2009, pp. 98–107.
-
35)
-
5. Guo, X., Ipek, E., Soyata, T.: ‘Resistive computation: avoiding the power wall with low-leakage, STT-MRAM based computing’. ISCA'10: Proc. of the 37th Annual Int. Symp. on Computer Architecture, 2010.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cdt.2015.0190
Related content
content/journals/10.1049/iet-cdt.2015.0190
pub_keyword,iet_inspecKeyword,pub_concept
6
6