http://iet.metastore.ingenta.com
1887

Automatic management of Software Programmable Memories in Many-core Architectures

Automatic management of Software Programmable Memories in Many-core Architectures

For access to this article, please select a purchase option:

Buy article PDF
$19.95
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Computers & Digital Techniques — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Software Programmable Memories, or SPMs, are raw on-chip memories that are not implicitly managed by the processor hardware, but explicitly by software. For example, while caches fetch data from memories automatically and maintain coherence with other caches, SPMs explicitly manage data movement between memories and other SPMs through software instructions. SPMs make the design of on-chip memories simpler, more scalable, and power efficient, but also place additional burden for programming of SPM-based processors. Traditionally, SPMs have been utilised in embedded systems, especially multimedia and gaming systems, but recently research on SPM-based systems has seen increased interest as a means to solve the memory scaling challenges of many-core architectures. This study presents an overview of the state-of-the-art in SPM management techniques in many-core processors, summarises some recent research on SPM-based systems, and outlines future research directions in this field.

References

    1. 1)
      • 1. Heinrich, M.A.: ‘The performance and scalability of distributed shared-memory cache coherence protocols’. Ph.D. dissertation, Stanford University, Stanford, CA, USA, 1999, aAI9924431.
    2. 2)
      • 2. Abts, D., Scott, S., Lilja, D.J.: ‘So many states, so little time: verifying memory coherence in the Cray X1’. Proc. Int. Parallel and Distributed Processing Symp., 2003.
    3. 3)
    4. 4)
      • 4. Intel Lab: ‘The SCC programmer's guide’. http://www.intel.com, March 2014.
    5. 5)
      • 5. de Dinechin, B.D., de Massas, P.G., Lager, G., et al: ‘A distributed run-time environment for the Kalray MPPA-256 integrated manycore processor’. Procedia Computer Science, 2013.
    6. 6)
    7. 7)
    8. 8)
      • 8. ARM: ‘ARM1176JZF-S Technical Reference Manual’. http://www.infocenter.arm.com/, July 2004.
    9. 9)
      • 9. Texas Instrument: ‘TMS320C6678 Multicore Fixed and Floating-Point Digital Signal Processor (Rev. E)’. http://www.ti.com, January 2012.
    10. 10)
      • 10. Banakar, R., Steinke, S., Lee, B.-S., et al: ‘Scratchpad Memory: Design Alternative for Cache on-chip Memory in Embedded Systems’. Proc. of CODES, 2002.
    11. 11)
      • 11. Panda, P.R., Dutt, N.D., Nicolau, A.: ‘Efficient Utilization of Scratch-Pad Memory in Embedded Processor Applications’. IEEE Computer Society Proc. 1997 European Conf. on Design and Test, ser. EDTC ‘97, 1997, p. 7.
    12. 12)
      • 12. Sjödin, J., von Platen, C.: ‘Storage Allocation for Embedded Processors’. Proc. of the Int. Conf. on Compilers, Architecture, and Synthesis for Embedded Systems, 2001.
    13. 13)
      • 13. Avissar, O., Barua, R., Stewart, D.: ‘Heterogeneous memory management for embedded systems’. Proc. Int. Conf. on Compilers, Architecture, and Synthesis for Embedded Systems, 2001.
    14. 14)
    15. 15)
      • 15. Nguyen, N., Dominguez, A., Barua, R.: ‘Memory allocation for embedded systems with a compile-time-unknown scratch-pad size’. Proc. Int. Conf. on Compilers, Architecture, and Synthesis for Embedded Systems, 2005, pp. 115125.
    16. 16)
      • 16. Verma, M., Steinke, S., Marwedel, P.: ‘Data partitioning for maximal scratchpad usage’. Proc. of the Asia and South Pacific Design Automation Conf., 2003.
    17. 17)
      • 17. Steinke, S., Wehmeyer, L., Lee, B., et al: ‘Assigning program and data objects to scratchpad for energy reduction’. Proc. Design, Automation Test in Europe Conf. Exhibition, 2002, p. 409.
    18. 18)
      • 18. Puaut, I., Pais, C.: ‘Scratchpad memories vs locked caches in hard real-time systems: a quantitative comparison’. Proc. Design, Automation Test in Europe Conf., 2007.
    19. 19)
      • 19. Wu, H., Xue, J., Parameswaran, S.: ‘Optimal WCET-aware code selection for scratchpad memory’. Proc. 10th ACM International Conf. on Embedded Software, 2010.
    20. 20)
      • 20. Kim, Y., Broman, D., Cai, J., et al: ‘WCET-aware dynamic code management on scratchpads for software-managed multicores’. Proc. 20th IEEE Real-Time and Embedded Technology and Applications Symp. (RTAS), 2014.
    21. 21)
      • 21. Suhendra, V., Mitra, T., Roychoudhury, A., et al: ‘WCET centric data allocation to scratchpad memory’. Proc. 26th IEEE Int. Real-Time Systems Symp., 2005.
    22. 22)
      • 22. Wan, Q., Wu, H., Xue, J.: ‘WCET-aware data selection and allocation for scratchpad memory’. Proc. 13th ACM SIGPLAN/SIGBED Int. Conf. on Languages, Compilers, Tools and Theory for Embedded Systems, 2012.
    23. 23)
      • 23. Deverge, J.-F., Puaut, I.: ‘WCET-directed dynamic scratchpad memory allocation of data’. Proc. 19th Euromicro Conf. on Real-Time Systems, 2007.
    24. 24)
      • 24. Marwedel, P., Wehmeyer, L., Verma, M., et al: ‘Fast, predictable and low energy memory references through architecture-aware compilation’. Proc. Asia and South Pacific Design Automation Conf., 2004.
    25. 25)
      • 25. Udayakumaran, S., Barua, R.: ‘Compiler-decided dynamic memory allocation for scratch-pad based embedded systems’. Proc. Int. Conf. on Compilers, Architecture and Synthesis for Embedded Systems, 2003.
    26. 26)
      • 26. Dominguez, A., Udayakumaran, S., Barua, R.: ‘Heap data allocation to scratch-pad memory in embedded systems’, J. Embed. Comput., 2005, 1, (4), pp. 521540.
    27. 27)
      • 27. Li, L., Gao, L., Xue, J.: ‘Memory coloring: a compiler approach for scratchpad memory management’. Proc. PACT, 2005.
    28. 28)
      • 28. Kandemir, M.T., Ramanujam, J., Irwin, M.J., et al: ‘Dynamic Management of Scratch-Pad Memory Space’. Proc. Design Automation Conf., 2001, pp. 690695.
    29. 29)
      • 29. Hu, J., Xue, C., Zhuge, Q., et al: ‘Towards energy efficient hybrid on-chip scratch pad memory with non-volatile memory’. Proc. Design, Automation Test in Europe Conf. Exhibition, 2011.
    30. 30)
    31. 31)
      • 31. Poletti, F., Marchal, P., Atienza, D., et al: ‘An integrated hardware/software approach for run-time scratchpad management’. Proc. Design Automation Conf., 2004.
    32. 32)
    33. 33)
    34. 34)
    35. 35)
      • 35. Pabalkar, A., Shrivastava, A., Kannan, A., et al: ‘Sdrm: simultaneous determination of regions and function-to-region mapping for scratchpad memories’, Sadayappan, P., Parashar, M., Badrinath, R., Prasanna, V. (Eds.): ‘High Performance Computing – HiPC 2008’, 2008, (LNCS5374), pp. 569582.
    36. 36)
    37. 37)
    38. 38)
      • 38. Egger, B., Lee, J., Shin, H.: ‘Scratchpad memory management in a multitasking environment’. Proc. eighth ACM Int. Conf. on Embedded Software, 2008.
    39. 39)
      • 39. Guthaus, M.R., Ringenberg, J.S., Ernst, D., et al: ‘MiBench: a free, commercially representative embedded benchmark suite’. Proc. IEEE Int. Workshop on Workload Characterization, 2001.
    40. 40)
      • 40. Kannan, A., Shrivastava, A., Pabalkar, A., et al: ‘A software solution for dynamic stack management on scratchpad memory’. Proc. Conf. on Asia and South Pacific Design Automation, 2009, pp. 612617.
    41. 41)
      • 41. Bai, K., Shrivastava, A., Kudchadker, S.: ‘Stack data management for Limited Local Memory (LLM) multi-core processors’. Proc. Int. Conf. on Application Specific Systems, Architectures and Processors (ASAP), 2011, pp. 231234.
    42. 42)
      • 42. Lu, J., Bai, K., Shrivastava, A.: ‘SSDM: smart stack data management for Software Managed Multicores (SMMs)’. Proc. 50th Design Automation Conf. (DAC), 2013.
    43. 43)
      • 43. Bai, K., Shrivastava, A.: ‘Heap data management for Limited Local Memory (LLM) multi-core processors’. Proc. 23th Int. Symp. on System Synthesis (CODES + ISSS), New York, NY, USA, 2010, pp. 317326, iSBN.
    44. 44)
      • 44. Bai, K., Shrivastava, A.: ‘Automatic and efficient heap data management for limited local memory multicore architectures’. Proc. Int. Conf. on Design Automation and Test in Europe, 2013.
    45. 45)
      • 45. Bai, K., Lu, J., Shrivastava, A., et al: ‘Cmsm: an efficient and effective code management for software managed multicores’. Proc. Int. Symp. on Hardware/Software Codesign and System Synthesis (CODES + ISSS), 2013.
    46. 46)
      • 46. AMD: ‘HPC processor comparison’. July 2012. Available at: http://www.sites.amd.com/us/Documents/49747D_HPC_Processor_Comparison_v3_July2012.pdf.
    47. 47)
      • 47. IBM Technical Library: ‘Cell broadband engine architecture and its first implementation’. Available at: http://www.ibm.com/developerworks/power/library/pa-cellperf/.
    48. 48)
      • 48. Cai, J., Shrivastava, A.: ‘Software coherence management on non-coherent cache multi-cores’. 2016 29th Int. Conf. on VLSI Design and 2016 15th Int. Conf. on Embedded Systems (VLSID), January 2016, pp. 397402.
    49. 49)
      • 49. Keleher, P., Cox, A.L., Zwaenepoel, W.: ‘Lazy release consistency for software distributed shared memory’. Proc. 19th Annual Int. Symp. on Computer Architecture, 1992.
    50. 50)
      • 50. Gauthier, L., Ishihara, T., Takase, H., et al: ‘Minimizing inter-task interferences in scratch-pad memory usage for reducing the energy consumption of multi-task systems’. Proc. Int. Conf. on Compilers, Architectures and Synthesis for Embedded Systems, 2010.
    51. 51)
      • 51. Francesco, P., Marchal, P., Atienza, D., et al: ‘An integrated hardware/software approach for run-time scratchpad management’. Proc. 41st Annual Design Automation Conf., 2004.
    52. 52)
      • 52. Pyka, R., Faßbach, C., Verma, M., et al: ‘Operating system integrated energy aware scratchpad allocation strategies for multiprocess applications’. Proc. 10th Int. Workshop on Software & Compilers for Embedded Systems, 2007.
    53. 53)
      • 53. Takase, H., Tomiyama, H., Takada, H.: ‘Partitioning and allocation of scratch-pad memory for priority-based preemptive multi-task systems’. Proc. Design, Automation Test in Europe Conf. Exhibition, 2010.
    54. 54)
      • 54. Suhendra, V., Raghavan, C., Mitra, T.: ‘Integrated scratchpad memory optimization and task scheduling for MPSoC architectures’. Proc. Int. Conf. on Compilers, Architecture and Synthesis for Embedded Systems, 2006.
    55. 55)
      • 55. Cho, D., Pasricha, S., Issenin, I., et al: ‘Compiler driven data layout optimization for regular/irregular array access patterns’. Proc. ACM SIGPLAN-SIGBED Conf. on Languages, Compilers, and Tools for Embedded Systems, 2008.
    56. 56)
    57. 57)
    58. 58)
    59. 59)
    60. 60)
      • 60. Deng, N., Ji, W., Li, J., et al: ‘A semi-automatic scratchpad memory management framework for CMP’. Proc. Ninth Int. Conf. on Advanced Parallel Processing Technologies, 2011.
    61. 61)
      • 61. Alvarez, L., Vilanova, L., Moreto, M., et al: ‘Coherence protocol for transparent management of scratchpad memories in shared memory manycore architectures’. Proc. of the 42nd Annual Int. Symp. on Computer Architecture, 2015.
    62. 62)
      • 62. Komuravelli, R., Sinclair, M.D., Alsop, J., et al: ‘Stash: have your scratchpad and cache it too’. Proc. 42nd Annual Int. Symp. on Computer Architecture, 2015.
    63. 63)
      • 63. Bathen, L.A.D., Dutt, N.D., Shin, D., et al: ‘SPMVisor: dynamic scratchpad memory virtualization for secure, low power, and high performance distributed on-chip memories’. Proc. Seventh IEEE/ACM/IFIP Int. Conf. on Hardware/Software Codesign and System Synthesis, 2011.
    64. 64)
      • 64. Bathen, L.A.D., Dutt, N.D., Nicolau, A., et al: ‘VaMV: Variability-aware Memory Virtualization’. Proc. Conf. on Design, Automation and Test in Europe, 2012.
    65. 65)
      • 65. Bathen, L., Dutt, N.: ‘HaVOC: A hybrid memory-aware virtualization layer for on-chip distributed ScratchPad and Non-Volatile Memories’. Proc. 49th ACM/EDAC/IEEE Design Automation Conf., 2012.
    66. 66)
    67. 67)
      • 67. Tajik, H., Donyanavard, B., Jahn, J., et al: ‘SPMPool: Runtime SPM management for embedded many-cores’. Tech. Rep. CECS TR 14-08, Center for Embedded Computer Systems, University of California, Irvine, July 2014. Available at: http://www.cecs.uci.edu/files/2014/07/CECS-TR-14-08.pdf.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cdt.2016.0024
Loading

Related content

content/journals/10.1049/iet-cdt.2016.0024
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address