Exploiting parallelism in configurable architectures through custom array mapping

N. Baradaran; P.C. Diniz

Exploiting parallelism in configurable architectures through custom array mapping

Access Full Text

Exploiting parallelism in configurable architectures through custom array mapping

Author(s): N. Baradaran and P.C. Diniz
DOI: 10.1049/iet-cdt:20060181

For access to this article, please select a purchase option:

Buy article PDF

Buy Knowledge Pack

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership

Recommend Title Publication to library

IET Computers & Digital Techniques — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Author(s): N. Baradaran ¹ and P.C. Diniz ²
- Affiliations: 1: Information Sciences Institute, University of Southern California, USA
  2: INESC-ID, Technical University of Lisbon, Lisbon, Portugal
Source: Volume 1, Issue 4, July 2007, p. 303 – 311
DOI: 10.1049/iet-cdt:20060181 , Print ISSN 1751-8601, Online ISSN 1751-861X

Published

Configurable architectures offer the unique opportunity of customising the storage allocation to meet specific applications' needs. A compiler approach to map the arrays of a loop-based computation to internal memories of a configurable architecture with the objective of minimising the overall execution time is described. An algorithm that considers the data access patterns of the arrays along the critical path of the computation as well as the available storage and memory bandwidth is presented. Experimental results are presented which demonstrate the application of this approach for a set of kernel codes when targeting a field-programmable gate-array. The results reveal that the proposed algorithm outperforms the naive and custom data layout techniques by an average of 33% and 15% in terms of execution time, while taking into account the available hardware resources.

References

1. 1)
  - R. Cytron , J. Ferrante , B. Rosen , W. Wegman . Efficiently computing static single assignment form and the control dependence graph. ACM Trans. Prog. Lang. Syst. (TOPLAS) , 4 , 451 - 490
2. 2)
  - Gong, W., Wang, G., Kastner, R.: `Storage assignment during high-level synthesis for configurable architectures', Proc. ACM/IEEE Int. Conf. on Computer-Aided Design, 2005.
3. 3)
  - R. Allen , K. Kennedy . (2002) Optimizing compilers for modern architectures.
4. 4)
  - The Stanford SUIF compilation system. Public Domain Software and Documentation suif.stanford.edu.
5. 5)
  - F. Balasa , F. Catthoor , H. de Man . Practical solutions for counting scalars and dependences in ATOMIUM - a memory management system for multidimensional signal processing. IEEE Trans. Computer-Aided Design Integr. Circuits Syst. , 2 , 133 - 145
6. 6)
  - Wolf, M., Lam, M.: `A data locality optimization algorithm', Proc. ACM Conf. on Programming Language Design and Implementation (PLDI), 1991, ACM Press, p. 30–44.
7. 7)
  - Bairagi, D., Pande, S., Agrawal, D.: `Framework for containing code size in limited register set embedded processors', Proc. ACM Workshop on Languages, Compilers and Tools for Embedded Systems, 2000.
8. 8)
  - M. Weinhardt , W. Luk . Memory access optimization for reconfigurable systems. IEE Proc., Comput. Digit. Tech. , 3 , 105 - 112
9. 9)
  - Kulkarni, D., Najjar, W., Rinker, R., Kurdahi, F.: `Fast area estimation to support compiler optimizations in FPGA-based reconfigurable systems', Proc. IEEE Symp. on Field-Programmable Custom Computing Machines (FCCM), 2002.
10. 10)
  - Lam, M., Rothberg, E., Wolf, M.: `The cache performance and optimizations of blocked algorithms', Proc. Sixth Int. Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS-IV), 1991, p. 63–74.
11. 11)
  - P. Jha , N. Dutt . High-level library mapping for memories. ACM Trans. Design Autom. Electron. Syst. , 3 , 566 - 603
12. 12)
  - Sudarsanam, A., Malik, S.: `Register and memory bank allocation for software synthesis in ASIPs', Proc. 1995 Int. Conf. on Computer-Aided Design (ICCAD), 1995, p. 388–392.
13. 13)
  - P. Panda , N. Dutt , A. Nicolau . On-chip vs. off-chip memory: the data partitioning in embedded processor based systems. ACM Trans. Design Autom. Electron. Syst. (TODAES) , 3 , 682 - 704
14. 14)
  - Baradaran, N., Diniz, P.: `A register allocation algorithm in the presence of scalar replacement for fine-grain architectures', Proc. Design Automation and Testing in Europe (DATE), 2005.
15. 15)
  - So, B., Hall, M., Ziegler, H.: `Custom data layout for memory parallelism', Proc. Int. Symp. on Code Gen. and Opt. (CGO), 2004.
16. 16)
  - Grun, P., Dutt, N., Nicolau, A.: `Access pattern based local memory customization for low power embedded systems', Proc. Design, Automation and Test in Europe (DATE), 2001.
17. 17)
  - Barua, R., Lee, W., Amarasinghe, S., Agarwal, A.: `Maps: a compiler-managed memory system for RAW machines', Proc. ACM Intl. Symp. on Computer Architecture (ISCA'99), 1999.
18. 18)
  - Gokhale, M., Stone, J.: `Automatic allocation of arrays to memories in FPGA processor s with multiple memory banks', Proc. IEEE Symp. on FPGAs for Custom Computing Machines (FCCM'99), 1999.
19. 19)
  - K. Compton , S. Hauck . Reconfigurable computing: a survey of systems and software. ACM Comput. Surv. (CSUR) , 2 , 171 - 210
20. 20)
  - Kandemir, M., Choudhary, A.: `Compiler-directed scratch pad memory hierarchy design and management', Proc. ACM/IEEE Design Automation Conf. (DAC'02), 2002.
21. 21)
  - Gupta, S., Luthra, M., Dutt, N.D., Gupta, R.K., Nicolau, A.: `Hardware and interface synthesis of FPGA blocks using parallelizing code transformations', Invited talk at the special session on Synthesis For Programmable Systems at the Int. Conf. on Parallel and Distributed Computing and Systems, 2003.

Login

Not registered yet?

Share

Tools

Login to add to favourites

Key

Exploiting parallelism in configurable architectures through custom array mapping

Exploiting parallelism in configurable architectures through custom array mapping

Buy article PDF

Buy Knowledge Pack

Thank you

References

Related content