Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

Decoupling the programming model from resource management in throughput processors

Decoupling the programming model from resource management in throughput processors

For access to this article, please select a purchase option:

Buy chapter PDF
£10.00
(plus tax if applicable)
Buy Knowledge Pack
10 chapters for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
Many-Core Computing: Hardware and Software — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

This chapter introduces a new resource virtualization framework, Zorua, that decouples the graphics processing unit (GPU) programming model from the management of key on-chip resources in hardware to enhance programming ease, portability, and performance. The application resource specification-a static specification of several parameters such as the number of threads and the scratchpad memory usage per thread block-forms a critical component of the existing GPU programming models. This specification determines the parallelism, and, hence, performance of the application during execution because the corresponding on-chip hardware resources are allocated and managed purely based on this specification. This tight coupling between the software-provided resource specification and resource management in hardware leads to significant challenges in programming ease, portability, and performance, as we demonstrate in this chapter using real data obtained on state-of-the-art GPU systems. Our goal in this work is to reduce the dependence of performance on the software-provided static resource specification to simultaneously alleviate the above challenges. To this end, we introduce Zorua, a new resource virtualization framework, that decouples the programmer-specified resource usage of a GPU application from the actual allocation in the on-chip hardware resources. Zorua enables this decoupling by virtualizing each resource transparently to the programmer. The virtualization provided by Zorua builds on two key concepts-dynamic allocation of the on-chip resources and their oversubscription using a swap space in memory. Zorua provides a holistic GPU resource virtualization strategy designed to (i) adaptively control the extent of oversubscription and (ii) coordinate the dynamic management of multiple on-chip resources to maximize the effectiveness of virtualization.We demonstrate that by providing the illusion of more resources than physically available via controlled and coordinated virtualization, Zorua offers several important benefits: (i) Programming ease. It eases the burden on the programmer to provide code that is tuned to efficiently utilize the physically available on-chip resources. (ii) Portability. It alleviates the necessity of retuning an application's resource usage when porting the application across GPU generations. (iii) Performance. By dynamically allocating resources and carefully oversubscribing them when necessary, Zorua improves or retains the performance of applications that are already highly tuned to best utilize the resources. The holistic virtualization provided by Zorua has many other potential uses, e.g., fine-grained resource sharing among multiple kernels, low latency preemption of GPU programs, and support for dynamic parallelism, which we describe in this chapter.

Chapter Contents:

  • 4.1 Introduction
  • 4.2 Background
  • 4.3 Motivation
  • 4.3.1 Performance variation and cliffs
  • 4.3.2 Portability
  • 4.3.3 Dynamic resource underutilization
  • 4.3.4 Our goal
  • 4.4 Zorua: our approach
  • 4.4.1 Challenges in virtualization
  • 4.4.2 Key ideas of our design
  • 4.4.2.1 Leveraging software annotations of phase characteristics
  • 4.4.2.2 Control with an adaptive runtime system
  • 4.4.3 Overview of Zorua
  • 4.5 Zorua: detailed mechanism
  • 4.5.1 Key components in hardware
  • 4.5.2 Detailed walkthrough
  • 4.5.3 Benefits of our design
  • 4.5.4 Oversubscription decisions
  • 4.5.5 Virtualizing on-chip resources
  • 4.5.5.1 Virtualizing registers and scratchpad memory
  • 4.5.5.2 Virtualizing thread slots
  • 4.5.6 Handling resource spills
  • 4.5.7 Supporting phases and phase specifiers
  • 4.5.8 Role of the compiler and programmer
  • 4.5.9 Implications to the programming model and software optimization
  • 4.5.9.1 Flexible programming models for GPUs and heterogeneous systems
  • 4.5.9.2 Virtualization-aware compilation and auto-tuning
  • 4.5.9.3 Reduced optimization space
  • 4.6 Methodology
  • 4.6.1 System modeling and configuration
  • 4.6.2 Evaluated applications and metrics
  • 4.7 Evaluation
  • 4.7.1 Effect on performance variation and cliffs
  • 4.7.2 Effect on performance
  • 4.7.3 Effect on portability
  • 4.7.4 A deeper look: benefits and overheads
  • 4.8 Other applications
  • 4.8.1 Resource sharing in multi-kernel or multi-programmed environments
  • 4.8.2 Preemptive multitasking
  • 4.8.3 Support for other parallel programming paradigms
  • 4.8.4 Energy efficiency and scalability
  • 4.8.5 Error tolerance and reliability
  • 4.8.6 Support for system-level tasks on GPUs
  • 4.8.7 Applicability to general resource management in accelerators
  • 4.9 Related work
  • 4.10 Conclusion and future directions
  • Acknowledgments
  • References

Inspec keywords: resource allocation; programming; computer graphic equipment; graphics processing units; virtualisation

Other keywords: programmer-specified resource usage; resource virtualization framework; multiple on-chip resources; resource management; Zorua; holistic GPU resource virtualization strategy; dynamic management; on-chip hardware resources; coordinated virtualization; key on-chip resources; software-provided static resource specification; GPU programming; GPU application; fine-grained resource; GPU programs; software-provided resource specification; application resource specification; holistic virtualization; scratchpad memory usage; controlled virtualization; GPU generations; dynamic allocation; state-of-the-art GPU systems; graphics processing unit programming model; throughput processors

Subjects: Microprocessor chips; Microprocessors and microcomputers; File organisation

Preview this chapter:
Zoom in
Zoomout

Decoupling the programming model from resource management in throughput processors, Page 1 of 2

| /docserver/preview/fulltext/books/pc/pbpc022e/PBPC022E_ch4-1.gif /docserver/preview/fulltext/books/pc/pbpc022e/PBPC022E_ch4-2.gif

Related content

content/books/10.1049/pbpc022e_ch4
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address