XY -Type GPU Cache: Exploiting Spatial Localities in both X and Y Directions to Avoid Conflict Miss

XY -Type GPU Cache: Exploiting Spatial Localities in both X and Y Directions to Avoid Conflict Miss

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
Chinese Journal of Electronics — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Cache has been introduced into many Graphics processing units (GPUs) to decrease the frequency of data transfer between high-performance computing units and low-speed long-latency external memory. The traditional index mapping scheme designed originally for CPU cache exploits only the spatial locality in address space. The access to graphics data always has region locality on the frame buffer: there are high spatial localities in both X and Y directions. It may generate more conflict misses on some limited cache lines, which eventually results in high cache miss ratio and a performance drop. Traditional CPU cache cannot be used directly in GPU. We propose a new conflict-avoiding GPU cache called XY - type cache with a new index mapping scheme, whose cache line indices are computed from both X and Y coordinates of pixels and the cache index distribution is consistent with the region locality on the frame buffer. Our evaluation results show that the proposed XY -type GPU cache can reduce cache miss ratio by 88% at most via scattering the cache accesses to all lines evenly, and can completely avoid the bad effect caused by frame resolution. Since the cache miss ratio in direct-mapped or 2-way set-associative structure is approximate to or even lower than that in fully-associative structure which is the best case in terms of lowering cache line conflicts, XY -type GPU cache can be designed with lower complexity and lower consumption power.

Related content

This is a required field
Please enter a valid email address