Area-efficient special function unit for mobile vertex processors

Area-efficient special function unit for mobile vertex processors

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
Electronics Letters — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

An area-efficient special function unit (SFU) for the evaluation of transcendental functions in mobile vertex processors is presented. In spite of infrequent usage, previous implementation of an SFU occupied significant portion of a shader datapath unit. The proposed SFU reduces the area by 54% by performing quadratic interpolation included in the function evaluation with a shared 4D dot product unit and implementing setup circuitry and a lookup table by dedicated hardware for the SFU. By benchmarking shader programs, the performance/area of a shader datapath unit turns out to be improved by 69%.


    1. 1)
      • D. Kim . An SoC with 1.3 Gtexels/s 3-D graphics full pipeline engine for consumer applications. IEEE J. Solid-State Circuits , 1 , 71 - 84
    2. 2)
      • C.-H. Yu . An energy-efficient mobile vertex processor with multithread expanded VLIW architecture and vertex cache. IEEE J. Solid-State Circuits , 10 , 2257 - 2269
    3. 3)
      • Muller, J.-M.: `“Partially rounded” small-order approximation for accurate, hardware-oriented, table-based methods', Proc. IEEE Int. Symp. on Computer Arithmetic, June 2003, p. 114–121.
    4. 4)
      • D. Kim , L.-S. Kim . A floating-point unit for 4D vector inner product with reduced latency. IEEE Trans. Comput , 7 , 890 - 901
    5. 5)
      • Yoon, J.-S.: `A 3D graphics processor with fast 4D vector inner product units and power aware texture cache', Proc. IEEE Custom Integrated Circuit Conf., September 2008, p. 539–542.
    6. 6)
      • Chung, K.: `Tessellation-enabled shader for a bandwidth-limited 3D graphics engine', Proc. IEEE Custom Integrated Circuit Conf., September 2008, p. 367–370.

Related content

This is a required field
Please enter a valid email address