Scheduling-scheme and parallel structure for multi-level lifting two-dimensional discrete wavelet transform without using frame-buffer
In this paper, we have proposed a novel scheduling scheme for generating continuous input-blocks for the succeeding processing units of parallel structure to achieve 100% hardware utilisation efficiency (HUE) without block folding. Based on the proposed scheme, we have derived a parallel and pipeline structure for multilevel lifting two-dimensional discrete wavelet transform (DWT). The proposed structure involves regular data-flow and does not require frame-buffer, and calculates DWT levels concurrently. A theoretical comparison shows that the proposed structure for J = 2 involves 1.25 times more multipliers and adders, 2N more registers than those of existing folded block-based structure and offers 1.25 times higher throughput, where N is the input-image width. Compared with similar existing parallel structure, the proposed structure requires the same number of multipliers and adders, 2.125N less registers and offers the same throughput rate. Application specific integrated circuit synthesis result shows that the core of the proposed structure for 2-level DWT and image size (512 × 512) involves 41% less area-delay-product and 36% less energy-per-image than those of similar existing parallel structure.