VLSI Architectures for Future Video Coding
VLSI Architectures for Future Video Coding This book addresses future video coding from the perspective of hardware implementation and architecture design, with particular focus on approximate computing and the energyquality scalability paradigm. Challenges in deploying VLSI architectures for video coding are identified and potential solutions postulated with reference to recent research in the field. The book offers systematic coverage of the designs, techniques and paradigms that will most likely be exploited in the design of VLSI architectures for future video coding systems. Written by a team of expert authors from around the world, and brought together by an editor who is a recognised authority in the field, this book is a useful resource for academics and industry professionals working on VLSI implementation of video codecs.
Inspec keywords: video coding; transforms; VLSI; video codecs
Other keywords: coding modules; high-resolution coding; joint algorithm-architecture design; real-time architectures; low-power circuit design techniques; VLSI architectures; hardwired oriented algorithms; high-throughput architectures; system architecture; scalable transform architectures; 3D video coding; future video coding
Subjects: Codecs, coders and decoders; General electrical engineering topics; Image and video coding; Semiconductor integrated circuits; Television and video equipment, systems and applications
- Book DOI: 10.1049/PBCS053E
- Chapter DOI: 10.1049/PBCS053E
- ISBN: 9781785617102
- e-ISBN: 9781785617119
- Page count: 384
- Format: PDF
-
Front Matter
- + Show details - Hide details
-
p.
(1)
-
1 Scalable transform architectures for video coding
- + Show details - Hide details
-
p.
1
–39
(39)
In spite of the recent advances in telecommunication standards, communication networks still have limited bandwidths and storage capacity. Therefore, video compression has drawn increasing importance since high-resolution video contents have become more and more used in various fields. These requirements raise the need for high-performance video-compression technologies able to reduce the amount of data to be transmitted or stored by compressing the input video signal into a bitstream file. Improving the coding efficiency was always one of the crucial issues of various compression standards that aim to get the most compact representation of the reconstructed video, with a high subjective quality. The high-efficiency video coding (HEVC) comes to respond to these requirements. However, the increased consumption of high-quality multimedia content has pushed the international communication companies to put much effort to better enhance video-coding techniques. In this perspective, an upcoming video-coding standard to be known as versatile video coding (VVC) has emerged aiming to improve the coding efficiency of the current HEVC codec. Improvements on rate distortion (RD) performance that came with both HEVC and VVC have brought an increased complexity in the majority of the coding modules, which makes it difficult to implement on hardware systems with real-time encoding. This chapter focuses on the transform coding stage as one of the most computationally demanding modules. For HEVC, efficient approximation algorithms in addition to reconfigurable and scalable architecture have been developed in order to decrease the computational complexity of the transform module. The main objectives are to meet low power and real-time processing constraints while maintaining a compression gain and a satisfying video quality. Field programmable gate array (FPGA) implementation results and comparisons with existing works confirm the efficiency of the proposed approximations since they contribute in reducing time and power consumption, optimizing the hardware resources and bringing peak signal-to noise ratio (PSNR) improvement as well. Similarly, for the adaptive multiple transform (AMT) introduced in the transform module of the VVC, approximations were done for discrete cosine transform (DCT)-II and discrete sine transform (DST)-VII transforms since they are statistically the most used ones among the five predefined types. Bitrate (BR) reduction with a slightly degradation of video quality and a less use of hardware resources are the main contributions of the proposed approximations.
-
2 Joint algorithm-architecture design of video coding modules
- + Show details - Hide details
-
p.
41
–77
(37)
This chapter overviews the joint effort of ITU-T and ISO/IEC to develop the upcoming VVC standard that will outperform HEVC standard by 50% of compression efficiency for similar video quality. Our analysis on current video encoder of VVC reference software (VTM 1) shows that one order-of-magnitude increase in encoding time is expected for VVC encoder compared to HEVC encoder. The complexity growth numbers are expected to increase since the standardization process is still in the beginning and the reference software includes only a small set of coding tools, while others are under test. The BD-rate reduction of VTM 1 compared to HEVC are far from the goal of 50%, but other sophisticated (and complex) coding tools are under test to reach this goal. Monthly meetings are happening to define and test coding tools to be included in each VTM version. The final standard is planned to 2020. This chapter also presented some state-of-the-art solutions on joint algorithm-architecture design of video coding modules, mainly targeted to HEVC, and discussed the challenges on adapting or designing new solutions for VVC standard. The new block partitioning is a key tool that affects many modules, such as RDO, inter- and intra-frame prediction, and transforms. It has also increased the performance requirements because of the rise of possible combinations of block partitioning and coding modes. We foresee that the future video coding systems will need joint algorithm-architecture optimizations by combining fast block partitioning algorithms with high-throughput hardware architecture employing intelligent reusing schemes, fast and low power arithmetic operators implemented in newer technology nodes. These types of challenges on implementing high-throughput hardware modules need to be tackled by industry and academic communities to make the future VVC video-coding standard usable for a wide range of applications.
-
3 High-throughput architectures for high-resolution video coding: system architecture analysis
- + Show details - Hide details
-
p.
79
–110
(32)
In this article discusses the design for high-resolution video involves some challenges, mainly in encoder architectures. First, provided the same algorithm, the amount of computations increases proportionally to the number of processed pixels or to its square. Second, more compression efficient algorithms require more complex architectures. Third, mode and data dependencies between neighboring blocks introduce timing limitations in processing paths, especially in the reconstruction loop. Fourth, architectures operating at higher clock frequencies are indispensable. Fifth, simplifications introduced to algorithms implemented in reference models usually decrease the compression efficiency. Since there are many options and parameters in the encoder algorithm, the space of simplifications is huge. Therefore, designers should decide the algorithm specification subject to target design constraints.
-
4 High-throughput architectures for highresolution video coding: hardwired oriented algorithms and VLSI architectures
- + Show details - Hide details
-
p.
111
–148
(38)
High video resolutions along with more complex coding algorithms drive the need for new hardware accelerators featuring high throughputs. Simultaneously, it is desired to achieve this goal at the smallest possible cost of resources and power consumption. In the case of encoders, another important constraint is the allowable loss in the compression efficiency. The efficiency should be close to reference software implementations developed to demonstrate the compression possibilities of a coding standard. These opposite requirements must be satisfied with reference to hardware framework where design units usually support fixed throughputs and delays. Therefore, it is important to develop throughput-balanced processing paths with a high utilization of computation resources. Provided the same performance, encoders are usually much more complex than decoders. Thus, this chapter focuses on hardware-oriented algorithms and architectures dedicated to encoders. However, some design techniques for particular modules can also be applied in decoders. There are some bottlenecks in the encoder dataflow as discussed in Chapter 3. They can be limited or removed by modifications introduced to coding algorithms. Usually, the modifications involve losses in the compression efficiency. The following sections review solutions to particular encoder tasks presented in the literature.
-
5 Low-power circuit design techniques for high-resolution video coding
- + Show details - Hide details
-
p.
149
–190
(42)
After an Introduction, the Chapter presents the sources of power dissipation in CMOS circuits, as well as a methodology for accurate power dissipation estimation using real-video data. State-of-the-art low-power techniques used in dedicated hardware accelerator designs are discussed in the next section. One of the methods introduced in this section shows the hybrid encoding of arithmetic operators and a new hybrid-encoded adder operator is presented. The power-efficient hybrid encoding representation groups m bit and uses gray encoding to potentially reduce circuit switching activity, both internally and at the inputs of the arithmetic operators. This section also discusses the exploration of different adder compressor structures in SAD operation and filter interpolation hardware architecture. Adder compressor architecture performs the simultaneous addition of N operands. Combinations of 3-2, 4-2, 5-2, and 7-2 adder compressors are discussed. An approximate computing technique is presented which is based on the coefficient pruning for SATD hardware architecture, and finally the application of the low-power techniques in SAD, SATD, and interpolation filters is presented in detail.
-
6 Real-time architectures for 3D video coding
- + Show details - Hide details
-
p.
191
–226
(36)
The previous generation of 3D videos was not appealing enough to justify the support for them in the handheld devices and to maintain this support on other electronic devices, like TVs. The huge amount of data to be transmitted, and limited view synthesis performance resulted in visual discomfort and poor-user experience, which culminated on a momentary stop in the manufacture of those devices. In last years, contrasting with this scenario, emerging 3D-video-related technologies covering 3D-movies, virtual reality, augmented reality, mixed reality, and other technologies, have gained place again in the consumer market, especially in handheld devices. These technologies are driven by devices like Microsoft RealSense 3D, Structure Sensor, and Stereolabs ZED. Most of these devices had focused on a texture plus depth approach, which is the most efficient way to represent/encode 3D-videos nowadays. In this context, 3D-HEVC was published in 2015 adopting the MVD format and novel encoding tools to efficiently deal with MV videos captured by multiple cameras and improved view-synthesis performance. Some works focusing on VLSI designs and the real-time processing of 3D videos can be found in the literature, as discussed in this chapter. Despite the 3D-HEVC being the most efficient way available to encode 3D-videos, video coding in a scenario with multiple views remains to be challenging even considering the 3D-HEVC and dedicated VLSI designs. Then, a few works are available in the literature relating hardware designs targeting specifically the 3D-HEVC tools. Three works were selected to be detailed in this chapter, two of them focusing on the new 3D-HEVC intra-frame prediction tools (DMMs and DIS) and one focusing on the 3D-HEVC inter-frame and inter-view predictions. These three architecture are capable of processing at least HD 1,080p@30fps videos in real time. For that, these works developed solutions with different approaches to achieve high throughput with the minimum drawback in area usage and power dissipation.
-
7 Frame memory compression for high-resolution video coding
- + Show details - Hide details
-
p.
227
–255
(29)
This chapter presents a frame memory compression method used in video coding. Frame memory compression compresses the data to be stored in the frame memory in order to reduce the external bandwidth and the related power consumption, as shown in Figure 7.1. When the pixels of a motion-compensated frame have to be written into external DRAM, the frame memory compression engine compresses those pixels. During motion estimation (ME), the frame memory compression engine decompresses the compressed pixels of previous frames and passes them to the video codec. As shown in Figure 7.2, most of the frame memory compression algorithms are composed of three stages: prediction, entropy coding and memory organization [1], which are respectively introduced in Sections 7.1, 7.2 and 7.3.
-
8 Discrete transform approximations for video coding
- + Show details - Hide details
-
p.
257
–297
(41)
In this article discusses several relevant low-complexity transforms for image and video compression. We present metrics for evaluating DCT approximate transforms and search for optimal transforms in terms of the evaluation metrics. The optimal transforms are then submitted to hardware implementation and compared to the standard approximations for high-efficiency video coding (HEVC).
-
9 Reconfigurable and approximate computing for video coding
- + Show details - Hide details
-
p.
299
–330
(32)
The Chapter begins with a discussion of the constraints and needs of video coding systems. The lack in flexibility of traditional monolithic codec specifications, not suitable to model commonalities among codecs and foster reusability among successive codec generations/updates, was the main trigger for the development of a new standard initiative within the ISO/IEC MPEG committee, called reconfigurable video coding (RVC). The MPEG-RVC framework exploits the dataflow nature behind video coding to foster flexible and reconfigurable codec design, as well as to support dynamic reconfiguration. The Chapter goes on to consider that the inherent resiliency of various functional blocks (like motion estimation in the high-efficiency video coding, HEVC) and the varying levels of user perception make video coding suitable to apply approximate computing techniques. Approximate computing, if properly supported at design time, allows achieving run-time trade-offs, representing a new direction in hardware-software codesign research. The main assumption behind approximate computing, exploited within video coding, is that the degree of accuracy (in this case during codec execution) is not required to be the same all the time. The final part of the Chapter attempts to put together the concepts addressed and remarks on which are, in the authors' opinion, some interesting research directions.
-
10 Future video coding: new tools and algorithms
- + Show details - Hide details
-
p.
331
–356
(26)
In the recent years, there has been a real revolution in the world of film and television, thanks to the advent of new digital formats that has involved the entire chain of production of multimedia products. This revolution has impacted the multimedia industry, consumer electronics and communication networks, opening new opportunities for convergence. Video quality has grown exponentially, aiming to emulate the chromatic richness, dynamics and rendering of details typical of human vision, posing major challenges with respect to bandwidth in transmission channels and media storage, making necessary to have new and more performing standards of compression of the video signal that leaves the quality unchanged. Moreover, video content itself has experienced important changes related to the quality delivered to the users and also in the way they consume it. From one point of view, HD and beyond-HD resolutions have become increasingly popular and from the other point of view video-on-demand, mobile television services, stereo and multiview capture and display are some examples of how the video content is evolving nowadays. All these services demand efficient solutions to store huge amounts of data and to deliver the same video content at different resolutions. Although communication networks have also evolved to provide higher capacities, these new requirements concerning video content require to compress the video signal very efficiently in order to store it and stream it reliably.
-
Back Matter
- + Show details - Hide details
-
p.
(1)