IET Computers & Digital Techniques
Volume 9, Issue 1, January 2015
Volumes & issues:
Volume 9, Issue 1
January 2015
-
- Author(s): Jose Nunez-Yanez ; Juan Manuel Moreno ; Dimitrios S. Nikolopoulos
- Source: IET Computers & Digital Techniques, Volume 9, Issue 1, p. 1 –2
- DOI: 10.1049/iet-cdt.2014.0215
- Type: Article
- + Show details - Hide details
-
p.
1
–2
(2)
- Author(s): Rong Ren ; Eduardo Juarez ; Cesar Sanz ; Mickael Raulet ; Fernando Pescador
- Source: IET Computers & Digital Techniques, Volume 9, Issue 1, p. 3 –15
- DOI: 10.1049/iet-cdt.2014.0087
- Type: Article
- + Show details - Hide details
-
p.
3
–15
(13)
In this study, a platform-independent energy estimation methodology is proposed to estimate the energy consumption of reconfigurable video coding (RVC)-CAL video codec specifications. This methodology is based on the performance monitoring counters (PMCs) of embedded platforms and demonstrates its portability, simplicity and accuracy for on-line estimation. It has two off-line procedure stages: the former, which automatically identifies the most appropriate PMCs with no specific detailed knowledge of the employed platform, and the latter, which trains the model using either a linear regression or a multivariable adaptive regression splines (MARS) method. Experimenting on an RVC-CAL decoder, the proposed PMC-driven model can achieve an average estimation error <10%. In addition, the maximal model computation overhead is 4.04%. The results show that the training video sequence has significant influence on the model accuracy. An experimental metric is introduced to achieve more stable accurate models based on a combination of training sequences. Furthermore, a comparison demonstrates better predictive ability of MARS techniques in scenarios with multi-core platforms. Finally, the experimental results show a good potential of energy efficiency improvement when the estimation model is combined into the RVC framework. In two different scenarios, the battery lifetime is increased 5.16% and 20.9%, respectively.
- Author(s): Hamed Tabkhi ; Majid Sabbagh ; Gunar Schirner
- Source: IET Computers & Digital Techniques, Volume 9, Issue 1, p. 16 –26
- DOI: 10.1049/iet-cdt.2014.0075
- Type: Article
- + Show details - Hide details
-
p.
16
–26
(11)
This study focuses on embedded realisation of adaptive vision algorithms, and illustrates the challenges using mixture of Gaussian (MoG) background subtraction. MoG is a frequently used adaptive vision kernel, for example, for surveillance applications. It involves massive computation and communication demands, which renders a software approach infeasible considering a 1 W power budget. To address these challenges, the authors employ a systematic system-level design approach and first analyse the demands at high-level, explore opportunities for bandwidth reduction, and derive a customised system-level specification. Based on the system-level exploration, this study then proposes a communication-centric architecture template that simplifies implementing embedded adaptive vision algorithms. To achieve high efficiency, they propose to separate steaming and algorithm-intrinsic traffic. This allows customising the traffic handling based on role of the data, as well as simplifying interconnecting multiple heterogeneous nodes. The authors demonstrate the benefits of traffic separation and the communication-centric architecture template based on MoG. They realise MoG on the Zynq-7000 SoC processing 1080p 30 Hz stream in real-time. The MoG processing kernel consists of 77 pipeline stages operating at 148.5 MHz. The authors' solution is more than 600 × faster than an ARM Cortex-A9 with 666 MHz. It only consumes 151 mW of on-chip power operating in real-time.
- Author(s): Edson Luiz Padoin ; Laércio Lima Pilla ; Márcio Castro ; Francieli Z. Boito ; Philippe Olivier Alexandre Navaux ; Jean-François Méhaut
- Source: IET Computers & Digital Techniques, Volume 9, Issue 1, p. 27 –35
- DOI: 10.1049/iet-cdt.2014.0074
- Type: Article
- + Show details - Hide details
-
p.
27
–35
(9)
Power consumption is one of the main challenges to achieve Exascale performance. Current research trends aim at overcoming power consumption constraints using low-power processors. Although new processors feature sensors that enable precise power measurements, they provide different interfaces to collect data, making it difficult to correlate performance with energy consumption. To overcome this issue, the authors developed a platform-independent tool that collects power and energy data from homogeneous and heterogeneous systems. Using this tool, they provide a detailed comparison between a low-power processor (ARM big.LITTLE) and a high performance processor (Intel Sandy Bridge-EP) using all applications from the NAS parallel benchmarks and a real-world soil irrigation simulator. The results show that the average power demand of Intel Sandy Bridge-EP is within 12.6× to 152.4× higher than ARM big.LITTLE, whereas its average energy consumption is within 1.6× to 7.1× superior. Overall, ARM big.LITTLE presented a better performance/energy trade-off when it takes <9.2× the execution time of Intel Sandy Bridge-EP to solve the same problem.
- Author(s): Francesca Palumbo ; Carlo Sau ; Luigi Raffo
- Source: IET Computers & Digital Techniques, Volume 9, Issue 1, p. 36 –48
- DOI: 10.1049/iet-cdt.2014.0089
- Type: Article
- + Show details - Hide details
-
p.
36
–48
(13)
Power reduction in modern embedded systems design is a challenging issue exacerbated by the complexity and heterogeneity of their architecture. In the field of Reconfigurable Video Coding (RVC), to challenge these issues and cut-down time to market, dataflow-based techniques have been adopted. In particular, to master management and composability of dynamically reconfigurable systems, the authors have developed the multi-dataflow composer. Nevertheless, despite the RVC offers several different tools, in its reference design framework power management is still an open issue. To make some steps forward towards filling this gap, in this study, they address power management for coarse-grained reconfigurable systems combining structural and dynamic strategies, both to be applied at the dataflow level.
- Author(s): Nuno Neves ; Henrique Mendes ; Ricardo Jorge Chaves ; Pedro Tomás ; Nuno Roma
- Source: IET Computers & Digital Techniques, Volume 9, Issue 1, p. 49 –62
- DOI: 10.1049/iet-cdt.2014.0078
- Type: Article
- + Show details - Hide details
-
p.
49
–62
(14)
Given the increased demand for high performance and energy-aware computational platforms, an adaptive heterogeneous computing platform composed of 100+ cores is herein proposed. The platform is based on an aggregate of multiple processing clusters, each containing multiple processing cores, whose architectures are adapted, in execution time, to the instantaneous energy and performance constraints of the software application under execution. This adaptation is ensured by a sophisticated hypervisor engine, implemented as a software layer in the host computer, which keeps a permanent record of a broad set of performance counters, gathered from the execution of each core in the field-programmable gate array (FPGA), in order to dynamically determine the optimal heterogeneous mix of processor architectures that satisfy the considered constraints. By issuing convenient reconfiguration commands to the reconfiguration engine, implemented in a static portion of the FPGA, partial dynamical reconfiguration mechanisms ensure a runtime adaptation of the cores that integrate each cluster. When compared with static instantiations of the considered many-core processor architectures, the obtained experimental results show that significant gains can be obtained with the proposed adaptive computing platform, with performance speedups up to 9.5× , while offering reductions in terms of the consumed energy as high as 10×.
- Author(s): Mateus Beck Rutzig ; Antonio Carlos Schneider Beck ; Luigi Carro
- Source: IET Computers & Digital Techniques, Volume 9, Issue 1, p. 63 –72
- DOI: 10.1049/iet-cdt.2014.0072
- Type: Article
- + Show details - Hide details
-
p.
63
–72
(10)
Nowadays, multiprocessor system-on-chips (MPSoCs) are employed in a heterogeneous fashion, being composed of application-specific integrated circuits (ASICs) and processors that implement different instruction set architectures (ISAs). Because of that, there are two main issues. First, the lack of adaptability, since ASICs are designed for a specific purpose and cannot be changed after deployment; second, the necessity to code for different ISAs, which involves different tool chains which increases design time. In this scenario, the authors propose custom-reconfigurable arrays for multiprocessor systems (CReAMS), which is composed of multiple processors that implement a unique ISA, each of them coupled to an adaptive reconfigurable system, so it is possible to simultaneously exploit instruction-level and thread-level parallelism. Differently from most reconfigurable architectures there is no need to change the binary/source code, nor software development process or environment, which guarantees software compatibility; and in contrast to current MPSoCs used in embedded systems, it is capable of adapting to accelerate applications that were not considered at design time. Besides the obvious advantages in software productivity, CReAMS outperforms a multiprocessor with single-issue processors by 19% and reduces 70% of the energy consumption. In addition, CReAMS outperforms a four-issue out-of-order superscalar processor by 18% in a power budget scenario.
- Author(s): Bruno de Abreu Silva ; Lucas A. Cuminato ; Alexandre C.B. Delbem ; Pedro C. Diniz ; Vanderlei Bonato
- Source: IET Computers & Digital Techniques, Volume 9, Issue 1, p. 73 –81
- DOI: 10.1049/iet-cdt.2014.0091
- Type: Article
- + Show details - Hide details
-
p.
73
–81
(9)
This study describes and evaluates an automated technique that exploits the potential of heterogeneous multi-core processor (HMP) systems when customised with respect to the number of cores and L1 cache memory sizes using a field programmable gate array fitted with LEON3 cores at its base. The authors evaluated the real energy consumption of the HMP system tuned for a set of 50 application codes using a data-mining tool for finding code similarities and selecting HMP configurations. The selected HMP system configuration requires a small cache configuration and consumes less energy when compared to a homogeneous system with the same number of cores and only with a very modest increase in execution time.
- Author(s): Ian Gray ; Gary Plumbridge ; Neil C. Audsley
- Source: IET Computers & Digital Techniques, Volume 9, Issue 1, p. 82 –92
- DOI: 10.1049/iet-cdt.2014.0070
- Type: Article
- + Show details - Hide details
-
p.
82
–92
(11)
Manufacturing variability is an increasingly significant problem. Silicon devices that are designed to be identical will display widely ranging characteristics after manufacture. Power use, supported clock frequencies and lifespan may all vary considerably. This is of particular concern for embedded systems because of their extensive use of complex system-on-chip (SoC)-based architectures. If this variability is not tolerated by the software, then manufacturing yields are reduced and devices are not used efficiently. This study discusses a novel approach to the integration of variability-mitigation techniques that uses model-driven engineering to explicitly consider variability as part of the development process. Developers can build systems that are much more resilient to variability effects, allowing systems to have higher yields, lower costs and greater reliability. The approach uses code generation and code transformation to simplify design-space exploration and reduce time-to-market. The approach is illustrated with an example of audio processing on a complex multiprocessor SoC with simulated variability, and it is shown to be increasingly effective as system variability becomes more significant.
Guest Editorial
Energy estimation models for video decoders: reconfigurable video coding-CAL case-study
Power-efficient real-time solution for adaptive vision algorithms
Performance/energy trade-off in scientific computing: the case of ARM big.LITTLE and Intel Sandy Bridge
Coarse-grained reconfiguration: dataflow-based power management
Morphable hundred-core heterogeneous architecture for energy-aware computation
Adaptive and dynamic reconfigurable multiprocessor system to improve software productivity
Application-oriented cache memory configuration for energy efficiency in multi-cores
Toolchain-based approach to handling variability in embedded multiprocessor system on chips
Most viewed content
Most cited content for this Journal
-
High-performance elliptic curve cryptography processor over NIST prime fields
- Author(s): Md Selim Hossain ; Yinan Kong ; Ehsan Saeedi ; Niras C. Vayalil
- Type: Article
-
Majority-based evolution state assignment algorithm for area and power optimisation of sequential circuits
- Author(s): Aiman H. El-Maleh
- Type: Article
-
Scalable GF(p) Montgomery multiplier based on a digit–digit computation approach
- Author(s): M. Morales-Sandoval and A. Diaz-Perez
- Type: Article
-
Fabrication and characterisation of Al gate n-metal–oxide–semiconductor field-effect transistor, on-chip fabricated with silicon nitride ion-sensitive field-effect transistor
- Author(s): Rekha Chaudhary ; Amit Sharma ; Soumendu Sinha ; Jyoti Yadav ; Rishi Sharma ; Ravindra Mukhiya ; Vinod K. Khanna
- Type: Article
-
Adaptively weighted round-robin arbitration for equality of service in a many-core network-on-chip
- Author(s): Hanmin Park and Kiyoung Choi
- Type: Article