IET Computers & Digital Techniques
Volume 13, Issue 1, January 2019
Volumes & issues:
Volume 13, Issue 1
January 2019
-
- Author(s): Kollaparampil Somasekharan Sreekala and Sukumarapillai Krishnakumar
- Source: IET Computers & Digital Techniques, Volume 13, Issue 1, p. 1 –10
- DOI: 10.1049/iet-cdt.2018.0009
- Type: Article
- + Show details - Hide details
-
p.
1
–10
(10)
With the advent of nanoscale devices, due to the problems of leakage power has grown enormously. Reducing leakage power is one of the main challenges in the design of low power circuits. This study presents a delay efficient circuit level leakage reduction technique, which uses dual-V th named ‘Feedback Sleeper-Stack (FS-S)’ for deep submicron (DSM) technology. FS-S is proposed in order to reduce leakage power dramatically while saving exact logic state. An analytical RC delay model of the FS-S is derived. Comparisons are then carried out in terms of leakage power, total power, delay, area, and power–delay product to the available leakage reduction techniques. 45 nm BSIM4 Predictive Technology Model parameters are used to estimate the changes in power and delay. FS-S is applied to three generic logic circuits to show that the proposed technique is suitable for general logic circuits. Results show that chain of four inverters, NAND3 gate, and C17 circuit with dual-V th FS-S give 15, 62, and 90% performance levels, respectively, over base case circuit under iso-area condition.
- Author(s): Ebadollah Taheri ; Karim Mohammadi ; Ahmad Patooghy
- Source: IET Computers & Digital Techniques, Volume 13, Issue 1, p. 11 –19
- DOI: 10.1049/iet-cdt.2017.0139
- Type: Article
- + Show details - Hide details
-
p.
11
–19
(9)
Dynamic thermal management (DTM) techniques of three-dimensional (3D) Network-on-Chips (NoCs) are employed to rescue the chip from thermal difficulties. Reactive routing algorithms, which utilise router throttling technique as a popular DTM, disregard distribution of heat generation of routers resulting in more throttled routers as well as long packet delays in throttled processing elements. This study proposes a reactive routing algorithm for 3D NoCs to (i) dynamically detour packets from hot zones containing throttled routers and (ii) minimise the number of required router throttling in the network. The proposed routing algorithm defines two virtual networks to enhance the path diversity for packets in each layer of 3D NoCs. The selection of diverse paths distributes heat generation to alleviate the thermal variance. The proposed routing algorithm is analysed by turn model to achieve deadlock freedom. Access Noxim simulator is also used to evaluate the performance and the thermal behaviour of the proposed routing algorithm in the variety of conditions. Results show that the proposed routing algorithm improves temperature variance by 9–39% and reduces number of throttled routers by 16–86%, which is achieved at the cost of one extra virtual channel per each physical channel in the XY-plane.
- Author(s): Fatma Elzahra Sayadi ; Marwa Chouchene ; Haithem Bahri ; Randa Khemiri ; Mohamed Atri
- Source: IET Computers & Digital Techniques, Volume 13, Issue 1, p. 20 –27
- DOI: 10.1049/iet-cdt.2017.0149
- Type: Article
- + Show details - Hide details
-
p.
20
–27
(8)
As video processing technologies continue to rise quicker than central processing unit (CPU) performance in complexity and image resolution, data-parallel computing methods will be even more important. In fact, the high-performance, data-parallel architecture of modern graphics processing unit (GPUs) can minimise execution times by orders of magnitude or more. However, creating an optimal GPU implementation not only needs converting sequential implementation of algorithms into parallel ones but, more importantly, needs cautious balancing of the GPU resources. It requires also an understanding of the bottlenecks and defect caused by memory latency and code computing. The defiance is even greater when an implementation exceeds the GPU resources. In this study, the authors discuss the parallelisation and memory optimisation strategies of a computer vision application for motion estimation using the NVIDIA compute unified device architecture (CUDA). It addresses optimisation techniques for algorithms that surpass the GPU resources in either computation or memory resources for CUDA architecture. The proposed implementation reveals a substantial improvement in both speed up (SU) and peak signal-to-noise ratio (PSNR). Indeed, the implementation is up to 50 times faster than the CPU counterpart. It also provides an increase in PSNR of the coded test sequence up to 8 dB.
- Author(s): Mohammad Gh. Alfailakawi ; Mohammed El-Shafei ; Imtiaz Ahmad ; Ayed Salman
- Source: IET Computers & Digital Techniques, Volume 13, Issue 1, p. 28 –37
- DOI: 10.1049/iet-cdt.2017.0164
- Type: Article
- + Show details - Hide details
-
p.
28
–37
(10)
Cuckoo search (CS) is a recent swarm intelligence-based meta-heuristic optimisation algorithm that has shown excellent results for a broad class of optimisation problems in diverse fields. However, CS is generally compute intensive and slow when implemented in software requiring large number of fitness function evaluations to obtain acceptable solutions. In this study, the authors present a problem specific parallel pipelined field programmable gate array-based accelerator to reduce execution time when solving complex optimisation problems. Experiments conducted on a large number of well-known benchmark functions revealed that the hardware approach offers a promising average speedup of 75× and 53× than software and GPU implementations, respectively.
- Author(s): Chandan Bandyopadhyay ; Rakesh Das ; Anupam Chattopadhyay ; Hafizur Rahaman
- Source: IET Computers & Digital Techniques, Volume 13, Issue 1, p. 38 –48
- DOI: 10.1049/iet-cdt.2017.0097
- Type: Article
- + Show details - Hide details
-
p.
38
–48
(11)
Reversible logic synthesis is one of the best suited ways which act as the intermediate step for synthesising Boolean functions on quantum technologies. For a given Boolean function, there are multiple possible intermediate representations (IRs), based on functional abstraction, e.g. truth table, decision diagrams or circuit abstraction, e.g. binary decision diagram (BDD), and-inverter graph (AIG) and majority inverter graph (MIG). These IRs play an important role in building circuits as the choice of an IR directly impacts on cost parameters of the design. In the authors’ work, they are analysing the effects of different graph-based IRs (BDD, AIG and MIG) and their usability in making efficient circuit realisations. Although applications of BDDs as an IR to represent large functions has already been studied, here they are demonstrating a synthesis scheme by taking AIG and MIG as IRs and making a comprehensive comparative analysis over all these three graph-based IRs. In experimental evaluation, it is being observed that for small functions BDD gives more compact circuits than the other two IRs but when the input size increases, then MIG as IR makes substantial improvements in cost parameters as compared with BDD by reducing quantum cost by 39% on an average. Along with the experimental results, a detailed analysis over the different IRs is also included to find their easiness in designing circuits.
- Author(s): Somesh Kumar and Rohit Sharma
- Source: IET Computers & Digital Techniques, Volume 13, Issue 1, p. 49 –56
- DOI: 10.1049/iet-cdt.2018.5067
- Type: Article
- + Show details - Hide details
-
p.
49
–56
(8)
High-speed metal interconnects play a significant role in the on-chip network system as the network performance largely depends on the behaviour of these interconnects. Variability in wire properties due to the surface roughness directly impacts the overall system performance. In this study, the authors evaluate the effects of interconnect surface roughness on deeply scaled on-chip interconnects (i.e. 22, 13, and 7 nm) in the context of the network on chip (NoC). The critical roughness parameters of interconnect in NoC are extracted by atomic force microscopy analysis of fabricated thin sheets of copper. Their analysis shows that in a 5 × 5 NoC with 25 cores on 2.5 mm × 2.5 mm die, rough interconnects can lead to a significant penalty on energy budget, bandwidth density, bit error rate, the figure of merit and total system throughput. Their analysis shows that this penalty is further increased by moving towards interconnection lines at advanced technology nodes. They simulate the bodytrack workload of PARSEC benchmark by using Tejas Simulator to show the penalty on latency and energy of the architecture due to the rough interconnects. Their study makes an attempt to qualitatively and quantitative highlight the impact of the interconnect surface roughness on the design of power-aware NoCs.
State retained dual-V th feedback sleeper-stack for leakage reduction
ON–OFF: a reactive routing algorithm for dynamic thermal management in 3D NoCs
CUDA memory optimisation strategies for motion estimation
FPGA-based implementation of cuckoo search
Design and synthesis of improved reversible circuits using AIG- and MIG-based graph data structures
Investigating the role of interconnect surface roughness towards the design of power-aware network on chip
Most viewed content
Most cited content for this Journal
-
High-performance elliptic curve cryptography processor over NIST prime fields
- Author(s): Md Selim Hossain ; Yinan Kong ; Ehsan Saeedi ; Niras C. Vayalil
- Type: Article
-
Majority-based evolution state assignment algorithm for area and power optimisation of sequential circuits
- Author(s): Aiman H. El-Maleh
- Type: Article
-
Scalable GF(p) Montgomery multiplier based on a digit–digit computation approach
- Author(s): M. Morales-Sandoval and A. Diaz-Perez
- Type: Article
-
Fabrication and characterisation of Al gate n-metal–oxide–semiconductor field-effect transistor, on-chip fabricated with silicon nitride ion-sensitive field-effect transistor
- Author(s): Rekha Chaudhary ; Amit Sharma ; Soumendu Sinha ; Jyoti Yadav ; Rishi Sharma ; Ravindra Mukhiya ; Vinod K. Khanna
- Type: Article
-
Adaptively weighted round-robin arbitration for equality of service in a many-core network-on-chip
- Author(s): Hanmin Park and Kiyoung Choi
- Type: Article