Online ISSN
1751-861X
Print ISSN
1751-8601
IET Computers & Digital Techniques
Volume 6, Issue 6, November 2012
Volumes & issues:
Volume 6, Issue 6
November 2012
-
- Author(s): A. Mitra and S. Chattopadhyay
- Source: IET Computers & Digital Techniques, Volume 6, Issue 6, p. 353 –361
- DOI: 10.1049/iet-cdt.2011.0051
- Type: Article
- + Show details - Hide details
-
p.
353
–361
(9)
This study presents a particle swarm optimisation (PSO)-based approach to optimise node count and path length of the binary decision diagram (BDD) representation of Boolean function. The optimisation is achieved by identifying a good ordering of the input variables of the function. This affects the structure of the resulting BDD. Both node count and longest path length of the shared BDDs using the identified input ordering are found to be much superior to the existing results. The improvements are more prominent for larger benchmarks. The PSO parameters have been tuned suitably to explore a large search space within a reasonable computation time. - Author(s): A.D. Brown ; D.J.D. Milton ; A.J. Rushton ; P.R. Wilson
- Source: IET Computers & Digital Techniques, Volume 6, Issue 6, p. 362 –369
- DOI: 10.1049/iet-cdt.2012.0006
- Type: Article
- + Show details - Hide details
-
p.
362
–369
(8)
Behavioural synthesis is the process of automatically translating an abstract specification to physical realisation – silicon. The endpoints of this process are accelerating apart (behavioural descriptions become more abstract, DSM silicon becomes less willing to behave as Boolean circuits) but there is still work outstanding in the middle ground. Recursion allows the elegant expression of complicated systems, and is supported by many languages (software and hardware). The electronic design automation (EDA) tool designers’ task is to support the semantics of a language (both simulation and synthesis). Although recursive descriptions can always be re-cast into non-recursive iterative forms, if a language supports a construct, a user should be able to utilise it (the authors are not offering any opinion on the relative wisdom of using recursion or iteration). The authors describe the problems/solutions of supporting the semantics of recursion (single/multiple, direct/arbitrarily indirect) in synthesis. The hardware synthesised can be smaller and faster than that obtained by reformulating the description. It is dangerous, to conclude too much from this – recursion requires a stack and a heap (plus managers). In software, these are taken for granted (‘free’ resources that do not feature in footprint metrics); in hardware, every resource needed must be explicitly created. - Author(s): T. Lang and A. Nannarelli
- Source: IET Computers & Digital Techniques, Volume 6, Issue 6, p. 370 –371
- DOI: 10.1049/iet-cdt.2012.0090
- Type: Article
- + Show details - Hide details
-
p.
370
–371
(2)
- Author(s): S. Gao ; D. Al-Khalili ; N. Chabini ; P. Langlois
- Source: IET Computers & Digital Techniques, Volume 6, Issue 6, p. 372 –383
- DOI: 10.1049/iet-cdt.2011.0146
- Type: Article
- + Show details - Hide details
-
p.
372
–383
(12)
In this study, asymmetric non-pipelined large size unsigned and signed multipliers are implemented using symmetric and asymmetric embedded multipliers, look-up tables and dedicated adders in field programmable gate arrays (FPGAs). Decompositions of the operands are performed for the efficient use of the embedded blocks. Partial products are organised in various configurations, and the additions of the products are realised in an optimised manner. The additions used in the implementation of the multiplication include compressor-based, Delay-Table and Ternary-adder-based approaches. These approaches have led to the minimisation of the total critical path delay with reduced utilisation of FPGA resources. The asymmetric multipliers were implemented in Xilinx FPGAs using 18×18-bit and 25×18-bit embedded signed multipliers. Implementation results demonstrate an improvement of up to 32% in delay and up to 37% in the number of embedded blocks compared with the performance of designs generated by commercial synthesis tools. - Author(s): M. Hosseinabady and J.L. Nunez-Yanez
- Source: IET Computers & Digital Techniques, Volume 6, Issue 6, p. 384 –395
- DOI: 10.1049/iet-cdt.2012.0001
- Type: Article
- + Show details - Hide details
-
p.
384
–395
(12)
Early system modelling is an essential tool to accelerate software development, architectural analysis and hardware verification in complex many-core system-on-chips (SoCs). Transaction level modelling (TLM) offers a higher level of abstraction than register transfer level (RTL) and can be used for early system modelling. Maintaining simulation speed with the right accuracy is a major challenge and this paper proposes SystemC-based architectural modelling techniques that extend TLM to deliver faster simulation models for many-core system. The proposed approach considers a micro-scheduler for large modules (in the sense of SystemC modules) to locally manage all events in the module. Exploiting this micro-scheduler along with function object and coroutine concepts, the authors propose a lightweight thread process that significantly reduces the context switching overhead among the different processes. Additionally the micro-scheduler allows some processes to be run ahead of simulation time. The proposed techniques are applied to the model of a very large networks-on-chip (NoC) formed by thousands of cores stressing the simulation capabilities of the host computer and operating system. The experimental results demonstrate that the model can run successfully and exhibits up to 93% improvement in simulation speed compared to traditional SystemC-based modelling. - Author(s): J.Y. Hur ; K. Goossens ; L. Mhamdi ; M.A. Wahlah
- Source: IET Computers & Digital Techniques, Volume 6, Issue 6, p. 396 –405
- DOI: 10.1049/iet-cdt.2011.0169
- Type: Article
- + Show details - Hide details
-
p.
396
–405
(10)
It is well-known that any logical functionality can be implemented using the reconfigurability in field-programmable gate arrays (FPGAs). However, the reconfigurability is traded with the reduced functional performance, increased cost and increased configuration overheads. Hardwiring the interconnect fabric is gaining notice as an alternative solution to tackle the mentioned problems. In this article, first, the authors present that hardwired built-in crossbars that can improve the performance of the inter-processor communication. The authors conduct an analysis of functional performance, cost and configuration cost for soft and hard crossbar (SBAR and HBAR) interconnects. The queuing model is applied to compare soft and hard interconnects. A motion JPEG (MJPEG) case study suggests that HBAR achieve significantly better throughput and less cost compared to SBAR. Second, the authors present the effectiveness of the hardwired network-on-chip (NoC) in FPGAs. Considering the Æthereal NoC, an analysis is conducted to compare hard and soft NoCs. Consequently, the analysis, implementation and simulation indicate that the hardwired networks perform significantly better than soft networks. - Author(s): A. Valaee and A.J. Al-Khalili
- Source: IET Computers & Digital Techniques, Volume 6, Issue 6, p. 406 –413
- DOI: 10.1049/iet-cdt.2012.0038
- Type: Article
- + Show details - Hide details
-
p.
406
–413
(8)
SRAMs in nanoscale CMOS technology suffer from plethora of design challenges such as increased process variation, increased leakage current and variation in the cell current that threatens the reliability of sensing scheme. These issues coupled with continuous increase in the SRAMs size, requires additional techniques and treatments such as read-assist techniques to ensure fast and reliable read operation. In this study, the authors address these concerns and propose a novel read-assist sensing scheme. The circuit is simulated using Spectre in 65 nm CMOS technology. Simulation results showed an increased sensing speed, lower power dissipation and enhanced SRAM dynamic cell stability. A complete comparison is made between the proposed scheme, the conventional circuit and another state of the art design, which shows speed improvement of 55.34, 66.01% and power reduction of 21.33, 89.09% with respect to conventional sense amplifier and the referenced scheme, respectively. These enhancements are at the expense of negligible area overhead. Also, the proposed scheme enables one to reduce the cell's VDD by 227 and 345 mV for the same operating frequency with respect to conventional and referenced circuits, respectively. This results in leakage power reduction of 19.7 and 30% which constitutes a considerable portion of overall power dissipation in nanoscale SRAMs. - Author(s): C. Desmouliers ; E. Oruklu ; S. Aslan ; J. Saniie ; F.M. Vallina
- Source: IET Computers & Digital Techniques, Volume 6, Issue 6, p. 414 –425
- DOI: 10.1049/iet-cdt.2011.0156
- Type: Article
- + Show details - Hide details
-
p.
414
–425
(12)
In this study, an image and video processing platform (IVPP) based on field programmable gate array (FPGAs) is presented. This hardware/software co-design platform has been implemented on a Xilinx Virtex-5 FPGA using a high-level synthesis and can be used to realise and test complex algorithms for real-time image and video processing applications. The video interface blocks are done in Register Transfer Languages and can be configured using the MicroBlaze processor allowing the support of multiple video resolutions. The IVPP provides the required logic to easily plug-in the generated processing blocks without modifying the front-end (capturing video data) and the back-end (displaying processed output data). The IVPP can be a complete hardware solution for a broad range of real-time image/video processing applications including video encoding/decoding, surveillance, detection and recognition.
Variable ordering for shared binary decision diagrams targeting node count and path length optimisation using particle swarm technique
Behavioural synthesis utilising recursive definitions
Comments on ‘Improving the speed of decimal division’
Asymmetric large size multipliers with optimised FPGA resource utilisation
Fast and low overhead architectural transaction level modelling for large-scale network-on-chip simulation
Comparative analysis of soft and hard on-chip interconnects for field-programmable gate arrays
High-performance low-power sensing scheme for nanoscale SRAMs
Image and video processing platform for field programmable gate arrays using a high-level synthesis
Most viewed content for this Journal
Article
content/journals/iet-cdt
Journal
5
Most cited content for this Journal
-
High-performance elliptic curve cryptography processor over NIST prime fields
- Author(s): Md Selim Hossain ; Yinan Kong ; Ehsan Saeedi ; Niras C. Vayalil
- Type: Article
-
Majority-based evolution state assignment algorithm for area and power optimisation of sequential circuits
- Author(s): Aiman H. El-Maleh
- Type: Article
-
Scalable GF(p) Montgomery multiplier based on a digit–digit computation approach
- Author(s): M. Morales-Sandoval and A. Diaz-Perez
- Type: Article
-
Fabrication and characterisation of Al gate n-metal–oxide–semiconductor field-effect transistor, on-chip fabricated with silicon nitride ion-sensitive field-effect transistor
- Author(s): Rekha Chaudhary ; Amit Sharma ; Soumendu Sinha ; Jyoti Yadav ; Rishi Sharma ; Ravindra Mukhiya ; Vinod K. Khanna
- Type: Article
-
Adaptively weighted round-robin arbitration for equality of service in a many-core network-on-chip
- Author(s): Hanmin Park and Kiyoung Choi
- Type: Article