Online ISSN
1751-861X
Print ISSN
1751-8601
IET Computers & Digital Techniques
Volume 6, Issue 2, March 2012
Volumes & issues:
Volume 6, Issue 2
March 2012
-
- Author(s): S. Mohamed Saeed and O. Sinanoglu
- Source: IET Computers & Digital Techniques, Volume 6, Issue 2, p. 69 –77
- DOI: 10.1049/iet-cdt.2011.0104
- Type: Article
- + Show details - Hide details
-
p.
69
–77
(9)
Scan architectures with compression support have remedied the test time and data volume problems of today's sizable designs. On-chip compression of responses enables the transmission of a reduced volume signature information to the ATE, delivering test data volume savings, while it engenders the challenge of retaining test quality. In particular, unknown bits (x's) in responses corrupt other response bits upon being compacted altogether, masking their observation, and hence preventing the manifestation of the fault effects they possess. In this work, we propose the design and utilisation of a response compactor that can adapt to the varying density of x's in responses. In the proposed design, fan-out of scan chains to XOR trees within the compactor can be adjusted per pattern/slice so as to minimise the corruption impact of x's. A theoretical framework is developed to guide the cost-effective synthesis of multi-modal compactor that can deliver x-mitigation capabilities in every mode it operates. Adaptiveness of the proposed response compactor enhances the observability of scan cells cost-effectively, where observability enhancements can be tailored in a fault model-dependent or -independent manner, in either way improving test quality and/or test costs. - Author(s): I. Pomeranz
- Source: IET Computers & Digital Techniques, Volume 6, Issue 2, p. 78 –85
- DOI: 10.1049/iet-cdt.2011.0097
- Type: Article
- + Show details - Hide details
-
p.
78
–85
(8)
In a broadside test, a scan-in operation is followed by two functional clock cycles where primary input vectors, denoted by v0 and v1, are applied. Because of tester limitations that prevent the primary input vectors from being changed at speed, broadside tests are computed under the constraint where v0=v1. This results in a loss of delay fault coverage. This study develops a fast procedure for identifying transition faults that are undetectable by broadside tests under the constraint v0=v1. Faults that are undetectable because of this constraint are undetectable because of tester limitations and not because of the logic in the circuit. These faults may be able to affect the circuit during functional operation, when the primary input vectors are unconstrained. In this case the faults are important to detect. A fast procedure for identifying undetectable transition faults under the constraint v0=v1 provides a quantitative measure of the effect of this constraint on the achievable fault coverage without performing test generation. If it turns out that the effect on fault coverage is unacceptable, other solutions may be used without first performing test generation. - Author(s): W.-H. Hu ; C.-Y. Chen ; J.H. Bahn ; N. Bagherzadeh
- Source: IET Computers & Digital Techniques, Volume 6, Issue 2, p. 86 –94
- DOI: 10.1049/iet-cdt.2010.0177
- Type: Article
- + Show details - Hide details
-
p.
86
–94
(9)
Low-density parity check (LDPC) codes can achieve performances close to the Shannon limit and they have been widely adopted for various communication standards. However, the irregular message exchange pattern of LDPC codes is a major challenge for decoder design. Additionally, there is a great demand for integrating diverse applications onto a single system where a flexible, scalable and efficient implementation of LDPC decoding is highly preferable. With the enormous computing power provided by integrating many processors on a single chip in advanced process technology, a multiprocessor platform is regarded as a promising solution to tackle these design challenges. In this work, we devised a parallelisation scheme to implement LDPC decoding on a multiprocessor platform. By using a distributed and cooperative way for LDPC decoding, the memory bottleneck, commonly seen in LDPC decoder design, is eliminated. Moreover, we used a graph spectra-based mapping algorithm to reduce heavy message exchanges among processors during the decoding process. Compared to the sequential mapping strategy, our approach has successfully decreased the amount of inter-processor communication by up to 48%/45%/40% for 16/32/64-processor platforms, respectively. Cycle-accurate simulation results from various LDPC codes demonstrate that desirable scalability and speedups are obtained by our approach. - Author(s): J. Vasiljevic and A.G. Ye
- Source: IET Computers & Digital Techniques, Volume 6, Issue 2, p. 95 –104
- DOI: 10.1049/iet-cdt.2010.0167
- Type: Article
- + Show details - Hide details
-
p.
95
–104
(10)
Fractional motion estimation (FME) is an important part of the H.264/AVC video encoding standard. The algorithm can significantly increase the compression ratio of video encoders while preserving high video quality. The full-search FME algorithm, however, is computationally expensive and can consist of over 45% of the total motion estimation process. To maximise the performance and efficiency of FME implementations on field-programmable gate arrays (FPGAs), one needs to efficiently exploit the inherent parallelism in the algorithm. The authors investigate the scalability of the full-search FME algorithm on FPGAs and also implemented six scaled versions of the algorithm on Xilinx Virtex-5 FPGAs. The authors found that scaling the algorithm vertically within a 4×4 sub-block is more efficient than scaling horizontally across several sub-blocks. It is shown that, with four reference frames, the best vertically scaled design can achieve 96 frames-per-second (fps) performance while encoding full 1920×1088 progressive HDTV video, and the design only consumes 25.5 K LUTS and 28.7 K registers. - Author(s): F. Duhem ; F. Muller ; P. Lorenzini
- Source: IET Computers & Digital Techniques, Volume 6, Issue 2, p. 105 –113
- DOI: 10.1049/iet-cdt.2011.0033
- Type: Article
- + Show details - Hide details
-
p.
105
–113
(9)
Partial reconfiguration suffers from low performance and thus its use is limited when the reconfiguration overhead is too high compared to the task execution time. To overcome this issue, the authors present a fast internal configuration access port (ICAP) controller, FaRM, providing high-speed configuration and easy-to-use readback capabilities, reducing configuration overhead as much as possible. In order to enhance performance, FaRM uses techniques such as master accesses, ICAP overclocking, bitstream pre-load into a controller and bitstream compression technique, Offset-run length encoding (RLE), which is an improvement of the RLE algorithm. Combining these approaches allows us to achieve an ICAP theoretical throughput of 800 MB/S at 200 MHz. In order to complete our approach, we provide a cost model for the reconfiguration overhead for the system level that can be used during the early stages of development. The authors tested their approach on an Advanced Encryption Standard (AES) encryption/decryption architecture. - Author(s): S. Tuuna ; J. Isoaho ; H. Tenhunen
- Source: IET Computers & Digital Techniques, Volume 6, Issue 2, p. 114 –124
- DOI: 10.1049/iet-cdt.2010.0060
- Type: Article
- + Show details - Hide details
-
p.
114
–124
(11)
In this study, the authors present an optimisation method based on analytical resistance, inductance and capacitance (RLC) models for simultaneous reduction of both functional crosstalk noise and power supply noise caused by on-chip buses. This is achieved by intentional skewing of the relative timing of adjacent wires. The method is applicable to any number of bus wires and it takes into account both capacitive and inductive coupling between wires. The authors model the effect of skewing on both functional crosstalk in a distributed RLC bus and the power noise in the surrounding RLC power distribution network. The model is verified by comparing it with HSPICE in 65 nm technology, with the average error being 1.4%. The capability of the method in reducing problematic long-range inductive crosstalk noise is demonstrated in a case study where the maximum crosstalk noise is reduced from 0.20 to 0.05 V. Implementation and the use of the method in combination with other crosstalk reduction methods and power supply noise reduction methods are presented. The influence of the number of different skewing times is analysed. - Author(s): F. Burns ; A. Bystrov ; A. Koelmans ; A. Yakovlev
- Source: IET Computers & Digital Techniques, Volume 6, Issue 2, p. 125 –135
- DOI: 10.1049/iet-cdt.2010.0042
- Type: Article
- + Show details - Hide details
-
p.
125
–135
(11)
A new design flow for security is presented. Cryptographic circuit specifications are first refined and then mapped to a secure power-balanced library consisting of novel mixed 1-of-2 and 1-of-4 components based on N-nary logic. Logic optimisation tools are then applied to generate secure synchronous circuits for layout generation. The circuits generated are more efficient than balanced circuits generated by alternative techniques. A new method is presented for evaluating the security of such circuits. A security metric is introduced, which is based on the common selection function that is widely used in differential power analysis (DPA) attacks and a correlation measure similar to the one used in correlation power analysis (CPA) attacks. The metric enables the construction of a library of robust cryptograhic components including S-boxes that are more resistant to attack. - Author(s): S.K. Srinivasan ; Y. Cai ; K. Sarker
- Source: IET Computers & Digital Techniques, Volume 6, Issue 2, p. 136 –152
- DOI: 10.1049/iet-cdt.2010.0023
- Type: Article
- + Show details - Hide details
-
p.
136
–152
(17)
A formal verification procedure to check the correctness of synchronous elastic pipelined systems against their synchronous specification systems was developed. The procedure can deal with elastic systems that incorporate early evaluation. Note that the goal of the verification procedure is not to establish the correctness of the algorithm for synthesising elastic circuits, but instead, to find bugs and formally prove the correctness of elasticised designs. Dataflow through elastic architectures is complicated by the insertion of any number of elastic buffers in any place in the design. The authors introduce elastic token-flow diagrams, which are used to track the flow of data in elastic architectures. The authors provide a method to construct such diagrams. The authors also develop a highly automated and systematic procedure based on elastic token-flow diagrams that computes functions that map states of elastic systems to states of the synchronous parent systems. Such functions, known as refinement maps are used to compare behaviours of elastic and synchronous systems and hence prove their equivalence. The effectiveness of our methods is demonstrated by verifying 14 elastic pipelined processor models, eight of which incorporate early evaluation.
Multi-modal response compaction adaptive to x-density variation
Undetectable transition faults under broadside tests with constant primary input vectors
Parallel low-density parity check decoding on a network-on-chip-based multiprocessor platform
Effect of scaling on the area and performance of the H.264/AVC full-search fractional motion estimation algorithm on field-programmable gate arrays
Reconfiguration time overhead on field programmable gate arrays: reduction and cost model
Skewing-based method for reduction of functional crosstalk and power supply noise caused by on-chip buses
Design and security evaluation of balanced 1-of-n circuits
Refinement-based verification of elastic pipelined systems
Most viewed content for this Journal
Article
content/journals/iet-cdt
Journal
5
Most cited content for this Journal
-
High-performance elliptic curve cryptography processor over NIST prime fields
- Author(s): Md Selim Hossain ; Yinan Kong ; Ehsan Saeedi ; Niras C. Vayalil
- Type: Article
-
Majority-based evolution state assignment algorithm for area and power optimisation of sequential circuits
- Author(s): Aiman H. El-Maleh
- Type: Article
-
Scalable GF(p) Montgomery multiplier based on a digit–digit computation approach
- Author(s): M. Morales-Sandoval and A. Diaz-Perez
- Type: Article
-
Fabrication and characterisation of Al gate n-metal–oxide–semiconductor field-effect transistor, on-chip fabricated with silicon nitride ion-sensitive field-effect transistor
- Author(s): Rekha Chaudhary ; Amit Sharma ; Soumendu Sinha ; Jyoti Yadav ; Rishi Sharma ; Ravindra Mukhiya ; Vinod K. Khanna
- Type: Article
-
Adaptively weighted round-robin arbitration for equality of service in a many-core network-on-chip
- Author(s): Hanmin Park and Kiyoung Choi
- Type: Article