Home
>
Journals & magazines
>
IEE Proceedings - Computers and Digital Technique...
>
Volume 151
Issue 6
IEE Proceedings - Computers and Digital Techniques
Volume 151, Issue 6, November 2004
Volumes & issues:
Volume 151, Issue 6
November 2004
-
- Author(s): D.H. Green
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 151, Issue 6, p. 385 –390
- DOI: 10.1049/ip-cdt:20041115
- Type: Article
- + Show details - Hide details
-
p.
385
–390
(6)
The linear complexity of m-phase power residue sequences is investigated for the case when m is composite. For each factor of m, the linear complexity and the characteristic polynomial of the shortest linear feedback shift register that generates this version of the sequence can be deduced and these results can then be combined using the Chinese remainder theorem to derive the m-phase values. These values are shown to depend on the categories of the length of the sequence computed modulo of each factor of m, rather than on the category of the length modulo-m itself. For a given length, the highest values of linear complexity results from constructing the sequences using those values of the primitive element which lead to non-zero categories for each factor of m. - Author(s): V. Androutsopoulos ; D.M. Brookes ; T.J.W. Clarke
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 151, Issue 6, p. 391 –401
- DOI: 10.1049/ip-cdt:20041100
- Type: Article
- + Show details - Hide details
-
p.
391
–401
(11)
A system-on-a-chip is an interconnection of different pre-verified IP hardware blocks, which communicate using complex protocols. The integration of IP blocks requires some glue logic to interface otherwise incompatible datapaths. This glue logic is called a protocol converter and its manual design proves to be a tedious and time-consuming task. Automatic synthesis is therefore important, but for optimal system-level design it is necessary to consider not just the correctness, but also the quality (in terms of bandwidth and latency of data transfer) of the converter. A good solution to this problem will allow greater use of protocol-level abstraction as a design tool in system design and synthesis. Results are presented on automatic synthesis of a converter between two protocols. It is shown how converter logic which is bandwidth-optimal can be synthesised for datapaths with an arbitrary number of data ports each of which has arbitrary-size first-in first-out (FIFO) storage. An extension of the product FSM converter synthesis algorithm to include FIFO data-paths is presented. In addition the converter bandwidth is identified as a mean cycle graph problem which is solved using maximum mean cycle graph algorithms. - Author(s): C. McIvor ; M. McLoone ; J.V. McCanny
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 151, Issue 6, p. 402 –408
- DOI: 10.1049/ip-cdt:20040791
- Type: Article
- + Show details - Hide details
-
p.
402
–408
(7)
Modified Montgomery multiplication and associated RSA modular exponentiation algorithms and circuit architectures are presented. These modified multipliers use carry save adders (CSAs) to perform large word length additions. These have the attraction that, when repeatedly used to perform RSA modular exponentiation, the (carry save) format of the output words is compatible with that required by the multiplier inputs. This avoids the repeated interim output/input format conversion, needed when previously reported Montgomery multipliers are used for RSA modular exponentiation. Thus, the lengthy and costly conventional additions required at each stage are avoided. As a consequence, the critical path delay and, hence, the data throughput rate of the resulting Montgomery multiplier architectures are also word length independent. The approach presented is based on a reformulation of the solution to modular multiplication within the context of RSA exponentiation. Two algorithmic variants are presented, one based on a five-to-two CSA and the other on a four-to-two CSA plus multiplexer. The practical application of the approach has been demonstrated by using this to design special purpose RSA processing units with 512-bit and 1024-bit key sizes. The resulting RSA units exhibit the highest data rates reported in the literature to date, reflecting the very low and word length independent critical path delay achieved. - Author(s): J.J. Rooney ; J.G. Delgado-Frias ; D.H. Summerville
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 151, Issue 6, p. 409 –416
- DOI: 10.1049/ip-cdt:20041014
- Type: Article
- + Show details - Hide details
-
p.
409
–416
(8)
A study of a prefix routing cache for Internet IP routing is presented. An output port assignment requires one cache memory access when the assignment is found in cache. The cache array is divided into sets that are of variable size; all entries within a set have the same prefix size. The cache is based on a ternary content addressable memory that matches ones, zeroes and don't care values. Our study shows that an associative ternary cache provides an output port at the speed of one memory access with a very high hit rate. For an 8K entry cache the hit rate ranges from 97.62 to 99.67% on traces of 0.2 to 3.5 million addresses. A port error occurs when the port selected by the cache differs from the port that would have been selected from the routing table. A sampling technique is introduced that reduces the worst port error rate by an order of magnitude (from 0.52 to 0.05%). - Author(s): P.-A. Hsiung ; T.-Y. Lee ; J.-M. Fu ; W.-B. See
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 151, Issue 6, p. 417 –434
- DOI: 10.1049/ip-cdt:20041102
- Type: Article
- + Show details - Hide details
-
p.
417
–434
(18)
With the rapid escalation in design complexity of real-time embedded software, application frameworks have become an almost indispensable tool because they greatly ease the work of a designer by performing tedious tasks on behalf of a designer and by reusing semi-complete application codes. To ensure code quality and reliability, computer-aided analysis is also performed for the generated application software in some frameworks. However, when the target is real-time embedded systems, the correctness of the software in terms of satisfying all user-given real-time and embedded constraints becomes a primary objective for such frameworks. To guarantee correctness, formal verification in the form of model checking is a viable solution due to its full automation capability. Nevertheless, little is known from either the existing literature or industrial experience on how formal verification can be integrated into an object-oriented application framework, whose primary purpose was previously only to design and generate application software. This work contributes to the state-of-art technology by showing how a design framework and a verification framework can be integrated. Three main issues are tackled: (i) what to verify?; (ii) when to verify?; and (iii) how to verify? As a solution to these three issues the authors propose a mapping from the object-oriented model to a formal model, a schedule-verify-map strategy and a compositional verification methodology, respectively. These have been implemented in a component-based framework and experiments performed to illustrate their feasibility. Due to the incorporation of industry de-facto standards such as real-time unified modelling language and real-time Java, in the proposed techniques it should now be possible for an engineer to gain access to theoretically proven formal verification technologies that would otherwise be considered to be inaccessible to an engineer unskilled in verification techniques. - Author(s): A. Schmid and Y. Leblebici
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 151, Issue 6, p. 435 –447
- DOI: 10.1049/ip-cdt:20041099
- Type: Article
- + Show details - Hide details
-
p.
435
–447
(13)
The circuit-level hardware realisation of several multiple-valued logic functions using the capacitive threshold logic design style is presented. The generic design approach for multiple-input, multiple-output and multiple-level transfer functions is shown. SPICE simulations of complex operators demonstrate correct operation which qualifies the proposed circuits for integration into larger multiple-valued logic systems. An analysis of noise margin figures and comparisons with previously published circuit examples are provided. - Author(s): K. Maharatna ; A. Troya ; S. Banerjee ; E. Grass
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 151, Issue 6, p. 448 –456
- DOI: 10.1049/ip-cdt:20041107
- Type: Article
- + Show details - Hide details
-
p.
448
–456
(9)
The authors propose a coordinate rotation digital computer (CORDIC) rotator algorithm that eliminates the problems of scale factor compensation and limited range of convergence associated with the classical CORDIC algorithm. In the proposed scheme, depending on the target angle or the initial coordinate of the vector, a scaling by 1 or 1/√2 is needed that can be realised with minimal hardware. The proposed CORDIC rotator adaptively selects the appropriate iteration steps and converges to the final result by executing on average only 50% of the number of iterations required by the classical CORDIC. Unlike for the classical CORDIC, the value of the scale factor is completely independent of the number of executed iterations. Based on the proposed algorithm, a 16-bit pipelined CORDIC rotator was implemented. The silicon area of the fabricated pipelined CORDIC rotator core is 2.73 mm2. This is equivalent to 38 000 inverter gates in the used 0.25 μm BiCMOS technology. The average dynamic power consumption of the fabricated CORDIC rotator is 17 mW at a 2.5 V supply voltage and a 20 Ms/s throughput. Currently, this CORDIC rotator is used as a part of the baseband processor for a project that aims to design a single-chip wireless modem compliant with the IEEE 802.11a standard. - Author(s): A. Janapsatya ; S. Parameswaran ; J. Henkel
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 151, Issue 6, p. 457 –465
- DOI: 10.1049/ip-cdt:20040942
- Type: Article
- + Show details - Hide details
-
p.
457
–465
(9)
The memory hierarchy subsystem has a significant impact on performance and energy consumption of an embedded system. Methods which increase the hit ratio of the cache hierarchy will typically enhance the performance and reduce the embedded system's total energy consumption. This is mainly due to reduced cache-to-memory bus transactions, fewer main memory accesses and fewer processor waiting cycles. A heuristic approach is presented to reduce the total number of cache misses by carefully relocating selected sections of the application's software code within the main memory, thus reducing conflict misses resulting from the cache hierarchy. The method requires no hardware modifications i.e. it is a software-only approach. For the first time such a method is applied to large program traces, and the miss rates and corresponding energy savings are observed while varying cache size, line size and associativity. Relocating the code consistently produces superior performance on direct-mapped cache. Since direct-mapped caches, being smaller in silicon area than caches with higher associativity (for the same size), cost less in terms of energy/access, and access faster, using direct-mapped instruction cache with code relocation for performance-oriented embedded systems is recommended. A maximum cache miss rate reduction from 71% down to less than 1% is achieved, with energy reductions of up to 63% with only a small increase in main memory size. - Author(s): I. Voyiatzis ; N. Kranitis ; D. Gizopoulos ; A. Paschalis ; C. Halatsis
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 151, Issue 6, p. 466 –472
- DOI: 10.1049/ip-cdt:20040850
- Type: Article
- + Show details - Hide details
-
p.
466
–472
(7)
In this paper an algorithm for the generation of single input change (SIC) pairs is presented, termed the accumulator-based SIC pair generation (ASG) algorithm; SIC pairs have been effectively utilised for testing robustly detectable sequential faults. ASG is implemented in hardware utilising an accumulator whose inputs are driven by a barrel shifter. Since such structures (accumulators whose inputs are driven by barrel shifters) are commonly found in current, high-speed signal processing VLSI circuits, the presented schema provides a practical solution for the built-in testing of such circuits for testing delay and stuck-open faults. Utilisation of ASG to applying SIC pairs to adjacent pairs of inputs of the CUT, resulting in pseudoexhaustive schemes, is also addressed.
Linear complexity of modulo-m power residue sequences
Protocol converter synthesis
Modified Montgomery modular multiplication and RSA exponentiation techniques
Associative ternary cache for IP routing
Formal verification of real-time embedded software in an object-oriented application framework
Realisation of multiple-valued functions using the capacitive threshold logic gate
Virtually scaling-free adaptive CORDIC rotator
REMcode: relocating embedded code for improving system efficiency
Accumulator-based built-in self-test generator for robustly detectable sequential fault testing
Most viewed content for this Journal
Article
content/journals/ip-cdt
Journal
5
Most cited content for this Journal
We currently have no most cited data available for this content.