Home
>
Journals & magazines
>
IEE Proceedings - Computers and Digital Technique...
>
Volume 143
Issue 6
IEE Proceedings - Computers and Digital Techniques
Volume 143, Issue 6, November 1996
Volumes & issues:
Volume 143, Issue 6
November 1996
-
- Author(s): B. Becker and R. Drechsler
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 143, Issue 6, p. 349 –354
- DOI: 10.1049/ip-cdt:19960787
- Type: Article
- + Show details - Hide details
-
p.
349
–354
(6)
In the paper, an algorithm for the exact minimisation of Kronecker expressions (KROs) for totally symmetric functions is presented. KROs are a class of AND/EXOR forms closely related to ordered Kronecker functional decision diagrams (OKFDDs). This close relation is used to obtain a polynomial time minimisation algorithm. A generalisation to partially symmetric functions is investigated. Experimental results in comparison to previously published methods are given to show the efficiency of the approach. - Author(s): G.R. Redinbo
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 143, Issue 6, p. 355 –363
- DOI: 10.1049/ip-cdt:19960769
- Type: Article
- + Show details - Hide details
-
p.
355
–363
(9)
Fault-tolerant linear processing systems using real number codes, as in algorithm-based fault tolerance methodologies, can be extended to include error correction for correcting intermittent errors in the processed output data. The reliability function for a protected system containing correction is considered under simple but realistic assumptions on the arrival of failures in both the normal processing and corrector subassemblies. An error-correcting procedure employing a real, convolutional code is described briefly, and a system diagram of a corrector subsystem containing self-checking comparators for detecting its internal failures is presented. Failures are modelled as arriving according to Poisson processes in both parts of the fault-tolerant system. Formulas for the reliability of the protected system are given and an efficient lower bound is developed. This bound depends only on broad details such as relative areas used in a VLSI design of the processing and corrector parts. Computational methods employing computer algebra packages are discussed, and some typical reliability curves for two configurations demonstrate the dramatic improvement which error correction introduces. - Author(s): R. Drechsler ; B. Becker ; N. Göckel
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 143, Issue 6, p. 364 –368
- DOI: 10.1049/ip-cdt:19960789
- Type: Article
- + Show details - Hide details
-
p.
364
–368
(5)
A genetic algorithm (GA) is applied to find a variable ordering that minimises the size of ordered binary decision diagrams (OBDDs). OBDDs are a data structure for representation and manipulation of Boolean functions often applied in CAD. The choice of the variable ordering largely influences the size of the OBDD (i.e. its size may vary from polynomial to exponential in the number of variables). Dynamic variable ordering is the state-of-the-art method for finding good variable orderings. In the paper it is shown by experimental results that better sizes can be obtained using GAs. The authors' GA approach is a practical alternative to the exact algorithm for variable ordering. It produced optimal results for most considered benchmarks, but it is also applicable to functions with more than 20 variables due to its short runtimes. - Author(s): C.-M. Chen ; C.-T. King ; Y.-Y. Chen
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 143, Issue 6, p. 369 –375
- DOI: 10.1049/ip-cdt:19960822
- Type: Article
- + Show details - Hide details
-
p.
369
–375
(7)
Branches are a major limiting factor to instruction-level parallelism. One solution is to execute several branches simultaneously using multiway branching architectures. Such architectures are especially important when the instruction issue width becomes large. The authors we study the problem of compile-time scheduling of branch operations on such architectures: an optimisation called branch merging. The scheduling attempts to bring profitable branches together for concurrent execution. It is shown that finding the optimal solution to the branch merging problem is NP-hard. A heuristic is then proposed, which relies on a cost model to direct the merging of branches and their associated basic blocks. Merged branches are then scheduled together for concurrent execution. The authors used simulation to evaluate the effectiveness of the proposed algorithm. Experiments on selected benchmark programs show that the heuristic achieves roughly a 10% performance improvement on multiway branching architectures. - Author(s): D. Debnath and T. Sasao
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 143, Issue 6, p. 376 –384
- DOI: 10.1049/ip-cdt:19960826
- Type: Article
- + Show details - Hide details
-
p.
376
–384
(9)
A generalised Reed–Muller expression (GRM) is a class of AND-EXOR expression. In a GRM, each variable may appear both complemented and uncomplemented. Networks realised using GRMs are easily tested. The paper presents GRMIN2, a heuristic simplification algorithm for GRMs of multiple-output functions. GRMIN2 uses eight rules, and as the primary objective, it reduces the number of products, and as the secondary objective, it reduces the number of literals. GRMIN2 obtains a lower bound on the number of products in GRMs and often proves the minimality of the solutions. Experimental results show that in most cases GRMs require fewer products than conventional sum-of-products expressions. GRMIN2 outperforms existing algorithms and for many functions it proved the minimalities of the solutions. - Author(s): E.C. Tan and C.Y. Chia
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 143, Issue 6, p. 385 –390
- DOI: 10.1049/ip-cdt:19960770
- Type: Article
- + Show details - Hide details
-
p.
385
–390
(6)
Reed–Muller universal logic modules (RM-ULMs) can be used as individual modules or as networks to implement RM functions of any number of variables. For a large RM expansion, it is generally more efficient to realise it using cascades of low-order RM-ULMs. A programmed algorithm for the optimisation of the number of modules at the sub-system level has already been published. Here an alternative algorithm is presented which performs similar optimisation of fixed-polarity generalised RM expansions but without the need to maintain a saved-branch counter and an input record. Also, whenever the number of piterms having the specific control-tuple-state exceeds two, it is not necessary for the algorithm to perform separate variable examinations. Consequently, the saving in computation time will be especially significant in RM expansions with large number of variables. The resulting algorithm is simple in structure and can be easily implemented using high-level languages such as C or Fortran. - Author(s): Ph.W. Besslich and E.A. Trachtenberg
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 143, Issue 6, p. 391 –400
- DOI: 10.1049/ip-cdt:19960466
- Type: Article
- + Show details - Hide details
-
p.
391
–400
(10)
A nonlinear transformation, called the sign transform, is defined and studied. It maps all the ternary functions onto themselves, is uniquely invertible and is inherently related to Walsh and Walsh–Galois transforms. Decomposable Boolean functions and ternary product terms are mapped analytically into the sign domain. It is shown that sign coefficients describe the structure of a ternary output logic network implemented by a complete basis: EXOR of variables and sign function. Methods are discussed for direct transformation of switching functions without having to use the butterfly algorithm: Shannon's decomposition in the sign domain and, in general, merging sign spectra of subfunctions.As applications, spectra of arbitrary logic circuits is computed analytically by merging spectra of their gate components. Also, it is shown that testable realisations of incompletely specified Boolean functions can be designed by appropriate assignments of don't cares in the sign domain. - Author(s): J.J. Kim and H.M. Choi
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 143, Issue 6, p. 401 –406
- DOI: 10.1049/ip-cdt:19960565
- Type: Article
- + Show details - Hide details
-
p.
401
–406
(6)
An XMESH is proposed as a suitable interconnection network for massively parallel computers, and the performance of the proposed interconnection network is analysed. The XMESH has the same horizontal links as those of the toroidal mesh (TMESH); however, it has diagonally crossed links instead of vertical links. The proposed XMESH shows desirable characteristics as an interconnection network for massively parallel computers as the number of nodes increases, while retaining the structural advantages of the TMESH such as the symmetric structure and constant degree of K = 4. Analytical performance evaluations show that the XMESH has a shorter diameter, a shorter mean internode distance, and a higher message-completion rate than the TMESH or the diagonal mesh (DMESH). To confirm these results, an optimal self-routing algorithm for the proposed topology is developed and is used to simulate the maximum delay, the average delay and the throughput in the presence of contention. In all cases, the XMESH is shown to outperform the TMESH and the DMESH, and can provide an attractive alternative to those networks in implementing massively parallel computers. - Author(s): K.N.B. Murthy and C.S.R. Murthy
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 143, Issue 6, p. 407 –412
- DOI: 10.1049/ip-cdt:19960562
- Type: Article
- + Show details - Hide details
-
p.
407
–412
(6)
The problem of solving a system of N linear equations on a mesh-connected multiprocessor structure is considered. The solution to the problem is obtained by using a Gaussian-elimination-based algorithm called ‘successive Gaussian elimination’. The new algorithm does not contain a separate backsubstitution phase. A two-dimensional array of N × (N + 1) processors is employed to obtain the solution in (5N – log N – 4) time steps. This scheme eliminates the use of two processor structures in conjunction, one for triangulation and the other for backsubstitution, for producing the complete solution using the existing Gaussian elimination algorithm. Most importantly, the new algorithm supports pairwise pivoting to assure numerical stability. The proposed processor-array structure is amenable for VLSI implementation as identical processors with only simple and regular interconnections required. - Author(s): K.-J. Lin and C.-S. Lin
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 143, Issue 6, p. 413 –420
- DOI: 10.1049/ip-cdt:19960634
- Type: Article
- + Show details - Hide details
-
p.
413
–420
(8)
A novel alternative for removing CSC (complete state coding) violations in asynchronous circuit synthesis for STGs (signal transition graphs) is presented. The main feature of the work is to exploit delays in the physical circuit to remove CSC violations. Its main advantages are that it (i) does not need to obey the noninput constraint and (ii) saves area overhead when a CSC violation in the state graph does not actually appear in the physical circuit. The delay constraint for removing each CSC violation is formulated. Then an algorithm is proposed to derive a consistent set of constraints to ensure that all violations are removed. If a consistent set exists, it is shown that those constraints can always be satisfied by padding delays during hazard analysis, and therefore hazard-free circuits without any CSC violation can be derived. Based on this approach, the marked-graph benchmarks, hitherto unsolvable due to the noninput constraint in existing methods, are now resolved. - Author(s): K.-H. Lin ; C.-S. Chen ; T.-T. Hwang
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 143, Issue 6, p. 421 –425
- DOI: 10.1049/ip-cdt:19960638
- Type: Article
- + Show details - Hide details
-
p.
421
–425
(5)
In an era of submicron technology, routing is becoming a dominant factor in area, timing and power consumption. The problem of scan flip-flops chaining with the objective of achieving minimum routing area overhead is studied. The first attempt is to chain the flip-flops in logic level. To make more accurate decisions on chaining flip-flops, the second attempt is to perform the chaining of scan flip-flops taking layout information into consideration. Specifically, the authors show that the chaining problem is a travelling salesman problem (TSP). Then, two heuristics, greedy and matching-based algorithms, are proposed to solve the TSP problem. Various cost functions will be defined which take layout information into account. Benchmarking results show that the cost function achieves the best results when it considers placement and routing information and is dynamically updated. - Author(s): S. Vaidehi ; D.J. Ram ; A. Shukla
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 143, Issue 6, p. 426 –430
- DOI: 10.1049/ip-cdt:19960468
- Type: Article
- + Show details - Hide details
-
p.
426
–430
(5)
Logical clocks and vector clocks were proposed in the past to capture causality between events of different processes of a distributed computation. However, these clocks could not capture intraprocess concurrency. Later, bit-matrix clocks and hierarchical clocks were developed to capture interprocess concurrency as well as intraprocess concurrency. The major disadvantages of these clocks are the associated storage and communication overheads. To overcome these disadvantages, the authors introduce the concept of difference clocks. The difference clocks maintain minimal information about the differences among various local clocks. It is shown that they can be used to reconstruct the bit-matrix clocks and result in substantial reduction in storage space and communication overhead. - Author(s): G.-J. Lai and C. Chen
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 143, Issue 6, p. 431 –435
- DOI: 10.1049/ip-cdt:19960637
- Type: Article
- + Show details - Hide details
-
p.
431
–435
(5)
A new program model is presented to accurately represent parallel programs for partitioning and scheduling problems. This model extends the graphic representation of the macrodataflow by considering the complex communication options supported by NUMA systems. The proposed model shows not only task precedence relations but also data sharing status. Moreover, a new partitioning method based on the proposed model is also developed. Experimental results show that many conventional partitioning algorithms operate more efficiently using the proposed model, and that the proposed algorithm surpasses existing algorithms. - Author(s): B.K. Mohanty and P.K. Meher
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 143, Issue 6, p. 436 –439
- DOI: 10.1049/ip-cdt:19960423
- Type: Article
- + Show details - Hide details
-
p.
436
–439
(4)
Recurrence relations and fully pipelined novel cell-level systolic architectures are suggested for massively parallel implementation of two-dimensional FIR and linear phase FIR filters. Owing to the higher level of parallelism, the proposed structures would yield more throughput over the existing structures. Besides, it can be flexibly configured according to the throughput requirement of the application for the chosen processor technology for cost-effective implementation. - Author(s): V.H. Nguyen
- Source: IEE Proceedings - Computers and Digital Techniques, Volume 143, Issue 6, p. 440 –442
- DOI: 10.1049/ip-cdt:19960640
- Type: Article
- + Show details - Hide details
-
p.
440
–442
(3)
Given a set S of nonoverlapping axis-parallel rectangles placed inside a rectangular region B, a partition of the space B\S is called a ‘free space partition’. A simple proof of a formula on the number of rectangles in a minimum free space partition is presented. Based on this formula, it is shown that a minimum free space partition can be computed using a well known geometric graph search algorithm in O(n3/2 log n) time.
Exact minimisation of Kronecker expressions for symmetric functions
Reliability levels for fault-tolerant linear processing using real number error correction
Genetic algorithm for variable ordering of OBDDs
Branch merging for scheduling concurrent executions of branch operations
GRMIN2: A heuristic simplification algorithm for generalised Reed–Muller expressions
Alternative algorithm for optimisation of Reed–Muller universal logic module networks
Three-valued quasi-linear transformation for logic synthesis
XMESH interconnection network for massively parallel computers
Gaussian-elimination-based algorithm for solving linear equations on mesh-connected processors
Removing CSC violations in asynchronous circuits by delay padding
Layout-driven chaining of scan flip-flops
Difference clocks: A new scheme for logical time in distributed systems
New program model for program partitioning on NUMA multiprocessor systems
Cost-effective novel flexible cell-level systolic architecture for high throughput implementation of 2-D FIR filters
Optimum partitioning of rectilinear layouts
Most viewed content for this Journal
Article
content/journals/ip-cdt
Journal
5
Most cited content for this Journal
We currently have no most cited data available for this content.