## Parallel programming and algorithm theory

### More general concepts than this:

- Sort by:
- Newest first
- Titles A to Z

### Filter by subject:

- Computer and control engineering [19]
- Numerical analysis and theoretical computer topics [19]
- Computer theory [19]
- Programming and algorithm theory [19]
- Parallel programming and algorithm theory [19]
- Electrical and electronic engineering [12]
- Systems and control theory [10]
- General topics, engineering mathematics and materials science [8]
- Mathematical techniques [8]
- Engineering mathematics and mathematical techniques [7]
- [7]
- http://iet.metastore.ingenta.com/content/subject/c5200,http://iet.metastore.ingenta.com/content/subject/c6000,http://iet.metastore.ingenta.com/content/subject/c6100,http://iet.metastore.ingenta.com/content/subject/b6000,http://iet.metastore.ingenta.com/content/subject/b6100,http://iet.metastore.ingenta.com/content/subject/b6140,http://iet.metastore.ingenta.com/content/subject/c1200,http://iet.metastore.ingenta.com/content/subject/c4100,http://iet.metastore.ingenta.com/content/subject/c5260,http://iet.metastore.ingenta.com/content/subject/c6110,http://iet.metastore.ingenta.com/content/subject/c6110p,http://iet.metastore.ingenta.com/content/subject/c7000,http://iet.metastore.ingenta.com/content/subject/b1000,http://iet.metastore.ingenta.com/content/subject/b6140c,http://iet.metastore.ingenta.com/content/subject/c1180,http://iet.metastore.ingenta.com/content/subject/c4240c,http://iet.metastore.ingenta.com/content/subject/c5220,http://iet.metastore.ingenta.com/content/subject/c5220p,http://iet.metastore.ingenta.com/content/subject/c5260b,http://iet.metastore.ingenta.com/content/subject/c7400,http://iet.metastore.ingenta.com/content/subject/c7410,http://iet.metastore.ingenta.com/content/subject/b0290,http://iet.metastore.ingenta.com/content/subject/b1200,http://iet.metastore.ingenta.com/content/subject/b8000,http://iet.metastore.ingenta.com/content/subject/b8100,http://iet.metastore.ingenta.com/content/subject/b8110,http://iet.metastore.ingenta.com/content/subject/c7410b,http://iet.metastore.ingenta.com/content/subject/e,http://iet.metastore.ingenta.com/content/subject/e0000,http://iet.metastore.ingenta.com/content/subject/e0200,http://iet.metastore.ingenta.com/content/subject/b0230,http://iet.metastore.ingenta.com/content/subject/b0260,http://iet.metastore.ingenta.com/content/subject/b0290z,http://iet.metastore.ingenta.com/content/subject/b1265,http://iet.metastore.ingenta.com/content/subject/b1265f,http://iet.metastore.ingenta.com/content/subject/c1130,http://iet.metastore.ingenta.com/content/subject/c1160,http://iet.metastore.ingenta.com/content/subject/c1250,http://iet.metastore.ingenta.com/content/subject/c1260,http://iet.metastore.ingenta.com/content/subject/c4140,http://iet.metastore.ingenta.com/content/subject/c4190,http://iet.metastore.ingenta.com/content/subject/c5100,http://iet.metastore.ingenta.com/content/subject/c5135,http://iet.metastore.ingenta.com/content/subject/c6150,http://iet.metastore.ingenta.com/content/subject/e0210,http://iet.metastore.ingenta.com/content/subject/b0100,http://iet.metastore.ingenta.com/content/subject/b0120,http://iet.metastore.ingenta.com/content/subject/b0210,http://iet.metastore.ingenta.com/content/subject/b0250,http://iet.metastore.ingenta.com/content/subject/b0290h,http://iet.metastore.ingenta.com/content/subject/b1100,http://iet.metastore.ingenta.com/content/subject/b1110,http://iet.metastore.ingenta.com/content/subject/b1130,http://iet.metastore.ingenta.com/content/subject/b1130b,http://iet.metastore.ingenta.com/content/subject/b1270,http://iet.metastore.ingenta.com/content/subject/b1270f,http://iet.metastore.ingenta.com/content/subject/b8110b,http://iet.metastore.ingenta.com/content/subject/b8120,http://iet.metastore.ingenta.com/content/subject/b8120c,http://iet.metastore.ingenta.com/content/subject/b8200,http://iet.metastore.ingenta.com/content/subject/b8230,http://iet.metastore.ingenta.com/content/subject/c0000,http://iet.metastore.ingenta.com/content/subject/c0200,http://iet.metastore.ingenta.com/content/subject/c0220,http://iet.metastore.ingenta.com/content/subject/c1110,http://iet.metastore.ingenta.com/content/subject/c1230,http://iet.metastore.ingenta.com/content/subject/c4170,http://iet.metastore.ingenta.com/content/subject/c4230,http://iet.metastore.ingenta.com/content/subject/c4230m,http://iet.metastore.ingenta.com/content/subject/c5240,http://iet.metastore.ingenta.com/content/subject/c5270,http://iet.metastore.ingenta.com/content/subject/c5600,http://iet.metastore.ingenta.com/content/subject/c5610,http://iet.metastore.ingenta.com/content/subject/c5610s,http://iet.metastore.ingenta.com/content/subject/c6140,http://iet.metastore.ingenta.com/content/subject/c6140d,http://iet.metastore.ingenta.com/content/subject/c6150c,http://iet.metastore.ingenta.com/content/subject/c6150n,http://iet.metastore.ingenta.com/content/subject/c6170,http://iet.metastore.ingenta.com/content/subject/c7300,http://iet.metastore.ingenta.com/content/subject/c7310,http://iet.metastore.ingenta.com/content/subject/c7410d,http://iet.metastore.ingenta.com/content/subject/e0210a,http://iet.metastore.ingenta.com/content/subject/e0210g,http://iet.metastore.ingenta.com/content/subject/e0250
- c5200,c6000,c6100,b6000,b6100,b6140,c1200,c4100,c5260,c6110,c6110p,c7000,b1000,b6140c,c1180,c4240c,c5220,c5220p,c5260b,c7400,c7410,b0290,b1200,b8000,b8100,b8110,c7410b,e,e0000,e0200,b0230,b0260,b0290z,b1265,b1265f,c1130,c1160,c1250,c1260,c4140,c4190,c5100,c5135,c6150,e0210,b0100,b0120,b0210,b0250,b0290h,b1100,b1110,b1130,b1130b,b1270,b1270f,b8110b,b8120,b8120c,b8200,b8230,c0000,c0200,c0220,c1110,c1230,c4170,c4230,c4230m,c5240,c5270,c5600,c5610,c5610s,c6140,c6140d,c6150c,c6150n,c6170,c7300,c7310,c7410d,e0210a,e0210g,e0250
- [7],[7],[7],[6],[6],[6],[5],[5],[5],[5],[5],[5],[4],[4],[4],[4],[4],[4],[4],[4],[4],[3],[3],[3],[3],[3],[3],[3],[3],[3],[2],[2],[2],[2],[2],[2],[2],[2],[2],[2],[2],[2],[2],[2],[2],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1]
- /search/morefacet;jsessionid=383qlm0e2k42o.x-iet-live-01
- /content/searchconcept;jsessionid=383qlm0e2k42o.x-iet-live-01?option1=pub_concept&facetOptions=2+3&option2=pub_year_facet&sortField=prism_publicationDate&pageSize=20&sortDescending=true&facetNames=pub_year_facet+pub_concept_facet&value1=c4240p&operator2=AND&value2=1995&operator3=AND&option3=pub_concept_facet&value3=
- See more See less

### Filter by content type:

### Filter by publication date:

- 1995 [19]

### Filter by author:

- A. Cuhadar [1]
- A.A. Beex [1]
- A.C. Downton [1]
- A.I.T. Rowstron [1]
- A.J. Jones [1]
- A.M. Wood [1]
- A.R. Daniels [1]
- B. Baran [1]
- C.L. Valenzuela [1]
- Chein-WeiJen [1]
- Chin-LiangWang [1]
- ChingsonChen [1]
- D.M. Falcao [1]
- D.M. Wilkes [1]
- E. Kaszkurewicz [1]
- F. Lai [1]
- I. East [1]
- I. Simonik [1]
- J. Roupec [1]
- J. Rowe [1]
- J.G. McWhirter [1]
- J.J. Soraghan [1]
- J.S. Kanevski [1]
- Jiun-InGuo [1]
- K.W. Chan [1]
- L. Piech [1]
- M. Hamdi [1]
- M.P. Fargues [1]
- N.C. Pahalawaththa [1]
- N.R. Harvey [1]
- P. Kraft [1]
- P. Mazumder [1]
- P. Osmera [1]
- P. Ramanathan [1]
- P. Siy [1]
- R. Venkateswaran [1]
- R. Wyrzykowski [1]
- R.-Y. Hwang [1]
- R.W. Dunn [1]
- S. Chalasani [1]
- S. Marshall [1]
- S.R. Wang [1]
- T. Numnonda [1]
- U.D. Annakkage [1]
- Y. Pan [1]
- Yu-TaiChang [1]
- See more See less

An efficient heuristic algorithm for automatic partitioning of a power system network is presented. The algorithm was designed for use in conjunction with a diakoptics-based coarse grain parallel method for solving large, sparse linear systems arising from network analysis. It exploits the techniques of factorisation path graph partitioning and equivalent post-ordering. Test results for systems up to 811 busbars and 1476 transmission lines are included for comparison purposes. This indicates that the method is able to divide a power system network model into a number of equal sized subnetworks in order to optimise the use of parallel computer systems for network analysis while the resultant number of fill-in elements is kept to a minimum.

This paper reports the use of a general technique to combine several different methods to solve complex systems of algebraic equations in the context of load flow calculations of electrical power networks. Such a combinations of methods, referred to as 'team algorithms', seem specially well suited to be used with distributed memory computer systems, in an asynchronous environment. Experimental results solving example problems in a commercially available parallel computer system show that a 'synergetic effect' with considerable speedup can be obtained using these 'team algorithms'.

The paper investigates the application of parallel simulated annealing for unit commitment problems. Two parallel simulated annealing concepts, speculative computation and serial subset, are applied to a unit commitment problem of ten thermal generators. The authors also propose a combined scheme where speculative computation is used in the initial phase and the serial subset is used in the final phase. The parallel simulated annealing schemes are tested with an example problem and the results show that the parallel schemes can considerably speed up the computation of simulated annealing.

The authors present an algorithm and its systolic implementation for computing the 1-D *N*-point inverse discrete cosine transform (IDCT), where *N* is a power of two. The architecture requires (*N*^{2} – 1)/3 multipliers and can evaluate one *N*-point IDCT every clock cycle. Owing to the features of regularity and modularity, the architecture is well suited to VLSI implementation. Compared with existing related arrays, the proposed algorithm has a better area-time performance. In addition, it can be extended to implementing the 2-D IDCT based on the row-column decomposition method.

A unified array design for the discrete cosine transform, discrete sine transform and their inverses based on a new general formulation is presented. The presented design provides the flexibility to adaptively select different types of transforms in transform coding according to the characteristics of images for minimising the bit rate as much as possible.

An order-recursive algorithm is presented for the generalised eigendecomposition of definite Hermitian pencils. An important feature is that it consists of a number (equal to the current order) of independent iterative eigenvalue searches followed by direct computation of the corresponding eigenvectors. The independence aspect produces the potential for parallel hardware implementations. The order-recursive nature of the algorithm results in the potential for solving problems of an a priori unknown, adequate minimal size rather than a single maximum size only. Such algorithms may be useful for sensor-array signal processing, for example in early direction-of-arrival estimation. The order-recursive nature is not detrimental to numerical performance, as it is shown that the algorithm is competitive with standard algorithms solving the single maximum-size problem only.

A new, highly parallel model for concurrent multilayer routing, called CHiRPS (Configurable Highly Routable Parallel System), is presented. The nucleus of CHiRPS is a very flexible pathfinder that can be easily configured, even in the presence of obstacles, to generate various commonly used pattern-based routes, such as Steiner trees with single trunk, comb trees, contour-based routes, etc., that span multiple layers simultaneously. The authors employ the concept of a total grid-graph to capture the state of the routing region. The main steps of the pathfinder are based on new parallel algorithms for cycle detection, cycle elimination and tree reduction. The proposed algorithms scale well with increased problem sizes since they require only O(log N) time when given a grid-graph with up to N^{2} nodes. As such, they are good candidates for massively data-parallel machines.

Solutions to partial differential equations are required in many engineering applications. The multigrid method is an iterative technique for speeding up the solution of these equations. The authors describe a parallel implementation of the multigrid method on the Connection Machine CM-5 architecture. An analytic model is presented for estimating the computation and communication times of the multigrid algorithm. The times predicted by the analytic model are within 5% of the results obtained from CM-5. Results demonstrate that the communication overhead incurred by the parallel multigrid algorithm is relatively small compared to the computation time. Consequently, implementations of the multigrid algorithm on the CM-5 easily achieve processor efficiencies near 100%.

Arrays of processors with pipelined optical buses are introduced for the efficient implementation of computationally intensive applications. Techniques for the concurrent transmission of messages over the optical bus to avoid collision of messages is shown. Convenient parallel data movement operations are derived for this architecture, which are then used in the design of parallel algorithms for the solution of some important numerical problems. The parallel algorithms implemented in the paper are for solving systems of linear equations and finding the roots of nonlinear equations. Even though this array of processors can function in the MIMD mode of operation, it is more suitable for the SIMD mode of operation, because it can be easily synchronised and scaled to a massive number of processors. Hence, the above parallel algorithms have been designed with the SIMD mode in mind. Their time complexities have been analysed, and are shown to compare favourably with those implemented on processors connected with electronic buses or point to point links such as the hypercube. Moreover, whereas a processing element of a hypercube of size N has log N ports, a processing element of an array with optical buses has a constant number of ports. Thus, it seems that an array of processors with optical buses is a promising, and could be a better, alternative for future supercomputing systems.

A compiler technique for migrating synchronisation operations is proposed. The traditional code motion technique is only used for migrating loop invariant statements before the loop; it cannot migrate those synchronisation operations inside the loop. On the other hand, all current statement level synchronisation schemes cannot handle the code migration. However, the performance enhancement is significant after code migration of synchronisation operation. This new migration technique reorders the sequence of synchronisation instructions; Send Signal(S) is moved up and Wait Signal(S, i-d) is moved down, to improve the system performance. Evidence shows that the migration of some synchronisation operations is not helpful for performance enhancement. Therefore, to migrate efficiently, an intelligent code migration algorithm is proposed. In this algorithm, only a few synchronisation migration operations are needed to speed up performance enormously.

The authors present a parallel-decomposition algorithm for discrete computational problems. An efficient convolution algorithm is developed under the proposed decomposition algorithm. The decomposition operation is based on integer modular arithmetic. Congruent sets are generated from integer modular operation over the index of the problem and constitute a partition. The partition is used to decompose the problem into several small subproblems. The partition under the integer modular operation is unique. Complete reconstruction of the problem is guaranteed by the uniqueness of the partition. Since the algorithm is established on the foundation of all algebraic systems, i.e. number and set theories, it is highly problem-independent, and provides an uniform approach to parallel decomposition of general problems on the discrete domain. Because the subproblems are highly regular, it is suitable for VLSI hardware implementation. The computational complexity of the developed convolution algorithm is reduced by a factor of p^{2}, and (p^{2})^{n} for n-dimensional problems (p can be any common factor of the input sequences). Compared to the popular FFT methods, where round-off error is introduced, the convolution result from the algorithm is exact. In addition, it does not have restrictions on the length of the input sequences, as in the well known block convolution algorithm. As a result, the algorithm is highly efficient for convolution of large input sequences or correlation operations.

Some problems are very difficult to solve by mathematical programming approaches. A genetic algorithm (GA) is an extremely powerful optimization technique that could be used to solve such problems, but its efficiency is dependent on its ability to do a large number of evaluations in a reasonable amount of time. A classical GA contains three basic operators-reproduction, crossover and mutation. To increase the efficiency of a genetic algorithm the influence of migration in a multilevel distributed GA (MDGA) was tested. Several different structures of PC computers connected in a local area network (LAN) were used for the MDGAs. MDGAs use the power of the computers better than one-level distributed GAs. The problem of communication between the computers in the MDGAs was dealt with in two different ways, with files on a server or by sending packets.

Several design methods for parallel genetic algorithms (PGAs) optimizing morphological filters are discussed and an optimal scale and machine independent PGA for a loosely coupled, homogeneous or inhomogeneous multiprocessor computer is developed. The optimization is made in terms of computation speed, parallelization efficiency and quality. Quality means, in this context, the fitness of the best chromosome, which is an objective measure for the performance of the corresponding morphological filter in a particular environment.

Evolutionary divide and conquer (EDAC) applied to the geometric travelling salesman problem uses a genetic algorithm to explore the space of problem subdivisions. The underlying techniques for subdivision and patching are based on the cellular dissection algorithms of Richard Karp (1977), but the addition of new repair heuristics as well as the genetic algorithm, appears to have succeeded in lifting the quality of solution way above Karp's algorithms, whilst maintaining almost linear scaling of the algorithm with increased problem size. In this paper we outline a parallel implementation of EDAC and present some preliminary results of this work.

Kauffman has proposed a theory of cell differentiation using random boolean nets as a model of genetic action. This scheme is used as the basis of a developmental representation in the context of genetic algorithms. A number of simple linear morphologies are evolved using this system.

The concept of algorithmic engineering was introduced by McWhirter (see IEE Proc.-E, vol.139, no.3, June 1992) in the context of adaptive signal processing. In effect, it constitutes a simple but rigorous diagrammatic approach to the design of parallel algorithms and architectures for digital signal processing. For the purposes of this design methodology, a parallel algorithm is represented in terms of its basic signal flow graph (SFG). In the context of teaching DSP, this could be of particular advantage. In the article, the potential use of algorithmic engineering as a design and teaching aid is illustrated by means of two particularly relevant examples. The first relates to the FFT. The second concerns the least squares lattice algorithm for linear prediction and adaptive filtering. It is shown how each can be derived, and hence taught, by starting with the SIG for a more basic algorithm (the DFT and recursive least squares algorithms respectively) and then applying a sequence of simple diagrammatic transformations. The least squares lattice derivation starts with the SFG for a multichannel least squares adaptive filter based on QR decomposition. (3 pages)

Presents the description of a parallel implementation of mathematical morphology using the LINDA model of parallel communication embedded within ISETL. The morphological operation of dilation of binary images is considered in detail, with an ISETL implementation given initially, then a description of how this was converted to run in a parallel version of ISETL, ISETL-Linda.

Presents a case study showing how a top-down structured generic parallel design method can be successfully applied to a typical multi-level computer vision algorithm. The case study involves a written postal address recognition system. The address verification is discussed in particular. Incrementally controllable speedups of up to 15 for the complete application were obtained using up to 29 processors. Although this speedup is insufficient to achieve the required real-time performance of processing 10 envelope images/second, it would be straightforward to apply the same parallel design model to current generation parallel processors such as the TMS320C40 or T9000 which have at least 10 times the processing power and communications bandwidth of the T800 transputers used as processing elements in the present case study. The range of algorithms utilised in the case study, and the flexibility of the parallel solutions generated, lead the authors to believe that the design method is applicable to a wide range of embedded vision applications.

Using isomorphic embedding of the original dependence graphs in another graph before its space-time mapping onto array architectures, two linear processor arrays are designed for the Gauss-Jordan algorithm with partial pivoting and Cholesky decomposition. Each of these arrays comprises only (n+1)/2 processing elements (PEs), where n is the number of columns in the input matrices. The block pipelining period is n(n-1) cycles for the first array and n(n+1)/2 cycles for the second. If the matrices are processed sequentially, systems of linear equations are solved by the Gauss-Jordan algorithm with almost full processor utilisation, whereas for the Cholesky decomposition the utilisation of PEs is approximately two-thirds.