New Publications are available for Digital circuit design, modelling and testing
http://dl-live.theiet.org
New Publications are available now online for this publication.
Please follow the links to view the publication.Implementation and testing of multipliers using reversible logic
http://dl-live.theiet.org/content/conferences/10.1049/ic.2011.0073
Reversible logic gates are very much in demand for the future computing technologies as they are known to produce zero power dissipation under ideal conditions. Reversible logic circuits are of interests to power minimization having applications in low power CMOS design, optical information processing DNA computing, bio informatics, quantum computing and nanotechnology. Both reversible logic synthesis and testing reversible logic circuits are very important issues in this area. Multipliers are very essential for the construction of various computational units of a quantum computer. Multiplier is an important hardware unit that decides the speed in any processor. In this work, an unsigned four bit array multiplier and signed Baugh-Wooley multiplier circuits using reversible gates are implemented. A reversible Built In Logic Block Observer (BILBO) is also designed by using which the proposed reversible multiplier circuits are tested for Stuck-at faults (SAF) and missing gate faults (MGF) based on signature analysis.A 16nm SRAM design for low power and high read stability
http://dl-live.theiet.org/content/conferences/10.1049/ic.2011.0045
SRAM memory design in nanoscale regime has become increasingly challenging due to the reducing noise margins and increased sensitivity to threshold voltage variations. To overcome these challenges, different memory cells have been proposed for SRAMs with different transistor structures. These designs improve the cell stability in the subthreshold regime but suffer from bitline leakage noise, placing constraints on the number of cells shared by each bitline. In this paper, we propose a novel 11T SRAM cell topology which achieves cell stability as well as prevents bitline leakage. In addition to that, the proposed cell shows appreciable improvement in the dynamic power consumption. The HSPICE simulation and analysis at a 16nm feature size in CMOS process shows that the bitline leakage power consumption of the proposed 11T SRAM cell is reduced by 38% and the dynamic power consumption is reduced by 54% when compared to the existing 10T SRAM cell, while maintaining the read static noise margin nearly twice that of conventional 6T SRAM circuit.VHDL implementation of BIST controller
http://dl-live.theiet.org/content/conferences/10.1049/ic.2011.0077
Built-in self-test (BIST) is a design technique that allows a circuit to test itself It is a set of structured-test techniques for combinational and sequential logic, memories, multipliers and other embedded logic blocks. The principle is to generate test vectors, apply them to the circuit under test or device under test, and then verify the response. Being an automated testing, BIST enables testing at high speed and high fault coverage. BIST controller coordinates the operations of different blocks of the BIST. Based on the test mode(TM) input to the controller, the system either operates in the normal mode or in the test mode. In this paper we explain an implementation of a restart able logic BIST controller for a combinational logic circuit using VHDL. It allows us to suspend the signature generation at any desired point in the test sequence. In this case, the BIST circuit is considered to comprise hold logic and a signature generation element. The hold logic will be implemented such that an external signal (HOED) can temporarily suspend signature generation in the signature generation element at specified times during the BIST session.Secure scan design with isomorphic registers
http://dl-live.theiet.org/content/conferences/10.1049/ic.2011.0052
In this paper, we first introduce Isomorphic Redundancy concept. Two functionally equivalent shift registers can be Isomorphic to each other. They can be equivalent to each other by simple permutation of states m state tables. Two Isomorphically redundant circuits can be used to prevent two bit change insertion attack. In, addition we also propose a new model with the help of functionally equivalent shift registers. It is highly non-linear and scan-secure model.Reversible masking: a novel fault-diagnosed
http://dl-live.theiet.org/content/conferences/10.1049/ic.2011.0053
This paper suggests a novel design of reversible masking circuit for Quantum Cryptography. Quantum computation uses quantum properties to represent data and perform reversible operations on data. In this paper, we proposed to design a reversible masking logic and implement the masking logic using basic quantum gates. The masking expression is thus transformed into a Positive Polarity Reed Muller Expression to calculate its nonlinearity. We also proposed a novel quantum gate design, namely M-gate. based on the basic quantum gates. To analyze the design quantum cost is calculated can improve the circuit comparatively another masking circuit. From this circuit we are getting high nonlinearity which is much better than another masking circuit. Here also we test the circuit by using single Missing Gate Fault Model and Multiple Missing Gate Fault Model.A spatial hierarchy FPGA implementation with DPR square root partitions
http://dl-live.theiet.org/content/conferences/10.1049/cp.2011.0904
The customized hardware platform for spatial hierarchy construction with DPR square root partitions is validly obtained. Specialized process structures and storage structures are constructed to ensure the reutilization of presorting results on each level. The objective functions for division evaluation oriented to specific applications are scheduled. DPR partitions are built to meet intensive square root finding requests. After optimizations for algorithm and architecture, such achievement provides the starting points for future development.The design of RS (255,239) encoder based on ADSL
http://dl-live.theiet.org/content/conferences/10.1049/cp.2011.0726
The design of RS (255,239) encoder based on ADSL system GF (2 8) is studied, the core encoder multiplier unit and limited domain constant realization of hardware are presented in the paper. Because the coding process adopts 16 dedicated constant multiplier units, with variable comparison multiplier unit using before, which greatly simplified the hardware structure of multiplier unit, saved the hardware area, and improved the speed of multiplier unit. In addition, the performance of RS (255,239) coding is validated using the MATLAB programming language.Design and implementation of a hybrid SET-CMOS based Hi-speed and power efficient pulse divider circuit
http://dl-live.theiet.org/content/conferences/10.1049/cp.2011.0433
Hybrid SET-CMOS circuits which combine the merits of both the SET and CMOS promises to be a practical implementation for future low power ultra-dense VLSI/VLSI circuit design. In this work, an SET-CMOS hybrid pulse divider circuit is proposed. The MIB model for SET and BSIM4 model for CMOS are used. The operation of the proposed circuit is verified in Tanner environment. The performances of CMOS and Hybrid SET-CMOS based pulse divider circuits are compared. The hybrid SET-CMOS circuit is found to consume lesser power than the CMOS based circuit. Further it is established that hybrid SET-CMOS based circuit is much faster compared to CMOS based circuit.A FPGA ray tracing scheme with memory optimization facility
http://dl-live.theiet.org/content/conferences/10.1049/cp.2011.0903
This article presents a ray tracing acceleration system realized on FPGA platform. Additionally we make improvement to the memory accessing performance and achieve ideal ratio of performance against resources. Such work will gain ray tracing more applicable regions. The global scheme includes spatial indexing hierarchy module, ray generation module, pre-masking module, packet traversal module, arithmetic module and memory optimization facility which is described specially in this article..A practical method of clock synchronization in 2-out-of-3 system
http://dl-live.theiet.org/content/conferences/10.1049/cp.2011.0897
We expound an operable method derived from convergence-nonaveraging algorithm to solve the problem of clock synchronization in 2-out-of-3 system in engineering application. Because of the high reliable and safe requirements, the drift rate of synchronized logic clock must be as small as possible. We use the minimum offset to compute the adjusted value. Through the simulation both in software and hardware, the maximum drift rate is less than 0.1% and the maximum synchronous cycle is less than 6 cycles.A low power multiplier architecture based on bypassing technique for digital filter
http://dl-live.theiet.org/content/conferences/10.1049/cp.2011.0432
The objective of the paper is to present a low power 4×4 digital multiplier design to reduce power consumption of digital multiplier based on 2-dimensional bypassing method. Design of portable battery operated multimedia devices requires energy-efficient multiplication circuits. The proposed bypass cells constitute the multiplier skip redundant signal transitions when the horizontally partial product or the vertical operand is zero. Hence, it is a 2-dimensional bypassing architecture using which we designed a Digital filter for low power dissipation in signal processing applications. Thorough post-layout simulations show that the power dissipation of the proposed 2D multiplier and FIR filter design based on 2D multiplier is reduced by more than 75% compared to the prior design with obscure cost of delay and area.Designed and implemented of graphics rasterization algorithm with FPGA
http://dl-live.theiet.org/content/conferences/10.1049/cp.2011.0902
The rasterization stage, which is an important part of a graphics processing unit, always requires huge operations and is the bottleneck of the performance, especially for mobile devices. In this paper, the authors research the rasterization algorithm and optimize some rasterization algorithm. In the last, the authors implement a simple rasterization engine with small hardware resource of FPGA.Service control systems - managing change and diversity
http://dl-live.theiet.org/content/conferences/10.1049/ic.2010.0096
Presents a collection of slides covering the following topics: Service control systems; Rail system development; Design drivers; and Market drivers.Impact of simultanous switching noise on packaged mixed SoC [simultanous read simultaneous]
http://dl-live.theiet.org/content/conferences/10.1049/cp.2010.0653
Increasing numbers of high-density and high-speed mixed circuits has been integrated on a single chip, while the devices are becoming more sensitive to the simultanous switching noise. In this paper,a chip-package co-design method is presented for co-design of chip and package. Based on the analysis of simulation result of package and chip, a special circuit which is composed of resistors and capacitor is designed on chip to decrease the affect of simultaneous switch noise. The simulation shows an excellent agreement with measurement within a 1% margin.Design and implementation of a SoC-based security coprocessor and program protection mechanism for WSN
http://dl-live.theiet.org/content/conferences/10.1049/cp.2010.1044
The practical applications of wireless sensor networks in vulnerable areas require the communication data of sensor devices confidentiality, integrity and freshness. Furthermore the program data of sensor devices need to be protected. In this paper, we present the design, implementation and simulation of an effective hardware security coprocessor namely RC5-FKM and program protection mechanism based on system on chip (SoC) technology for wireless sensor networks (WSN). Compared with existing works, the unique features of our design includes: (1) a design of fingerprint based key management (FKM) is implemented in SoC, which is used to build secret keys for cryptographic coprocessor. (2) A program protection mechanism is proposed to prevent the program data from being read out by system intruders so as to improve the security of program data in sensor device. (3) A reusable optimized logic cell (ROTL) including some adders and registers is implemented in RC5-FKM, which results in the elimination or minimization of the additional hardware overhead. The design is mapped on FPGA and ASIC design. Results show that the hardware overhead of our design is 9.6% less than previous designs and the execution time of our hardware design is only 0.2% of that of general processors and shorter than other AES coprocessors.VHDL guidance for safe and certifiable FPGA design
http://dl-live.theiet.org/content/conferences/10.1049/cp.2010.0832
Field Programmable Gate Arrays (FPGAs) are becoming increasingly popular for use within high integrity and safety critical systems. One commonly used coding language for their configuration is the VHSIC Hardware Description Language (VHDL). Whilst VHDL is used for hardware description, it is developed in a similar way to traditional software, and many safety critical software certification standards require the use of coding subsets and style guidance in order to ensure known language vulnerabilities are avoided. At present there is no recognized, public domain guidance for VHDL. This paper draws together many different sources to provide a starting discussion for a VHDL subset. (6 pages)Synchronization technology for OFDM system and its FPGA design
http://dl-live.theiet.org/content/conferences/10.1049/cp.2010.0660
For one of the focus communication technology, Multicarrier modulation OFDM (Orthogonal Frequency Division Multiplexing), this paper, based on the previous study of basic modem module, carry out some research on its key synchronization technology. The key point includes adding cyclic prefix and suffix to reduce requirement on accuracy of symbol synchronization, FFT window synchronization time and adding the training sequence to extract bit synchronization signal . In our subject study, algorithm is researched based on MATLAB language. And also introduces several considerations in FPGA circuit Hardware implementation.Formal foundations for MARTE-SystemC interoperability
http://dl-live.theiet.org/content/conferences/10.1049/ic.2010.0152
Model Driven Architecture (MDA) and Electronic System Level (ESL) design are key approaches for succeeding in the specification and design of current embedded systems, which are increasingly complex and heterogeneous. MARTE is the most advanced UML profile for abstract specification of real-time embedded systems in the MDA context, while SystemC is the language most widely adopted by the ESL design community. This paper provides formal foundations for a consistent and synergistic link between MARTE and SystemC. These foundations are based on the ForSyDe formalism, used to reflect the abstract execution semantics of both the MARTE model and its corresponding SystemC executable specification. The concepts introduced are shown through the specification of an essential part of a video decoder.SyReC: a programming language for synthesis of reversible circuits
http://dl-live.theiet.org/content/conferences/10.1049/ic.2010.0150
Reversible logic serves as a basis for emerging technologies like quantum computing and additionally has applications in low-power design. In particular, since traditional technologies like CMOS are going to reach their limits in the near future, reversible logic has been established as a promising alternative. Thus, in the last years this area started to become intensely studied by researchers. In particular, how to efficiently synthesize complex reversible circuits is an important question. So far, only synthesis approaches are available that rely on Boolean function representations, like e.g. truth tables or decision diagrams. In this paper, we propose the programming language SyReC that allows to specify and afterwards to automatically synthesize reversible circuits. Using an existing programming language for reversible software design as basis, we introduce new concepts, operations, and restrictions allowing the specification of reversible hardware. Furthermore, a hierarchical approach is presented that automatically transforms the respective statements and operations of the new programming language into a reversible circuit. Experiments show that with the proposed method, complex circuits can be easily specified and synthesized while with previous approaches this often is not possible due to the limits caused by truth tables or decision diagrams.A tripartite system level design approach for design space exploration
http://dl-live.theiet.org/content/conferences/10.1049/ic.2010.0128
Abstract-In this paper a system level design approach is presented, which reduces the effort of integrating low level tools for the evaluation of different solutions during design space exploration. Thereby, low level estimation tools can be utilized for a fast and accurate estimation of the power consumption of different HW/SW architectures. The proposed design flow extends the known separation of communication and computation to a tripartite design approach. By separately modeling complex data structures, it is possible to design parts that specify computation directly synthesizable and compilable without major changes. Communication parts and complex data structures are taken from a library or refined manually. Using this approach, the way from a system level model to an actual HW/SW implementation is accelerated and the application of low level power estimation tools becomes possible. The benefits of this new design approach are demonstrated by the generation of different solutions of a test system of an audio resampler for VoIP systems. Seven different HW/SW solutions are compared concerning their power consumption, latency, and area.Genetic-based high-level synthesis of ΣΔ modulator in SystemC-A
http://dl-live.theiet.org/content/conferences/10.1049/ic.2010.0147
This paper proposes a novel genetic-based high-level synthesis methodology for ΣΔ modulators. This approach is based on simulation-based optimisation where optimal topology of the ΣΔ modulator is automated explored using a genetic algo rithm(GA) under various design constraints, such as SNR(Signal to-Noise Ratio) and hardware complexity. The proposed synthesis technique has been implemented in SystemC-A due to its ad vantages in terms of high simulation speed, flexibility and data manipulation. Experimental results validates the effectiveness of the synthesis approach.Early robustness evaluation of digital integrated systems
http://dl-live.theiet.org/content/conferences/10.1049/ic.2010.0131
Evaluating the sensitivity of digital integrated systems with respect to soft errors has become an important part of the design flow for many applications. This presentation quickly discusses the most typical approaches used today to analyze the robustness from the application viewpoint.Mixed signal simulation with SystemC and Saber
http://dl-live.theiet.org/content/conferences/10.1049/ic.2010.0138
Increasing complexity and heterogeneity leads to systems that combine the aspects of both digital hardware/soft ware and mixed-signal embedded systems. A major difficulty is the fact that the components for mixed-signal systems are designed bottom-up, while a digital hardware/software system is designed top-down. Often this requires co-simulation, in practice involving multiple simulators from different vendors and on different platforms. Unfortunately, setting up co-simulations is a time-consuming task which is therefore done only a few times for verification purposes. In this paper we show how a plain SystemC simulation can be connected to Saber. A proxy module interfaces to the SystemC simulation and relays signals to Saber. A special signal synchronisation and update scheme ensures the availability of current analogue values to SystemC starting from the very beginning of each time step. Furthermore we introduce a mechanism for automatically connecting SystemC modules and show how it can be used to implement a graphical SystemC editor. A design example which compares a SystemC to Saber co-simulation to a functionally identical SystemC-AMS simulation is also included.Modeling technique for simulation time speed-up of performance computation in transaction level models
http://dl-live.theiet.org/content/conferences/10.1049/ic.2010.0142
Modeling embedded systems at transaction level facilitates the architecting of hardware and software resources according to non-functional requirements. Raising the level of abstraction, Transaction Level Modeling (TLM) represents a good compromise between modeling accuracy and simulation speed. However, in complex pipelined architectures, the efficiency of exploration and performance evaluation is limited by the number of involved transactions and by the various non-functional properties to assess. In this paper we propose a technique to improve the creation of transaction level models and the description of properties related to resources of the system architecture. This technique is based on the separation of concerns between the evolution of a system model at transaction level and the computation of non-functional properties. The considered case study is a wireless communication receiver based on the Long Term Evolution (LTE) protocol. The proposed technique is used to evaluate the related computing complexity according to various system configurations.Bounded fault tolerance checking
http://dl-live.theiet.org/content/conferences/10.1049/ic.2010.0120
Summary form only given. Continuously shrinking feature sizes result in an increasing susceptibility of circuits to transient faults. Approaches to implement fault tolerance are known e.g. on architecture level, algorithmic level, or layout level. But assessing the fault tolerance of a given circuit is a hard verification problem. Verification of the fault tolerance based on simulation is fast, but cannot cover the complete input space in combination with all potential faults in reasonable time. In contrast, formal methods are complete by proving the fault tolerance with respect to the whole input space, but may suffer from run time limitations. Here, we propose a formal model to assess the robustness of a digital circuit with respect to transient faults. Our formal model uses a fixed bound in time to cope with the complexity of the underlying Sequential Equivalence Check (SEC). Exact bounds for the robustness are retrieved while restricting the formal analysis to an observation window of t<sup xmlns="http://pub2web.metastore.ingenta.com/ns/">d</sup> time steps.Formal verification of timed VHDL programs
http://dl-live.theiet.org/content/conferences/10.1049/ic.2010.0133
The verification of timed digital circuits is an important issue. These circuits are composed by logical gates, each of them being associated with propagation delays. The analysis of such circuits is necessary to identify critical path and adjust the clock period of the circuit or to determine the stability period of input/ouput signals. These circuits are represented by a functional model described in VHDL and a timing model associating propagation delays to each functional block. This model is translated into timed automata formalism upon which classical simulation or model checking verification can be performed. This method rises two problems: 1) Propagation delays associated to a gate depend on the transistor assembly and the manufacturer's technology. How do we associate propagation delays to a logical gate ? 2) How to automatically translate a VHDL functional description, combined with propagation delays, into timed automata ? This paper addresses these two problems. It presents a method automating the verification of VHDL descriptions, augmented with interval bounded propagation delays, obtained by electrical simulation of the transistor model of the gates.Formal support for untimed SystemC specifications. Application to high-level synthesis
http://dl-live.theiet.org/content/conferences/10.1049/ic.2010.0132
SystemC lacks a well defined formal semantics for abstract specification, specifically for untimed models. This paper tackles this problem by providing the fundamentals of a framework which enables the analysis of any untimed SystemC specification under a formal meta-model. Then, the conditions for the SystemC specification to correspond with its formal meta model are defined. As an application example, the use of the framework for high-level synthesis verification is shown.Reversing deterministic finite state machines
http://dl-live.theiet.org/content/conferences/10.1049/cp.2009.1696
Finite State Machines (FSM) are an important category of digital circuits. Simply put, an FSM starts from a certain state, receives a sequence of inputs, changes its internal states, and produces a sequence of outputs. We define the reverse of a given FSM as an FSM that given the original final state and the reversed sequence of original outputs, can produce the reversed sequence of original inputs. Implementing such an FSM has uses in testing, fault tolerance and debugging digital circuits including processors. We present techniques that can produce a deterministic reverse FSM from a given deterministic FSM. The overhead is at most one extra state, plus <sup xmlns="http://pub2web.metastore.ingenta.com/ns/">⌈</sup>log<sub xmlns="http://pub2web.metastore.ingenta.com/ns/">2</sub>(NP)<sup xmlns="http://pub2web.metastore.ingenta.com/ns/">⌉</sup> extra output bits in case in the original FSM at most N states share the same next state and output value. (6 pages)Design of high-performance session management module in network behavior inspection system
http://dl-live.theiet.org/content/conferences/10.1049/cp.2009.2001
Network management system is required to have high-performance as well as good functionality since the Internet traffic is increasing rapidly. This paper presents a design of network behavior inspection system based on PowerPC network processor and embedded Linux. To ensure the high performance of the system, the implementation of the session management module is based on the table lookup unit (TLU) of the network processor. Experimental result shows there is a substantial increase in the efficiency of the session management.Scheduling of balancing WSC for minimum IP testing time
http://dl-live.theiet.org/content/conferences/10.1049/cp.2009.2000
Today the research on designing for testability is becoming very important in the filed of SoC. However, the traditional research is limited in top level of SoC. It does not really guarantee to improve the utilization of test resource of SoC and reduce the test time effectively in theory. This paper proposes that the focus of study will be extended to the IP core level. We can reduce the percentage of idle test time and achieve the minimum of test time in various IP cores through the establishment of scheduling strategy of balancing WSC (Wrapper Scan Chain) and the exchange ISC (Inter Scan Chain) heuristic algorithm. In this paper, we verify the scheduling strategy of balancing WSC in algorithm and reusability upon the ITC'02 benchmarks. The results show that the vast majority of percentage of idle test time is much less than 1 % in the relationship set between the number m of balanced WSC and the test time Γιρ within each IP core. The results also can verify the importance of the research programme which is implemented in IP core level for improving the utilization of test resource and reducing the test time effectively in SOC.The SoPC based design for real-time radar seeker signal processing
http://dl-live.theiet.org/content/conferences/10.1049/cp.2009.0416
A SoPC system fabric suitable for the radar seeker signal processing application is introduced. The SW/HW function partition method based on the idea of SoPC regarding to the arithmetic characteristics is specified, and all the function modules for the overall processing can be implemented into one FPGA chip. The detailed engineering implementation architecture in one chip of Virtex4 FX60 is given and the design idea has been guaranteed by the successful utility in the radar seeker processing system's miniaturization and intelligentization. (4 pages)Implementing complex and multiple DSP systems on chip: developing a "tops-down" approach to multicore processor architectures
http://dl-live.theiet.org/content/conferences/10.1049/ic_20080611
TI has over a decade of successful history in multi-core processor and system-on-chip (SoC) design. A generic bottom-up SoC architecture approach will be compared to application-domain focussed various multi-core architectures. Heterogeneous designs will be compared to homogeneous solutions. Various performance indicators will be discussed, such as application scope, design challenges, power consumption and development tool chains. The talk will close by addressing some current-day virtualization challenges in multi-core processor design. (16 pages)New design methodologies & synthesis techniques for complex FPGA designs
http://dl-live.theiet.org/content/conferences/10.1049/ic_20080615
With diverse applications for FPGAs in the automotive, consumer, military/aerospace, networking, medical or wireless markets - designers are faced with an equally diverse set of implementation challenges. FPGA designs may have aggressive performance and area objectives or stringent operational requirements for safety-critical applications. Logic synthesis is among the most critical steps in ensuring these design goals are met within the required market window. In this this session you will learn how advances in FPGA synthesis have kept pace with the latest device architectures. Among the topics to be discussed are latest innovations in physical synthesis, design analysis, synthesis for operational safety, and incremental design flows. (19 pages)Electronic and software design reliability
http://dl-live.theiet.org/content/conferences/10.1049/ic_20080513
To summarise this paper, you can say that for both hardware and software, a good design is essential for good reliability, but even with a perfect design, hardware will fail. Since it was concluded that software can not fail, it means that all software failures are design failures. All experience from any area, says that it is impossible to design an error-free software, it the system has some complexity.Architecture of efficient RNS-based digital signal processor with very low-level pipelining
http://dl-live.theiet.org/content/conferences/10.1049/cp_20080650
A generalized architecture of an efficient digital signal processor using the residue number system (RNS) is proposed. It is based on using our new residue multipliers-accumulators (MACs) as the main building blocks. This architecture offers potentially higher throughput thanks to the possibility of implementing very low-level pipelining. The maximal applicable clock frequency could be determined by the delay of only a few stages of full-adders.Structural division procedure for efficient IC analysis
http://dl-live.theiet.org/content/conferences/10.1049/cp_20080632
The efficient and structured analysis of unknown CMOS integrated circuits (ICs) has become a topic of great relevance in recent years. Up until now, different invasive [1], [2] and non-invasive [3], [4] strategies have been developed for procedure of analysis. However, invasive procedures always lead to the destruction of system under investigation. The non-invasive approaches published so far have the disadvantage that ICs are analysed by using complex algorithms. Here, no subdivision exists to avoid extensive analysis times in the case that only simple structures are investigated. Moreover, traditional procedures cannot automatically distinguish between input and output pin types, which is usually required in the investigation of real unknown integrated circuits. This paper presents an efficient non-invasive procedure to determine binary multi-input multi-output (MIMO) ICs by its input-output behaviour. It was implemented into analysis environment described in [5] and classifies unknown ICs by means of automata theory. A novel separation procedure is proposed in this paper to further minimise the IC analysis. All sections of the classification procedure are simulated and fully tested on ISCAS-85, ISCAS-89 and ISCAS-99 benchmark models of real ICs [6], [7] and the results are presented in this paper.Logic equivalence check in SOC design: solution and issues
http://dl-live.theiet.org/content/conferences/10.1049/cp_20080882
When the design becomes complex and costly in sub-micron SOC chips, formal verification - logic equivalence check becomes increasingly important to meet the design challenging and schedule. This paper shows its method and application in one reconfigurable system-on-chip's physical design and full chip integration phase. We described the key learning from this project.Test and Diagnosis of Analogue, Mixed-Signal and RF Integrated Circuits: the system on chip approach
http://dl-live.theiet.org/content/books/cs/pbcs019e
<p xmlns="http://pub2web.metastore.ingenta.com/ns/">This book provides a comprehensive discussion of automatic testing, diagnosis and tuning of analogue, mixed-signal and RF integrated circuits, and systems in a single source. As well as fundamental concepts and techniques, the book reports systematically the state of the arts and future research directions of those areas. A complete range of circuit components are covered and test issues from the SoC perspective. An essential reference for researchers and engineers in mixed signal testing, postgraduate and senior undergraduate students.</p>New methods and techniques used by design software and IP suppliers to increase design productivity
http://dl-live.theiet.org/content/conferences/10.1049/ic.2008.0761
This paper presents the following: semiconductor devices, a need for programmable platforms; changing need of design tools and IPs for programmable platforms; and SoftJin's offerings to programmable platform developers. Reconfiguration granularity: coarse and fine grain, and application driven reconfigurable devices are also discussed. (30 pages)Design of digital FIR filter using dynamic distributed arithmetic algorithm with improved table look up scheme for residue number system
http://dl-live.theiet.org/content/conferences/10.1049/ic_20070683
The use of the improved table look up residue number system (RNS) and dynamic distributed arithmetic algorithm (DDAA) in modern telecommunication and multimedia applications is becoming more and more important because it allows interesting advantages in terms of area, power consumption and speed.. This paper presents a general conversion procedure based on a {2<sup xmlns="http://pub2web.metastore.ingenta.com/ns/">n-1</sup>,2<sup xmlns="http://pub2web.metastore.ingenta.com/ns/">n</sup>, 2<sup xmlns="http://pub2web.metastore.ingenta.com/ns/">n+1</sup>} moduli set. Based on the improved table look up RNS and DAA algorithm, an architecture which efficiently implements the digital fir filter is synthesized using Xilinx VirtexE. lt is observed that up to 82.85% reduction in number of slices, up to 100% reduction in number of flip flops and up to 87.21% reduction in number of look up tables (LUT) is achieved. The speed of the filter is improved by 30.98%.Optimized S-box design for AES core
http://dl-live.theiet.org/content/conferences/10.1049/ic_20070729
This paper proposes an efficient solution to combine Rijndael encryption and decryption with on fly Key Scheduler in one FPGA design, with a strong focus on low area constraints and high throughput. In this paper, we investigate a new compact digital hardware implementation of AES Structure with integrated Sub byte and Inverse Sub byte transformation which minimizes the computation cost of the relevant arithmetic in the finite field GF (28), including the cost of the mapping. This approach has advantages over a straightforward implementation using read-only memories for table lookups. The Hardware implementation is compared with the previous work done in this area. The resulting S-box design with subfield operations in GF ((2<sup xmlns="http://pub2web.metastore.ingenta.com/ns/">2</sup>)<sup xmlns="http://pub2web.metastore.ingenta.com/ns/">2</sup>)<sup xmlns="http://pub2web.metastore.ingenta.com/ns/">2</sup> offers a reduction in the reconfigurable logic by 81% low gate count as compared to look up table and 23% better performance in area and faster by 3% in comparison with one using GF ((2<sup xmlns="http://pub2web.metastore.ingenta.com/ns/">4</sup>)<sup xmlns="http://pub2web.metastore.ingenta.com/ns/">2</sup>). Derived architectures are evaluated using popular low-cost field-programmable gate arrays. Using the proposed architecture, a fully sub-pipelined AES core with both inner and outer round pipelining and 7 sub-stages in each round unit implemented using XC2V3000 Virtex -E devices can achieve a throughput of 33.47Gbps at 261.68 MHz and 12542 CLB Slices in non-feedback modes, suitable for high speed applications. Result is compared with the fastest previous FPGA implementation known to date.A novel CMOS string D/A converter for system-on-chip applications
http://dl-live.theiet.org/content/conferences/10.1049/cp_20070740
The once emerging trend of silicon-on-chip is becoming a reality. The challenge of integrating a digital-to-analog converter on a single chip is one of the bottlenecks in SoC solutions. A threshold inverter quantization (TIQ) architecture is proposed in this paper for a design of a novel string D/A converter. The so-called TIQ is based on the principle that the threshold voltage of an inverter can be manipulated by altering the layout geometry. The new design technique of D/A converters presented in this paper could achieve a significant improvement on both speed and layout area aspects. A 4-bit TIQ based string demonstrator DAC has been designed using a 0.7 micron standard CMOS technology. This demonstrator DAC can achieve a maximum speed of 425 MSPs which is 2 orders higher than the typical commercial use DAC.IBPTB-based test scheduling
http://dl-live.theiet.org/content/conferences/10.1049/cp_20070303
During the process of wrapper scan chain balance, it is inevitable that a certain number of idle bits will appear for the inequality in length between different wrapper scan chains. In this paper, a new wrapper scan chain balance algorithm called best exchange optimization (BEO) is presented first. This algorithm aims at minimizing the number of idle bits between wrapper scan chains. Later a new test scheduling method called idle bit percentage on test bus (IBPTB)-based test scheduling (ITS) is proposed. ITS generates optimal scheduling solution according to IBPTBs of test rectangles of each IP core. The benchmark circuits in ITC'02 are selected to be the objects for test in order to demonstrate the practicability and validity of both two algorithms. Experiment on core6 of ITC'02 SOC p93791 shows that BEO can generate more balanced wrapper scan chains than those by classic best fit decreasing (BFD) heuristic algorithm. The best improvement on the length of the longest wrapper scan chain is 2.629% reduced. Experiments on both d695 and p93791 in ITC'02 show that ITS is suitable for generating better scheduling solution than classic ILP for large SOCs. For d695, only one scheduling solution can achieve lower total test time than ILP by 9.62%. For p93791, there exist several scheduling solutions like that. The best improvement on total test time can reach 26.269%.Hardware/software co-design of AC3 decoding
http://dl-live.theiet.org/content/conferences/10.1049/cp_20070719
In MPEG2 MP@HL decoding chip, RISC core is used as both audio decoding and system decoding. In the paper, AC3 decoding is used to research the hardware/software co-design method. A new hardware/software co-design method is proposed. After running AC3 program on RISC core and getting the profile information of each function, the key operation is extracted and models are set up to extract special instructions. Finally, implementation of some special instruction is given. Results shows that the method have achieved performance increase and memory space decrease.Reconfigurable full-search video motion estimation architecture
http://dl-live.theiet.org/content/conferences/10.1049/cp_20070708
In this paper, a new reconfigurable multi-standard architecture is introduced for integer-pixel motion estimation and a standard-cell based chip design study is presented. This has been designed to cover most of the common block-based video compression standards, including MPEG-2, MPEG-4, H.263, H.264, AVS and WMV-9. The architecture exhibits simpler control, high throughput and relative low hardware cost and highly competitive when compared with excising designs for specific video standards. It can also, through the use of control signals, be dynamically reconfigured at run-time to accommodate different system constraint such as the trade-off in power dissipation and video-quality. The computational rates achieved make the circuit suitable for high end video processing applications. Silicon design studies indicate that circuits based on this approach incur only a relatively small penalty in terms of power dissipation and silicon area when compared with implementations for specific standards.Design of high speed 32-bit parallel prefix adder for CMOS technology
http://dl-live.theiet.org/content/conferences/10.1049/cp_20070723
This paper describes the design requirement specification of a High-speed 32-bit parallel-prefix Adder. The parallel-prefix adder utilizes a parallel-prefix topology to reduce the critical path in the adder. The critical path is the carry generation path, has a logarithmic dependence of the bit-width, which should be compared to the linear dependence in the ripple carry adder. The author used Kogge-Stone Adder because it has one of the shortest critical paths of all other adders. A set of tools for architectural level synthesis of parallel adders is developed, which were module generator and a library of parameterized VHDL models. Reasonable levels of power consumption were investigated and for a 9.29 mW power simulation results, the total delay 2.829 ns in Kogge-stone Adder and maximum combinational path delay 8.586 ns.ASIP: developments, challenges and trends
http://dl-live.theiet.org/content/conferences/10.1049/cp_20070790
Application Specific instruction Set Processor (ASIP) becomes an attractive substitute for ASIC as transistor density, logic complexity and market competition boost. Similar to ASIC, ASIP is based on customized and tailored architectures. In this way, ASIP delivers high performances with low overheads on cost and power whilst taking the advantages of high flexibility and fast time-to-market as a processor-based solution. To demonstrate this effective solution for embedded applications, this paper performs an overall investigation on ASIP's developments, challenges, trends in terms of architectures and design methodologies.A multiple scan chain BIST scheme based on constraint input reduction
http://dl-live.theiet.org/content/conferences/10.1049/cp_20070739
In this paper, a new BIST scheme for multiple scan chain circuit is presented, where constraint input reduction, LFSR coding and folding counter are applied to compress and generate a deterministic test set. Its highlight is effectively combining previous several test methods to take full advantage of them. The new proposed scheme can't only achieve high test data compression rate, but also is compatible with traditional scan-based design flow. Experimental results show that the proposed technique needs less storage volume and significantly reduces testing time than previously published approaches. Its total test performance gains upgrade.TAM optimization and test scheduling for SoC based on zigzag design flow
http://dl-live.theiet.org/content/conferences/10.1049/cp_20070302
Traditional SoC design-for-test (DFT) flow involves the sequence of determining detailed test architecture, choosing the approach of test scheduling and implementing core test. Such procedure may produce the issues of physical realizability and workability. To conquer the inefficiency of the past test flow, a new test flow whose shape is like letter Z, is presented. The Z design flow consists of proposing conceptual test architecture with uncertain test access mechanism(TAM) width, deciding the test scheduling to satisfy the required weighted test cost(WTC), then defining the deterministic test architecture and executing the core test. To better make tradeoffs between test time and hardware overhead during the Z design flow, a new test scheduling approach is explored. The verification result demonstrates the efficiency and usefulness of the proposed technique. The optimal WTC for benchmark circuit using the proposed algorithm is 55% of the average WTC.A survey of network processor workloads
http://dl-live.theiet.org/content/conferences/10.1049/cp_20070720
Over the past few years, fiber bandwidth has increased faster than both processor clock frequency and memory access speed. To solve this bottleneck, network processor design has relied on parallelization of tasks via heterogeneous processing elements and by utilizing dedicated high speed hardware blocks for acceleration. However these solutions either reduce the flexibility of network processors or become progressively more difficult to implement due to issues such as memory bandwidth, processor interconnection, etc. To this end, we analyze the fundamental components of a network processor, the processing engines (PEs). By examining the workload undertaken by a modern network processor we determine the processing complexity of network applications along with a detailed instruction mix analysis. Through simulation, our analysis finds that although the average processing cost per packet can allow us to estimate the workload of a particular application, the varying nature of network processor tasks requires two methods of determining if a particular function can be sustained. For those tasks operate independent of the packet length, utilizing the packet header only, we find that the maximum processing cost encountered by any packet is the important metric. Similarly, when analyzing those tasks which require access to the packet payload, we find that average packet cost complexity do not present an efficient method of estimating instruction budgets, since the vast majority of packets deviate from the mean packet length. At an architectural level, our analysis of instruction mix and traces find that as well as increasing parallelization, improvements within process engine performance can be found in a number of areas. We find floating point and multiply units are underutilized within network applications, with more cost effective solutions such as shared resources more suited to network processor design space. Secondly, the byte-wise nature and high number of programming variables hint to the need for a large register base. Finally, conditional operations within a network processor are found to be relatively simple nested loops and if/else bit tests, however the high proportion of conditional branches found through simulation highlight a possible future bottleneck within network processor research.