IET Computers & Digital Techniques
Volume 7, Issue 2, March 2013
Volumes & issues:
Volume 7, Issue 2
March 2013
-
- Author(s): Giorgos Dimitrakopoulos and Cyriel Minkenberg
- Source: IET Computers & Digital Techniques, Volume 7, Issue 2, page: 57 –57
- DOI: 10.1049/iet-cdt.2013.0018
- Type: Article
- + Show details - Hide details
-
p.
57
(1)
- Author(s): Alessandro Strano ; Nicola Caselli ; Simone Terenzi ; Davide Bertozzi
- Source: IET Computers & Digital Techniques, Volume 7, Issue 2, p. 58 –68
- DOI: 10.1049/iet-cdt.2012.0064
- Type: Article
- + Show details - Hide details
-
p.
58
–68
(11)
Most built-in self-test architectures use pseudo-random test pattern generators. However, whenever this technique has been applied to on-chip interconnection networks, overly large testing latencies have been reported. On the other hand, alternative approaches either suffer from large area penalties (like scan-based testing or the use of deterministic test patterns) or poor fault coverage in the control path (functional testing). Moreover, the recent proliferation of clock domains on a chip makes testing overly challenging. This manuscript presents the optimisation of a built-in self-testing framework based on pseudo-random test patterns to the microarchitecture of network-on-chip switches. As a result, fault coverage and testing latency approach those achievable with deterministic test patterns while materialising relevant area savings and enhanced flexibility. Finally, the authors implement the extension of the proposed testing methodology to multisynchronous systems, thus making it compliant with the relaxation of synchronisation assumptions in nanoscale designs.
- Author(s): Mario Lodde ; Toni Roca ; José Flich
- Source: IET Computers & Digital Techniques, Volume 7, Issue 2, p. 69 –80
- DOI: 10.1049/iet-cdt.2012.0056
- Type: Article
- + Show details - Hide details
-
p.
69
–80
(12)
Future chip multiprocessors will include hundreds of cores organised in a tile-based design pattern. These systems commonly employ a shared memory programming model, thus needing a coherence protocol to keep data consistent on the various levels of the cache hierarchy. Usually an invalidation-based protocol is used, where shared copies are invalidated before a write operation. In this study, the authors propose a NoC re-organisation in which a small and fast dedicated control network is used to transmit acknowledgement messages related to the invalidation process, thus relieving the NoC from a considerable percentage of traffic. The dedicated control network is evaluated both with full map directories and with a broadcast-based protocol (Hammer). Experimental evaluation shows significant gains in performance. With a low area overhead (<2.5%), the control network reduces NoC traffic and miss latency, thus reducing execution time up to 16%. Simulation results show a reduction of network traffic up to 80% and a reduction of store and load miss latency up to 70 and 40%, respectively.
- Author(s): Ana Jokanovic ; Jose Carlos Sancho ; German Rodriguez ; Cyriel Minkenberg ; Ramon Beivide ; Jesus Labarta
- Source: IET Computers & Digital Techniques, Volume 7, Issue 2, p. 81 –92
- DOI: 10.1049/iet-cdt.2012.0059
- Type: Article
- + Show details - Hide details
-
p.
81
–92
(12)
Network contention is seen as a major hurdle to achieve higher throughput in today's large-scale high-performance computing systems. Even more so with the current trend of employing blocking networks driven by the need of reducing cost. Additionally, the effect is aggravated by current system schedulers that allocate jobs as soon as nodes become available, thus producing job fragmentation, that is, the tasks of one job might be spread throughout the system instead of being allocated contiguously. This fragmentation increases the probability of sharing network resources with other applications, which produces higher inter-application network contention. In this study, the authors perform a broad analysis of diverse applications’ performance variability because of the topology connectivity and fragmentation and a classification of applications based on their sensitiveness to these two factors. Once they understood the inherent characteristics of applications, the authors analysed the applications performance in a shared environment, that is, when mixing with other applications. They show that inter-application contention might be a significant factor of degradation even in the networks with high connectivity. Their results suggest different strategies on task allocation policies: grouping sensitive and insensitive applications, reducing the number of applications sharing the first level switch or isolation of sensitive applications.
- Author(s): Arseniy Vitkovskiy ; Vassos Soteriou ; Chrysostomos Nicopoulos
- Source: IET Computers & Digital Techniques, Volume 7, Issue 2, p. 93 –103
- DOI: 10.1049/iet-cdt.2012.0054
- Type: Article
- + Show details - Hide details
-
p.
93
–103
(11)
Downscaled complementary metal-oxide semiconductor (CMOS) technology feature sizes have enabled massive transistor integration densities. Multi-core chips with billions of transistors are now a reality. However, this rapid increase in on-chip resources has come at the expense of higher susceptibility to defects and wear-out. The inter-router communication links of networks-on-chips (NoCs) are composed of metal wires that are especially vulnerable to catastrophic physical effects such as those of electro-migration, which can even cause link disconnects. To address this hazard, fault-tolerant (FT) routing algorithms sustain on-chip communication by re-routing messages around faulty links, or regions. This work presents a new FT routing scheme that employs a localised re-routing approach. Packets are de-toured around faulty links/regions based on purely local and distributed decisions, and without any global link state knowledge. The algorithm, which is proven to be deadlock- and livelock-free, also handles dynamically occurring faults. Detailed evaluation with synthetic traffic patterns and real applications within a full-system simulation environment demonstrate the efficacy of the new scheme with up to 12% of NoC links being faulty. Synthesis results also prove the feasibility of the proposed protocol at modest hardware and power consumption overheads of only over 5 and 2.5%, respectively.
Editorial
Optimising pseudo-random built-in self-testing of fully synchronous as well as multisynchronous networks-on-chip
Built-in fast gather control network for efficient support of coherence protocols
On the trade-off of mixing scientific applications on capacity high-performance computing systems
Dynamic fault-tolerant routing algorithm for networks-on-chip based on localised detouring paths
Most viewed content
Most cited content for this Journal
-
High-performance elliptic curve cryptography processor over NIST prime fields
- Author(s): Md Selim Hossain ; Yinan Kong ; Ehsan Saeedi ; Niras C. Vayalil
- Type: Article
-
Majority-based evolution state assignment algorithm for area and power optimisation of sequential circuits
- Author(s): Aiman H. El-Maleh
- Type: Article
-
Scalable GF(p) Montgomery multiplier based on a digit–digit computation approach
- Author(s): M. Morales-Sandoval and A. Diaz-Perez
- Type: Article
-
Fabrication and characterisation of Al gate n-metal–oxide–semiconductor field-effect transistor, on-chip fabricated with silicon nitride ion-sensitive field-effect transistor
- Author(s): Rekha Chaudhary ; Amit Sharma ; Soumendu Sinha ; Jyoti Yadav ; Rishi Sharma ; Ravindra Mukhiya ; Vinod K. Khanna
- Type: Article
-
Adaptively weighted round-robin arbitration for equality of service in a many-core network-on-chip
- Author(s): Hanmin Park and Kiyoung Choi
- Type: Article