© The Institution of Engineering and Technology
As computing systems continue their unquenchable rise towards and through million core architectures, two considerations that used to be unimportant become more and more dominant: power consumption (be it FLOPS/W or W/mm2) and reliability. This study is concerned with the latter: in a system of a million cores, it is unrealistic to expect 100% functionality on power-up; equally, operational availability degrades with time. Monitoring and maintaining the health of such a system using traditional techniques is costly, and most rely on the concept of some sort of central overseer or monitor to make a final judgement about system availability, giving a single point of failure. Large systems of the future will consist of hardware and software that work synergistically to cope with isolated points of failure, allowing the gross behaviour of the system to degrade gracefully and in a meaningful way in the face of faults. This study describes one such system: spiking neural network architecture is a million-core machine with layered fault-tolerance built in at many levels. The authors show how the system may be used to solve the canonical distributed heat diffusion equation, and how the quality of solution is modulated by the effects of partial system failure.
References
-
-
1)
-
16. Kuzum, D., Jeyasingh, R.G.D., Lee, B., Philip Wong, H.-S.: ‘Nanoelectronic programmable synapses based on phase change materials for brain-inspired computing’, Nano Lett., 2012, 12, (5), pp. 2179–2186 (doi: 10.1021/nl201040y).
-
2)
-
6. Markram, H.: ‘The blue brain project’, Nat. Rev. Neurosci., 2006, 7, pp. 153–160, (doi: 10.1038/nrn1848).
-
3)
-
12. Amunts, K., Lepage, C., Borgeat, L., et al: ‘BigBrain: an ultrahigh-resolution 3D human brain model’, Science, 2013, 340, (6139), pp. 1472–1475 (doi: 10.1126/science.1235381).
-
4)
-
5. Markram, H.: ‘Seven challenges for neuroscience’, Funct. Neurol., 2013, 28, (3), pp. 145–151.
-
5)
-
6)
-
26. Spence, R., Randeep Singh, S.: ‘Tolerance design of electronic circuits’ (Addison-Wesley, New York, 1988).
-
7)
-
29. Saleh, R.A., Antao, B., Singh, J.: ‘Multilevel and mixed-domain simulation of analogue circuits and systems’,.
-
8)
-
9)
-
R. Saleh ,
B. Antao ,
J. Singh
.
Multilevel and mixed-domain simulation of analog circuits and systems.
IEEE Trans. Comput.-Aided Des.
,
68 -
82
-
10)
-
18. Brüderle, D., Petrovici, M.A., Vogginger, B.: ‘A comprehensive workflow for general-purpose neural modelling with highly configurable neuromorphic hardware systems’, Biol. Cybern., 2011, 104, (4–5), pp. 263–296 (doi: 10.1007/s00422-011-0435-9).
-
11)
-
12)
-
R.M. Fujimoto
.
Parallel discrete event simulation.
Commun. ACM
,
10 ,
30 -
35
-
13)
-
14)
-
31. Göddeke, D., Strzodka, R., Turek, S.: ‘Performance and accuracy of hardware-oriented native-, emulated-and mixed-precision solvers in FEM simulations’, Int. J. Parallel, Emergent Distrib. Syst., 2007, 22, (4), pp. 221–256 (doi: 10.1080/17445760601122076).
-
15)
-
16)
-
33. Bertsekas, D.P., Tsitsiklis, J.N.: ‘Parallel and distributed computation: numerical methods’ (Prentice-Hall, Englewood Cliffs, NJ, 1989), .
-
17)
-
13. Lein, E.S., Hawrylycz, M.J., Ao, N.: ‘Genome-wide atlas of gene expression in the adult mouse brain’, Nature, 2006, 445, pp. 168–176, (doi: 10.1038/nature05453).
-
18)
-
3. Painkras, E., Plana, L.A., Garside, J.D., et al: ‘SpiNNaker: A 1 W 18-core system-on-chip for massively-parallel neural network simulation’, , IEEE J. Solid-State Circuits, 2013, 48, (8), pp. 1943–1953, (doi: 10.1109/JSSC.2013.2259038).
-
19)
-
27. MacMillen, D., Camposano, R., Hill, D.: ‘An industrial view of electronic design automation’, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., 2000, 19, (12), pp. 1428–1448 (doi: 10.1109/43.898825).
-
20)
-
2. Furber, S.B., Lester, D.R., Plana, L.A., Garside, J.D., Painkras, E., Temple, S., Brown, A.D.: ‘Overview of the SpiNNaker system architecture’, IEEE Trans. Comput., 2013, 62, pp. 2454–2467 (doi: 10.1109/TC.2012.142).
-
21)
-
22)
-
23)
-
32. Brown, A.D., Ross, J.N., Nichols, K.G.: ‘Time-domain simulation of mixed nonlinear magnetic and electronic systems’, IEEE Trans. Magn., 2001, 37, (1), pp. 522–532. (doi: 10.1109/20.914373).
-
24)
-
25. Solano-Quinde, L.D., Bode, B.M.: ‘Module Prototype for Online Failure Prediction for the IBM Blue Gene/L’. Proc. IEEE Electro/Information Technology Conf., EIT, May 2008, pp. 470–474, .
-
25)
-
9. Seung, S.: ‘Connectome: how the brains wiring makes us who we are’, Houghton-Harcourt, 2012, .
-
26)
-
27)
-
20. Brown, A.D., Nichols, K.G., Zwolinski, M.: ‘Issues in the design of a logic simulator: element modelling for efficiency’, IEE Proc. Circuits Devices Syst., 1996, 143, (1), pp. 21–27 (doi: 10.1049/ip-cds:19960013).
-
28)
-
28. Asenov, A., Kaya, S., Davies, J.H.: ‘Intrinsic threshold voltage fluctuations in decanano MOSFETs due to local oxide thickness variations’, IEEE Trans. Electron Devices, 2002, 49, (1), pp. 112–119 (doi: 10.1109/16.974757).
-
29)
-
34. Elnozahy, M., Alvisi, L., Wang, Y.-M., Johnson, D.B.: ‘A survey of rollback-recovery protocols in message passing systems’ (School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 1999).
-
30)
-
2. Brown, A.D., Furber, S.B., Reeve, J.S., et al: ‘SpiNNaker – foundation software’, , IEEE Trans. Comput..
-
31)
-
32)
-
33)
-
22. Pakin, S., Lauria, M., Chien, A.: ‘High performance messaging on workstations: Illinois Fast Messages (FM) for Myrinet’. Proc. of the 1995 ACM/IEEE Conf. on Supercomputing.
-
34)
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cdt.2014.0110
Related content
content/journals/10.1049/iet-cdt.2014.0110
pub_keyword,iet_inspecKeyword,pub_concept
6
6