© The Institution of Engineering and Technology
In deep learning architectures, rectified linear unit based functions are widely used as activation functions of hidden layers, and the softmax is used for the output layers. Two critical problems of the softmax are introduced, and an improved softmax method to resolve the problems is proposed. The proposed method minimises instability of the softmax while reducing its losses. Moreover, this method is straightforward so its computation complexity is low, but it is substantially reasonable and operates robustly. Therefore, the proposed method can replace the softmax functions.
References
-
-
1)
-
4. Jin, X., Xu, C., Feng, J., et al: ‘Deep learning with S-shaped rectified linear activation units’. Proc. Thirtieth AAAI Conf. on Artificial Intelligence, Phoenix, AZ, USA, February 2016, pp. 1737–1743.
-
2)
-
5. Hinton, G.E., Salakhutdinov, R.R.: ‘Reducing the dimensionality of data with neural networks’, Science, 2006, 313, (5786), pp. 504–507 (doi: 10.1126/science.1127647).
-
3)
-
2. Xiong, C., Merity, S., Socher, R.: ‘Dynamic memory networks for visual and textural question answering’. Int. Conf. Machine Learning, New York, NY, USA, June 2016, pp. 2397–2406.
-
4)
-
1. Salakhutdinov, R., Hinton, G.: ‘Replicated softmax: an undirected topic model’, Adv. Neural Inf. Proc. Syst., 2016, 22, pp. 1607–1614.
-
5)
-
3. Nitta, T.: ‘Solving the XOR problem and the detection of symmetry using a single complex-valued neuron’, Neural Netw., 2003, 16, (8), pp. 1101–1105 (doi: 10.1016/S0893-6080(03)00168-0).
http://iet.metastore.ingenta.com/content/journals/10.1049/el.2017.3394
Related content
content/journals/10.1049/el.2017.3394
pub_keyword,iet_inspecKeyword,pub_concept
6
6