access icon free Stable improved softmax using constant normalisation

In deep learning architectures, rectified linear unit based functions are widely used as activation functions of hidden layers, and the softmax is used for the output layers. Two critical problems of the softmax are introduced, and an improved softmax method to resolve the problems is proposed. The proposed method minimises instability of the softmax while reducing its losses. Moreover, this method is straightforward so its computation complexity is low, but it is substantially reasonable and operates robustly. Therefore, the proposed method can replace the softmax functions.

Inspec keywords: learning (artificial intelligence); computational complexity; transfer functions

Other keywords: deep learning architectures; hidden layers; improved softmax method; constant normalisation; rectified linear unit based functions; computation complexity; activation functions

Subjects: Computational complexity; Learning in AI (theory)

References

    1. 1)
      • 4. Jin, X., Xu, C., Feng, J., et al: ‘Deep learning with S-shaped rectified linear activation units’. Proc. Thirtieth AAAI Conf. on Artificial Intelligence, Phoenix, AZ, USA, February 2016, pp. 17371743.
    2. 2)
    3. 3)
      • 2. Xiong, C., Merity, S., Socher, R.: ‘Dynamic memory networks for visual and textural question answering’. Int. Conf. Machine Learning, New York, NY, USA, June 2016, pp. 23972406.
    4. 4)
      • 1. Salakhutdinov, R., Hinton, G.: ‘Replicated softmax: an undirected topic model’, Adv. Neural Inf. Proc. Syst., 2016, 22, pp. 16071614.
    5. 5)
http://iet.metastore.ingenta.com/content/journals/10.1049/el.2017.3394
Loading

Related content

content/journals/10.1049/el.2017.3394
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading