© The Institution of Engineering and Technology
Local feature extraction is one of the key characteristics of convolutional neural networks (CNNs). This study proposes an adaptive local feature enhancement (ALFE) model with a low-frequency general appearance-enhancement operator and a high-frequency local detail enhancement operator to improve local features of CNNs. Through supervised training, the model could adaptively adjust enhancement parameters and achieve a global-local enhancement of training images and CNNs. The performance of ALFE was first preliminarily evaluated with a self-built CNN on CIFAR-10 data set in different conditions of image augmentation and feature pooling. CNNs with ALFE could increase the top-1 accuracy compared against CNNs in the same conditions, and nearly reach the same level of performance for different pooling approaches. With only two extra adjustable parameters, this model could effectively avoid overfitting, without affecting the convergence speed of CNN. In addition, the extra burden of network complexity could be neglected. Further, experiments of three existing popular CNNs (AlexNet, VGGNet and ResNet) with ALFE were carried out on dogs versus cats, Tiny ImageNet, and SVHN data sets, respectively. The results show that ALFE is feasible for the existing popular CNN models, improving their top-1 accuracies without changing their convergence speeds.
References
-
-
1)
-
38. He, K., Zhang, X., Ren, S., et al: ‘Identity mappings in deep residual networks’. European Conf. on Computer Vision, Amsterdam, Netherlands, 2016, pp. 630–645.
-
2)
-
8. Simonyan, K., Zisserman, A.: ‘Very deep convolutional networks for large-scale image recognition’. Int. Conf. on Learning Representations, San Diego, CA, USA, 2015, pp. 1–14.
-
3)
-
32. Ciresan, D.C., Meier, U., Masci, J., et al: ‘Flexible, high performance convolutional neural networks for image classification’. Twenty-Second Int. Joint Conf. on Artificial Intelligence, Barcelona, Spain, 2011.
-
4)
-
28. Oloyede, M.O., Hancke, G.P., Myburgh, H.C.: ‘Improving face recognition systems using a new image enhancement technique, hybrid features and the convolutional neural network’, IEEE Access, 2018, 6, pp. 75181–75191.
-
5)
-
24. Scherer, D., Müller, A., Behnke, S.: ‘Evaluation of pooling operations in convolutional architectures for object recognition’. Int. Conf. on Artificial Neural Networks, Thessaloniki, Greece, 2010, pp. 92–101.
-
6)
-
14. Zhu, Y., Sun, J., Naoi, S.: ‘Recognizing natural scene characters by convolutional neural network and bimodal image enhancement’. Int. Workshop on Camera-Based Document Analysis and Recognition, Beijing, People's Republic of China, 2011, pp. 69–82.
-
7)
-
33. Krizhevsky, A., Hinton, G.: ‘Learning multiple layers of features from tiny images’, 2009.
-
8)
-
39. Maas, A.L., Hannun, A.Y., Ng, A.Y.: ‘Rectifier nonlinearities improve neural network acoustic models’. Int. Conf. on Machine Learning (ICML), 2013.
-
9)
-
6. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘Imagenet classification with deep convolutional neural networks’, Adv. Neural Inf. Process. Syst., 2012, 25, pp. 1097–1105.
-
10)
-
29. Hegdé, J., Van Essen, D.C.: ‘Selectivity for complex shapes in primate visual area V2’, J. Neurosci., 2000, 20, (5), pp. RC61–RC61.
-
11)
-
5. LeCun, Y., Boser, B., Denker, J.S., et al: ‘Backpropagation applied to handwritten zip code recognition’, Neural Comput., 1989, 1, (4), pp. 541–551.
-
12)
-
22. Ioffe, S., Szegedy, C.: ‘Batch normalization: accelerating deep network training by reducing internal covariate shift’. Proc. of the Int. Conf. on Machine Learning (ICML), Lille, France, 2015, pp. 448–456.
-
13)
-
31. Willmore, B.D., Prenger, R.J., Gallant, J.L.: ‘Neural representation of natural images in visual area V2’, J. Neurosci., 2010, 30, (6), pp. 2102–2114.
-
14)
-
35. Deng, J., Dong, W., Socher, R., et al: ‘Imagenet: a large-scale hierarchical image database’. 2009 IEEE Conf. on Computer Vision and Pattern Recognition, Miami Beach, FL, USA, 2009, pp. 248–255.
-
15)
-
36. Netzer, Y., Wang, T., Coates, A., et al: ‘Reading digits in natural images with unsupervised feature learning’. Nips Workshop on Deep Learning & Unsupervised Feature Learning, Sierra Nevada, Spain, 2011, pp. 1–9.
-
16)
-
1. Fang, L., Wang, C., Li, S., et al: ‘Attention to lesion: lesion-aware convolutional neural network for retinal optical coherence tomography image classification’, IEEE Trans. Med. Imaging, 2019, 38, (8), pp. 1959–1970.
-
17)
-
10. Hubel, D.H., Wiesel, T.N.: ‘Receptive fields of single neurones in the cat's striate cortex’, J. Physiol., 1959, 148, (3), pp. 574–591.
-
18)
-
11. Yoon, K.S., Kim, W.J.: ‘Efficient edge-preserved sonar image enhancement method based on CVT for object recognition’, IET Image Process., 2019, 13, (1), pp. 15–23.
-
19)
-
16. Shaik, B., Ram, M.S.S.: ‘Convolution neural network-based Alzheimer's disease classification using hybrid enhanced independent component analysis based segmented gray matter of T2 weighted magnetic resonance imaging with clinical valuation’, Alzheimer's & Dementia, Transl. Res. Clin. Interventions, 2019, 5, pp. 974–986.
-
20)
-
23. Lee, C.-Y., Gallagher, P.W., Tu, Z.: ‘Generalizing pooling functions in convolutional neural networks: mixed, gated, and tree’, Artif. Intell. Stat., 2016, 51, pp. 464–472.
-
21)
-
13. Ravi, P., Krishnan, J.: ‘Image enhancement with medical image fusion using multiresolution discrete cosine transform’, Mater. Today, Proc., 2018, 5, (1), pp. 1936–1942.
-
22)
-
3. Rivera, P., Anazco, E.V., Choi, M.T., et al: ‘Trilateral convolutional neural network for 3D shape reconstruction of objects from a single depth view’, Image Process., IET, 2019, 13, (13), pp. 2457–2466.
-
23)
-
19. Zhang, S., Zhang, S., Zhang, C., et al: ‘Cucumber leaf disease identification with global pooling dilated convolutional neural network’, Comput. Electron. Agric., 2019, 162, pp. 422–430.
-
24)
-
17. Vardhana, M., Arunkumar, N., Lasrado, S., et al: ‘Convolutional neural network for bio-medical image segmentation with hardware acceleration’, Cogn. Syst. Res., 2018, 50, pp. 10–14.
-
25)
-
2. Lopes, A.T., Aguiar, E., Souza, A.F.D., et al: ‘Facial expression recognition with convolutional neural networks: coping with few data and the training sample order’, Pattern Recognit., 2017, 61, pp. 610–628.
-
26)
-
4. Wang, Y., Luo, X., Ding, L., et al: ‘Detection based visual tracking with convolutional neural network’, Knowl.-Based Syst., 2019, 175, pp. 62–71.
-
27)
-
12. Tripathi, R.K.: ‘Adaptive geometric filtering based on average brightness of the image and discrete cosine transform coefficient adjustment for gray and color image enhancement’, Arab. J. Sci. Eng., 2020, 45, pp. 1655–1668.
-
28)
-
18. Nahid, A.-A., Ali, F.B., Kong, Y.: ‘Histopathological breast-image classification with image enhancement by convolutional neural network’. Int. Conf. of Computer and Information Technology, Dhaka, Bangladesh, 2017.
-
29)
-
9. He, K., Zhang, X., Ren, S., et al: ‘Deep residual learning for image recognition’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 770–778.
-
30)
-
15. Jang, H.U., Choi, H.Y., Kim, D., et al: ‘Fingerprint spoof detection using contrast enhancement and convolutional neural networks’. Int. Conf. on Information Science & Applications, Macau, People's Republic of China, 2017, pp. 331–338.
-
31)
-
32)
-
25. Zeiler, M.D., Fergus, R.: ‘Stochastic pooling for regularization of deep convolutional neural networks’. Int. Conf. on Learning Representations, Scottsdale, AZ, USA, 2013.
-
33)
-
37. LeCun, Y., Bottou, L., Bengio, Y., et al: ‘Gradient-based learning applied to document recognition’, Proc. IEEE, 1998, 86, (11), pp. 2278–2324.
-
34)
-
7. Szegedy, C., Liu, W., Jia, Y., et al: ‘Going deeper with convolutions’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 1–9.
-
35)
-
27. Nogueira, R.F., Lotufo, R.D.A., Machado, R.C.: ‘Fingerprint liveness detection using convolutional neural networks’, IEEE Trans. Inf. Forensics Sec., 2017, 11, (6), pp. 1206–1213.
-
36)
-
20. Heravi, E.J., Aghdam, H.H., Puig, D.: ‘An optimized convolutional neural network with bottleneck and spatial pyramid pooling layers for classification of foods’, Pattern Recognit. Lett., 2018, 105, pp. 50–58.
-
37)
-
30. Anzai, A., Peng, X., Van Essen, D.C.: ‘Neurons in monkey visual area V2 encode combinations of orientations’, Nat. Neurosci., 2007, 10, (10), p. 1313.
-
38)
-
21. Goodfellow, I.J., Warde-Farley, D., Mirza, M., et al: ‘Maxout networks’. Int. Conf. on Machine Learning (ICML), Atlanta, GA, USA, 2013, pp. 1319–1327.
-
39)
-
26. Sun, M., Song, Z., Jiang, X., et al: ‘Learning pooling for convolutional neural network’, Neurocomputing, 2017, 224, pp. 96–104.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-ipr.2020.0591
Related content
content/journals/10.1049/iet-ipr.2020.0591
pub_keyword,iet_inspecKeyword,pub_concept
6
6