In image processing domain of deep learning, the big size and complexity of the visual data require a large number of learnable variables. Subsequently, the training process consumes enormous computation and memory resources. Based on residual modules, the authors developed a new model architecture that has a minimal number of parameters and layers that enabled us to classify tiny images using much less computation and memory costs. Also, the summation of correlations between pairs of feature maps as an additive penalty in the objective function was used. This technique encourages the kernels to be learned in a way that elicit uncorrelated representations from the input images. Also, employing Fractional pooling helped to have deeper networks that consequently resulted in more informative representation. Moreover, employing periodic learning rate curves, multiple machines are trained with a less total cost. In the training phase, a random augmentation to the input data that prevent the model from being overfitted was applied. Applying MNIST and CIFAR-10 datasets to the proposed model resulted in the classification accuracy of 99.72 and 93.98, respectively.

References

1. 1)
  - 27. He, K., Zhang, X., Ren, S., et al: ‘Deep residual learning for image recognition’. 2016 IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016.
2. 2)
  - 42. Glorot, X., Bengio, Y.: ‘Understanding the difficulty of training deep feedforward neural networks’, J. Mach. Learn. Res., 2010, 9, pp. 249–256.
3. 3)
  - 21. Krizhevsky, A., Sutskever, I., Hinton, G.: ‘Imagenet classification with deep convolutional neural networks’, Commun. ACM, 2017, 60, (6), pp. 84–90.
4. 4)
  - 31. Cireşan, D., Meier, U., Schmidhuber, J.: ‘Multi-column deep neural networks for image classification’. 2012 IEEE Conf. on Computer Vision and Pattern Recognition, Washington, DC, USA, 2012.
5. 5)
  - 16. Unar, S., Wang, X., Zhang, C.: ‘Visual and textual information fusion using Kernel method for content based image retrieval’, Inf. Fusion, 2018, 44, pp. 176–187.
6. 6)
  - 17. Hager, G.D.: ‘Object recognition techniques’, 2003. Available at http://www.cs.jhu.edu/.
7. 7)
  - 36. Springenberg, J.T., Dosovitskiy, A., Brox, T., et al: ‘Striving for simplicity: the all convolutional net’. Int. Conf. on Learning Representations, Banff National Park, Alberta, Canada, 2014.
8. 8)
  - 25. Szegedy, C., Vanhoucke, V., Ioffe, S., et al: ‘Rethinking the inception architecture for computer vision’. 2016 IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2015.
9. 9)
  - 8. Grauman, K., Leibe, B.: ‘Visual object recognition’, Synthesis Lectures on Artificial Intell. Mach. Learn., 2011, 5, (2), pp. 1–181.
10. 10)
  - 32. Sato, I., Nishimura, H., Yokoi, K.: ‘APAC: augmented pattern classification with neural networks’, arXiv:1505.03229, 2015.
11. 11)
  - 37. Mishkin, D., Matas, J.: ‘All you need is a good init’. Int. Conf. on Learning Representations, 2016, Vancouver, BC, Canada, 2016.
12. 12)
  - 20. LeCun, Y., Cortes, C., Burges, C.J.: ‘The MNIST database of handwritten digits’, 1998. Available at http://yann.lecun.com/exdb/mnist/.
13. 13)
  - 2. Martinez, A.M., Kak, A.C.: ‘PCA versus LDA’, IEEE Trans. Pattern Anal. Mach. Intell., 2001, 23, (2), pp. 228–233.
14. 14)
  - 1. Hsu, G.S., Loc, T.T., Chung, S.L.: ‘A comparison study on appearance-based object recognition’. Proc. of the 21st Int. Conf. on Pattern Recognition, Tsukuba, Japan, 2012.
15. 15)
  - 12. Bay, H., Ess, A., Tuytelaars, T., et al: ‘Speeded-up robust features (SURF)’, Comput. Vis. Image Underst., 2008, 110, (3), pp. 346–359.
16. 16)
  - 38. Moradi, R., Berangi, R., Minaei, B.: ‘Sparsemaps: convolutional networks with sparse feature maps for tiny image classification’, Expert Syst. Appl., 2019, 119, pp. 142–154.
17. 17)
  - 24. Szegedy, C., Liu, W., Jia, Y., et al: ‘Going deeper with convolutions’. 2015 IEEE Conf. on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015.
18. 18)
  - 14. Wang, C., Wang, X., Li, Y., et al: ‘Quaternion polar harmonic Fourier moments for color images’, Inf. Sci., 2018, 450, pp. 141–156.
19. 19)
  - 44. Wang, X., Wang, Z.: ‘The method for image retrieval based on multi-factors correlation utilizing block truncation coding’, Pattern Recognit., 2014, 47, (10), pp. 3293–3303.
20. 20)
  - 43. Kingma, D.P., Ba, J.: ‘Adam: a method for stochastic optimization’. Proc. of the 3rd Int. Conf. on Learning Representations, San Diego, USA, 2014.
21. 21)
  - 28. Iandola, F.N., Han, S., Moskewicz, M.W., et al: ‘Squeezenet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size’, arXiv:1602.07360, 2016.
22. 22)
  - 34. Lee, C.Y., Gallagher, P.W., Tu, Z.: ‘Generalizing pooling functions in convolutional neural networks: mixed, gated, and tree’. Int. Conf. on Artificial Intelligence and Statistics, Cadiz, Spain, 2016.
23. 23)
  - 13. Wang, C., Wang, X., Xia, Z., et al: ‘Ternary radial harmonic Fourier moments based robust stereo image zero-watermarking algorithm’, Inf. Sci., 2019, 470, pp. 109–120.
24. 24)
  - 7. Wang, X.Y., Chen, Z.F., Yun, J.J.: ‘An effective method for color image retrieval based on texture’, Comput. Stand. Interfaces, 2012, 34, (1), pp. 31–35.
25. 25)
  - 9. Lowe, D.G.: ‘Distinctive image features from scale-invariant keypoints’, Int. J. Comput. Vis., 2004, 60, (2), pp. 91–110.
26. 26)
  - 30. Wan, L., Zeiler, M., Zhang, S., et al: ‘Regularization of neural networks using DropConnect’. 30th Int. Conf. on Int. Conf. on Machine Learning, Atlanta, GA, USA, 2013.
27. 27)
  - 35. Graham, B.: ‘Fractional Max-pooling’, arXiv:1412.6071, 2015.
28. 28)
  - 18. Ehrenmann, M., Ambela, D., Steinhaus, P., et al: ‘A comparison of four fast vision based object recognition methods for programming by demonstration applications’. IEEE Int. Conf. on Robotics and Automation. Symposia Proc., San Francisco, USA, 2000.
29. 29)
  - 15. Xingyuan, W., Zongyu, W.: ‘A novel method for image retrieval based on structure elements’ descriptor’, J. Vis. Commun. Image Represent., 2013, 24, (1), pp. 63–74.
30. 30)
  - 23. Lin, M., Chen, Q., Yan, S.: ‘Network in network’. Int. Conf. on Learning Representations, Scottsdale, Arizona, USA, 2013.
31. 31)
  - 22. Simonyan, K., Zisserman, A.: ‘Very deep convolutional networks for large-scale image recognition’. Int. Conf. on Learning Representations, San Diego, California, 2015.
32. 32)
  - 11. Roth, P.M., Winter, M.: ‘Survey of appearance-based methods for object recognition’. Technical report, Institute for Computer Graphics and Vision, Austria, 2008.
33. 33)
  - 33. Chang, J.R., Chen, Y.S.: ‘Batch-normalized maxout network in network’, arXiv:1511.02583, 2015.
34. 34)
  - 45. Wang, X.Y., Wang, S.G.: ‘An improved no-search fractal image coding method based on a modified gray-level transform’, Comput. Graph., 2008, 32, (4), pp. 445–450.
35. 35)
  - 6. Matas, J., Obdrzalek, S.: ‘Object recognition methods based on transformation covariant features’. 12th European Signal Processing Conf., Vienna, Austria, 2004.
36. 36)
  - 39. Wang, X.Y., Wang, Y.X., Yun, J.J.: ‘An improved no-search fractal image coding method based on a fitting plane’, Image Vis. Comput., 2010, 28, (8), pp. 1303–1308.
37. 37)
  - 3. Jolliffe, I.T., Cadima, J.: ‘Principal component analysis: a review and recent developments’, Philos. Transact. A Math. Phys. Eng. Sci., 2016, 374, (2065).
38. 38)
  - 29. Canziani, A., Paszke, A., Culurciello, E.: ‘An analysis of deep neural network models for practical applications’, arXiv:1605.07678, 2016.
39. 39)
  - 5. Andreopoulos, A., Tsotsos, J.K.: ‘50 years of object recognition: directions forward’, Comput. Vis. Image Underst., 2013, 117, (8), pp. 827–891.
40. 40)
  - 41. Abadi, M., Agarwal, A., Barham, P., et al: ‘Tensorflow: large-scale machine learning on heterogeneous distributed systems’, arXiv:1603.04467, 2016.
41. 41)
  - 10. Zhou, Q.: ‘Generalized landmark recognition in robot navigation’. PhD thesis, Ohio University, 2004.
42. 42)
  - 26. Ioffe, S., Szegedy, C.: ‘Batch normalization: accelerating deep network training by reducing internal covariate shift’. 32nd Int. Conf. on Int. Conf. on Machine Learning, Lille, France, 2015.
43. 43)
  - 19. Khurana, K., Awasthi, R.: ‘Techniques for object recognition in images and multi-object detection’, Int. J. Adv. Res. Comput. Eng. Technol., 2013, 2, (4), pp. 1383–1388.
44. 44)
  - 40. Wang, X.Y., Chen, Z.F.: ‘A fast fractal coding in application of image retrieval’, Fractals, 2009, 17, (4), pp. 441–450.
45. 45)
  - 4. Tharwat, A., Gaber, T., Ibrahim, A., et al: ‘Linear discriminant analysis: a detailed tutorial’, AI Commun., 2017, 30, (2), pp. 169–190.

OrthoMaps: an efficient convolutional neural network with orthogonal feature maps for tiny image classification

References

Related content