access icon free Recurrent neural networks for remote sensing image classification

Automatically classifying an image has been a central problem in computer vision for decades. A plethora of models has been proposed, from handcrafted feature solutions to more sophisticated approaches such as deep learning. The authors address the problem of remote sensing image classification, which is an important problem to many real world applications. They introduce a novel deep recurrent architecture that incorporates high-level feature descriptors to tackle this challenging problem. Their solution is based on the general encoder–decoder framework. To the best of the authors’ knowledge, this is the first study to use a recurrent network structure on this task. The experimental results show that the proposed framework outperforms the previous works in the three datasets widely used in the literature. They have achieved a state-of-the-art accuracy rate of 97.29% on the UC Merced dataset.

Inspec keywords: learning (artificial intelligence); recurrent neural nets; geophysical image processing; computer vision; remote sensing; feature extraction; image classification

Other keywords: recurrent neural networks; general encoder-decoder framework; high-level feature descriptors; deep recurrent architecture; deep learning; UC Merced dataset; RS-19 dataset; computer vision; automatic image classification; remote sensing image classification; recurrent network structure; Brazilian Coffee Scenes dataset

Subjects: Neural computing techniques; Image recognition; Knowledge engineering techniques; Geography and cartography computing; Instrumentation and techniques for geophysical, hydrospheric and lower atmosphere research; Geophysical techniques and equipment; Data and information; acquisition, processing, storage and dissemination in geophysics; Computer vision and image processing techniques

References

    1. 1)
      • 35. Xia, G.-S., Yang, W., Delon, J., et al: ‘Structural high-resolution satellite image indexing’. ISPRS TC VII Symp. – 100 Years ISPRS, Vienna, Austria, 2010, vol. 38, pp. 298303.
    2. 2)
      • 21. Cho, K., Courville, A., Bengio, Y.: ‘Describing multimedia content using attention-based encoder–decoder networks’, IEEE Trans. Multimed., 2015, 17, (11), pp. 18751886.
    3. 3)
      • 34. Wei, Y., Xia, W., Huang, J., et al: ‘CNN: single-label to multi-label’, CoRR, abs/1406.5726, 2014.
    4. 4)
      • 9. Yang, Y., Newsam, S.: ‘Bag-of-visual-words and spatial extensions for land-use classification’. Proc. 18th SIGSPATIAL Int. Conf. on Advances in Geographic Information Systems, GIS ‘10, New York, NY, USA, 2010, pp. 270279.
    5. 5)
      • 13. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘Imagenet classification with deep convolutional neural networks’, in Pereira, F., Burges, C.J.C., Bottou, L., et al (Eds.): ‘Advances in neural information processing systems’, vol. 25 (Curran Associates, Inc., Red Hook, NY, USA, 2012), pp. 10971105.
    6. 6)
      • 36. Ali Sharif, R., Hossein, A., Josephine, S., et al: ‘CNN features off-the-shelf: an astounding baseline for recognition’. 2014 IEEE Conf. on Computer Vision and Pattern Recognition Workshops (CVPRW 2014), Columbus, OH, USA, 2014.
    7. 7)
      • 29. Razavian, A.S., Azizpour, H., Sullivan, J., et al: ‘CNN features off-the-shelf: an astounding baseline for recognition’, CoRR, abs/1403.6382, 2014.
    8. 8)
      • 32. Cho, K., van Merrienboer, B., Bahdanau, D., et al: ‘On the properties of neural machine translation: encoder-decoder approaches’, CoRR, 2014.
    9. 9)
      • 17. Castelluccio, M., Poggi, G., Sansone, C., et al: ‘Land use classification in remote sensing images by convolutional neural networks’, CoRR, abs/1508.00092, 2015.
    10. 10)
      • 4. Dalal, N., Triggs, B.: ‘Histograms of oriented gradients for human detection’. Proc. 2005 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR'05), Washington, DC, USA, 2005, vol. 1, pp. 886893.
    11. 11)
      • 26. Donahue, J., Jia, Y., Vinyals, O., et al: ‘Decaf: a deep convolutional activation feature for generic visual recognition’, CoRR, abs/1310.1531, 2013.
    12. 12)
      • 28. Zeiler, M.D., Fergus, R.: ‘Visualizing and understanding convolutional networks’, CoRR, abs/1311.2901, 2013.
    13. 13)
      • 3. Mou, L., Ghamisi, P., Zhu, X.X.: ‘Deep recurrent neural networks for hyperspectral image classification’, IEEE Trans. Geosci. Remote Sens., 2017, 55, (7), pp. 36393655.
    14. 14)
      • 1. Zhang, L., Zhang, L., Du, B.: ‘Deep learning for remote sensing data: A technical tutorial on the state of the art’, IEEE Geosci. Remote Sens. Mag., 2016, 4, (2), pp. 2240.
    15. 15)
      • 18. Xia, G.-S., Hu, J., Hu, F., et al: ‘AID: a benchmark dataset for performance evaluation of aerial scene classification’, IEEE Trans. Geosci. Remote Sens., 2017, 55, (7), pp. 39653981.
    16. 16)
      • 2. Romero, A., Gatta, C., Camps-Valls, G.: ‘Unsupervised deep feature extraction for remote sensing image classification’, IEEE Trans. Geosci. Remote Sens., 2016, 54, (3), pp. 13491362.
    17. 17)
      • 8. Hu, F., Xia, G.S., Wang, Z., et al: ‘Unsupervised feature learning via spectral clustering of multidimensional patches for remotely sensed scene classification’, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 2015, 8, (5), pp. 20152030.
    18. 18)
      • 10. Lazebnik, S., Schmid, C., Ponce, J.: ‘Beyond bags of features: spatial pyramid matching for recognizing natural scene categories’. 2006 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR'06), New York, NY, USA, 2006, vol. 2, pp. 21692178.
    19. 19)
      • 33. Li, X., Zhao, F., Guo, Y.: ‘Multi-label image classification with a probabilistic label enhancement model’. Proc. 30th Conf. on Uncertainty in Artificial Intelligence, UAI'14, Arlington, Virginia, USA, 2014, pp. 430439.
    20. 20)
      • 27. Oquab, M., Bottou, L., Laptev, I., et al: ‘Learning and transferring mid-level image representations using convolutional neural networks’. 2014 IEEE Conf. on Computer Vision and Pattern Recognition, Columbus, OH, USA, June 2014, pp. 17171724.
    21. 21)
      • 15. Deng, J., Dong, W., Socher, R., et al: ‘Imagenet: a large-scale hierarchical image database’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR09), Miami, FL, USA, 2009.
    22. 22)
      • 12. He, K., Zhang, X., Ren, S., et al: ‘Deep residual learning for image recognition’. 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, June 2016, pp. 770778.
    23. 23)
      • 14. Simonyan, K., Zisserman, A.: ‘Very deep convolutional networks for large-scale image recognition’, CoRR, abs/1409.1556, 2014.
    24. 24)
      • 20. Penatti, O.A.B., Nogueira, K., dos Santos, J.A.: ‘Do deep features generalize from everyday objects to remote sensing and aerial scenes domains?’. 2015 IEEE Conf. on Computer Vision and Pattern Recognition Workshops (CVPRW), Boston, MA, USA, June 2015, pp. 4451.
    25. 25)
      • 6. Marmanis, D., Datcu, M., Esch, T., et al: ‘Deep learning earth observation classification using imagenet pretrained networks’, IEEE Geosci. Remote Sens. Lett., 2016, 13, (1), pp. 105109.
    26. 26)
      • 19. Cheng, G., Han, J., Lu, X.: ‘Remote sensing image scene classification: benchmark and state of the art’, Proc. IEEE, 2017, 105, (10), pp. 18651883.
    27. 27)
      • 31. Hochreiter, S., Schmidhuber, J.: ‘Long short-term memory’, Neural Comput., 1997, 9, (8), pp. 17351780.
    28. 28)
      • 24. Xu, K., Ba, J., Kiros, R., et al: ‘Show, attend and tell: neural image caption generation with visual attention’. Proc. 32nd Int. Conf. on Machine Learning, ICML 2015, Lille, France, 6–11 July 2015, pp. 20482057.
    29. 29)
      • 37. Nogueira, K., Penatti, O.A., dos Santos, J.A.: ‘Towards better exploiting convolutional neural networks for remote sensing scene classification’, Pattern Recognit., 2017, 61, pp. 539556.
    30. 30)
      • 38. Zhou, B., Khosla, A., Lapedriza, A., et al: ‘Places: an image database for deep scene understanding’, arXiv preprint arXiv:1610.02055, 2016.
    31. 31)
      • 16. Szegedy, C., Liu, W., Jia, Y., et al: ‘Going deeper with convolutions’. Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 2015.
    32. 32)
      • 39. Lakhal, M.I., Escalera, S., Cevikalp, H.: ‘CRN: end-to-end convolutional recurrent network structure applied to vehicle classification’. 13th Int. Conf. on Computer Vision Theory and Applications (VISAPP), Funchal, Portugal, 2018.
    33. 33)
      • 23. Sutskever, I., Vinyals, O., Le, Q.V.: ‘Sequence to sequence learning with neural networks’. Proc. 27th Int. Conf. on Neural Information Processing Systems, NIPS'14, Cambridge, MA, USA, 2014, pp. 31043112.
    34. 34)
      • 22. Cho, K., van Merriënboer, B., Gülçehre, Ç, et al: ‘Learning phrase representations using RNN encoder–decoder for statistical machine translation’. Proc. 2014 Conf. on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, October 2014, pp. 17241734.
    35. 35)
      • 5. Lowe, D.G.: ‘Distinctive image features from scale-invariant keypoints’, Int. J. Comput. Vis., 2004, 60, (2), pp. 91110.
    36. 36)
      • 7. Cho, K., van Merrienboer, B., Bahdanau, D., et al: ‘On the properties of neural machine translation: encoder–decoder approaches’, CoRR, abs/1409.1259, 2014.
    37. 37)
      • 30. Pascanu, R., Mikolov, T., Bengio, Y.: ‘Understanding the exploding gradient problem’, CoRR, abs/1211.5063, 2012.
    38. 38)
      • 25. Sharma, S., Kiros, R., Salakhutdinov, R.: ‘Action recognition using visual attention’, CoRR, abs/1511.04119, 2015.
    39. 39)
      • 11. Chen, S., Tian, Y.: ‘Pyramid of spatial relations for scene level land use classification’, IEEE Trans. Geosci. Remote Sens., 2015, 53, (4), pp. 19471957.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2017.0420
Loading

Related content

content/journals/10.1049/iet-cvi.2017.0420
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading