The query image is usually a simple and single object in image retrieval, and the reference images in the database usually have many distractions. The precision of image retrieval can be greatly improved If the target regions in the database image are extracted during retrieval. So this paper proposes a Bow image retrieval method based on SSD target detection. First, the training gallery is manually annotated to record the location and size information. Second, the SSD target detection model is trained with the labeled training gallery to obtain the target object SSD model. Third, the SSD model is used to locate the similar target regions of the reference image and the query graph. Finally, the target region information is mapped into the convolutional features, and these feature vectors are used for image similarity matching. The performance of the proposed method is evaluated on Paris6k, Oxford5k, Paris106k and Oxford105k databases. The experimental results show that the accuracy of image retrieval will be greatly improved by adding optimization methods in the proposed image retrieval framework. The image retrieval accuracy of this method is higher than that of similar methods in recent years.

References

1. 1)
  - 30. Kalantidis, Y., Mellina, C., Osindero, S.: ‘Cross-dimensional weighting for aggregated deep convolutional features’. European Conf. on Computer Vision, Amsterdam, Netherlands, 2016, pp. 685–701.
2. 2)
  - 11. Zhou, X.Y., Wang, K., Li, L.Y.: ‘A review of target detection algorithm based on deep learning’, Electron. Meas. Technol., 2017, 40, (11), pp. 89–93.
3. 3)
  - 18. Redmon, J., Farhadi, A., YOLO9000: ‘Better, faster, stronger’. IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 6517–6525.
4. 4)
  - 14. Ross, G., Jeff, D., Trevor, D., et al: ‘Rich feature hierarchies for accurate object detection and semantic segmentation’, Available at https://arxiv.org/abs/1311.2524v3, (Accessed 09 April 2019).
5. 5)
  - 33. Li, Y., Xu, Y., Wang, J., et al: ‘MS-RMAC: multiscale regional maximum activation of convolutions for image retrieval’, IEEE Signal Process. Lett., 2017, 24, (5), pp. 609–613.
6. 6)
  - 17. Uijlings, J.R., Van De Sande, K.E., Gevers, T., et al: ‘Selective search for object recognition’, Int. J. Of Comput. Vis., 2013, 104, (2), pp. 154–171.
7. 7)
  - 10. Cao, J., Liu, L., Wang, P., et al: ‘Where to focus: query adaptive matching for instance retrieval using convolutional feature maps’, Available at https://arxiv.org/abs/1606.06811, (Accessed 09 April 2019).
8. 8)
  - 26. Sicre, R., Jégou, H.: ‘Particular object retrieval with integral max-pooling of CNN activations’, Available at https://arxiv.org/abs/1511.05879v2, (Accessed 09 April 2019).
9. 9)
  - 25. Babenko, A., Lempitsky, V.: ‘Aggregating deep convolutional features for image retrieval’, 2015 IEEE Int. Conf. on Computer Vision, Santiago, Chile, 2015, pp. 108–116.
10. 10)
  - 24. Babenko, A., Slesarev, A., Chigorin, A., et al: ‘Neural codes for image retrieval’. European Conf. on Computer Vision’, Munich, Germany, 2014, pp. 584–599.
11. 11)
  - 7. Gordo, A., Almazán, J., Revaud, J., et al: ‘Deep image retrieval: learning global representations for image search’. European Conf. on Computer Vision, Amsterdam, Netherlands, 2016, pp. 241–257.
12. 12)
  - 34. Hao, J., Dong, J., Wang, W., et al: ‘What Is the best practice for CNNs applied to visual instance retrieval’, Available at https://arxiv.org/abs/1611.01640, (Accessed 09 April 2019).
13. 13)
  - 15. Girshick, R.: ‘Fast r-cnn’, Available at https://arxiv.org/abs/1504.08083, (Accessed 09 April 2019).
14. 14)
  - 16. Ren, S., He, K., Girshick, R., et al: ‘Faster R-CNN: towards real-time object detection with region proposal networks’. Advances in Neural Information Processing Systems, Montreal, Canada, 2015, pp. 91–99.
15. 15)
  - 23. Singh, N.K., Singh, N.J., Kumar, W.K.: ‘Image classification using SLIC superpixel and FAAGKFCM image segmentation’, IET Image Process., 2020, 14, (3), pp. 487–494.
16. 16)
  - 31. Tang, Z.W.: ‘Image Retrieval Research Based on Content and Deep Learning’, Xi'an University of Technology, Xi'an, 2018, pp. 1–47.
17. 17)
  - 1. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘Imagenet classification with deep convolutional neural networks’. Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 2012, pp. 1097–1105.
18. 18)
  - 27. Jégou, H., Chum, O.: ‘Negative evidences and co-occurences in image retrieval: the benefit of PCA and whitening’. European Conf. on Computer Vision, Florence, Italy, 2012, pp. 774–787.
19. 19)
  - 19. Redmon, J., Farhadi, A.YOLOv3: ‘An incremental improvement’, Available at https://arxiv.org/abs/1804.02767, (Accessed 09 April 2019).
20. 20)
  - 36. Azizpour, H., Razavian, A.S., Sullivan, J., et al: ‘Factors of transferability for a generic ConvNet representation’, IEEE Trans. Pattern Anal. Mach. Intell., 2016, 38, (9), pp. 1790–1802.
21. 21)
  - 3. Zhang, X., Li, S., Jing, X.Y., et al: ‘Unsupervised domain adaption for image-to-video person re-identification’, Multimedia Tools Appl., 2020, 79, pp. 33793–33810.
22. 22)
  - 21. Liu, W., Anguelov, D., Erhan, D., et al: ‘SSD: single shot MultiBox detector’. European Conf. on Computer Vision, Amsterdam, Netherlands, 2016, pp. 21–37.
23. 23)
  - 8. Wei, X.S., Luo, J.H., Wu, J., et al: ‘Selective convolutional descriptor aggregation for fine-grained image retrieval’, IEEE Trans. Image Process., 2017, 26, (6), pp. 2868–2881.
24. 24)
  - 28. Jégou, H., Zisserman, A.: ‘Triangulation embedding and democratic aggregation for image search’. IEEE Conf. on Computer Vision and Pattern Recognition, Columbus, OH USA, 2014, pp. 3310–3317.
25. 25)
  - 2. Kim, S., Song, C., Park, J.: ‘Edge-aware image filtering using a structure-guided CNN’, IET Image Process., 2020, 14, (3), pp. 472–479.
26. 26)
  - 9. Zheng, L., Yang, Y., Tian, Q.: ‘SIFT meets CNN: a decade survey of instance retrieval’, IEEE Trans. Pattern Anal. Mach. Intell., 2018, 40, (5), pp. 1224–1244.
27. 27)
  - 32. Arandjelović, R., Gronat, P., Torii, A., et al: ‘NetVLAD: CNN architecture for weakly supervised place recognition’, IEEE Trans. Pattern Anal. Mach. Intell., 2018, 40, (6), pp. 1437–1451.
28. 28)
  - 12. Yu, J.Y., Ye, M., Ding, P.C., et al: ‘Application of convolutional neural networks in target detection’, Comput. Sci., 2018, 45, (11), pp. 17–26.
29. 29)
  - 20. Redmon, J, Divvala, S, Girshick, R, et al: ‘You only look once: unified, real-time object detection’, Available at https://arxiv.org/abs/1506.02640, (Accessed 09 April 2019).
30. 30)
  - 13. Dai, Y.C., Zhang, J., He, M.Y.: ‘Detection of salient targets in multispectral remote sensing images with deep residual network’, J. of Surv. Mapping, 2018, 47, (6), pp. 873–881.
31. 31)
  - 22. Sermanet, P, Eigen, D, Zhang, X, et al: ‘Overfeat: ‘integrated recognition, localization and detection using convolutional networks’, Available at https://arxiv.org/abs/1312.6229, (Accessed 09 April 2019).
32. 32)
  - 4. Yang, H.H., Zhang, T.Y., Li, L.Q., et al: ‘Target identification algorithm for urban management cases based on MobileNet’, Comput. Appl., 2019, 39, (8), pp. 2475–2479.
33. 33)
  - 5. Jimenez, A., Alvarez, J.M., Giro-I-Nieto, X.: ‘Class-weighted convolutional features for visual instance search’, Available at https://arxiv.org/abs/1707.02581, (Accessed 09 April 2019).
34. 34)
  - 29. Razavian, A.S., Sullivan, J., Maki, A., et al: ‘Visual instance retrieval with deep convolutional networks’, Trans. on Media Technol. Appl., 2016, 4, (3), pp. 251–258.
35. 35)
  - 6. Radenović, F., Tolias, G., Chum, O.: ‘CNN image retrieval learns from bow: unsupervised fine-tuning with hard examples’. European Conf. on Computer Vision, Amsterdam, Netherlands, 2016, pp. 3–20.
36. 36)
  - 35. Li, X., Larson, M., Hanjalic, A.: ‘Pairwise geometric matching for large-scale object retrieval’. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 5153–5161.

Bow image retrieval method based on SSD target detection

References

Related content