Global and local exploitation for saliency using bag-of-words
- Author(s): Zhenzhu Zheng 1 ; Yun Zhang 1 ; Luxin Yan 2
-
-
View affiliations
-
Affiliations:
1:
High Performance Computing Center, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, People's Republic of China;
2: Science and Technology on Multi-spectral Information Processing Laboratory, School of Automation, Huazhong University of Science & Technology, Wuhan, People's Republic of China
-
Affiliations:
1:
High Performance Computing Center, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, People's Republic of China;
- Source:
Volume 8, Issue 4,
August 2014,
p.
299 – 304
DOI: 10.1049/iet-cvi.2013.0132 , Print ISSN 1751-9632, Online ISSN 1751-9640
The guidance of attention helps human vision system to detect objects rapidly. In this study, the authors present a new saliency detection algorithm by using bag-of-words (BOW) representation. The authors regard salient regions as coming from globally rare features and regions locally differ from their surroundings. Our approach consists of three stages: first, calculate global rarity of visual words. A vocabulary, a group of visual words, is generated from the given image and a rarity factor for each visual word is introduced according to its occurrence. Second, calculate local contrast. Representations of local patch are achieved from the histograms of words. Then, local contrast is computed by the difference between the two BOW histograms of a patch and its surroundings. Finally, saliency is measured by the combination of global rarity and local patch contrast. We compare our model with the previous methods on natural images, and experimental results demonstrate good performance of our model and fair consistency with human eye fixations.
Inspec keywords: object detection; image representation
Other keywords: saliency detection algorithm; objects detection; BOW representation; human vision system; salient regions; local patch representation; bag-of-words representation; natural images
Subjects: Optical, image and video signal processing; Computer vision and image processing techniques
References
-
-
1)
-
25. Ranganath, C., Rainer, G.: ‘Neural mechanisms for detecting and remembering novel events’, Nat. Rev. Neurosci., 2003, 4, pp. 193–202 (doi: 10.1038/nrn1052).
-
-
2)
-
2. Duncan, K., Sarkar, S.: ‘Saliency in images and video: a brief survey’, IET Comput. Vis., 2012, 6, pp. 514–523 (doi: 10.1049/iet-cvi.2012.0032).
-
-
3)
-
1. Pal, R., Mukherjee, A., Mitra, P., Mukherjee, J.: ‘Modelling visual saliency using degree centrality’, IET Comput. Vis., 2010, 4, pp. 218–229 (doi: 10.1049/iet-cvi.2009.0067).
-
-
4)
- L. Itti , C. Kock , E. Niebur . A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. , 11 , 1254 - 1259
-
5)
- L. Itti , C. Koch . Computational modelling of visual attention. Nature Rev. Neurosci. , 3 , 194 - 203
-
6)
-
23. Liu, Z., Xue, Y., Yan, H., Zhang, Z.: ‘Efficient saliency detection based on Gaussian models’, IET Image Process., 2011, 5, (2), pp. 122–131 (doi: 10.1049/iet-ipr.2009.0382).
-
-
7)
-
15. Forster, J.: ‘Local and global cross-modal influences between vision and hearing, tasting, smelling, or touching’, J. Exp. Psychol. Gen., 2011, 140, pp. 364–389 (doi: 10.1037/a0023175).
-
-
8)
- B. Olshausen , D. Field . Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature , 6583 , 607 - 609
-
9)
-
17. Zheng, Z., Zhang, T., Yan, L.: ‘Saliency model for object detection: searching for novel items in the scene’, Opt. Lett., 2012, 37, pp. 1580–1582 (doi: 10.1364/OL.37.001580).
-
-
10)
-
48. Nothdurft, H.-C.: ‘The role of features in preattentive vision: comparison of orientation, motion and color cues’, Vis. Res., 1993, 33, (14), pp. 1937–1958 (doi: 10.1016/0042-6989(93)90020-W).
-
-
11)
-
26. Li, J., Levine, M.D., An, X., Xu, X., He, H.: ‘Visual saliency based on scale-space analysis in the frequency domain’, IEEE Trans. Pattern Anal. Mach. Intell., 2013, 35, pp. 996–1010 (doi: 10.1109/TPAMI.2012.147).
-
-
12)
-
16. Quelhas, P., Monay, F., Odobez, J.M., Gatica-Perez, D., Tuytelaars, T.: ‘A thousand words in a scene’, IEEE Trans. Pattern Anal. Mach. Intell., 2007, 29, pp. 1575–89 (doi: 10.1109/TPAMI.2007.1155).
-
-
13)
-
7. Smirnakis, S.M., Berry, M.J., Warland, D.K., Bialek, W., Meister, M.: ‘Adaptation of retinal processing to image contrast and spatial scale’, Nature, 1997, 386, pp. 69–73 (doi: 10.1038/386069a0).
-
-
14)
-
7. Toet, A.: ‘Computational versus psychophysical bottom-up image saliency: a comparative evaluation study’, IEEE Trans. Pattern Anal. Mach. Intell., 2011, 33, (11), pp. 2131–2143 (doi: 10.1109/TPAMI.2011.53).
-
-
15)
-
8. Bruce, N.D.B., Tsotsos, J.K.: ‘Saliency based on information maximization’, Adv. Neural Inf. Process. Syst., 2006, 18, pp. 155–162.
-
-
16)
-
12. Zhang, L.Y., Tong, M.H., Marks, T.K., Shan, H.H., Cottrell, G.W.: ‘SUN: A Bayesian framework for saliency using natural statistics’, J. Vis., 2008, 8, pp. 1–20.
-
-
17)
-
21. Nothdurft, H.C.: ‘The role of features in preattentive vision: comparison of orientation, motion and color cues’, Vision Res., 1993, 33, pp. 1937–1958 (doi: 10.1016/0042-6989(93)90020-W).
-
-
18)
-
3. Liu, Z., Xue, Y., Yan, H., Zhang, Z.: ‘Efficient saliency detection based on Gaussian models’, IET Image Process., 2011, 5, pp. 122–131 (doi: 10.1049/iet-ipr.2009.0382).
-
-
19)
-
6. Itti, L., Koch, C., Niebur, E.: ‘A model of saliency-based visual attention for rapid scene analysis’, IEEE Trans. Pattern Anal. Mach. Intell., 1998, 20, pp. 1254–1259 (doi: 10.1109/34.730558).
-
-
20)
-
5. Toet, A.: ‘Computational versus psychophysical bottom-up image saliency: a comparative evaluation study’, IEEE Trans. Pattern Anal. Mach. Intell., 2011, 33, pp. 2131–2146 (doi: 10.1109/TPAMI.2011.53).
-
-
21)
-
11. Olshausen, B.A., Field, D.J.: ‘Emergence of simple-cell receptive field properties by learning a sparse code for natural images’, Nature, 1996, 381, pp. 607–609 (doi: 10.1038/381607a0).
-
-
22)
-
19. Li, F., Perona, P.: ‘A Bayesian hierarchical model for learning natural scene categories’. IEEE Conf. Computers Visual Pattern Recognition, 2005, pp. 524–531.
-
-
23)
-
17. Zheng, Z., Zhang, T., Yan, L.: ‘Saliency model for object detection: searching for novel items in the scene’, Opt. Lett., 2012, 37, pp. 1580–1582 (doi: 10.1364/OL.37.001580).
-
-
24)
-
22. http://www-sop.inria.fr/members/Neil.Bruce/.
-
-
25)
-
2. Duncan, K., Sarkar, S.: ‘Saliency in images and video: a brief survey’, IET Comput. Vis., 2012, 6, pp. 514–523 (doi: 10.1049/iet-cvi.2012.0032).
-
-
26)
-
23. Judd, T., Ehinger, K., Durand, F., Torralba, A.: ‘Learning to predict where humans look’. IEEE Int. Conf. on Computers Vision, 2009, pp. 2106–2113.
-
-
27)
-
10. Wang, W., Wang, Y.Z., Huang, Q.M., Gao, W.: ‘Measuring visual saliency by site entropy rate’. IEEE Conf. Computers Visual Pattern Recognition, 2010, pp. 2368–2375.
-
-
28)
-
24. Riche, N., Mancas, M., Gosselin, B., Dutoit, T.: ‘Rare: A new bottom-up saliency model’. IEEE Conf. Image Process., 2012, pp. 641–644.
-
-
29)
-
25. Ranganath, C., Rainer, G.: ‘Neural mechanisms for detecting and remembering novel events’, Nat. Rev. Neurosci., 2003, 4, pp. 193–202 (doi: 10.1038/nrn1052).
-
-
30)
-
18. Wang, G., Zhang, Y., Li, F.: ‘Using dependent regions for object categorization in a generative framework’. IEEE Conf. Computers Visual Pattern Recognition., 2006, pp. 1597–1604.
-
-
31)
-
14. Hou, X., Zhang, L.: ‘Dynamic visual attention: searching for coding length increments’, Proc. 22nd Annual Conf. on Neural Information Processing Systems, 2008, pp. 681–688.
-
-
32)
-
26. Li, J., Levine, M.D., An, X., Xu, X., He, H.: ‘Visual saliency based on scale-space analysis in the frequency domain’, IEEE Trans. Pattern Anal. Mach. Intell., 2013, 35, pp. 996–1010 (doi: 10.1109/TPAMI.2012.147).
-
-
33)
-
1. Pal, R., Mukherjee, A., Mitra, P., Mukherjee, J.: ‘Modelling visual saliency using degree centrality’, IET Comput. Vis., 2010, 4, pp. 218–229 (doi: 10.1049/iet-cvi.2009.0067).
-
-
34)
-
4. Filipe, S., Alexandre, L.A.: ‘From the human visual system to the computational models of visual attention: a survey’, Artif.l Intell. Rev., 2013, 29, pp. 1–47.
-
-
35)
-
16. Quelhas, P., Monay, F., Odobez, J.M., Gatica-Perez, D., Tuytelaars, T.: ‘A thousand words in a scene’, IEEE Trans. Pattern Anal. Mach. Intell., 2007, 29, pp. 1575–89 (doi: 10.1109/TPAMI.2007.1155).
-
-
36)
-
7. Smirnakis, S.M., Berry, M.J., Warland, D.K., Bialek, W., Meister, M.: ‘Adaptation of retinal processing to image contrast and spatial scale’, Nature, 1997, 386, pp. 69–73 (doi: 10.1038/386069a0).
-
-
37)
-
9. Borji, A., Itti, L.: ‘Exploiting local and global patch rarities for saliency detection’. IEEE Conf. Comput. Visual Pattern Recognition., 2012, pp. 478–485.
-
-
38)
-
15. Forster, J.: ‘Local and global cross-modal influences between vision and hearing, tasting, smelling, or touching’, J. Exp. Psychol. Gen., 2011, 140, pp. 364–389 (doi: 10.1037/a0023175).
-
-
39)
-
13. Hou, X.D., Zhang, L.Q.: ‘Saliency detection: a spectral residual approach’. IEEE Conf. Computers Visual Pattern Recognition, 2007, pp. 2280–2287.
-
-
40)
-
20. Itti, L., Koch, C.: ‘Computational modelling of visual attention’, Nat. Rev. Neurosci., 2001, 2, pp. 194–203 (doi: 10.1038/35058500).
-
-
1)