Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

access icon free Image region annotation based on segmentation and semantic correlation analysis

The authors propose an image region annotation framework by exploring syntactic and semantic correlations among segmented regions in an image. A texture-enhanced image segmentation JSEG algorithm is first used to improve the pixel consistency in a segmented image region. Next, each region is represented by a set of image codewords, also known as visual alphabets, with each of them used to characterise certain low-level image features. A visual lexicon, with its vocabulary items defined as either a codeword or a co-occurrence of multiple alphabets, is formed and used to model middle-level semantic concepts. The concept classification models are trained by a maximal figure-of-merit algorithm with a collection of training images with multiple correlations, including spatial, syntactic and semantic relationship, between regions and their corresponding concepts. In addition, a region-semantic correlation model constructed with latent semantic analysis is used to correct the potentially wrong annotations by analysing the relationship between image region positions and labels. When evaluated on the Corel 5K dataset, the proposed image region annotation framework achieves accurate results on image region concept tagging as well as whole image based annotations.

References

    1. 1)
      • 9. Memon, M.H., Li, J.P., Memon, I., et al: ‘GEO matching regions: multiple regions of interests using content based image retrieval based on relative locations’, Multimed. Tools Appl., 2017, 76, (14), pp. 1537715411.
    2. 2)
      • 36. Qi, X.J., Han, Y.T.: ‘Incorporating multiple SVMs for antomatic image annotation’, Pattern Recognit., 2007, 40, (2), pp. 728741.
    3. 3)
      • 8. Yuan, J., Li, J., Zhang, B.: ‘Exploiting spatial context constraints for automatic image region annotation’. Proc. ACM Multimedia, Augsburg, Germany, 29 September 2007, pp. 595604.
    4. 4)
      • 28. Li, F.F., Perona, P.: ‘A Bayesian hierarchical model for learning natural scene categories’. Proc. Computer Vision and Pattern Recognition, San Diego, CA, USA, 2005, pp. 524531.
    5. 5)
      • 18. Moscheni, F., Bhattacharjee, S., Kunt, M.: ‘Spatio-temporal segmentation based on region merging’, IEEE Trans. Pattern Anal. Mach. Intell., 1998, 20, (9), pp. 897915.
    6. 6)
      • 5. Jeon, J., Lavrenko, V., Manmatha, R.: ‘Automatic image annotation and retrieval using cross-media relevance models’. Proc. ACM SIGIR Conf. on Research and Development in Informaion Retrieval, Toronto, Canada, 28 July 2003, pp. 119126.
    7. 7)
      • 21. Zhang, J., Li, D., Zhao, Y., et al: ‘Representation of image content based on RoI-BoW’, J. Vis. Commun. Image Represent., 2015, 36, pp. 3749.
    8. 8)
      • 33. Gao, S., Wang, D.H., Lee, C.H.: ‘Automatic image annotation through multi-topic text categorization’. Proc. IEEE Int. Conf. on Acoustics Speech and Signal Processing Proceedings, Toulouse, France, 2006, pp. 1419.
    9. 9)
      • 17. Nguyen, L., Tosun, A.B., Fine, J., et al: ‘Spatial statistics for segmenting histological structures in H&E stained tissue images’, IEEE Trans. Med. Imaging, 2017, 36, (7), pp. 15221532.
    10. 10)
      • 41. Karpathy, A., Fei-Fei, L.: ‘Deep visual-semantic alignments for generating image descriptions’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, 39, (4), pp. 664676.
    11. 11)
      • 2. Liu, P., Guo, J.M., Wu, C.Y., et al: ‘Fusion of deep learning and compressed domain features for content based image retrieval’, IEEE Trans. Image Process., 2017, 26, (12), pp. 57065717.
    12. 12)
      • 25. Deng, Y.N., Manjunath, B.S.: ‘Unsupervised segmentation of color-texture regions in images and video’, IEEE Trans. Pattern Anal. Mach. Intell., 2001, 23, (8), pp. 800810.
    13. 13)
      • 7. Hu, J.W., Lam, K.M.: ‘An efficient two-stage framework for image annotation’, Pattern Recognit., 2013, 46, (3), pp. 936947.
    14. 14)
      • 37. Park, S.B., Lee, J.W., Kim, S.K.: ‘Content-based image classification using a neural network’, Pattern Recognit. Lett., 2004, 25, (3), pp. 287300.
    15. 15)
      • 6. Ke, X., Zhou, M., Niu, Y., et al: ‘Data equilibrium based automatic image annotation by fusing deep model and semantic propagation’, Pattern Recognit., 2017, 17, pp. 6077.
    16. 16)
      • 4. Zhang, D.S., Islam, Md.M., Lu, G.J.: ‘A review on automatic image annotation techniques’, Pattern Recognit., 2012, 45, (1), pp. 346362.
    17. 17)
      • 3. Feng, L., Bhanu, B.: ‘Semantic concept co-occurrence patterns for image annotation and retrieval’, IEEE Trans. Pattern Anal. Mach. Intell., 2016, 38, (4), pp. 785799.
    18. 18)
      • 44. Zhang, J., Zhao, Y.X., Li, D., et al: ‘A novel image annotation model based on content representation with multi-layer segmentation’, Neural Comput. Appl., 2015, 26, (6), pp. 14071422.
    19. 19)
      • 43. Mensink, T., Verbeek, J., Csurka, G.: ‘Tree-structured CRF models for interactive image labeling’, IEEE Trans. Pattern Anal. Mach. Intell., 2012, 35, (2), pp. 476489.
    20. 20)
      • 20. Bhatnagar, G., Jonathan Wu, Q.M., Liu, Z.: ‘Human visual system inspired multi-modal medical image fusion framework’, Expert Syst. Appl., 2013, 40, (5), pp. 17081720.
    21. 21)
      • 49. Zhang, D.S., Wong, A., Indrawan, M., et al: ‘Content-based image retrieval using gabor texture features’. IEEE Trans. Pattern Anal. Mach. Intell., 2000, 13, (15), pp. 91110.
    22. 22)
      • 35. Tao, D.P., Jin, L.W., Liu, W.F., et al: ‘Hessian regularized support vector machines for mobile image annotation on the cloud’, IEEE Trans. Multimed., 2013, 15, (4), pp. 833844.
    23. 23)
      • 32. Wu, X., Zhao, W.L., Ngo, C.W.: ‘Near-duplicate keyframe retrieval with visual keywords and semantic context’. Proc. ACM Conf. on Image and Video Retrieval, Amsterdam, The Netherlands, 2007, pp. 162169.
    24. 24)
      • 50. Duyhulu, P., Barnard, K., de Freitas, J.F.G., et al: ‘Object recognition as machine translation: learning a lexicon for a fixed image vocabulary’. Proc. European Conf. on Computer Vision, Copenhagen, Denmark, 2002.
    25. 25)
      • 46. Gao, S., Wu, W., Lee, C.H., et al: ‘A MFoM learning approach to robust multiclass multi-label text categorization’. Proc. Int. Conf. on Machine Learning, Banff, Canada, 2004.
    26. 26)
      • 15. Zhu, S.P., Xia, X., Zhang, Q.R., et al: ‘An image segmentation algorithm in image processing based on threshold segmentation’. IEEE Conf. on Signal-Image Technologies and Internet-Based System, Shanghai, China, 2007, pp. 673678.
    27. 27)
      • 42. Carbonetto, P., de Freitas, N., Barnard, K., et al: ‘A statistical model for general contextual object recognition’. Proc. European Conf. on Computer Vision, Prague, Czech Republic, 2004.
    28. 28)
      • 11. Tang, Y., Wang, X., Dellandrea, E., et al: ‘Weakly supervised learning of deformable part-based models for object detection via region proposals’, IEEE Trans. Multimed., 2017, 64, pp. 417424.
    29. 29)
      • 10. Yang, C.B., Dong, M., Hua, J.: ‘Region-based image annotation using asymmetrical support vector machine-based multiple-instance learning’. Proc. IEEE Computer Vision and Pattern Recognition, New York, NY, USA, 2006.
    30. 30)
      • 12. Wang, Y., Mei, T., Gong, S.G., et al: ‘Combining global, regional and contextual features for automatic image annotation’, Pattern Recognit., 2009, 42, (2), pp. 259266.
    31. 31)
      • 45. Lavrenko, V., Manmatha, R., Jeon, J., et al: ‘A model for learning the semantics of pictures’. Proc. Neural Information Processing Systems, Vancouver and Whistler, Canada, 2003.
    32. 32)
      • 26. Xia, S., Chen, P., Zhang, J., et al: ‘Utilization of rotation-invariant uniform LBP histogram distribution and statistics of connected regions in automatic image annotation based on multi-label learning’, Neurocomputing, 2017, 228, pp. 1118.
    33. 33)
      • 14. Zahn, C.: ‘Graph theoretical methods for detecting and describing gestalt clusters’, IEEE Trans. Comput., 1971, 20, pp. 6886.
    34. 34)
      • 29. Sivic, J., Zisserman, A.: ‘Video google: a text retrieval approach to object matching invideos’. Proc. Int. Conf. on Computer Vision, Nice, France, 2003, pp. 14701477.
    35. 35)
      • 24. Shi, J.B., Malik, J.: ‘Normalized cuts and image segmentation’, IEEE Trans. Pattern Anal. Mach. Intell., 2000, 22, (8), pp. 888905.
    36. 36)
      • 1. Smeulders, A.W.M., Worring, M., Santini, S., et al: ‘Content-based image retrieval at the end of the early years’, IEEE Trans. Pattern Anal. Mach. Intell., 2000, 22, (12), pp. 13491380.
    37. 37)
      • 27. Darwish, S.M.: ‘Combining firefly algorithm and Bayesian classifier: new direction for automatic multilabel image annotation’, IET Image Process., 2016, 10, (10), pp. 763772.
    38. 38)
      • 40. Frate, F.D., Pacifici, F., Schiavon, G., et al: ‘Use of neural networks for automatic classification from high-resolution images’, IEEE Trans. Geosci. Remote Sens., 2007, 45, (4), pp. 800809.
    39. 39)
      • 22. Zhang, H., Berg, A., Maire, M., et al: ‘SVM-KNN: discriminative nearest neighbor classification for visual category recognition’. Proc. IEEE Computer Vision and Pattern Recognition, New York, NY, USA, 2006, pp. 21262136.
    40. 40)
      • 38. Kumar, M.P., Turki, H., Dan, P., et al: ‘Parameter estimation and energy minimization for region-based semantic segmentation’, IEEE Trans. Pattern Anal. Mach. Intell., 2015, 37, (7), pp. 13731386.
    41. 41)
      • 30. Lu, Z., Wang, L.: ‘Learning descriptive visual representation for image classification and annotation’, Pattern Recognit., 2015, 48, (2), pp. 498508.
    42. 42)
      • 34. Alvarez, S., Vanrell, M.: ‘Texton theory revisited: a bag-of-words approach to combine textons’, Pattern Recognit., 2012, 45, pp. 43124325.
    43. 43)
      • 48. Zhang, J., Gao, Y., Feng, S., et al: ‘Automatic image region annotation through segmentation based visual semantic analysis and discriminative classification’. Proc. IEEE Int. Conf. on Acoustics Speech and Signal Processing Proceedings, Shanghai, China, 2016.
    44. 44)
      • 23. Guillaumin, M., Mensink, T., Verbeek, J., et al: ‘Tagprop: discriminative metric learning in nearest neighbor models for image auto-annotation’. Proc. Int. Conf. on Computer Vision, Kyoto, Japan, 2009, pp. 309316.
    45. 45)
      • 13. Deng, Y., Manjunath, B.S., Shin, H.: ‘Color image segmentation’. Proc. IEEE Computer Vision and Pattern Recognition, Fort Collins, CO, USA, 1999.
    46. 46)
      • 47. Hastings, P.W.: ‘Latent semantic analysis’. Proc. Int. Joint Conf. on Artificial Intelligence, Edinburgh, UK, 2004, pp. 114.
    47. 47)
      • 19. Huang, K.Q., Wang, Q., Wu, Z.Y.: ‘Natural color image enhancement and evaluation algorithm based on human visual system’, Comput. Vis. Image Underst., 2006, 103, (1), pp. 5263.
    48. 48)
      • 39. Wang, X., Thome, N., Cord, M.: ‘Gaze latent support vector machine for image classification improved by weakly supervised region selection’, Pattern Recognit., 2017, 72, pp. 5971.
    49. 49)
      • 31. Nguyen, N.V., Boucher, A., Ogier, J.: ‘Keyword visual representation for image retrieval and image annotation’, Int. J. Pattern Recognit. Artif. Intell., 2015, 29, (6), pp. 137.
    50. 50)
      • 16. Sappa, A.D.: ‘Unsupervised contour closure algorithm for range image edge-based segmentation’, IEEE Trans. Image Process., 2006, 15, (2), pp. 377384.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-ipr.2017.0917
Loading

Related content

content/journals/10.1049/iet-ipr.2017.0917
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address