access icon free Method for unconstrained text detection in natural scene image

Text detection in natural scene images is an important prerequisite for many content-based multimedia understanding applications. The authors present a simple and effective text detection method in natural scene image. Firstly, MSERs are extracted by the V-MSER algorithm from channels of G, H, S, O 1, and O 2, as component candidates. Since text is composed of character candidates, the authors design an MRF model to exploit the relationship between characters. Secondly, in order to filter out non-text components, they design a set of two-layers filtering scheme: most of the non-text components can be filtered by the first layer of the filtering scheme; the second layer filtering scheme is an AdaBoost classifier, which is trained by the features of compactness, horizontal variance and vertical variance, and aspect ratio. Then, only four simple features are adopted to generate component pairs. Finally, according to the orientation similarity of the component pairs, component pairs which have roughly the same orientation are merged into text lines. The proposed method is evaluated on two public datasets: ICDAR 2011 and MSRA-TD500. It achieves 82.94 and 75% F-measure, respectively. Especially, the experimental results, on their URMQ_LHASA-TD220 dataset which contains 220 images for multi-orientation and multi-language text lines evaluation, show that the proposed method is general for detecting scene text lines in different languages.

Inspec keywords: text detection; image classification; multimedia systems; natural language processing; image filtering

Other keywords: second layer filtering scheme; nontext component filterig; character candidates; first layer filtering scheme; ICDAR 2011; natural scene image; multilanguage text lines evaluation; V-MSER algorithm; URMQ_LHASA-TD220 dataset; aspect ratio; compactness; multiorientation text lines evaluation; vertical variance; MSRA-TD500; unconstrained text detection; component pair orientation similarity; horizontal variance; content-based multimedia understanding applications; two-layers filtering scheme; MRF model; AdaBoost classifier

Subjects: Computer vision and image processing techniques; Image recognition; Natural language interfaces; Filtering methods in signal processing; Multimedia

References

    1. 1)
      • 18. Yin, X., Zuo, Z., Tian, S., et al: ‘Text detection, tracking and recognition in video: a comprehensive survey’, IEEE Trans. Image Process., 2016, 25, (6), pp. 27522773.
    2. 2)
      • 31. Kang, L., Li, Y., Doermann, D.: ‘Orientation robust text line detection in natural images’. Computer Vision and Pattern Recognition (CVPR), 2014, pp. 40344041.
    3. 3)
      • 16. Shahab, A., Shafait, F., Dengel, A.: ‘Icdar 2011 robust reading competition challenge 2: Reading text in scene images’. Int. Conf. Document Analysis and Recognition (ICDAR), IEEE, 2011, pp. 14911496.
    4. 4)
      • 33. Huang, W., Lin, Z., Yang, J., et al: ‘Text localization in natural images using stroke feature transform and text covariance descriptors’. Int. Conf. Computer Vision (ICCV), 2013, pp. 12411248.
    5. 5)
      • 41. Van de Weijer, J., Gevers, T., Bagdanov, A.D.: ‘Boosting color saliency in image feature detection’, IEEE Trans. Pattern Anal. Mach. Intell., 2006, 28, (1), pp. 150156.
    6. 6)
      • 22. Wang, K., Babenko, B., Belongie, S.: ‘End-to-end scene text recognition’. Int. Conf. Computer Vision (ICCV), 2011, pp. 14571464.
    7. 7)
      • 9. Jaderberg, M., Vedaldi, A., Zisserman, A.: ‘Deep features for text spotting’. Computer Vision and Pattern Recognition (CVPR), 2014, pp. 512528.
    8. 8)
      • 38. Tian, Z., Huang, W., He, T., et al: ‘Detecting text in natural image with connectionist text proposal network’. European Conf. Computer Vision (ECCV), 2016, pp. 5672.
    9. 9)
      • 47. Shi, C., Wang, C., Xiao, B., et al: ‘Scene text detection using graph model built upon maximally stable extremal regions’, Pattern Recognit. Lett., 2013, 34, (2), pp. 107116.
    10. 10)
      • 13. Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., et al: ‘Icdar 2015 competition on robust reading’. Int. Conf. Document Analysis and Recognition (ICDAR), 2015, pp. 11561160.
    11. 11)
      • 2. Karaoglu, S., van Gemert, J.C., Gevers, T.: ‘Con-text: text detection using background connectivity for fine-grained object classification’. ACM Int. Conf. on Multimedia (ACM MM), ACM, 2013, pp. 757760.
    12. 12)
      • 32. Karatzas, D., Shafait, F., Uchida, S., et al: ‘Icdar 2013 robust reading competition’. Int. Conf. Document Analysis and Recognition (ICDAR), 2013, pp. 14841493.
    13. 13)
      • 20. Lee, J.-J., Lee, P.-H., Lee, S.-W., et al: ‘Adaboost for text detection in natural scene’. Int. Conf. Document Analysis and Recognition (ICDAR), 2011, pp. 429434.
    14. 14)
      • 24. Li, Y., Jia, W., Shen, C., et al: ‘Characterness: an indicator of text in the wild’, IEEE Trans. Image Process., 2014, 23, (4), pp. 16661677.
    15. 15)
      • 26. Sun, L., Huo, Q., Jia, W., et al: ‘Robust text detection in natural scene images by generalized color-enhanced contrasting extremal region and neural networks’. Int. Conf. on Pattern Recognition (ICPR), 2014, pp. 27152720.
    16. 16)
      • 44. Tong, S., Koller, D.: ‘Support vector machine active learning with applications to text classification’, J. Mach. Learn. Res., 2001, 2, (11), pp. 4566.
    17. 17)
      • 4. He, T., Huang, W., Qiao, Y., et al: ‘Accurate text localization in natural image with cascaded convolutional text network’, arXiv preprint arXiv:1603.09423, 2016, pp. 110.
    18. 18)
      • 6. Busta, M., Neumann, L., Matas, J.: ‘Fastext: Efficient unconstrained scene text detector’. Int. Conf. on Computer Vision (ICCV), 2015, pp. 12061214.
    19. 19)
      • 8. Zhang, Z., Shen, W., Yao, C., et al: ‘Symmetry-based text line detection in natural scenes’. Computer Vision and Pattern Recognition (CVPR), 2015, pp. 25582567.
    20. 20)
      • 28. Nistér, D., Stewénius, H.: ‘Linear time maximally stable extremal regions’. European Conf. Computer Vision (ECCV), 2008, pp. 183196.
    21. 21)
      • 7. Tian, S., Pan, Y., Huang, C., et al: ‘Text flow: A unified text detection system in natural scene images’. Int. Conf. Computer Vision (ICCV), 2015, pp. 46514659.
    22. 22)
      • 27. Matas, J., Chum, O., Urban, M., et al: ‘Robust wide-baseline stereo from maximally stable extremal regions’, Image Vis. Comput., 2004, 22, (10), pp. 761767.
    23. 23)
      • 43. Friedman, J., Hastie, T., Tibshirani, R., et al: ‘Additive logistic regression: a statistical view of boosting’, Ann. Stat., 2000, 28, (2), pp. 337407.
    24. 24)
      • 37. Pan, Y.-F., Hou, X., Liu, C.-L.: ‘A hybrid approach to detect and localize texts in natural scene images’, IEEE Trans. Image Process., 2011, 20, (3), pp. 800813.
    25. 25)
      • 17. Yao, C., Bai, X., Liu, W., et al: ‘Detecting texts of arbitrary orientations in natural images’. Computer Vision and Pattern Recognition (CVPR), 2012, pp. 10831090.
    26. 26)
      • 19. Yao, C., Bai, X., Liu, W.: ‘A unified framework for multioriented text detection and recognition’, IEEE Trans. Image Process., 2014, 23, (11), pp. 47374749.
    27. 27)
      • 15. Jain, A.K., Murty, M.N., Flynn, P.J.: ‘Data clustering: a review’, ACM Comput. Surv. (CSUR), 1999, 31, (3), pp. 264323.
    28. 28)
      • 1. Merino-Gracia, C., Lenc, K., Mirmehdi, M.: ‘A head-mounted device for recognizing text in natural scenes’. Int. Workshop on Camera-Based Document Analysis and Recognition (IWCDAR), 2011, pp. 2941.
    29. 29)
      • 29. Epshtein, B., Ofek, E., Wexler, Y.: ‘Detecting text in natural scenes with stroke width transform’. Computer Vision and Pattern Recognition (CVPR), 2010, pp. 29632970.
    30. 30)
      • 40. Fu, H., Cao, X., Tu, Z.: ‘Cluster-based co-saliency detection’, IEEE Trans. Image Process., 2013, 22, (10), pp. 37663778.
    31. 31)
      • 42. Chen, X., Yuille, A.L.: ‘Detecting and reading text in natural scenes’. Computer Vision and Pattern Recognition (CVPR), vol. 2, 2004, pp. II366.
    32. 32)
      • 25. Yin, X.-C., Pei, W.-Y., Zhang, J., et al: ‘Multi-orientation scene text detection with adaptive clustering’, IEEE Trans. Pattern Anal. Mach. Intell., 2015, 37, (9), pp. 19301937.
    33. 33)
      • 3. Zhu, Y., Yao, C., Bai, X.: ‘Scene text detection and recognition: recent advances and future trends’, Front. Comput. Sci., 2016, 10, (1), pp. 1936.
    34. 34)
      • 12. Karaoglu, S., Van Gemert, J.C., Gevers, T.: ‘Object reading: text recognition for object recognition’. European Conf. Computer Vision (ECCV), 2012, pp. 456465.
    35. 35)
      • 23. Huang, W., Qiao, Y., Tang, X.: ‘Robust scene text detection with convolution neural network induced MSER trees’. European Conf. on Computer Vision (ECCV), 2014, pp. 497511.
    36. 36)
      • 11. Weinman, J.J., Butler, Z., Knoll, D., et al: ‘Toward integrated scene text reading’, IEEE Trans. Pattern Anal. Mach. Intell., 2014, 36, (2), pp. 375387.
    37. 37)
      • 45. Platt, J.C.: ‘Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods’, Adv. Large Margin Classif., 1999, 10, (3), pp. 6174.
    38. 38)
      • 30. Neumann, L., Matas, J.: ‘Real-time scene text localization and recognition’. Computer Vision and Pattern Recognition (CVPR), 2012, pp. 35383545.
    39. 39)
      • 48. Liao, M., Shi, B., Bai, X., et al: ‘Textboxes: a fast text detector with a single deep neural network’. In AAAI Conference on Artificial Intelligence (AAAI), pp. 41614167arXiv preprint arXiv:1611.06779, 2016.
    40. 40)
      • 36. Wang, Q., Lu, Y., Sun, S.: ‘Text detection in nature scene images using two-stage nontext filtering’. Int. Conf. Document Analysis and Recognition (ICDAR), 2015, pp. 106110.
    41. 41)
      • 5. Zhang, Z., Zhang, C., Shen, W., et al: ‘Multi-oriented text detection with fully convolutional networks’, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 41594167, arXiv preprint arXiv:1604.04018, 2016.
    42. 42)
      • 21. Neumann, L., Matas, J.: ‘On combining multiple segmentations in scene text recognition’. Int. Conf. Document Analysis and Recognition (ICDAR), IEEE, 2013, pp. 523527.
    43. 43)
      • 39. He, T., Huang, W., Qiao, Y., et al: ‘Text-attentional convolutional neural network for scene text detection’, IEEE Trans. Image Process., 2016, 25, (6), pp. 25292541.
    44. 44)
      • 14. Ye, Q., Doermann, D.: ‘Text detection and recognition in imagery: a survey’, IEEE Trans. Pattern Anal. Mach. Intell., 2015, 37, (7), pp. 14801500.
    45. 45)
      • 35. Cho, H., Sung, M., Bongjin, J.: ‘Canny text detector: Fast and robust scene text localization algorithm’. Computer Vision and Pattern Recognition (CVPR), 2016, pp. 35663573.
    46. 46)
      • 10. Yin, X.-C., Yin, X., Huang, K., et al: ‘Robust text detection in natural scene images’, IEEE Trans. Pattern Anal. Mach. Intell., 2014, 36, (5), pp. 970983.
    47. 47)
      • 34. Zitnick, C.L., Dollár, P.: ‘Edge boxes: Locating object proposals from edges’. European Conf. Computer Vision (ECCV), 2014, pp. 391405.
    48. 48)
      • 46. Boykov, Y., Veksler, O., Zabih, R.: ‘Fast approximate energy minimization via graph cuts’, IEEE Trans. Pattern Anal. Mach. Intell., 2001, 23, (11), pp. 12221239.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2016.0452
Loading

Related content

content/journals/10.1049/iet-cvi.2016.0452
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading