http://iet.metastore.ingenta.com
1887

Method for unconstrained text detection in natural scene image

Method for unconstrained text detection in natural scene image

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Computer Vision — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Text detection in natural scene images is an important prerequisite for many content-based multimedia understanding applications. The authors present a simple and effective text detection method in natural scene image. Firstly, MSERs are extracted by the V-MSER algorithm from channels of G, H, S, O 1, and O 2, as component candidates. Since text is composed of character candidates, the authors design an MRF model to exploit the relationship between characters. Secondly, in order to filter out non-text components, they design a set of two-layers filtering scheme: most of the non-text components can be filtered by the first layer of the filtering scheme; the second layer filtering scheme is an AdaBoost classifier, which is trained by the features of compactness, horizontal variance and vertical variance, and aspect ratio. Then, only four simple features are adopted to generate component pairs. Finally, according to the orientation similarity of the component pairs, component pairs which have roughly the same orientation are merged into text lines. The proposed method is evaluated on two public datasets: ICDAR 2011 and MSRA-TD500. It achieves 82.94 and 75% F-measure, respectively. Especially, the experimental results, on their URMQ_LHASA-TD220 dataset which contains 220 images for multi-orientation and multi-language text lines evaluation, show that the proposed method is general for detecting scene text lines in different languages.

References

    1. 1)
      • 1. Merino-Gracia, C., Lenc, K., Mirmehdi, M.: ‘A head-mounted device for recognizing text in natural scenes’. Int. Workshop on Camera-Based Document Analysis and Recognition (IWCDAR), 2011, pp. 2941.
    2. 2)
      • 2. Karaoglu, S., van Gemert, J.C., Gevers, T.: ‘Con-text: text detection using background connectivity for fine-grained object classification’. ACM Int. Conf. on Multimedia (ACM MM), ACM, 2013, pp. 757760.
    3. 3)
      • 3. Zhu, Y., Yao, C., Bai, X.: ‘Scene text detection and recognition: recent advances and future trends’, Front. Comput. Sci., 2016, 10, (1), pp. 1936.
    4. 4)
      • 4. He, T., Huang, W., Qiao, Y., et al: ‘Accurate text localization in natural image with cascaded convolutional text network’, arXiv preprint arXiv:1603.09423, 2016, pp. 110.
    5. 5)
      • 5. Zhang, Z., Zhang, C., Shen, W., et al: ‘Multi-oriented text detection with fully convolutional networks’, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 41594167, arXiv preprint arXiv:1604.04018, 2016.
    6. 6)
      • 6. Busta, M., Neumann, L., Matas, J.: ‘Fastext: Efficient unconstrained scene text detector’. Int. Conf. on Computer Vision (ICCV), 2015, pp. 12061214.
    7. 7)
      • 7. Tian, S., Pan, Y., Huang, C., et al: ‘Text flow: A unified text detection system in natural scene images’. Int. Conf. Computer Vision (ICCV), 2015, pp. 46514659.
    8. 8)
      • 8. Zhang, Z., Shen, W., Yao, C., et al: ‘Symmetry-based text line detection in natural scenes’. Computer Vision and Pattern Recognition (CVPR), 2015, pp. 25582567.
    9. 9)
      • 9. Jaderberg, M., Vedaldi, A., Zisserman, A.: ‘Deep features for text spotting’. Computer Vision and Pattern Recognition (CVPR), 2014, pp. 512528.
    10. 10)
      • 10. Yin, X.-C., Yin, X., Huang, K., et al: ‘Robust text detection in natural scene images’, IEEE Trans. Pattern Anal. Mach. Intell., 2014, 36, (5), pp. 970983.
    11. 11)
      • 11. Weinman, J.J., Butler, Z., Knoll, D., et al: ‘Toward integrated scene text reading’, IEEE Trans. Pattern Anal. Mach. Intell., 2014, 36, (2), pp. 375387.
    12. 12)
      • 12. Karaoglu, S., Van Gemert, J.C., Gevers, T.: ‘Object reading: text recognition for object recognition’. European Conf. Computer Vision (ECCV), 2012, pp. 456465.
    13. 13)
      • 13. Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., et al: ‘Icdar 2015 competition on robust reading’. Int. Conf. Document Analysis and Recognition (ICDAR), 2015, pp. 11561160.
    14. 14)
      • 14. Ye, Q., Doermann, D.: ‘Text detection and recognition in imagery: a survey’, IEEE Trans. Pattern Anal. Mach. Intell., 2015, 37, (7), pp. 14801500.
    15. 15)
      • 15. Jain, A.K., Murty, M.N., Flynn, P.J.: ‘Data clustering: a review’, ACM Comput. Surv. (CSUR), 1999, 31, (3), pp. 264323.
    16. 16)
      • 16. Shahab, A., Shafait, F., Dengel, A.: ‘Icdar 2011 robust reading competition challenge 2: Reading text in scene images’. Int. Conf. Document Analysis and Recognition (ICDAR), IEEE, 2011, pp. 14911496.
    17. 17)
      • 17. Yao, C., Bai, X., Liu, W., et al: ‘Detecting texts of arbitrary orientations in natural images’. Computer Vision and Pattern Recognition (CVPR), 2012, pp. 10831090.
    18. 18)
      • 18. Yin, X., Zuo, Z., Tian, S., et al: ‘Text detection, tracking and recognition in video: a comprehensive survey’, IEEE Trans. Image Process., 2016, 25, (6), pp. 27522773.
    19. 19)
      • 19. Yao, C., Bai, X., Liu, W.: ‘A unified framework for multioriented text detection and recognition’, IEEE Trans. Image Process., 2014, 23, (11), pp. 47374749.
    20. 20)
      • 20. Lee, J.-J., Lee, P.-H., Lee, S.-W., et al: ‘Adaboost for text detection in natural scene’. Int. Conf. Document Analysis and Recognition (ICDAR), 2011, pp. 429434.
    21. 21)
      • 21. Neumann, L., Matas, J.: ‘On combining multiple segmentations in scene text recognition’. Int. Conf. Document Analysis and Recognition (ICDAR), IEEE, 2013, pp. 523527.
    22. 22)
      • 22. Wang, K., Babenko, B., Belongie, S.: ‘End-to-end scene text recognition’. Int. Conf. Computer Vision (ICCV), 2011, pp. 14571464.
    23. 23)
      • 23. Huang, W., Qiao, Y., Tang, X.: ‘Robust scene text detection with convolution neural network induced MSER trees’. European Conf. on Computer Vision (ECCV), 2014, pp. 497511.
    24. 24)
      • 24. Li, Y., Jia, W., Shen, C., et al: ‘Characterness: an indicator of text in the wild’, IEEE Trans. Image Process., 2014, 23, (4), pp. 16661677.
    25. 25)
      • 25. Yin, X.-C., Pei, W.-Y., Zhang, J., et al: ‘Multi-orientation scene text detection with adaptive clustering’, IEEE Trans. Pattern Anal. Mach. Intell., 2015, 37, (9), pp. 19301937.
    26. 26)
      • 26. Sun, L., Huo, Q., Jia, W., et al: ‘Robust text detection in natural scene images by generalized color-enhanced contrasting extremal region and neural networks’. Int. Conf. on Pattern Recognition (ICPR), 2014, pp. 27152720.
    27. 27)
      • 27. Matas, J., Chum, O., Urban, M., et al: ‘Robust wide-baseline stereo from maximally stable extremal regions’, Image Vis. Comput., 2004, 22, (10), pp. 761767.
    28. 28)
      • 28. Nistér, D., Stewénius, H.: ‘Linear time maximally stable extremal regions’. European Conf. Computer Vision (ECCV), 2008, pp. 183196.
    29. 29)
      • 29. Epshtein, B., Ofek, E., Wexler, Y.: ‘Detecting text in natural scenes with stroke width transform’. Computer Vision and Pattern Recognition (CVPR), 2010, pp. 29632970.
    30. 30)
      • 30. Neumann, L., Matas, J.: ‘Real-time scene text localization and recognition’. Computer Vision and Pattern Recognition (CVPR), 2012, pp. 35383545.
    31. 31)
      • 31. Kang, L., Li, Y., Doermann, D.: ‘Orientation robust text line detection in natural images’. Computer Vision and Pattern Recognition (CVPR), 2014, pp. 40344041.
    32. 32)
      • 32. Karatzas, D., Shafait, F., Uchida, S., et al: ‘Icdar 2013 robust reading competition’. Int. Conf. Document Analysis and Recognition (ICDAR), 2013, pp. 14841493.
    33. 33)
      • 33. Huang, W., Lin, Z., Yang, J., et al: ‘Text localization in natural images using stroke feature transform and text covariance descriptors’. Int. Conf. Computer Vision (ICCV), 2013, pp. 12411248.
    34. 34)
      • 34. Zitnick, C.L., Dollár, P.: ‘Edge boxes: Locating object proposals from edges’. European Conf. Computer Vision (ECCV), 2014, pp. 391405.
    35. 35)
      • 35. Cho, H., Sung, M., Bongjin, J.: ‘Canny text detector: Fast and robust scene text localization algorithm’. Computer Vision and Pattern Recognition (CVPR), 2016, pp. 35663573.
    36. 36)
      • 36. Wang, Q., Lu, Y., Sun, S.: ‘Text detection in nature scene images using two-stage nontext filtering’. Int. Conf. Document Analysis and Recognition (ICDAR), 2015, pp. 106110.
    37. 37)
      • 37. Pan, Y.-F., Hou, X., Liu, C.-L.: ‘A hybrid approach to detect and localize texts in natural scene images’, IEEE Trans. Image Process., 2011, 20, (3), pp. 800813.
    38. 38)
      • 38. Tian, Z., Huang, W., He, T., et al: ‘Detecting text in natural image with connectionist text proposal network’. European Conf. Computer Vision (ECCV), 2016, pp. 5672.
    39. 39)
      • 39. He, T., Huang, W., Qiao, Y., et al: ‘Text-attentional convolutional neural network for scene text detection’, IEEE Trans. Image Process., 2016, 25, (6), pp. 25292541.
    40. 40)
      • 40. Fu, H., Cao, X., Tu, Z.: ‘Cluster-based co-saliency detection’, IEEE Trans. Image Process., 2013, 22, (10), pp. 37663778.
    41. 41)
      • 41. Van de Weijer, J., Gevers, T., Bagdanov, A.D.: ‘Boosting color saliency in image feature detection’, IEEE Trans. Pattern Anal. Mach. Intell., 2006, 28, (1), pp. 150156.
    42. 42)
      • 42. Chen, X., Yuille, A.L.: ‘Detecting and reading text in natural scenes’. Computer Vision and Pattern Recognition (CVPR), vol. 2, 2004, pp. II366.
    43. 43)
      • 43. Friedman, J., Hastie, T., Tibshirani, R., et al: ‘Additive logistic regression: a statistical view of boosting’, Ann. Stat., 2000, 28, (2), pp. 337407.
    44. 44)
      • 44. Tong, S., Koller, D.: ‘Support vector machine active learning with applications to text classification’, J. Mach. Learn. Res., 2001, 2, (11), pp. 4566.
    45. 45)
      • 45. Platt, J.C.: ‘Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods’, Adv. Large Margin Classif., 1999, 10, (3), pp. 6174.
    46. 46)
      • 46. Boykov, Y., Veksler, O., Zabih, R.: ‘Fast approximate energy minimization via graph cuts’, IEEE Trans. Pattern Anal. Mach. Intell., 2001, 23, (11), pp. 12221239.
    47. 47)
      • 47. Shi, C., Wang, C., Xiao, B., et al: ‘Scene text detection using graph model built upon maximally stable extremal regions’, Pattern Recognit. Lett., 2013, 34, (2), pp. 107116.
    48. 48)
      • 48. Liao, M., Shi, B., Bai, X., et al: ‘Textboxes: a fast text detector with a single deep neural network’. In AAAI Conference on Artificial Intelligence (AAAI), pp. 41614167arXiv preprint arXiv:1611.06779, 2016.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2016.0452
Loading

Related content

content/journals/10.1049/iet-cvi.2016.0452
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address