Text segmentation using superpixel clustering

Yuanping Zhu; Kuang Zhang

Text segmentation using superpixel clustering

View Fulltext

Author(s): Yuanping Zhu ¹ and Kuang Zhang ¹
- Affiliations: 1: Department of Computer Science, Tianjin Normal University, No. 393 Binshuixi Road, Xiqing District, Tianjin, People's Republic of China
Source: Volume 11, Issue 7, July 2017, p. 455 – 464
DOI: 10.1049/iet-ipr.2016.0914 , Print ISSN 1751-9659, Online ISSN 1751-9667

« Previous Article
Table of contents
Next Article »

Received 01/11/2016, Accepted 08/04/2017, Revised 28/03/2017, Published 13/04/2017

Text segmentation is important for text image analysis and recognition; however, it is challenging due to noise and complex background in natural scenes. Superpixel-based image representation can enhance robustness to noise and local disturbances, but conventional superpixel algorithms are difficult to obtain the complete stroke regions and accurate boundaries for text images. In this study, a text segmentation method based on superpixel clustering is proposed. First, to generate accurate superpixels for text images, an adaptive simple linear iterative clustering-based text superpixel generation algorithm is proposed. The adaptive superpixel size and compactness are calculated to enhance boundary adherence. Second, to increase the complete coverage of strokes from superpixels, superpixel clustering merges homogeneous superpixels into larger regions for both strokes and the background. A modified density-based spatial clustering of applications with noise is proposed. Finally, stroke superpixel verification assigns each region to a stroke or to the background and the text segmentation result is obtained. The proposed method shows promising robustness to noise and complex background textures. Experimental results on the Korea Advanced Institute of Science and Technology (KAIST) scene text dataset, International Conference on Document Analysis and Recognition (ICDAR) 2003 natural scene text image dataset and Street View Text dataset verify that this method is effective and significantly outperforms existing methods.

References

1. 1)
  - 25. Jung, J., Lee, S.H., Cho, M.S., et al: ‘Touch TT: scene text extractor using touch screen interface’, ETRI J., 2011, 33, (1), pp. 78–88.
2. 2)
  - 27. Wang, K., Babenko, B., Belongie, S.: ‘End-to-end scene text recognition’. Proc. 2011 IEEE Int. Conf. on Computer Vision, Barcelona, Spain, November 2011, pp. 1457–1464.
3. 3)
  - 9. Otsu, N.: ‘A threshold selection method from gray-level histograms’, IEEE Trans. Syst. Man Cybern., 1979, 9, (1), pp. 62–66.
4. 4)
  - 10. Niblack, W.: ‘An introduction to digital image processing’ (Prentice-Hall, 1986).
5. 5)
  - 13. Epshtein, B., Ofek, E., Wexler, Y.: ‘Detecting text in natural scenes with stroke width transform’. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 2010, pp. 2963–2970.
6. 6)
  - 7. Bai, X., Yao, C., Liu, W.: ‘Strokelets: a learned multi-scale mid-level representation for scene text recognition’, IEEE Trans. Image Process., 2016, 25, (6), pp. 2789–2802.
7. 7)
  - 3. Zhu, Y., Yao, C., Bai, X.: ‘Scene text detection and recognition: recent advances and future trends’, Front. Comput. Sci., 2015, 10, (1), pp. 1–18.
8. 8)
  - 6. Lee, C., Bhardwaj, A., Di, W., et al: ‘Region-based discriminative feature pooling for scene text recognition’. Proc. 2014 IEEE Conf. on Computer Vision and Pattern Recognition, Columbus, OH, USA, June 2014, pp. 4050–4057.
9. 9)
  - 18. Neumann, L., Matas, J.: ‘On combining multiple segmentations in scene text recognition’. Proc. the 12th Int. Conf. on Document Analysis and Recognition, Washington, DC, USA, August 2013, pp. 523–527.
10. 10)
  - 22. Min, S.C., Seok, J.H., Lee, S., et al: ‘Scene text extraction by superpixel CRFs combining multiple character features’. Proc. the 11th Int. Conf. on Document Analysis and Recognition, Beijing, China, September 2011, pp. 1034–1038.
11. 11)
  - 20. Radhakrishna, A., Appu, S., Kevin, S., et al: ‘SLIC superpixels compared to state-of-the-art superpixel methods’, IEEE Trans. Pattern Anal. Mach. Intell., 2012, 34, (11), pp. 2274–2282.
12. 12)
  - 15. Shi, C., Wang, C., Xiao, B., et al: ‘Scene text detection using graph model built upon maximally stable extremal regions’, Pattern Recognit. Lett., 2013, 34, (2), pp. 107–116.
13. 13)
  - 11. Li, Z., Liu, G., Qian, X., et al: ‘Effective and efficient video text extraction using key text points’, IET Image Process., 2011, 5, (8), pp. 671–683.
14. 14)
  - 16. Neumann, L., Matas, J.: ‘Real-time scene text localization and recognition’. Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, Providence, Rhode Island, USA, June 2012, pp. 3538–3545.
15. 15)
  - 5. Zhang, J., Kasturi, R.: ‘Extraction of text objects in video documents: recent progress’. Proc. the Eighth IAPR Int. Workshop Document Analysis Systems, Nara, Japan, September 2008, pp. 5–17.
16. 16)
  - 28. Song, J., Cai, M., Lyu, M.R.: ‘A robust statistic method for classifying color polarity of video text’. Proc. of 2003 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Hong Kong, China, April 2003, pp. 581–584.
17. 17)
  - 19. Neumann, L., Matas, J.: ‘Real-time lexicon-free scene text localization and recognition’, IEEE Trans. Pattern Anal. Mach. Intell., 2016, 38, (9), pp. 1872–1885.
18. 18)
  - 23. Zhu, A., Wang, G., Dong, Y.: ‘Robust text segmentation in low quality images via adaptive stroke width estimation and stroke based superpixel grouping’. Proc. the 12th Asian Conf. on Computer Vision, 2015 (LNCS, Part I), pp. 119–133.
19. 19)
  - 12. Subramanian, K., Natarajan, P., Decerbo, M., et al: ‘Character-stroke detection for text-localization and extraction’. Proc. the Ninth Int. Conf. on Document Analysis and Recognition, Curitiba, Parana, Brazil, September 2007, pp. 33–37.
20. 20)
  - 24. Ester, M., Kriegel, H.P., Sander, J., et al: ‘A density-based algorithm for discovering clusters in large spatial databases with noise’. Proc. the Second Int. Conf. on Knowledge Discovery and Data Mining (KDD-96), Portland, Oregon, USA, August 1996, pp. 226–231.
21. 21)
  - 26. Lucas, S.M., Panaretos, A., Sosa, L., et al: ‘ICDAR2003 robust reading competitions: entries, results and future directions’, Int. J. Doc. Anal. Recognit., 2005, 7, (2), pp. 105–122.
22. 22)
  - 29. Ren, C.Y., Reid, I.: ‘gSLIC: a real-time implementation of SLIC superpixel segmentation’. Technical Report, University of Oxford, Department of Engineering Science, 2011.
23. 23)
  - 4. Yin, X.-C., Zuo, Z.-Y., Tian, S., et al: ‘Text detection, tracking and recognition in video: a comprehensive survey’, IEEE Trans. Image Process., 2016, 25, (6), pp. 2752–2773.
24. 24)
  - 2. Jung, K., Kim, K.I., Jain, A.K.: ‘Text information extraction in images and video: a survey’, Pattern Recognit., 2004, 37, (5), pp. 977–997.
25. 25)
  - 1. Liang, J., Doermann, D., Li, H.: ‘Camera-based analysis of text and documents: a survey’, Int. J. Doc. Anal. Recognit., 2005, 7, (2), pp. 84–104.
26. 26)
  - 14. Pan, Y.F., Zhu, Y., Sun, J., et al: ‘Improving scene text detection by scale-adaptive segmentation and weighted CRF verification’. Proc. the 11th Int. Conf. on Document Analysis and Recognition, Beijing, China, September 2011, pp. 759–763.
27. 27)
  - 17. Yin, X.C., Yin, X., Huang, K., et al: ‘Robust text detection in natural scene images’, IEEE Trans. Pattern Anal. Mach. Intell., 2014, 36, (5), pp. 970–983.
28. 28)
  - 8. Jaderberg, M., Vedaldi, A., Zisserman, A.: ‘Deep features for text spotting’. Proc. the 13th European Conf. on Computer Vision, Zurich, Switzerland, September 2014, pp. 512–528.
29. 29)
  - 21. Li, S., Lu, H.-C., Ruan, X., et al: ‘Human body segmentation based on independent component analysis with reference at two-scale superpixel’, IET Image Process., 2012, 6, (6), pp. 770–777.

Login

Not registered yet?

Share

Tools

Login to add to favourites

Key

Text segmentation using superpixel clustering

References

Related content