Text plays an important role in daily life because of its rich information, thus automatic text detection in natural scenes has many attractive applications. However, detecting and recognising such text is always a challenging problem. In this study, the authors propose a method which extends the widely-used stroke width transform by two steps of edge analysis, namely candidate edge recombination and edge classification. A new method that recognises text through candidate edge recombination and candidate edge recognition is also proposed. In the step of candidate edge recombination, they use the idea of over-segmentation and region merging. To separate text edge from background, the edge of the input image is first divided into small segments. Then, neighbour edge segments are merged, if they have similar stroke width and colour. Through this step, each character is described by one candidate boundary. In the step of boundary classification, candidate boundaries are aggregated into text chains, followed by chain classification using character-based and chain-based features. To recognise text, the grey image is extracted based on the location of each candidate edge after the step of candidate edge recombination. Then, histogram of gradient features and a classifier are used to recognise each character. To evaluate the effectiveness of their method, the algorithm is run on the ICDAR competition dataset and Street View Text database. The experimental results show that the proposed method provides promising performance in comparison with the existing methods.

References

1. 1)
  - 11. Zhang, J., Kasturi, R.: ‘Character energy and link energy-based text extraction in scene images’. ACCV 2010, November 2010 (LNCS, 6495), vol. II, pp. 832–844.
2. 2)
  - J. Canny . A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. , 6 , 679 - 697
3. 3)
  - 20. Kim, J.S., Park, S.C., Kim, S.H.: ‘Text locating from natural scene images using image intensities’. ICDAR, 2005.
4. 4)
  - 17. Pan, Y.-F., Hou, X., Liu, C.-L.: ‘A robust system to detect and localize text in natural scene images’. DAS, 2008.
5. 5)
  - 7. Zhang, J., Kasturi, R.: ‘Extraction of text objects in video documents: recent progress’. Proc Eighth IAPR Workshop on Document Analysis Systems (DAS08), Nara, Japan, 2008, pp. 1–13.
6. 6)
  - 12. Lee, S., Cho, M., Jung, K., Kim, J.: ‘Scene text extraction with edge constraint and text collinearity’. ICPR, 2010.
7. 7)
  - 5. Chen, X., Yuille, A.L.: ‘Detecting and reading text in natural scenes’. CVPR, 2004, vol. 2, pp. 366–377.
8. 8)
  - Q. Ye , Q. Huang , W. Gao , D. Zhao . Fast and robust text detection in images and video frames. Image Vision Comput. , 565 - 576
9. 9)
  - 18. Shahab, A., Shafait, F., Dengel, A.: ‘ICDAR 2011 robust reading competition challenges 2: reading text in scene images’. ICDAR 2011, 2011, pp. 1491–1496.
10. 10)
  - 21. Ezaki, N., Kiyota, K., Minh, B.T., Bulacu, M.: ‘Improved text-detection methods for a camera-based text reading system for blind persons’. ICDAR, 2005.
11. 11)
  - 16. Dalal, N., Triggs, B.: ‘Histograms of oriented gradients for human detection’. CVPR, 2005.
12. 12)
  - 8. Jain, A., Yu, B.: ‘Automatic text location in images and video frames’, Pattern Recognit., 1988, 31, (12), pp. 2005–2076.
13. 13)
  - 4. Lucas, S.M.: ‘Text locating competition results’. Proc. Third Int. Conf. onDocument Analysis and Recognition, 2005, vol. 0, pp. 80–85.
14. 14)
  - 29. Shi, C., Wang, C., Xiao, B., Zhang, Y.: ‘Scene text recognition using part-based tree-structured character detection’. CVPR, 2013.
15. 15)
  - 19. Lucas, S.: ‘Icdar 2005 text locating competition results’. ICDAR, 2005.
16. 16)
  - 22. Neumann, L., Matas, J.: ‘On combining multiple segmentations in scene text recognition’. ICDAR, 2013.
17. 17)
  - 27. Wolf, C., Jolion, J.-M.: ‘Object count/area graphs for the evaluation of object detection and segmentation algorithms’, Int. J. Doc. Anal. Recognit., 2006, 8, pp. 280–296 (doi: 10.1007/s10032-006-0014-0).
18. 18)
  - L. Breiman . Random forests. Mach. Learn. , 1 , 5 - 32
19. 19)
  - K. Jung , K.I. Kim , A.K. Jain . Text information extraction in images and video: a survey. Pattern Recogn. , 977 - 997
20. 20)
  - 23. Koo, H.I., Kim, D.H.: ‘Scene text detection via connected component clustering and nontext filtering’, IEEE Trans. Image Process., 2013, 22, (6), pp. 2296–2305 (doi: 10.1109/TIP.2013.2249082).
21. 21)
  - 26. Wang, K., Belongie, S.: ‘Word spotting in the wild’. ECCV, 2010.
22. 22)
  - 15. Perazzi, F., Krahenbuhl, P., Pritch, Y., Hornung, A.: ‘Salient filters: contrast based filtering for salient region detection’. CVPR, 2012.
23. 23)
  - 24. Yi, C., Tian, Y.: ‘Text string detection from natural scenes by structure-based partition and grouping’, IEEE Trans. Image Process., 2011, 20, (9), pp. 2594–2605 (doi: 10.1109/TIP.2011.2126586).
24. 24)
  - 13. Wang, K., Babenko, B., Belongie, S.: ‘End-to-end scene text recognition’. ICCV 2011, 2011.
25. 25)
  - 28. Neumann, L., Matas, J.: ‘Real-time scene text localization and recognition’. CVPR, 2012.
26. 26)
  - 3. Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Yong, R.: ‘ICDAR 2003 robust reading competitions’. ICDAR 2003, 2003, pp. 682.
27. 27)
  - 9. Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: ‘Detecting texts of arbitrary orientations in natural images’. CVPR, 2012.
28. 28)
  - 30. Mishra, A., Alahari, K., Jawahar, C.V.: ‘Top-down and bottom-up cues for scene text recognition’. CVPR, 2012.
29. 29)
  - 1. Liang, J., Doermann, D., Li, H.: ‘Camera-based analysis of text and documents: a survey’, Int. J. Doc. Anal. Recognit., 2005, 7, (2–3), pp. 84–104 (doi: 10.1007/s10032-004-0138-z).
30. 30)
  - 10. Epshtein, B., Ofek, E., Wexler, Y.: ‘Detecting text in natural scenes with stroke width transform’. CVPR, 2010.

Text detection and recognition in natural scene with edge analysis

References

Related content