© The Institution of Engineering and Technology
As an important step in text-based information extraction systems, scene text detection has become a popular subject of research in recent years. In this study, the authors present a novel approach to robustly detect texts which are variable in scales, colours, fonts, languages and orientations in scene images. To segment candidate text connected components (CCs) from images, both local contrast and colour consistency are considered in superpixel level. To filter out the non-text CCs, a hierarchical model is designed. This hierarchical model groups the CCs into three cascaded stages, and is equipped with a well-designed classifier in each stage. Experimental results on the public ICDAR 2005 dataset and the MSRA-TD500 dataset show that their approach obtains better performance than other state-of-the-art methods.
References
-
-
1)
-
Y. Boykov ,
O. Veksler ,
R. Zabih
.
Fast approximate energy minimisation via graph cuts.
IEEE Trans. Pattern Anal. Mach. Intell.
,
11 ,
1222 -
1239
-
2)
-
25. He, J., Do, Q., Downton, A., Kim, J.: ‘A comparison of binarization methods for historical archive documents’. Proc. Eighth Int. Document Analysis and Recognition Conf., IEEE Computer Society, 2005, pp. 538–542.
-
3)
-
27. Hammersley, J.M., Clifford, P.: ‘Markov Field on Finite Graphs and Lattices’. , 1971.
-
4)
-
5. Liu, Y., Goto, S., Ikenaga, T.: ‘A contour-based robust algorithm for text detection in color images’, IEICE Trans. Inf. Syst., 2006, 89, (3), pp. 1221–1230 (doi: 10.1093/ietisy/e89-d.3.1221).
-
5)
-
3. Gang, Z., Yuehu, L.: ‘Scene text detection based on probability map and hierarchical model’, Opt. Eng., 2012, 51, pp. 067204 (9 pp.).
-
6)
-
17. Cong, Y., Xiang, B., Wenyu, L., Yi, M., Zhuowen, T.: ‘Detecting texts of arbitrary orientations in natural images’. 2012 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, 2012, pp. 1083–1090.
-
7)
-
2. Gang, Z., Yuehu, L., Zhiqiang, T.: ‘Scene text detection with superpixels and hierarchical model’. 2012 19th IEEE Int. Conf. on Image Processing (ICIP 2012), 2012, pp. 1001–1004.
-
8)
-
22. Xujiong, Y., Beddoe, G., Slabaugh, G.: ‘Automatic graph cut segmentation of lesions in ct using mean shift superpixels’, Int. J. Biomed. Imaging, 2010, p. 983963 (14 pp.).
-
9)
-
15. Gomez, L., Karatzas, D.: ‘Multi-script text extraction from natural scenes’. 12th Int. Conf. on Document Analysis and Recognition (ICDAR), 2013, 2013, pp. 467–471.
-
10)
-
8. Min Su, C., Jae-Hyun, S., Seonghun, L., Jin Hyung, K.: ‘Scene text extraction by superpixel crfs combining multiple character features’. Proc. the 2011 11th Int. Conf. on Document Analysis and Recognition (ICDAR 2011), Beijing, China, September 2011, pp. 1034–1038.
-
11)
-
J. Shi ,
J. Malik
.
Normalized cuts and image segmentation.
IEEE Trans. Pattern Anal. Mach. Intel.
,
8 ,
888 -
905
-
12)
-
26. Lafferty, J., McCallum, A., Pereira, F.: ‘Conditional random fields: probabilistic models for segmenting and labeling sequence data’, 2001.
-
13)
-
13. Shivakumara, P., Phan, T.Q., Tan, C.L.: ‘A Laplacian approach to multi-oriented text detection in video’, IEEE Trans. Pattern Anal. Mach. Intell., 2011, 33, (2), pp. 412–419 (doi: 10.1109/TPAMI.2010.166).
-
14)
-
30. Lucas, S.: ‘Icdar 2005 text locating competition results’. Proc. Eighth Int. Document Analysis and Recognition Conf., 2005, pp. 80–84.
-
15)
-
31. Cao, H., Govindaraju, V., Bhardwaj, A.: ‘Unconstrained handwritten document retrieval’, Int. J. Document Anal. Recognit., 2011, 14, pp. 1–13 (doi: 10.1007/s10032-010-0145-1).
-
16)
-
19. Pan, Y.F., Hou, X., Liu, C.L.: ‘A hybrid approach to detect and localize texts in natural scene images’, IEEE Trans. Image Process., 2011, 20, (3), pp. 800–813 (doi: 10.1109/TIP.2010.2070803).
-
17)
-
9. Epshtein, B., Ofek, E., Wexler, Y.: ‘Detecting text in natural scenes with stroke width transform’. Proc. IEEE Computer Society Conf. Computer Vision and Pattern Recognition CVPR 2010, 2010, pp. 2963–2970.
-
18)
-
7. Lee, S., Cho, M., Jung, K., Kim, J.: ‘Scene text extraction with edge constraint and text collinearity’. Proc. 20th Int. Pattern Recognition (ICPR) Conf., 2010, pp. 3983–3986.
-
19)
-
12. Toan, D.N., Park, J., Lee, G.: ‘Tensor voting based text localization in natural scene images’, IEEE Signal Process. Lett., 2010, 17, (7), pp. 439–442.
-
20)
-
6. Chen, X., Yuille, A.: ‘Detecting and reading text in natural scenes’. Proc. IEEE Computer Society Conf. Computer Vision and Pattern Recognition CVPR 2004, 2004, vol. 2, pp. 366–373.
-
21)
-
16. Neumann, L., Matas, J.: ‘Text localization in real-world images using efficiently pruned exhaustive search’. Proc. the 2011 11th Int. Conf. on Document Analysis and Recognition (ICDAR 2011), 2011, pp. 687–691.
-
22)
-
19. Kaufhold, J., Hoogs, A.: ‘Learning to segment images using region-based perceptual features’. Proc. the 2004 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, Washington, DC, USA, 27 June–2 July 2004, vol. 2II–II–61 (2), pp. II–954–II–II–61.
-
23)
-
10. Neumann, L., Matas, J.: ‘Real-time scene text localization and recognition’. 2012 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, 2012, pp. 3538–3545.
-
24)
-
28. Ouwayed, N., Belaid, A.: ‘A general approach for multi-oriented text line extraction of handwritten documents’, Int. J. Doc. Anal. Recogn., 2012, 15, pp. 297–314 (doi: 10.1007/s10032-011-0172-6).
-
25)
-
24. Niblack, W.: ‘An introduction to digital image processing’ (Birkeroed: Strandberg, 1985).
-
26)
-
18. Ren, X., Malik, J.: ‘Learning a classification model for segmentation’. Proc. Ninth IEEE Int. Conf. on Computer Vision, Nice, France, 2003, vol. 117, (1), pp. 10–17.
-
27)
-
K. Jung ,
K.I. Kim ,
A.K. Jain
.
Text information extraction in images and video: a survey.
Pattern Recogn.
,
977 -
997
-
28)
-
14. Yi, C., Tian, Y.: ‘Text string detection from natural scenes by structure-based partition and grouping’, IEEE Trans. Image Process., 2011, 20, (99), pp. 2594–2605.
-
29)
-
4. Bouman, K.L., Abdollahian, G., Boutin, M., Delp, E.J.: ‘A low complexity sign detection and text localization method for mobile applications’, IEEE Trans. Multimedia, 2011, 13, (5), pp. 922–934 (doi: 10.1109/TMM.2011.2154317).
-
30)
-
D. Comaniciu ,
P. Meer
.
Mean shift: a robust approach toward feature space analysis.
IEEE Trans. Pattern Anal. Mach. Intell.
,
5 ,
603 -
619
-
31)
-
29. Dalal, N., Triggs, B.: ‘Histograms of oriented gradients for human detection’. Proc. IEEE Computer Society Conf. Computer Vision and Pattern Recognition CVPR 2005, 2005, vol. 1, pp. 886–893.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2014.0297
Related content
content/journals/10.1049/iet-cvi.2014.0297
pub_keyword,iet_inspecKeyword,pub_concept
6
6