This study focuses on cursive text recognition appearing in videos, using a complete framework of deep neural networks. While mature video optical character recognition systems (V-OCRs) are available for text in non-cursive scripts, recognition of cursive scripts is marked by many challenges. These include complex and overlapping ligatures, context-dependent shape variations and presence of a large number of dots and diacritics. The authors present an analytical technique for recognition of cursive caption text that relies on a combination of convolutional and recurrent neural networks trained in an end-to-end framework. Text lines extracted from video frames are preprocessed to segment the background and are fed to a convolutional neural network for feature extraction. The extracted feature sequences are fed to different variants of bi-directional recurrent neural networks along with the ground truth transcription to learn sequence-to-sequence mapping. Finally, a connectionist temporal classification layer is employed to produce the final transcription. Experiments on a data set of more than 40,000 text lines from 11,192 video frames of various News channel videos reported an overall character recognition rate of 97.63%. The proposed work employs Urdu text as a case study but the findings can be generalised to other cursive scripts as well.

References

1. 1)
  - 72. Naz, S., Umar, A.I., Ahmad, R., et al: ‘Offline cursive urdu-nastaliq script recognition using multidimensional recurrent neural networks’, Neurocomputing, 2016, 177, pp. 228–241.
2. 2)
  - 24. Liu, C.-L., Fink, G.A., Govindaraju, V., et al: ‘Special issue on deep learning for document analysis and recognition’, 2018.
3. 3)
  - 47. Yang, H., Li, S., Yin, X., et al: ‘Recurrent highway networks with attention mechanism for scene text recognition’. 2017 Int. Conf. on Digital Image Computing: Techniques and Applications (DICTA), Sydney, Australia, 2017, pp. 1–8.
4. 4)
  - 12. Islam, N., Islam, Z., Noor, N.: ‘A survey on optical character recognition system’, arXiv preprint arXiv:1710.05703.
5. 5)
  - 98. Sauvola, J., Pietikäinen, M.: ‘Adaptive document image binarization’, Pattern Recognit., 2000, 33, (2), pp. 225–236.
6. 6)
  - 32. Jaderberg, M., Simonyan, K., Vedaldi, A., et al: ‘Synthetic data and artificial neural networks for natural scene text recognition’, arXiv preprint arXiv:1406.2227.
7. 7)
  - 52. Liao, M., Zhang, J., Wan, Z., et al: ‘Scene text recognition from two-dimensional perspective’, arXiv preprint arXiv:1809.06508.
8. 8)
  - 88. Naz, S., Umar, A.I., Ahmed, R., et al: ‘Urdu nasta'liq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks’, SpringerPlus, 2016, 5, (1), p. 2010.
9. 9)
  - 62. Kulkarni, P., Bhagyashri, P., Joglekar, B.: ‘An effective content based video analysis and retrieval using pattern indexing techniques’. 2015 Int. Conf. on Industrial Instrumentation and Control (ICIC), Pune, India, 2015.
10. 10)
  - 67. Yousfi, S., Berrani, S.-A., Garcia, C.: ‘Deep learning and recurrent connectionist-based approaches for Arabic text recognition in videos’. 2015 13th Int. Conf. on Document Analysis and Recognition (ICDAR), Nancy, France, 2015, pp. 1026–1030.
11. 11)
  - 61. Khatri, M.J., Shetty, A., Gupta, A., et al: ‘Video OCR for indexing and retrieval’, Int. J.Comput. Appl., 2015, 118, (2), pp. 30–33.
12. 12)
  - 106. Hussain, S., Ali, S., Akram, Q.U.A.: ‘Nastalique segmentation-based approach for Urdu OCR’, Int. J. Doc. Anal. Recognit. (IJDAR), 2015, 18, (4), pp. 357–374.
13. 13)
  - 97. Niblack, W.: ‘An introduction to digital image processing’, vol. 34 (Prentice-Hall, Englewood Cliffs, 1986).
14. 14)
  - 53. Gao, Y., Huang, Z., Dai, Y.: ‘Double supervised network with attention mechanism for scene text recognition’, arXiv preprint arXiv:1808.00677.
15. 15)
  - 40. Saranya, K.C., Singhal, V.: ‘Real-time prototype of driver assistance system for Indian road signs’. Int. Proc. on Advances in Soft Computing, Intelligent Systems and Applications, Singapore, Singapore, 2018, pp. 147–155.
16. 16)
  - 36. Neumann, L.: ‘Scene text localization and recognition in images and videos’. Ph.D. thesis, Department of Cybernetics Faculty of Electrical Engineering, Czech Technical University, 2017.
17. 17)
  - 37. Hechri, A., Hmida, R., Mtibaa, A.: ‘Robust road lanes and traffic signs recognition for driver assistance system’, Int. J. Comput. Sci. Eng., 2015, 10, (1–2), pp. 202–209.
18. 18)
  - 38. Ellahyani, A., El Ansari, M., El Jaafari, I.: ‘Traffic sign detection and recognition based on random forests’, Appl. Soft Comput., 2016, 46, pp. 805–815.
19. 19)
  - 57. Lee, S., Kim, J.: ‘Complementary combination of holistic and component analysis for recognition of low-resolution video character images’, Pattern Recognit. Lett., 2008, 29, (4), pp. 383–391.
20. 20)
  - 10. Fasquel, J.-B., Delanoue, N.: ‘A graph based image interpretation method using a priori qualitative inclusion and photometric relationships’, IEEE Trans. Pattern Anal. Mach. Intell., 2019, 41, (5), pp. 1043–1055.
21. 21)
  - 101. Bodapati, J.D., Suvarna, B., N, V.: ‘Role of deep neural features vs hand crafted features for hand written digit recognition’, Int. J. Recent Technol. Eng. (IJRTE), 2019, 7, pp. 147–152.
22. 22)
  - 44. Jaderberg, M., Vedaldi, A., Zisserman, A.: ‘Deep features for text spotting’. European Conf. on Computer Vision, Zürich, Switzerland, 2014, pp. 512–528.
23. 23)
  - 83. Aved, S.T., Hussain, S., Maqbool, A., et al: ‘Segmentation free nastalique Urdu OCR’, World Academy of Sci., Eng. Technol., 2010, 46, pp. 456–461.
24. 24)
  - 14. Weinman, J.J., Zachary, B., Dugan, K., et al: ‘Toward integrated scene text reading’, IEEE Trans. Pattern Anal. Mach. Intell., 2014, 36, (2), pp. 375–387.
25. 25)
  - 69. Bhunia, A.K., Kumar, G., Roy, P.P., et al: ‘Text recognition in scene image and video frame using color channel selection’, Multimedia Tools Appl., 2018, 77, (7), pp. 8551–8578.
26. 26)
  - 77. Shamsher, I., Ahmad, Z., Orakzai, J.K., et al: ‘OCR for printed Urdu script using feed forward neural network’. Proc. of World Academy of Science, Engineering and Technology, vol. 23, Dubai, UAE, 2007.
27. 27)
  - 35. Lu, S., Chen, T., Tian, S., et al: ‘Scene text extraction based on edges and support vector regression’, Int. J. Document Anal. Recogn. (IJDAR), 2015, 18, (2), pp. 125–135.
28. 28)
  - 7. Palaiahnakote, S., Quy, P.T., Lim, T.C.: ‘A Laplacian approach to multi-oriented text detection in video’, IEEE Trans. Pattern Anal. Mach. Intell., 2011, 33, (2), pp. 412–419.
29. 29)
  - 41. Lai, Y., Wang, N., Yang, Y., et al: ‘Traffic signs recognition and classification based on deep feature learning’. 7th Int. Conf. on Pattern Recognition Applications and Methods (ICPRAM), Madeira, Portugal, 2018, pp. 622–629.
30. 30)
  - 79. Sardar, S., Wahab, A.: ‘Optical character recognition system for urdu’. Int. Conf. on Information and Emerging Technologies, Karachi, Pakistan, 2010.
31. 31)
  - 76. Uddin, I., Siddiqi, I., Khalid, S., et al: ‘Segmentation-free optical character recognition for printed urdu text’, EURASIP J. Image Video Process., 2017, 2017, (1), p. 62.
32. 32)
  - 6. Naz, S., Hayat, K., Razzak, M.I., et al: ‘The optical character recognition of Urdu-like cursive scripts’, Pattern Recognit., 2014, 47, (3), pp. 1229–1248.
33. 33)
  - 30. Yi, C., Tian, Y.: ‘Scene text recognition in mobile applications by character descriptor and structure configuration’, IEEE Trans. Image Process., 2014, 23, (7), pp. 2972–2982.
34. 34)
  - 28. Yao, C., Bai, X., Shi, B., et al: ‘Strokelets: a learned multi-scale representation for scene text recognition’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Columbus, OH, USA, 2014, pp. 4042–4049.
35. 35)
  - 99. Feng, M.-L., Tan, Y.-P.: ‘Contrast adaptive binarization of low quality document images’, IEICE Electron. Express, 2004, 1, (16), pp. 501–506.
36. 36)
  - 95. Mirza, A., Siddiqi, I., Mustufa, S.G., et al: ‘Impact of pre-processing on recognition of cursive video text’. 9th Iberian Conf. on Pattern Recognition and Image Analysis (IbPRIA 2019), Madrid, Spain, 2019.
37. 37)
  - 1. Ciardiello, G., Scafuro, G., Degrandi, M., et al: ‘An experimental system for office document handling and text recognition’. Proc 9th Int. Conf. on Pattern Recognition, Rome, Italy, 1988, pp. 739–743.
38. 38)
  - 84. Lehal, G.S.: ‘Choice of recognizable units for Urdu OCR’. Proceeding of the workshop on document analysis and recognition, New York, NY, USA, 2012, pp. 79–85.
39. 39)
  - 105. Akram, Q., Hussain, S., Adeeba, F., et al: ‘Framework of Urdu nastalique optical character recognition system’. Proc. of Conf. on Language and Technology (CLT), Karachi, Pakistan, 2014, pp. 1–7.
40. 40)
  - 80. Ahmad, I., Mahmoud, S.A., Fink, G.A.: ‘Open-vocabulary recognition of machine-printed Arabic text using hidden Markov models’, Pattern Recognit., 2016, 51, pp. 97–111.
41. 41)
  - 87. Ul-Hasan, A., Ahmed, S.B., Rashid, F., et al: ‘Offline printed Urdu nastaleeq script recognition with bidirectional LSTM networks’. 2013 12th Int. Conf. on Document Analysis and Recognition, Washington, DC, USA, 2013, pp. 1061–1065.
42. 42)
  - 5. Smith, R.: ‘An overview of the tesseract OCR engine’. Ninth Int. Conf. on Document Analysis and Recognition, 2007. ICDAR 2007, Curitiba, Brazil, 2007, vol. 2, pp. 629–633.
43. 43)
  - 54. Lee, C.W., Jung, K., Kim, H.J.: ‘Automatic text detection and removal in video sequences’, Pattern Recognit. Lett., 2003, 24, (15), pp. 2607–2623.
44. 44)
  - 11. Weinman, J.J., Learned-Miller, E.: ‘Improving recognition of novel input with similarity’. 2006 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, New York, NY, USA, 2006, vol. 1, pp. 308–315.
45. 45)
  - 9. Toselli, A.H., Vidal, E., Romero, V., et al: ‘HMM word graph based keyword spotting in handwritten document images’, Inf. Sci., 2016, 370, pp. 497–518.
46. 46)
  - 23. Sudholt, S., Fink, G.A.: ‘Phocnet: a deep convolutional neural network for word spotting in handwritten documents’. 2016 15th Int. Conf. on Frontiers in Handwriting Recognition (ICFHR), Shenzhen, People's Republic of China, 2016, pp. 277–282.
47. 47)
  - 65. Zayene, O., Hennebert, J., Touj, S.M., et al: ‘A dataset for Arabic text detection, tracking and recognition in news videos – activ’. 2015 13th Int. Conf. on Document Analysis and Recognition (ICDAR), Nancy, France, 2015.
48. 48)
  - 19. Caner, G., Haritaoglu, I.: ‘Shape-dna: effective character restoration and enhancement for Arabic text documents’. 2010 20th Int. Conf. on Pattern Recognition (ICPR), Istanbul, Turkey, 2010, pp. 2053–2056.
49. 49)
  - 82. Javed, S.T., Hussain, S.: ‘Segmentation based Urdu nastalique OCR’. Iberoamerican Congress on Pattern Recognition, Havana, Cuba, 2013, pp. 41–49.
50. 50)
  - 13. Lei, B., Xu, G., Feng, M., et al: ‘Classification, parameter estimation and state estimation: an engineering approach using MATLAB’ (John Wiley & Sons, UK, 2017).
51. 51)
  - 66. Yousfi, S., Berrani, S.-A., Garcia, C.: ‘Alif: a dataset for Arabic embedded text recognition in tv broadcast’. 2015 13th Int. Conf. on Document Analysis and Recognition (ICDAR), Nancy, France, 2015, pp. 1221–1225.
52. 52)
  - 71. Yan, X., Siyuan, S., Ziming, Q., et al: ‘End-to-end subtitle detection and recognition for videos in east Asian languages via CNN ensemble’, Signal Process., Image Commun., 2018, 60, pp. 131–143.
53. 53)
  - 92. Raza, A., Siddiqi, I., Abidi, A., et al: ‘An unconstrained benchmark Urdu handwritten sentence database with automatic line segmentation’. 2012 Int. Conf. on Frontiers in Handwriting Recognition (ICFHR), Bari, Italy, 2012, pp. 491–496.
54. 54)
  - 93. Hayat, U., Aatif, M., Zeeshan, O., et al: ‘Ligature recognition in Urdu caption text using deep convolutional neural networks’. 2018 14th Int. Conf. on Emerging Technologies (ICET), Islamabad, Pakistan, 2018, pp. 1–6.
55. 55)
  - 89. Sabbour, N., Shafait, F.: ‘A segmentation-free approach to Arabic and Urdu OCR’. IS&T/SPIE Electronic Imaging, Int. Society for Optics and Photonics, San Francisco, CA, USA, 2013, pp. 86580N–86580N.
56. 56)
  - 102. Nanni, L., Ghidoni, S., Brahnam, S.: ‘Handcrafted vs. non-handcrafted features for computer vision classification’, Pattern Recognit., 2017, 71, pp. 158–172.
57. 57)
  - 31. Sriman, B., Schomaker, L.: ‘Object attention patches for text detection and recognition in scene images using sift’. ICPRAM (1), Lisbon, Portugal, 2015, pp. 304–311.
58. 58)
  - 85. Sagheer, M.W., Nobile, N., He, C.L., et al: ‘A novel handwritten Urdu word spotting based on connected components analysis’. 2010 20th Int. Conf. on Pattern Recognition (ICPR), Istanbul, Turkey, 2010, pp. 2013–2016.
59. 59)
  - 39. Salhi, A., Minaoui, B., Fakir, M., et al: ‘Traffic signs recognition using HP and HOG descriptors combined to MLP and SVM classifiers’, Traffic, 2017, 8, (11), pp. 526–530.
60. 60)
  - 8. Garz, A., Seuret, M., Simistira, F., et al: ‘Creating ground truth for historical manuscripts with document graphs and scribbling interaction’. 2016 12th IAPR Workshop on Document Analysis Systems (DAS), Santorini, Greece, 2016, pp. 126–131.
61. 61)
  - 48. Bušta, M., Neumann, L., Matas, J.: ‘Deep textspotter: an end-to-end trainable scene text localization and recognition framework’. 2017 IEEE Int. Conf. on Computer Vision (ICCV), Venice, Italy, 2017, pp. 2223–2231.
62. 62)
  - 17. Rabi, M., Amrouch, M., Mahani, Z.: ‘Cursive Arabic handwriting recognition system without explicit segmentation based on hidden Markov models’, J. Data Mining Digital Humanities, 2018, pp. 1–7.
63. 63)
  - 86. Abidi, A., Jamil, A., Siddiqi, I., et al: ‘Word spotting based retrieval of Urdu handwritten documents’. 2012 Int. Conf. on Frontiers in Handwriting Recognition (ICFHR), Bari, Italy, 2012, pp. 331–336.
64. 64)
  - 100. Wolf, C., Jolion, J.-M.: ‘Extraction and recognition of artificial text in multimedia documents’, Formal Pattern Anal. Appl., 2004, 6, (4), pp. 309–326.
65. 65)
  - 16. Rabi, M., Amrouch, M., Mahani, Z.: ‘Recognition of cursive Arabic handwritten text using embedded training based on hidden Markov models’, Int. J. Pattern Recognit. Artif. Intell., 2018, 32, (1), p. 1860007.
66. 66)
  - 26. Tian, S., Bhattacharya, U., Lu, S., et al: ‘Multilingual scene character recognition with co-occurrence of histogram of oriented gradients’, Pattern Recognit., 2016, 51, pp. 125–134.
67. 67)
  - 29. Bai, X., Yao, C., Liu, W.: ‘Strokelets: a learned multi-scale mid-level representation for scene text recognition’, IEEE Trans. Image Process., 2016, 25, (6), pp. 2789–2802.
68. 68)
  - 2. Elliman, D.G., Lancaster, I.T.: ‘A review of segmentation and contextual analysis techniques for text recognition’, Pattern Recognit., 1990, 23, (3–4), pp. 337–346.
69. 69)
  - 107. Ahmed, S.B., Naz, S., Razzak, M.I., et al: ‘Evaluation of cursive and non-cursive scripts using recurrent neural networks’, Neural Comput. Appl., 2016, 27, (3), pp. 603–613.
70. 70)
  - 64. Zayene, O., Touj, S.M., Hennebert, J., et al: ‘Multi-dimensional long short-term memory networks for artificial Arabic text recognition in news video’, IET Comput. Vis., 2018, 12, (5), pp. 710–719.
71. 71)
  - 33. Kumar, S., Kumar, K., Mishra, R.K., et al: ‘Scene text recognition using artificial neural network: a survey’, Int.J. Comput. Appl., 2016, 137, (6), pp. 40–50.
72. 72)
  - 18. Zhou, P., Li, L., Tan, C.L.: ‘Character recognition under severe perspective distortion’. 10th Int. Conf. on Document Analysis and Recognition, 2009. ICDAR'09, Barcelona, Spain, 2009, pp. 676–680.
73. 73)
  - 49. Lei, Z., Zhao, S., Song, H., et al: ‘Scene text recognition using residual convolutional recurrent neural network’, Mach. Vis. Appl., 2018, 29, pp. 1–11.
74. 74)
  - 4. Komanduri, S., Roopa, Y.M., Bala, M.M.: ‘Novel approach for image text recognition and translation’. 2019 3rd Int. Conf. on Computing Methodologies and Communication (ICCMC), Tamil Nadu, India, 2019, pp. 596–599.
75. 75)
  - 81. Khemiri, A., Echi, A.K., Belaid, A., et al: ‘Arabic handwritten words off-line recognition based on hmms and dbns’. 2015 13th Int. Conf. on Document Analysis and Recognition (ICDAR), Nancy, France, 2015, pp. 51–55.
76. 76)
  - 78. Tariq, J., Nauman, U., Naru, M.U.: ‘Softconverter: a novel approach to construct OCR for printed Urdu isolated characters’. 2010 2nd Int. Conf. on Computer Engineering and Technology (ICCET), Chengdu, People's Republic of China, 2010, vol. 3, p. V3V495V3–495.
77. 77)
  - 103. Alshazly, H., Linse, C., Barth, E., et al: ‘Handcrafted versus CNN features for ear recognition’, Symmetry, 2019, 11, (12), p. 1493.
78. 78)
  - 27. Yu, B., Wan, H.: ‘Chinese text detection and recognition in natural scene using HOG and SVM’, DEStech Trans.Comput. Sci. Eng., 2016, pp. 148–152.
79. 79)
  - 3. Qiaoyang, Y., Doermann, D.: ‘Text detection and recognition in imagery: a survey’, IEEE Trans. Pattern Anal. Mach. Intell., 2015, 37, (7), pp. 1480–1500.
80. 80)
  - 60. Shivakumara, P., Phan, T.Q., Lu, S., et al: ‘Video character recognition through hierarchical classification’. 2011 Int. Conf. on Document Analysis and Recognition, Peking, People's Republic of China, 2011, pp. 131–135.
81. 81)
  - 45. Shi, B., Wang, X., Lyu, P., et al: ‘Robust scene text recognition with automatic rectification’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 4168–4176.
82. 82)
  - 90. Center for language engineering. http://http://www.cle.org.pk/, accessed: 2019-04-15.
83. 83)
  - 15. Metwally, A.H., Khalil, M.I., Abbas, H.M.: ‘Offline Arabic handwriting recognition using hidden Markov models and post-recognition lexicon matching’. 2017 12th Int. Conf. on Computer Engineering and Systems (ICCES), Cairo, Egypt, 2017, pp. 238–243.
84. 84)
  - 70. Lu, W., Sun, H., Chu, J., et al: ‘A novel approach for video text detection and recognition based on a corner response feature map and transferred deep convolutional neural network’, IEEE Access, 2018, 6, pp. 40198–40211.
85. 85)
  - 20. Kompalli, P.S.: ‘Image document processing in a client-server system including privacy-preserving text recognition’. US Patent 9,847,974, 19 December 2017.
86. 86)
  - 21. Märgner, V., Pal, U., Antonacopoulos, A., et al: ‘Document analysis and text recognition, 2018.
87. 87)
  - 42. Wang, K., Belongie, S.: ‘Word spotting in the wild’. European Conf. on Computer Vision, Crete, Greece, 2010, pp. 591–604.
88. 88)
  - 96. Otsu, N.: ‘A threshold selection method from gray-level histograms’, IEEE Trans. Syst. Man Cybernet., 1979, 9, (1), pp. 62–66.
89. 89)
  - 68. Jain, M., Mathew, M., Jawahar, C.: ‘Unconstrained scene text and video text recognition for Arabic script’. 2017 1st Int. Workshop on Arabic Script Analysis and Recognition (ASAR), Lorraine, France, 2017, pp. 26–30.
90. 90)
  - 58. Tang, X., Gao, X., Liu, J., et al: ‘A spatial-temporal approach for video caption detection and recognition’, IEEE Trans. Neural Netw., 2002, 13, (4), pp. 961–971.
91. 91)
  - 74. Naz, S., Umar, A.I., Ahmad, R., et al: ‘Urdu nastaliq recognition using convolutional–recursive deep learning’, Neurocomputing, 2017, 243, pp. 80–87.
92. 92)
  - 43. Goel, V., Mishra, A., Alahari, K., et al: ‘Whole is greater than sum of parts: recognizing scene text words’. 2013 12th Int. Conf. on Document Analysis and Recognition (ICDAR), Washington, DC, USA, 2013, pp. 398–402.
93. 93)
  - 104. Ahmad, Z., Orakzai, J.K., Shamsher, I., et al: ‘Urdu nastaleeq optical character recognition’, Int. J. Comput. Inf. Eng., 2007, 1, (8), pp. 2380–2383.
94. 94)
  - 25. Lee, C.-Y., Bhardwaj, A., Di, W., et al: ‘Region-based discriminative feature pooling for scene text recognition’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Columbus, OH, USA, 2014, pp. 4050–4057.
95. 95)
  - 50. Liu, Z., Li, Y., Ren, F., et al: ‘Squeezedtext: a real-time scene text recognition by binary convolutional encoder-decoder network’. Thirty-Second AAAI Conf. on Artificial Intelligence, New Orleans, LA, USA, 2018, pp. 23–28.
96. 96)
  - 91. Sagheer, M.W., He, C.L., Nobile, N., et al: ‘Holistic Urdu handwritten word recognition using support vector machine’. 2010 20th Int. Conf. on Pattern Recognition (ICPR), Istanbul, Turkey, 2010, pp. 1900–1903.
97. 97)
  - 108. Naz, S., Umar, A.I., Ahmad, R., et al: ‘Urdu nasta'liq text recognition system based on multi-dimensional recurrent neural network and statistical features’, Neural Comput. Appl., 2017, 28, (2), pp. 219–231.
98. 98)
  - 63. Tikle, A.N., Vaidya, C., Dahiwale, P.: ‘A survey of indexing techniques for large scale content-based image retrieval’. 2015 Int. Conf. on Electrical, Electronics, Signals, Communication and Optimization (EESCO), Andhra Pradesh, India, 2015.
99. 99)
  - 34. Zhu, S., Zanibbi, R.: ‘A text detection system for natural scenes with convolutional feature learning and cascaded classification’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 625–632.
100. 100)
  - 46. Shi, B., Bai, X., Yao, C.: ‘An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, 39, (11), pp. 2298–2304.
101. 101)
  - 55. Kim, D., Sohn, K.: ‘Static text region detection in video sequences using color and orientation consistencies’. 19th Int. Conf. on Pattern Recognition, 2008. ICPR 2008, Tampa, FL, USA, 2008, pp. 1–4.
102. 102)
  - 59. Elagouni, K., Garcia, C., Sébillot, P.: ‘A comprehensive neural-based approach for text recognition in videos using natural language processing’. Proc. of the 1st ACM Int. Conf. on Multimedia Retrieval, Trento, Italy, 2011, p. 23.
103. 103)
  - 22. Wang, T., Wu, D.J., Coates, A., et al: ‘End-to-end text recognition with convolutional neural networks’. 2012 21st Int. Conf. on Pattern Recognition (ICPR), Tsukuba, Japan, 2012, pp. 3304–3308.
104. 104)
  - 75. Naseer, A., Zafar, K.: ‘Comparative analysis of raw images and meta feature based Urdu OCR using CNN and lstm’, Int. J. Adv. Comput. Sci. Appl., 2018, 9, (1), pp. 419–424.
105. 105)
  - 56. Phan, T.Q., Shivakumara, P., Lu, T., et al: ‘Recognition of video text through temporal integration’. 2013 12th Int. Conf. on Document Analysis and Recognition (ICDAR), Washington, DC, USA, 2013, pp. 589–593.
106. 106)
  - 51. Liu, W., Chen, C., Wong, K.-Y.K.: ‘Char-net: a character-aware neural network for distorted scene text recognition’. Thirty-Second AAAI Conf. on Artificial Intelligence, New Orleans, LA, USA, 2018, pp. 23–29.
107. 107)
  - 73. Javed, N., Shabbir, S., Siddiqi, I., et al: ‘Classification of Urdu ligatures using convolutional neural networks-a novel approach’. 2017 Int. Conf. on Frontiers of Information Technology (FIT), Islamabad, Pakistan, 2017, pp. 93–97.
108. 108)
  - 94. Tayyab, B.U., Naeem, M.F., Ul-Hasan, A., et al: ‘A multi-faceted OCR framework for artificial Urdu news ticker text recognition’. 2018 13th IAPR Int. Workshop on Document Analysis Systems (DAS), Vienna, Austria, 2018, pp. 211–216.

Recognition of cursive video text using a deep learning framework

References

Related content