http://iet.metastore.ingenta.com
1887

Multi-dimensional long short-term memory networks for artificial Arabic text recognition in news video

Multi-dimensional long short-term memory networks for artificial Arabic text recognition in news video

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Computer Vision — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

This study presents a novel approach for Arabic video text recognition based on recurrent neural networks. In fact, embedded texts in videos represent a rich source of information for indexing and automatically annotating multimedia documents. However, video text recognition is a non-trivial task due to many challenges like the variability of text patterns and the complexity of backgrounds. In the case of Arabic, the presence of diacritic marks, the cursive nature of the script and the non-uniform intra/inter word distances, may introduce many additional challenges. The proposed system presents a segmentation-free method that relies specifically on a multi-dimensional long short-term memory coupled with a connectionist temporal classification layer. It is shown that using an efficient pre-processing step and a compact representation of Arabic character models brings robust performance and yields a low-error rate than other recently published methods. The authors’ system is trained and evaluated using the public AcTiV-R dataset under different evaluation protocols. The obtained results are very interesting. They also outperform current state-of-the-art approaches on the public dataset ALIF in terms of recognition rates at both character and line levels.

References

    1. 1)
      • 1. Mantas, J.: ‘An overview of character recognition methodologies’, Pattern Recognit., 1986, 19, (6), pp. 425430.
    2. 2)
      • 2. Govindan, V.K., Shivaprasad, A.P.: ‘Character recognition – a review’, Pattern Recognit., 1990, 23, (7), pp. 671683.
    3. 3)
      • 3. Dash, K.S., Puhan, N.B., Panda, G.: ‘Handwritten numeral recognition using non-redundant stockwell transform and bio-inspired optimal zoning’, IET Image Process., 2015, 9, (10), pp. 874882.
    4. 4)
      • 4. Lorigo, L.M., Govindaraju, V.: ‘Offline Arabic handwriting recognition: a survey’, IEEE Trans. Pattern Anal. Mach. Intell., 2006, 28, (5), pp. 712724.
    5. 5)
      • 5. Srihari, S.N.: ‘Recognition of handwritten and machine-printed text for postal address interpretation’, Pattern Recognit. Lett., 1993, 14, (4), pp. 291302.
    6. 6)
      • 6. Jayadevan, R., Kolhe, S.R., Patil, P.M., et al: ‘Automatic processing of handwritten bank cheque images: a survey’, Int. J. Doc. Anal. Recognit., 2012, 15, (4), pp. 267296.
    7. 7)
      • 7. Wang, F., Guo, Q., Lei, J., et al: ‘Convolutional recurrent neural networks with hidden Markov model bootstrap for scene text recognition’, IET Comput. Vis., 2017, 11, (6), pp. 497504.
    8. 8)
      • 8. Ye, Q., Doermann, D.: ‘Text detection and recognition in imagery: a survey’, IEEE Trans. Pattern Anal. Mach. Intell., 2015, 37, (7), pp. 14801500.
    9. 9)
      • 9. Yin, X.C., Zuo, Z.Y., Tian, S., et al: ‘Text detection, tracking and recognition in video: a comprehensive survey’, IEEE Trans. Image Process., 2016, 25, (6), pp. 27522773.
    10. 10)
      • 10. Yu, C., Song, Y., Meng, Q., et al: ‘Text detection and recognition in natural scene with edge analysis’, IET Comput. Vis., 2015, 9, (4), pp. 603613.
    11. 11)
      • 11. Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., et al: ‘ICDAR 2015 competition on robust reading’. Proc. Int. Conf. on Document Analysis and Recognition, Nancy, France, 2015, pp. 11561160.
    12. 12)
      • 12. Märgner, V., ElAbed, H.: ‘Guide to OCR for Arabic scripts’ (Springer, London, 2012).
    13. 13)
      • 13. Touj, S.M., Ben Amara, N.E., Amiri, H.: ‘A hybrid approach for off-line Arabic handwriting recognition based on a planar hidden Markov modeling’. Proc. Int. Conf. on Document Analysis and Recognition, Curitiba, Paraná, Brazil, 2007, pp. 964968.
    14. 14)
      • 14. Slimane, F., Zayene, O., Kanoun, S., et al: ‘New features for complex Arabic fonts in cascading recognition system’. Proc. Int. Conf. on Pattern Recognition, Tsukuba, Japan, 2012, pp. 738741.
    15. 15)
      • 15. Graves, A.: ‘Offline Arabic handwriting recognition with multidimensional recurrent neural networks’, in ‘Guide to OCR for Arabic scripts’ (Springer, London, 2012), pp. 297313.
    16. 16)
      • 16. Yousfi, S., Berrani, S.A., Garcia, C.: ‘Deep learning and recurrent connectionist-based approaches for Arabic text recognition in videos’. Proc. Int. Conf. on Document Analysis and Recognition, Nancy, France, 2015, pp. 10261030.
    17. 17)
      • 17. Benhalima, M., Alimi, M.A., Vila, A.F.., et al: ‘NF-SAVO: neuro-fuzzy system for Arabic video OCR’, Int. J. Adv. Comput. Sci. Appl., 2012, 14, (1), pp. 128136.
    18. 18)
      • 18. Schuster, M., Paliwal, K.K.: ‘Bidirectional recurrent neural networks’, IEEE Trans. Signal Process., 1997, 45, (11), pp. 26732681.
    19. 19)
      • 19. Senior, A.W., Robinson, A.J.: ‘An off-line cursive handwriting recognition system’, IEEE Trans. Pattern Anal. Mach. Intell., 1998, 20, (3), pp. 309321.
    20. 20)
      • 20. Robinson, A.J.: ‘An application of recurrent nets to phone probability estimation’, IEEE Trans. Neural Netw., 1994, 5, (2), pp. 298305.
    21. 21)
      • 21. Graves, A., Liwicki, M., Fernández, S., et al: ‘A novel connectionist system for unconstrained handwriting recognition’, IEEE Trans. Pattern Anal. Mach. Intell., 2009, 31, (5), pp. 855868.
    22. 22)
      • 22. Pham, V., Bluche, T., Kermorvant, C., et al: ‘Dropout improves recurrent neural networks for handwriting recognition’. Proc. Int. Conf. on Frontiers in Handwriting Recognition, Crete, Greece, 2014, pp. 285290.
    23. 23)
      • 23. Messina, R., Louradour, J.: ‘Segmentation-free handwritten Chinese text recognition with LSTM-RNN’. Proc. Int. Conf. on Document Analysis and Recognition, Nancy, France, 2015, pp. 171175.
    24. 24)
      • 24. Zhiwei, Z., Linlin, L., Lim, T.C.: ‘Edge based binarization for video text images’. Proc. Int. Conf. on Pattern Recognition, Istanbul, Turkey, 2010, pp. 133136.
    25. 25)
      • 25. Zhang, Z., Wang, W.: ‘A novel approach for binarization of overlay text’. Proc. Systems, Man, and Cybernetics, Manchester, UK, 2013, pp. 42594264.
    26. 26)
      • 26. Hu, P., Wang, W., Lu, K.: ‘A novel binarization approach for text in images’. Proc. Int. Conf. on Image Processing, Quebec City, Canada, 2015, pp. 956960.
    27. 27)
      • 27. Roy, S., Shivakumara, P., Roy, P.P., et al: ‘Bayesian classifier for multi-oriented video text recognition system’, Expert Syst. Appl., 2015, 42, pp. 55545566.
    28. 28)
      • 28. Zhai, C., Chen, Z., Li, J., et al: ‘Chinese image text recognition with BLSTM-CTC: a segmentation-free method’. Proc. Chinese Conf. on Pattern Recognition, Chengdu, China, 2016, pp. 525536.
    29. 29)
      • 29. Su, B., Lu, S.: ‘Accurate scene text recognition based on recurrent neural network’. Proc. Asian Conf. on Computer Vision, Singapore, 2014, pp. 3548.
    30. 30)
      • 30. Jaderberg, M., Simonyan, K., Vedaldi, A., et al: ‘Synthetic data and artificial neural networks for natural scene text recognition’. NIPS Deep Learning Workshop, Montreal, Canada, 2014.
    31. 31)
      • 31. Elagouni, K., Garcia, C., Mamalet, F., et al: ‘Text recognition in videos using a recurrent connectionist approach’. Proc. Int. Conf. on Artificial Neural Networks, Lausanne, Switzerland, 2012, pp. 172179.
    32. 32)
      • 32. Shi, B., Bai, X., Yao, C.: ‘An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, 39, (11), pp. 22982304.
    33. 33)
      • 33. Iwata, S., Ohyama, W., Wakabayashi, T., et al: ‘Recognition and transition frame detection of Arabic news captions for video retrieval’. Proc. Int. Conf. on Pattern Recognition, Cancun, Mexico, 2016, pp. 40054010.
    34. 34)
      • 34. Khémiri, A., Kacem, A., Belad, A., et al: ‘Arabic handwritten words off-line recognition based on HMMs and DBNs’. Proc. Int. Conf. on Document Analysis and Recognition, Nancy, France, 2015, pp. 5155.
    35. 35)
      • 35. Abandah, G.A., Jamour, F.T., Qaralleh, E.A.: ‘Recognizing handwritten Arabic words using grapheme segmentation and recurrent neural networks’, Int. J. Doc. Anal. Recognit., 2014, 17, (3), pp. 275291.
    36. 36)
      • 36. Yousfi, S., Berrani, S.A., Garcia, C.: ‘ALIF: a dataset for arabic embedded text recognition in TV broadcast’. Proc. Int. Conf. on Document Analysis and Recognition, Nancy, France, 2015, pp. 12211225.
    37. 37)
      • 37. Naz, S., Umar, A.I., Ahmad, R., et al: ‘Urdu Nastaliq text recognition system based on multi-dimensional recurrent neural network and statistical features’, Neural Comput. Appl., 2017, 28, (2), pp. 219231.
    38. 38)
      • 38. Chherawala, Y., Roy, P.P., Cheriet, M.: ‘Feature set evaluation for offline handwriting recognition systems: application to the recurrent neural network model’, IEEE Trans. Cybern., 2016, 46, (12), pp. 28252836.
    39. 39)
      • 39. Yousefi, M.R., Soheili, M.R., Breuel, T.M., et al: ‘A comparison of 1D and 2D LSTM architectures for the recognition of handwritten Arabic’. Document Recognition and Retrieval, San Francisco, USA, 2015.
    40. 40)
      • 40. Lee, S.-W., Kim, Y.-J.: ‘A new type of recurrent neural network for handwritten character recognition’. Proc. Int. Conf. on Document Analysis and Recognition, Montreal, Canada, 1995, pp. 3841.
    41. 41)
      • 41. Graves, A., Fernandez, S., Schmidhuber, J.: ‘Multi-dimensional recurrent neural networks’. Proc. Int. Conf. on Artificial Neural Networks, Porto, Portugal, 2007, pp. 549558.
    42. 42)
      • 42. Graves, A.: ‘Supervised sequence labelling with recurrent neural networks’, (Springer, Berlin, Heidelberg, 2012).
    43. 43)
      • 43. Bengio, Y., Simard, P., Frasconi, P.: ‘Learning long-term dependencies with gradient descent is difficult’, IEEE Trans. Neural Netw., 1994, 5, (2), pp. 157166.
    44. 44)
      • 44. Hochreiter, S., Schmidhuber, J.: ‘Long short-term memory’, Neural Comput., 1997, 9, (8), pp. 17351780.
    45. 45)
      • 45. Austin, S., Schwartz, R., Placeway, P.: ‘The forward-backward search algorithm’. Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, Toronto, Canada, 1991, pp. 697700.
    46. 46)
      • 46. Slimane, F., Ingold, R., Alimi, M.A., et al: ‘Duration models for arabic text recognition using hidden Markov models’. Proc. Computational Intelligence for Modelling Control & Automation, Vienna, Austria, 2008, pp. 838843.
    47. 47)
      • 47. Zayene, O., Hennebert, J., Touj, S.M., et al: ‘A dataset for arabic text detection, tracking and recognition in news videos – AcTiV’. Proc. Int. Conf. on Document Analysis and Recognition, Nancy, France, 2015, pp. 9961000.
    48. 48)
      • 48. Zayene, O., Hajjej, N., Touj, S.M., et al: ‘ICPR2016 contest on arabic text detection and recognition in video frames-AcTiVComp’. Proc. Int. Conf. on Pattern Recognition, Cancun, Mexico, 2016, pp. 187191.
    49. 49)
      • 49. Frinken, V., Zamora-Martnez, F., Espana-Boquera, S., et al: ‘Long-short term memory neural networks language modeling for handwriting recognition’. Proc. Int. Conf. on Pattern Recognition, Tsukuba, Japan, 2012, pp. 701704.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2017.0468
Loading

Related content

content/journals/10.1049/iet-cvi.2017.0468
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address