This study presents a new method for handwritten keyword spotting. The innovation in this paper is to provide a model based on neural network architecture and an output based on the margin. At first, a neural network is designed such that its output determines whether a test word as an input is spotted or rejected. The intended neural network has one input layer, two middle layers, and one output layer. Another innovation in this study is optimising neural network weights based on swarm optimisation method. This optimisation model is used to train the neural network, so that the output has adequate margin for classification. The new components of the proposed classifier include new particle coding and new fitness function. Two layers are considered in coding particle, one for activating and deactivating neural network nodes and the other layer for acquiring proper values for weights. Different experiments with variety of parameters were designed for the multi-layer perceptron neural network. The experiments on three datasets: AMA Arabic dataset, IAM English dataset, and IFN/Farsi dataset yielded 83, 77, and 69% values, respectively, in the best condition. The results demonstrate that the proposed method has been better than the previous ones.

References

1. 1)
  - 22. Dey, S., Josep Llados, A.N., Umapada, P., et al: ‘Local binary pattern for word spotting in handwritten historical document’. Joint IAPR Int. Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), Mérida, Mexico, 2016, pp. 574–583.
2. 2)
  - 15. Zagoris, K., Pratikakis, I., Gatos, B.:‘Unsupervised word spotting in historical handwritten document images using document-oriented local features’, IEEE Trans. Image Process., 2017, 26, (8), pp. 4032–4041.
3. 3)
  - 17. Lam, L., Lee, S.-W., Suen, C.Y.: ‘Thinning methodologies-a comprehensive survey’, IEEE Trans. Pattern Anal. Mach. Intell., 1992, 14, (9), pp. 869–885.
4. 4)
  - 7. Rath, R.M.T.M.: ‘Indexing of handwritten historical documents-recent progress’. Proc. 2003 Symp. Document Image Understanding Technology, UMD, Greenbelt, Maryland, 2003, pp. 77–85.
5. 5)
  - 11. Ye, Q., Doermann, D.: ‘Text detection and recognition in imagery: a survey’, IEEE Trans. Pattern Anal. Mach. Intell., 2015, 37, (7), pp. 1480–1500.
6. 6)
  - 14. Khayyat, M., Lam, L., Suen, C.Y.: ‘Learning-based word spotting system for Arabic handwritten documents’, Pattern Recog., 2014, 47, (3), pp. 1021–1030.
7. 7)
  - 25. Mozaffari, S., Abed, H.E., Volker Märgner, K.F., et al: ‘Ifn/Farsi-database: a database of Farsi handwritten city names’. Int. Conf. Frontiers in Handwriting Recognition, 2008.
8. 8)
  - 13. Fischer, A., Andreas Keller, V.F., Horst, B., et al: ‘Lexicon-free handwritten word spotting using character HMMs’, Pattern Recog. Lett., 2012, 33, (7), pp. 934–942.
9. 9)
  - 6. Rusiñol, M., Lladós, J.: ‘Boosting the handwritten word spotting experience by including the user in the loop’, Pattern Recog., 2014, 47, (3), pp. 1063–1072.
10. 10)
  - 10. Rath, T.M., Manmatha, R.: ‘Word image matching using dynamic time warping. in computer vision and pattern Recognition’. 2003 IEEE Computer Society Conf. 2003 Proc., Madison, WI, USA, 2003, pp. 521–527.
11. 11)
  - 27. Wshah, S., Kumar, G., Govindaraju, V.: ‘Statistical script independent word spotting in offline handwritten documents’, Pattern Recog., 2014, 47, (3), pp. 1039–1050.
12. 12)
  - 4. Keyvanpour, M., Tavoli, R.: ‘Document image retrieval: algorithms, analysis and promising directions’, Int. J. Softw. Eng. Appl., 2013, 7, (1), pp. 93–106.
13. 13)
  - 8. Keyvanpour, M., Tavoli, R.: ‘Feature weighting for improving document image retrieval system performance’. arXiv preprint arXiv:1206.1291, 2012.
14. 14)
  - 20. Rodriguez, J.A., Perronnin, F.: ‘Local gradient histogram features for word spotting in unconstrained handwritten documents’. Proc. 1st Int. Workshop on Automated Forensic Handwriting Analysis (ICFHR), 2008, pp. 7–12.
15. 15)
  - 16. Toselli, A.H., Puigcerver, J., Vidal, E.: ‘Two methods to improve confidence scores for Lexicon-free word spotting in handwritten text’. 2016 15th Int. Conf. Frontiers in Handwriting Recognition (ICFHR), Shenzhen, China, 2016, pp. 349–354.
16. 16)
  - 24. Analysis, A.M.: ‘Arabic-handwritten-1.0’. 2007.
17. 17)
  - 21. Shi, Y., Eberhart, R.C.: ‘Empirical study of particle swarm optimization’. Proc. 1999 Congress on Evolutionary Computation, 1999. CEC 99, 1999.
18. 18)
  - 3. Murugappan, A., Ramachandran, B., Dhavachelvan, P.: ‘A survey of keyword spotting techniques for printed document images’, Artif. Intell. Rev., 2011, 35, (2), pp. 119–136.
19. 19)
  - 23. Marti, U.-V., Bunke, H.: ‘The IAM-database: an English sentence database for offline handwriting recognition’, Int. J. Doc. Anal. Recognit., 2002, 5, (1), pp. 39–46.
20. 20)
  - 2. Giotis, A.P., Giorgos, S., Basilis, G., et al: ‘A survey of document image word spotting techniques’, Pattern Recog., 2017, 68, pp. 310–332.
21. 21)
  - 9. Frinken, V., Andreas Fischer, R.M., Horst, B., et al: ‘A novel word spotting method based on recurrent neural networks’, IEEE Trans. Pattern Anal. Mach. Intell., 2012, 34, (2), pp. 211–224.
22. 22)
  - 19. Hangarge, M., Rajmohan Pardeshi, C.V., Dhandra, B.V., et al: ‘Gabor wavelets based word retrieval from kannada documents’, Proc. Comput. Sci., 2016, 79, pp. 441–448.
23. 23)
  - 5. Fischer, A., Andreas Keller, V.F., Horst, B., et al: ‘HMM-based word spotting in handwritten documents using subword models’. 2010 20th Int. Conf. Pattern Recognition (ICPR), 2010, pp. 3416–3419.
24. 24)
  - 26. Rodríguez-Serrano, J.A., Perronnin, F.: ‘A model-based sequence similarity with application to handwritten word spotting’, IEEE Trans Pattern Anal. Mach. Intell., 2012, 34, (11), pp. 2108–2120.
25. 25)
  - 18. Rodríguez-Serrano, J.A., Perronnin, F.: ‘Handwritten word-spotting using hidden Markov models and universal vocabularies’, Pattern Recogn., 2009, 42, (9), pp. 2106–2116.
26. 26)
  - 12. Sudholt, S., Fink, G.A.: ‘PHOCNet: A deep convolutional neural network for word spotting in handwritten documents’. 2016 15th Int. Conf. Frontiers in Handwriting Recognition (ICFHR), Shenzhen, China, 2016, pp. 277–282.
27. 27)
  - 1. Doermann, D.: ‘The indexing and retrieval of document images: a survey’, Comput. Vis. Image Underst., 1998, 70, (3), pp. 287–298.

A method for handwritten word spotting based on particle swarm optimisation and multi-layer perceptron

References

Related content