access icon free Semi-supervised framework for writer identification using structural learning

Writer identification is a complex task as the handwriting of an individual encapsulates lot of information pertaining to text and personality of a writer. To learn a model to distinguish one writer from the other, it is important to capture every nuance of the handwriting of an individual. Learning such model poses two challenges. First, discriminatory variables maybe large and potentially related leading to a complex discriminatory function. Second, it will require large amount of training data to learn a complex and possibly high-dimensional function. In this study, the authors are proposing a semi-supervised framework for writer identification for offline handwritten documents that leverages the information hidden in the unlabelled samples. Proposed framework models the complexity of approximating the optimal hypothesis by breaking the main task into several subtasks and learning a separate hypothesis for each subtask. All the hypotheses pertaining to the subtasks will be used for the best model selection by retrieving a common substructure that has high correspondence with all the candidate hypotheses. The obtained substructure acts as a knowledge base that has the contextual information, which is otherwise difficult to retrieve. The extra information can be used to improve the performance of the identification model.

Inspec keywords: learning (artificial intelligence); handwriting recognition; document handling; information retrieval

Other keywords: contextual information; substructure retrieval; optimal hypothesis; discriminatory variables; writer identification model; structural learning; best model selection; complex discriminatory function; IAM data set; handwritten text; offline handwritten documents; high-dimensional function; semisupervised framework

Subjects: Knowledge engineering techniques; Information retrieval techniques; Document processing and analysis techniques

References

    1. 1)
      • 6. Matsuura, T., Thumwarin, P.: ‘On-line writer identification method based on fir system characterizing pen-tip movement’. Int. Conf. Signals and Electronic Systems, 2008 (ICSES’08), September 2008, pp. 201204.
    2. 2)
      • 10. Said, H.E.S., Baker, K.D., Tan, T.N.: ‘Personal identification based on handwriting’. Proc. 14th Int. Conf. Pattern Recognition, 1998, August 1998, vol. 2, pp. 17611764.
    3. 3)
      • 11. Zois, E.N., Anastassopoulos, V.: ‘Morphological waveform coding for writer identification’, Pattern Recognit., 2000, 33, (3), pp. 385398 (doi: 10.1016/S0031-3203(99)00063-1).
    4. 4)
      • 1. Ramaiah, C., Porwal, U., Govindaraju, V.: ‘Accent detection in handwriting based on writing styles’, Doc. Anal. Syst., 2012, pp. 312316.
    5. 5)
      • 17. Blum, A., Mitchell, T.: ‘Combining labeled and unlabeled data with co-training’. Proc. 11th Annual Conf. Computational Learning Theory (COLT’ 98), 1998, pp. 92100.
    6. 6)
      • 3. Ando, R.K., Zhang, T.: ‘A framework for learning predictive structures from multiple tasks and unlabeled data’, J. Mach. Learn. Res., 2005, 6, pp. 18171853.
    7. 7)
      • 8. Srihari, S.N., Beal, M.J., Bandi, K., Shah, V., Krishnamurthy, P.: ‘A statistical model for writer verification’. Proc. Eighth Int. Conf. Document Analysis and Recognition, 2005, pp. 11051109.
    8. 8)
      • 5. Li, B., Tan, T.: ‘Online text-independent writer identification based on temporal sequence and shape codes’. Proc. 2009 10th Int. Conf. Document Analysis and Recognition (ICDAR’09), 2009, pp. 931935.
    9. 9)
      • 25. Guruswami, V., Sahai, A.: ‘Multiclass learning, boosting, and error-correcting codes’. Proc. 12th Annual Conf. Computational Learning Theory (COLT’99), 1999, pp. 145155.
    10. 10)
      • 23. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: ‘Parallel distributed processing: explorations in the microstructure of cognition, vol. 1. chapter Learning internal representations by error propagation’ (MIT Press, Cambridge, MA, USA, 1986) pp. 318362.
    11. 11)
      • 14. Bulacu, M., Schomaker, L.: ‘Text-independent writer identification and verification using textural and allographic features’, IEEE Trans. Pattern Anal. Mach. Intell., 2007, 29, (4), pp. 701717 (doi: 10.1109/TPAMI.2007.1009).
    12. 12)
      • 27. Favata, J.T., Srikantan, G., Srihari, S.N.: ‘Handprinted character/digit recognition using a multiple feature/resolution philosophy’. Proc. Fourth Int. Workshop Frontiers of Handwriting Recognition, 1994.
    13. 13)
      • 21. Shivram, A., Ramaiah, C., Porwal, U., Govindaraju, V.: ‘Modeling writing styles for online writer identification: a hierarchical Bayesian approach’ (ICFHR, 2012) pp. 387392.
    14. 14)
      • 22. Allwein, E.L., Schapire, R.E., Singer, Y.: ‘Reducing multiclass to binary: a unifying approach for margin classifiers’, J. Mach. Learn. Res., 2001, 1, pp. 113141.
    15. 15)
      • 15. Bulacu, M., Schomaker, L., Vuurpijl, L.: ‘Writer identification using edge-based directional features’. Proc. Seventh Int. Conf. Document Analysis and Recognition – Volume 2, (ICDAR’03), 2003, pp. 937.
    16. 16)
      • 12. Niels, R., Vuurpijl, L., Schomaker, L.: ‘Introducing trigraph – trimodal writer identification’. Proc. European Network of Forensic Handwr. Experts, 2005.
    17. 17)
      • 9. Plamondon, R., Lorette, G.: ‘Automatic signature verification and writer identification – the state of the art’, Pattern Recognit., 1989, 22, (2), pp. 107131 (doi: 10.1016/0031-3203(89)90059-9).
    18. 18)
      • 18. Porwal, U., Rajan, S., Govindaraju, V.: ‘An oracle-based co-training framework for writer identification in offline handwriting’. Document Recognition and Retrieval XIX, 2012.
    19. 19)
      • 4. Dietterich, T.G., Bakiri, G.: ‘Solving multiclass learning problems via error-correcting output codes’, J. Artif. Int. Res., 1995, 2, (1), pp. 263286.
    20. 20)
      • 7. Tsai, M.-Y., Lan, L.-S.: ‘Online writer identification using the point distribution model’. Proc. 2005 IEEE Int. Conf. Systems, Man and Cybernetics, October 2005, vol. 2, pp. 12641268.
    21. 21)
      • 20. Chapelle, O., Schölkopf, B., Zien, A., (Eds.): ‘Semi-supervised learning’ (MIT Press, Cambridge, MA, 2006).
    22. 22)
      • 13. Van Der Maaten, L., Postma, E.: ‘Improving automatic writer identification’. Proc. 17th Belgium-Netherlands Conf. Artificial Intelligence (BNAIC 2005), 2005, pp. 260266.
    23. 23)
      • 28. Bhardwaj, A., Reddy, M., Setlur, S., Govindaraju, V., Ramachandrula, S.: ‘Writer identification in online handwriting using topic models’. Proc. NIPS 2009 Workshop on Applications of Topic Models: Text and Beyond, 2009.
    24. 24)
      • 19. Blitzer, J., McDonald, R., Pereira, F.: ‘Domain adaptation with structural correspondence learning’. Proc. 2006 Conf. Empirical Methods in Natural Language Processing (EMNLP’06), Association for Computational Linguistics, 2006, pp. 120128.
    25. 25)
      • 26. James, G., Hastie, T.: The error coding method and picts, 1998.
    26. 26)
      • 2. Porwal, U., Ramaiah, C., Shivram, A., Govindaraju, V.: ‘Structural learning for writer identification in offline handwriting’. Proc. Int. Conf. Frontiers in Handwriting Recognition, 2012, pp. 415420.
    27. 27)
      • 16. Joachims, T.: ‘Transductive inference for text classification using support vector machines’. Proc. 16th Int. Conf. Machine Learning (ICML ’99), 1999, pp. 200209.
    28. 28)
      • 24. Hastie, T., Tibshirani, R.: Classification by pairwise coupling, 1998.
    29. 29)
      • 29. Bhardwaj, A., Reddy, M., Setlur, S., Govindaraju, V., Ramachandrula, S.: ‘Latent dirichlet allocation based writer identification in offline handwriting’. Proc. Ninth IAPR Int. Workshop on Document Analysis Systems (DAS’10), 2010, pp. 357362.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-bmt.2013.0018
Loading

Related content

content/journals/10.1049/iet-bmt.2013.0018
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading