access icon free PixTextGAN: structure aware text image synthesis for license plate recognition

Rapid progress on text image recognition has been achieved with the development of deep-learning techniques. However, it is still a great challenge to achieve a comprehensive license plate recognition in the real scenes, since there are no publicly available large diverse datasets for the training of deep learning models. This paper aims at synthesising of license plate images with generative adversarial networks (GAN), refraining from collecting a vast amount of labelled data. The authors thus propose a novel PixTextGAN that leverages a controllable architecture that generates specific character structures for different text regions to generate synthetic license plate images with reasonable text details. Specifically, a comprehensive structure-aware loss function is presented to preserve the key characteristic of each character region and thus to achieve appearance adaption for better recognition. Qualitative and quantitative experiments demonstrate the superiority of authors’ proposed method in text image synthetisation over state-of-the-art GANs. Further experimental results of license plate recognition on ReId and CCPD dataset demonstrate that using the synthesised images by PixTextGAN can greatly improve the recognition accuracy.

Inspec keywords: feature extraction; image segmentation; object recognition; learning (artificial intelligence); traffic engineering computing; text analysis

Other keywords: comprehensive license plate recognition; text details; specific character structures; deep learning models; synthesised images; PixTextGAN; text image synthetisation; structure aware text image synthesis; text image recognition; text regions; synthetic license plate images; labelled data; generative adversarial networks; recognition accuracy; comprehensive structure-aware loss function

Subjects: Knowledge engineering techniques; Image recognition; Document processing and analysis techniques; Computer vision and image processing techniques; Traffic engineering computing

References

    1. 1)
      • 23. Gupta, A., Vedaldi, A., Zisserman, A.: ‘Synthetic data for text localisation in natural images’, Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016, pp. 23152324.
    2. 2)
      • 9. Wang, Q., Yuan, Z., Du, Q., et al: ‘Getnet: a general end-to-end 2-d cnn framework for hyperspectral image change detection’, IEEE Trans. Geosci. Remote Sens., 2018, 57, (1), pp. 111.
    3. 3)
      • 17. Li, H., Shen, C.: ‘Reading car license plates using deep convolutional neural networks and lstms’, arXiv preprint arXiv:1601.05610, 2016.
    4. 4)
      • 28. Kingma, D.P., Ba, J.: ‘Adam: a method for stochastic optimization’, arXiv preprint arXiv:1412.6980, 2014.
    5. 5)
      • 16. Llorens, D., Marzal, A., Palazón, V., et al: ‘Car license plates extraction and recognition based on connected components analysis and hmm decoding’. In Iberian Conf. on Pattern Recognition and Image Analysis, Estoril, Portugal, 2005, pp. 571578.
    6. 6)
      • 1. Isola, P., Zhu, J.Y., Zhou, T., et al: ‘Image-to-image translation with conditional adversarial networks’. 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, USA, 2016, pp. 59675976.
    7. 7)
      • 3. Chen, Y., Lai, Y.-K., Liu, Y.-J.: ‘Cartoongan: generative adversarial networks for photo cartoonization’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018, pp. 94659474.
    8. 8)
      • 21. Shrivastava, A., Pfister, T., Tuzel, O., et al: ‘Learning from simulated and unsupervised images through adversarial training’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, USA, 2017, pp. 22422251.
    9. 9)
      • 8. Ren, S., He, K., Girshick, R., et al: ‘Faster r-cnn: towards real-time object detection with region proposal networks’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, 39, (6), pp. 11371149.
    10. 10)
      • 5. Špaňhel, J., Sochor, J., Juránek, R., et al: ‘Holistic recognition of low quality license plates by cnn using track annotated data’. 2017 14th IEEE Int. Conf. on Advanced Video and Signal Based Surveillance (AVSS), Leece, Italy, 2017, pp. 16.
    11. 11)
      • 13. Anagnostopoulos, C.N.E., Anagnostopoulos, I.E., Loumos, V., et al: ‘A license plate-recognition algorithm for intelligent transportation system applications’, IEEE Trans. Intell. Transp. Syst., 2006, 7, (3), pp. 377392.
    12. 12)
      • 4. Graves, A., Fernández, S., Gomez, F., et al: ‘Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks’. Proc. of the 23rd int. Conf. on Machine learning, ACM, Pittsburgh, USA, 2006, pp. 369376.
    13. 13)
      • 29. Shi, B., Bai, X., Yao, C.: ‘An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, 39, (11), pp. 22982304.
    14. 14)
      • 27. Johnson, J., Alahi, A., Li, F.F.: ‘Perceptual losses for real-time style transfer and super-resolution’. European Conf. on Computer Vision, Amsterdam, Netherlands, 2016, pp. 694711.
    15. 15)
      • 18. Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., et al: ‘Generative adversarial nets’. Int. Conf. on Neural Information Processing Systems, Montréal, Canada, 2014, pp. 26722680.
    16. 16)
      • 19. Mirza, M., Osindero, S.: ‘Conditional generative adversarial nets’, Comput. Sci., 2014, pp. 26722680, arXiv:1411.1784.
    17. 17)
      • 26. Wang, Q., Chen, M., Nie, F., et al: ‘Least squares generative adversarial networks’, IEEE Trans. Pattern Anal. Mach. Intell. (T-PAMI), 2018, DOI: 10.1109/TPAMI.2018.2875002.
    18. 18)
      • 14. Jiao, J., Ye, Q., Huang, Q.: ‘A configurable method for multi-style license plate recognition’, Pattern Recognit., 2009, 42, (3), pp. 358369.
    19. 19)
      • 25. Mao, X., Li, Q., Xie, H., et al: ‘Least squares generative adversarial networks’. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2016.
    20. 20)
      • 22. Jaderberg, M., Simonyan, K., Vedaldi, A., et al: ‘Synthetic data and artificial neural networks for natural scene text recognition’, Eprint Arxiv, 2014.
    21. 21)
      • 24. Zheng, C., Cham, T.-J., Cai, J.: ‘T2net: synthetic-to-realistic translation for solving single-image depth estimation tasks’, arXiv preprint arXiv:1808.01454, 2018.
    22. 22)
      • 6. Xu, Z., Yang, W., Meng, A., et al: ‘Towards end-to-end license plate detection and recognition: a large dataset and baseline’. European Conf. on Computer Vision, Munich, Germany, 2018, pp. 261277.
    23. 23)
      • 11. Guo, J.-M., Liu, Y.-F.: ‘License plate localization and character segmentation with feedback self-learning and hybrid binarization techniques’, IEEE Trans. Veh. Technol., 2008, 57, (3), pp. 14171424.
    24. 24)
      • 12. Nomura, S., Yamanaka, K., Katai, O., et al: ‘A novel adaptive morphological approach for degraded character image segmentation’, Pattern Recognit., 2005, 38, (11), pp. 19611975.
    25. 25)
      • 10. Wang, Q., Gao, J., Yuan, Y.: ‘Embedding structured contour and location prior in siamesed fully convolutional networks for road detection’, IEEE Trans. Intell. Transp. Syst., 2018, 19, (1), pp. 230241.
    26. 26)
      • 20. Radford, A., Metz, L., Chintala, S.: ‘Unsupervised representation learning with deep convolutional generative adversarial networks’, Comput. Sci., 2015, arXiv:1511.06434.
    27. 27)
      • 7. Krizhevsky, A, Sutskever, I, Hinton, G.E.: ‘Imagenet classification with deep convolutional neural networks’. Advances in neural information processing systems, Lake Tahoe, USA, 2012, pp. 10971105.
    28. 28)
      • 2. Zhu, J.Y., Park, T., Isola, P., et al: ‘Unpaired image-to-image translation using cycle-consistent adversarial networks’. IEEE Int. Conf. on Computer Vision, Honolulu, USA, 2017, pp. 22422251.
    29. 29)
      • 15. Wen, Y., Lu, Y., Yan, J., et al: ‘An algorithm for license plate recognition applied to intelligent transportation system’, IEEE Trans. Intell. Transp. Syst., 2011, 12, (3), pp. 830845.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-ipr.2018.6588
Loading

Related content

content/journals/10.1049/iet-ipr.2018.6588
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading