access icon free F2PNet: font-to-painting translation by adversarial learning

For Chinese font images, when all their strokes are replaced by pattern elements such as flowers and birds, they become flower–bird character paintings, which are traditional Chinese art treasures. The generation of flower–bird painting requires professional painters’ great efforts. How to automatically generate these paintings from font images? There is a huge gap between the font domain and the painting domain. Although many image-to-image translation frameworks have been proposed, they are unable to handle this situation effectively. In this study, a novel method called font-to-painting network (F2PNet) is proposed for font-to-painting translation. Specifically, an encoder equipped with dilated convolutions extracts features of the font image, and then the features are fed into the domain translation module for mapping the font feature space to the painting feature space. The acquired features are further adjusted by the refinement module and utilised by the decoder to obtain the target painting. The authors apply adversarial loss and cycle-consistency loss to F2PNet and further propose a loss term, which is called recognisability loss and makes the generated painting have font-level recognisability. It is proved by experiments that F2PNet is effective and can be used as an unsupervised image-to-image translation framework to solve more image translation tasks.

Inspec keywords: art; feature extraction; unsupervised learning; image coding; painting

Other keywords: domain translation module; font feature space; F2PNet; generated painting; image translation tasks; font domain; flower–bird character paintings; Chinese font images; font-level recognisability; font-to-painting translation; unsupervised image-to-image translation framework; font image; painting feature space; adversarial loss; font-to-painting network; target painting; image-to-image translation frameworks; cycle-consistency loss; painting domain

Subjects: Knowledge engineering techniques; Image and video coding; Computer vision and image processing techniques; Humanities computing; Image recognition

References

    1. 1)
      • 45. Krawetz, N.: ‘Looks like it’, 2011. Available at https://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html.
    2. 2)
      • 25. Fedus, W., Goodfellow, I., Dai, A.M.: ‘Maskgan: better text generation via filling in the _’. arXiv preprint arXiv:1801.07736, 2018.
    3. 3)
      • 15. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘Imagenet classification with deep convolutional neural networks’. Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA., 2012, pp. 10971105.
    4. 4)
      • 12. Yi, Z., Zhang, H., Tan, P., et al: ‘Dualgan: unsupervised dual learning for image-to-image translation’. 2017 IEEE Int. Conf. on Computer Vision (ICCV), Venice, Italy, October 2017, pp. 28682876.
    5. 5)
      • 27. Wang, X., Yu, K., Wu, S., et al: ‘ESRGAN: enhanced super-resolution generative adversarial networks’. arXiv preprint arXiv:1809.00219, 2018.
    6. 6)
      • 17. Szegedy, C., Liu, W., Jia, Y., et al: ‘Going deeper with convolutions’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Boston, MA, USA., 2015, pp. 19.
    7. 7)
      • 33. Brock, A., Donahue, J., Simonyan, K.: ‘Large scale gan training for high fidelity natural image synthesis’. arXiv preprint arXiv:1809.11096, 2018.
    8. 8)
      • 40. Odena, A., Olah, C., Shlens, J.: ‘Conditional image synthesis with auxiliary classifier gans’. Proc. of the 34th Int. Conf. on Machine Learning, Sydney, Australia, August 2017, vol. 70, pp. 26422651.
    9. 9)
      • 26. Choi, Y., Choi, M., Kim, M., et al: ‘Stargan: unified generative adversarial networks for multi-domain image-to-image translation’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA., 2018, pp. 87898797.
    10. 10)
      • 29. Luc, P., Couprie, C., Chintala, S., et al: ‘Semantic segmentation using adversarial networks’. arXiv preprint arXiv:1611.08408, 2016.
    11. 11)
      • 18. He, K., Zhang, X., Ren, S., et al: ‘Deep residual learning for image recognition’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, USA., 2016, pp. 770778.
    12. 12)
      • 4. Xu, S., Jin, T., Jiang, H., et al: ‘Automatic generation of personal Chinese handwriting by capturing the characteristics of personal handwriting’. IAAI, Pasadena, CA, USA., July 2009.
    13. 13)
      • 34. Karras, T., Laine, S., Aila, T.: ‘A style-based generator architecture for generative adversarial networks’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Long Beach, CA, USA., 2019, pp. 44014410.
    14. 14)
      • 2. Zong, A., Zhu, Y.: ‘Strokebank: automating personalized Chinese handwriting generation’. AAAI, Québec, Canada, June 2014, pp. 30243030.
    15. 15)
      • 21. Wang, F., Jiang, M., Qian, C., et al: ‘Residual attention network for image classification’. arXiv preprint arXiv:1704.06904, 2017.
    16. 16)
      • 14. LeCun, Y., Bottou, L., Bengio, Y., et al: ‘Gradient-based learning applied to document recognition’, Proc. IEEE, 1998, 86, (11), pp. 22782324.
    17. 17)
      • 16. Simonyan, K., Zisserman, A.: ‘Very deep convolutional networks for large-scale image recognition’. arXiv preprint arXiv:1409.1556, 2014.
    18. 18)
      • 37. Park, T., Liu, M.Y., Wang, T.C., et al: ‘Semantic image synthesis with spatially-adaptive normalization’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Long Beach, CA, USA., 2019, pp. 23372346.
    19. 19)
      • 10. Tian, Y.: ‘Zi2zi: master Chinese calligraphy with conditional adversarial networks’, 2017. Available at https://github.com/kaonashi-tyc/zi2zi.
    20. 20)
      • 9. Tian, Y.: ‘Rewrite: neural style transfer for Chinese fonts’, 2016. Available at https://github.com/kaonashi-tyc/Rewrite.
    21. 21)
      • 30. Yu, J., Lin, Z., Yang, J., et al: ‘Generative image inpainting with contextual attention’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA., 2018, pp. 55055514.
    22. 22)
      • 41. Yu, F., Koltun, V.: ‘Multi-scale context aggregation by dilated convolutions’. arXiv preprint arXiv:1511.07122, 2015.
    23. 23)
      • 19. Huang, G., Liu, Z., Van Der Maaten, L., et al: ‘Densely connected convolutional networks’. 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA., July 2017, pp. 22612269.
    24. 24)
      • 5. Xu, S., Jiang, H., Jin, T., et al: ‘Automatic generation of Chinese calligraphic writings with style imitation’, IEEE Intell. Syst., 2009, 24, (2), pp. 4453.
    25. 25)
      • 32. Karras, T., Aila, T., Laine, S., et al: ‘Progressive growing of gans for improved quality, stability, and variation’. arXiv preprint arXiv:1710.10196, 2017.
    26. 26)
      • 20. Hu, J., Shen, L., Sun, G.: ‘Squeeze-and-excitation networks’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA., 2018, pp. 71327141.
    27. 27)
      • 38. Liu, M.Y., Breuel, T., Kautz, J.: ‘Unsupervised image-to-image translation networks’. Advances in Neural Information Processing Systems, Montreal, Canada, 2017, pp. 700708.
    28. 28)
      • 1. Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al: ‘Generative adversarial nets’. Advances in Neural Information Processing Systems, Montréal, Canada, 2014, pp. 26722680.
    29. 29)
      • 43. Deng, J., Dong, W., Socher, R., et al: ‘Imagenet: a large-scale hierarchical image database’. IEEE Conf. on Computer Vision and Pattern Recognition, 2009. CVPR 2009, Miami Beach, FL, USA., June 2009, pp. 248255.
    30. 30)
      • 6. Xu, S., Lau, F.C., Cheung, W.K., et al: ‘Automatic generation of artistic Chinese calligraphy’, IEEE Intell. Syst., 2005, 20, (3), pp. 3239.
    31. 31)
      • 35. Zhu, J.Y., Zhang, R., Pathak, D., et al: ‘Toward multimodal image-to-image translation’. Advances in Neural Information Processing Systems, Montreal, Canada, 2017, pp. 465476.
    32. 32)
      • 42. Maas, A.L., Hannun, A.Y., Ng, A.Y.: ‘Rectifier nonlinearities improve neural network acoustic models’. Proc. ICML, Atlanta, GA, USA., June 2013, vol. 30, no. 1, p. 3.
    33. 33)
      • 36. Wang, T.C., Liu, M.Y., Zhu, J.Y., et al: ‘High-resolution image synthesis and semantic manipulation with conditional gans’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA., 2018, pp. 87988807.
    34. 34)
      • 3. Zhou, B., Wang, W., Chen, Z.: ‘Easy generation of personal Chinese handwritten fonts’. 2011 IEEE Int. Conf. on Multimedia and Expo (ICME), Barcelona, Spain, July 2011, pp. 16.
    35. 35)
      • 23. Tulyakov, S., Liu, M.Y., Yang, X., et al: ‘Mocogan: decomposing motion and content for video generation’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA., 2018, pp. 15261535.
    36. 36)
      • 8. Zhu, J.Y., Park, T., Isola, P., et al: ‘Unpaired image-to-image translation using cycle-consistent adversarial networks’. arXiv preprint, 2017.
    37. 37)
      • 39. Huang, X., Liu, M.Y., Belongie, S., et al: ‘Multimodal unsupervised image-to-image translation’. arXiv preprint arXiv:1804.04732, 2018.
    38. 38)
      • 13. Kim, T., Cha, M., Kim, H., et al: ‘Learning to discover cross-domain relations with generative adversarial networks’. arXiv preprint arXiv:1703.05192, 2017.
    39. 39)
      • 28. Wu, H., Zheng, S., Zhang, J., et al: ‘Gp-gan: towards realistic high-resolution image blending’. arXiv preprint arXiv:1703.07195, 2017.
    40. 40)
      • 24. Xiong, W., Luo, W., Ma, L., et al: ‘Learning to generate time-lapse videos using multi-stage dynamic generative adversarial networks’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA., 2018, pp. 23642373.
    41. 41)
      • 7. Isola, P., Zhu, J.Y., Zhou, T., et al: ‘Image-to-image translation with conditional adversarial networks’. 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA., July 2017, pp. 59675976.
    42. 42)
      • 44. Wang, Z., Bovik, A.C., Sheikh, H.R., et al: ‘Image quality assessment: from error visibility to structural similarity’, IEEE Trans. Image Process., 2004, 13, (4), pp. 600612.
    43. 43)
      • 22. Yang, L.C., Chou, S.Y., Yang, Y.H.: ‘Midinet: A convolutional generative adversarial network for symbolic-domain music generation’. arXiv preprint arXiv:1703.10847, 2017.
    44. 44)
      • 31. Radford, A., Metz, L., Chintala, S.: ‘Unsupervised representation learning with deep convolutional generative adversarial networks’. arXiv preprint arXiv:1511.06434, 2015.
    45. 45)
      • 11. Lyu, P., Bai, X., Yao, C., et al: ‘Auto-encoder guided GAN for Chinese calligraphy synthesis’. 2017 14th IAPR Int. Conf. on Document Analysis and Recognition (ICDAR), Kyoto, Japan, November 2017, vol. 1, pp. 10951100.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-ipr.2019.0476
Loading

Related content

content/journals/10.1049/iet-ipr.2019.0476
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading