Representations generated by Fisher vector (FV) have shown decent performances on many facial image datasets. However, discriminative information could be masked by noise if the authors directly sum all local responses with respect to the learned dictionary. Further, the high dimension of FV prohibits its practical use. To mitigate these problems, the authors propose a new framework called joint compressed Fisher vector (JCFV), which generate task-specific data representation by jointly encoding multiscale deep convolutional activations. Firstly, they feed into the deep network facial images cropped with cascaded sub-windows and resized into various scales. Next, they select discriminative convolutional features to form a dictionary. Then, they aggregate multiscale features with respect to the dictionary by calculating a re-weighted first-order statistics. JCFV halves the dimension of FV, and they could further compress the dimension with several combinations of subspace methods. They prove the effectiveness of their JCFV descriptor with comprehensive experiments on FERET, AR, LFW and FRGC 2.0 Experiment 4.

References

1. 1)
  - 13. Avila, S., Thome, N., Cord, M., et al: ‘Pooling in image representation: the visual codeword point of view’, Comput. Vis. Image Underst., 2013, 117, (5), pp. 453–465.
2. 2)
  - 45. Hwang, W., Park, G., Lee, J., et al: ‘Multiple face model of hybrid Fourier feature for large face image set’. 2006 IEEE Computer Society Conf. Computer Vision and Pattern Recognition, 2006, vol. 2, pp. 1574–1581.
3. 3)
  - 34. Ding, C., Choi, J., Tao, D., et al: ‘Multi-directional multi-level dual-cross patterns for robust face recognition’, IEEE Trans. Pattern Anal. Mach. Intell., 2016, 38, (3), pp. 518–531.
4. 4)
  - 38. Martinez, A.M.: ‘The AR face database’. CVC technical report, 1998, vol. 24.
5. 5)
  - 26. Sun, Y., Wang, X., Tang, X.: ‘Deeply learned face representations are sparse, selective, and robust’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2015, pp. 2892–2900.
6. 6)
  - 19. Chandrasekhar, V., Lin, J., Morère, O., et al: ‘A practical guide to CNNs and fisher vectors for image instance retrieval’, arXiv preprint arXiv:1508.02496, 2015.
7. 7)
  - 14. Schroff, F., Kalenichenko, D., Philbin, J.: ‘Facenet: a unified embedding for face recognition and clustering’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2015, pp. 815–823.
8. 8)
  - 3. Srivastava, N., Hinton, G., Krizhevsky, A., et al: ‘Dropout: a simple way to prevent neural networks from overfitting’, J. Mach. Learn. Res., 2014, 15, (1), pp. 1929–1958.
9. 9)
  - 20. Yoo, D., Park, S., Lee, J.-Y., et al: ‘Multi-scale pyramid pooling for deep convolutional representation’. Proc. IEEE Conf. Computer Vision and Pattern Recognition Workshops, 2015, pp. 71–80.
10. 10)
  - 16. Liu, J., Deng, Y., Bai, T., et al: ‘Targeting ultimate accuracy: face recognition via deep embedding’, arXiv preprint arXiv:1506.07310, 2015.
11. 11)
  - 36. Chatfield, K., Lempitsky, V., Vedaldi, A., et al: ‘The devil is in the details: an evaluation of recent feature encoding methods’. British Machine Vision Conf., 2011.
12. 12)
  - 12. Simonyan, K., Vedaldi, A., Zisserman, A.: ‘Deep fisher networks for large-scale image classification’. Advances in Neural Information Processing Systems, 2013, pp. 163–171.
13. 13)
  - 6. Wang, H., Deng, W.: ‘Face recognition via compact fisher vector’. Chinese Conf. Biometric Recognition, 2015, pp. 68–77.
14. 14)
  - 18. Ding, C., Tao, D.: ‘Trunk-branch ensemble convolutional neural networks for video-based face recognition’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, pp. 1.
15. 15)
  - 11. Simonyan, K., Parkhi, O.M., Vedaldi, A., et al: ‘Fisher vector faces in the wild’. British Machine Vision Conf., 2013.
16. 16)
  - 30. Zhao, W., Chellappa, R., Krishnaswamy, A.: ‘Discriminant analysis of principal components for face recognition’. Third IEEE Int. Conf. Automatic Face and Gesture Recognition, 1998, 1998, pp. 336–341.
17. 17)
  - 28. Bishop, C.M.: ‘Pattern recognition and machine learning’ (Springer, New York, USA, 2006).
18. 18)
  - 42. Phillips, P.J., Flynn, P.J., Scruggs, T., et al: ‘Overview of the face recognition grand challenge’. IEEE Computer Society Conf. Computer Vision and Pattern Recognition, 2005. CVPR 2005, 2005, vol. 1, pp. 947–954.
19. 19)
  - 29. Friedman, J.H.: ‘Regularized discriminant analysis’, J. Am. Stat. Assoc., 1989, 84, (405), pp. 165–175.
20. 20)
  - 4. He, K., Zhang, X., Ren, S., et al: ‘Deep residual learning for image recognition’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2016, pp. 770–778.
21. 21)
  - 23. Guo, Y., Lew, M.S.: ‘Bag of surrogate parts: one inherent feature of deep CNNs’.
22. 22)
  - 9. Perronnin, F., Liu, Y., Sánchez, J., et al: ‘Large-scale image retrieval with compressed fisher vectors’. 2010 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2010, pp. 3384–3391.
23. 23)
  - 7. Wang, H., Hu, J., Deng, W.: ‘Compressing fisher vector for robust face recognition’, IEEE Access, 2017, PP, (99), pp. 1–1.
24. 24)
  - 25. Sun, Y., Chen, Y., Wang, X., et al: ‘Deep learning face representation by joint identification-verification’. Advances in Neural Information Processing Systems, 2014, pp. 1988–1996.
25. 25)
  - 15. Taigman, Y., Yang, M., Ranzato, M., et al: ‘Deepface: closing the gap to humanlevel performance in face verification’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2014, pp. 1701–1708.
26. 26)
  - 8. Perronnin, F., Dance, C.: ‘Fisher kernels on visual vocabularies for image categorization’. IEEE Conf. Computer Vision and Pattern Recognition, 2007. CVPR'07, 2007, pp. 1–8.
27. 27)
  - 41. Chen, D., Cao, X., Wen, F., et al: ‘Blessing of dimensionality: high-dimensional feature and its efficient compression for face verification’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2013, pp. 3025–3032.
28. 28)
  - 44. Chan, C.-H., Kittler, J., Tahir, M.A.: ‘Kernel fusion of multiple histogram descriptors for robust face recognition’. Joint IAPR Int. Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), 2010, pp. 718–727.
29. 29)
  - 21. Perronnin, F., Larlus, D.: ‘Fisher vectors meet neural networks: a hybrid classification architecture’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2015, pp. 3743–3752.
30. 30)
  - 27. Radovanović, M., Nanopoulos, A., Ivanović, M.: ‘Hubs in space: popular nearest neighbors in high-dimensional data’, J. Mach. Learn. Res., 2010, 11, pp. 2487–2531.
31. 31)
  - 43. Chan, C.H., Tahir, M.A., Kittler, J., et al: ‘Multiscale local phase quantization for robust component-based face recognition using kernel fusion of multiple descriptors’, IEEE Trans. Pattern Anal. Mach. Intell., 2013, 35, (5), pp. 1164–1177.
32. 32)
  - 2. Chatfield, K., Simonyan, K., Vedaldi, A., et al: ‘Return of the devil in the details: delving deep into convolutional nets’. British Machine Vision Conf., 2014.
33. 33)
  - 32. Lei, Z., Pietikäinen, M., Li, S.Z.: ‘Learning discriminant face descriptor’, IEEE Trans. Pattern Anal. Mach. Intell., 2014, 36, (2), pp. 289–302.
34. 34)
  - 40. Hussain, S.U., Napoléon, T., Jurie, F.: ‘Face recognition using local quantized patterns’. British Machive Vision Conf., 2012, p. 11pages.
35. 35)
  - 37. Schwartz, W.R., Guo, H., Choi, J., et al: ‘Face identification using large feature sets’, IEEE Trans. Image Process., 2012, 21, (4), pp. 2245–2255.
36. 36)
  - 22. Sydorov, V., Sakurada, M., Lampert, C.H.: ‘Deep fisher kernels-end to end learning of the fisher Kernel GMM parameters’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2014, pp. 1402–1409.
37. 37)
  - 10. Sánchez, J., Perronnin, F., Mensink, T., et al: ‘Image classification with the Fisher vector: theory and practice’, Int. J. Comput. Vis., 2013, 105, (3), pp. 222–245.
38. 38)
  - 35. Phillips, P.J., Wechsler, H., Huang, J., et al: ‘The Feret database and evaluation procedure for face-recognition algorithms’, Image Vis. Comput., 1998, 16, (5), pp. 295–306.
39. 39)
  - 5. Sharif Razavian, A., Azizpour, H., Sullivan, J., et al: ‘CNN features off-the-shelf: an astounding baseline for recognition’. Proc. IEEE Conf. Computer Vision and Pattern Recognition Workshops, 2014, pp. 806–813.
40. 40)
  - 39. Sharma, G., ul Hussain, S., Jurie, F.: ‘Local higher-order statistics (LHS) for texture categorization and facial analysis’. Computer Vision – ECCV 2012, 2012, pp. 1–12.
41. 41)
  - 33. Lu, J., Liong, V.E., Zhou, X., et al: ‘Learning compact binary face descriptor for face recognition’, IEEE Trans. Pattern Anal. Mach. Intell., 2015, 37, (10), pp. 2041–2056.
42. 42)
  - 1. Huang, Y., Wu, Z., Wang, L., et al: ‘Feature coding in image classification: a comprehensive study’, IEEE Trans. Pattern Anal. Mach. Intell., 2014, 36, (3), pp. 493–506.
43. 43)
  - 24. Parkhi, O.M., Vedaldi, A., Zisserman, A.: ‘Deep face recognition’. British Machine Vision Conf., 2015, vol. 1, p. 6.
44. 44)
  - 17. Ding, C., Tao, D.: ‘Robust face recognition via multimodal deep face representation’, IEEE Trans. Multimed., 2015, 17, (11), pp. 2049–2058.
45. 45)
  - 31. Perez, C.A., Cament, L.A., Castillo, L.E.: ‘Local matching Gabor entropy weighted face recognition’. 2011 IEEE Int. Conf. Automatic Face & Gesture Recognition and Workshops (FG 2011), 2011, pp. 179–184.

Face recognition with compressed Fisher vector on multiscale convolutional features

References

Related content