access icon free Age interval and gender prediction using PARAFAC2 and SVMs based on visual and aural features

Parallel factor analysis 2 (PARAFAC2) is employed to reduce the dimensions of visual and aural features and provide ranking vectors. Subsequently, score level fusion is performed by applying a support vector machine (SVM) classifier to the ranking vectors derived by PARAFAC2 to make gender and age interval predictions. The aforementioned procedure is applied to the Trinity College Dublin Speaker Ageing database, which is supplemented with face images of the speakers and two single-modality benchmark datasets. Experimental results demonstrate the advantage of using combined aural and visual features for both prediction tasks.

Inspec keywords: vectors; parallel processing; speaker recognition; image classification; image fusion; support vector machines; face recognition; feature extraction

Other keywords: SVM classifier; PARAFAC2; gender prediction; speaker face images; ranking vectors; parallel factor analysis 2; support vector machine classifier; age interval prediction; aural features; visual features; single-modality benchmark datasets; SVM; Trinity College Dublin Speaker Ageing database; score level fusion

Subjects: Image recognition; Parallel software; Speech recognition and synthesis; Knowledge engineering techniques; Algebra; Sensor fusion; Algebra; Speech processing techniques; Computer vision and image processing techniques

References

    1. 1)
      • 20. Bekhouche, S.E., Ouafi, A., Benlamoudi, A., et al: ‘Facial age estimation and gender classification using multi level local phase quantization’. Proc. IEEE Int. Conf. Control, Engineering & Information Technology, Tlemcen, Algeria, May 2015, pp. 14.
    2. 2)
      • 19. Levi, G., Hassner, T.: ‘Age and gender classification using convolutional neural networks’. Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition Workshops, Boston, MA, USA, June 2015, pp. 3442.
    3. 3)
      • 21. Mesgarani, N., Slaney, M., Shamma, S.A.: ‘Discrimination of speech from nonspeech based on multiscale spectro-temporal modulations’, IEEE Trans. Audio Speech Lang. Process., 2006, 14, (3), pp. 920930.
    4. 4)
      • 18. Chao, W.L., Liu, J.Z., Ding, J.J.: ‘Facial age estimation based on label-sensitive learning and age-oriented regression’, Pattern Recognit., 2013, 46, (3), pp. 628641.
    5. 5)
      • 28. Loizou, P.C.: ‘Speech enhancement: theory and practice’ (CRC Press, Boca Raton, FL, 2007, 2nd edn. 2013).
    6. 6)
      • 5. Polastro, M.C., Eleuterio, P.M.S.: ‘Nudetective: A forensic tool to help combat child pornography through automatic nudity detection’. Proc. IEEE Int. Workshop Database and Expert Systems Applications, Bilbao, Spain, August 2010, pp. 349353.
    7. 7)
      • 9. Kelly, F., Harte, N.: ‘Effects of long-term ageing on speaker verification’. Biometrics and ID Management, 2011 (LNCS, 6583), pp. 113124.
    8. 8)
      • 22. Panagakis, Y., Kotropoulos, C.L., Arce, G.R.: ‘Music genre classification via joint sparse low-rank representation of audio features’, IEEE/ACM Trans. Audio Speech Lang. Process., 2014, 22, (12), pp. 19051917.
    9. 9)
      • 14. Nixon, M.S., Correia, P.L., Nasrollahi, K., et al: ‘On soft biometrics’, Pattern Recognit. Lett., 2015, 68, pp. 218230.
    10. 10)
      • 6. Kelly, F., Drygajlo, A., Harte, N.: ‘Speaker verification with long-term ageing data’. Proc. IARP Int. Conf. Biometrics, New Delhi, India, March 2012, pp. 478483.
    11. 11)
      • 1. Lanitis, A.: ‘A survey of the effects of aging on biometric identity verification’, Int. J. Biometrics, 2010, 2, (1), pp. 3452.
    12. 12)
      • 4. Pantraki, E., Kotropoulos, C., Lanitis, A.: ‘Age interval and gender prediction using PARAFAC2 applied to speech utterances’. Proc. Int. Workshop Biometrics and Forensics, Limassol, Cyprus, March 2016, pp. 16.
    13. 13)
      • 2. Kinnunen, T., Li, H.: ‘An overview of text-independent speaker recognition: from features to supervectors’, Speech Commun., 2010, 52, (1), pp. 1240.
    14. 14)
      • 12. Liu, H., Sun, X.: ‘A partial least squares based ranker for fast and accurate age estimation’. Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Shanghai, China, March 2016, pp. 27922796.
    15. 15)
      • 26. Yan, S., Wang, H., Tang, X., et al: ‘Learning auto-structured regressor from uncertain nonnegative labels’. Proc. IEEE Int. Conf. Computer Vision, Rio de Janeiro, Brazil, October 2007, pp. 18.
    16. 16)
      • 23. Chew, P.A., Bader, B.W., Kolda, T.G., et al: ‘Cross-language information retrieval using PARAFAC2’. Proc. ACM Int. Conf. Knowledge Discovery and Data Mining, San Jose, CA, USA, August 2007, pp. 143152.
    17. 17)
      • 27. Lanitis, A., Draganova, C., Christodoulou, C.: ‘Comparing different classifiers for automatic age estimation’, IEEE Trans. Syst. Man Cybern. B, Cybern., 2004, 34, (1), pp. 621628.
    18. 18)
      • 17. Guo, G., Mu, G., Fu, Y., et al: ‘Human age estimation using bio-inspired features’. Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, Miami, FL, USA, June 2009, pp. 112119.
    19. 19)
      • 24. Chang, C.C., Lin, C.J.: ‘LIBSVM: A library for support vector machines’, ACM Trans. Intell. Syst. Technol., 2011, 2, (3), pp. 27:127:27.
    20. 20)
      • 25. Turnbull, D., Barrington, L., Torres, D., et al: ‘Semantic annotation and retrieval of music and sound effects’, IEEE Trans. Audio Speech Lang. Process., 2008, 16, (2), pp. 467476.
    21. 21)
      • 11. Sadjadi, S.O., Ganapathy, S., Pelecanos, J.W.: ‘Speaker age estimation on conversational telephone speech using senone posterior based i-vectors’. Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Shanghai, China, March 2016, pp. 50405044.
    22. 22)
      • 7. Panis, G., Lanitis, A., Tsapatsoulis, N., et al: ‘Overview of research on facial ageing using the FG-NET ageing database’, IET Biometrics, 2016, 5, (2), pp. 3746.
    23. 23)
      • 3. Harshman, R.A.: ‘PARAFAC2: mathematical and technical notes’. UCLA Working Papers in Phonetics, 1972, vol. 22, pp. 3047.
    24. 24)
      • 16. Liu, L., Liu, J., Cheng, J.: ‘Age-group classification of facial images’. Proc. IEEE Int. Conf. Machine Learning and Applications, Boca Raton, FL, USA, December 2012, pp. 693696.
    25. 25)
      • 15. Arigbabu, O.A., Ahmad, S.M.S., Adnan, W.A.W., et al: ‘Recent advances in facial soft biometrics’, Vis. Comput., 2015, 31, (5), pp. 513525.
    26. 26)
      • 10. Kelly, F., Saeidi, R., Harte, N., et al: ‘Effect of long-term ageing on i-vector speaker verification’. Proc. Interspeech, Singapore, September 2014, pp. 8690.
    27. 27)
      • 8. NIST Multimodal Information Group: ‘NIST 2008 speaker recognition evaluation test set’ (Linguistic Data Consortium, Philadelphia, US, 2011).
    28. 28)
      • 29. Goutte, C., Gaussier, E.: ‘A probabilistic interpretation of precision, recall and F-score, with implication for evaluation’. Proc. Eur. Conf. Information Retrieval Research, Santiago de Compostela, Spain, March 2005, pp. 345359.
    29. 29)
      • 13. Geng, X., Zhou, Z.H., Smith-Miles, K.: ‘Automatic age estimation based on facial aging patterns’, IEEE Trans. Pattern Anal. Mach. Intell., 2007, 29, (12), pp. 22342240.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-bmt.2016.0122
Loading

Related content

content/journals/10.1049/iet-bmt.2016.0122
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading