Voice biometrics using linear Gaussian model

Voice biometrics using linear Gaussian model

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
IET Biometrics — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

This study introduces a linear Gaussian model-based framework for voice biometrics. The model works with discrete-time linear dynamical systems. The study motivation is to use the linear Gaussian modelling method in voice biometrics, and show that the accuracy offered by the linear Gaussian modelling method is comparable with other state-of-the-art methods such as Probabilistic Linear Discriminant Analysis and two-covariance model. An expectation–maximisation algorithm is derived to train the model and a Bayesian solution is used to calculate the log-likelihood ratio score of all trials of speakers. This approach performed well on the core-extended conditions of the NIST 2010 Speaker Recognition Evaluation, and is competitive compared with the Gaussian probabilistic linear discriminant analysis, in terms of normalised decision cost function.


    1. 1)
      • 1. Pillay, S., Ariyaeeinia, A., Sivakumaran, P., Pawlewski, M.: ‘Effective speaker verification via dynamic mismatch compensation’, IET Biometrics, 2012, 1, (2), pp. 130135 (doi: 10.1049/iet-bmt.2012.0001).
    2. 2)
      • 2. Gottschlich, C., Schölieb, C.-B.: ‘Oriented diffusion filtering for enhancing low-quality fingerprint images’, IET Biometrics, 2012, 1, (2), pp. 105113 (doi: 10.1049/iet-bmt.2012.0003).
    3. 3)
      • 3. Huang, S.-M., Yang, J.-F.: ‘Subface hidden Markov models coupled with a universal occlusion model for partially occluded face recognition’, IET Biometrics, 2012, 1, (3), pp. 149159 (doi: 10.1049/iet-bmt.2012.0018).
    4. 4)
      • 4. Pflug, A., Busch, C.: ‘Ear biometrics: a survey of detection, feature extraction and recognition methods’, IET Biometrics, 2012, 1, (2), pp. 114129 (doi: 10.1049/iet-bmt.2011.0003).
    5. 5)
      • 5. Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: ‘Speaker verification using adapted Gaussian mixture models’, Digital Signal Process., 2000, 10, (1C3), pp. 1941 (doi: 10.1006/dspr.1999.0361).
    6. 6)
      • 6. Campbell, W., Sturim, D., Reynolds, D., Solomonoff, A.: ‘SVM based speaker verification using a GMM supervector kernel and nap variability compensation’. Proc. 2006 IEEE Int. Conf. Acoustics, Speech and Signal Processing, 2006 (ICASSP 2006), May 2006, vol. 1, pp. I.
    7. 7)
      • 7. Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: ‘Front-end factor analysis for speaker verification’, IEEE Trans. Audio Speech Lang. Process., 2011, 19, (4), pp. 788798 (doi: 10.1109/TASL.2010.2064307).
    8. 8)
      • 8. Kenny, P., Ouellet, P., Dehak, N., Gupta, V., Dumouchel, P.: ‘A study of interspeaker variability in speaker verification’, IEEE Trans. Audio Speech Lang. Process., 2008, 16, (5), pp. 980988 (doi: 10.1109/TASL.2008.925147).
    9. 9)
      • 9. Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: ‘Joint factor analysis versus eigenchannels in speaker recognition’, IEEE Trans. Audio Speech Lang. Process., 2007, 15, (4), pp. 14351447 (doi: 10.1109/TASL.2006.881693).
    10. 10)
      • 10. Kenny, P.: ‘Bayesian speaker verification with heavy tailed priors’. Keynote presentation, Odyssey, 2010.
    11. 11)
      • 11. Dehak, N., Karam, Z., Reynolds, D., Dehak, R., Campbell, W., Glass, J.: ‘A channel-blind system for speaker verification’. 2011 IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), May 2011, pp. 45364539.
    12. 12)
      • 12. Brümmer, N., de Villiers, E.: ‘The speaker partitioning problem’. Proc. Odyssey Speaker and Language Recognition Workshop, Brno, Czech Republic, 2010.
    13. 13)
      • 13. Roweis, S., Ghahramani, Z.: ‘A unifying review of linear Gaussian models’, Neural Comput., 1999, 11, (2), pp. 305345 (doi: 10.1162/089976699300016674).
    14. 14)
      • 14. Bishop, C., Nasrabadi, N.M.: ‘Pattern recognition and machine learning’ (Springer, New York, 2006, vol. 1).
    15. 15)
      • 15. Garcia-Romero, D., Espy-Wilson, C.Y.: ‘Analysis of i-vector length normalization in speaker recognition systems’. INTERSPEECH, 2011, pp. 249252.

Related content

This is a required field
Please enter a valid email address