© The Institution of Engineering and Technology
This study presents investigations into the effectiveness of the state-of-the-art speaker verification techniques (i.e. GMM–UBM and GMM–SVM) in mismatched noise conditions. Based on experiments using white and real world noise, it is shown that the verification performance offered by these methods is severely affected when the level of degradation in the test material is different from that in the training utterances. To address this problem, a modified realisation of the parallel model combination (PMC) method is introduced and a new form of test normalisation (T-norm), termed condition adjusted T-norm, is proposed. It is experimentally demonstrated that the use of these techniques with GMM–UBM can significantly enhance the accuracy in mismatched noise conditions. Based on the experimental results, it is observed that the resultant relative improvement achieved for GMM–UBM (under the most severe mismatch condition considered) is in excess of 70%. Additionally, it is shown that the improvement in the verification accuracy achieved in this way is higher than that obtainable with the direct use of PMC with GMM–UBM. Moreover, it is found that while the accuracy performance of GMM–SVM can also considerably benefit from the use of these techniques, the extensive computational cost involved in this case severely limits the use of such a combined approach in practice.
References
-
-
1)
-
Fortuna, J.: `Speaker indexing based on voice biometrics', 2006, PhD, University of Hertfordshire.
-
2)
-
Ben, M., Bimbot, F.: `D-MAP: a distance-normalized MAP estimation of speaker models for automatic speaker verification', Proc. IEEE Conf. Acoustics, Speech and Signal Processing (ICASSP’03), 2003, Hong Kong, 2, p. 69–72.
-
3)
-
Solomonoff, A., Campbell, W., Boardman, I.: `Advances in channel compensation for SVM speaker recognition', Proc. IEEE Conf. Acoustics, Speech and Signal Processing (ICASSP’05), 2005, Philadelphia, USA, p. 629–632.
-
4)
-
D.A. Reynolds ,
T. Quatieri ,
R. Dunn
.
Speaker verification using adapted Gaussian mixture models.
Dig. Signal Process.
,
19 -
41
-
5)
-
Garofolo, J.S., Lamel, L.F., Fisher, M.: `TIMIT acoustic-phonetic continuous speech corpus', Linguistic Data Consortium, 1993, Philadelphia.
-
6)
-
Ben, M.: `Approaches robustes pour la vérification automatique du locuteur par normalisation et adaptation hiarchique', 2004, PhD, University of Rennes I.
-
7)
-
Bellot, O., Matrouf, D., Merlin, T., Bonastre, J.F.: `Additive and convolutional noises compensation in speaker recognition', Proc. Int. Conf. Spoken Language Processing (ICSLP'00), 2000, Beijing, China, 2, p. 799–802.
-
8)
-
Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: `Factor analysis simplified', Proc. IEEE Conf. Acoustics, Speech and Signal Processing (ICASSP’05), 2005, Philadelphia, USA, 1, p. 637–640.
-
9)
-
P. Kenny ,
P. Demouchel
.
Eigenvoice modeling with sparse training data.
IEEE Trans. Speech Audio Lang. Process.
,
3 ,
345 -
354
-
10)
-
N. Cristianini ,
J. Shawe-Taylor
.
(2000)
An introduction to support vector machines and other kernel-based learning methods.
-
11)
-
A. Martin ,
M. Przybocki
.
The NIST speaker recognition evaluation series.
-
12)
-
Sivakumaran, P.: `Robust text dependent speaker verification', 1998, PhD, University of Hertfordshire.
-
13)
-
Reynolds, D.: `Comparison of background normalisation methods for text-independent speaker verification', Proc. Eurospeech, 1997, Rhodes, Greece, p. 963–966.
-
14)
-
A. Ariyaeeinia ,
J. Fortuna ,
P. Sivakumaran ,
A. Malegaonkar
.
Verification effectiveness in open-set speaker identification.
IEE Proc. Vision Image Signal Process.
,
5 ,
618 -
624
-
15)
-
J.C. Christopher
.
A tutorial on support vector machines for pattern recognition.
Data Min. Knowl. Discov.
,
121 -
167
-
16)
-
W. Campbell ,
J. Campbell ,
T.P. Gleason ,
D.A. Reynolds ,
W. Shen
.
Speaker verification using support vector machines and high level features.
IEEE Trans. Audio Speech Lang. Process.
,
7 ,
2085 -
2094
-
17)
-
Minghui, L., Yanlu, X., Zhigiang, Y., Beigian, D.: `A new hybrid GMM/SVM for speaker verification', Proc. 18th Int. Conf. Pattern Recognition, 2006, 4, p. 314–317.
-
18)
-
Suhadi, S.S., Sorel, S., Fingscheidt, T., Beaugeant, C.: `An evaluation of VTS and IMM for speaker verification in noise', Proc. Eurospeech '03, 2003, Geneza, Switzerland, p. 1669–1672.
-
19)
-
McLaren, M., Vogt, R., Sridharan, S.: `SVM speaker verification using session variability modelling and GMM supervectors', Proc. Int. Conf. Biometrics, 2007, p. 1077–1084.
-
20)
-
Dehak, R., Dehak, N., Kenny, P., Dumouchel, P.: `Linear and non linear kernel GMM supervector machines for speaker verification', Proc. Interspeech, 2007, Antwerp, Belgium, p. 302–305.
-
21)
-
B. Fauve ,
D. Matrouf ,
N. Sheffer ,
J.F. Bonastre ,
J. Mason
.
State-of-art performance in text-independent speaker verification through open-source software.
IEEE Trans. Audio Speech Lang. Process.
,
7 ,
1960 -
1968
-
22)
-
D.A. Reynolds
.
Speaker identification and verification using Gaussian mixture speaker models.
Speech Commun.
,
91 -
108
-
23)
-
Solomonoff, A., Quillen, C., Campbell, W.: `Channel compensation for SVM speaker recognition', Proc. Speaker Odyssey, 2004, Toledo, Spain, p. 57–62.
-
24)
-
A. Varga ,
H.J.M. Steeneken ,
M. Tornlinson ,
D. Jones
.
The NOISEX-92 study on the effect of additive noise on automatic speech recognition.
Speech Commun.
,
247 -
252
-
25)
-
Ortega-Garcia, J., Gonzalez-Rodriguez, L.: `Overview of speaker enhancement techniques for automatic speaker recognition', Proc. Int. Conf. Spoken Language Processing (ICSLP'96), 1996, Philadelphia, USA, p. 929–932.
-
26)
-
Drygajlo, A., El-Malikim, M.: `Speaker verification in noisy environments with combined spectral subtraction and missing feature theory', Proc. IEEE Conf. Acoustics, Speech and Signal Processing (ICASSP '98), 1998, Seattle, Washington, USA, 1, p. 121–124.
-
27)
-
R. Auckenthaler ,
M. Carey ,
H.L. Thomas
.
Score normalization for text-independent speaker verification systems.
Dig. Signal Process.
,
42 -
54
-
28)
-
W.M. Campbell ,
D.E. Sturim ,
D.A. Reynolds
.
Support vector machines using GMM supervectors for speaker verification.
IEEE Signal Process. Lett.
,
5 ,
115 -
118
-
29)
-
R. Collobert ,
S. Bengio
.
SVMTorch: Support vector machines for large-scale regression problems.
J. Mach. Learn. Res.
,
143 -
160
-
30)
-
Wan, V.: `Speaker verification using support vector machines', 2003, PhD, University of Sheffield.
-
31)
-
F. Bimbot ,
J.F. Bonastre ,
C. Fredouille
.
A tutorial on text-independent speaker verification.
EURASIP J. Appl. Signal Process.
,
4 ,
963 -
966
-
32)
-
Campbell, W.M., Sturim, D.E., Reynolds, D.A., Solomonoff, A.: `SVM based speaker verification using a GMM supervector kernel and NAP variability compensation', Proc. IEEE Conf. Acoustics, Speech and Signal Processing (ICASSP’06), 2006, Toulouse, France, 1, p. 97–100.
-
33)
-
V. Vapnik
.
(1995)
The nature of statistical learning theory.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-spr.2008.0175
Related content
content/journals/10.1049/iet-spr.2008.0175
pub_keyword,iet_inspecKeyword,pub_concept
6
6