This study describes a special-case application of speaker recognition in open-set speaker-identification mode, which nonetheless has wide applicability. Watch-list based speaker spotting in telephone banking can potentially provide valuable protection against ‘known’ fraudsters with access to stolen customer details. In this study, the detection of known fraudsters in a telephone banking service using commercial off-the-shelf verification engines is described. A new ‘delta scoring’ method for watch-list detection is proposed based on using the genuine customer model as a reference. The approach combines for the first time speaker recognition in both verification and identification mode. Empirical experiment results show a significant gain in performance using the new method.

References

1. 1)
  - 5. Ariyaeeinia, A.M., Fortuna, J., Sivakumaran, P., Malegaonkar, A.: ‘Verification effectiveness in open-set speaker identification’, IEEE Proc. Vis. Image Signal Process., 2006, 153, (5), pp. 618–624 (doi: 10.1049/ip-vis:20050273).
2. 2)
  - T. Kinnunen , H. Li . An overview of text-independent speaker recognition: from features to supervectors. Speech Commun. , 1 , 12 - 40
3. 3)
  - 17. Sturim, D.E., Reynolds, D.A.: ‘Speaker adaptive cohort selection for tnorm in text-independent speaker verification’, ICASSP, 2005, 0, (1), pp. 741–744.
4. 4)
  - 3. Malegaonkar, A., Ariyaeeinia, A.: ‘Performance evaluation in open-set speaker identification’. Biometrics and ID Management – COST 2011 European Workshop, 2011.
5. 5)
  - 15. Apsingekar, V.R., De Leon, P.L.: ‘Speaker verification score normalization using speaker model clusters’, Speech Commun., 2011, 53, (1), pp. 110–118 (doi: 10.1016/j.specom.2010.07.001).
6. 6)
  - 11. Agnitio: ‘Agnitio kivox 4.0 the power of voice (White Paper)’.
7. 7)
  - 13. Apsingekar, V.R., De Leon, P.L.: ‘Speaker model clustering for efficient speaker identification in large population applications’, IEEE Trans. Audio Speech Lang. Process., 2009, 17, (4), pp. 848–853 (doi: 10.1109/TASL.2008.2010882).
8. 8)
  - 4. Fortuna, J., Sivakumaran, P., Ariyaeeinia, A.M., Malegaonkar, A.: ‘Relative effectiveness of score normalisation methods in open-set speaker identification’. ODYSSEY the Speaker and Language Recognition Workshop, 2004.
9. 9)
  - 14. González-Rodriguez, J., Drygajlo, A., Ramos-Castro, D., Garcia-Gomar, M., Ortega-García, J.: ‘Robust estimation, interpretation and assessment of likelihood ratios in forensic speaker recognition’, Comput. Speech Lang., 2006, 20, (2–3), pp. 331–355 (doi: 10.1016/j.csl.2005.08.005).
10. 10)
  - 12. Martin, A., Przybocki, M., Campbell, J.P.Jr.: ‘The NIST speaker recognition evaluation program’, in Wayman, J., Jain, A.K., Wayman, D. (Eds.): ‘Biometric systems: technology, design and performance evaluation’ (Springer, 2005), pp. 241–262.
11. 11)
  - 9. Hiltgen, A., Kramp, T., Weigold, T.: ‘Secure internet banking authentication’, IEEE Secur. Privacy, 2006, 4, (2), pp. 21–29 (doi: 10.1109/MSP.2006.50).
12. 12)
  - 15. Ortega-Garcia, J., Fierrez, J., Alonso-Fernandez, F., et al: ‘The multi-scenario multi-environment Biosecure Multimodal Database (BMDB)’, IEEE Trans. Pattern Anal. Mach. Intell., 2010, 32, (6), pp. 1097–1111 (doi: 10.1109/TPAMI.2009.76).
13. 13)
  - 7. Singer, E., Reynolds, D.: ‘Analysis of multi-target detection for speaker and language recognition’. ODYSSEY – The Speaker and Language Recognition Workshop, 2004.
14. 14)
  - 23. Martin, A., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: ‘The DET curve in assessment of detection task performance’. Proc. Eurospeech ‘97, 1997, pp. 1895–1898.
15. 15)
  - F. Bimbot , J.F. Bonastre , C. Fredoville . A tutorial on text-independent speaker verification. EURASIP J. Appl. Signal Process. , 430 - 451
16. 16)
  - 8. Zigel, Y., Wasserbiat, M.: ‘How to deal with multiple-targets in speaker identification systems ?’ ODYSSEY – The Speaker and Language Recognition Workshop, 2006, pp. 1–7.
17. 17)
  - 20. Garcia-Romero, D., Espy-Wilson, C.Y.: ‘Analysis of I-vector length normalization in speaker recognition systems’. INTERSPEECH, 2011, pp. 249–252.
18. 18)
  - 21. Brümmer, N., Garcia-Romero, D.: ‘Generative modelling for unsupervised score calibration’. Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2014.
19. 19)
  - 24. Greenberg, C.S., Stanford, V.M., Martin, A.F., et al: ‘The 2012 NIST speaker recognition evaluation’. Interspeech, August 2013, pp. 1971–1975.
20. 20)
  - 6. Ramasubramanian, V.: ‘Speaker spotting: Automatic telephony surveillance for homeland security’, in Neustein, A., Hemant, A. (Eds.): ‘Forensic speaker recognition: law enforcement and counter-terrorism’ (2012), pp. 427–468.
21. 21)
  - 19. Kenny, P.: ‘Bayesian speaker verification with heavy-tailed priors’. ODYSSEY – The Speaker and Language Recognition Workshop, 2010.
22. 22)
  - 10. González-Rodriguez, J., Toledano, D.T., Ortega-García, J.: ‘Voice biometrics’, in Jain, A.K., Flynn, P., Ross, A. (Eds.): ‘Handbook of biometrics’ (Springer, 2008).
23. 23)
  - R. Auckenthaler , M. Carey , H. Lloyd-Thomas . Score normalization for text-independent speaker verification systems. Digital Signal Process. , 42 - 54
24. 24)
  - 18. Ramos-Castro, D., Fierrez-Aguilar, J., Gonzalez-Rodriguez, J., Ortega-Garcia, J.: ‘Speaker verification using speaker- and test-dependent fast score normalization’, Pattern Recognit. Lett., 2007, 28, (1), pp. 90–98 (doi: 10.1016/j.patrec.2006.06.008).

Effective speaker spotting for watch-list detection of fraudsters in telephone banking

References

Related content