http://iet.metastore.ingenta.com
1887

Automatic speaker verification on narrowband and wideband lossy coded clean speech

Automatic speaker verification on narrowband and wideband lossy coded clean speech

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Biometrics — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Substantial progress has been achieved in voice-based biometrics in recent times but a variety of challenges still remain for speech research community. One such obstacle is reliable speaker authentication from speech signals degraded by lossy compression. Compression is commonplace in modern telecommunications, such as mobile telephony, VoIP services, teleconference, voice messaging or gaming. In this study, the authors investigate the effect of lossy speech compression on text-independent speaker verification. Voice biometrics performance is evaluated on clean speech signals distorted by the state-of-the-art narrowband (NB) as well as wideband (WB) speech codecs. The tests are performed in both channel-matched and channel-mismatched scenarios. The test results show that coded WB speech improves voice authentication precision by 1–3% of equal error rate over coded NB speech, even at the lowest investigated bitrates. It is also shown that the enhanced voice services codec does not provide better results than the other codecs involved in this study.

References

    1. 1)
      • F. Bimbot , J.F. Bonastre , C. Fredouille .
        1. Bimbot, F., Bonastre, J.F., Fredouille, C., et al: ‘A tutorial on text-independent speaker verification’, EURASIP J. Appl. Signal Process., 2004, 2004, pp. 430451.
        . EURASIP J. Appl. Signal Process. , 430 - 451
    2. 2)
      • J.H. Hansen , T. Hasan .
        2. Hansen, J.H., Hasan, T.: ‘Speaker recognition by machines and humans: a tutorial review’, IEEE Signal Process. Mag., 2015, 32, (6), pp. 7499.
        . IEEE Signal Process. Mag. , 6 , 74 - 99
    3. 3)
      • L.F. Gallardo . (2016)
        3. Gallardo, L.F.: ‘Human and automatic speaker recognition over telecommunication channels’ (Springer Science + Business Media, Singapore, 2016).
        .
    4. 4)
      • L.F. Gallardo , M. Wagner , S. Möller .
        4. Gallardo, L.F., Wagner, M., Möller, S.: ‘I-vector speaker verification for speech degraded by narrowband and wideband channels’. Proc. 11th ITG Symp. Speech Communication, Erlangen, Germany, September 2014.
        . Proc. 11th ITG Symp. Speech Communication
    5. 5)
      • S. Bruhn , E. Norvell , J. Svedberg .
        5. Bruhn, S., Norvell, E., Svedberg, J., et al: ‘A novel sinusoidal approach to audio signal frame loss concealment and its application in the new EVS codec standard’. Proc. Int. Conf. ICASSP'15, South Brisbane, QLD, April 2015, pp. 51425146.
        . Proc. Int. Conf. ICASSP'15 , 5142 - 5146
    6. 6)
      • M. Kuitert , L. Boves .
        6. Kuitert, M., Boves, L.: ‘Speaker verification with GSM coded telephone speech’. Proc. 5th European Conf. EUROSPEECH'97, Rhodes, Greece, September 1997, pp. 975978.
        . Proc. 5th European Conf. EUROSPEECH'97 , 975 - 978
    7. 7)
      • L. Besacier , S. Grassi , A. Dufaux .
        7. Besacier, L., Grassi, S., Dufaux, A., et al: ‘GSM speech coding and speaker recognition’. Proc. of ICASSP'00, Istanbul, Turkey, June 2000, vol. 2, pp. II1085II1088.
        . Proc. of ICASSP'00 , II1085 - II1088
    8. 8)
      • R.B. Dunn , T.F. Quatieri , D.A. Reynolds .
        8. Dunn, R.B., Quatieri, T.F., Reynolds, D.A., et al: ‘Speaker recognition from coded speech and the effects of score normalization’. Proc. 35th Asilomar Conf. on Signals, Systems and Computers, Pacific Grove, CA, USA, November 2001, vol. 2, pp. 15621567.
        . Proc. 35th Asilomar Conf. on Signals, Systems and Computers , 1562 - 1567
    9. 9)
      • A.R. Stauffer , A.D. Lawson .
        9. Stauffer, A.R., Lawson, A.D.: ‘Speaker recognition on lossy compressed speech using the Speex codec’. Proc. INTERSPEECH'09, Brighton, UK, September 2009, pp. 23632366.
        . Proc. INTERSPEECH'09 , 2363 - 2366
    10. 10)
      • M. Mclaren , V. Abrash , Graciarena .
        10. Mclaren, M., Abrash, V., Graciarena, , et al: ‘Improving robustness to compressed speech in speaker recognition’. Proc. of INTERSPEECH'13, Lyon, France, August 2013, pp. 36983702.
        . Proc. of INTERSPEECH'13 , 3698 - 3702
    11. 11)
      • L.F. Gallardo , S. Moller , M. Wagner .
        11. Gallardo, L.F., Moller, S., Wagner, M.: ‘Human speaker identification of known voices transmitted through different user interfaces and transmission channels’. Proc. Int. Conf. ICASSP'13, Vancouver, BC, Canada, May 2013, pp. 77757779.
        . Proc. Int. Conf. ICASSP'13 , 7775 - 7779
    12. 12)
      • S.O. Sadjadi , M. Slaney , L. Heck .
        12. Sadjadi, S.O., Slaney, M., Heck, L.: ‘MSR identity toolbox v1. 0: A Matlab toolbox for speaker-recognition research’, Speech and Language Processing Technical Committee Newsletter, 2013, 1, (4).
        . Speech and Language Processing Technical Committee Newsletter , 4
    13. 13)
      • 13. ITU-T Rec. G.711: ‘Pulse code modulation (PCM) of voice frequencies’, 1988.
        .
    14. 14)
      • 14. ITU-T Rec. G.711.1: ‘Wideband embedded extension for G.711 pulse code modulation’, 2008.
        .
    15. 15)
      • 15. ITU-T Rec. G.729: ‘Coding of speech at 8 kbit/s using conjugate-structure algebraic-code-exited linear prediction (CS-ACELP)’, 2007.
        .
    16. 16)
      • 16. ITU-T Rec. G.729.1: ‘G.729-based embedded variable bit-rate coder: An 8-32 kbit/s scalable wideband coder bitstream interoperable with G.729’, 2006.
        .
    17. 17)
      • 17. ETSI TS 26.071: ‘3GPP mandatory speech CODEC speech processing functions; AMR speech codec; general description’, 2000.
        .
    18. 18)
      • 18. ETSI TS 26.171: ‘Adaptive multi-rate - wideband (AMR-WB) speech codec; general description’, 2001.
        .
    19. 19)
      • 19. ETSI TS 26.445: ‘EVS codec detailed algorithmic description’, 2014.
        .
    20. 20)
      • P.L. Sordo Martinez , B. Fauve , A. Larcher .
        20. Sordo Martinez, P.L., Fauve, B., Larcher, A., et al: ‘Speaker verification performance with constrained durations’. Proc. 2nd Int. Workshop on Biometrics and Forensics (IWBF), Valletta, Malta, March 2014.
        . Proc. 2nd Int. Workshop on Biometrics and Forensics (IWBF)
    21. 21)
      • A. Poddar , M.D. Sahidullah , G. Saha .
        21. Poddar, A., Sahidullah, M.D., Saha, G.: ‘Performance comparison of speaker recognition systems in presence of duration variability’. Proc. IEEE India Conf. INDICON, New Delhi, India, December 2015.
        . Proc. IEEE India Conf. INDICON
    22. 22)
      • A. Kanagasundaram , R. Vogt , D. Dean .
        22. Kanagasundaram, A., Vogt, R., Dean, D., et al: ‘I-vector based speaker recognition on short utterances’. Proc. Int. Conf. INTERSPEECH'11, Florence, Italy, August 2011, pp. 23412344.
        . Proc. Int. Conf. INTERSPEECH'11 , 2341 - 2344
    23. 23)
      • J. Polacky , R. Jarina , M. Chmulik .
        23. Polacky, J., Jarina, R., Chmulik, M.: ‘Influence of packet loss on a speaker verification system over IP network’. Proc. 26th Int. Conf. Radioelektronika 2016, Kosice, Slovakia, April 2016, pp. 339342.
        . Proc. 26th Int. Conf. Radioelektronika 2016 , 339 - 342
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-bmt.2016.0119
Loading

Related content

content/journals/10.1049/iet-bmt.2016.0119
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address