Voice disguise by mimicry: deriving statistical articulometric evidence to evaluate claimed impersonation

Voice disguise by mimicry: deriving statistical articulometric evidence to evaluate claimed impersonation

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
IET Biometrics — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Voice disguise by impersonation is often used in voice-based crimes by perpetrators who try to evade identification while sounding genuine. Voice evidence from these crimes is analysed to both detect impersonation, and match the impersonated voice to the natural voice of the speaker to prove its correct ownership. There are interesting situations, however, where a speaker might be confronted with voice evidence that perceptually sounds like their natural voice but may deny ownership of it, claiming instead that it is the production of an expert impersonator. This is a bizarre claim, but plausible since the human voice has a great degree of natural variation. It poses a difficult forensic problem: instead of detecting impersonation one must now prove the absence of it, and instead of matching the evidence with the natural voice of the person one must show that they cannot not have a common originator. The authors address the problem of disproving the denial of voice ownership from an articulatory-phonetic perspective, and propose a hypothesis-testing framework that may be used to solve it. The authors demonstrate their approach on data comprising voices of prominent political figures in USA, and their expert impersonators.


    1. 1)
      • 1. CNN News Channel, USA: ‘Donald trump on recording: not me’,, May 14, 2016.
    2. 2)
      • 2. High Court of Justiciary, Edinburgh, Scotland: ‘Her Majesty's Advocate v Thomas Sheridan and Gail Sheridan’, Decided 23 December2010.
    3. 3)
      • 3. Doniger, W.: ‘The woman who pretended to be who she was: myths of self-imitation’ (Oxford University Press, 2004).
    4. 4)
      • 4. Zetterholm, E.: ‘Voice imitation: a phonetic study of perceptual illusions and acoustic success’. Ph.D. dissertation, Lund University, 2003.
    5. 5)
      • 5. Eriksson, A., Wretling, P.: ‘How flexible is the human voice? – a case study of mimicry’. Proc. EUROSPEECH 97, 1997, vol. 2, pp. 10431046.
    6. 6)
      • 6. Eriksson, A.: ‘The disguised voice: imitating accents or speech styles and impersonating individuals’, in Llamas, C., Watt, D., (Eds): ‘Language and identities’ (Edinburgh University Press, 2010), chap. 8, pp. 8698.
    7. 7)
      • 7. Kitamura, T.: ‘Acoustic analysis of imitated voice produced by a professional impersonator’. INTERSPEECH, 2008, pp. 813816.
    8. 8)
      • 8. Deutsch, D.: ‘Auditory illusions, handedness, and the spatial environment’, J. Audio Eng. Soc., 1983, 31, (9), pp. 606620.
    9. 9)
      • 9. McGettigan, C., Eisner, F., Agnew, Z.K., et al: ‘T'ain't what you say, it's the way that you say it – left insula and inferior frontal cortex work in interaction with superior temporal regions to control the performance of vocal impersonations’, J. Cogn. Neurosci., 2013, 25, (11), pp. 18751886.
    10. 10)
      • 10. Gallois, C., Giles, H.: ‘Communication accommodation theory’, The Int. Encyclopedia of Language and Social Interaction, 2015, pp. 118.
    11. 11)
      • 11. McGettigan, C.: ‘The social life of voices: studying the neural bases for the expression and perception of the self and others during spoken communication’, Front. Hum. Neurosci., 2015, 9, (129), p. 129.
    12. 12)
      • 12. Mariéthoz, J., Bengio, S.: ‘Can a professional imitator fool a GMM-based speaker verification system?’. Technical Report, IDIAP, 2005.
    13. 13)
      • 13. Hautamäki, R.G., Kinnunen, T., Hautamäki, V., et al: ‘I-vectors meet imitators: on vulnerability of speaker verification systems against voice mimicry’. Interspeech, Citeseer, 2013, pp. 930934.
    14. 14)
      • 14. Schlichting, F., Sullivan, K.P.: ‘The imitated voice – a problem for voice line-ups?’, Forensic Linguistics, 1997, 4, pp. 148165.
    15. 15)
      • 15. Kinnunen, T., Wu, Z.-Z., Lee, K.A., et al: ‘Vulnerability of speaker verification systems against voice conversion spoofing attacks: The case of telephone speech’. 2012 IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2012, pp. 44014404.
    16. 16)
      • 16. Singh, R., Gencaga, D., Raj, B.: ‘Formant manipulations in voice disguise by mimicry’. 4th Int. Workshop on Biometrics and Forensics (IWBF), Limassol, Cyprus, 2016.
    17. 17)
      • 17. Delattre, P.: ‘Coarticulation and the locus theory’, Studia Linguistica, 1969, 23, (1), pp. 126.
    18. 18)
      • 18. Ferrand, C.T.: ‘Speech science: an integrated approach to theory and clinical practice (with CD-ROM)’ (Allyn & Bacon, 2006).
    19. 19)
      • 19. Brand, M.: ‘Structure learning in conditional probability models via an entropic prior and parameter extinction’, Neural Comput., 1999, 11, (5), pp. 11551182.
    20. 20)
      • 20. Snell, R.C., Milinazzo, F.: ‘Formant location from LPC analysis data’, IEEE Trans. Speech Audio Process., 1993, 1, (2), pp. 129134.
    21. 21)
      • 21. Darling, D.A.: ‘The Kolmogorov-Smirnov, Cramer-Von Mises tests’, Ann. Math. Stat., 1957, 28, (4), pp. 823838.
    22. 22)
      • 22. Kruskal, W.H.: ‘Historical notes on the Wilcoxon unpaired two-sample test’, J. Am. Stat. Assoc., 1957, 52, (279), pp. 356360.
    23. 23)
      • 23. Hotelling, H.: ‘A generalized T test and measure of multivariate dispersion’. Proc. of the Second Berkeley Symp. on Mathematical Statistics and Probability, 1951, pp. 2341.
    24. 24)
      • 24. Anderson, T.W.: ‘An introduction to multivariate statistical analysis’. Technical Report, Wiley, New York, 1962.
    25. 25)
      • 25. Jimenez, A., Raj, B.: ‘A three-way hypothesis test to compare multivariate sets’, Arxiv, 2016.
    26. 26)
      • 26. Lamere, P., Kwok, P., Gouvea, E., et al: ‘The CMU SPHINX-4 speech recognition system’. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP 2003), Hong Kong, vol. 1, Citeseer, 2003, pp. 25.
    27. 27)
      • 27. Labov, W., Ash, S., Boberg, C.: ‘The atlas of North American English: phonetics, phonology and sound change’ (Walter de Gruyter, 2005).
    28. 28)
      • 28. The CMU Sphinx suite of speech recognition systems’,, 2013.
    29. 29)
      • 29. Singh, R., Raj, B., Baker, J.: ‘Short-term analysis for estimating physical parameters of speakers’. 4th IEEE Int. Workshop on Biometrics and Forensics (IWBF), Cyprus, March 2016.

Related content

This is a required field
Please enter a valid email address