Using multi-scale product spectrum for single and multi-pitch estimation

Access Full Text

Using multi-scale product spectrum for single and multi-pitch estimation

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Signal Processing — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

The authors present an algorithm for pitch estimation including voiced/unvoiced decision in the case of a noisy speech and when two speakers are talking simultaneously. The approach is based on the spectral multi-scale product (SMP) analysis of the sound mixture. SMP is the spectrum of the product of three successive wavelet transform coefficients of the speech. The wavelet used for SMP analysis is the quadratic spline function. The proposed method is compared with other state-of-the-art algorithms. It is robust in the presence of a noise and permits the pitch estimation of the dominant speech and the concurrent one from the sound mixture with high accuracy.

Inspec keywords: spectral analysis; splines (mathematics); wavelet transforms; speech processing; estimation theory

Other keywords: noisy speech; SMP analysis; wavelet transform coefficient; single pitch estimation; voiced-unvoiced decision; multipitch estimation; quadratic spline function; spectral multiscale product analysis

Subjects: Other topics in statistics; Interpolation and function approximation (numerical analysis); Integral transforms in numerical analysis; Speech processing techniques; Other topics in statistics; Integral transforms in numerical analysis; Speech and audio signal processing; Interpolation and function approximation (numerical analysis)

References

    1. 1)
    2. 2)
      • Plante, F., Meyer, G., Ainsworth, W.A.: `A pitch extraction reference database', Fourth European Conf. on Speech Communication and Technology, EUROSPEECH 95, September 1995, Madrid, Spain, p. 837–840.
    3. 3)
      • Cooke, M.P.: `Modeling auditory processing and organisation', 1993, PhD, University of Sheffield.
    4. 4)
      • C.S. Burrus , R.A. Gopinath , H. Guo . (1998) Introduction to wavelets and wavelet transform: a primer.
    5. 5)
    6. 6)
      • W.J. Hess . (1983) Pitch determination of speech signals: algorithms and devices.
    7. 7)
    8. 8)
      • A. Bouzid , N. Ellouze . (2007) Open quotient measurements based on multi-scale product of speech signal wavelet transform.
    9. 9)
      • Gu, Y.H., van Bokhoven, W.M.G.: `Co-channel speech separation using frequency bin nonlinear adaptive filter', Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, May 1991, Toronto, Ontario, Canada, p. 949–952.
    10. 10)
      • Saito, S., Kameoka, H., Nishimoto, T., Sagayama, S.: `Specmurt analysis of multi-pitch music signals with adaptive estimation of common harmonic structure', Proc. Int. Conf. Music, Information, Retrieval, September 2005, London, UK.
    11. 11)
      • Davy, M., Godsill, S.: `Bayesian harmonic models for musical signal analysis', Proc. Seventh Valencia Int. Meeting in Bayesian Statistics, 2003, Valencia, Spain.
    12. 12)
    13. 13)
    14. 14)
      • A. De Cheveigné , D.L. Wang , G.J. Brown . (2006) Multiple , Computational auditory scene analysis: principles, algorithms and applications.
    15. 15)
    16. 16)
      • Boersma, P.: `Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound', Proc. Institute of Phonetic Sciences, 1993, Amsterdam, p. 97–110.
    17. 17)
      • A. Bouzid , N. Ellouze . Electroglottographic measures based on GCI and GOI detection using multi-scale product. Int. J. Comput. Commun. Control , 21 - 32
    18. 18)
    19. 19)
    20. 20)
      • Joho, D., Bennewitz, M., Behnke, S.: `Pitch estimation using models of voiced speech on three levels', Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, April 2007, Honolulu, Hawaii, USA, p. 1077–1080.
    21. 21)
      • http://spib.rice.edu/spib/select_noise.html, accessed 15 September 2010.
    22. 22)
    23. 23)
    24. 24)
      • M.A. Ben Messaoud , A. Bouzid , N. Ellouze , J. Sole-Casals , V. Zaiats . (2010) Pitch tracking based on spectral multi-scale product analysis, Advances in nonlinear speech processing.
    25. 25)
    26. 26)
      • S. Mallat . (1998) A wavelet tour of signal processing.
    27. 27)
      • A. Klapuri , A. Klapuri , M. Davy . (2005) Auditory model-based methods for multiple fundamental frequency estimation, Signal processing methods for music transcription.
    28. 28)
      • Klapuri, A.: `Multiple fundamental frequency estimation by summing harmonic amplitudes', Proc. Int. Conf. on Music, Information, Retrieval, October 2006, Victoria, Canada.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-spr.2010.0030
Loading

Related content

content/journals/10.1049/iet-spr.2010.0030
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading