Using multi-scale product spectrum for single and multi-pitch estimation

Author(s): M.A.B. Messaoud ; A. Bouzid ; N. Ellouze
DOI: 10.1049/iet-spr.2010.0030

For access to this article, please select a purchase option:

Buy article PDF

Buy Knowledge Pack

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership

Recommend Title Publication to library

IET Signal Processing — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Author(s): M.A.B. Messaoud ¹ ; A. Bouzid ¹ ; N. Ellouze ¹
- Affiliations: 1: Electrical Engineering Department, National School of Engineers of Tunis, University of Tunis El Manar, Tunis, Tunisia
Source: Volume 5, Issue 3, June 2011, p. 344 – 355
DOI: 10.1049/iet-spr.2010.0030 , Print ISSN 1751-9675, Online ISSN 1751-9683

The authors present an algorithm for pitch estimation including voiced/unvoiced decision in the case of a noisy speech and when two speakers are talking simultaneously. The approach is based on the spectral multi-scale product (SMP) analysis of the sound mixture. SMP is the spectrum of the product of three successive wavelet transform coefficients of the speech. The wavelet used for SMP analysis is the quadratic spline function. The proposed method is compared with other state-of-the-art algorithms. It is robust in the presence of a noise and permits the pitch estimation of the dominant speech and the concurrent one from the sound mixture with high accuracy.

References

1. 1)
  - T. Nakatani , T. Irino . Robust and accurate fundamental frequency estimation based on dominant harmonic components. J. Acoust. Soc. Am. , 6 , 3690 - 3700
2. 2)
  - Plante, F., Meyer, G., Ainsworth, W.A.: `A pitch extraction reference database', Fourth European Conf. on Speech Communication and Technology, EUROSPEECH 95, September 1995, Madrid, Spain, p. 837–840.
3. 3)
  - Cooke, M.P.: `Modeling auditory processing and organisation', 1993, PhD, University of Sheffield.
4. 4)
  - C.S. Burrus , R.A. Gopinath , H. Guo . (1998) Introduction to wavelets and wavelet transform: a primer.
5. 5)
  - A. Klapuri . Multiple fundamental frequency estimation based on harmonicity and spectral smoothness. IEEE Trans. Speech Audio Process. , 6 , 804 - 815
6. 6)
  - W.J. Hess . (1983) Pitch determination of speech signals: algorithms and devices.
7. 7)
  - S. Kadambe , G.F. Boudrcaux-Bartels . Application of the wavelet transform for pitch determination of speech signals. IEEE Trans. Inf. Theory , 917 - 924
8. 8)
  - A. Bouzid , N. Ellouze . (2007) Open quotient measurements based on multi-scale product of speech signal wavelet transform.
9. 9)
  - Gu, Y.H., van Bokhoven, W.M.G.: `Co-channel speech separation using frequency bin nonlinear adaptive filter', Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, May 1991, Toronto, Ontario, Canada, p. 949–952.
10. 10)
  - Saito, S., Kameoka, H., Nishimoto, T., Sagayama, S.: `Specmurt analysis of multi-pitch music signals with adaptive estimation of common harmonic structure', Proc. Int. Conf. Music, Information, Retrieval, September 2005, London, UK.
11. 11)
  - Davy, M., Godsill, S.: `Bayesian harmonic models for musical signal analysis', Proc. Seventh Valencia Int. Meeting in Bayesian Statistics, 2003, Valencia, Spain.
12. 12)
  - T. Tolonen , M. Karjalainen . A computationally efficient multi-pitch analysis model. IEEE Trans. Speech Audio Process. , 6 , 708 - 716
13. 13)
  - B.M. Sadler , A. Swami . Analysis of multi-scale products for step detection and estimation. IEEE Trans. Inf. Theory , 1043 - 1051
14. 14)
  - A. De Cheveigné , D.L. Wang , G.J. Brown . (2006) Multiple , Computational auditory scene analysis: principles, algorithms and applications.
15. 15)
  - Y. Xu , J.B. Weaver , D.M. Healy , Lu , Lu. Jian . Wavelet transform domain filters: a spatially selective noise filtration technique. IEEE Trans. Image Process. , 6 , 747 - 757
16. 16)
  - Boersma, P.: `Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound', Proc. Institute of Phonetic Sciences, 1993, Amsterdam, p. 97–110.
17. 17)
  - A. Bouzid , N. Ellouze . Electroglottographic measures based on GCI and GOI detection using multi-scale product. Int. J. Comput. Commun. Control , 21 - 32
18. 18)
  - Z. Berman , J.S. Baras . Properties of the multi-scale maxima and zero-crossings representations. IEEE Trans. Signal Process. , 3216 - 3231
19. 19)
  - A. De Cheveigné . Separation of concurrent harmonic sounds: fundamental frequency estimation and a time domain cancellation model of auditory processing. J. Acoust. Soc. Am. , 6 , 3271 - 3290
20. 20)
  - Joho, D., Bennewitz, M., Behnke, S.: `Pitch estimation using models of voiced speech on three levels', Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, April 2007, Honolulu, Hawaii, USA, p. 1077–1080.
21. 21)
  - http://spib.rice.edu/spib/select_noise.html, accessed 15 September 2010.
22. 22)
  - A. de Cheveigne , H. Kawahara . YIN, a fundamental frequency estimator for speech and music. J. Acoust. Soc. Am. , 4 , 1917 - 30
23. 23)
  - M. Wu , D. Wang , G. Brown . A multi-pitch tracking algorithm for noisy speech. IEEE Trans. Speech Audio Process. , 3 , 229 - 241
24. 24)
  - M.A. Ben Messaoud , A. Bouzid , N. Ellouze , J. Sole-Casals , V. Zaiats . (2010) Pitch tracking based on spectral multi-scale product analysis, Advances in nonlinear speech processing.
25. 25)
  - L.R. Rabiner , M.J. Cheng , A.E. Rosenberg , C.A. McGonegal . A comparative performance study of several pitch detection algorithms. IEEE Trans. Acoust., Speech, Signal Process. , 5 , 399 - 418
26. 26)
  - S. Mallat . (1998) A wavelet tour of signal processing.
27. 27)
  - A. Klapuri , A. Klapuri , M. Davy . (2005) Auditory model-based methods for multiple fundamental frequency estimation, Signal processing methods for music transcription.
28. 28)
  - Klapuri, A.: `Multiple fundamental frequency estimation by summing harmonic amplitudes', Proc. Int. Conf. on Music, Information, Retrieval, October 2006, Victoria, Canada.

Using multi-scale product spectrum for single and multi-pitch estimation

Using multi-scale product spectrum for single and multi-pitch estimation

Buy article PDF

Buy Knowledge Pack

Thank you

References

Related content