© The Institution of Engineering and Technology
This paper presents an algorithm for estimating the instantaneous fundamental frequency of a noisy non-stationary periodic signal whose components are harmonically related. To this end, the authors’ propose a harmonic state-space model for the input signal and use it to derive an extended Kalman filter (EKF), an unscented Kalman filter (UKF) and a particle filter (PF). In this model, the input signal is characterised by a time-varying fundamental frequency and amplitude which is a practical assumption for real-world periodic signals. In contrast to most of existing methods such as short-time Fourier transform, the proposed algorithm does not use any windowing technique. Therefore the trade-off between time and frequency resolutions is less controversial and so can be used for real-time frequency tracking. It also reveals some fine and continuous variations in signal pitch such as Vibrato and Glissando. Simulation results show that the proposed algorithm performs well even when most of the signal energy is contained in the higher-order harmonics. The performance of the proposed algorithm using EKF, UKF and PF is also evaluated and the results are compared in diverse conditions.
References
-
-
1)
-
34. Händel, P., Tichavsky, P.: ‘Adaptive estimation for periodic signal enhancement and tracking’, Int. J. Adapt. Control Signal Process., 1994, 8, (5), pp. 447–456 (doi: 10.1002/acs.4480080502).
-
2)
-
3. Bittanti, S., Savaresi, S.M.: ‘On the parametrization and design of an extended Kalman filter frequency tracker’, IEEE Trans. Autom. Control, 2000, 45, (9), pp. 1718–1724 (doi: 10.1109/9.880631).
-
3)
-
40. Nielsen, J.K.: ‘Some new results on the estimation of sinusoids in noise’. PhD thesis, Aalborg University, 2012.
-
4)
-
38. Hajimolahoseini, H., Taban, M.R., Abutalebi, H.R.: ‘Improvement of extended Kalman filter frequency tracker for nonstationary harmonic signals’. Int. Symp. on Telecommunications, IEEE, Tehran, August 2008, pp. 592–597.
-
5)
-
23. Sun, X.: ‘Pitch determination and voice quality analysis using subharmonic-to-harmonic ratio’. IEEE Int. Conf. on. Acoustics, Speech, and Signal Processing (ICASSP), IEEE, Florida, USA, May 2002, 1, pp. I–333.
-
6)
-
37. Hajimolahoseini, H., Taban, M.R., Soltanian-Zadeh, H.: ‘Extended Kalman filter frequency tracker for non-stationary harmonic signals’, Meas. Elsevier, 2012, 45, (1), pp. 126–132 (doi: 10.1016/j.measurement.2011.09.008).
-
7)
-
45. Enescu, M., Sirbu, M., Koivunen, V.: ‘Recursive estimation of noise statistics in Kalman filter based MIMO equalization’. Proc. of XXVIIth General Assembly of the Int. Union of Radio Science (URSI), Netherland, 17–24 August 2002.
-
8)
-
J. Tabrikian ,
S. Dubnov ,
Y. Dickalov
.
Maximum a-posteriori probability pitch tracking in noisy environments using harmonic model.
IEEE Trans. Speech Audio Process.
,
1 ,
76 -
87
-
9)
-
5. Müller, M., Ellis, D.P.W., Klapuri, A., Richard, G.: ‘Signal processing for music analysis’, IEEE J. Sel. Top. Signal Process., 2011, 5, (6), pp. 1088–1110 (doi: 10.1109/JSTSP.2011.2112333).
-
10)
-
12. Ramos, P.M., Cruz Serra, A.: ‘Comparison of frequency estimation algorithms for power quality assessment’, Meas. J. Elsevier, 2009, 42, (9), pp. 1312–1317 (doi: 10.1016/j.measurement.2008.04.013).
-
11)
-
24. Drugman, T., Alwan, A.: ‘Joint robust voicing detection and pitch estimation based on residual harmonics’. Proc. Interspeech, Florence, Italy, 2011, pp. 1973–1976.
-
12)
-
15. Park, S.Y., Nat, K., Song, Y.S., Park, J.: ‘Improved method for frequency estimation of sampled sinusoidal signals without iteration’, IEEE Trans. Instrum. Meas., 2011, 60, (8), pp. 2828–2834 (doi: 10.1109/TIM.2011.2121231).
-
13)
-
28. Kawahara, H., Morise, M., Takahashi, T., Nisimura, R., Irino, T., Banno, H.: ‘Tandem-straight: a temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, f0, and aperiodicity estimation’. Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP). IEEE, USA, 2008, pp. 3933–3936.
-
14)
-
S. Ahmadi ,
A.S. Spanias
.
Cepstrum-based pitch detection using a new statistical V/UV classification algorithm.
IEEE Trans. Speech Audio Process.
,
3 ,
333 -
338
-
15)
-
J. Rissanen
.
Modeling by Shortest Data Description.
Automatica
,
465 -
471
-
16)
-
36. Nielsen, J.K., Christensen, M.G., Jensen, S.H.: ‘Default Bayesian estimation of the fundamental frequency’, IEEE Trans. Audio Speech Lang. Process., 2013, 21, (3), pp. 598–610 (doi: 10.1109/TASL.2012.2229979).
-
17)
-
41. Haykin, S.: ‘Communication systems’ (John Wiley and Sons Ltd, New York, 2000, 4th edn.).
-
18)
-
3. Christensen, M.G., Jakobsson, A., Jensen, S.H.: ‘Multi-pitch estimation using harmonic music’. Rec. Asilomar Conf. Signals, Systems, and Computers, November 2006, pp. 521–524.
-
19)
-
9. Beigi, H.: ‘Fundamentals of speaker recognition’ (Springer, New York, 2011).
-
20)
-
R.J. McAulay ,
T.F. Quatieri
.
Speech analysis/synthesis based on a sinusoidal representation.
IEEE Trans. Acoust. Speech Signal Process.
,
744 -
754
-
21)
-
M.S. Arulampalam ,
S. Maskell ,
N. Gordon ,
T. Clapp
.
A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking.
IEEE Trans. Signal Process.
,
2 ,
174 -
188
-
22)
-
39. Ribeiro, M.I.: ‘Kalman and extended Kalman filters: concept, derivation and properties’ (Institute for Systems and Robotics, Instituto Superior Tcnico, Av. Rovisco Pais, , Portugal, 2004).
-
23)
-
35. Nielsen, J.K., Christensen, M.G., Cemgil, A.T., Godsill, S.J., Jensen, S.H.: ‘Bayesian interpolation and parameter estimation in a dynamic sinusoidal model’, IEEE Trans. Audio Speech Lang. Process., 2011, 19, (7), pp. 1986–1998 (doi: 10.1109/TASL.2011.2108285).
-
24)
-
Y.S. Shmaliy
.
An iterative Kalman-like algorithm ignoring noise and initial conditions.
IEEE Trans. Signal Process.
,
6 ,
2465 -
2473
-
25)
-
7. Garner, P.N.: ‘Cepstral normalisation and the signal to noise ratio spectrum in automatic speech recognition’, Speech Commun., 2011, 53, (8), pp. 991–1001 (doi: 10.1016/j.specom.2011.05.007).
-
26)
-
6. Hajimolahoseini, H., Taban, M.R., Abutalebi, H.R.: ‘Automatic transcription of music signal using harmonic elimination method’. Int. Symp. on Telecommunications, IEEE, Tehran, August 2008, pp. 559–563.
-
27)
-
10. Babacan, O., Drugman, T., Alessandro, N., Henrich, N., Dutoit, T.: ‘A comparative study of pitch extraction algorithms on a large variety of singing sounds’. Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), Vancouver, Canada, May 2013, pp. 7815–7819.
-
28)
-
13. La Scala, B., Bitmead, R.: ‘Design of an extended Kalman filter frequency tracker’, IEEE Trans. Signal Process., 1996, 44, (2), pp. 739–742 (doi: 10.1109/78.489052).
-
29)
-
A. de Cheveigne ,
H. Kawahara
.
YIN, a fundamental frequency estimator for speech and music.
J. Acoust. Soc. Am.
,
4 ,
1917 -
30
-
30)
-
8. Jurafsky, D., Martin, J.H.: ‘Speech and language processing’ (Pearson Prentice-Hall, New Jersey, 2008, 2nd edn.).
-
31)
-
2. Cemgil, A.T., Kappen, H.J., Barber, D.: ‘A generative model for music transcription’, IEEE Trans. Audio Speech Lang. Process., 2006, 14, (2), pp. 679–694 (doi: 10.1109/TSA.2005.852985).
-
32)
-
49. Wan, E.A., Merwe, V.: ‘Kalman filtering and neural networks, chapter 7, the unscented Kalman filter’ (Wiley Publishing, New York, 2001).
-
33)
-
48. Gibbs, B.P.: ‘Advanced Kalman filtering, least-squares and modeling: a practical handbook’ (Wiley, New York, 2011).
-
34)
-
26. Michael Noll, A.: ‘Short-time spectrum and cepstrum techniques for vocal-pitch detection’, J. Acoust. Soc. Am., 1964, 36, (2), pp. 296–302 (doi: 10.1121/1.1918949).
-
35)
-
17. Gerhard, D.: ‘Pitch extraction and fundamental frequency: history and current techniques’ (University of Regina, Canada, 2003), pp. 2003–2006.
-
36)
-
H. Akaike
.
A new look at statistical model identification.
IEEE Trans. Autom. Control
,
716 -
723
-
37)
-
18. Cheng, Y.M., O'Shaughnessy, D.: ‘Automatic and reliable estimation of glottal closure instant and period’, IEEE Trans. Acoust. Speech Signal Process., 1989, 37, (12), pp. 1805–1815 (doi: 10.1109/29.45529).
-
38)
-
33. Christensen, M.G.: ‘A method for low-delay pitch tracking and smoothing’. Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), Japan, March 2012, pp. 345–348.
-
39)
-
1. Cemgil, A.T.: ‘Bayesian music transcription’. PhD thesis, Radboud University of Nijmegen, 2004.
-
40)
-
46. Shmaliy, Y.S.: ‘Suboptimal FIR filtering of nonlinear models in additive white Gaussian noise’, IEEE Trans. Signal Process., 2012, 60, (10), pp. 5519–5527 (doi: 10.1109/TSP.2012.2205569).
-
41)
-
11. Regulski, P., Terzija, V.: ‘Estimation of frequency and fundamental power components using an unscented Kalman filter’, IEEE Trans. Instrum. Meas., 2012, 61, (4), pp. 952–962 (doi: 10.1109/TIM.2011.2179342).
-
42)
-
16. Subudhi, B., Ray, P.K., Mohanty, S.R., Panda, A.M.: ‘A comparative study on different frequency estimation techniques’, Int. J. Autom. Control, 2009, 3, (2–3), pp. 202–215 (doi: 10.1504/IJAAC.2009.025242).
-
43)
-
51. Gordon, N., Salmond, D., Smith, A.F.M.: ‘Novel approach to nonlinear and non-Gaussian Bayesian state estimation’, IEE Proc., F, 1993, 140, (2), pp. 107–113.
-
44)
-
Z. Wang ,
S.S. Abeysekera
.
Performance of correlation-based frequency estimation methods in the presence of multiplicative noise.
IEEE Trans. Veh. Technol.
,
4 ,
1281 -
1290
-
45)
-
44. Mehr, R.K.: ‘On the identification of variances and adaptive Kalman filtering’, IEEE Trans. Autom. Control, 1974, AC-15, (2), pp. 175–184.
-
46)
-
4. Hajimolahoseini, H.: ‘Monophonic music transcription’. MS thesis, Yazd University, Iran, 2008.
-
47)
-
30. Kim, S., Paul, A.S., Wan, E.A., McNames, J.: ‘Multiharmonic frequency tracking method using the sigma-point Kalman smoother’, EURASIP J. Adv. Signal Process., 2010, 2010, 36.
-
48)
-
32. Christensen, M.G., Jakobsson, A.: ‘Multi-pitch estimation’ (Morgan & Claypool Publishers, California, 2009).
-
49)
-
22. Hermes, D.J.: ‘Measurement of pitch by subharmonic summation’, J. Acoust. Soc. Am., 1988, 83, pp. 257–264 (doi: 10.1121/1.396427).
-
50)
-
20. Boersma, P.: ‘Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound’. IFA Proc. Institute of Phonetic Sciences, University of Amsterdam, 1993, pp. 97–110.
-
51)
-
25. Belega, D., Dallet, D.: ‘Frequency estimation via weighted multipoint interpolated DFT’, IET Sci. Meas. Technol., 2008, 2, (1), pp. 1–8 (doi: 10.1049/iet-smt:20070022).
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-spr.2014.0120
Related content
content/journals/10.1049/iet-spr.2014.0120
pub_keyword,iet_inspecKeyword,pub_concept
6
6