© The Institution of Engineering and Technology
Kalman filter is normally used to enhance speech quality in a noisy environment, in which the speech signals are usually modelled as autoregressive (AR) process, and represented in the state-space domain. It is a known fact that to identify the changing AR coefficients in every time state requires extensive computation. In this paper, the authors develop a bidirectional Kalman filter and apply it in a speech processing system. The proposed filter uses a system dynamics model that utilises the past and the future measurements to form an estimate of the system's current time state. It provides efficient recursive means to estimate the state of a process that minimises the mean of the squared error. Compared to the conventional Kalman filter, the proposed filter reduces the computation time in two ways: (i) by avoiding the computation of AR parameters in each time state, and (ii) by reducing the dimension of the matrices involved in the difference equations and the measurement equations into constant (1 × 1) matrices. The speech recognition result shows that the developed speech recognition system becomes more robust after the proposed filtering process, and the proposed filter's low computational expense makes it applicable in the practical hidden Markov model-based speech recognition system.
References
-
-
1)
-
3. Gabrea, M.: ‘Adaptive Kalman filtering-based speech enhancement algorithm’. IEEE Canadian Conf. on Electrical and Computer Engineering, 2001, vol. 1, pp. 521–526.
-
2)
-
25. Cohen, I., Berdugo, B.: ‘Speech enhancement for non-stationary noise environments’, Signal Process., 2001, 81, (11), pp. 2403–2418 (doi: 10.1016/S0165-1684(01)00128-1).
-
3)
-
9. You, C., Koh, S., Rahardja, S.: ‘Kalman filtering speech enhancement incorporating masking properties for mobile communication in a car environment’. IEEE Int. Conf. on Multimedia and Expo, 2004, vol. 2, pp. 1343–1346.
-
4)
-
15. Mustiere, F., Bolic, M., Bouchard, M.: ‘Improved colored noise handling in Kalman filter-based speech enhancement algorithms’. Canadian Conf. on Electrical and Computer Engineering, 2008, pp. 497–500.
-
5)
-
4. Hermansky, H., Morgan, N.: ‘RASTA processing of speech’, IEEE Trans. Speech Audio Process., 1994, 2, (4), pp. 578–589 (doi: 10.1109/89.326616).
-
6)
-
20. Atal, B.: ‘Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification’, J. Acoust. Soc. Am., 1974, 55, (6), pp. 1304–1312 (doi: 10.1121/1.1914702).
-
7)
-
13. Ma, J., Deng, L.: ‘Efficient decoding strategies for conversational speech recognition using a constrained nonlinear state-space model’, IEEE Trans. Speech Audio Process., 2003, 11, (6), pp. 590–602 (doi: 10.1109/TSA.2003.818075).
-
8)
-
12. Jeong, S., Hahn, M.: ‘Speech quality and recognition rate improvement in car noise environments’, Electron. Lett., 2001, 37, (12), pp. 800–802 (doi: 10.1049/el:20010513).
-
9)
-
23. Cui, X., Alwan, A.: ‘Noise robust speech recognition using feature compensation based on polynomial regression of utterance snr’, IEEE Trans. Speech Audio Process., 2005, 13, (6), pp. 1161–1172 (doi: 10.1109/TSA.2005.853002).
-
10)
-
16. Wu, W., Chen, P.: ‘Subband Kalman filtering for speech enhancement’, IEEE Trans. Circuits Syst. II, Analog Digital Signal Process., 1998, 45, (8), pp. 1072–1083 (doi: 10.1109/82.718814).
-
11)
-
4. Gannot, S., Burshtein, D., Weinstein, E.: ‘Iterative and sequential Kalman filter-based speech enhancement algorithms’, IEEE Trans. Speech Audio Process., 1998, 6, (4), pp. 373–385 (doi: 10.1109/89.701367).
-
12)
-
14. Mathe, M., Nandyala, S.P., Kishore Kumar, T.: ‘Speech enhancement using Kalman filter for white, random and color noise’. IEEE Int. Conf. on Devices, Circuits and Systems (ICDCS), 2012, pp. 195–198.
-
13)
-
29. Bryson, A., Frazier, M.: ‘Smoothing for linear and nonlinear dynamic systems’. Proc. of the Optimum System Synthesis Conf., 1962, pp. 353–364.
-
14)
-
31. Kondo, K.: ‘Subjective quality measurement of speech’ (Springer, 2012).
-
15)
-
27. Leggetter, C., Woodland, P.: ‘Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models’, Comput. Speech Language, 1995, 9, (2), p. 171 (doi: 10.1006/csla.1995.0010).
-
16)
-
28. Cui, X., Gong, Y.: ‘A study of variable-parameter Gaussian mixture hidden Markov modeling for noisy speech recognition’, IEEE Trans. Audio Speech Language Process., 2007, 15, (4), pp. 1366–1376 (doi: 10.1109/TASL.2006.889791).
-
17)
-
18. Shaughnessy, D.: ‘Improving speech analysis methods for robust automatic recognition’. IEEE, Canadian Conf. on Electrical and Computer Engineering, 2004, vol. 1, pp. 161–164.
-
18)
-
30. Fong, W., Godsill, S.J., Doucet, A., West, M.: ‘Monte Carlo smoothing with application to audio signal enhancement, signal processing’, IEEE Trans., 2002, 50, (2), pp. 438–449 (doi: 10.1109/78.978397).
-
19)
-
24. Kovacevic, B., Milosavljevic, M., Veinovic, M.: ‘Robust recursive ar speech analysis’, Signal Process., 1995, 44, (2), pp. 125–138 (doi: 10.1016/0165-1684(95)00019-A).
-
20)
-
11. Kuropatwinski, M., Kleijn, W.: ‘Estimation of the short-term predictor parameters of speech under noisy conditions’, IEEE Trans. Audio Speech Language Process., 2006, 14, (5), pp. 1645–1655 (doi: 10.1109/TSA.2005.858558).
-
21)
-
17. Mai, Q., He, D., Hou, Y., Huang, Z.: ‘A fast adaptive Kalman filtering algorithm for speech enhancement’. IEEE Conf. on Automation Science and Engineering (CASE), 2011, pp. 327–332.
-
22)
-
6. Grivel, E., Gabrea, M., Najim, M.: ‘Subspace state space model identification for speech enhancement’. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, 1999, vol. 2, pp. 781–784.
-
23)
-
26. Gales, M., Young, S.: ‘Robust continuous speech recognition using parallel model combination’, IEEE Trans. Speech Audio Process., 1996, 4, pp. 352–359 (doi: 10.1109/89.536929).
-
24)
-
28. Hermansky, H.: ‘Perceptual linear predictive (plp) analysis of speech’, J. Acoust. Soc. Am., 1990, 87, p. pp. 1738 (doi: 10.1121/1.399423).
-
25)
-
7. You, C., Rahardja, S., Soo Ngee Koh, , et al: ‘Autoregressive parameter estimation for Kalman filtering speech enhancement’. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 2007, vol. 4, pp. 891–913.
-
26)
-
19. Boll, S.: ‘Suppression of acoustic noise in speech using spectral subtraction’, IEEE Trans. Acoustics Speech Signal Process., 1979, 27, (2), pp. 113–120 (doi: 10.1109/TASSP.1979.1163209).
-
27)
-
10. Ma, N., Bouchard, M., Goubran, R.: ‘Speech enhancement using a masking threshold constrained Kalman filter and its heuristic implementations’, IEEE Trans. Audio Speech Language Process., 2006, 14, (1), pp. 19–32 (doi: 10.1109/TSA.2005.858515).
-
28)
-
1. Paliwal, K., Basu, A.: ‘A speech enhancement method based on Kalman filtering’. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, 1987, vol. 12, pp. 177–180.
-
29)
-
8. Sorqvist, P., Handel, P., Ottersten, B.: ‘Kalman filtering for low distortion speech enhancement in mobile communication’. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, 1997, vol. 2, pp. 1219–1222.
-
30)
-
5. Lee, K., Jung, S.: ‘Time-domain approach using multiple Kalman filters and em algorithm to speech enhancement with nonstationary noise’, IEEE Trans. Speech Audio Process., 2000, 8, (3), pp. 282–291 (doi: 10.1109/89.841210).
-
31)
-
2. Goh, Z., Tan, K., Tan, B.: ‘Kalman-filtering speech enhancement method based on a voiced-unvoiced speech model’, IEEE Trans. Speech Audio Process., 1999, 7, (5), pp. 510–524 (doi: 10.1109/89.784103).
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-spr.2014.0109
Related content
content/journals/10.1049/iet-spr.2014.0109
pub_keyword,iet_inspecKeyword,pub_concept
6
6