access icon free Robust speech processing using local adaptive non-linear filtering

A local adaptive non-linear algorithm for robust speech processing is proposed. The algorithm is based on calculation of the rank-order statistics of an input speech signal over a sliding window. The algorithm is locally adaptive because it can vary the size and contents of a sliding window signal as well as an estimation function employed for recovering a clean speech signal from a noisy signal. The algorithm is able to improve the quality of a speech signal preserving its intelligibility and introducing only imperceptible musical noise. The performance of the adaptive algorithm for suppressing additive, impulsive and mixed noise in an input test speech signal is compared with that of existing speech enhancement algorithms in terms of several objective metrics.

Inspec keywords: adaptive filters; speech intelligibility; speech enhancement; signal denoising; estimation theory; statistical analysis; impulse noise; nonlinear filters

Other keywords: noisy signal; sliding window signal; robust speech processing; speech signal intelligibility; speech enhancement algorithms; impulsive noise suppression; imperceptible musical noise; input test speech signal; local adaptive nonlinear filtering algorithm; rank-order statistics; estimation function; additive noise suppression

Subjects: Filtering methods in signal processing; Other topics in statistics; Other topics in statistics; Speech processing techniques; Speech and audio signal processing

References

    1. 1)
      • 35. ITU.: ‘Perceptual evaluation of speech quality (PESQ), and objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs’. ITU-T Recommendation P.862, 2000.
    2. 2)
      • 10. Sreenivas, T.V., Kirnapure, P.: ‘Codebook constrained Wiener filtering for speech enhancement’, IEEE Trans. Speech Audio, Process., 1996, 4, (5), pp. 383389 (doi: 10.1109/89.536932).
    3. 3)
      • 23. Coyle, E.J., Lin, J.H., Gabbouj, M.: ‘Optimal stack filtering and the estimation and structural approaches to image processing’, IEEE Trans. Acoust. Speech, Signal Process., 1989, 37, (12), pp. 20372066 (doi: 10.1109/29.45552).
    4. 4)
      • 28. Breithaupt, C., Gerkmann, T., Martin, R.: ‘Cepstral smoothing of spectral filter gains for speech enhancement without musical noise’, IEEE Signal Process. Lett., 2007, 14, (12), pp. 10361039 (doi: 10.1109/LSP.2007.906208).
    5. 5)
      • 37. Vincent, E., Gribonval, R., Févotte, C.: ‘Performance measurement in blind audio source separation’, IEEE Trans. Audio, Speech, Lang. Process., 2006, 14, (4), pp. 14621469 (doi: 10.1109/TSA.2005.858005).
    6. 6)
      • 6. Loizou, P.C.: ‘Speech enhancement theory and practice’ (Taylor & Francis, 2007).
    7. 7)
      • 21. Arce, G., McLoughlin, M.: ‘Theoretical analysis of the max/median filter’, IEEE Trans. Acoust. Speech, Signal Process., 1987, 35, (1), pp. 6069 (doi: 10.1109/TASSP.1987.1165036).
    8. 8)
      • 18. Vaseghi, S.V., Rayner, P.J.W.: ‘Detection and suppression of impulsive noise in speech communication systems’. Proc. IEE Communications, Speech and Vision 1990, pp. 3846.
    9. 9)
      • 33. Ephraim, Y., Malah, D.: ‘Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator’, IEEE Trans. Acoust. Speech Signal Process., 1984, 32, (6), pp. 11091121 (doi: 10.1109/TASSP.1984.1164453).
    10. 10)
      • 14. Seok, J.W., Bae, K.S.: ‘Reduction of musical noise in spectral subtraction method using subframe phase randomisation’, Electron. Lett., 1999, 35, (2), pp. 123125 (doi: 10.1049/el:19990122).
    11. 11)
      • 5. Vaseghi, S.V.: ‘Advanced digital signal processing and noise reduction’ (Wiley, 2008, 4th edn.).
    12. 12)
      • 26. Kober, V.I., Mozerov, M.I., Alvarez-Borrego, J., Ovseyevich, I.A.: ‘Fast algorithms of rank-order filters with spatial adaptive neighborhoods’, Pattern Recognit. Image Anal., 2001, 11, (4), pp. 690698.
    13. 13)
      • 31. Erkelens, J.S., Heusdens, R.: ‘Tracking of nonstationary noise based on data-driven recursive noise power spectrum’, IEEE Trans. Acoust. Speech Signal Process., 2008, 16, (6), pp. 11121123.
    14. 14)
      • 2. Ma, J., Hu, Y., Loizou, P.: ‘Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions’, J. Acoust. Soc. Am., 2009, 125, (5), pp. 33873405 (doi: 10.1121/1.3097493).
    15. 15)
      • 9. Hansen, J.H.L., Clements, M.A.: ‘Constrained iterative speech enhancement with application to speech recognition’, IEEE Trans. Signal Process., 1991, 39, (4), pp. 795805 (doi: 10.1109/78.80901).
    16. 16)
      • 12. McAulay, R., Malpass, M.: ‘Speech enhancement using soft-decision noise suppression filter’, IEEE Trans. Acoust. Speech, Signal Process., 1980, 28, (2), pp. 137145 (doi: 10.1109/TASSP.1980.1163394).
    17. 17)
      • 32. Christensen, H., Barker, J., Green, P.: ‘The CHiME corpus: a resource and a challenge for computational hearing in multisource environments’. Interspeech '10, Makuhari, Japan, 2010.
    18. 18)
      • 19. Yaroslavsky, L.P., Eden, M.: ‘Fundamentals of digital optics’ (Birkhäuser, Boston, 1996).
    19. 19)
      • 13. Berouti, M., Schwartz, M., Makhoul, J.: ‘Enhancement of speech corrupted by acoustic noise’. Proc IEEE Int Conf Acoust Speech and Signal Process., 1979, pp. 208211.
    20. 20)
      • 34. Ephraim, Y., Malah, D.: ‘Speech enhancement using a minimum mean-square error log-spectral amplitude estimator’, IEEE Trans. Acoust. Speech Signal Process., 1985, 33, (2), pp. 443445 (doi: 10.1109/TASSP.1985.1164550).
    21. 21)
      • 25. Kober, V.I., Mozerov, M.G., Alvarez-Borrego, J., Ovseyevich, I.A.: ‘Rank image processing using spatially adaptive neighborhoods’, Pattern Recognit. Image Anal., 2001, 11, (3), pp. 542552.
    22. 22)
      • 8. Plapous, C., Marro, C., Scalart, P.: ‘Improved signal-to-noise ratio estimation for speech enhancement’, IEEE Trans. Audio, Speech, Lang. Process., 2006, 14, (6), pp. 20982108 (doi: 10.1109/TASL.2006.872621).
    23. 23)
      • 24. Wang, S.S., Lin, C.F.: ‘Conditional trimmed mean filters and their applications for noise removal’, Signal Process., 1995, 43, pp. 103109 (doi: 10.1016/0165-1684(94)00147-R).
    24. 24)
      • 15. Gustafsson, H., Nordholm, S.E., Claesson, I.: ‘Spectral subtraction using reduced delay convolution and adaptive averaging’, IEEE Trans. Speech, Audio, Process., 2001, 9, (8), pp. 799807 (doi: 10.1109/89.966083).
    25. 25)
      • 16. Hansler, E., Schmidt, G. (Eds.): ‘Speech and audio processing in adverse environments’ ‘Signals and Communication Technology’ (Springer, 2008).
    26. 26)
      • 4. Ma, J., Loizou, P.C.: ‘SNR loss: a new objective measure for predicting the intelligibility of noise-suppressed speech’, Speech Commun., 2011, 53, (3), pp. 340354 (doi: 10.1016/j.specom.2010.10.005).
    27. 27)
      • 30. Pomalaza-Raez, C., McGillem, C.: ‘An adaptative, nonlinear edge-preserving filter’, IEEE Trans. Acoust. Speech Signal Process., 1984, 32, (3), pp. 571576 (doi: 10.1109/TASSP.1984.1164361).
    28. 28)
      • 22. Nieminen, A., Heinonen, P., Neuvo, Y.: ‘A new class of detail-preserving filters for image processing’, IEEE Trans. Pattern Anal. Mach. Intell., 1987, PAMI-9, pp. 7490 (doi: 10.1109/TPAMI.1987.4767873).
    29. 29)
      • 3. Hu, Y., Loizou, P.C.: ‘Evaluation of objective quality measures for speech enhancement’, IEEE Trans. Audio, Speech, Lang. Process., 2008, 16, (1), pp. 229238 (doi: 10.1109/TASL.2007.911054).
    30. 30)
      • 36. Taal, C.H., Hendriks, R.C., Heusdens, R., Jensen, J.: ‘An algorithm for intelligibility prediction of time-frequency weighted noisy speech’, IEEE Trans. Audio, Speech, Lang. Process., 2011, 19, (7), pp. 21252136 (doi: 10.1109/TASL.2011.2114881).
    31. 31)
      • 7. Scalart, P., Filho, J.V.: ‘Speech enhancement based on a priori signal to noise estimation’, Proc. IEEE Int. Conf. Acoust. Speech, Signal Process., 1996, 2, pp. 629632.
    32. 32)
      • 29. Quatieri, T.F.: ‘Discrete-time speech signal processing: principles and practice’ (Prentice-Hall, 2001).
    33. 33)
      • 1. Benesty, J., Makino, S., Chen, J.: ‘Speech enhancement’ ‘Signals and communication technology’ (Springer, 2005).
    34. 34)
      • 11. Boll, S.F.: ‘Suppression of acoustic noise in speech using spectral subtraction’, IEEE Trans. Acoust. Speech, Signal Process., 1979, 27, (2), pp. 113120 (doi: 10.1109/TASSP.1979.1163209).
    35. 35)
      • 20. Gallagher, N., Wise, G.: ‘A theoretical analysis of the properties of median filters’, IEEE Trans. Acoust. Speech, Signal Process., 1981, 29, (6), pp. 11361141 (doi: 10.1109/TASSP.1981.1163708).
    36. 36)
      • 27. Huber, P.J., Ronchetti, E.M.: ‘Robust statistics’ (Wiley, New York, 1981).
    37. 37)
      • 38. Jong-Sen, L.: ‘Digital image smoothing and the sigma filter’, Comput. Vis., Graph., Image Process., 1983, 24, (2), pp. 255269 (doi: 10.1016/0734-189X(83)90047-6).
    38. 38)
      • 17. Astola, J., Kuosmanen, P.: ‘Fundamentals of nonlinear digital filtering’ (CRC-Press, 1997).
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-spr.2011.0206
Loading

Related content

content/journals/10.1049/iet-spr.2011.0206
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading