http://iet.metastore.ingenta.com
1887

Bandwidth extension of narrowband speech using integer wavelet transform

Bandwidth extension of narrowband speech using integer wavelet transform

For access to this article, please select a purchase option:

Buy article PDF
$19.95
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Signal Processing — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Public telephone systems transmit speech across a limited frequency range, about 300–3400 Hz, called narrowband (NB) which results in a significant reduction of quality and intelligibility of speech. This study proposes a fully backward compatible novel method for bandwidth extension (BWE) of NB speech. The method uses integer wavelet transform technique to provide a perceptually better wideband (WB) speech signal. The spectral envelope parameters are extracted from the down sampled frequency shifted version of the high-frequency components of speech signal existing above NB, which are spread by using pseudo-noise codes, and are embedded in the integer wavelet coefficients of NB speech signal. The hearing threshold is calculated in the integer wavelet domain and this threshold is employed as the embedding threshold. The embedded information is extracted at the receiving end to reconstruct the WB speech signal. Theoretical and simulation analyses show that the proposed method is robust to quantisation and channel noises. The comparison category rating listening and log spectral distortion tests clearly show that the reconstructed WB signal gives a much better performance in terms of speech quality when compared with some of the existing speech BWE methods employing data hiding.

References

    1. 1)
      • 1. Jax, P., Vary, P.: ‘Bandwidth extension of speech signals: a catalyst for the introduction of wideband speech coding?’, IEEE Commun. Mag., 2006, 44, (5), pp. 106111.
    2. 2)
      • 2. Jax, P.: ‘Enhancement of bandlimited speech signals: algorithms and theoretical bounds’. PhD thesis, RWTH Aachen University, Aachen, Germany, 2002.
    3. 3)
      • 3. Qian, Y., Kabal, P.: ‘Dual-mode wideband speech recovery from narrowband speech’. Proc. EUROSPEECH, Geneva, Switzerland, September 2003, pp. 14331436.
    4. 4)
      • 4. Vaseghi, S., Zavarehei, E., Yan, Q.: ‘Speech bandwidth extension: Extrapolations of spectral envelop and harmonicity quality of excitation’. Proc. ICASSP, Toulouse, France, May 2006, pp. 844847.
    5. 5)
      • 5. Epps, J., Holmes, W.H.: ‘A new technique for wideband enhancement of coded narrowband speech’. Proc. IEEE workshop on Speech Coding, Porvoo, June 1999, pp. 174176.
    6. 6)
      • 6. Hu, R., Krishnan, V., Anderson, D.V.: ‘Speech bandwidth extension by improved codebook mapping towards increased phonetic classification’. Proc. Interspeech, Lisbon, Portugal, September 2005, pp. 15011504.
    7. 7)
      • 7. Nakatoh, Y., Tsushima, M., Norimatsu, T.: ‘Generation of broadband speech from narrowband speech using piecewise linear mapping’. Proc. EUROSPEECH, Rhodes, Greece, September 1997, pp. 16431646.
    8. 8)
      • 8. Pulakka, H., Remes, U., Palomaki, K., et al: ‘Speech bandwidth extension using Gaussian mixture model-based estimation of the highband Mel spectrum’. Proc. ICASSP, Prague, Czech Republic, May 2011, pp. 51005103.
    9. 9)
      • 9. Bauer, P., Fingscheidt, T.: ‘An HMM based artificial bandwidth extension evaluated by cross-language training and test’. Proc. ICASSP, Las Vegas, NV, April 2008, pp. 45894592.
    10. 10)
      • 10. Pulakka, H., Alku, P.: ‘Bandwidth extension of telephone speech using a neural network and a filter bank implementation for highband melspectrum’, IEEE Trans. Acoust. Speech Signal Process., 2011, 19, (7), pp. 21702183.
    11. 11)
      • 11. Jax, P., Vary, P.: ‘An upper bound on the quality of artificial bandwidth extension of narrowband speech signals’. Proc. ICASSP, Orlando, USA, May 2002, pp. 237240.
    12. 12)
      • 12. Chen, S., Leung, H.: ‘Artificial bandwidth extension of telephony speech by data hiding’. Proc. IEEE Int. Symp. Circuits and Systems, Kobe, Japan, May 2005, pp. 31513154.
    13. 13)
      • 13. Chen, S., Leung, H.: ‘Speech bandwidth extension by data hiding and phonetic classification’. Proc. ICASSP, Honolulu, Hawaii, USA, April 2007, pp. 593596.
    14. 14)
      • 14. Chen, S., Leung, H., Ding, H.: ‘Telephony speech enhancement by data hiding’, IEEE Trans. Instrument. Meas., 2007, 56, (1), pp. 6374.
    15. 15)
      • 15. Chen, Z., Zhao, C., Geng, G., et al: ‘An audio watermark based speech bandwidth extension method’, EURASIP J. Audio Speech Music Process., 2013, 2013, (10), pp. 18.
    16. 16)
      • 16. Vary, P., Geiser, B.: ‘Steganographic wideband telephony using narrowband speech codecs’. Proc. Conf. Record of Asilomar Conf. on Signals, Systems, and Computers, Pacific Grove, California, November 2007, pp. 14751479.
    17. 17)
      • 17. Geiser, B., Vary, P.: ‘Backwards compatible wideband telephony in mobile networks: CELP watermarking and bandwidth extension’. Proc. ICASSP, Honolulu, Hawaii, USA, April 2007, pp. 533536.
    18. 18)
      • 18. Geiser, B., Vary, P.: ‘Speech bandwidth extension based on in-band transmission of higher frequencies’. Proc. ICASSP, Vancouver, Canada, May 2013, pp. 75077511.
    19. 19)
      • 19. Bhatt, N., Kosta, Y.: ‘A novel approach for artificial bandwidth extension of speech signals by LPC technique over proposed GSM FR NB coder using high band feature extraction and various extension of excitation methods’, Int. J. Speech Technol., 2015, 18, (1), pp. 5764.
    20. 20)
      • 20. Kosta, Y.: ‘Simulation and overall comparative evaluation of performance between different techniques for high band feature extraction based on artificial bandwidth extension of speech over proposed global system for mobile full rate narrow band coder’, Int. J. Speech Technol., 2016, 19, (4), pp. 881893.
    21. 21)
      • 21. Bhatt, N.: ‘Implementation and overall performance evaluation of CELP based GSM AMR NB coder over ABE’. Proc. Int. Conf. on Communication Systems and Network Technologies, Gwalior, MP, April 2015, pp. 402406.
    22. 22)
      • 22. Alipoor, G.H., Savoji, M.H.: ‘Wideband speech coding using ADPCM and a new enhanced bandwidth extension method’. Proc. Int. Symp. on Intelligent Signal Processing, Floriana, Malta, September 2011, pp. 14.
    23. 23)
      • 23. Alipoor, G.h., Savoji, M.H.: ‘Wideband speech coding based on bandwidth extension and sparse linear prediction’. Proc. Int. Conf. on Telecommunications and Signal Processing, Prague, Czech Republic, July 2012, pp. 454459.
    24. 24)
      • 24. Delforouzi, A., Pooyan, M.: ‘Adaptive digital audio steganography based on integer wavelet transform’, Circ. Syst. Signal Process., 2008, 27, (2), pp. 247259.
    25. 25)
      • 25. Geiser, B., Jax, P., Vary, P.: ‘Artificial bandwidth extension of speech supported by watermark-transmitted side information’. Proc. Interspeech, Lisbon, Portugal, September 2005, pp. 14971500.
    26. 26)
      • 26. Hassan, A.A., Hershey, J.E., Saulnier, G.J.: ‘Perspectives in spread spectrum’ (Kluwer Academic Publishers, Boston/Dordrecht/London, 1998).
    27. 27)
      • 27. Sayed, A.H.: ‘Adaptive filters’ (John Wiley & Sons, Hoboken, New Jersey, 2008).
    28. 28)
      • 28. Dinan, E.H., Jabbari, E.H.: ‘Spreading codes for direct sequence CDMA and wideband CDMA cellular networks’, IEEE Commun. Mag., 1998, 36, (9), pp. 4854.
    29. 29)
      • 29. Goldsmith, A.: ‘Wireless communications’ (Cambridge University Press, New York, 2005).
    30. 30)
      • 30. Chen, S., Leung, H.: ‘A bandwidth extension technique for signal transmission using chaotic data hiding’, Circ. Syst. Signal Process., 2008, 27, (6), pp. 893913.
    31. 31)
      • 31. Mallat, S.G.: ‘A theory for multiresolution signal decomposition: the wavelet representation’, IEEE Trans. Pattern Anal. Mach. Intell., 1989, 11, (7), pp. 674693.
    32. 32)
      • 32. Cvejic, N., Seppanen, T.: ‘Channel capacity for high bit rate audio data hiding algorithms in diverse transform domains’. Proc. Int. Symp. on Communications and Information Technologies, Sapporo, Japan, October 2004, pp. 8488.
    33. 33)
      • 33. Baranwal, N., Datta, K.: ‘Comparative study of spread spectrum based audio watermarking techniques’. Proc. Int. Conf. on Recent Trends in Information Technology, Chennai, India, June 2011, pp. 896900.
    34. 34)
      • 34. Ko, C.S., Kim, K.Y., Hwang, R.W., et al: ‘Robust audio watermarking in wavelet domain using pseudorandom Sequences’. Proc. Int. Conf. on Computer and Information Science, Jeju Island, South Korea, July 2005, pp. 397401.
    35. 35)
      • 35. Chen, Y.H., Chen, J.C.: ‘A new multiple audio watermarking algorithm applying DS-CDMA’. Proc. Int. Conf. on Machine Learning and Cybernetics, Baoding, July 2009, pp. 22052210.
    36. 36)
      • 36. Hanzo, L.L., Somerville, F.C.A., Woodard, J.P.: ‘Voice compression and communications: principles and applications for fixed and wireless channels’ (John Wiley & Sons, New York, 2001).
    37. 37)
      • 37. Nilsson, M., Kleijn, W.B.: ‘Avoiding overestimation in bandwidth extension of telephony speech’. Proc. ICASSP, Salt Lake City, Utah, USA, May 2001, pp. 869872.
    38. 38)
      • 38. Jax, P., Vary, P.: ‘On artificial bandwidth extension of telephone speech’, Signal Process., 2003, 83, (8), pp. 17071719.
    39. 39)
      • 39. ETSI ES 201 108 V1.1.2: ‘Speech processing, transmission and quality aspects (STQ); distributed speech recognition; front-end feature extraction algorithm; Compression algorithms’, April 2000.
    40. 40)
      • 40. NTT Adv. Technol. Corp.: ‘Multi-lingual speech database for telephonometry 1994’, 1994.
    41. 41)
      • 41. ITU-T Recommendation P.800: ‘Methods for subjective determination of transmission quality’, August 1996.
    42. 42)
      • 42. ITU-T Recommendation P.862: ‘Perceptual evaluation of speech quality (PESQ): An objective method for end to-end speech quality assessment of narrow-band telephone networks and speech codecs’, Feb 2001.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-spr.2016.0453
Loading

Related content

content/journals/10.1049/iet-spr.2016.0453
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address