Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

Finite state CELP for variable rate speech coding

Finite state CELP for variable rate speech coding

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IEE Proceedings I (Communications, Speech and Vision) — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

The performance of a variable rate code excited linear predictor system is investigated. The coding system is based on a finite state CELP (FSCELP) frame work. Each individual state is primarily identified with a LPC model order, LPC coefficients bit allocation, excitation code book population density and state encoding rate. Successive input speech vectors are encoded at a rate that depends on the current state of the FSCELP system and the input vector characteristics. The use of a finite state system involves implicit clustering of speech signals. The lower rate states are selected during highly correlated steady state speech segments when relatively few bits are required to obtain adequate fidelity. For speech signals with a strong glottal excitation, unvoiced signals and transient speech segments, a relatively greater quantisation accuracy is needed to obtain good fidelity and therefore higher rate states of the system are used. Further improvement is obtained by using gamma populated excitation codebooks, for those states that are mainly used to encode speech signals with a strong underlying glottal excitation pulses. Experiments focus on investigation of the varying encoding requirements of the excitation signal for low pass, voiced, unvoiced and transient speech signals. The parameters of the finite state CELP system are designed to match the encoding requirements of typical speech signals. The greater part of the coding gain is obtained from variable rate encoding of the excitation signal. Using a six-state FSCELP, good quality speech is obtained at an average, maximum and minimum bit rates of 4 kbit/s, 10 kbit/s and 2 kbit/s, respectively.

References

    1. 1)
      • C. Bei , R.M. Gray . Simulation of vector trellis encoding systems. IEEE Trans. , 3 , 214 - 218
    2. 2)
      • C.E. Shannon . (1959) Coding theorems for a discrete source with a fidelity criterion, IRE Nat. Conv. Rec..
    3. 3)
      • Magill, D.T.: `Adaptive speech compression system for packet communication systems', Telecomm. Conf. Record, 1973.
    4. 4)
      • R.V. Cox , S.L. Gay , Y. Shoham , S.R. Quackenbush , N. Seshadri , N.S. Jayant . New directions in subband coding. IEEE J. Sel. Areas Commun. , 2 , 391 - 409
    5. 5)
      • F. Itakura . Minimum prediction residual principle applied to speech recognition. IEEE Trans. , 67 - 72
    6. 6)
      • Soong, F.K., Juang, B.H.: `Line spectrum pair(LSP) and speech data compression', IEEE ICASSP84, 1984, p. 1.10.1–1.10.4.
    7. 7)
      • P.E. Papamichalis , T.P. Barnwell . Variable rate speech compression by encoding subsets of the PARCOR coefficients. IEEE Trans. , 3 , 706 - 713
    8. 8)
      • V. Cuperman . On adaptive vector transform quantisation for speech coding. IEEE Trans. , 3 , 261 - 267
    9. 9)
      • Jayant, N.S., Chen, J.H.: `Speech coding with time-varying bit allocations to excitation and LPC parameters', IEEE Proc. ICASSP 89, May 1989, Glasgow, 1, p. 65–68.
    10. 10)
      • J.D. Johnston . Transform coding of audio signals using perceptual noise criteria. IEEE J. Sel. Areas Commun. , 2 , 314 - 323
    11. 11)
      • Schroder, M.R., Atal, B.S.: `Code-excited linear predication (CELP): high-quality speech at low bit rates', Proc. IEEE ICASSP, April 1985, p. 937–940.
    12. 12)
      • J. Foster , R.M. Gray , M.O. Dunham . Finite-state vector quantisation for waveform coding. IEEE Trans. , 348 - 359
    13. 13)
      • L.R. Rabiner , R.W. Schafer . (1978) , Digital processing of speech signals.
    14. 14)
      • E. Ayanoglu , R.M. Gray . The design of predicitive trellis waveform coders using the generalised Lloyd algorithm. IEEE Trans. , 11 , 1073 - 1080
    15. 15)
      • Wang, S., Gersho, A.: `Phonetically-based vector excitation coding of speech at 3.6 kbps', IEEE Proc. ICASSP 89, May 1989, Glasgow, 1, p. 49–52.
    16. 16)
      • Kroonand, P., Atal, B.S.: `Strategies for improving the performance of CELP coders at low bit rates', ICASSP 88, April 1988, New York, 1, p. 151–154.
    17. 17)
      • Y. Ephraim , R.M. Gray . A unified approach for encoding clean and noisy sources by means of waveform and autoregressive model vector quantisation. IEEE Trans. , 4 , 826 - 834
    18. 18)
      • B.S. Atal , M.R. Schroder . (1984) Stochastic coding of speech at very low bit rates, Proc. ICC.
    19. 19)
      • M. Yong , A. Gersho . (1988) Vector excitation coding for dynamic bit allocation, Proc. IEEE-GLOBCOM.
    20. 20)
      • T. Burger . (1971) , Rate distortion theory: a mathematical basis for dynamic bit for data compression.
    21. 21)
      • L.C. Stewart , R.M. Gray , Y. Linde . The design of trellis waveform coders. IEEE Trans. , 4 , 702 - 710
    22. 22)
      • Makhoul, J., Viswanathan, R., Cossell, L., Russell, W.: `Nature communication with computers: speech compression at BBN', BBN Report No. 2976, 1974, Vol. 2.
    23. 23)
      • P.A. Chou , T. Lookabaugh , R.M. Gray . Entropy constrained vector quantisation. IEEE Trans. , 1 , 31 - 42
    24. 24)
      • N. Suamura , N. Farvardin . Quantiser design in LSP speech analysis-synthesis. IEEE J. Select. Areas Commun. , 2 , 432 - 440
    25. 25)
      • N. Sugumura , F. Itakura . Speech analysis and synthesis methods developed at ECL in NTT — from LPC to LSP. Speech Commun. , 199 - 215
http://iet.metastore.ingenta.com/content/journals/10.1049/ip-i-2.1991.0078
Loading

Related content

content/journals/10.1049/ip-i-2.1991.0078
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address