Linear prediction serves as a mathematical operation to estimate the future values of a discrete-time signal based on a linear function of previous samples. When applied to predictive coding of waveform such as speech and audio, a common issue that plagues compression performance is the non-stationary characteristics of prediction residuals around the starting point of the random access frames. This is because dependencies between prediction residuals and the historical waveform are interrupted to satisfy the random access requirement. In such cases, the dynamic range of the prediction residuals will fluctuate dramatically in such frames, leading to substantially poor coding performance in the subsequent entropy coder. In this study, the authors developed a solution to this long-standing issue by establishing a theoretical relationship between the energy envelope of linear prediction residuals in the random access frames and the prediction coefficients. Using the established relationship, an adaptive normalisation method is formulated as a preprocessor to the entropy coder to mitigate the poor coding performance in the random access frames. Simulation results confirm the superiority of the proposed method over existing solutions in terms of coding efficiency performance.

References

1. 1)
  - 24. Makhoul, J.: ‘Stable and efficient lattice methods for linear prediction’, IEEE Trans. Acoust. Speech Signal Process., 1977, 25, (5), pp. 423–428 (doi: 10.1109/TASSP.1977.1162979).
2. 2)
  - 8. Yu, R., Chung, K.C.: ‘High quality audio coding using a novel hybrid WLP-subband coding algorithm’. Proc. Int. Symp. on Signal Processing and its Applications, 1999, vol. 1, pp. 483–486.
3. 3)
  - 22. Robinson, T.: ‘SHORTEN: Simple lossless and near-lossless waveform compression’. Technical report CUED/F-INFENG/TR.156, 1994.
4. 4)
  - J. Makhoul . Linear prediction: a tutorial review. Proc. IEEE , 561 - 580
5. 5)
  - 1. Kolmogorov, A.N.: ‘Interpolation and extrapolation of stationary random sequences’, Izv. Akad. Nauk SSSR Ser. Mat., 1941, 5:1, (28), pp. 3–14.
6. 6)
  - 5. Saito, S., Itakura, F.: ‘The Theoretical Consideration of Statistically Optimum Methods for Speech Spectral Density’. Report No. 3107, Electrical Communication Laboratory, NTT, Tokyo, 1966.
7. 7)
  - 10. Sukissiana, L., Kollias, S., Boutalis, Y.: ‘Adaptive classification of textured images using linear prediction and neural networks’, Signal Process., 1994, 36, pp. 209–392 (doi: 10.1016/0165-1684(94)90209-7).
8. 8)
  - 21. ‘Free Lossless Audio Codec’, Available at: http://www.flac.sourceforge.net/index.html.
9. 9)
  - 16. Yang, D., Moriya, T., Liebchen, T.: ‘A lossless audio compression scheme with random access property’. Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, 2004, vol. 3, pp. 1016–1019.
10. 10)
  - S. Kullback , R.A. Leibler . On information and sufficiency. Ann. Math. Stat. , 76 - 86
11. 11)
  - 23. Salomon, D.: ‘Data compression: the complete reference’ (Springer-Verlag, 2007).
12. 12)
  - 14. Shu, H., Huang, H., Chan, T.-E., Yu, R., Rahardja, S.: ‘An introduction to AVS lossless audio coding’. 129th Audio Engineering Society Convention, October 2010.
13. 13)
  - 25. Vaidyanathan, P.P.: ‘The theory of linear prediction’ (Morgan & Claypool Publishers, 2008).
14. 14)
  - 11. Durbin, J.: ‘The fitting of time-series models’, Rev. Int. Stat. Inst., 1960, 28, pp. 233–243 (doi: 10.2307/1401322).
15. 15)
  - 17. Moriya, T., Yang, D.T., Liebchen, T.: ‘A design of lossless compression for high-quality audio signals’, Int. Congr. Acoust., 2004, II, pp. 1005–1008.
16. 16)
  - 19. Golomb, S.W.: ‘Run-length encodings’, IEEE Trans. Inf. Theory, 1966, 12, pp. 399–401 (doi: 10.1109/TIT.1966.1053907).
17. 17)
  - 13. ISO/IEC 14496-3:2005/Amd 2:2006: ‘Audio lossless coding (ALS), new audio profiles and BSAC extensions’. June 2006.
18. 18)
  - 20. Rice, R.F., Plaunt, J.R.: ‘Adaptive variable-length coding for efficient compression of spacecraft television data’, IEEE Trans. Commun. Technol., 1971, 19, (6), pp. 889–897 (doi: 10.1109/TCOM.1971.1090789).
19. 19)
  - 2. Wiener, N.: ‘Extrapolation, interpolation, and smoothing of stationary time series’ (The MIT Press, 1964).
20. 20)
  - 7. Harma, A., Laine, U.K., Karjalainen, M.: ‘Warped linear prediction (WLP) in audio coding’. Proc. IEEE Symp. Nordic Signal Processing, 1996, pp. 447–450.
21. 21)
  - 3. Levinson, N.: ‘The wiener RMS error criterion in filter design and prediction’, J. Math. Phys., 1947, 25, (4), pp. 261–278.
22. 22)
  - 9. Strobach, P.: ‘Quadtree-structured linear prediction models for image sequence processing’, IEEE Trans. Pattern Anal. Mach. Intell., 1989, 11, pp. 742–748 (doi: 10.1109/34.192469).
23. 23)
  - 15. Shu, H., Yu, R., Huang, H., Rahardja, S.: ‘Normalization of LPC residue for random access frame in audio coding’. Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, October 2011, pp. 265–268.
24. 24)
  - 6. Atal, B.S., Schroeder, M.R.: ‘Predictive coding of speech signals’. Conf. on Communication, and Processing, 1967, pp. 360–361.
25. 25)
  - 12. Markel, J.D., Gray, A.H.: ‘Linear prediction of speech’ (Springer-Verlag, 1976).

Optimal normalisation of prediction residual for predictive coding with random access

References

Related content