The current popular dictionary learning algorithms for sparse representation of signals are K-means Singular Value Decomposition (K-SVD) and K-SVD-extended. Only rank-1 approximation is used to update one atom at a time and it is unable to cope with large dictionary efficiently. In order to tackle these two problems, this study proposes M-Principal Component Analysis-N (M-PCA-N), which is an algorithm for dictionary learning and sparse representation. First, M-Principal Component Analysis (M-PCA) utilised information from the top M ranks of SVD decomposition to update M atoms at a time. Then, in order to further utilise the information from remaining ranks, M-PCA-N is proposed on the basis of M-PCA, by transforming information from the following N non-principal ranks onto the top M principal ranks. The mathematic formula indicates that M-PCA may be seen as a generalisation of K-SVD. Experimental results on the BBC Sound Effects Library show that M-PCA-N not only lowers the MSE between original signal and approximation signal in audio signal sparse representation, but also obtains higher audio signal classification precision than K-SVD.

References

1. 1)
  - 28. Tian, X., Lee, S.: ‘An exemplar-based approach to frequency warping for voice conversion’, IEEE/ACM Trans. Audio Speech Signal Process., 2017, 25, (10), pp. 1863–11876.
2. 2)
  - 27. Gao, Y., Ma, J., Yuille, A.L.: ‘Semi-supervised sparse representation based classification for face recognition with insufficient labeled samples’, IEEE Trans. Image Process., 2017, 26, (5), pp. 2545–2560.
3. 3)
  - 29. Aihara, R., Takiguchi, T.: ‘Multiple non-negative matrix factorization for many-to-many voice conversion’, IEEE/ACM Trans. Audio Speech Signal Process., 2016, 24, (7), pp. 1175–1184.
4. 4)
  - 36. Chatterjee, S., Sundman, D., Skoglund, M.: ‘Look ahead orthogonal matching pursuit’. IEEE Int. Conf. on Acoustics, Speech and Signal Processing Prague, Czech Republic, May 2011, pp. 2024–4027.
5. 5)
  - 1. Donoho, D.L.: ‘Compressed sensing’, IEEE Trans. Inf. Theory, 2006, 52, (4), pp. 1289–1306.
6. 6)
  - 6. Yang, A.Y., Gastpar, M., Bajcsy, R., et al: ‘Distributed sensor perception via sparse representation’, Proc. IEEE, 2010, 98, (6), pp. 1077–1088.
7. 7)
  - 22. Dong, G., Kuang, G., Wang, W., et al: ‘Classification via sparse representation of steerable wavelet frames on Grassmann manifold application to target recognition in SAR image’, IEEE Trans. Image Process., 2017, 26, (6), pp. 2892–2904.
8. 8)
  - 3. Zhao, W., Li, M., Harley, J., et al: ‘Reconstruction of Lamb wave dispersion curves by sparse representation with continuity constraints’, J. Acoust. Soc. Am., 2017, 141, (2), pp. 749–762.
9. 9)
  - 9. Adler, A., Emiya, V., Jafari, M., et al: ‘Audio inpainting’, IEEE Trans. Audio Speech Signal Process., 2012, 20, (3), pp. 922–932.
10. 10)
  - 19. Rostami, M., Michailovich, O., Wang, Z.: ‘Image deblurring using derivative compressed sensing for optical imaging application’, IEEE Trans. Image Process., 2012, 21, (7), pp. 3139–3149.
11. 11)
  - 30. Engan, K., Aase, S.O., Hakon, J. H.: ‘Method of optimal direction for frame design’. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Phoenix, Arizona, USA, March 1999, pp. 2443–2446.
12. 12)
  - 34. Mallat, S.G., Zhang, Z.: ‘Matching pursuits with time-frequency dictionaries’, IEEE Trans. Signal Process., 1993, 41, (12), pp. 3397–3415.
13. 13)
  - 15. Needell, D., Vershynin, R.: ‘Signal recovery from incomplete and inaccurate measurements via regularized orthogonal matching pursuit’, IEEE J. Sel. Top. Signal Process., 2010, 4, (2), pp. 310–316.
14. 14)
  - 13. Rusa, C., Dumitrescu, B.: ‘Stagewise K-SVD to design efficient dictionaries for sparse representations’, IEEE Trans. Signal Process. Lett., 2012, 19, (10), pp. 631–634.
15. 15)
  - 25. Yuan, Y., Lin, J., Wang, Q.: ‘Hyperspectral image classification via multitask joint sparse representation and stepwise MRF optimization’, IEEE Trans. Cybern., 2016, 46, (12), pp. 2966–2977.
16. 16)
  - 12. Mazhar, R., Gader, P.D.: ‘EK-SVD: optimized dictionary design for sparse representations’. 19th Int. Conf. on Pattern Recognition (ICPR), Tampa, Florida, USA, December 2008, pp. 1–4.
17. 17)
  - 44. Mohri, M., Pereia, F., Riley, M.: ‘Weighted finite-state transducers, in speech recognition’, Comput. Speech Lang., 2002, 20, (1), pp. 69–88.
18. 18)
  - 45. Stavros, N.: ‘A novel holistic modeling approach for generalized sound recognition’, IEEE Signal Process. Lett., 2013, 20, (2), pp. 185–188.
19. 19)
  - 16. Wu, R., Huang, W., Chen, D.: ‘The exact support recovery of sparse signals with noise via orthogonal matching pursuit’, IEEE Signal Process. Lett., 2013, 20, (4), pp. 403–406.
20. 20)
  - 24. Zhang, E., Jiao, L., Zhang, X., et al: ‘Class-level joint sparse representation for multifeature-based hyperspectral image classification’, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 2016, 9, (9), pp. 4160–4177.
21. 21)
  - 32. Engan, K., Aase, S.O., Husoy, J.H.: ‘Multi-frame compression: theory and design’, EURASIP Signal Process., 2000, 80, (10), pp. 2121–2140.
22. 22)
  - 42. Rubinstein, R., Zibulevsky, M., Elad, M.: ‘Efficient implementation of the K-SVD algorithm using batch orthogonal matching pursuit’. Available at http://www.yumpu.com/en/document/view/6635304/efficient-implementation-of the-k-svd-algorithm-using-technion.
23. 23)
  - 33. Klema, V.C., Laub, A.J.: ‘The singular value decomposition: its computation and some applications’, IEEE Trans. Autom. Control, 1980, 25, (2), pp. 164–176.
24. 24)
  - 38. Donoho, D.L., Tsaig, Y., Drori, I., et al: ‘Sparse solution of underdetermined systems of linear equations by stagewise orthogonal matching pursuit’, IEEE Trans. Inf. Theory, 2012, 58, (2), pp. 1094–1120.
25. 25)
  - 10. Wang, P., Jiang, J., Li, N., et al: ‘Sparse dictionary for synthetic transmit aperture medical ultrasound imaging’, J. Acoust. Soc. Am., 2017, 142, (1), pp. 240–248.
26. 26)
  - 26. Zhang, Y., Peng, H.: ‘One sample per person face recognition via sparse representation’, IET Signal Process., 2016, 19, (9), pp. 1126–1134.
27. 27)
  - 23. Sun, Y., Du, L., Wang, Y., et al: ‘SAR automatic target recognition based on dictionary learning and joint dynamic sparse representation’, IEEE Geosci. Remote Sens. Lett., 2016, 13, (12), pp. 1777–1781.
28. 28)
  - 20. Peotta, L., Granai, L., Vandergheynst, P.: ‘Image compression using an edge redundant dictionary and wavelets’, Signal Process., 2006, 86, (3), pp. 444–456.
29. 29)
  - 2. Cai, T.T., Wang, L.: ‘Orthogonal matching pursuit for sparse signal recovery with noise’, IEEE Trans. Inf. Theory, 2011, 57, (7), pp. 4680–4688.
30. 30)
  - 11. Aharon, M., Elad, M., Bruckstein, A.: ‘K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation’, IEEE Trans. Signal Process., 2006, 54, (11), pp. 4311–4322.
31. 31)
  - 31. Engan, K., Rao, B.D., Delgado, K.: ‘Frame design using focuss with method of optimal directions (mod)’. Proc. of Norwegian Signal Processing, Symp., Asker, Norway, 1999, pp. 65–69.
32. 32)
  - 35. Pati, Y.C., Rezaiifar, R., Krishnaprasad, P.S.: ‘Orthogonal matching pursuit: recursive function approximation with application to wavelet decomposition’. Conf. Record of The Twenty-Seventh Asilomar Conf. on Signals, Systems and Computers, Pacific Grove, CA, USA, November 1993, vol. 1, pp. 40–44.
33. 33)
  - 18. Zhang, M.i., Desrosiers, C.: ‘Image denoising based on sparse representation and gradient histogram’, IET Image Process., 2016, 11, (1), pp. 54–63.
34. 34)
  - 5. Bajwa, W.U., Haupt, J., Sastry, A.M.: ‘Compressed channel sensing: a new approach to estimating sparse multipath channel’, Proc. IEEE, 2010, 98, (6), pp. 1058–1076.
35. 35)
  - 8. Shao, W., Bouzerdoum, A., Phung, S.L.: ‘Sparse representation of GPR traces with application to signal classification’, IEEE Trans. Geosci. Remote Sens., 2013, 51, (7), pp. 3922–3930.
36. 36)
  - 41. British Broadcasting Corporation (BBC): ‘BBC sound effects library’. Available at https://www.sound-ideas.com/Collection/2/General-Sound-Effects-Collections, accessed 1 April 2015.
37. 37)
  - 21. Castrodad, A., Xing, Z., Greer, J., et al: ‘Learning discriminative sparse representations for modeling, source separation, and mapping of hyperspectral imagery’, IEEE Trans. Geosci. Remote Sens., 2011, 49, (11), pp. 4263–4281.
38. 38)
  - 4. Bi, C., Liu, Y., Xu, Y., et al: ‘Sound field reconstruction using compressed model equivalent point source method’, J. Acoust. Soc. Am., 2017, 141, (1), pp. 73–79.
39. 39)
  - 17. Guleryuz, O.G.: ‘Weighted averaging for denoising with overcomplete dictionary’, IEEE Trans. Image Process., 2007, 16, (12), pp. 3020–3034.
40. 40)
  - 7. Candes, E.J., Wakin, M.B.: ‘An introduction to compressive sampling’, IEEE Signal Process. Mag., 2008, 25, pp. 21–30.
41. 41)
  - 40. Chu, S., Narayanan, S.: ‘Environmental sound recognition with time-frequency audio features’, IEEE Trans. Audio Speech Signal Process., 2009, 17, (6), pp. 1142–1158.
42. 42)
  - 43. Najim, D., Patrick, J.K., Réda, D., et al: ‘Front-end factor analysis for speaker verification’, IEEE Trans. Audio Speech Lang. Process., 2011, 19, (4), pp. 788–798.
43. 43)
  - 14. Troop, J.A., Gilbert, A.C.: ‘Signal recovery from random measurements via orthogonal matching pursuit’, IEEE Trans. Inf. Theory, 2007, 53, (12), pp. 4655–4666.
44. 44)
  - 37. Chatterjee, S., Sundman, D., Skoglund, M.: ‘Projection-based and look ahead strategies for atom selection’, IEEE Trans. Signal Process., 2012, 60, (2), pp. 634–647.
45. 45)
  - 39. Wang, J., Kwon, S., Shim, B.: ‘Sparse solution of underdetermined systems of linear equations by stagewise orthogonal matching pursuit’, IEEE Trans. Signal Process., 2012, 60, (12), pp. 6202–6215.

Dictionary learning based on M-PCA-N for audio signal sparse representation

References

Related content