access icon free Dictionary learning based on M-PCA-N for audio signal sparse representation

The current popular dictionary learning algorithms for sparse representation of signals are K-means Singular Value Decomposition (K-SVD) and K-SVD-extended. Only rank-1 approximation is used to update one atom at a time and it is unable to cope with large dictionary efficiently. In order to tackle these two problems, this study proposes M-Principal Component Analysis-N (M-PCA-N), which is an algorithm for dictionary learning and sparse representation. First, M-Principal Component Analysis (M-PCA) utilised information from the top M ranks of SVD decomposition to update M atoms at a time. Then, in order to further utilise the information from remaining ranks, M-PCA-N is proposed on the basis of M-PCA, by transforming information from the following N non-principal ranks onto the top M principal ranks. The mathematic formula indicates that M-PCA may be seen as a generalisation of K-SVD. Experimental results on the BBC Sound Effects Library show that M-PCA-N not only lowers the MSE between original signal and approximation signal in audio signal sparse representation, but also obtains higher audio signal classification precision than K-SVD.

Inspec keywords: audio signal processing; singular value decomposition; signal representation; signal classification

Other keywords: K-SVD generalisation; rank-1 approximation; dictionary learning; audio signal classification precision; K-SVD-extended; M-PCA-N; audio signal sparse representation; SVD decomposition; nonprincipal ranks; principal ranks

Subjects: Speech and audio signal processing; Algebra

References

    1. 1)
      • 28. Tian, X., Lee, S.: ‘An exemplar-based approach to frequency warping for voice conversion’, IEEE/ACM Trans. Audio Speech Signal Process., 2017, 25, (10), pp. 186311876.
    2. 2)
      • 27. Gao, Y., Ma, J., Yuille, A.L.: ‘Semi-supervised sparse representation based classification for face recognition with insufficient labeled samples’, IEEE Trans. Image Process., 2017, 26, (5), pp. 25452560.
    3. 3)
      • 29. Aihara, R., Takiguchi, T.: ‘Multiple non-negative matrix factorization for many-to-many voice conversion’, IEEE/ACM Trans. Audio Speech Signal Process., 2016, 24, (7), pp. 11751184.
    4. 4)
      • 36. Chatterjee, S., Sundman, D., Skoglund, M.: ‘Look ahead orthogonal matching pursuit’. IEEE Int. Conf. on Acoustics, Speech and Signal Processing Prague, Czech Republic, May 2011, pp. 20244027.
    5. 5)
      • 1. Donoho, D.L.: ‘Compressed sensing’, IEEE Trans. Inf. Theory, 2006, 52, (4), pp. 12891306.
    6. 6)
      • 6. Yang, A.Y., Gastpar, M., Bajcsy, R., et al: ‘Distributed sensor perception via sparse representation’, Proc. IEEE, 2010, 98, (6), pp. 10771088.
    7. 7)
      • 22. Dong, G., Kuang, G., Wang, W., et al: ‘Classification via sparse representation of steerable wavelet frames on Grassmann manifold application to target recognition in SAR image’, IEEE Trans. Image Process., 2017, 26, (6), pp. 28922904.
    8. 8)
      • 3. Zhao, W., Li, M., Harley, J., et al: ‘Reconstruction of Lamb wave dispersion curves by sparse representation with continuity constraints’, J. Acoust. Soc. Am., 2017, 141, (2), pp. 749762.
    9. 9)
      • 9. Adler, A., Emiya, V., Jafari, M., et al: ‘Audio inpainting’, IEEE Trans. Audio Speech Signal Process., 2012, 20, (3), pp. 922932.
    10. 10)
      • 19. Rostami, M., Michailovich, O., Wang, Z.: ‘Image deblurring using derivative compressed sensing for optical imaging application’, IEEE Trans. Image Process., 2012, 21, (7), pp. 31393149.
    11. 11)
      • 30. Engan, K., Aase, S.O., Hakon, J. H.: ‘Method of optimal direction for frame design’. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Phoenix, Arizona, USA, March 1999, pp. 24432446.
    12. 12)
      • 34. Mallat, S.G., Zhang, Z.: ‘Matching pursuits with time-frequency dictionaries’, IEEE Trans. Signal Process., 1993, 41, (12), pp. 33973415.
    13. 13)
      • 15. Needell, D., Vershynin, R.: ‘Signal recovery from incomplete and inaccurate measurements via regularized orthogonal matching pursuit’, IEEE J. Sel. Top. Signal Process., 2010, 4, (2), pp. 310316.
    14. 14)
      • 13. Rusa, C., Dumitrescu, B.: ‘Stagewise K-SVD to design efficient dictionaries for sparse representations’, IEEE Trans. Signal Process. Lett., 2012, 19, (10), pp. 631634.
    15. 15)
      • 25. Yuan, Y., Lin, J., Wang, Q.: ‘Hyperspectral image classification via multitask joint sparse representation and stepwise MRF optimization’, IEEE Trans. Cybern., 2016, 46, (12), pp. 29662977.
    16. 16)
      • 12. Mazhar, R., Gader, P.D.: ‘EK-SVD: optimized dictionary design for sparse representations’. 19th Int. Conf. on Pattern Recognition (ICPR), Tampa, Florida, USA, December 2008, pp. 14.
    17. 17)
      • 44. Mohri, M., Pereia, F., Riley, M.: ‘Weighted finite-state transducers, in speech recognition’, Comput. Speech Lang., 2002, 20, (1), pp. 6988.
    18. 18)
      • 45. Stavros, N.: ‘A novel holistic modeling approach for generalized sound recognition’, IEEE Signal Process. Lett., 2013, 20, (2), pp. 185188.
    19. 19)
      • 16. Wu, R., Huang, W., Chen, D.: ‘The exact support recovery of sparse signals with noise via orthogonal matching pursuit’, IEEE Signal Process. Lett., 2013, 20, (4), pp. 403406.
    20. 20)
      • 24. Zhang, E., Jiao, L., Zhang, X., et al: ‘Class-level joint sparse representation for multifeature-based hyperspectral image classification’, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 2016, 9, (9), pp. 41604177.
    21. 21)
      • 32. Engan, K., Aase, S.O., Husoy, J.H.: ‘Multi-frame compression: theory and design’, EURASIP Signal Process., 2000, 80, (10), pp. 21212140.
    22. 22)
      • 42. Rubinstein, R., Zibulevsky, M., Elad, M.: ‘Efficient implementation of the K-SVD algorithm using batch orthogonal matching pursuit’. Available at http://www.yumpu.com/en/document/view/6635304/efficient-implementation-of the-k-svd-algorithm-using-technion.
    23. 23)
      • 33. Klema, V.C., Laub, A.J.: ‘The singular value decomposition: its computation and some applications’, IEEE Trans. Autom. Control, 1980, 25, (2), pp. 164176.
    24. 24)
      • 38. Donoho, D.L., Tsaig, Y., Drori, I., et al: ‘Sparse solution of underdetermined systems of linear equations by stagewise orthogonal matching pursuit’, IEEE Trans. Inf. Theory, 2012, 58, (2), pp. 10941120.
    25. 25)
      • 10. Wang, P., Jiang, J., Li, N., et al: ‘Sparse dictionary for synthetic transmit aperture medical ultrasound imaging’, J. Acoust. Soc. Am., 2017, 142, (1), pp. 240248.
    26. 26)
      • 26. Zhang, Y., Peng, H.: ‘One sample per person face recognition via sparse representation’, IET Signal Process., 2016, 19, (9), pp. 11261134.
    27. 27)
      • 23. Sun, Y., Du, L., Wang, Y., et al: ‘SAR automatic target recognition based on dictionary learning and joint dynamic sparse representation’, IEEE Geosci. Remote Sens. Lett., 2016, 13, (12), pp. 17771781.
    28. 28)
      • 20. Peotta, L., Granai, L., Vandergheynst, P.: ‘Image compression using an edge redundant dictionary and wavelets’, Signal Process., 2006, 86, (3), pp. 444456.
    29. 29)
      • 2. Cai, T.T., Wang, L.: ‘Orthogonal matching pursuit for sparse signal recovery with noise’, IEEE Trans. Inf. Theory, 2011, 57, (7), pp. 46804688.
    30. 30)
      • 11. Aharon, M., Elad, M., Bruckstein, A.: ‘K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation’, IEEE Trans. Signal Process., 2006, 54, (11), pp. 43114322.
    31. 31)
      • 31. Engan, K., Rao, B.D., Delgado, K.: ‘Frame design using focuss with method of optimal directions (mod)’. Proc. of Norwegian Signal Processing, Symp., Asker, Norway, 1999, pp. 6569.
    32. 32)
      • 35. Pati, Y.C., Rezaiifar, R., Krishnaprasad, P.S.: ‘Orthogonal matching pursuit: recursive function approximation with application to wavelet decomposition’. Conf. Record of The Twenty-Seventh Asilomar Conf. on Signals, Systems and Computers, Pacific Grove, CA, USA, November 1993, vol. 1, pp. 4044.
    33. 33)
      • 18. Zhang, M.i., Desrosiers, C.: ‘Image denoising based on sparse representation and gradient histogram’, IET Image Process., 2016, 11, (1), pp. 5463.
    34. 34)
      • 5. Bajwa, W.U., Haupt, J., Sastry, A.M.: ‘Compressed channel sensing: a new approach to estimating sparse multipath channel’, Proc. IEEE, 2010, 98, (6), pp. 10581076.
    35. 35)
      • 8. Shao, W., Bouzerdoum, A., Phung, S.L.: ‘Sparse representation of GPR traces with application to signal classification’, IEEE Trans. Geosci. Remote Sens., 2013, 51, (7), pp. 39223930.
    36. 36)
      • 41. British Broadcasting Corporation (BBC): ‘BBC sound effects library’. Available at https://www.sound-ideas.com/Collection/2/General-Sound-Effects-Collections, accessed 1 April 2015.
    37. 37)
      • 21. Castrodad, A., Xing, Z., Greer, J., et al: ‘Learning discriminative sparse representations for modeling, source separation, and mapping of hyperspectral imagery’, IEEE Trans. Geosci. Remote Sens., 2011, 49, (11), pp. 42634281.
    38. 38)
      • 4. Bi, C., Liu, Y., Xu, Y., et al: ‘Sound field reconstruction using compressed model equivalent point source method’, J. Acoust. Soc. Am., 2017, 141, (1), pp. 7379.
    39. 39)
      • 17. Guleryuz, O.G.: ‘Weighted averaging for denoising with overcomplete dictionary’, IEEE Trans. Image Process., 2007, 16, (12), pp. 30203034.
    40. 40)
      • 7. Candes, E.J., Wakin, M.B.: ‘An introduction to compressive sampling’, IEEE Signal Process. Mag., 2008, 25, pp. 2130.
    41. 41)
      • 40. Chu, S., Narayanan, S.: ‘Environmental sound recognition with time-frequency audio features’, IEEE Trans. Audio Speech Signal Process., 2009, 17, (6), pp. 11421158.
    42. 42)
      • 43. Najim, D., Patrick, J.K., Réda, D., et al: ‘Front-end factor analysis for speaker verification’, IEEE Trans. Audio Speech Lang. Process., 2011, 19, (4), pp. 788798.
    43. 43)
      • 14. Troop, J.A., Gilbert, A.C.: ‘Signal recovery from random measurements via orthogonal matching pursuit’, IEEE Trans. Inf. Theory, 2007, 53, (12), pp. 46554666.
    44. 44)
      • 37. Chatterjee, S., Sundman, D., Skoglund, M.: ‘Projection-based and look ahead strategies for atom selection’, IEEE Trans. Signal Process., 2012, 60, (2), pp. 634647.
    45. 45)
      • 39. Wang, J., Kwon, S., Shim, B.: ‘Sparse solution of underdetermined systems of linear equations by stagewise orthogonal matching pursuit’, IEEE Trans. Signal Process., 2012, 60, (12), pp. 62026215.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-spr.2015.0277
Loading

Related content

content/journals/10.1049/iet-spr.2015.0277
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading