http://iet.metastore.ingenta.com
1887

Dictionary learning based on M-PCA-N for audio signal sparse representation

Dictionary learning based on M-PCA-N for audio signal sparse representation

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Signal Processing — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

The current popular dictionary learning algorithms for sparse representation of signals are K-means Singular Value Decomposition (K-SVD) and K-SVD-extended. Only rank-1 approximation is used to update one atom at a time and it is unable to cope with large dictionary efficiently. In order to tackle these two problems, this study proposes M-Principal Component Analysis-N (M-PCA-N), which is an algorithm for dictionary learning and sparse representation. First, M-Principal Component Analysis (M-PCA) utilised information from the top M ranks of SVD decomposition to update M atoms at a time. Then, in order to further utilise the information from remaining ranks, M-PCA-N is proposed on the basis of M-PCA, by transforming information from the following N non-principal ranks onto the top M principal ranks. The mathematic formula indicates that M-PCA may be seen as a generalisation of K-SVD. Experimental results on the BBC Sound Effects Library show that M-PCA-N not only lowers the MSE between original signal and approximation signal in audio signal sparse representation, but also obtains higher audio signal classification precision than K-SVD.

References

    1. 1)
      • 1. Donoho, D.L.: ‘Compressed sensing’, IEEE Trans. Inf. Theory, 2006, 52, (4), pp. 12891306.
    2. 2)
      • 2. Cai, T.T., Wang, L.: ‘Orthogonal matching pursuit for sparse signal recovery with noise’, IEEE Trans. Inf. Theory, 2011, 57, (7), pp. 46804688.
    3. 3)
      • 3. Zhao, W., Li, M., Harley, J., et al: ‘Reconstruction of Lamb wave dispersion curves by sparse representation with continuity constraints’, J. Acoust. Soc. Am., 2017, 141, (2), pp. 749762.
    4. 4)
      • 4. Bi, C., Liu, Y., Xu, Y., et al: ‘Sound field reconstruction using compressed model equivalent point source method’, J. Acoust. Soc. Am., 2017, 141, (1), pp. 7379.
    5. 5)
      • 5. Bajwa, W.U., Haupt, J., Sastry, A.M.: ‘Compressed channel sensing: a new approach to estimating sparse multipath channel’, Proc. IEEE, 2010, 98, (6), pp. 10581076.
    6. 6)
      • 6. Yang, A.Y., Gastpar, M., Bajcsy, R., et al: ‘Distributed sensor perception via sparse representation’, Proc. IEEE, 2010, 98, (6), pp. 10771088.
    7. 7)
      • 7. Candes, E.J., Wakin, M.B.: ‘An introduction to compressive sampling’, IEEE Signal Process. Mag., 2008, 25, pp. 2130.
    8. 8)
      • 8. Shao, W., Bouzerdoum, A., Phung, S.L.: ‘Sparse representation of GPR traces with application to signal classification’, IEEE Trans. Geosci. Remote Sens., 2013, 51, (7), pp. 39223930.
    9. 9)
      • 9. Adler, A., Emiya, V., Jafari, M., et al: ‘Audio inpainting’, IEEE Trans. Audio Speech Signal Process., 2012, 20, (3), pp. 922932.
    10. 10)
      • 10. Wang, P., Jiang, J., Li, N., et al: ‘Sparse dictionary for synthetic transmit aperture medical ultrasound imaging’, J. Acoust. Soc. Am., 2017, 142, (1), pp. 240248.
    11. 11)
      • 11. Aharon, M., Elad, M., Bruckstein, A.: ‘K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation’, IEEE Trans. Signal Process., 2006, 54, (11), pp. 43114322.
    12. 12)
      • 12. Mazhar, R., Gader, P.D.: ‘EK-SVD: optimized dictionary design for sparse representations’. 19th Int. Conf. on Pattern Recognition (ICPR), Tampa, Florida, USA, December 2008, pp. 14.
    13. 13)
      • 13. Rusa, C., Dumitrescu, B.: ‘Stagewise K-SVD to design efficient dictionaries for sparse representations’, IEEE Trans. Signal Process. Lett., 2012, 19, (10), pp. 631634.
    14. 14)
      • 14. Troop, J.A., Gilbert, A.C.: ‘Signal recovery from random measurements via orthogonal matching pursuit’, IEEE Trans. Inf. Theory, 2007, 53, (12), pp. 46554666.
    15. 15)
      • 15. Needell, D., Vershynin, R.: ‘Signal recovery from incomplete and inaccurate measurements via regularized orthogonal matching pursuit’, IEEE J. Sel. Top. Signal Process., 2010, 4, (2), pp. 310316.
    16. 16)
      • 16. Wu, R., Huang, W., Chen, D.: ‘The exact support recovery of sparse signals with noise via orthogonal matching pursuit’, IEEE Signal Process. Lett., 2013, 20, (4), pp. 403406.
    17. 17)
      • 17. Guleryuz, O.G.: ‘Weighted averaging for denoising with overcomplete dictionary’, IEEE Trans. Image Process., 2007, 16, (12), pp. 30203034.
    18. 18)
      • 18. Zhang, M.i., Desrosiers, C.: ‘Image denoising based on sparse representation and gradient histogram’, IET Image Process., 2016, 11, (1), pp. 5463.
    19. 19)
      • 19. Rostami, M., Michailovich, O., Wang, Z.: ‘Image deblurring using derivative compressed sensing for optical imaging application’, IEEE Trans. Image Process., 2012, 21, (7), pp. 31393149.
    20. 20)
      • 20. Peotta, L., Granai, L., Vandergheynst, P.: ‘Image compression using an edge redundant dictionary and wavelets’, Signal Process., 2006, 86, (3), pp. 444456.
    21. 21)
      • 21. Castrodad, A., Xing, Z., Greer, J., et al: ‘Learning discriminative sparse representations for modeling, source separation, and mapping of hyperspectral imagery’, IEEE Trans. Geosci. Remote Sens., 2011, 49, (11), pp. 42634281.
    22. 22)
      • 22. Dong, G., Kuang, G., Wang, W., et al: ‘Classification via sparse representation of steerable wavelet frames on Grassmann manifold application to target recognition in SAR image’, IEEE Trans. Image Process., 2017, 26, (6), pp. 28922904.
    23. 23)
      • 23. Sun, Y., Du, L., Wang, Y., et al: ‘SAR automatic target recognition based on dictionary learning and joint dynamic sparse representation’, IEEE Geosci. Remote Sens. Lett., 2016, 13, (12), pp. 17771781.
    24. 24)
      • 24. Zhang, E., Jiao, L., Zhang, X., et al: ‘Class-level joint sparse representation for multifeature-based hyperspectral image classification’, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 2016, 9, (9), pp. 41604177.
    25. 25)
      • 25. Yuan, Y., Lin, J., Wang, Q.: ‘Hyperspectral image classification via multitask joint sparse representation and stepwise MRF optimization’, IEEE Trans. Cybern., 2016, 46, (12), pp. 29662977.
    26. 26)
      • 26. Zhang, Y., Peng, H.: ‘One sample per person face recognition via sparse representation’, IET Signal Process., 2016, 19, (9), pp. 11261134.
    27. 27)
      • 27. Gao, Y., Ma, J., Yuille, A.L.: ‘Semi-supervised sparse representation based classification for face recognition with insufficient labeled samples’, IEEE Trans. Image Process., 2017, 26, (5), pp. 25452560.
    28. 28)
      • 28. Tian, X., Lee, S.: ‘An exemplar-based approach to frequency warping for voice conversion’, IEEE/ACM Trans. Audio Speech Signal Process., 2017, 25, (10), pp. 186311876.
    29. 29)
      • 29. Aihara, R., Takiguchi, T.: ‘Multiple non-negative matrix factorization for many-to-many voice conversion’, IEEE/ACM Trans. Audio Speech Signal Process., 2016, 24, (7), pp. 11751184.
    30. 30)
      • 30. Engan, K., Aase, S.O., Hakon, J. H.: ‘Method of optimal direction for frame design’. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Phoenix, Arizona, USA, March 1999, pp. 24432446.
    31. 31)
      • 31. Engan, K., Rao, B.D., Delgado, K.: ‘Frame design using focuss with method of optimal directions (mod)’. Proc. of Norwegian Signal Processing, Symp., Asker, Norway, 1999, pp. 6569.
    32. 32)
      • 32. Engan, K., Aase, S.O., Husoy, J.H.: ‘Multi-frame compression: theory and design’, EURASIP Signal Process., 2000, 80, (10), pp. 21212140.
    33. 33)
      • 33. Klema, V.C., Laub, A.J.: ‘The singular value decomposition: its computation and some applications’, IEEE Trans. Autom. Control, 1980, 25, (2), pp. 164176.
    34. 34)
      • 34. Mallat, S.G., Zhang, Z.: ‘Matching pursuits with time-frequency dictionaries’, IEEE Trans. Signal Process., 1993, 41, (12), pp. 33973415.
    35. 35)
      • 35. Pati, Y.C., Rezaiifar, R., Krishnaprasad, P.S.: ‘Orthogonal matching pursuit: recursive function approximation with application to wavelet decomposition’. Conf. Record of The Twenty-Seventh Asilomar Conf. on Signals, Systems and Computers, Pacific Grove, CA, USA, November 1993, vol. 1, pp. 4044.
    36. 36)
      • 36. Chatterjee, S., Sundman, D., Skoglund, M.: ‘Look ahead orthogonal matching pursuit’. IEEE Int. Conf. on Acoustics, Speech and Signal Processing Prague, Czech Republic, May 2011, pp. 20244027.
    37. 37)
      • 37. Chatterjee, S., Sundman, D., Skoglund, M.: ‘Projection-based and look ahead strategies for atom selection’, IEEE Trans. Signal Process., 2012, 60, (2), pp. 634647.
    38. 38)
      • 38. Donoho, D.L., Tsaig, Y., Drori, I., et al: ‘Sparse solution of underdetermined systems of linear equations by stagewise orthogonal matching pursuit’, IEEE Trans. Inf. Theory, 2012, 58, (2), pp. 10941120.
    39. 39)
      • 39. Wang, J., Kwon, S., Shim, B.: ‘Sparse solution of underdetermined systems of linear equations by stagewise orthogonal matching pursuit’, IEEE Trans. Signal Process., 2012, 60, (12), pp. 62026215.
    40. 40)
      • 40. Chu, S., Narayanan, S.: ‘Environmental sound recognition with time-frequency audio features’, IEEE Trans. Audio Speech Signal Process., 2009, 17, (6), pp. 11421158.
    41. 41)
      • 41. British Broadcasting Corporation (BBC): ‘BBC sound effects library’. Available at https://www.sound-ideas.com/Collection/2/General-Sound-Effects-Collections, accessed 1 April 2015.
    42. 42)
      • 42. Rubinstein, R., Zibulevsky, M., Elad, M.: ‘Efficient implementation of the K-SVD algorithm using batch orthogonal matching pursuit’. Available at http://www.yumpu.com/en/document/view/6635304/efficient-implementation-of the-k-svd-algorithm-using-technion.
    43. 43)
      • 43. Najim, D., Patrick, J.K., Réda, D., et al: ‘Front-end factor analysis for speaker verification’, IEEE Trans. Audio Speech Lang. Process., 2011, 19, (4), pp. 788798.
    44. 44)
      • 44. Mohri, M., Pereia, F., Riley, M.: ‘Weighted finite-state transducers, in speech recognition’, Comput. Speech Lang., 2002, 20, (1), pp. 6988.
    45. 45)
      • 45. Stavros, N.: ‘A novel holistic modeling approach for generalized sound recognition’, IEEE Signal Process. Lett., 2013, 20, (2), pp. 185188.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-spr.2015.0277
Loading

Related content

content/journals/10.1049/iet-spr.2015.0277
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address