© The Institution of Engineering and Technology
A novel single channel blind source separation method based on probabilistic matrix factorisation (PMF) is proposed. Compared to the conventional non-negative matrix factorisation (NMF) employing Euclidean distance or Kullback–Leibler divergence, PMF uses the log posterior probability as a cost function for optimising spectrum and activation matrices. Such cost function has an advantage that the hyperparameters are optimised numerically without cross-validation. In order to apply PMF to audio source separation, both Gaussian and Laplacian priors are considered. Exponential substitution for target matrices is also proposed to guarantee the non-negativity of the separated spectrogram. In source separation experiments, the proposed PMF-based approach provided significantly better performance than the conventional NMF.
References
-
-
1)
-
7. Vincent, E., Gribonval, R., Févotte, C.: ‘Performance measurement in blind audio source separation’, Trans. Audio Speech Lang. Process., 2006, 14, (4), pp. 1462–1469 (doi: 10.1109/TSA.2005.858005).
-
2)
-
8. Wilcoxon, F.: ‘Individual comparisons by ranking methods’, Biometrics Bull., 1945, 1, (6), pp. 80–83 (doi: 10.2307/3001968).
-
3)
-
6. Garofolo, J.S., Lamel, L.F., Fisher, W.M., et al: ‘DARPA TIMIT acoustic phonetic continuous speech corpus cdrom’, 1993.
-
4)
-
5. Smaragdis, P.: ‘Non-negative matrix factor deconvolution; extraction of multiple sound sources from monophonic inputs’, Indep. Compon. Anal. Blind Signal Separation, 2004, 3195, pp. 494–499 (doi: 10.1007/978-3-540-30110-3_63).
-
5)
-
3. Lee, D.D., Seung, H.S.: ‘Algorithms for non-negative matrix factorization’, Proc. Advances in Neural Information Processing Systems, Vancouver, Canada, December 2001, 13, pp. 556–562.
-
6)
-
4. Salakhutdinov, R., Mnih, A.: ‘Probabilistic matrix factorization’. Proc. Advances in Neural Information Processing Systems, Vancouver, Canada, December 2007, pp. 1257–1264.
-
7)
-
2. Kim, H.-G., Jang, G.-J., Park, J.-S., et al: ‘Particle filtering based pitch sequence correction for monaural speech segregation’, Int. J. Imaging Syst. Technol., 2013, 23, (1), pp. 64–70 (doi: 10.1002/ima.22039).
-
8)
-
1. Raj, B., Virtanen, T., Chaudhuri, S., et al: ‘Non-negative matrix factorization based compensation of music for automatic speech recognition’. Proc. INTERSPEECH, Makuhari, Japan, September 2010, pp. 717–720.
http://iet.metastore.ingenta.com/content/journals/10.1049/el.2017.2013
Related content
content/journals/10.1049/el.2017.2013
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Correspondence
This article has following corresponding article(s):
in brief