Overlapping functional modules detection in PPI network with pair-wise constrained non-negative matrix tri-factorisation
- Author(s): Guangming Liu 1 ; Bianfang Chai 2 ; Kuo Yang 1 ; Jian Yu 1 ; Xuezhong Zhou 1
-
-
View affiliations
-
Affiliations:
1:
Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University , No. 3 Shangyuancun Haidian District , Beijing , People's Republic of China ;
2: Department of Information Engineering , Hebei GEO University , Shijiazhuang , People's Republic of China
-
Affiliations:
1:
Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University , No. 3 Shangyuancun Haidian District , Beijing , People's Republic of China ;
- Source:
Volume 12, Issue 2,
April
2018,
p.
45 – 54
DOI: 10.1049/iet-syb.2017.0084 , Print ISSN 1751-8849, Online ISSN 1751-8857
- « Previous Article
- Table of contents
- Next Article »
A large amount of available protein–protein interaction (PPI) data has been generated by high-throughput experimental techniques. Uncovering functional modules from PPI networks will help us better understand the underlying mechanisms of cellular functions. Numerous computational algorithms have been designed to identify functional modules automatically in the past decades. However, most community detection methods (non-overlapping or overlapping types) are unsupervised models, which cannot incorporate the well-known protein complexes as a priori. The authors propose a novel semi-supervised model named pairwise constrains nonnegative matrix tri-factorisation (PCNMTF), which takes full advantage of the well-known protein complexes to find overlapping functional modules based on protein module indicator matrix and module correlation matrix simultaneously from PPI networks. PCNMTF determinately models and learns the mixed module memberships of each protein by considering the correlation among modules simultaneously based on the non-negative matrix tri-factorisation. The experiment results on both synthetic and real-world biological networks demonstrate that PCNMTF gains more precise functional modules than that of state-of-the-art methods.
Inspec keywords: proteins; cellular biophysics; molecular biophysics; matrix algebra
Other keywords: pair-wise constrained nonnegative matrix trifactorisation; cellular functions; overlapping functional module detection; protein–protein interaction data; PPI network; real-world biological networks; protein complexes; synthetic biological networks
Subjects: Cellular biophysics; Algebra, set theory, and graph theory; Biomolecular interactions, charge transfer complexes
References
-
-
1)
-
5. Uetz, P., Giot, L., Cagney, G., et al: ‘A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae’, Nature, 2000, 403, (6770), pp. 623–627.
-
-
2)
-
3. Ho, Y., Gruhler, A., Heilbut, A., et al: ‘Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry’, Nature, 2002, 415, (6868), pp. 180–183.
-
-
3)
-
52. Wang, D., Gao, X., Wang, X.: ‘Semi-supervised nonnegative matrix factorization via constraint propagation’, IEEE Trans. Cybern., 2016, 46, (1), pp. 233–244.
-
-
4)
-
22. Zhu, L., Galstyan, A., Cheng, J., et al: ‘Tripartite graph clustering for dynamic sentiment analysis on social media’. Proc. 2014 ACM SIGMOD international Conf. Management of data, 2014, pp. 1531–1542.
-
-
5)
-
4. Ito, T., Chiba, T., Ozawa, R., et al: ‘A comprehensive two-hybrid analysis to explore the yeast protein interactome’, Proc. Natl. Acad. Sci., 2001, 98, (8), pp. 4569–4574.
-
-
6)
-
44. Peri, S., Navarro, J.D., Kristiansen, T.Z., et al: ‘Human protein reference database as a discovery resource for proteomics’, Nucleic Acids Res., 2004, 32, (Suppl. 1), pp. D497–D501.
-
-
7)
-
33. Yang, L., Cao, X., Jin, D., et al: ‘A unified semi-supervised community detection framework using latent space graph regularization’, IEEE Trans. Cybern., 2015, 45, (11), pp. 2585–2598.
-
-
8)
-
45. Kikugawa, S., Nishikata, K., Murakami, K., et al: ‘PCDq: human protein complex database with quality index which summarizes different levels of evidences of protein complexes predicted from h-invitational protein-protein interactions integrative dataset’, BMC Syst. Biol., 2012, 6, (Suppl. 2), p. S7.
-
-
9)
-
28. Wang, F., Li, T., Wang, X., et al: ‘Community discovery using nonnegative matrix factorization’, Data Min. Knowl. Discov., 2011, 22, (3), pp. 493–521.
-
-
10)
-
10. Shih, Y.K., Parthasarathy, S.: ‘Identifying functional modules in interaction networks through overlapping Markov clustering’, Bioinformatics, 2012, 28, (18), pp. i473–i479.
-
-
11)
-
53. Bu, D., Zhao, Y., Cai, L., et al: ‘‘Topological structure analysis of the protein–protein interaction network in budding yeast’, Nucleic Acids Res., 2003, 31, (9), pp. 2443–2450.
-
-
12)
-
16. Palla, G., Derényi, I., Farkas, I., et al: ‘Uncovering the overlapping community structure of complex networks in nature and society’, Nature, 2005, 435, (7043), pp. 814–818.
-
-
13)
-
32. Zhang, Z.-Y., Sun, K.-D., Wang, S.-Q.: ‘Enhanced community structure detection in complex networks with partial background information’, Sci. Rep., 2013, 3, pp. 3241.
-
-
14)
-
2. Aebersold, R., Mann, M.: ‘Mass spectrometry-based proteomics’, Nature, 2003, 422, (6928), pp. 198–207.
-
-
15)
-
46. Radicchi, F., Castellano, C., Cecconi, F., et al: ‘Defining and identifying communities in networks’, Proc. Natl. Acad. Sci. USA, 2004, 101, (9), pp. 2658–2663.
-
-
16)
-
36. Wu, Q., Wang, Z., Li, C., et al: ‘Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization’, BMC Syst. Biol., BioMed Central Ltd, 2015, 9, (Suppl. 1), p. S9.
-
-
17)
-
17. Xiang, Y., Zhang, C.Q., Huang, K.: ‘Predicting glioblastoma prognosis networks using weighted gene co-expression network analysis on TCGA data’, BMC Bioinform.. BioMed Central, 2012, 13, (S2), p. S12.
-
-
18)
-
18. Bader, G.D., Hogue, C.W.V.: ‘An automated method for finding molecular complexes in large protein interaction networks’, BMC Bioinform., 2003, 4, (1), p. 2.
-
-
19)
-
31. Kondor, R.I., Lafferty, J.: ‘Diffusion kernels on graphs and other discrete input spaces’, ICML, 2002, 2, pp. 315–322.
-
-
20)
-
42. Lancichinetti, A., Fortunato, S., Radicchi, F.: ‘Benchmark graphs for testing community detection algorithms’, Phys. Rev. E, 2008, 78, (4), p. 046110.
-
-
21)
-
19. Li, T., Zhang, Y., Sindhwani, V.: ‘A non-negative matrix tri-factorization approach to sentiment classification with lexical prior knowledge’. Proc. Joint Conf. 47th Annual Meeting of the ACL and the 4th Int. Joint Conf. Natural Language Processing of the AFNLP, 2009, vol. 1, pp. 244–252.
-
-
22)
-
23. Pei, Y., Chakraborty, N., Sycara, K.: ‘Nonnegative matrix tri-factorization with graph regularization for community detection in social networks’. Twenty-Fourth Int. Joint Conf. Artificial Intelligence.2015.
-
-
23)
-
13. Kenley, E.C., Cho, Y.R.: ‘Detecting protein complexes and functional modules from protein interaction networks: a graph entropy approach’, Proteomics, 2011, 11, (19), pp. 3835–3844.
-
-
24)
-
14. Arnau, V., Mars, S., Marín, I.: ‘Iterative cluster analysis of protein interaction data’, Bioinformatics, 2004, 21, (3), pp. 364–378.
-
-
25)
-
47. Lovász, L., Plummer, M.D.: ‘Matching theory’ (American Mathematical Society, Providence, 2009).
-
-
26)
-
50. Li, Y., Jia, C., Yu, J.: ‘A parameter-free community detection method based on centrality and dispersion of nodes in complex networks’, Phys. A Stat. Mech. Appl., 2015, 438, pp. 321–334.
-
-
27)
-
7. Pereira-Leal, J.B., Levy, E.D., Teichmann, S.A.: ‘The origins and evolution of functional modules: lessons from protein complexes’, Philos. Trans. R. Soc. Lond. B Biol. Sci., 2006, 361, (1467), pp. 507–517.
-
-
28)
-
25. Von Mering, C., Krause, R., Snel, B., et al: ‘Comparative assessment of large-scale data sets of protein–protein interactions’, Nature, 2002, 417, (6887), pp. 399–403.
-
-
29)
-
39. Hartwell, L.H., Hopfield, J.J., Leibler, S., et al: ‘From molecular to modular cell biology’, Nature, 1999, 402, pp. C47–C52.
-
-
30)
-
26. Ruepp, A., Brauner, B., Dunger-Kaltenbach, I., et al: ‘CORUM: the comprehensive resource of mammalian protein complexes’, Nucleic Acids Res., 2007, 36, (Suppl. 1), pp. D646–D650.
-
-
31)
-
11. Lei, X., Wu, S., Ge, L., et al: ‘Clustering and overlapping modules detection in PPI network based on IBFO’, Proteomics, 2013, 13, (2), pp. 278–290.
-
-
32)
-
29. Lu, H., Zhu, X., Liu, H., et al: ‘The interactome as a tree—an attempt to visualize the protein–protein interaction network in yeast’, Nucleic Acids Res., 2004, 32, (16), pp. 4804–4811.
-
-
33)
-
12. Nepusz, T., Yu, H., Paccanaro, A.: ‘Detecting overlapping protein complexes in protein-protein interaction networks’, Nat. Meth., 2012, 9, (5), pp. 471–472.
-
-
34)
-
51. Shi, X., Lu, H., He, Y., et al: ‘Community detection in social network with pairwisely constrained symmetric non-negative matrix factorization’. Proc. 2015 IEEE/ACM Int. Conf. Advances in Social Networks Analysis and Mining 2015, 2015, pp. 541–546.
-
-
35)
-
38. Brunet, J.P., Tamayo, P., Golub, T.R., et al: ‘Metagenes and molecular pattern discovery using matrix factorization’, Proc. Natl. Acad. Sci., 2004, 101, (12), pp. 4164–4169.
-
-
36)
-
24. Menche, J., Sharma, A., Kitsak, M., et al: ‘Uncovering disease-disease relationships through the incomplete interactome’, Science, 2015, 347, (6224), p. 1257601.
-
-
37)
-
1. Davis, D., Yaveroğlu, Ö.N., Malod-Dognin, N., et al: ‘Topology-function conservation in protein–protein interaction networks’, Bioinformatics, 2015, 31, (10), pp. 1632–1639.
-
-
38)
-
43. Xenarios, I., Salwinski, L., Duan, X.J., et al: ‘DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions’, Nucleic Acids Res., 2002, 30, (1), pp. 303–305.
-
-
39)
-
30. Wang, R.S., Zhang, S., Wang, Y., et al: ‘Clustering complex networks and biological networks by nonnegative matrix factorization with various similarity measures’, Neurocomputing, 2008, 72, (1), pp. 134–141.
-
-
40)
-
8. Albert, R., Jeong, H., Barabási, A.L.: ‘Error and attack tolerance of complex networks’, Nature, 2000, 406, (6794), pp. 378–382.
-
-
41)
-
27. Psorakis, I., Roberts, S., Ebden, M., et al: ‘Overlapping community detection using Bayesian non-negative matrix factorization’, Phys. Rev. E, 2011, 83, (6), p. 066114.
-
-
42)
-
49. Zhang, Y., Yeung, D.Y.: ‘Overlapping community detection via bounded nonnegative matrix tri-factorization’. Proc. 18th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, 2012, pp. 606–614.
-
-
43)
-
41. Girvan, M., Newman, M.E.J.: ‘Community structure in social and biological networks’, Proc. Natl. Acad. Sci., 2002, 99, (12), pp. 7821–7826.
-
-
44)
-
15. Adamcsek, B., Palla, G., Farkas, I.J., et al: ‘CFinder: locating cliques and overlapping modules in biological networks’, Bioinformatics, 2006, 22, (8), pp. 1021–1023.
-
-
45)
-
9. Wagner, G.P., Pavlicev, M., Cheverud, J.M.: ‘The road to modularity’, Nat. Rev. Genetics, 2007, 8, (12), pp. 921–931.
-
-
46)
-
35. Zhu, S., Yu, K., Chi, Y., et al: ‘Combining content and link for classification using matrix factorization’. Proc. 30th annual Int. ACM SIGIR conf. Research and development in Information Retrieval, 2007, pp. 487–494.
-
-
47)
-
37. Zhang, Y., Du, N., Ge, L., et al: ‘A collective NMF method for detecting protein functional module from multiple data sources’. Proc. ACM Conf. Bioinformatics, Computational Biology and Biomedicine, 2012, pp. 655–660.
-
-
48)
-
6. Wu, H., Gao, L., Dong, J., et al: ‘Detecting overlapping protein complexes by rough-fuzzy clustering in protein-protein interaction networks’, PLoS ONE, 2014, 9, (3), p. e91856.
-
-
49)
-
21. Wang, H., Nie, F., Huang, H., et al: ‘Nonnegative matrix tri-factorization based high-order co-clustering and its fast implementation’. 2011 IEEE 11th Int. Conf. Data Mining (ICDM), 2011, pp. 774–783.
-
-
50)
-
40. Zhang, X.F., Dai, D.Q., Le, O.-Y., et al: ‘Detecting overlapping protein complexes based on a generative model with functional and topological properties’, BMC Bioinform., 2014, 15, (1), p. 186.
-
-
51)
-
48. Ou-Yang, L., Dai, D.Q., Zhang, X.F.: ‘Protein complex detection via weighted ensemble clustering based on Bayesian nonnegative matrix factorization’, PLoS ONE, 2013, 8, (5), p. e62158.
-
-
52)
-
34. Wass, M.N., David, A., Sternberg, M.J.E.: ‘Challenges for the prediction of macromolecular interactions’, Curr. Opin. Struct. Biol., 2011, 21, (3), pp. 382–390.
-
-
53)
-
20. Devarajan, K.: ‘Nonnegative matrix factorization: an analytical and interpretive tool in computational biology’, PLoS Comput. Biol., 2008, 4, (7), p. e1000029.
-
-
1)

Related content
