access icon openaccess Identification of essential proteins based on a new combination of topological and biological features in weighted protein–protein interaction networks

The identification of essential proteins in protein–protein interaction (PPI) networks is not only important in understanding the process of cellular life but also useful in diagnosis and drug design. The network topology-based centrality measures are sensitive to noise of network. Moreover, these measures cannot detect low-connectivity essential proteins. The authors have proposed a new method using a combination of topological centrality measures and biological features based on statistical analyses of essential proteins and protein complexes. With incomplete PPI networks, they face the challenge of false-positive interactions. To remove these interactions, the PPI networks are weighted by gene ontology. Furthermore, they use a combination of classifiers, including the newly proposed measures and traditional weighted centrality measures, to improve the precision of identification. This combination is evaluated using the logistic regression model in terms of significance levels. The proposed method has been implemented and compared to both previous and more recent efficient computational methods using six statistical standards. The results show that the proposed method is more precise in identifying essential proteins than the previous methods. This level of precision was obtained through the use of four different data sets: YHQ-W, YMBD-W, YDIP-W and YMIPS-W.

Inspec keywords: drugs; topology; biology computing; proteins; ontologies (artificial intelligence)

Other keywords: weighted centrality measures; YMBD-W dataset; protein complexes; biological features; low-connectivity essential proteins; incomplete PPI networks; YMIPS-W dataset; YHQ-W dataset; network topology-based centrality measures; YDIP-W dataset; weighted protein–protein interaction networks; topological centrality measures

Subjects: Knowledge engineering techniques; Combinatorial mathematics; Biology and medical computing

References

    1. 1)
      • 3. Cullen, L.M., Arndt, G.M.: ‘Genome-wide screening for gene function using RNAi in mammalian cells’, Immunol. Cell Biol., 2005, 83, pp. 217223.
    2. 2)
      • 37. Tew, K.L., Li, X.L., Tan, S.H.: ‘Functional centrality: detecting lethality of proteins in protein interaction networks’, Genome Inform., 2007, 19, pp. 166177.
    3. 3)
      • 11. Li, M., Wang, J., Chen, X., et al: ‘A local average connectivity-based method for identifying essential proteins from the network level’, Comput. Biol. Chem., 2011, 35, pp. 143150.
    4. 4)
      • 19. Pan, Y., Hu, S., Zhao, B.: ‘Identification of essential protein based on functional modules and weighted protein–protein interaction networks’, IJUNESST, 2016, 9, pp. 343350.
    5. 5)
      • 22. Li, M., Ni, P., Chen, X., et al: ‘Construction of refined protein interaction network for predicting essential protein’, IEEE/ACM Trans. Comput. Biol. Bioinf., 2017, DOI: 10.1109/TCBB.2017.2665482, 2018, In Press.
    6. 6)
      • 29. Li, M., Zhang, H., Wang, J.X., et al: ‘A new essential protein discovery method based on the integration of protein–protein interaction and gene expression data’, BMC Syst. Biol., 2012, 6, p. 15.
    7. 7)
      • 30. Zhang, X., Xu, J., Xiao, W.X.: ‘A new method for the discovery of essential proteins’, PLoS One, 2013, 8, p. e58763.
    8. 8)
      • 48. Holman, A.G., Davis, P.J., Foster, J.M., et al: ‘Computational prediction of essential genes in an unculturable endosymbiotic bacterium, Wolbachia of Brugia malayi’, BMC Microbiol., 2009, 9, p. 243.
    9. 9)
      • 13. Zotenko, E.: ‘Why do hubs in the yeast protein interaction network tend to be essential: reexamining the connection between the network topology and essentiality’, PLOS Comput. Biol., 2008, 4, p. e1000140.
    10. 10)
      • 21. Zhao, B., Wang, J., Li, X., et al: ‘Essential protein discovery based on a combination of modularity and conservatism’, Methods, 2016, 110, pp. 5463.
    11. 11)
      • 23. Li, M., Li, W., Wu, F.X., et al: ‘Identifying essential proteins based on sub-network partition and prioritization by integrating subcellular localization information’, J. Theor. Biol., 2018, 447, pp. 6573.
    12. 12)
      • 17. Luo, J., Qi, Y.: ‘Identification of essential proteins based on a new combination of local interaction density and protein complexes’, PLoS One, 2015, 10, p. e0131418.
    13. 13)
      • 36. Tang, Y., Li, M., Wang, J., et al: ‘CytoNCA: a cytoscape plugin for centrality analysis and evaluation of protein interaction networks’, Biosystems, 2015, 127, pp. 6772.
    14. 14)
      • 5. Roemer, T., Jiang, B., Davison, J., et al: ‘Large-scale essential gene identification in Candida albicans and applications to antifungal drug discovery’, Mol. Microbiol., 2003, 50, pp. 167181.
    15. 15)
      • 20. Hart, G.T., Lee, I., Marcotte, E.M.: ‘A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality’, BMC Bioinf., 2007, 8, p. 236.
    16. 16)
      • 18. Jiawei, L., Shunmin, L.: ‘A novel essential protein identification algorithm based on the integration of local network topology and gene ontology’, J. Comput. Theor. Nanosci., 2014, 11, pp. 619624.
    17. 17)
      • 34. Diaz-Diaz, N., Diaz-Montana, J.J.: ‘GFD-Net: a novel approach for analyzing the functional dissimilarity of gene networks’. 6th Argentinian Conf. on Bioinformatics and Computational Biology, Buenos Aires, Argentina, 2015.
    18. 18)
      • 6. Jeong, H., Mason, S.P., Barabási, A.L., et al: ‘Lethality and centrality in protein networks’, Nature, 2001, 411, pp. 4142.
    19. 19)
      • 9. Bonacich, P.: ‘Power and centrality: a family of measures’, AJS, 1987, 92, pp. 11701182.
    20. 20)
      • 31. Tang, X., Wang, J., Zhong, J., et al: ‘Predicting essential proteins based on weighted degree centrality’, IEEE/ACM Trans. Comput. Biol. Bioinf., 2014, 11, pp. 407418.
    21. 21)
      • 38. Mewes, H.W., Frishman, D., Mayer, K.F., et al: ‘MIPS: analysis and annotation of proteins from whole genomes in 2005’, Nucleic Acids Res., 2006, 34, pp. D169D172.
    22. 22)
      • 15. Chua, H.N., Tew, K.L., Li, X.L., et al: ‘A unified scoring scheme for detecting essential proteins in protein interaction networks’. 20th IEEE Int. Conf. on Tools with Artificial Intelligence, 2008 (ICTAI'08), Dayton, OH, USA, 2008, pp. 6673.
    23. 23)
      • 1. Winzeler, E.A., Shoemaker, D.D., Astromoff, A., et al: ‘Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis’, Science, 1999, 285, pp. 901906.
    24. 24)
      • 47. Cherry, J.M., Adler, C., Ball, C., et al: ‘SGD: Saccharomyces genome database’, Nucleic Acids Res., 1998, 26, pp. 7379.
    25. 25)
      • 40. Yu, H., Greenbaum, D., Lu, H.X., et al: ‘Genomic analysis of essentiality within protein networks’, TRENDS Genet., 2004, 20, pp. 227231.
    26. 26)
      • 14. Qin, C., Sun, Y., Dong, Y.: ‘A new method for identifying essential proteins based on network topology properties and protein complexes’, PLoS One, 2016, 11, p. e0161042.
    27. 27)
      • 42. Zhang, R., Lin, Y.: ‘DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes’, Nucleic Acids Res., 2008, 37, pp. D455D458.
    28. 28)
      • 25. Lei, X., Zhao, J., Fujita, H., et al: ‘Predicting essential proteins based on RNA-Seq, subcellular localization and GO annotation datasets’, Knowl.-Based Syst., 2018, 151, pp. 136148.
    29. 29)
      • 26. Li, M., Niu, Z., Chen, X., et al: ‘A reliable neighbor-based method for identifying essential proteins by integrating gene expressions, orthology, and subcellular localization information’, Tsinghua Sci. Technol., 2016, 21, pp. 668677.
    30. 30)
      • 28. Peng, X., Wang, J., Wang, J., et al: ‘Rechecking the centrality–lethality rule in the scope of protein subcellular localization interaction networks’, PLoS One, 2015, 10, p. e0130743.
    31. 31)
      • 16. del Rio, G., Koschützki, D., Coello, G.: ‘How to identify essential genes from molecular networks?’, BMC Syst. Biol., 2009, 3, p. 102.
    32. 32)
      • 8. Estrada, E., Rodriguez-Velazquez, J.A.: ‘Subgraph centrality in complex networks’, Phys. Rev. E, 2005, 71, p. 056103.
    33. 33)
      • 46. Aloy, P., Böttcher, B., Ceulemans, H., et al: ‘Structure-based assembly of protein complexes in yeast’, Science, 2004, 303, pp. 20262029.
    34. 34)
      • 7. Joy, M.P., Brock, A., Ingber, D.E., et al: ‘High-betweenness proteins in the yeast protein interaction network’, BioMed. Res. Int., 2005, 2005, pp. 96103.
    35. 35)
      • 45. Pu, S., Vlasblom, J., Emili, A., et al: ‘Identifying functional modules in the physical interactome of saccharomyces cerevisiae’, Proteomics, 2007, 7, pp. 944960.
    36. 36)
      • 43. Friedel, C., Krumsiek, J., Zimmer, R.: ‘Bootstrapping the interactome: unsupervised identification of protein complexes in yeast’. Research in Computational Molecular Biology, 4955, 2008, pp. 316.
    37. 37)
      • 44. Pu, S., Wong, J., Turner, B., et al: ‘Up-to-date catalogues of yeast protein complexes’, Nucleic Acids Res., 2008, 37, pp. 825831.
    38. 38)
      • 4. Giaever, G., Chu, A.M., Ni, L., et al: ‘Functional profiling of the Saccharomyces cerevisiae genome’, Nature, 2002, 418, pp. 387391.
    39. 39)
      • 10. Wang, J., Li, M., Wang, H., et al: ‘Identification of essential proteins based on edge clustering coefficient’, IEEE/ACM Trans. Comput. Biol. Bioinf., 2012, 9, pp. 10701080.
    40. 40)
      • 27. Li, M., Lu, Y., Wang, J., et al: ‘A topology potential-based method for identifying essential proteins from PPI networks’, IEEE/ACM Trans. Comput. Biol. Bioinf., 2015, 12, pp. 372383.
    41. 41)
      • 33. Li, M., Lu, Y., Niu, Z., et al: ‘United complex centrality for identification of essential proteins from PPI networks’, IEEE/ACM Trans. Comput. Biol. Bioinf., 2017, 14, pp. 370380.
    42. 42)
      • 32. Peng, W., Wang, J., Wang, W., et al: ‘Iteration method for predicting essential proteins based on orthology and protein–protein interaction networks’, BMC Syst. Biol., 2012, 6, p. 87.
    43. 43)
      • 24. Shang, X., Wang, Y., Chen, B., et al: ‘Identifying essential proteins based on dynamic protein-protein interaction networks and RNA-Seq datasets’, Sci. China Inform. Sci., 2016, 59, pp. 111.
    44. 44)
      • 35. Li, M., Wang, J.X., Wang, H., et al: ‘Identification of essential proteins from weighted protein–protein interaction networks’, J. Bioinform. Comput. Biol., 2013, 11, p. 1341002.
    45. 45)
      • 39. Xenarios, I., Rice, D.W., Salwinski, L., et al: ‘DIP: the database of interacting proteins’, Nucleic Acids Res., 2000, 28, pp. 289291.
    46. 46)
      • 12. Batada, N.N., Hurst, L.D., Tyers, M.: ‘Evolutionary and physiological importance of hub proteins’, PLOS Comput. Biol., 2006, 2, p. e88.
    47. 47)
      • 2. Fraser, H.B., Hirsh, A.E., Steinmetz, L.M., et al: ‘Evolutionary rate in the protein interaction network’, Science, 2002, 296, pp. 750752.
    48. 48)
      • 41. Issel-Tarver, L., Christie, K.R., Dolinski, K., et al: ‘Saccharomyces genome database’, Methods Enzymol., 2002, 350, pp. 329346.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-syb.2018.5024
Loading

Related content

content/journals/10.1049/iet-syb.2018.5024
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading