access icon free Performance improvement of a non-intrusive voice quality metric in lossy networks

Voice quality assessment of phone calls is a relevant task for mobile service providers. In this context, the main objective of this research is to provide a model that improves the performance of ITU-T Rec. P.563. To accomplish this objective, the proposed model considers two aspects, better response in lossy network and adequate treatment of silences segments into the audio signal. Thus, a function is determined to suppress silences in the speech signal according to the packet loss rate value. Furthermore, the proposed model is implemented on both a voice quality server and a mobile device. Experimental results show that P.563 algorithm performance was really improved by the proposed model, approximating its results to those given by P.862 algorithm, reaching a Pearson correlation coefficient of 0.9957 and a root mean square error of 0.2983. Moreover, subjective test results demonstrated that the proposed model results overcome those obtained by the P.563 algorithm.

Inspec keywords: speech processing; mobile handsets; mean square error methods; correlation theory

Other keywords: mobile device; root mean square error; Pearson correlation coefficient; lossy network; audio signal; speech signal; mobile service providers; P.563 algorithm performance; ITU-T Rec. P.563 algorithm; nonintrusive voice quality metric assessment; phone calls; voice quality server; packet loss rate value

Subjects: Speech and audio signal processing; Speech processing techniques; Interpolation and function approximation (numerical analysis); Interpolation and function approximation (numerical analysis); Mobile radio systems

References

    1. 1)
      • 21. Abareghi, M., Homayounpour, M.M., Dehghan, M., et al: ‘Improved ITU-P.563 non-intrusive speech quality assessment method for covering VOIP conditions’. Proc. IEEE Int. Conf. on Advanced Communication Technology, Gangwon-Do, South Korea, February 2008, pp. 354357.
    2. 2)
      • 5. Koster, F., Mittag, G., Moeller, S.: ‘Modeling the overall quality of experience on the basis of underlying quality dimensions’. Proc. Conf. on Quality of Multimedia Experience, Erfurt, Germany, May. 2017, pp. 16.
    3. 3)
      • 20. Polacky, J., Pocta, P.: ‘An analysis of the impact of packet loss, codecs and type of voice on internal parameters of p.563 model’. Proc. IEEE Int. Conf. on Digital Technologies, Zilina, Slovakia, July 2014, pp. 281284.
    4. 4)
      • 13. Kuipers, F., Kooij, R., Vleeschauwer, D.D., et al: ‘Techniques for measuring quality of experience’. Proc. Wired/Wireless Internet Communications, Lulea, Sweden, June 2010, pp. 216227.
    5. 5)
      • 22. Pereira, C.H., Nunes, R.D., Rosa, R.L., et al: ‘Improving the performance of a nonintrusive metric of voice quality assessment considering IP network parameters’. Proc. SBRT-BST, Juiz de Fora, Brazil, Set 2015, pp. 513517.
    6. 6)
      • 35. 3GPP TS 37.320 version 10.4.0 Release 10.: ‘Radio measurement collection for minimization of drive tests (MDT)’, January 2012.
    7. 7)
      • 15. Diethorn, E.J.: ‘Purposeful receive-path audio degradation for providing feedback about transmit-path signal quality’, U.S. Patent 8 126 394, February 2012.
    8. 8)
      • 4. Rodriguez, D.Z., Rosa, R.L., Costa, E.A., et al: ‘Video quality assessment in video streaming services considering user preference for video content’, IEEE Trans. Consum. Electron., 2014, 60, (3), pp. 436444.
    9. 9)
      • 10. ITU-T.: ‘Single-ended method for objective speech quality assessment in narrow-band telephony applications’, Rec. P.563, Geneva, May 2004.
    10. 10)
      • 23. Malfait, L., Berger, J., Kastner, M.: ‘P.563 - the ITU-T standard for single-ended speech quality assessment’, IEEE Trans. Audio Speech Lang. Process., 2006, 14, (6), pp. 19241934.
    11. 11)
      • 38. Sadjadi, S.O., Hansen, J.H.L.: ‘Unsupervised speech activity detection using voicing measures and perceptual spectral flux’, IEEE Signal Process. Lett., 2013, 20, (3), pp. 197200.
    12. 12)
      • 36. Information Society Technologies.: ‘ANITA reference database description’, vol. 2, EADS Telecom, 2003, pp. 0627.
    13. 13)
      • 31. Falk, T.H., Chan, W.Y.: ‘Hybrid signal-and-link-parametric speech quality measurement for VoIP communications’, IEEE Trans. Audio Speech Lang. Process., 2008, 16, (8), pp. 15791589.
    14. 14)
      • 2. Brunnstrom, K., Beker, S., De-Moor, K., et al: ‘Qualinet white paper on definitions of quality of experience’. Fifth Qualinet Meeting, Novi Sad, Serbia, March 2013, pp. 120.
    15. 15)
      • 39. Fielding, R.T., Taylor, R.N.: ‘Principled design of the modern web architecture’. Proc. Int. Conf. on Software Engineering, Limerick, Ireland, June 2000, pp. 407416.
    16. 16)
      • 3. Reiter, U., et al: ‘Factors influencing quality of experience’, in Moeller, S., Raake, A. (Eds.): ‘Quality of experience’ (Springer, London, 2014), pp. 5572.
    17. 17)
      • 7. Möller, S., Chan, W-Y., Côté, N., et al: ‘Speech quality estimation: models and trends’, IEEE Signal Process. Mag., 2011, 28, (6), pp. 1828.
    18. 18)
      • 34. Rodriguez, D.Z., Bressan, G.: ‘Improving the minimization drive tests using voice quality Index’. Proc. Int. Workshop on Telecommunications, Minas Gerais, Brazil, May 2013, pp. 14.
    19. 19)
      • 25. Nunes, R.D., Pereira, C.H., Lopes, R.R., et al: ‘Real-time evaluation of speech quality in mobile communication services’. Proc. IEEE Int. Conf. on Consumer Electronics, Las Vegas, USA, January 2016, pp. 425426.
    20. 20)
      • 26. ITU-T Temporary Document.: ‘Technical requirement specification proposals for scope of single-ended perceptual evaluation of listening quality (P.SPELQ)’, May 2015. Available at http://www.itu.int/md/T13-SG12–150505-TD-GEN-0724/en.
    21. 21)
      • 33. Raja, A., Azad, R.M.A., Flanagan, C., et al: ‘Real-time non-intrusive evaluation of VoIP’. Proc. 10th European Conf., Berlin, Germany, April 2007, pp. 217228.
    22. 22)
      • 14. Rodriguez, D.Z., Pivaro, G., Sousa, J.: ‘Apparatus and method for evaluating voice quality in a mobile network’, U.S. Patent 9 078 143, July 2015.
    23. 23)
      • 32. Cherif, W., Ksentini, A., Négru, D., et al: ‘A_PSQA: PESQ-like non-intrusive tool for QoE prediction in VoIP services’. Proc. IEEE Int. Conf. on Communications, Ottawa, Canada, June 2012, pp. 21242128.
    24. 24)
      • 16. Akhtar, R., Leng, S., Memon, S.I., et al: ‘Architecture of hybrid mobile social networks for efficient content delivery’, Wirel. Pers. Commun., 2015, 80, (1), pp. 8596.
    25. 25)
      • 9. ITU-T.: ‘Perceptual objective listening quality assessment’, Tech. Rec. P.863, Geneva, Set. 2014.
    26. 26)
      • 6. ITU-T.: ‘Methods for subjective determination of transmission quality’, Tech. Rec. P.800, Geneva, August 1996.
    27. 27)
      • 11. Dubey, R.K., Kumar, A.: ‘Non-intrusive speech quality assessment using multi-resolution auditory model features for degraded narrowband speech’, IET Signal Process., 2015, 9, (9), pp. 638646.
    28. 28)
      • 8. ITU-T.: ‘Perceptual evaluation of speech quality: an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs’, Tech. Rec. P.862, Geneva, February 2001.
    29. 29)
      • 28. ITU-T.: ‘Transmission impairments due to speech processing’, Tech. Rec. G.113, Geneva, May 2007. Available at https://www.itu.int/rec/T-REC-G.113-200711-I/en.
    30. 30)
      • 29. Ramirez, J., Górriz, J.M., Segura, J.C.: ‘Voice activity detection: fundamentals and speech recognition system robustness’ (Intech Open Access Publisher, Vienna, 2007, 1st Edn.), p. 460.
    31. 31)
      • 30. Grancharov, V., Zhao, D.Y., Lindblom, J., et al: ‘Low-complexity, nonintrusive speech quality assessment’, IEEE Trans. Audio Speech Lang. Process., 2006, 14, (6), pp. 19481956.
    32. 32)
      • 19. Cao, R., Yang, L.: ‘What determines resource optimization in cooperative communications’. Proc. IEEE Conf. on Wireless Communications and Networking Conf., Piscataway, NJ, USA, 2009, pp. 10051010.
    33. 33)
      • 12. Mossavat, I., Petkov, P.N., Kleijn, W.B., et al: ‘A hierarchical Bayesian approach to modeling heterogeneity in speech quality assessment’, IEEE Trans. Audio Speech Lang. Process., 2012, 20, (1), pp. 136146.
    34. 34)
      • 24. Affonso, E.T., Rosa, R.L., Rodríguez, D.Z.: ‘Speech quality assessment over lossy transmission channels using deep belief networks’, IEEE Signal Process. Lett., 2018, 25, (1), pp. 7074.
    35. 35)
      • 27. ITU-T.: ‘The E-model: a computational model for use in transmission planning’, Tech. Rec. G.107, Geneva, June 2015, Available at https://www.itu.int/rec/T-REC-G.107.
    36. 36)
      • 37. ITU-T.: ‘ITU-T Coded-Speech Database’, Tech. Rec. Sup.23, Geneva, February 1998.
    37. 37)
      • 17. Hasan, M.K., Ismail, A.F., Islam, S., et al: ‘A novel HGBBDSA-CTI approach for subcarrier allocation in heterogeneous network’, Telecommun. Syst., 2019, 70, (2), pp. 245262.
    38. 38)
      • 18. Abdo, A., Zhao, X., Xiongwen, Z., et al: ‘MU-MIMO Downlink capacity analysis and Optimum code weight vector design for 5G big data massive antenna millimeter wave communication’, Wirel. Commun. Mob. Comput., 2018, 2018, pp. 112.
    39. 39)
      • 1. Zhang, M., Shen, Z., Zhang, X., et al: ‘Theoretical modelings for mobile web service QoE assessment’. Proc. IEEE Wireless Communications and Networking Conf., New Orleans, USA, March 2015, pp. 20262031.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-com.2018.5165
Loading

Related content

content/journals/10.1049/iet-com.2018.5165
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading