SamEn-SVR: using sample entropy and support vector regression for bug number prediction

SamEn-SVR: using sample entropy and support vector regression for bug number prediction

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
IET Software — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Monitoring and predicting the trend of bug number time series of a software system is crucial for both software project managers and software end-users. For software managers, accurate prediction of bug number of a software system will assist them in making timely decisions, such as effort investment and resource allocation. For software end-users, knowing possible bug number of their systems ahead will enable them to adopt timely actions in coping with the loss caused by possible system failures. This study proposes an approach called SamEn-SVR to combine sample entropy and support vector regression (SVR) to predict software bug number using time series analysis. The basic idea is to use template vectors with the smallest complexity as input vectors for SVR classifiers to ensure predictability of time series. By using Mozilla Firefox bug data, we conduct extensive experiments to compare the proposed approach and state-of-the-art techniques including auto-regressive integrated moving average (ARIMA), X12 enhanced ARIMA and polynomial regression to predict bug number time series. Experimental results demonstrate that the proposed SamEn-SVR approach outperforms state-of-the-art techniques in bug number prediction.


    1. 1)
      • 1. Pressman, S.: ‘Software engineering: a practitioner's approach’ (McGraw-Hill Education, Ohio USA, 2005, 6th edn.).
    2. 2)
      • 2. Zhang, W., Wang, S., Wang, Q.: ‘KSAP an approach to bug report assignment using KNN search and heterogeneous proximity’, Inf. Softw. Technol., 2016, 70, pp. 6884.
    3. 3)
      • 3. Zhang, H.: ‘An initial study of the growth of eclipse defects’. Fifth Int. Workshop on Mining Software Repositories (MSR 2008), 10–11 May 2008.
    4. 4)
      • 4. Hassan, A.E.: ‘The road ahead for mining software repositories’. Frontiers of Software Maintenance (FoSM'2008), 2008, pp. 4857.
    5. 5)
      • 5. Zimmermann, T., Weigerber, P., Diehl, S., et al: ‘Mining version histories to guide software changes’, IEEE Trans. Softw. Eng., 2005, 31, (6), pp. 429445, doi: 10.1109/TSE.2005.72.
    6. 6)
      • 6. Herraiz, I., González-Barahona, J.M., Robles, G.: ‘Forecasting the number of changes in eclipse using time series analysis’. Proc. Fourth Int. Workshop on Mining Software Repositories (MSR 2007), 19–20 May 2007, p. 32.
    7. 7)
      • 7. Takens, F.: ‘Detecting strange attractors in turbulence’ in ‘Lecture Notes in Mathematics’, vol. 898 (Springer, Berlin, 1981), pp. 366381.
    8. 8)
      • 8. Sauer, T., Yorke, J.A., Casdagli, M.: ‘Embedology’, J. Stat. Phys., 1991, 65, p. 579.
    9. 9)
      • 9. Shumway, R.H., Stoffer, D.S.: ‘Time series analysis and its applications with R examples (springer texts in statics)’ (Springer, Heidelberg, Berlin, 2006).
    10. 10)
      • 10. Box, G.E.P., Jenkins, G.: ‘Time series analysis: forecasting and control, Holden-day’ (Prentice-Hall, New York, NY, 1994, 3rd edn.).
    11. 11)
      • 11. ‘X12-ARIMA’, available at
    12. 12)
      • 12. Findley, D.F., Hood, C.C.: ‘X-12-ARIMA and its application to some Italian indicator series’. Seasonal adjustment procedures – experiences and perspectives, Istituto Nazionale di Statistica, Rome, 2000, 10, (20), pp. 231251.
    13. 13)
      • 13. Findley, D.F., Monsell, B.C., Bell, W.R., et al: ‘New capabilities and methods of the X-12-ARIMA seasonal adjustment program’, J. Bus. Econ. Stat., 1998, 16, pp. 127176.
    14. 14)
      • 14. Hazewinkel, M.: ‘Taylor formula’ in ‘Encyclopedia of Mathematics’ (Springer, Heidelberg, Berlin, 2001), ISBN 978-1-55608-010-4.
    15. 15)
      • 15. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: ‘Learning internal representations by error propagation’. Parallel Distributed Processing, Exploitations in the Microstructure of Cognition, Cambridge, MA, 1986, vol. 1, pp. 318362.
    16. 16)
      • 16. Cantrell, C.D.: ‘Modern mathematical methods for physicists and engineers’ (Cambridge University Press, Cambridge, 2000).
    17. 17)
      • 17. Richman, J.S., Moorman, J.R.: ‘Physiological time-series analysis using approximate entropy and sample entropy’, Am. J. Physiol. Heart Circ. Physiol., 2000, 278, (6), pp. 20392049.
    18. 18)
      • 18. Zhang, W., Yoshida, T., Tang, X.: ‘Text classification based on multi-word with support vector machine’, Knowl.-Based Syst., 2008, 21, (8), pp. 879886.
    19. 19)
      • 19. ‘Mozilla Firefox Project’, available at
    20. 20)
      • 20. ‘MSR 2010 challenge’, available at
    21. 21)
      • 21. ‘Eviews Trial Version’, available at
    22. 22)
      • 22. Briand, L.C., Wieczorek, I.: ‘Resource estimation in software engineering’, in ‘Encyclopedia of software engineering’ (Wiley, Hoboken, 2002), pp. 11601196.
    23. 23)
      • 23. Mann, H.B., Whitney, R.: ‘On a test of whether one of two random variables is stochastically larger than the other’, Ann. Math. Stat., 1947, 18, (1), pp. 5060.
    24. 24)
      • 24. Yuen, C.C.H.: ‘On analyzing maintenance process data at the global and detail levels: a case study’. Proc. 6th Int. Conf. on Software Maintenance, 1988, pp. 248255.
    25. 25)
      • 25. Kemerer, C.F., Slaughter, S.: ‘An empirical approach to studying software evolution’, IEEE Trans. Software Eng., 1999, 25, (4), pp. 493509.
    26. 26)
      • 26. Kenmei, B., Antoniol, G., Penta, M.: ‘Trend analysis and issue prediction in large-scale open source systems’. Proc. 12th European Conf. on Software Maintenance and Reengineering, 2008, pp. 7382.
    27. 27)
      • 27. Caprio, F., Casazza, G., Penta, M.D., et al: ‘Measuring and predicting the Linux kernel evolution’. Proc. Int. Workshop of Empirical Studies on Software Maintenance, Florence, Italy, 2001.
    28. 28)
      • 28. Pati, J., Shukla, K.K.: ‘A comparison of ARIMA, neural network and a hybrid technique for Debian bug number prediction’. Proc. 5th Int. Conf. on Computer and Communication Technology, 2014, pp. 4753.
    29. 29)
      • 29. Wu, W., Zhang, W., Yang, Y., et al: ‘Time series analysis for bug number prediction’. Proc. 2nd Int. Conf. on Software Engineering and Data Mining, 2010, pp. 589596.
    30. 30)
      • 30. Huang, S., Chuang, P., Wub, C., et al: ‘A chaos-based support vector regressions for exchange rate forecasting’, Expert Syst. Appl., 2010, 37, pp. 85908598.

Related content

This is a required field
Please enter a valid email address