© The Institution of Engineering and Technology
Monitoring and predicting the trend of bug number time series of a software system is crucial for both software project managers and software end-users. For software managers, accurate prediction of bug number of a software system will assist them in making timely decisions, such as effort investment and resource allocation. For software end-users, knowing possible bug number of their systems ahead will enable them to adopt timely actions in coping with the loss caused by possible system failures. This study proposes an approach called SamEn-SVR to combine sample entropy and support vector regression (SVR) to predict software bug number using time series analysis. The basic idea is to use template vectors with the smallest complexity as input vectors for SVR classifiers to ensure predictability of time series. By using Mozilla Firefox bug data, we conduct extensive experiments to compare the proposed approach and state-of-the-art techniques including auto-regressive integrated moving average (ARIMA), X12 enhanced ARIMA and polynomial regression to predict bug number time series. Experimental results demonstrate that the proposed SamEn-SVR approach outperforms state-of-the-art techniques in bug number prediction.
References
-
-
1)
-
26. Kenmei, B., Antoniol, G., Penta, M.: ‘Trend analysis and issue prediction in large-scale open source systems’. Proc. 12th European Conf. on Software Maintenance and Reengineering, 2008, pp. 73–82.
-
2)
-
28. Pati, J., Shukla, K.K.: ‘A comparison of ARIMA, neural network and a hybrid technique for Debian bug number prediction’. Proc. 5th Int. Conf. on Computer and Communication Technology, 2014, pp. 47–53.
-
3)
-
15. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: ‘Learning internal representations by error propagation’. Parallel Distributed Processing, Exploitations in the Microstructure of Cognition, Cambridge, MA, 1986, , pp. 318–362.
-
4)
-
2. Zhang, W., Wang, S., Wang, Q.: ‘KSAP an approach to bug report assignment using KNN search and heterogeneous proximity’, Inf. Softw. Technol., 2016, 70, pp. 68–84.
-
5)
-
1. Pressman, S.: ‘Software engineering: a practitioner's approach’ (McGraw-Hill Education, Ohio USA, 2005, 6th edn.).
-
6)
-
8. Sauer, T., Yorke, J.A., Casdagli, M.: ‘Embedology’, J. Stat. Phys., 1991, 65, p. 579.
-
7)
-
8)
-
4. Hassan, A.E.: ‘The road ahead for mining software repositories’. Frontiers of Software Maintenance (FoSM'2008), 2008, pp. 48–57.
-
9)
-
9. Shumway, R.H., Stoffer, D.S.: ‘Time series analysis and its applications with R examples (springer texts in statics)’ (Springer, Heidelberg, Berlin, 2006).
-
10)
-
25. Kemerer, C.F., Slaughter, S.: ‘An empirical approach to studying software evolution’, IEEE Trans. Software Eng., 1999, 25, (4), pp. 493–509.
-
11)
-
30. Huang, S., Chuang, P., Wub, C., et al: ‘A chaos-based support vector regressions for exchange rate forecasting’, Expert Syst. Appl., 2010, 37, pp. 8590–8598.
-
12)
-
13. Findley, D.F., Monsell, B.C., Bell, W.R., et al: ‘New capabilities and methods of the X-12-ARIMA seasonal adjustment program’, J. Bus. Econ. Stat., 1998, 16, pp. 127–176.
-
13)
-
14. Hazewinkel, M.: ‘Taylor formula’ in ‘Encyclopedia of Mathematics’ (Springer, Heidelberg, Berlin, 2001), .
-
14)
-
27. Caprio, F., Casazza, G., Penta, M.D., et al: ‘Measuring and predicting the Linux kernel evolution’. Proc. Int. Workshop of Empirical Studies on Software Maintenance, Florence, Italy, 2001.
-
15)
-
16)
-
3. Zhang, H.: ‘An initial study of the growth of eclipse defects’. Fifth Int. Workshop on Mining Software Repositories (MSR 2008), 10–11 May 2008.
-
17)
-
7. Takens, F.: ‘Detecting strange attractors in turbulence’ in ‘Lecture Notes in Mathematics’, (Springer, Berlin, 1981), pp. 366–381.
-
18)
-
19)
-
18. Zhang, W., Yoshida, T., Tang, X.: ‘Text classification based on multi-word with support vector machine’, Knowl.-Based Syst., 2008, 21, (8), pp. 879–886.
-
20)
-
12. Findley, D.F., Hood, C.C.: ‘X-12-ARIMA and its application to some Italian indicator series’. Seasonal adjustment procedures – experiences and perspectives, Istituto Nazionale di Statistica, Rome, 2000, 10, (20), pp. 231–251.
-
21)
-
17. Richman, J.S., Moorman, J.R.: ‘Physiological time-series analysis using approximate entropy and sample entropy’, Am. J. Physiol. Heart Circ. Physiol., 2000, 278, (6), pp. 2039–2049.
-
22)
-
23. Mann, H.B., Whitney, R.: ‘On a test of whether one of two random variables is stochastically larger than the other’, Ann. Math. Stat., 1947, 18, (1), pp. 50–60.
-
23)
-
24)
-
24. Yuen, C.C.H.: ‘On analyzing maintenance process data at the global and detail levels: a case study’. Proc. 6th Int. Conf. on Software Maintenance, 1988, pp. 248–255.
-
25)
-
5. Zimmermann, T., Weigerber, P., Diehl, S., et al: ‘Mining version histories to guide software changes’, IEEE Trans. Softw. Eng., 2005, 31, (6), pp. 429–445, .
-
26)
-
6. Herraiz, I., González-Barahona, J.M., Robles, G.: ‘Forecasting the number of changes in eclipse using time series analysis’. Proc. Fourth Int. Workshop on Mining Software Repositories (MSR 2007), 19–20 May 2007, p. 32.
-
27)
-
29. Wu, W., Zhang, W., Yang, Y., et al: ‘Time series analysis for bug number prediction’. Proc. 2nd Int. Conf. on Software Engineering and Data Mining, 2010, pp. 589–596.
-
28)
-
10. Box, G.E.P., Jenkins, G.: ‘Time series analysis: forecasting and control, Holden-day’ (Prentice-Hall, New York, NY, 1994, 3rd edn.).
-
29)
-
16. Cantrell, C.D.: ‘Modern mathematical methods for physicists and engineers’ (Cambridge University Press, Cambridge, 2000).
-
30)
-
22. Briand, L.C., Wieczorek, I.: ‘Resource estimation in software engineering’, in ‘Encyclopedia of software engineering’ (Wiley, Hoboken, 2002), pp. 1160–1196.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-sen.2017.0168
Related content
content/journals/10.1049/iet-sen.2017.0168
pub_keyword,iet_inspecKeyword,pub_concept
6
6