On Some Fault Prediction Issues CAMARGO CRUZ Ana Erika and IIDA Hajimu 奈良先端科学技術大学院大学 情報科学研究科 ソフトウェア設計学研究室 1 Introduction NATO Software Engineering Conference (1968), Mr. Nash from the IBM UK Laboratories suggested : Test Planning and Status Control as valuable tools in managing the later part of a software development cycle. Various research areas for Test Planning, one: Fault Prediction 2 Fault Prediction Goal: More effective criteria for the elaboration of test cases. Predict faulty components State of the art: A large number of prediction models have been proposed in the last decades, but hardly put into practice. Projects: mostly on open source software. Predictors: mainly product metrics. 3 Fault Prediction Akiyama et al. [1971] LOC Design Complexity Metrics Basili et al. [1996] (Coupling , Inheritance, etc.) Complexity Metrics Halstead et al. [1975] Process, History, Repository Metrics (Number of Commits, LOC modified, Past faults, Times a file is refactored, etc.) E. Arisholm et al. [2007] Moser et al. [2008] S. Shivaji et al. [2009] Y. Kamei et al. [2010] Logically, components with the greatest number of LOC, • Most complex • Most frequently changed Latest Literature Review[1]:Overall, LOC is useful for fault prediction. Are we are reinventing the wheel? [1]Hall, T., Beecham, S., Bowes, D., Gray, D. and Counsell, S.: A Systematic Literature Review on Fault Prediction Performance in 4 Softw. Eng., Softw. Eng., IEEE Trans. on, Vol. 38, No.6, pp.1276–1304 (2012) Research Questions RQ1: Is there exist multicollinearity? If our supposition is right, these metrics would be highly intercorrelated. RQ2: How much accurate is a model using inter-correlated metrics than single metric models? Metric1 Metric2 MetricN Metric_N Faulty Faulty Prediction Model A Metric1 No Faulty Prediction Model B No Faulty RQ3: How much does their usage worth? If Model A is more accurate than B, does its prediction accuracy worth the effort invested to construct it? 5 The Multicolinearity Problem Most common prediction techniques used for fault prediction (Naive Bayes and Logistic Regression) : assume the predictor variables to be interdependent of each other. For example, suppose the following correlation matrix: Number of Commits LOC Complexity Number of Commits 1 LOC 0.9 1 Complexity 0.8 0.7 1 Number of Faults 0.7 0.9 0.5 Number of Faults 1 6 The Multicolinearity Problem r 2 Shared Variance: Variation amount of two variables that tend to vary together rLOC,Faults 0.9 LOC r 2 LOC,Faults 0.81 Faults a1LOC a2 NCommits b NCommits R LOC 2 Faults (Ideally) Faults rNCommits,Faults 0.7 NCommits r 2 NCommits,Faults 0.49 Faults NComiits LOC Faults Multiple Correlation Coefficient A measure of the fit of a multiple linear regression model → [0,1] NCommits→ poor significance rNCommits, LOC 0.9 r 2 NCommits,LOC 0.81 R2 Faults( NCommits,LOC) ~ r 2 LOC,Faults 0.81 7 A Rapid Literature Review[1] From 208 research papers, only 36 passed their review. They reported sufficient contextual and methodological information We reviewed 13/36: (RQ1)Finding 1. Report muticollineraity among their used predictors, but no details are provided. Correlation Matrix among predictors is missing. Principal Component Analysis to alleviate the problem no major details are given. [1]Hall, T., Beecham, S., Bowes, D., Gray, D. and Counsell, S.: A Systematic Literature Review on Fault Prediction Performance in Softw. Eng., Softw. Eng., IEEE Trans. on, Vol. 38, No.6, pp.1276–1304 (2012) 8 A Rapid Literature Review (RQ2) Finding 2. Do not report their prediction results on single metrics, but on sets of metrics. We cannot know how much more accurate is a model using a single metric than another using multiple metrics. Exceptions: - LOC useful predictor with stable performance[2,3]. - LOC-only model was suggested as viable alternative to more complex models[4]. - One study exploring design complexity metrics (Chidamber and Kemerer) found that: » Using a single predictor model yielded better prediction accuracy that using multiple metrics[5]. [2] M. D’Ambros, M. Lanza, and R. Robbes, “An extensive comparison of bug prediction approaches,” MSR 2010,. 7th IEEE Working Conference on, 2010, pp. 31–41. [3] Z. Hongyu, “An investigation of the relationships between lines of code and defects,” ICSM 2009. IEEE International Conference on, 2009, pp. 274–283. [4]R. Bell, T. Ostrand, and E. Weyuker, “Looking for bugs in all the right places,” in Procs of the 2006 international symposium on Software testing and analysis. ACM, 2006 [5] T. Gyimothy, R. Ferenc, and I. Siket, “Empirical validation of object-oriented metrics on open source software for fault prediction,” Software Engineering, IEEE 9 Transactions on, vol. 31, no. 10, pp. 897 – 910, oct. 2005. A Rapid Literature Review (RQ3)Finding 3. The gain of using multiple metrics is not assessed either. An exception is the work of E. Arisholm et al.[7,8], that proposes a measures of costeffectiveness. However, their results are on set of metrics. Size If the only thing a prediction model does is to model the fact that the number of faults of a class is proportional to its size, there would be likely no much gain from such a model. [7] E. Arisholm, L. C. Briand, and M. Fuglerud, “Data mining techniques for building fault-proneness models in telecom java software,” in Software Reliability, ’07. The 18th IEEE International Symposium on, nov. 2007, pp. 215 –224. [8] E. Arisholm, L. C. Briand, and E. B. Johannessen, “A systematic and comprehensive investigation of methods to build and evaluate fault prediction models,”10 Journal of Systems and Software, vol. 83, no. 1, pp. 2–17, 2010. For Discussion Course of Action Study other metrics. Which? Where from? Publically available data only from open source software projects. Metrics mined from these projects (product, repository) may be telling us the same. Metrics on Experience, Team Communication, where from? Include results and analysis: on single predictors models as opposed to multiple predictors models. cost effectiveness measures. 11 Conclusions Although multicollinearity among predictors of faulty code is reported: Little is known about the gain of using multiple predictors as opposed to single predictor models We think that researchers have exhausted the exploration of metrics from open source software Other factors which may be related to faulty code are difficult to study due to the lack of publicly available data. 12
© Copyright 2024 ExpyDoc