Quality prediction model for object oriented software using UML metrics Ana Erika Camargo, Koichiro Ochimizu Japan Advanced Institute of Science and Technology 4th World Congress for Software Quality – Bethesda, Maryland, USA – September 2008 Quality Prediction Model using UML metrics [1] of [42] Outline • • • • • • • • Objective Scope Our Approach Related Work Design complexity metrics and UML Prediction Technique Case Study Conclusions and Future work Quality Prediction Model using UML metrics [2] of [42] Objective To create a model which is able : • To predict fault-prone* code in early phases of the life cycle of the software • To detect possible defects in the software (*) Fault-prone code: Code capable of having bugs. Quality Prediction Model using UML metrics [3] of [42] : Variables S : Change in the Same Direction O : Change in the Opposite Direction : Scope of this study Scope Causal-Loop Diagram Fault-prone code S S S S Complex Specifications Complex Design O O Complex Implementation Wrong Implementation O S O Wrong Design S Designers' experience Developers' experience Quality Prediction Model using UML metrics Misunderstanding of Requirements O [4] of [42] Our Approach *Related existing works Design Complexity Predict Fault-prone code Metrics FROM: Code Approximation: To obtain good candidates of fault-proneness prediction Design Complexity Metrics Predict before coding FROM: UML Artifacts Quality Prediction Model using UML metrics [5] of [42] Related work: Fault prediction Prediction models of fault-proneness: Study Output Prediction Technique Basili et al. [1996] Fault-prone classes Multivariate Logistic Regression Briand et al.[2000] Fault-prone classes Multivariate Logistic Regression Fault ratio General Regression Neural Network Fault ratio Multiple Linear Regression Fault-prone classes Multivariate Logistic Regression Kanmani et al.[2004] Input: Design Complexity Metrics CK Metrics among others Nachiappan et al.[2005] Olague et al.[2007] CK, QMOOD CK : Chidamber & Kemerer, QMOOD: Quality Metrics for Object Oriented Design Quality Prediction Model using UML metrics [6] of [42] Related work: Fault prediction From these studies, we identified useful metrics to predict fault-proneness of code : • Chidamber and Kemerer – CK 1. 2. 3. 4. 5. 6. Depth of inheritance tree (DIT) Number of children (NOC) Weighted Methods per Class (WMC) Coupling Between objects (CBO) Response for class (RFC) Lack of Cohesion of methods (LCOM) • Bansiyana and Davi's quality metrics - QMOOD 7. Average of DIT for all classes in the system (ANA) 8. Class Interface Size (CIS) 9. Data Access Metric (DAM) 10.Direct Class Coupling (DCC) 11.Measure of aggregation (MOA) 12.Measure of functionality abstraction (MFA) 13.Number of methods (NOM - same as WMC) Quality Prediction Model using UML metrics [7] of [42] Related work: UML & Design Complexity Metrics • Tang et. al[2002]: Measures CK metrics from data structures , which are created from Rational Rose class, collaboration and activity diagrams. Issue: To obtain accurate measures, assumptions are made, related to the level of details in the diagrams. For example: one activity diagram per operation in the system is required Quality Prediction Model using UML metrics [8] of [42] Related work: UML & Design Complexity Metrics • Baroni [2002]: formal definition of CK and QMOOD metrics, among others. This work uses UML class diagrams. Issues: RFC, LCOM calculations are code dependent CBO calculation, does not have a clear inclusion of methods used or variables instantiated of different classes, within every method of a class. Quality Prediction Model using UML metrics [9] of [42] UML & Design Complexity Metrics *Related existing works Predict Design Complexity Fault-prone code Metrics FROM: Code Approximation: To obtain good candidates of fault-proneness prediction Design Complexity Metrics Predict before coding FROM: UML Artifacts Quality Prediction Model using UML metrics [10] of [42] UML & Design Complexity Metrics Design complexity metrics that can be approximated using UML class diagrams: • Chidamber and Kemerer – CK Weighted Methods per Class (WMC) Depth of inheritance tree (DIT) Number of children (NOC) Coupling Between objects (CBO) Response for class (RFC) Lack of Cohesion of methods (LCOM) Can be obtained straightforward from CLASS Diagrams Cannot be calculated precisely from CLASS Diagrams. Implementation of the bodies of the classes is needed. • Bansiyana and Davi's quality metrics - QMOOD Average of DIT for all classes in the system (ANA) Class Interface Size (CIS) Data Access Metric (DAM) Direct Class Coupling (DCC) Measure of aggregation (MOA) Measure of functionality abstraction (MFA) Number of methods (NOM - same as WMC) Quality Prediction Model using UML metrics [11] of [42] UML & Design Complexity Metrics CBO Approximation • CBO-code: Num. Classes Couple to a given Class * • CBO-UML Approach 1 (UML Collaboration Diagram): A count of all messages Sent to different objects • CBO-UML Approach 2 (UML Collaboration Diagram): The same as Approach 1, but eliminating those which RETURN a value. (*) If a method within a class uses a method or instance of a variable of a different class, it is said that this pair of classes is coupled Quality Prediction Model using UML metrics [12] of [42] UML & Design Complexity Metrics CBO Approximation R7: fundsStatus : = CommtiFunds() aCustomer Quality Prediction Model using UML metrics [13] of [42] UML & Design Complexity Metrics CBO Evaluation using an e-commerce system (*). 1.2 1 CBO 0.8 CBO-code 0.6 CBO-UML(2) 0.4 CBO-UML(1) 0.2 0 -0.2 0 2 4 6 8 10 12 14 Class number (*) Described in: Gomaa Hassan, Designing Concurrent, Distributed, and Real-Time Applications with UML, Addison Wesley-Object Technology Series Editors, July 2000. Quality Prediction Model using UML metrics [14] of [42] UML & Design Complexity Metrics CBO Evaluation • For CBO-code and CBO-UML Approach 1 correlation coefficient = 0.81 • For CBO-code and CBO-UML Approach 2 correlation coefficient = 0.89 CBO-UML Approach 2 is slightly more linear to CBO-code Quality Prediction Model using UML metrics [15] of [42] UML & Design Complexity Metrics RFC Approximation • RFC-code: Num. of Methods of a given class + Num. of methods of other classes directly called by any of the methods of the given class. • RFC-UML Approach 1 (UML Collaboration Diagrams): Messages Received + Messages Sent • RFC-UML Approach 2 (UML Collaboration & Class Diagrams): (Messages Received + Number of attributes*2) + Messages Sent, where: (Messages Received + Number of attributes*2) ~ Num. of Methods of a given class. Considering 2 public methods per attribute to get and to set its value. Quality Prediction Model using UML metrics [16] of [42] UML & Design Complexity Metrics RFC Approximation class C { A a; void m() { Dd; d.dosth(); …….. } void setA (A a) { this.a = a; } A getA() { return a; } } dosth() d c m() x RFC (C) = 3 + 1 = 4 Quality Prediction Model using UML metrics [17] of [42] UML & Design Complexity Metrics RFC Evaluation using the same e-commerce system. 1.2 1 RFC 0.8 RFC-code 0.6 RFC-UML(2) RFC-UML(1) 0.4 0.2 0 0 2 4 6 8 10 12 14 Class number Quality Prediction Model using UML metrics [18] of [42] UML & Design Complexity Metrics RFC Evaluation • For RFC-Code and RFC-UML Approach 1 correlation coefficient = -0.07 • For RFC-Code and RFC-UML Approach 2 correlation coefficient = 0.67 RFC-UML Approach 2 has a stronger linear relationship with RFC-Code Quality Prediction Model using UML metrics [19] of [42] UML & Design Complexity Metrics Remark If true that our 2nd approach’s assumption might not be all valid, it still obtained an acceptable performance. Which might be explained to the fact that private attributes in a class are moderate correlated to its number of methods, according to Olague’s research [2007]. Quality Prediction Model using UML metrics [20] of [42] UML & Design Complexity Metrics Design complexity metrics that can be approximated using UML diagrams: Can be obtained straightforward from CLASS Diagrams Can be approximated by using • Chidamber and Kemerer – CK COLLABORATION Diagrams • Weighted Methods per Class (WMC) Depth of inheritance tree (DIT) Number of children (NOC) Coupling Between objects (CBO) Response for class (RFC) Lack of Cohesion of methods (LCOM) Can not be approximated precisely using UML Diagrams Bansiyana and Davi's quality metrics - QMOOD Average of DIT for all classes in the system (ANA) Class Interface Size (CIS) Data Access Metric (DAM) Direct Class Coupling (DCC) Measure of aggregation (MOA) Measure of functionality abstraction (MFA) Number of methods (NOM - same as WMC) Quality Prediction Model using UML metrics [21] of [42] Prediction Technique Design Complexity Metrics (13) Related existing works Predict Fault-prone code FROM: Code Approximation Predict : How? Design Complexity Metrics (12) FROM: UML Artifacts Quality Prediction Model using UML metrics [22] of [42] Prediction Technique Logistic Regression • Use. When we have one variable (y) with two values (e.g. faulty /no faulty, 1/0) and one or more measurement variables (xs). • Goal. To predict the probability of getting a particular value of y , given xs variables, through a logit model. • Key Points. No assumptions on the distribution of variables are made. Quality Prediction Model using UML metrics [23] of [42] Prediction Technique Logistic Regression Quality Prediction Model using UML metrics [24] of [42] Prediction Technique Example. We want to estimate the probability of a class to be highly FAULTY, in terms of a design complexity metric: Mx. Quality Prediction Model using UML metrics [25] of [42] Prediction Technique Faulty: Design complexity Metric: CLASS FAULTY Mx ---------------------------------------1. 1 1 2 1 1 3 1 1 4 1 1 5 1 1 6 1 1 7 1 1 8 1 1 9 1 1 10 1 1 11 1 0 12 1 0 Most Faulty (MF) = 1 Least Faulty (LF) = 2 Mx CLASS FAULTY Mx --------------------------------------------13 2 1 14 2 0 15 2 0 16 2 0 Mx=0 Total 17 2 0 CLASS Mx=1 18 2 0 -------------------------------------------19 2 0 MF=1 10 2 12 20 2 0 1 11 12 21 2 0 LF=2 22 2 0 -------------------------------------------23 2 0 Total 11 13 24 24 2 0 Quality Prediction Model using UML metrics [26] of [42] Prediction Technique CLASS Mx=1 Mx=0 Total -------------------------------------------MF 10 2 12 LF 1 11 12 -------------------------------------------Total 11 13 24 Probabilities • The probability of any given CLASS will be MF: P(MF) = 12 /24 = 0.50 • The probability of any given CLASS will be MF given that Mx=1: P(MF|Mx=1) = 10/11= 0.909 • The probability of any given CLASS will be MF given that Mx=0: P(MF|Mx=0) = 2/13= 0.154 Quality Prediction Model using UML metrics [27] of [42] Prediction Technique CLASS Mx=1 Mx=0 Total -------------------------------------------MF 10 2 12 LF 1 11 12 -------------------------------------------Total 11 13 24 Odds • The odds of a CLASS being MF: Odds(MF) = 12 /12 = 1 • The odds of a CLASS being MF given that Mx=1 : Odds(MF| Mx=1) = 10/1= 10 …. (1) • The odds of a CLASS being MF given that Mx=0 : Odds(MF| Mx=0) = 2/11= 0.182 … (2) Quality Prediction Model using UML metrics [28] of [42] Prediction Technique • Odds and Probabilities provide the same information but in different ways. • It is easy to convert odds y probabilities and vice-versa, e.g. : 10 P(MF| Mx=1) = odds (MF| Mx=1) = 1 + odds (MF| Mx=1) 1+10 Odds(MF| Mx=1) = P (MF| Mx=1) 1 - P (MF| Mx=1) = = 0.909 0.909 = 10 1-0.909 Quality Prediction Model using UML metrics [29] of [42] Prediction Technique • Applying the natural log of (1) and (2) : ln [ Odds(MF|Mx=1) ] = ln ( 10 ) = 2.303 …………(3) ln [ Odds(MF|Mx=0) ] = ln (0.182) = -1.704 ………(4) • We can generalize (3) and (4) in the following: ln[ Odds(MF|Mx) ] = A + B*Mx ………..(5) • From (3) and (5), when Mx = 1: ln[ Odds(MF|Mx) ] = A + B = 2.303 ….(6) • From (4) and (5), when Mx=0: ln[ Odds(MF|Mx) ] = A = -1.704 ……..(7) • From (6) and (7): A = -1.704 , B = 4.007 • Finally we can re-write (5) as follows: ln[ Odds(MF|Mx) ] = -1.704 + 4.007 *Mx Quality Prediction Model using UML metrics [30] of [42] Prediction Technique ln[ Odds(MF|Mx) ] = -1.704 + 4.007 *Mx • If: Odds(MF|Mx) = p ; p = P (MF|Mx) 1-p • We can re-write our final equations as: ln [ p ] = -1.704 + 4.007 *Mx 1-p p = P (MF|Mx) = 1 (1+e-(-1.704+4.007Mx) ) Quality Prediction Model using UML metrics [31] of [42] Case study Design Complexity Metrics (13) Related existing works Predict Fault-prone code FROM: Code Approximation Design Complexity Metrics (12) Predict using: Logistic Regression Are the candidate UML metrics good enough to predict fault-proneness? FROM: UML Artifacts Quality Prediction Model using UML metrics [32] of [42] Case study Objective: Estimate the probability of having a faulty class during the testing phase, using Logistic Regression. Quality Prediction Model using UML metrics [33] of [42] Case study Description. Using the design and implementation of the e-commerce system described in Gomaa’s book, this case study was carried out as follows: • • Collection of UML and Code metrics (Xs) Collection of data related to the faults of the ecommerce system from the logs of the CVS repository used (Y) • Evaluation of the relationship between each metric to fault-proneness, using Univariate Logistic Models Quality Prediction Model using UML metrics [34] of [42] Case study Metrics to evaluate. Due to the manner the e-commerce system was designed and implemented, without inheritance classes: SUITE Code Metric Average Number of Ancestors (ANA) Level System Inheritance Metric UML Metric to evaluate Yes Measure of Aggregation (MOA) Class Interface Size (CIS)* QMOOD No Data Access of Metric (DAM) Direct Class Coupling (DCC) Measure of Functional Abstraction (MFA) QMOOD CK CK Number of Methods (NOM) = Weighted Methods per class (WMC) * Yes Class No Depth of Inheritance (DIT) Yes Number of Children (NOC) Yes Response For Class (RFC)* No Coupling Between Objects (CBO) (*) Were found good predictors of fault-prone code in Olague’s work [2007]. Quality Prediction Model using UML metrics [35] of [42] Case study Estimation of the probability of a class of being faulty, using CBO-code. Class Number Actual (y) 1 2 3 4 5 6 7 8 9 10 11 12 13 No Faulty No Faulty Faulty Faulty Faulty Faulty Faulty Faulty Faulty Faulty No Faulty Faulty No Faulty PREDICTED Prob using CBO-code 0.2 0.2 0.99903 1 1 1 0.2 1 1 1 0.2 1 0.2 Predicted (y ) No Faulty No Faulty Faulty Faulty Faulty Faulty No Faulty Faulty Faulty Faulty No Faulty Faulty No Faulty p 1 1 e 1.3863 8.3282CBO Correctness: 12/13 classes 92.3% classes correct classified Sensitivity: 8/9 faulty classes 88.8% Faulty classes correct classified Specificity: 4/4 no-faulty classes 100% No-faulty classes correct classified Quality Prediction Model using UML metrics [36] of [42] Case study Results. From the univariate models using each one of the metrics proposed. Correctness [classes] Sensitivity [ faulty classes] Specificity [no-faulty classes] CBO-code 92.3 % 88.88% 100% CBO-UML(1) 69.2% 66.66% 75% CBO-UML(2) 69.2% 55.55% 100% RFC-code 84.61% 88.88% 75% RFC-UML(1) 76.92% 77.77% 75% RFC-UML(2) 84.61% 88.88% 75% WMC-code 90.9% 85.7% 100% WMC-UML 72.7% 71.42% 75% CIS-code 90.9% 85.7% 100% CIS-UML 90.9% 100% 75% DAM-code 36.3% 57.14% 0% DAM-UML 72.7% 85.7 50% Metrics Quality Prediction Model using UML metrics CIS: Public Methods in a class DAM: Ratio of number of private and protected attributes to the total number of attributes DCC measures were not significant for this study [37] of [42] Case study Results • Our second approach to approximate RFC with UML diagrams performed equally to the RFC metric measured from code • UML CIS approximation performed similarly to the CIS metric measured from the code • The rest of the UML metrics’ performance was somewhat acceptable Quality Prediction Model using UML metrics [38] of [42] Case study Can we apply the obtained models to other case studies? System Metrics E-commerce CBO-UML(1) Banking E-commerce RFC-UML(1) Banking E-commerce CBO-UML(2) Banking E-commerce Banking RFC-UML(2) Specificity [no-faulty classes] 69.2% Sensitivity [ faulty classes] 66.66% 72.7% 100% 50% 76.92% 77.77% 75% 72.7% 100% 50% 69.2% 55.55% 100% 63.6% 80% 50% 84.61% 88.88% 75% 72.7% 80% 66.6% Correctness [classes] Quality Prediction Model using UML metrics 75% [39] of [42] Conclusions and Future work • UML metrics can be acceptable predictors of faultprone code • UML CIS and UML RFC metrics showed strong relationship to fault-proneness of code • We might be able to create a more robust model to predict fault-prone code before its implementation. Quality Prediction Model using UML metrics [40] of [42] Conclusions and Future work • Further study and evaluation of other metrics using other UML artifacts (e.g. sequence diagrams, state diagrams and description of use cases) is needed. • Construction of a more robust model using multivariate logistic regression • Evaluation of the final model obtained, using different study cases Quality Prediction Model using UML metrics [41] of [42] Quality prediction model for object oriented software using UML metrics Camargo Ana Erika [email protected] Quality Prediction Model using UML metrics [42] of [42]
© Copyright 2024 ExpyDoc