--- tags: Ideas title: Metrics Document --- # Metrics State of Art In this note we are going to write down the state of art related to software engineering metrics that might be implemented in rust-code-analysis. # Metrics A list of metrics found online with a brief explanation for each of them ## Academic overview ### Common metrics A list of common software metrics from Wikipedia.[^cmw] :heavy_check_mark: : already implemented in rca :x: : non implemented in rca | Name | Description | Details | | | ---- | ---- | ---- | ---- | | *ABC score* | It's as a triplet of values that represent the size of a set of source code statements. It's calculated by counting the number of assignments (A), number of branches (B), and number of conditionals \(C\) in a program | It can be applied to individual methods, functions, classes, modules or files. It can be represented by a 3-D vector <A,B,C> or as a scalar value (the magnitude of the vector). https://wiki.c2.com/?AbcMetric | :x: | | *Bugs per line of code* | | | :x: | | *Cohesion* | It's the degree to which the elements inside a module belong together | In one sense, it is a measure of the strength of relationship between the methods and data of a class and some unifying purpose or concept served by that class. In another sense, it is a measure of the strength of relationship between the class's methods and data themselves | :x: | | *Comment density* | It's the percentage of comment lines in a given source code base | Comment lines divided by total lines of code | :x: | | *Connascence* | Two components are connascent if a change in one would require the other to be modified in order to maintain the overall correctness of the system. It allows reasoning about the complexity caused by dependency relationships in object-oriented design much like coupling did for structured design | In addition to allowing categorization of dependency relationships, it also provides a system for comparing different types of dependency | :x: | | *Coupling*| It's the degree of interdependence between software modules, a measure of how closely connected two routines or modules are, the strength of the relationships between modules | Low coupling often correlates with high cohesion, and vice versa. Low coupling is often thought to be a sign of a well-structured computer system and a good design, and when combined with high cohesion, supports the general goals of high readability and maintainability | :x: | | *Cyclomatic complexity* | It's used to indicate the complexity of a program | It is a quantitative measure of the number of linearly independent paths through a program's source code | :heavy_check_mark: | | *Cyclomatic complexity density* | Cyclomatic complexity / Lines of code | | :x: | | *Defect density* | Total number of defects / Size | | :x: | | *Defect potential* | Expected number of defects in a particular component | | :x: | | *Defect removal rate* | It has different definitions. In a general way, it's the number of defects found at the end of an event divided by the defects found before that event | | :x: | | *DSQI (design structure quality index)* | It's an architectural design metric used to evaluate a computer program's design structure and the efficiency of its modules | The result of DSQI calculations is a number between 0 and 1. The closer to 1, the higher the quality. It is best used on a comparison basis | :x: | | *Function Points* | It's a unit of measurement to express the amount of business functionality a software provides to a user | Function points are used to compute a functional size measurement (FSM) of software | :x: | | *Instruction path length* | It's the number of machine code instructions required to execute a section of a computer program | The total path length for the entire program could be deemed a measure of the algorithm's performance on a particular computer hardware | :x: | | *Maintainability index* | It's calculated with certain formulae from lines-of-code measures, McCabe measures and Halstead complexity measures | It helps reduce or reverse a system's tendency toward "code entropy" or degraded integrity, and to indicate when it becomes cheaper and/or less risky to rewrite the code than it is to change it | :heavy_check_mark: | | *Program execution time* | | | :x: | | *Program load time* | | | :x: | | *Program size (binary)* | | | :x: | | *Weighted Micro Function Points* | It's a modern software sizing algorithm which is a successor to solid ancestor scientific methods as COCOMO, COSYSMO, maintainability index, cyclomatic complexity, function points, and Halstead complexity | It produces more accurate results than traditional software sizing methodologies, while requiring less configuration and knowledge from the end user, as most of the estimation is based on automatic measurements of an existing source code | :x: | | *CISQ automated quality characteristics measures* | A set of measurement standards for Reliability, Security, Performance Efficiency, and Maintainability chosen by the Consortium for Information & Software Quality | | :x: | | *Cycle time* | It's the time it takes to bring a task or process from start to finish | | :x: | | *First pass yield* | It's the number of units coming out of a process divided by the number of units going into that process over a specified period of time | | :x: | | *Corrective Commit Probability* | It's the probability that a commit reflects corrective maintenance | | :x: | ### Object-oriented metrics A list of common object-oriented software metrics.[^oom] :heavy_check_mark: : already implemented in rca :x: : non implemented in rca | Name | Description | Meaning | What evaluates | | | ---- | ---- | ---- | ---- | ---- | | *Weighted methods per class* (**WMC**) | It's a count of the methods implemented within a class or the sum of the cyclomatic complexities of the methods | It's a predictor of how much time and effort is required to develop and mantain the class. A class with a lot of methods has more impact on children classes but, since it's very specific, its reuse is limited | Understandability, Maintainability, Reusability | :x: | | *Response for a Class* (**RFC**) | It's the cardinality of the set of all methods that can be invoked in response to a message to an object of the class or by some method in the class. This includes all methods accessible within the class hierarchy | High number of methods that can be invoked from a class through messages means greater complexity of the class, less understandability of the class and tests more complicated | Understandability, Maintainability, Testability | :x: | | *Lack of Cohesion of Methods* (**LCOM**) | It measures the degree of similarity of methods by data input variables or attributes (structural properties of classes). There are at least two different ways of measuring cohesion:[^lcom]<br/> 1. Calculate for each data field in the class what percentage of the methods use that data field. Average the percentages then subtract from 100%. Lower percentage means greater cohesion of data and methods in the class.<br/> 2. Methods are more similar if they operate on the same attributes. Count the number of disjoint sets produced from the intersection of the sets of attributes used by the methods | High cohesion indicates good class subdivision. Lack of cohesion or low cohesion increases complexity. Classes with low cohesion could probably be subdivided into two or more subclasses with increased cohesion | Efficiency, Reusability | :x: | | *Coupling Between Object Classes* (**CBO**) | It's a count of the number of other classes to which a class is coupled. It's measured by counting the number of distinct non-inheritance related class hierarchies on which a class depends | Excessive coupling prevents reuse, makes maintanance more difficult and makes the system more complex to understand | Efficiency, Reusability | :x: | | *Depth of Inheritance Tree* (**DIT**) | It's the maximum length from the class node to the root of the inheritance hierarchy tree | A deeper class inherits more methods, making the system more complex. On the other hand inherited methods make the system more reusable. A support metric is the *number of methods inherited* (**NMI**) | Efficiency, Reusability, Understandability, Testability | :x: | | *Number of Children* (**NOC**) | It's the number of immediate subclasses subordinate to a class in the hierarchy | It's an indicator of the potential influence a class can have. If it's high it means possible misuse of subclassing, great reusability and longer tests | Efficiency, Reusability, Testability | :x: | Additional less common OO metrics: [^lcoom] :heavy_check_mark: : already implemented in rca :x: : non implemented in rca | Name | Description | | | --- | --- | --- | | *Tight Class Cohesion* (**TCC**) | An alternative metric for measuring a class cohesion | :x: | | *Loose Class Cohesion* (**LCC**) | An alternative metric for measuring a class cohesion | :x: | | *Average Method Complexity* (**AMC**) | It measures the average method size for each class | :x: | | *Afferent Coupling* (**Ca**) | A measure of how many other classes use the specific class | :x: | | *Efferent Coupling* (**Ce**) | A measure of how many other classes are used by the specific class | :x: | | *Number of Public Methods* (**NPM**) | It counts all the methods which have the public specifier | :x: | | *Data Access Metric* (**DAM**) | The number of attributes declared as private or protected divided by the total number of attributes declared in the class | :x: | | *Measure Of Aggregation* (**MOA**) | It measures the the has-a relationship between attributes at run time | :x: | | *Measure of Functional Abstraction* (**MFA**) | The number of inherited methods of a class divided by the total number of methods which are accessible by member methods of the class | :x: | | *Cohesion Among Methods of Class* (**CAM**) | The relevance between the class methods based on the list of specifications of the methods | :x: | | *Inheritance Coupling* (**IC**) | The total number of super classes to which a given class is coupled | :x: | | *Coupling Between Methods* (**CBM**) | The total count of new or redefined methods to which all the inherited methods are coupled | :x: | | *Average Method Complexity* (**AMC**) | It measures the average method size for each class, where the size of a method is equivalent to the number of java binary codes in the method | :x: | | *Number of Assertions per KLOC* (**NAK**) | It counts the number of declarations per KLOC at run time | :x: | | *Number Of Fields* (**NOF**) | It counts the number of data declaration used at run time | :x: | | *Number Of Methods* (**NOM**) | It counts the number of object interacted at run time | :x: | | *Number of Static Fields* (**NOSF**) | It counts the number of static attributes in the selected scope | :x: | | *Number of Static Methods* (**NOSM**) | It counts the number of static methods at run time | :x: | | *Number of Test Methods* (**NTM**) | It counts the number of test methods at run time | :x: | ## Market overview ### SonarQube metrics The list of metrics implemented by SonarQube.[^sqm] Metric names beginning with "new" are applied just on new code. :heavy_check_mark: : already implemented in rca :x: : non implemented in rca | Category | Metric | Details | | | ---- | ---- | ---- | ---- | | **Complexity** | *complexity* | It is the Cyclomatic Complexity calculated based on the number of paths through the code. Whenever the control flow of a function splits, the complexity counter gets incremented by one. Each function has a minimum complexity of 1. This calculation varies slightly by language because keywords and functionalities do | :heavy_check_mark: | | | *cognitive_complexity* | How hard it is to understand the code's control flow | :heavy_check_mark: | | **Duplications** | *duplicated_blocks* | Number of duplicated blocks of lines | :x: | | | *duplicated_files* | Number of files involved in duplications | :x: | | | *duplicated_lines* | Number of lines involved in duplications | :x: | | | *duplicated_lines_density* | duplicated_lines / lines * 100 | :x: | | **Issues** | *violations, new_violations* | Total count of issues in all states | :x: | | | *xxx_violations, new_xxx_violations* | Total count of issues of the specified severity, where xxx is one of: blocker, critical, major, minor, info | :x: | | | *false_positive_issues* | Total count of issues marked False Positive | :x: | | | *open_issues* | Total count of issues in Open state | :x: | | | *confirmed_issues* | Total count of issues in Confirmed state | :x: | | | *reopened_issues* | Total count of issues in Reopened state | :x: | | **Maintainability** | *code_smells, new_code_smells* | Total count of Code Smell issues | :x: | | | *Maintainability rating* (formerly the *squale_rating*) | Rating given to your project related to the value of your Technical Debt Ratio. The default maintainability rating grid is: A=0-0.05, B=0.06-0.1, C=0.11-0.20, D=0.21-0.5, E=0.51-1 | :heavy_check_mark: | | | *Technical debt* (*sqale_index, new_technical_debt*) | Effort to fix all Code Smells. The measure is stored in minutes in the database. An 8-hour day is assumed when values are shown in days | :x: | | | *Technical Debt Ratio* (*sqale_debt_ratio, new_sqale_debt_ratio*) | Ratio between the cost to develop the software and the cost to fix it. Remediation cost / Development cost or Remediation cost / (Cost to develop 1 line of code * Number of lines of code) The value of the cost to develop a line of code is 0.06 days | :x: | | **Quality Gates** | *Quality Gate Status* (*alert_status*) | State of the Quality Gate associated to your Project. Possible values are : ERROR, OK | :x: | | | *Quality Gate Details* (*quality_gate_details*) | For all the conditions of your Quality Gate, you know which condition is failing and which is not | :x: | | **Reliability** | *bugs, new_bugs* | Number of bug issues | | | *reliability_rating* | A = 0 Bugs<br/>B = at least 1 Minor Bug<br/>C = at least 1 Major Bug<br/>D = at least 1 Critical Bug<br/>E = at least 1 Blocker Bug | :x: | | | *reliability_remediation_effort, new_reliability_remediation_effort* | Effort to fix all bug issues. The measure is stored in minutes in the DB. An 8-hour day is assumed when values are shown in days | :x: | | **Security** | *vulnerabilities, new_vulnerabilities* | Number of vulnerability issues | :x: | | | *security_rating* | A = 0 Vulnerabilities<br/>B = at least 1 Minor Vulnerability<br/>C = at least 1 Major Vulnerability<br/>D = at least 1 Critical Vulnerability<br/>E = at least 1 Blocker Vulnerability | :x: | | | *security_remediation_effort, new_security_remediation_effort* | Effort to fix all vulnerability issues. The measure is stored in minutes in the DB. An 8-hour day is assumed when values are shown in days | :x: | | | *security_hotspots, new_security_hotspots* | Number of Security Hotspots | :x: | | | *security_review_rating, new_security_review_rating* | The Security Review Rating is a letter grade based on the percentage of Reviewed (Fixed or Safe) Security Hotspots.<br/>A = >= 80%<br/>B = >= 70% and <80%<br/>C = >= 50% and <70%<br/>D = >= 30% and <50%<br/>E = < 30% | :x: | | | *security_hotspots_reviewed* | Percentage of Reviewed (Fixed or Safe) Security Hotspots. Number of Reviewed (Fixed or Safe) Hotspots x 100 / (To_Review Hotspots + Reviewed Hotspots) | :x: | | **Size** | *classes* | Number of classes (including nested classes, interfaces, enums and annotations) | :x: | | | *comment_lines* | Number of lines containing either comment or commented-out code.<br/>Non-significant comment lines (empty comment lines, comment lines containing only special characters, etc.) do not increase the number of comment lines | :heavy_check_mark: | | | *comment_lines_density* | Comment lines / (Lines of code + Comment lines) * 100<br/>50%: the number of lines of code equals the number of comment lines<br/>100%: the file only contains comment lines | :x: | | | *directories* | Number of directories | :x: | | | *files* | Number of files | :x: | | | *lines* | Number of physical lines (number of carriage returns) | :heavy_check_mark: | | | *ncloc* | Number of physical lines that contain at least one character which is neither a whitespace nor a tabulation nor part of a comment | :x: | | | *ncloc_language_distribution* | Non commenting lines of code distributed by language | :x: | | | *functions* | Number of functions. Depending on the language, a function is either a function or a method or a paragraph | :heavy_check_mark: | | | *projects* | Number of projects in a Portfolio | :x: | | | *statements* | Number of statements | :heavy_check_mark: | | **Tests** | *branch_coverage, new_branch_coverage* | (CT + CF) / (2 * B)<br/>CT = conditions that have been evaluated to 'true' at least once<br/>CF = conditions that have been evaluated to 'false' at least once<br/>B = total number of conditions | :x: | | | *branch_coverage_hits_data* | List of covered conditions | :x: | | | *conditions_by_line* | Number of conditions by line | :x: | | | *covered_conditions_by_line* | Number of covered conditions by line | :x: | | | *coverage, new_coverage* | (CT + CF + LC)/(2 * B + EL)<br/>CT = conditions that have been evaluated to 'true' at least once<br/>CF = conditions that have been evaluated to 'false' at least once<br/>LC = covered lines = linestocover - uncovered_lines<br/>B = total number of conditions<br/>EL = total number of executable lines (lines_to_cover) | :x: | | | *line_coverage, new_line_coverage* | (lines_to_cover - uncovered_lines) / lines_to_cover | :x: | | | *coverage_line_hits_data* | List of covered lines | :x: | | | *lines_to_cover, new_lines_to_cover* | Number of lines of code which could be covered by unit tests (blank lines or full comments lines are not considered) | :x: | | | *skipped_tests* | Number of skipped unit tests | :x: | | | *uncovered_conditions, new_uncovered_conditions* | Number of conditions which are not covered by unit tests | :x: | | | *uncovered_lines, new_uncovered_lines* | Number of lines of code which are not covered by unit tests | :x: | | | *tests* | Number of unit tests | :x: | | | *test_execution_time* | Time required to execute all the unit tests | :x: | | | *test_errors* | Number of unit tests that have failed | :x: | | | *test_failures* | Number of unit tests that have failed with an unexpected exception | :x: | | | *test_success_density* | (Unit tests - (Unit test errors + Unit test failures)) / Unit tests * 100 | :x: | ### Visual Studio metrics The list of metrics implemented by Visual Studio.[^vsm] :heavy_check_mark: : already implemented in rca :x: : non implemented in rca | Metric | Description | Details | | | --- | --- | --- | --- | | *Maintainability Index*[^vsmi] | Calculates an index value between 0 and 100 that represents the relative ease of maintaining the code. A high value means better maintainability | MAX(0,(171 - 5.2 * ln(Halstead Volume) - 0.23 * (Cyclomatic Complexity) - 16.2 * ln(Lines of Code)) * 100 / 171)<br/>0-9 = Red<br/>10-19 = Yellow<br/>20-100 = Green | :heavy_check_mark: | | *Cyclomatic Complexity* | Measures the structural complexity of the code. A program that has complex control flow requires more tests to achieve good code coverage and is less maintainable | It is created by calculating the number of different code paths in the flow of the program | :heavy_check_mark: | | *Depth of Inheritance* | Indicates the number of different classes that inherit from one another, all the way back to the base class. A low value is good and a high value is bad | Depth of Inheritance is similar to class coupling in that a change in a base class can affect any of its inherited classes. The higher this number, the higher the potential for base class modifications to result in a breaking change | :x: | | *Class Coupling*[^vscc] | Measures the coupling to unique classes through parameters, local variables, return types, method calls, generic or template instantiations, base classes, interface implementations, fields defined on external types, and attribute decoration | Types and methods should have high cohesion and low coupling. High coupling indicates a design that is difficult to reuse and maintain because of its many interdependencies on other types | :x: | | *Lines of Source code* | Indicates the exact number of source code lines that are present in your source file, including blank lines | | :heavy_check_mark: | | *Lines of Executable code* | Indicates the approximate number of executable code lines or operations. This is a count of number of operations in executable code | | :heavy_check_mark: | ### Ndepend https://www.ndepend.com/docs/code-metrics # Code Coverage and Code Complexity An explanation of the mechanisms that relate code coverage to code complexity How to realize this method: 1. Take a small C/Rust repository composed by three files 2. Run grcov on that and get a json for code coverage @Luni-4 Look at grcov public APIs 3. Create an rca API that takes the path to this json file and deserialize it 4. Implement one of the algorithm below ## Sifis-Home Mechanism The method proposed by Luca Ardito and others for Sifis-Home[^sifis]. The method works as following: The new coverage value of a given block of code is obtained by counting the number of covered lines then multiply these number by 2 if the code block complexity exceeds a given threshold, 1 if it doesn't. Finaly the global coverage value is obtained by summing all the new coverage values and dividing the sum by the PLOC in the source file. > There are two versions to try: > - **plain version**: you give to each line the complexity value, then you compute the code coverage by summing the values for the lines covered and the total lines examined > - **quantized version**: see the paragraph above > [name=Luca Barbato] ## C.R.A.P CRAP[^crap] is a metric created by Alberto Savoia and Bob Evans that relate cyclomatic complexity to path code coverage. The basic formula is the following: $CRAP1(m) = comp(m)^2 * (1 – cov(m)/100)^3 + comp(m)$ Where comp(m) is the cyclomatic complexity and $cov(m)$ is the code coverage. It could be possible to implement this metric also for cognitive complexity. Usually if the CRAP score is bigger than 30 the softwere is considered "crappy" and more testing or reduced complexity is needed,but this values can varing depending on languare or size of the code. Crap was originally implemented for java[^crap4j] but other implementation already exists like for example for .net code[^Ndepend]. ### Pseudocode ``` //Lets suppose we already have obtained the coverage and the complexity //pow is the power function CRAP(cov,comp): if comp==0 return 0 return comp*comp*(pow(1-cov,3)) + comp ``` ## Residual Complexity Residual Complexity[^residual] is a combination of cyclomatic complexity and code coverage that indicate how well the software is handled by its test. The Residual Complexity for a single method can be calculated as following: $rc(m)=comp(m)*(1-cov(m))$ Where $comp(m)$ is the code complexity and $cov(m)$ is the code coverage. For classes or similar structures the residual complexity can be calculated by sum up all the $rc$ of each single methods. In the paper is suggest to use this metrics combined: * Branch Coverage and Cyclomatic Complexity * Line of Code with Line coverage ### Pseudocode ``` //Lets suppose we already have obtained the coverage and the complexity RC(cov,comp): if comp==0 return 0 return comp*(1-cov) ``` ## Skunkscore SkunkScore[^skunk] is a metric that combine code smells, code coverage and code complexity in order to know who are the most complex modules with less coverage. ShunkScore was designed and implemented for Ruby and uses different Ruby libraries but it can be implemented for other languages as well with some modification in the design. The SkunkScore for a single method can be calculated as following: $scunk(m)= cost(m)*(100*cov(m))$ where: $cost(m)= sum(smells(m))+ comp(m)/COMPLEXITYFACTOR$ $sum(smells(m))$ is the sum of the cost of all code smells, the cost for each code smell can be set in the following ways: - ALways 1. - Depending on the time to resolve the smell, this is the most difficult approch because a DB with all times is needed. - Depending on the code smell severity, for exemple from 5 to 1 , higher the severity higher the value. $comp(m)$ is the code complexity, the complexity used in Skunk and suggested is the ADB metric. $COMPLEXITYFACTOR$ is a reduce factor that is set to an accetable threshold for the complexity metric. For the ABC metric and the implementation in the Flog module the threshold is set to 25, the value could change depending on the language. $cov(m)$ is the code coverage with value in [0,100], in the implementation of Skunk the type of coverage used is the Line Coverage, grcov can be used for obtain such value. In case we are not intrested in the code smells a possible way to calculate the cost can be: $cost(m)=comp(m)/COMPLEXITYFACTOR$. In the implementation is suggested to use as complexity metric the ABC score the ruby module SimpleCov[^simplecov] is used for the coverage. ### Pseudocode ``` //Lets suppose we already have obtained the coverage , the complexity and the code smeels COMPLEXITYFACTOR = 25.0 RC(cov,comp,smells[]): if comp==0 return 0 smells_sum = sum(smells) //sum all smells costs cost=smells_sum+comp/COMPLEXITYFACTOR if cov==100 return cost else return cost*(100-cov) ``` # Metric Implementation Feasibility Analysis A possible algorithm to implement an object-oriented metric for rca could be the following one: 1. Let A and B two files 2. Construct concrete syntax tree for A and look for the information necessary for the object oriented metric. Save those information in a specific structure in some way. A possible info could be: - Code lines - Code string - Space 4. Do the same procedure as 2 for file B 5. Compare the two structures in order to sastify the metrics The algorithm above can be adapted to N files # Common metrics feasibility analysis ## ABC metric :heavy_check_mark: The metric defines an ABC score as a triplet of values that represent the size of a set of source code statements. It's calculated by counting the number of assignments (A), number of branches (B), and number of conditionals \(C\) in a program. It can be applied to methods, functions, classes, modules or files. ABC score is represented by a 3-D vector $ABCvector = <A,B,C>$ or a scalar value $|ABCvector| = sqrt(A^2+B^2+C^2)$. By convention the ABC magnitude value is rounded to the nearest tenth. ABC scalar scores should not be presented without the accompanying ABC vectors, since the scalar values are not the complete representation of the size.^[abc] #### Pseudocode Pseudocode for analyzing a C language source code. ``` define A, B, C and M scan source file if found an assignment operator (=, *=, /=, %=, +=, -=, <<=, >>=, &=, !=, ^=) or found an increment or a decrement operator (++, --) A++ if found a function call or a goto statement which has a target at a deeper level of nesting than the level to the goto B++ if found a conditional operator (<, >, <=, >=, ==, !=) or a keywords (‘else’, ‘case’, ‘default’, ‘?’) or a unary conditional operator C++ M = sqrt(A^2+B^2+C^2) ``` The implementation for C++ is similar but with few differences: - When computing A we should exclude constant declarations and default parameter assignments and include initializations of a variable or a nonconstant class member. - When computing B we should include class method calls and occurrences of ‘new’ or ‘delete’ operators. - When computing C we should include the keywords keywords ‘try’ and ‘catch’. References: - https://en.wikipedia.org/wiki/ABC_Software_Metric ## Bugs per line of code :heavy_check_mark: As the name suggests, it's the number of bugs divided by the lines of code. $BugsPerLineOfCode = NumberOfBugs / LinesOfCode$ Bugs can be implemented in rca using SonarSource C++ and JavaScript rules: **C++ bug rules**: https://rules.sonarsource.com/cpp/type/Bug (107 rules) **JavaScript bug rules**: https://rules.sonarsource.com/javascript/type/Bug (62 rules) At the moment SonarSource does not support Rust. Finally SLOC, PLOC and LLOC metrics could be used as LinesOfCode field. ## Cohesion :heavy_check_mark: It indicates how much the elements inside a module belongs together. It can be a measure of the strength of relationship between the methods and data of a class and some unifying purpose or concept served by that class or a measure of the strength of relationship between the class's methods and data themselves. High cohesion is better than low cohesion. There's 8 different types of cohesion, from the worst to the best, they are: ***Coincidental cohesion (worst)*** It's when parts of a module are grouped arbitrarily (for example a "Utilities" class). ***Logical cohesion*** It's when parts of a module are grouped because they are logically categorized to do the same thing even though they are different by nature (for example the models, views and controllers in an MVC project). ***Temporal cohesion*** It's when parts of a module are grouped by when they are processed - the parts are processed at a particular time in program execution (for example the functions called after catching an exception and that close open files, create an error log, and notify the user). ***Procedural cohesion*** It's when parts of a module are grouped because they always follow a certain sequence of execution (for example a function which checks file permissions and then opens the file). ***Communicational/informational cohesion*** When parts of a module are grouped because they operate on the same data (for example a module which operates on the same record of information). ***Sequential cohesion*** It's when parts of a module are grouped because the output from one part is the input to another part like an assembly line (for example a function which reads data from a file and processes the data). ***Functional cohesion (best)*** It's when parts of a module are grouped because they all contribute to a single well-defined task of the module (for example the lexical analysis of an XML string). ***Perfect cohesion (atomic)*** It's when a module can't be reduced any more than that. Cohesion is a property of modules and several metrics try to measure it. The most famous cohesion metric for object-oriented software is the LCOM. It's feasibility for rca is discussed below. References: - https://en.wikipedia.org/wiki/Cohesion_(computer_science) - https://dl.acm.org/doi/abs/10.1145/3178461.3178479 ## Comment density :heavy_check_mark: Comment density is assumed to be a good predictor of maintainability. It's the percentage of comment lines in a given source code base: $CommentDensity(\%) = CommentLines / TotalLines * 100$ . In rca it can be implemented using CLOC and SLOC metrics: $CommentDensity(\%) = CLOC / SLOC * 100$ . References: - https://www.researchgate.net/publication/221555173_The_Comment_Density_of_Open_Source_Software_Code ## Connascence :heavy_check_mark: It's a metric that attempts to measure coupling between entities on object-oriented design. It also offers a taxonomy for different types of coupling. Two components are connascent if a change in one would require the other to be modified in order to maintain the overall correctness of the system. Each instance of connascence in a codebase must be considered on three separate axes: - ***Strength***: Stronger connascences are harder to discover, or harder to refactor. - ***Degree***: An entity that is connascent with thousands of other entities is likely to be a larger issue than one that is connascent with only a few. - ***Locality***: Connascent elements that are close together in a codebase are better than ones that are far apart. Connascences are said to be "static" if they can be found by visually examining the code and "dynamic" if they can only be discovered at runtime. | Strength | Static connascence type | Description | | -------- | ------------------------- | --- | | 1 | *Connascence of name (CoN)* | It's when multiple components must agree on the name of an entity. Method names are an example: if the name of a method changes, callers of that method must be changed. | | 2 | *Connascence of type (CoT)* | It's when multiple components must agree on the type of an entity. In statically typed languages, the type of method arguments is an example of this form of connascence: If a method changes the type of its argument, callers of that method must be changed. | | 3 | *Connascence of meaning (CoM) or connascence of convention (CoC)* | It's when multiple components must agree on the meaning of particular values. For example returning integers 0 and 1 to represent false and true. | | 4 | *Connascence of position (CoP)* | It's when multiple components must agree on the order of values. Positional parameters in method calls are an example. Both caller and callee must agree on the semantics of the first, second, etc. parameters. | | 5 | *Connascence of algorithm (CoA)* | It's when multiple components must agree on a particular algorithm. Message authentication codes are an example. Both sides of the exchange must implement exactly the same hashing algorithm or the authentication will fail. | A possible rca implementation should be able find all the different types of static connascence in a project and compute for all of them the threee properties: strenght, degree and locality. The three property could be better visualized on a graph. References: - https://en.wikipedia.org/wiki/Connascence - https://connascence.io/ - https://dzone.com/articles/about-connascence - https://www.maibornwolff.de/en/blog/connascence-rules-good-software-design ## Coupling :heavy_check_mark: It's the degree of interdependence between software modules, a measure of how closely connected two routines or modules are and the strength of the relationships between modules. Low coupling often correlates with high cohesion, and vice versa. Low coupling is often thought to be a sign of a well-structured computer system and a good design. It can be implemented in rca using the CBO metric. References: - https://en.wikipedia.org/wiki/Coupling_(computer_programming)#Object-oriented_programming - https://ieeexplore.ieee.org/abstract/document/731240 ## Cyclomatic complexity density :heavy_check_mark: It's the Cyclomatic complexity of a module divided by its length in noncomment source lines of code (NCSLOC). A line of code is any line of program text that is not a comment or blank line. This ratio is meant to represent the normalized complexity of a module and hence its likely level of maintenance task difficulty. It could be implemented in rca using the already implemented CC, PLOC (number of instructions, blank lines and comment lines are counted too if less than 25%) and CLOC (comment lines) metrics: $CyclomaticComplexityDensity = CC / (PLOC - CLOC)$ References: - https://www.researchgate.net/publication/3187433_Cyclomatic_complexity_density_and_software_maintenance_productivity - https://softwareengineering.stackexchange.com/questions/290405/is-cyclomatic-complexity-density-a-good-software-quality-metric ## Defect density :heavy_check_mark: Defect density is the number of confirmed bugs in a software application or module during the period of development, divided by the size of the software. It's counted per thousand lines of code, also known as KLOC. $1 KLOC = 1000 SLOC$ $DefectDensity = NumberOfDefects / KLOC$ $DefectDensity = NumberOfDefects / NumberOfFunctionalAreas$ A functional area can be a function, a module, a class ora a component SonarSource rules can be used to identify defects for C++ and JavaScript: https://rules.sonarsource.com/ SLOC, PLOC and LLOC metrics can be used to calculate the KLOC field used in the formula. References: - https://www.softwaretestinghelp.com/defect-density/#What_is_Defect_Density - https://economictimes.indiatimes.com/d/defect-density/profileshow/51442887.cms?from=mdr ## Defect potential :x: It's the total quantity of bugs or defects that will be found during the development of a software application in five software artifacts: requirements, design, code, documents, and “bad fixes” or secondary defects. It's measured in defects per function point. It can be measured in rca only if we define a defect. The defect count should be incremented every time we discover a defect in a software application. Function point measures are needed to. It seems a metric related to whole software development process, from the requirements definition to the maintenance of the software product. References: - https://rbcs-us.com/resources/articles/basic-library-articles-measuring-defect-potentials-and-defect-removal-efficiency/ ## Defect removal rate/efficiency :x: The percentage of total defects found and removed before software applications are delivered to the customer. This metric refers to the whole software development process, it would be hard to implement it in a static code analyzer tool like rca. Unless we can find a way to keep track of the same project for some time an implementation of it seems useless. References: - https://rbcs-us.com/resources/articles/basic-library-articles-measuring-defect-potentials-and-defect-removal-efficiency/ - https://www.lawinsider.com/dictionary/defect-removal-rate ## DSQI (design structure quality index) :x: It's an architectural design metric used to evaluate a computer program's design structure and the efficiency of its modules. It was developed by the United States Air Force Systems Command. The result of DSQI calculations is a number between 0 and 1. The closer to 1, the higher the quality. It is best used on a comparison basis, i.e., with previous successful projects. First Following Factors must be provided: | SX | Definition | | --- | --- | | S1 | The total number of modules defined in the program architecture | | S2 | The number of modules whose correct function depends on the source of data input or that produce data to be used elsewhere | | S3 | The number of modules whose correct function depends on prior processing | | S4 | The number of database items (includes data objects and all attributes that define objects) | | S5 | The total number of unique database items | | S6 | The number of database segments (different records or individual objects) | | S7 | The number of modules with a single entry and exit (exception processing is not considered to be a multiple exit) | Once the values are determined for a computer program, the following intermediate values can be computed: | DX | Name | Definition | | --- | --- | --- | | D1 | Program structure | If the architectural design was developed using a distinct method (e.g., data flow-oriented design or object-oriented design), then D1 = 1, otherwise D1 = 0 | | D2 | Module independence | D2 = 1 - (S2/S1) | | D3 | Modules not dependent on prior processing | D3 = 1 - (S3/S1) | | D4 | Database size | D4 = 1 - (S5/S4) | | D5 | Database compartmentalization | D5 = 1 - (S6/S4) | | D6 | Module entrance/exit characteristic | D6 = 1 - (S7/S1) | With these intermediate values determined, the DSQI is computed in the following manner: $DSQI = SUM(wiDi)$ where i = 1 to 6, wi is the relative weighting of the importance of each of the intermediate values, and SUM(wi) = 1. If all Di are weighted equally, then wi = 0.167. Value of DSQI for past designs can be determined and compared to a design that is currently under development. If the DSQI is significantly lower than average, further design work and review are indicated. If major changes are to be made to an existing design, the effect of those changes on DSQI can be calculated. Since this metric focus more on software design choices rather than static code, an implementation in rca may be hard or useless. References: - http://logicalprogram.blogspot.com/p/dsqi.html - http://groups.umd.umich.edu/cis/tinytools/cis375/f00/dsqi/main.html - https://en.wikipedia.org/wiki/DSQI - https://studylib.net/doc/10298989/software-metrics-alex-boughton - https://home.cs.colorado.edu/~kena/classes/5828/s12/presentation-materials/boughtonalexandra.pdf ## Function Points :x: It's a "unit of measurement" to express the amount of business functionality an information system (as a product) provides to a user. Defined in 1979 by Allan J. Albrecht, FPs are used to compute a functional size measurement (FSM) of software. There is currently no ISO recognized FSM method that includes algorithmic complexity in the sizing result. Some improved variants of FPs are implemented in commercial products: - Early and easy function points - Engineering function points - Bang measure - Feature points - Weighted Micro Function Points - Fuzzy Function Points ### Basic algorithm 1) Unadjusted Function Points (UFP): for each function we count some parameters, multiply them by a weight and then we sum them $UFP=sum(param_i*Wparam_i)$ | Parameter | Definition | Examples | Low | Avg | High | | --- | --- | --- | --- | --- | --- | | External Inputs (EIs) | Number of elementary processes data or control information that comes from outside the application’s boundary | Input screen and tables | 3 | 4 | 6 | | External Outputs (EOs) | Number of elementary processes that generates data or control information sent outside the application’s boundary | Output screens and reports | 4 | 5 | 7 | | External Inquiries (EQs) | Number of elementary processes made up of an input-output combination that results in data retrieval | Prompts and interrupts | 3 | 4 | 6 | | Internal Logical Files (ILFs) | Number of user identifiable groups of logically related data or control information maintained within the boundary of the application | Databases and directories | 7 | 10 | 15 | | External Interface Files (EIFs) | Number of groups of users recognizable logically related data allusion to the software but maintained within the boundary of another software | Shared databases and shared routines | 5 | 7 | 10 | 2) Complexity adjustment value (CAF): we answer to 14 questions with a value between 0 and 5, so we can then compute $CAF=[0.65+0.01*sum(f_i)]$ |Num|Question| | --- | --- | |1|Reliable backup and recovery required?| |2|Data communication required?| |3|Are there distributed processing functions?| |4|Is performance critical?| |5|Will the system run in an existing heavily utilized operational environment?| |6|On line data entry required?| |7|Does the on line data entry require the input transaction to be built over multiple screens or operations?| |8|Are the master files updated on line?| |9|Is the inputs, outputs, files or inquiries complex?| |10|Is the internal processing complex?| |11|Is the code designed to be reusable?| |12|Are the conversion and installation included in the design?| |13|Is the system designed for multiple installations in different organizations?| |14|Is the application designed to facilitate change and ease of use by the user?| 3) $FPs=UFPs*CAF$ ### Feasibility The basic algorithm from 1979 is quite general and requires subjective parameters, it can't be implemented in rca without allowing the user to input some subjective project parameters. References: - Wikipedia: https://en.wikipedia.org/wiki/Function_point - Working calculator: https://w3.cs.jmu.edu/bernstdh/web/common/webapps/oop/fpcalculator/FunctionPointCalculator.html - Working code: https://www.geeksforgeeks.org/software-engineering-calculation-of-function-point-fp/ - Method explaination - https://www.javatpoint.com/software-engineering-functional-point-fp-analysis - https://www.fingent.com/blog/function-point-analysis-introduction-and-fundamentals/ - https://www.geeksforgeeks.org/software-engineering-functional-point-fp-analysis/ - https://www.tutorialspoint.com/software_quality_management/software_quality_management_albrechts_function_point_method.htm - https://www.tutorialspoint.com/estimation_techniques/estimation_techniques_function_points.htm ## Instruction path length :x: It's the number of machine code instructions required to execute a section of a computer program. It's used to measure performance on a particular computer hardware. It's frequently taken as the number of assembly instructions required to perform a function or particular section of code. The instruction path length of an assemply language program includes only code in the executed control flow for the given input and does not include code that is not relevant for the particular input, or unreachable code. Since one statement written in a high-level language can produce multiple machine instructions of variable number, it is not always possible to determine instruction path length without, for example, an ***instruction set simulator*** – that can count the number of 'executed' instructions during simulation. If the high-level language supports and optionally produces an 'assembly list', it is sometimes possible to estimate the instruction path length by examining this list. It's an example of runtime metric that can't be implemented in rca. References: - https://en.wikipedia.org/wiki/Instruction_path_length ## Weighted Micro Function Points :x: It's a modern software sizing algorithm which is a successor to solid ancestor scientific methods as COCOMO, COSYSMO, maintainability index, cyclomatic complexity, function points, and Halstead complexity. It produces more accurate results than traditional software sizing methodologies, while requiring less configuration and knowledge from the end user, as most of the estimation is based on automatic measurements of an existing source code. As many ancestor measurement methods use source lines of code (SLOC) to measure software size, WMFP uses a parser to understand the source code breaking it down into micro functions and derive several code complexity and volume metrics, which are then dynamically interpolated into a final effort score. The WMFP measured elements are several different software metrics deduced from the source code by the WMFP algorithm analysis. They are represented as percentage of the whole unit (project or file) effort, and are translated into time. | Element | Definition | | --- | --- | | Flow complexity (FC) | Measures the complexity of a programs' flow control path in a similar way to the traditional cyclomatic complexity, with higher accuracy by using weights and relations calculation | | Object vocabulary (OV) | Measures the quantity of unique information contained by the programs' source code, similar to the traditional Halstead vocabulary with dynamic language compensation | | Object conjuration (OC) | Measures the quantity of usage done by information contained by the programs' source code | | Arithmetic intricacy (AI) | Measures the complexity of arithmetic calculations across the program | | Data transfer (DT) | Measures the manipulation of data structures inside the program | | Code structure (CS) | Measures the amount of effort spent on the program structure such as separating code into classes and functions | | Inline data (ID) | Measures the amount of effort spent on the embedding hard coded data | | Comments (CM) | Measures the amount of effort spent on writing program comments | The WMFP algorithm uses a 3-stage process: function analysis, APPW transform, and result translation. A dynamic algorithm balances and sums the measured elements and produces a total effort score. The basic formula is: $Σ(WiMi)ΠDq$ M = the source metrics value measured by the WMFP analysis stage W = the adjusted weight assigned to metric M by the APPW model N = the count of metric types i = the current metric type index (iteration) D = the cost drivers factor supplied by the user input q = the current cost driver index (iteration) K = the count of cost drivers This score is then transformed into time by applying a statistical model called average programmer profile weights (APPW) which is a proprietary successor to COCOMO II 2000 and COSYSMO. The resulting time in programmer work hours is then multiplied by a user defined cost per hour of an average programmer, to produce an average project cost, translated to the user currency. I couldn't find a lot of information online about this metric. It seems complex, I think it would be hard to implement it in rca. References: - https://en.wikipedia.org/wiki/Weighted_Micro_Function_Points - http://digilib.stmik-banjarbaru.ac.id/data.bc/21.%20Software%20Engineering/2010%20Capers_Jones%20Software_Engineering_Best_Practices.pdf ## <mark>CISQ automated quality characteristics measures :heavy_check_mark:</mark> CISQ (Consortium for Information & Software Quality) and OMG (Object Management Group) recently defined an Automated Source Code Quality Measures standard for 4 areas: Reliability, Security, Performance Efficiency, and Maintainability. This standard is called ***ISO/IEC 5055:2021*** and it's a list of both IT and embedded specific software weaknesses (CWEs). The CWEs for each characteristic were selected by a team of renowned software engineering experts because of their criticality and measured impact on quality and security. The standard CWEs are a selection of a bigger and very useful MITRES's rules database available online. In rca we could apply some of the rules defined by the ISO/IEC 5055 or even from the bigger MITRE's official database of CWE rules. If applied, those rules could be very useful for counting and highlighting software weaknesses. In rca we could provide a count of weaknesses for each of the four areas and then turn them into comparative measures such as the density of weaknesses. References: - CISQ - Standard information: https://www.it-cisq.org/standards/code-quality-standards/ - Standard set of CWEs: https://www.it-cisq.org/coding-rules/index.htm - CWE - MITRE's complete set of CWEs: https://cwe.mitre.org/index.html - ISO/IEC 5055:2021 - Official standard document: https://bsol-bsigroup-com.ezproxy.biblio.polito.it/Bibliographic/BibliographicInfoData/000000000030412917 ## <mark>Cycle time :x:</mark> It's a measure of speed of a software development process. It's a measure of performance of a team and can be used to improve efficiency. A cycle can be defined in different ways, but a starting and ending point must be defined. A starting point can coincide to the starting of the design or development, an ending point is usually the release or delivery of a software product. This metric could be implemented in rca only if we can define a starting and ending point to measure the time between them. A starting point could be a file creation date and the ending point could the last modified time. However this metrics seems really meant for evaluating a team efficiency and I believe it should be defined and measured outside. Another reason to discard this metric is that it can't be directly computed from a static analysis of the code. References: - https://en.wikipedia.org/wiki/Cycle_time_(software) - https://itsadeliverything.com/lead-time-versus-cycle-time-untangling-the-confusion - http://theleanthinker.com/2010/04/28/takt-time-cycle-time/ - https://linearb.io/cycle-time/ - https://www.klipfolio.com/blog/cycle-time-software-development - https://codeclimate.com/blog/software-engineering-cycle-time/ ## <mark>First pass yield :x:</mark> It's the number of units coming out of a process divided by the number of units going into that process over a specified period of time. It's primary used in manifacturing activities. It's definition for software it's not very clear. It could for example be the number of test cases coming out of a software testing process divided by the number of test cases going into software testing process over a specified period of time. I think this metric won't be very useful for rca for the following reasons: - It has not a clear and standard definition - It is more about work process efficiency than static code analysis References: - https://en.wikipedia.org/wiki/First_pass_yield - https://www.linkedin.com/pulse/quality-indicators-software-testing-factory-ivan-luizio-magalh%C3%A3es ## Corrective Commit Probability :x: It measures the probability that a commit reflects corrective maintenance. Corrective commits are identified by applying a linguistic model to the commit messages. Lower CCP (higher quality) is associated with smaller files, lower coupling, use of languages like JavaScript and C# as opposed to PHP and C++, fewer developers, lower developer churn, better onboarding, and better productivity. The language model uses regular expressions to identify the presence of different indicator terms in commit messages (like 'bug', 'fix', 'error' or 'fail'). Let $hr$ be the hit rate (probability that the model will identify a commit as corrective) and $pr$ be the positive rate, the true corrective rate in the commits (this is what CCP estimates) $pr = (hr − Fpr)/(recall − Fpr) = 1.253*hr - 0.053$ $Fpr = 0.042$ $recall = 0.84$ This metric can be computed without statically analyzing any code so its implementation in rca is discouraged. References: - Paper (21 July 2020): https://arxiv.org/abs/2007.10912 - Language models (file corrective_model.py): https://github.com/evidencebp/commit-classification ## More metrics ***Automated Function Points*** :x: It'a a recent method to automate Function Points counting for transaction-oriented software applications, in particular those with data persistency. The standard ISO/IEC 19515:2019 defines an algorithm for this and provides a list of all the input required for it to work: source code, files that don’t belong to the application, libraries that don’t belong to the application, data definition files, naming conventions, application boundaries, etc. Since most of the input required for the algorightm to work must come from a user, for now an rca implementation seems very complicated. References: - https://bsol-bsigroup-com.ezproxy.biblio.polito.it/Bibliographic/BibliographicInfoData/000000000030378053 - https://www.it-cisq.org/standards/automated-function-points/ - https://www.omg.org/spec/AFP ***Automated Enhancement Points*** :x: It's a standard that improves Automated Function Points and it's used to measure maintenance and enhancement work performed between two revisions of the software. This improved standard also requires a lot of user input. Despite both standards seem very useful, especially for large projects, the amout of user input they require seem too much for a static code analyzer like rca. References: - https://www.it-cisq.org/standards/automated-enhancement-points/ - https://www.omg.org/spec/AEP/ # Object-oriented metrics feasibility analysis ## Static code analyzers that produce object-oriented metrics - JaSoMe: Java Source Metrics https://github.com/rodhilton/jasome - CK https://github.com/mauricioaniche/ck - Remote Code Maintainability Analyzer https://github.com/abanza/RemoteCodeAnalyzer - CSharpMetricsCollectorJava https://github.com/tl182/CSharpMetricsCollectorJava (introduces MOOD metrics) - jPeek https://github.com/cqfn/jpeek - LCOM https://github.com/potfur/lcom - Classes and Metriсs (CaM) https://github.com/yegor256/cam ## Weighted methods per class (WMC) :heavy_check_mark: It's a count of the methods defined in a class with an optional weight. Usually the weight assigned to each method in the sum is the method cyclomatic complexity. It represents the complexity of a class as a whole and this measure can be used to indicate the development and maintenance effort for the class. $WMC = sum(X_i)$ where $X$ could be just $1$ or the cyclomatic complexity $CC$ of a class method. rust-code-analysis already has a way to find the cyclomatic complexity of a function. It may be useful to find a way to associate a method to its class so we could sum all the cyclomatic complexity of a class methods. References: - https://maisqual.squoring.com/wiki/index.php/Weighted_Methods_per_Class - https://www.aivosto.com/project/help/pm-oo-ck.html ## Response for a Class (RFC) :heavy_check_mark: It's the number of different methods and constructors invoked by a class. It is calculated by adding the number of methods in the class (not including inherited methods) plus the number of distinct method calls made by the methods in the class (each method call is counted only once even if it is called from different methods). In rca we should keep track of methods for each class and for each method we should save all the other class methods it calls. References: - https://objectscriptquality.com/docs/metrics/response-for-class ## Lack of Cohesion of Methods (LCOM) :heavy_check_mark: It's a measure of cohesion of a class. There are various different versions of the LCOM. The LCOM of a class should be low. LCOM4 It measures the number of connected components in a class. Methods A and B are part of the same connected component if they both access the same class-level variable, or A calls B, or B calls A. Inherited methods and empty methods should not be considered. Various different versions of the LCOM metrics can be implemented in rca. Modern versions of this metric are preferable because more reliable. To compute the LCOM we need to know for each method of a class all the attributes of that class accessed by that method and all the other methods called inside it. References: - https://objectscriptquality.com/docs/metrics/lack-cohesion-methods-lcom4 - https://dl.acm.org/doi/abs/10.1145/3178461.3178479 - https://blog.ndepend.com/lack-of-cohesion-methods/ - https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&ved=2ahUKEwil2_WaxtX0AhUHsKQKHUo6DqkQFnoECAQQAQ&url=https%3A%2F%2Fwww.ijeat.org%2Fwp-content%2Fuploads%2Fpapers%2Fv2i3%2FC1085022313.pdf&usg=AOvVaw2CLY3Z5OHRoUpXhz4g-Kt6 ## Coupling Between Object Classes (CBO) :heavy_check_mark: It represents the number of classes coupled to a given class. This coupling can happen throuhg: method call, class extends, properties or parameters, method arguments or return types, variables in methods. Coupling between classes is required for a system to do useful work, but excessive coupling it's bad. At project or package level, this metric provides the average number of classes used per class. In rca we should implement a way to recognize objects of a specific class inside another class. In that way we can analize the context of an object occurrence and decide if we should increase the CBO metric. References: - https://objectscriptquality.com/docs/metrics/coupling-between-object-classes-cbo ## Depth of Inheritance Tree (DIT) :heavy_check_mark: It measures the maximum length between a node and his root node in a class hierarchy. It's a measure of class coplexity: if a class has high DIT it means it's inheriting a lot so it's more complex than a low DIT class. A class with no parents has DIT = 0. ### Pseudocode We could recursively go up the hierarchy of a class, just like the tool **CSharpMetricsCollectorJava** on GitHub does. A method to associate a class to his parents and to keep track of parent-son relationship is needed. ``` calculateDIT(ClassStruct cs) { let parent = cs.getParent(); if (parent == null) { return 0; } else { return ( 1+calculateDIT(parent) ); } } ``` References: - **CSharpMetricsCollectorJava**: https://github.com/tl182/CSharpMetricsCollectorJava/blob/dc3bf052723562b3b350b56f73e25fb6a8fb9a22/src/csmc/metrics/CKMetric.java#L61-L63 ## Number of Children (NOC) :heavy_check_mark: It's the number of classes that directly extend a specific class. In rca it could be useful to save in a collection all the classes found by the parser. Each structure in the collection should represent a parsed class and its position in the class hierarchy: name, package/module to which it belongs, parents and children classes. ## <mark>Tight Class Cohesion (TCC) and Loose Class Cohesion (LCC) :heavy_check_mark:</mark> TCC e LCC are a way to measure class cohesion. For TCC and LCC we only consider visible methods. A method is visible unless it is Private. A method is visible also if it implements an interface or handles an event. The higher TCC and LCC, the more cohesive and better the class is. Methods A and B are ***directly connected*** if: - They both access the same class-level variable, or - The call trees starting at A and B access the same class-level variable. For the call trees we consider all procedures inside the class, including private procedures. If a call goes outside the class, we stop following that call branch. When 2 methods are not directly connected, but they are connected via other methods, we call them ***indirectly connected***. Example: A - B - C are direct connections. A is indirectly connected to C (via B). NP = maximum number of possible connections = N * (N − 1) / 2 where N is the number of methods NDC = number of direct connections (number of edges in the connection graph) NID = number of indirect connections $TCC = NDC / NP$ $LCC = (NDC + NIC) / NP$ References: - https://www.aivosto.com/project/help/pm-oo-cohesion.html#TCC_LCC ## <mark>MOOD and MOOD2 metrics :heavy_check_mark:</mark> MOOD (Metrics for Object Oriented Design) are designed to provide a summary of the overall quality of an object-oriented project. The original MOOD metrics suite consists of 6 metrics, the MOOD2 metrics were added later. These metrics are expressed as percentages, ranging from 0% (no use) to 100% (maximum use). Based on what they check : - encapsulation : - (MHF) Method Hiding Factor = 1 − MethodsVisible - (AHF) Attribute Hiding Factor = 1 − AttributesVisible MethodsVisible = sum(MV) / (C − 1) / Number of methods MV = number of other classes where method is visible AttributesVisible = sum(AV) / (C − 1) / Number of attributes AV = number of other classes where attribute is visible C = number of classes - inheritance : - (MIF) Method Inheritance Factor = inherited methods / total methods in classes - (AIF) Attribute Inheritance Factor = inherited attributes / total attributes in classes - polymorphism : - (POF) Polymorphism Factor = overrides / sum for each class(new methods * descendants) - coupling : - (COF) Coupling Factor = actual couplings / maximum possible couplings References: - https://www.aivosto.com/project/help/pm-oo-mood.html - https://www.ercim.eu/publication/Ercim_News/enw23/abreu.html - https://www.researchgate.net/publication/2611610_The_Design_of_Eiffel_Programs_Quantitative_Evaluation_Using_the_MOOD_Metrics ## Afferent Coupling (Ca) :heavy_check_mark: A class afferent couplings is a measure of how many other classes use the specific class. ### PseudoCode ``` fn afferent_coupling(classes,className) { let ca= empty set; for class in classes { for line in class.getCode() if line contains instance od className ca.insert(class.className); } return ca.length } ``` ## Efferent Coupling (Ce) :heavy_check_mark: A class efferent couplings is a measure of how many different classes are used by the specific class. ### PseudoCode ``` fn efferent_coupling(class,usedDefinedClassList) { let ce= empty set; for line in class.getCode() { if line contains a class istance defined in usedDefinedClassList insert class into ce if not present } return ce.length } ``` ## Instability (I):heavy_check_mark: The ratio of efferent coupling (Ce) to total coupling (Ce + Ca. $I = Ce / (Ce + Ca)$ This metric is an indicator of the package’s resilience to change. The range for this metric is 0 to 1, with $I=0$ indicating a completely stable package and $I=1$ indicating a completely unstable package. ### PseudoCode ``` fn Instability(class,usedDefinedClassList) { let ce= efferent_coupling(class,usedDefinedClassList); let ca = afferent_coupling(usedDefinedClassList,class.className); return ce/(ce+ca) } ``` ## Number of Public Methods (NPM) :heavy_check_mark: NUmber of Methods that are public. ### PseudoCode ``` fn NPM(classCode) let npm=0; for line in classCode{ if(line is a method declaration && method is public) npm++; } return npm; } ``` ## Data Access Metric (DAM) :heavy_check_mark: The ratio of the number of private (protected) attributes to the total number of attributes declared in the class. A high value for DAM is desired. (Range 0 to 1). $DAM= numberOfPrivateAttributes/numberOfAttributes$ ## Measure Of Aggregation (MOA) :heavy_check_mark: The Measure of Aggregation metric is a count of the number of data declarations whose types are user defined classes. ## Measure of Functional Abstraction (MFA) The ratio of the number of methods inherited by a class to the total number of methods accessible by member methods of the class. The constructors and the java.lang.Object (as parent) are ignored. (Range 0 to 1). $MFA= numberOfInheritedMethods/numberOfMethodsAccessibleByMemberMethods$ ## Cohesion Among Methods of Class (CAM) :heavy_check_mark: The metric is computed using the sum of number of different types of method parameters in every method divided by the multiplication of the number of different method parameter types in whole class and number of methods (Range 0 to 1). $CAM= sum(DiiferentParametersTypesPerMethod)$/$(NumberOf DIffParamethersInClass*NMethods)$ ## Inheritance Coupling (IC) Provides the number of parent classes to which a given class is coupled. A class is coupled to its parent class if one of its inherited methods functionally dependent on the new or redefined methods in the class. A class is coupled to its parent class if one of the following conditions is satisfied: - One of its inherited methods uses a variable (or data member) that is defined in a new/redefined method. - One of its inherited methods calls a redefined method. - One of its inherited methods is called by a redefined method and uses a parameter that is defined in the redefined method. ## Coupling Between Methods (CBM) The total number of new/redefined methods to which all the inherited methods are coupled. There is a coupling when one of the given in the IC metric definition conditions holds. ## Average Method Complexity (AMC) :heavy_check_mark: Measures the average method size for each class. Size of a method is equal to the number of java binary codes in the method. ### PseudoCode ``` fn AMC(classCode) { let mothods = get methods code from classCode let methodLocs = 0; for methodCode in methods{ methodLocs +=getLocMetric(methodCode); } return methodLocs/methods.length; } ``` ## Depth of Inheritance :heavy_check_mark: Depth of Inheritance indicates the number of different classes that inherit from one another, all the way back to the base class. Depth of Inheritance is similar to class coupling in that a change in a base class can affect any of its inherited classes. The higher this number, the deeper the depth of inheritance and the more potential for causing breaking changes in your code when modifying a base class. For Depth of Inheritance, a low value is good and a high value is bad. ### PseudoCode ``` fn DI(class) { let DI=1; let parent = class.parent while(parent.className != "Object"){ parent = parent.parent; DI++; } return DI; } ``` ## Class Coupling :heavy_check_mark: Class Coupling is a measure of how many classes a single class uses (efferent couplings and afferent couplings). This coupling can occur through method calls, field accesses, inheritance, arguments, return types, and exceptions. # SonarQube Metrics @giovannitangredi Same as above but for SonarQube metrics plus better understand Skunkscore. ## Duplicated Blocks :heavy_check_mark: Sonarqube implements the Index-Based Code Clone Detection[^ibccd] for finding dublicated blocks of code. ### Normalization First all the statement are normilized to remove unsufel differences like comments or different variable names. Normalization exemple in the paper: the following code ``` if(a==null) a.test(); return 0; ``` can be normilized like this ``` if(id0==null) id0.id1(); return int; ``` The approch used by SonarQube for normalizing is al little different and it works as following: - Identifiers, nulls,Bolleans litelars,Keywords are not changed. - Integers literals,Hexdecimals,Float Literals,Decimals and Binary are normalized with the string "$NUMBER". - Strings literals and Char literarls are normalized with the string "$CHARS". - Import and package segments, ";", "{" and "}" are ignored. - Anything else is not changed. Exemple: ``` if(a==null) { a.test(); a = "Hello"; } return 0; ``` Normailized: ``` if(a==null) a.test() a = $CHARS return $NUMBER; ``` ### Algoritm The algorithm uses the Clone Index structure which is a list of tuple $(file,statementIndex, sequenceHash,info)$. - $file$ is the name of the file - $statementIndex$ in the position in the list of normalize statement for a file - $sequenceHash$ is a hash code for *N* normilized statements. *N* is a constant usually set to the minimun clone length, it can vary depending on the language, usyally set to 5 or 7. - $info$ contains additional data not needed for the algorithm but may be useful like start and line for statements sequence, position in file, etc. For the $sequenceHash$ the authors suggest the use of MD5 hash algo, Sonar uses the Rabin-Karp rolling hash[^SQhash], other hash function can be used like SHA-256. Then after the clone index is created the clone can be found by running the following algorithm: ``` function reportClones (filename) let f be the list of tuples corresponding to filename sorted by statement index either read from the index or calculated on the fly let c be a list with c(0) = ∅ for i := 1 to length(f) do retrieve tuples with same sequence hash as f(i) store this set as c(i) for i := 1 to length(c) do if |c(i)| < 2 or c(i)⊆˜ c(i − 1) then continue with next loop iteration let a := c(i) for j := i + 1 to length(c) do let a0:= a ∩ c(j) if |a0| < |a| then report clones from c(i) to a a := a0 if |a| < 2 or a⊆˜ c(i − 1) then break inner loop ``` The implementation by Sonar of this algo can be found in the github[^SQgit](https://github.com/SonarSource/sonarqube/blob/master/sonar-duplications/src/main/java/org/sonar/duplications/detector/original/OriginalCloneDetectionAlgorithm.java). SonarQube follow the pseudocode , the important implementation are about intersection and subsumedBy: ``` function subsumedBy( blocks1, blocks2, indexCorrection) { i = 0; j = 0; while (i < blocks1.size and j < blocks2.size()) { //take the i and j blocks from eacks blocks list block1 = blocks1[i] block2 = blocks2[j] compere resourceID of eachs block and save result on c if (c != 0) { j++; continue; } c = block1.index - indexCorrection - block2.index if (c < 0) { // blocks1[i] < blocks2[j] break; } if (c != 0) { // blocks1[i] != blocks2[j] j++; } if (c == 0) { // blocks1[i] == blocks2[j] i++; j++; } } return i == blocks1.size; } ``` indexCorrection depends on the distance between groups ``` function intersect( blocks1, blocks2) { //Start with and empty intersection list intersection = empty i = 0; j = 0; w//take the i and j blocks from eacks blocks list block1 = blocks1[i] block2 = blocks2[j] compere resourceID of eachs block and save result on c if (c > 0) { j++; continue; } if (c < 0) { i++; continue; } if (c == 0) { c = block1.index + 1 - block2.index } if (c == 0) { // blocks1[i] == blocks2[j] i++; j++; // add the block to the intersection list intersection.blocks.add(block2); } if (c > 0) { // blocks1[i] > blocks2[j] j++; } if (c < 0) { // blocks1[i] < blocks2[j] i++; } } return intersection; } ``` resourceID is a unique identifier given to a block or clone part. It is a string value and can be easly genereted to be unique and sequential. ## Duplicated Files :heavy_check_mark: Number of file with duplication, can be easly calculated once all duplicated blocks are found or during the process. ## Duplicated Lines :heavy_check_mark: Number of duplicated line, it can be obtained from the dublicateed block if in the $info$ some field about startLine and endLine for a block is added. ## Duplicated lines density :heavy_check_mark: Percentage of Duplicated lines in all the lines of the code: $DLD=NumberOfDuplicatedLines/Numberlinesofcode*100$. Numberlinesofcode can be obtained by the LOC metric. ## Code smells :heavy_check_mark: The number of code smells can be found by using the SonarSource Rules for each implemented language. An exemple for C++: https://rules.sonarsource.com/cpp/type/Code%20Smell SonarSource still doesn't support Rust. Because there is a large number of code smells for each language we could only implement rules fir the most severe code smells. ## Techniacal Debt :x: It the working time needed to resolve all the Code Smells in the file/project. SonarQube uses time in minutes with values such ad 1min,2min,5min , etc.. . ## Technical Debt Ratio :x: Ratio between the cost to develop the software and the cost to fix it: $Technical Debt Effort / (Cost to develop 1 line of code * Numberlinesofcode)$ The value of the cost to develop a line of code is 0.06 days. Numberlinesofcode can be obtained by the LOC metric. ## Quality Gates :x: A set of conditions based on metrics that define if the project is ready for release. Each Quality Gate condition is a combination of: - a measure - a comparison operator - an error value For instance, a condition might be: - measure: Blocker Bugs - comparison operator: > - error value: 0 Which can be stated as: No blocker bugs. ## Bugs :heavy_check_mark: As Code Smells Bugs can be found using the SonarSource Rules. An exemple for C++: https://rules.sonarsource.com/cpp/type/Bug. Similar to Code Smells if there too musch rules to implement only the most severe bugs could be considered. ## Reliability Rating :heavy_check_mark: It is a rating given to the code based on the amount and type of bugs found. The SonarQube Ratung is decide with the following rules: - A = 0 Bugs - B = at least 1 Minor Bug - C = at least 1 Major Bug - D = at least 1 Critical Bug - E = at least 1 Blocker Bug ## Reliability Remediation Effort :x: It the working time needed to resolve all the Code Smells in the file/project. SonarQube uses time in minutes with values such ad 1min,2min,5min , etc.. . ## Vulnerabilities and Security Hotspot :heavy_check_mark: Like Bugs and Code Smells , Vulnerabilities and Security Hotspot can be found by appling the SonarSource rules. C++ exemples: https://rules.sonarsource.com/cpp/type/Vulnerability ## Number of Classes :heavy_check_mark: Number of classes (including nested classes, interfaces,struct?, enums and impl blocks). ## Comments Lines Density :heavy_check_mark: $CLD = CLOC/ LOC * 100$ ## Number of Directories :heavy_check_mark: Number of directory in the project. Can be easly found by traversing all the directories of the project. ## Number of Files :heavy_check_mark: Number of Files in the project. Can be easly found by traversing all the directories of the project. ## NCLOC :heavy_check_mark: Non Comment Line of Code. $NCLOC= LOC-CLOC$ ## Conditions By Line :heavy_check_mark: Number of conditions by line. ## Number of Tests :heavy_check_mark: Number of tests in the project. # Grcov metrics ## Covered Conditions By Line :heavy_check_mark: Line covered in testing. Can be obtained using grcov. ## Branch Coverage :heavy_check_mark: Brach covered in testing. Can be obtained using grcov ## Line Coverage :heavy_check_mark: Lines covered in testing. Can be obtained using grcov ## Lines To Coverage :heavy_check_mark: Number of lines of code which are not still covered by unit tests (blank lines or full comments lines are not considered as lines to cover). $LTC=LLOC- LinesCovered$ ## Test Exec Time :heavy_check_mark: ime required to execute all tests. ## Test Errors/Failures :heavy_check_mark: Number of unit tests that have failed. ## Test Success Density :heavy_check_mark: $TSD = (NumberUnitTests - (UnitTestErrors + UnitTestFailures))/NumberUnitTests* 100$ Implementation feasibility analysis for the metrics found in the previous sections # Student Notes ## Advice - For each paper found online, write down the link where you have found that. - It would be better to write the document in English, so anyone can understand it. If it takes you quite some time, use the Italian language and we will translate that later. - Useful papers and links * Rust Code Analysis[^rca] * Static Metrics * https://peerj.com/articles/cs-406/ * Code Complexity * https://www.guru99.com/cyclomatic-complexity.html * https://www.sonarsource.com/docs/CognitiveComplexity.pdf - Useful tools: * Tokei: https://github.com/XAMPPRocky/tokei (we could use this tool to implement the tabular output on rca) * SonarSource contains a list of metrics that could be implemented in rca: https://github.com/SonarSource - Pre-tasks to verify if you know how to implement something on rust-code-analysis. Fix one of the following issues. Add your tag next to the issue you are working on. * https://github.com/mozilla/rust-code-analysis/issues/389 @marcoballario * https://github.com/mozilla/rust-code-analysis/issues/409 * https://github.com/mozilla/rust-code-analysis/issues/410 @giovannitangredi [^rca]: Rust code analysis: https://www.sciencedirect.com/science/article/pii/S2352711020303484 [^cmw]: Wikipedia, Software metric: https://en.wikipedia.org/wiki/Software_metric [^oom]: Dr. Linda, H. Rosenberg, Lawrence E. Hyatt. *Software Quality Metrics for Object Oriented System Environments, A report of SATC’s research on OO metrics* http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.146.4058 [^lcoom]: Ramesh Ponnala, Dr.C.R.K.Reddy, Object Oriented Dynamic Metrics in Software Development: A Literature Review https://www.researchgate.net/publication/346977020_Object_Oriented_Dynamic_Metrics_in_Software_Development_A_Literature_Review [^lcom]: Lack of Cohesion of Methods: What Is This And Why Should You Care? https://blog.ndepend.com/lack-of-cohesion-methods/ [^sqm]: SonarQube Metric Definitions: https://docs.sonarqube.org/latest/user-guide/metric-definitions/ [^vsm]: Visual Studio, Code metrics values: https://docs.microsoft.com/en-us/visualstudio/code-quality/code-metrics-values?view=vs-2022 [^vsmi]: Visual Studio, Code metrics - Maintainability index range and meaning: https://docs.microsoft.com/en-us/visualstudio/code-quality/code-metrics-maintainability-index-range-and-meaning?view=vs-2022 [^vscc]: Visual Studio, Code metrics - Class coupling: https://docs.microsoft.com/en-us/visualstudio/code-quality/code-metrics-class-coupling?view=vs-2022 [^crap]: https://testing.googleblog.com/2011/02/this-code-is-crap.html [^crap4j]: http://www.crap4j.org/ [^Ndepend]: https://blog.ndepend.com/crap-metric-thing-tells-risk-code/ Or another implementtion on github: https://github.com/MorganPersson/crap4n [^residual]: https://www.researchgate.net/publication/319061974_A_Metric_for_Evaluating_Residual_Complexity_in_Software [^skunk]: https://www.fastruby.io/blog/code-quality/intruducing-skunk-stink-score-calculator.html, Implementation for Ruby in GitHUb: https://github.com/fastruby/skunk [^simplecov]: https://github.com/simplecov-ruby/simplecov [^sifis]: https://www.sifis-home.eu/wp-content/uploads/2021/10/D2.2_SIFIS-Home_v1.0-to-submit.pdf (section 2.4.1) [^ibccd]: https://www.cqse.eu/fileadmin/content/news/publications/2010-index-based-code-clone-detection-incremental-distributed-scalable.pdf [^SQhash]:https://en.wikipedia.org/wiki/Rolling_hash#Rabin-Karp_rolling_hash [^SQgit]: https://github.com/SonarSource/sonarqube