Top Banner
Static code analysis of Empire Classic Ajmal Khan Blekinge Institute of Technology 371 79 Karlskrona, Sweden [email protected] Islam Saeed Elkhalifa Blekinge Institute of Technology 371 79 Karlskrona, Sweden [email protected] Eleftherios Alveras Blekinge Institute of Technology 371 79 Karlskrona, Sweden [email protected] Bilal Ilyas Blekinge Institute of Technology 371 79 Karlskrona, Sweden [email protected] Mubashir Iqbal Blekinge Institute of Technology 371 79 Karlskrona, Sweden [email protected] ABSTRACT This report presents the findings of a controlled experiment, carried out in an academic setting. The purpose of the experiment was to compare the effectiveness of checklist-based code inspection (CBI) versus automatic tool-based code inspection (TBI). The experiment was required to run on the computer game Empire Classic version 4.2, written in C++, with over 34,000 lines of code. On the other hand, the students were given the freedom to choose which part and how much of the code would be inspected. The checklist and the tool, for the CBI and TBI respectively, were also left upon the students. The experiment was carried out by 5 students, divided into two teams, Team A of 2 students and Team B of 3 students. Both teams carried out both CBI and TBI. No defect seeding was done and the number of defect in the game was unknown at the start of the experiment. No significant difference in the number of check violations between the two methods was found. The distributions of the discovered violations did not differ significantly regarding either the severity or the type of the checks. Overall, the students couldn’t come to a clear conclusion that one technique is more effective than the other. This indicates the need for further study. Key Words: Code inspection, controlled experiment, static testing, tool support, checklist 1. INTRODUCTION The importance of code inspection is not debatable, but whether these code inspections should be carried out by the scarce resource of human inspectors or with the available automatic tools, is certainly debatable. Despite the availability of large number of automatic tools for inspection, companies perform very little automatic testing [2]. Since inspection is a costly process, it is considered a good practice to assign mundane repetitive tasks to automatic tools instead of human testers. This provides more time for manual testing, which involves creativity [2][3][5]. Many studies [4][6][7][11] have been carried out to compare automatic against manual code inspection. Our study differs in 3 ways. First, this study does not consider a small piece of code or program for the experiment, since the code base of the target program is larger than 34.000 lines of code. Second, the selection of the tool was also part of the experiment. Third, no checklist was provided to the students and they were asked to devise a suitable checklist for the purpose. 2. SELECTION OF AN AUTOMATIC TOOL FOR CODE INSPECTION There are a lot of automated tools (both open-source and commercial) available that support static code analysis. Many IDEs (Integrated Development Environments), e.g., the Eclipse and Microsoft Visual Studio, also provide basic automated code review functionality [14]. We have gone through a list of static code analysis tools available on Wikipedia [16]. Initially, we have selected some of the tools based on our basic requirement of supported source- code language and supported environments (platforms, IDEs, and compilers). This list of tools we considered is the following: Coverity Static Analysis: Identifies security vulnerabilities and code defects in C, C++, C# and Java code. CppCheck: Open-source tool that checks for several types of errors, including use of STL (Standard Template Library). CppLint: An open-source tool that checks for compliance with Google's style guide for C++ coding. GrammaTech CodeSonar: A source code analysis tool that performs a whole-program analysis on C/C++, and identifies programming bugs and security vulnerabilities. Klocwork Insight: Provides security vulnerability, defect detection, architectural and build-over-build trend analysis for C, C++, C#, Java. McCabe IQ Developers Edition: An interactive, visual environment for managing software quality through advanced static analysis. Parasoft C/C++test: A C/C++ tool that provide the functionality of static analysis, unit testing, code review, and runtime error detection. PVS-Studio: A software analysis tool for C/C++/C++0x. QA-C/C++: Deep static analysis of C/C++ for quality assurance and guideline enforcement. Red Lizard's Goanna: Static analysis of C/C++ for command line, Eclipse and Visual Studio. All of these tools are commercial, except CppCheck and CppLint, which are open-source. From the later, we were only able to successfully download and run CppCheck. The main features of CppCheck [1][19] are listed below: Array bounds checking for overruns Unused functions, variable initialization and memory duplication checking Exception safety checking, e.g., usage of memory allocation and destructor checking Memory and resource leaks checking
14

Software verification and validation

Nov 02, 2014

Download

Documents

Bilal Ilyas

Static code analysis using manual and automated techniques.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Software verification and validation

Static code analysis of Empire Classic Ajmal Khan

Blekinge Institute of Technology 371 79 Karlskrona, Sweden

[email protected]

Islam Saeed Elkhalifa Blekinge Institute of Technology

371 79 Karlskrona, Sweden

[email protected]

Eleftherios Alveras Blekinge Institute of Technology

371 79 Karlskrona, Sweden

[email protected]

Bilal Ilyas Blekinge Institute of Technology

371 79 Karlskrona, Sweden

[email protected]

Mubashir Iqbal Blekinge Institute of Technology

371 79 Karlskrona, Sweden

[email protected]

ABSTRACT

This report presents the findings of a controlled experiment,

carried out in an academic setting. The purpose of the

experiment was to compare the effectiveness of checklist-based

code inspection (CBI) versus automatic tool-based code

inspection (TBI). The experiment was required to run on the

computer game Empire Classic version 4.2, written in C++, with

over 34,000 lines of code. On the other hand, the students were

given the freedom to choose which part and how much of the

code would be inspected. The checklist and the tool, for the CBI

and TBI respectively, were also left upon the students. The

experiment was carried out by 5 students, divided into two

teams, Team A of 2 students and Team B of 3 students. Both

teams carried out both CBI and TBI. No defect seeding was

done and the number of defect in the game was unknown at the

start of the experiment. No significant difference in the number

of check violations between the two methods was found. The

distributions of the discovered violations did not differ

significantly regarding either the severity or the type of the

checks. Overall, the students couldn’t come to a clear conclusion

that one technique is more effective than the other. This

indicates the need for further study.

Key Words: Code inspection, controlled experiment, static

testing, tool support, checklist

1. INTRODUCTION The importance of code inspection is not debatable, but whether

these code inspections should be carried out by the scarce

resource of human inspectors or with the available automatic

tools, is certainly debatable. Despite the availability of large

number of automatic tools for inspection, companies perform

very little automatic testing [2]. Since inspection is a costly

process, it is considered a good practice to assign mundane

repetitive tasks to automatic tools instead of human testers. This

provides more time for manual testing, which involves creativity

[2][3][5].

Many studies [4][6][7][11] have been carried out to compare

automatic against manual code inspection. Our study differs in 3

ways. First, this study does not consider a small piece of code or

program for the experiment, since the code base of the target

program is larger than 34.000 lines of code. Second, the

selection of the tool was also part of the experiment. Third, no

checklist was provided to the students and they were asked to

devise a suitable checklist for the purpose.

2. SELECTION OF AN AUTOMATIC

TOOL FOR CODE INSPECTION There are a lot of automated tools (both open-source and

commercial) available that support static code analysis. Many

IDEs (Integrated Development Environments), e.g., the Eclipse

and Microsoft Visual Studio, also provide basic automated code

review functionality [14].

We have gone through a list of static code analysis tools

available on Wikipedia [16]. Initially, we have selected some of

the tools based on our basic requirement of supported source-

code language and supported environments (platforms, IDEs,

and compilers). This list of tools we considered is the following:

Coverity Static Analysis: Identifies security vulnerabilities

and code defects in C, C++, C# and Java code.

CppCheck: Open-source tool that checks for several types

of errors, including use of STL (Standard Template

Library).

CppLint: An open-source tool that checks for compliance

with Google's style guide for C++ coding.

GrammaTech CodeSonar: A source code analysis tool that

performs a whole-program analysis on C/C++, and

identifies programming bugs and security vulnerabilities.

Klocwork Insight: Provides security vulnerability, defect

detection, architectural and build-over-build trend analysis

for C, C++, C#, Java.

McCabe IQ Developers Edition: An interactive, visual

environment for managing software quality through

advanced static analysis.

Parasoft C/C++test: A C/C++ tool that provide the

functionality of static analysis, unit testing, code review,

and runtime error detection.

PVS-Studio: A software analysis tool for C/C++/C++0x.

QA-C/C++: Deep static analysis of C/C++ for quality

assurance and guideline enforcement.

Red Lizard's Goanna: Static analysis of C/C++ for

command line, Eclipse and Visual Studio.

All of these tools are commercial, except CppCheck and

CppLint, which are open-source. From the later, we were only

able to successfully download and run CppCheck. The main

features of CppCheck [1][19] are listed below:

Array bounds checking for overruns

Unused functions, variable initialization and memory

duplication checking

Exception safety checking, e.g., usage of memory

allocation and destructor checking

Memory and resource leaks checking

Page 2: Software verification and validation

Invalid usage of STL (Standard Template Library) idioms

and functions checking

Miscellaneous stylistic and performance errors checking

All of the above mentioned commercial tools are well renowned,

and contain more or less the same kind of features. Since we

expected to find more defects with commercial tools, we

decided to include at least one commercial tool in our TBI

activities. Unfortunately, the only tool we were able to acquire

and install successfully was Parasoft C/C++ test. The features of

this tool are mentioned in detail on the tool website [18].

3. SELECTION OF CHECKLIST The main objective of our experiment was to compare the

effectiveness of the two inspection methods, i.e. CBI and TBI.

Towards that end, we decided to use the same checks for the

checklist as well as for the tool. Doing so allowed us to have a

clear measure for comparing the two methods and that is the

number of reported violations. The selected checks are listed in

appendix A.

It is worth mentioning that, it was not possible to use the number

of actual defects as a means of comparison. While some checks

prove the existence of a defect, like MRM #8 (see Appendix A),

most others provide just an indication. Due to factors, such as

the absence of sufficient documentation, feedback from the

development team and lack of programming experience, it was

impossible to decide whether a check violation was indeed a

defect, in many cases. Therefore, the comparison was based on

the number of violations of the selected checks and not the

number of actual defects.

The tool or our choice implements hundreds of checks. We

could only select a few, because of the time constraints imposed

to us. An additional constraint was the experience with the target

programming language (C++), which limited our ability to apply

some of the checks. Therefore, only 29 checks were included in

the checklist and, as a result, only they were executed with the

tool.

4. SELECTION OF CODE FOR

INSPECTION Since the available documentation did not include details about

the program’s structure and information about the most critical

parts of the code was nowhere to be found, the decision could

only be made based on the code itself.

The team considered only two options, because of time

constraints. These options were reverse engineering [17] and

code metrics [13]. Due to lack of proficiency in reverse

engineering, we decided on the later. We picked the McCabe

Complexity [12] (a.k.a. Cyclometric Complexity, CC) as the

code metric of our choice. This metric enabled us to determine

which parts of the code are more complex and, as such, have

potentially more defects. In addition the CC was chosen because

it is widely used and we were able to acquire a tool to automate

the CC computation.

We used the tool Imagix4D [15] in order to compute the

complexity of the code base. Even though the tool supports a

variety of different metrics, our focus was on the complexity of

each function or class method and the total file complexity, i.e.

the sum of all the complexities of the functions of each file.

Our strategy in selecting which parts to inspect is as follows.

Firstly, we executed a check with the tool or the checklist. If the

number of items1 to inspect are low in number (e.g. less than

20), the inspection would include all of them. However, if the

number of items was too large, we would select those items that

exist in either the 10 more complex files (i.e. the 10 files with

the highest value of total complexity) or the 15 more complex

functions (i.e. the 15 functions with the highest value of CC). In

some checks both the above selections were applied, in order to

ensure the number of items to inspect was appropriate. A list of

files and function ranked according to CC are found on

Appendix D.

5. METHODOLOGY The research problem, research questions and the design of the

experiment is given in this section of the report.

5.1 Research Problem and Questions The research problem is “Is tool-based automatic manual code

inspection as effective as checklist based manual code

inspection?” We could hypothesize in two ways. First, that the

results are better when the code inspection is done by human

beings manually without using any automation tool. Second,

from all the hype about automation and tools and hue and cry

made about the consumption of resources by the manual code

inspection we could draw a hypothesis that code inspection is

more effective when an automatic tool for code inspection is

used. The research questions and the hypothesis for this

experiment are given below.

Research Question 1: How does the use of an automatic tool

for code inspection affect the number of reported violations?

Hypothesis H1-0: There is no difference in the number of

reported violations between manual and automatic code

inspection.

Hypothesis H1-1: More violations are reported in automatic

code inspection than in manual code inspection.

Research Question 2: How does the use of an automatic tool

for code inspection affect the type of reported violations?

Hypothesis H2-0: There is no difference in the type of the

reported violations between manual and automatic code

inspection.

5.2 Experimental Design The researchers used a “one-factor block design with a single

blocking variable” [8][9] and the research guidelines of

Kitchenham et al. [10] for this experiment.

5.2.1 Phases of the Experiment: The experiment was executed in the following 5 phases.

Phase 1: This phase comprised of three team meetings.

In the first meeting, the students met to download the code of

the game and form the two teams. Additionally, the decision was

taken to perform a literature review on checklists and

automation tools for code inspections. The tasks were divided

and the next meeting was arranged.

In the second meeting, the team discussed the checklists and

reached on the conclusion that the checklists that they found do

not fulfill their needs and that team A will make a customized

checklist and discuss it with team B in the next meeting. Also,

the tools for automatic code inspection [16] that were found by

team B were discussed and it was decided that Cppcheck and

Parasoft [16][19][18] would be used.

In the third meeting, the final checklist made by team A was

discussed with team B and was modified according to their 1. Items refer to pieces of code, which differ from check to check. In one check, items might be class

methods, while in other check it could be function parameters.

Page 3: Software verification and validation

suggestions. The result of this discussion is included in

Appendix A.

Phase 2: In this phase, discussion was carried on the code

selection. Towards that end, different options were discussed, as

mentioned in section 4. The final decision regarding the strategy

was taken in our next meeting, after performing some search for

the code metric of our choice and for a tool that would automate

its computation.

Phase 3: During this phase, team A used the checklist and team

B used the tool to carry out the inspection. The time allotted for

this phase was 7 days.

Phase 4: In the fourth phase, team B used the checklist and team

A used the tool to carry out the inspection. The time allotted for

this phase was also 7 days.

Phase 5: In the final phase, the two teams regrouped and data

analysis was performed.

5.2.2 Experimental Units The experimental units were the either the whole code or the 15

most complex functions or the 10 most complex files. Details

are explained in section 4.

5.2.3 Factors The factor in this experiment is the testing methodology and has

two alternatives, i.e., checklist-based code inspection (CBI) and

the tool-based code inspection (TBI) for inspecting the code.

5.2.4 Blocking Variables The undesirable variations that could not be eliminated or made

constant are called blocking variables [2]. The major blocking

variable of this experiment was the number of defects in the

code.

5.2.5 Response Variables The students looked for the method effectiveness measured by

the number of violations reported, their severity, their severity as

well as their type.

5.2.6 Subjects, and training Five students of Computer Science (CS) and Software

Engineering (SE) departments of Blekinge Institute of

Technology (BTH), Karlskrona, took part in this research, as a

part of a course examination during the months of Feburary and

March, 2012. The 5 students were divided in two teams, Team

A (Eleftherios Alevras, Mubahir Iqbal, Ajmal Khan) and Team

B (Islam Saeed, and Bilal Ilyas), randomly, in order to perform

two code inspection sessions (70 hours each) using both the CBI

and TBI techniques, one technique at a time per session. All of

the students had taken courses previously in structured and

object oriented programming. To refresh the knowledge of C++,

required for this experiment, one of the students, having good

skills in C++ coding and static testing, held two 4-hours sessions

with the rest of the students.

Team Roles in inspection process using checklists

Based on Fagan process to conduct code inspections different

roles must be divided among inspection team, here we have two

teams one to conduct the inspection process and the other to run

the tools, after that they switch roles, the tables below reflects

the different roles [20].

5.2.7 Parameters The important parameters in this experiment are the manual

testing skills of the students, the code selection strategy, the tool

for the automated code inspection, the checklist, the testing

environment, their skills in C++ programming, and the available

time for the execution of the experiment.

5.2.8 Undesirable Variations The static testing skills of the students and their coding skills in

C++ were the properties of the subjects that might have been a

major source of unwanted variation in this experiment. To cope

with this, the experiment was carried out by both team A and

team B, as mentioned above in phase 3 and phase 4. Another

step taken was the random assignment of the students to the two

teams.

6. DATA COLLECTION AND ANALYSIS During the execution of the inspection, detailed logs were kept

about the violations that were found. During phase 5, these logs

were analyzed and summarized in tables. The later ones are

included in this report, in their relevant sections.

7. RESULTS The statistical analysis of the data collected during this

experiment gives the following results.

7.1 Number of reported violations A total of 128 violations were reported using CBI and a total of

121 violations were reported using TBI. The following table

gives a summary of the number of defects found during phase 3

and phase 4 of the experiment.

Table 1: Summary of number of defects found by each team

The graph given below shows a graphical description of the

number of defects found using each technique along with the

performance of each team.

Figure 1: Summary of defects found

7.2 Types of reported violations The following table summarizes the results of the experiment,

based on the types of the reported violations. The names of the

types were derived from the manual of the tool. It is worth

noting that the tool’s manual is not a public document and

cannot be referenced. We were able to access it, only after

installing the tool.

Page 4: Software verification and validation

Table 2: Defects found in each violation type

We can see in the graph given below that MRM dominates all

the other violation types. Clearly, the run-time behavior of

EmpireClassic is unpredictable and probably deviates from the

intentions of the game creators.

It is worth noting, however, that 0 violations were reported on

the PB type. A possible explanation is that, since the game is

open-source, a lot of bugs may have been reported by the

players of the game and they may have been fixed, before

version 4.2, which is the version used in this experiment.

No object oriented defects were found in the game. Partly, this is

because the object oriented features that are used in the code are

very scarce. We could observe only a limited use of classes and

data encapsulation, while almost no inheritance or

polymorphism was used. The team easily came to the conclusion

that the code is much more procedural than object-oriented.

Figure 2: Technical types of defects found

The details of the violation types are given in appendix A.

7.3 Severity of reported violations As mentioned in previous sections, we summarized the results

from the inspection in terms of severity as well. The level of

severity for each check was derived from the manual of the tool.

The results are shown in the following table.

In the above table, it is clear that, since no checks were of low

severity level, no violations were reported. One the other hand,

even though four checks were at severity level high, no

violations were reported either. These checks are from multiple

types and no sampling was used for them, so no violations

reported for the high severity checks was actually a surprise.

By far, the most checks of the were assigned the severity

medium, so it was expected that the number of medium

violations was high.

The above table indicates that, even though there were a lot of

violations in the inspected parts of the code, only a few of them

were of high or highest severity. Apparently, that proves that the

code, even though it can be improved considerably, it is not

likely to have disastrous consequences for the target audience,

the potential gamers. Even so, it is apparent that a code

improvement is required, as the highest severity violations can

be harmful.

8. DISCUSSION In this section we discuss the summary of the results, answer the

research questions, give the limitations of this research and

discuss our future research.

Research Question 1: How does the use of an automatic tool

for code inspection affect the number of detected defects?

The table given below shows that checklist based inspection

proved better in discovering initialization defect, MISRA defects

and violation of coding conventions. Overall, 2.81% more

defects were detected using the checklist based inspection

technique than the tool based inspection technique.

This proved that hypothesis H1-2 was correct. More defects are

detected in code inspection done by human code inspectors

without the help of an automatic tool. This experiment proved

that CBI was a better technique as compared to TBI.

Research Question 2: How does the use of an automatic tool

for code inspection affect the type of violations defects?

Hypothesis H2-0 was proved to be correct. There is virtually no

difference in the type of violations found by the two methods,

CBI and TBI. From the table of section 7.3, it is clear that both

of the methods were as effective in all the violation types.

9. VALIDITY THREATS 1. Only one out of 5 of the students that performed the

experiment was competent in C++, before the start of the

experiment.

2. Time constraints limited the effectiveness of training in

C++, as mentioned in the section 5.2.6. As a result, the

technical skills of the team in C++ might have had a

negative effect in the quality of the results.

3. No seeding of defect was done, which would have

provided a defined target to judge the performance against.

4. The program given had over 34,000 lines of code, which

was not suitable for this kind of experiment. Earlier

experiments considered smaller code sizes. For example,

[10] used two programs, with 58 lines of code and 8 seeded

defaults and 128 lines of code with 11 seeded defaults,

respectively.

5. The recommended length of two hours for inspection

meeting could not be followed due to size of the code.

6. Maturation (learning) effects concern improvement in the

performance of subjects during the experiment.

7. Students being the subjects of the experiment cannot be

compared with the software engineering professionals. As a

result, total inspection time, inspected code size and code

selection for inspection do not reflect the real practice of

Page 5: Software verification and validation

inspection in the industry, as the choice of the students was

limited by their level available expertise of the inspectors.

8. The author of the code was not available and, thus, the code

selection may have been not as sophisticated. The authors

could not do much to do away with the external threats

especially and hence the generalizability of this experiment

is not valid.

10. CONCLUSION AND FURTHER

RESEARCH After the experiment, the students observed that both the CBI

and the TBI technique proved equally effective. With respect to

the number of reported violations, the CBI proved to be slightly

superior to TBI. However, the students are not confident that the

difference is statistically significant to draw solid conclusions.

Therefore, the team suggests that further research is needed in

order to verify whether CBI can be more effective than TBI. In

another experiment, the parameters, such as total inspection time

could be changed, in order to understand whether the inspection

time affects the effectiveness of CBI. For example, given

enough time, CBI may be more effective than TBI, but TBI

might be more suitable when the inspection time is severely

restricted.

Other factors that may have significant influence is the team’s

competence in the programming language, but also in the

checklist as well as the tool. For example, in this experiment,

one member of the team had prior experience with automated

code analysis. On the other hand, no member had any

experience in applying checklists. Using a competent member,

while performing CBI, might be able to give stronger indications

that CBI can be more effective than TBI.

11. REFERENCES

[1] A Survey of Software Tools for Computational Science:

http://www.docstoc.com/docs/81102992/A-Survey-of-

Software-Tools-for-Computational-Science. Accessed:

2012-05-18.

[2] Andersson, C. and Runeson, P. 2002. Verification and

Validation in Industry " A Qualitative Survey on the

State of Practice. Proceedings of the 2002 International

Symposium on Empirical Software Engineering

(Washington, DC, USA, 2002), 37–.

[3] Berner, S. et al. 2005. Observations and lessons learned

from automated testing. Proceedings of the 27th

international conference on Software engineering (New

York, NY, USA, 2005), 571–579.

[4] Brothers, L.R. et al. 1992. Knowledge-based code

inspection with ICICLE. Proceedings of the fourth

conference on Innovative applications of artificial

intelligence (1992), 295–314.

[5] Fewster, M. and Graham, D. 1999. Software Test

Automation. Addison-Wesley Professional.

[6] Gintell, J. et al. 1993. Scrutiny: A Collaborative

Inspection and Review System. Proceedings of the 4th

European Software Engineering Conference on Software

Engineering (London, UK, UK, 1993), 344–360.

[7] Gintell, J.W. et al. 1995. Lessons learned by building and

using Scrutiny, a collaborative software inspection

system. Computer-Aided Software Engineering, 1995.

Proceedings., Seventh International Workshop on (Jul.

1995), 350 –357.

[8] Itkonen, J. et al. 2007. Defect Detection Efficiency: Test

Case Based vs. Exploratory Testing. Empirical Software

Engineering and Measurement, 2007. ESEM 2007. First

International Symposium on (Sep. 2007), 61 –70.

[9] Juristo, N. and Moreno, A.M. 2001. Basics of Software

Engineering Experimentation. Springer.

[10] Kitchenham, B.A. et al. 2002. Preliminary guidelines for

empirical research in software engineering. Software

Engineering, IEEE Transactions on. 28, 8 (Aug. 2002),

721 – 734.

[11] MacDonald, F. and Miller, J. 1998. A Comparison of

Tool-Based and Paper-Based Software Inspection.

Empirical Softw. Engg. 3, 3 (Sep. 1998), 233–253.

[12] McCabe, T.J. 1976. A Complexity Measure. Software

Engineering, IEEE Transactions on. SE-2, 4 (Dec. 1976),

308 – 320.

[13] Munson, J.C. and Khoshgoftaar, T.M. 1992. The

Detection of Fault-Prone Programs. IEEE Trans. Softw.

Eng. 18, 5 (May. 1992), 423–433.

[14] Wikipedia contributors 2012. Automated code review.

Wikipedia, the free encyclopedia. Wikimedia Foundation,

Inc.

[15] Wikipedia contributors 2011. Imagix 4D. Wikipedia, the

free encyclopedia. Wikimedia Foundation, Inc.

[16] Wikipedia contributors 2012. List of tools for static code

analysis. Wikipedia, the free encyclopedia. Wikimedia

Foundation, Inc.

[17] Yu, Y. et al. 2005. RETR: Reverse Engineering to

Requirements. Reverse Engineering, 12th Working

Conference on (Nov. 2005), 234.

[18] C++TestDataSheet.pdf.

http://www.parasoft.com/jsp/printables/C++TestDataShe

et.pdf?path=/jsp/products/cpptest.jsp&product=CppTest

[19] manual.pdf. http://cppcheck.sourceforge.net/manual.pdf

[20] Fagan, M.E., Advances in Software Inspections, July

1986, IEEE Transactions on Software Engineering, Vol. SE-12,

No. 7, Page 744-751

Page 6: Software verification and validation

Appendix A – List of checks

The following table includes the checks that were used in TBI as well as in CBI.

Check Type Check

No. Check Description

Severity

Level

Coding Conventions [CODSTA]

1 Do not declare the size of an array when the array is passed into a function

as a parameter. 2

2 Always provide a default branch for switch statements. 3

3 Avoid returning handles to class data from member functions. 3

4 Bitwise operators, comparison operators, logical operators, comma

operator should be const. 3

5 Constructors allowing for conversion should be made explicit. 1

6 Conversion operator, operator->, operator(), operator[] should be const. 3

7 The condition of an if-statement and the condition of an iteration-

statement shall have type bool. 3

8 Each operand of the ! operator, the logical && or the logical || operators

shall have type bool. 3

Initialization [INIT]

1 All member variables should be initialized in constructor. 1

2 Assign to all data members in operator= 2

3 List members in an initialization list in the order in which they are

declared. 3

Memory and Resource

Management [MRM]

1 All classes should contain the assignment operator or appropriate

comment. 3

2 All classes should contain the copy constructor or appropriate comment. 3

3 Don't memcpy or memcmp non-PODs. 3

4 Always assign a new value to an expression that point to deallocated

memory. 3

Page 7: Software verification and validation

5 Call delete on pointer members in destructors. 2

6 Check the return value of new. 3

7 Never provide brackets ([]) for delete when deallocating non-arrays. 3

8 Always provide empty brackets ([]) for delete when deallocating arrays. 3

Motor Industry Software

Reliability Association [MISRA]

1 Provisions should be made for appropriate run-time checking. 5

2 Assignment operators shall not be used in expressions that yield a Boolean

value. 3

3 Do not apply arithmetic to pointers that don't address an array or array

element. 3

4 Floating-point expressions shall not be directly or indirectly tested for

equality or inequality. 3

Object Oriented [OOP] 1 Do not directly access global data from a constructor. 1

Possible Bugs [PB]

1

The definition of a constructor shall not contain default arguments that

produce a signature identical to that of the implicitly-declared copy

constructor.

2

2 Do not call 'sizeof' on constants. 3

3 Do not call 'sizeof' on pointers. 3

4 A function shall not return a reference 3

5 A function shall not return a pointer. 3

Page 8: Software verification and validation

It is worth noting that, both the types and the checks originate from the tool, i.e. they do not necessarily conform to a known standard, like

the Common Weakness Enumeration (CWE). However, it is also worth mentioning that, for each check, the tool does provide references.

For example, for the check CODSTA #2, the tool includes two references:

1. Ellemtel Coding Standards, http://www.chris-lott.org/resources/cstyle/Ellemtel-rules-mm.html, 14 Flow Control Structures - Rule 48

2. JOINT STRIKE FIGHTER, AIR VEHICLE, C++ CODING STANDARDS, Chapter 4.24 Flow Control Structures, AV Rule 194

Page 9: Software verification and validation

Appendix B - Checklist final results

Check Type Check No. Severity Level Violations

Reported File Name Line No.

Total

Reported

Violations

CODSTA

1 2 0 - -

23

2 3 0 - -

3 3 0 - -

4 3 6 types.hpp 116, 162, 172,

238, 679, 723

5 1 8

criteria.hpp 50

gather.cpp 24

island.hpp 59, 60

types.hpp 119, 174, 175,

176

6 3 2 types.hpp 634, 720

7 3 5*

fly.cpp 664

services.cpp 369, 384, 649,

725

8 3 2* fly.cpp 664

gather.cpp 844

INIT

1 1 3

report.cpp 18

3

criteria.hpp 17

types.hpp 356

2 2 0 - -

3 3 0 - -

MRM

1 3 30

bomb.cpp

All the classes

in these files

failed this

check.

72

criteria.hpp

gather.cpp

island.hpp

newsman.cpp

radar.cpp

report.cpp

set.hpp

ship.hpp

tdda.hpp

types.hpp

2 3 28

bomb.cpp

All the classes

(except class

Island and class

Region) in

these files

failed this

check.

criteria.hpp

gather.cpp

island.hpp

newsman.cpp

radar.cpp

report.cpp

set.hpp

ship.hpp

tdda.hpp

types.hpp

Page 10: Software verification and validation

3 3 2 tdda.hpp 38

tdda.cpp 27

4 3 3 fly.cpp 629, 886

newsman.cpp 416

5 2 0 - -

6 3 8

bomb.cpp 352

criteria.cpp 298

fly.cpp 230

island.cpp 177

newsman.cpp 389, 393, 414

tdda.cpp 13

7 3 0 - -

8 3 1 tdda.cpp 21

MISRA

1 5 12*

bomb.cpp

269, 540, 1124,

1124, 1128,

1129

30

course.cpp 378

gather.cpp 297, 299

genecis.cpp 320, 325

2 3 0* - -

3 3 0* - -

4 3 18*

bomb.cpp 748, 961, 1000

course.cpp 58, 63, 402,

404, 406

fly.cpp 94, 131

genecis.cpp 197, 412, 417,

690, 779

ship.cpp 280, 311, 667

OOP 1 1 0 - - 0

PB

1 2 0 - -

0

2 3 0 - -

3 3 0 - -

4 3 0*** - -

5 3 0*** - -

TOTAL 128

* Sampling used: 10 most complex files.

** Sampling used: 15 most complex functions.

*** Sampling used: 10 most complex files and 15 most complex functions.

Page 11: Software verification and validation

Appendix C - Tool final results

Check Type Check No. Severity Level Violations

Reported File Name Line No.

Total

Reported

Violations

CODSTA

1 2 0 - -

20

2 3 0 - -

3 3 0 - -

4 3 2 types.hpp 679, 723

5 1 3 island.hpp 59, 60

types.hpp 50

6 3 4 types.hpp 238, 634, 720

types.cpp 113

7 3 6*

fly.cpp 628

radar.cpp 796

services.cpp 369, 384, 649,

725

8 3 5*

bomb.cpp 1335

fly.cpp 664, 885

gather.cpp 845

services.cpp 646

INIT

1 1 4

criteria.cpp 277

4

report.cpp 18

criteria.hpp 17

types.hpp 382

2 2 0 - -

3 3 0 - -

MRM

1 3 30

bomb.cpp

All the classes

in these files

failed this

check.

72

criteria.hpp

gather.cpp

island.hpp

newsman.cpp

radar.cpp

report.cpp

set.hpp

ship.hpp

tdda.hpp

types.hpp

2 3 28

bomb.cpp

All the classes

(except class

Island and class

Region) in

these files

failed this

check.

criteria.hpp

gather.cpp

island.hpp

newsman.cpp

radar.cpp

report.cpp

set.hpp

ship.hpp

Page 12: Software verification and validation

tdda.hpp

types.hpp

3 3 0 - -

4 3 5

fly.cpp 629, 886

island.cpp 194, 381

newsman.cpp 416

5 2 0 - -

6 3 8

bomb.cpp 352

criteria.cpp 298

fly.cpp 230

island.cpp 177

newsman.cpp 389, 393, 414

tdda.cpp 13

7 3 0 - -

8 3 1 tdda.cpp 21

MISRA

1 5 12*

course.cpp 401

25

genecis.cpp 410, 411

move.cpp 106, 150

radar.cpp 280, 370, 377

ship.cpp 226, 280, 291,

330

2 3 0* - -

3 3 0* - -

4 3 13*

course.cpp 177, 333, 340,

470

genecis.cpp 279, 616, 618,

631, 643, 654

radar.cpp 554

services.cpp 335

ship.cpp 312

OOP 1 1 0 - - 0

PB

1 2 0 - -

0

2 3 0 - -

3 3 0 - -

4 3 0*** - -

5 3 0*** - -

TOTAL 121

* Sampling used: 10 most complex files.

** Sampling used: 15 most complex functions.

*** Sampling used: 10 most complex files and 15 most complex functions.

Page 13: Software verification and validation

Appendix D - File and function metrics

File metrics

Function metrics

Page 14: Software verification and validation

Appendix E- Summarized results of both techniques

Violations Reported

Check Type Check No. Severity

Level

Sampling

used

10 most

complex

files

15 most

complex

functions

Manual

Inspection

(using

checklist)

Automated

Inspection

(using tool)

CODSTA

1 2 x N/A N/A 0 0

2 3 x N/A N/A 0 0

3 3 x N/A N/A 0 0

4 3 x N/A N/A 6 2

5 1 x N/A N/A 8 3

6 3 x N/A N/A 2 4

7 3 √ √ x 5 6

8 3 √ √ x 2 5

INIT

1 1 x N/A N/A 3 4

2 2 x N/A N/A 0 0

3 3 x N/A N/A 0 0

MRM

1 3 x N/A N/A 30 30

2 3 x N/A N/A 28 28

3 3 x N/A N/A 2 0

4 3 x N/A N/A 3 5

5 2 x N/A N/A 0 0

6 3 x N/A N/A 8 8

7 3 x N/A N/A 0 0

8 3 x N/A N/A 1 1

MISRA

1 5 √ √ x 12 12

2 3 √ √ x 0 0

3 3 √ √ x 0 0

4 3 √ √ x 18 13

OOP 1 1 x N/A N/A 0 0

PB

1 2 x N/A N/A 0 0

2 3 x N/A N/A 0 0

3 3 x N/A N/A 0 0

4 3 √ √ √ 0 0

5 3 √ √ √ 0 0

TOTAL 128 121