* iComment : Bugs or Bad Comments? *

/* iComment: Bugs or Bad Comments? */Lin Tan, Ding Yuan, Gopal Krishna, Yuanyuan

ZhouPublished in SOSP 2007

Presented by Kevin Boos

In a Nutshell• iComment: static analysis + NLP• Detects code-comment mismatches• Uses both source code and comments

Roadmap• iComment Paper

o Motivationo Challenges o Contributionso Approach & Methodologyo Resultso Related Work

• Complexity• Authors’ other works

Motivation• Software bugs affect reliability.

o Mismatches between code and developer assumptions

// Caller must acquire lock.static int reset_hardware(...) {

//access shared data. }

static int in2000_bus_reset(...) {reset_hardware(...);

}

Prevalence of Comments

• Comments = developer assumptionso Must hold locks, interrupts must be disabled,

etc.

• Other tools do not utilize comments! o Ignore valuable information (dev. intentions)

Software Lines of Code

Lines of Comments

Linux 5 million 1 millionMozilla 3.3 million 0.5 million

Code vs. CommentsCode Comment Implication Precise Imprecise Comments are harder to analyze. Can be tested

Can NOT be tested

Software evolution makes comments less reliable.

Harder to understand

Easier to understand

Developers read comments before code.Wrong comments mislead programmers.

• Developer assumptions can’t always be inferred from source code

• Comments and code are redundanto or should be…

Inconsistencies• What’s wrong: comments or code?

o Developer mistakeo Out of date o Copy and paste error (clone detection)

• Bad code might be bugs• Bad comments cause future bugs

Challenges• Parsing and understanding

commentso Natural language is ambiguous and varying

/* We need to acquire the IRQ lock before calling … */ /* Lock must be acquired on entry to this function. *//* Caller must hold instance lock! */

• NLP only captures sentence structureo No concept of understandingo Decent accuracyo Comments may be grammar disasters…

Contributions• First step towards automatically

analyzing comments o Combines NLP, machine learning, static

analysis• Identifies inconsistent code &

comments• Real-world applicability

o Discovered 60 new bugs or bad comments• Only two topics: locks & calls

Approach• Two types of comments

o Explanatory: /* set the access flags */o Assumptions/Rules: /* don’t call with lock held

*/• Check comment rules topic-by-topic

o General frameworko Users choose the hot topics

Rule Templates• <Lock L> must be held before entering <Function F>. • <Lock L> must NOT be held before entering <Function F>. • <Lock L> must be held in <Function F>. • <Lock L> must NOT be held in <Function F>. • <Function A> must be called from <Function B> • <Function A> must NOT be called from <Function B>

• Other templates exist (see paper)• User can add more templates

Handling Comments• Extract comments

o NLP, keyword filters, correlated word filters• Classify comments (rule generation)

o Manually label small subseto Create decision tree with machine learningo Decision tree matches comments to templateso Fill template parameters with actual variables

• Training is optional for users

Rule Checker• Static analysis

o Flow sensitive and context sensitiveo Scope of comments

• Display the inconsistencieso Sorted by ranking (support probability)

EvaluationSoftware SLOC #Cmts. Languag

e Description Linux 5.0 M 1.0 M C OS Mozilla 3.3 M 0.51M C, C++ Browser Suite

Wine 1.5 M 0.22 M C Runs Windows Apps in Linux

Apache 0.27 M 0.06 M C Web Server • Four large software projects• Two topics: locks and function calls• Average training data: 18%

ResultsSoftware Mismatch Bugs Bad

Cmts. FP Rules

Linux 51 (14) 30 (11) 21 (3) 32 1209

Mozilla 6 (5) 2 (1) 4 (4) 3 410 Wine 2 1 1 3 149 Apache 1 0 1 0 64

Total 60 (19) 33 (12) 27 (7) 38 1832 • Automatically detected 60 new bugs and bad

commentso 19 new bugs and bad comments already confirmed by developers

• False positives exist (38%)o Incorrectly generated ruleso Inaccuracy of checking rule

Training Accuracy

Linux Mozilla Wine Apache 90.8% 91.3% 96.4% 100%

• Accuracy: % of correct mismatches

Training SW Mozilla Wine Apache Linux 81.5% 78.6% 83.3% Linux+Mozilla —— 89.3% 88.9%

—— Software-specific training ——

—— Cross-software training ——

Related Work• Extracting rules from source code

o iComment employs static analysis but not dynamic traces

• Annotations o Poor adoption rateso Requires manual effort per comment

• Documentation generationo No usage of NLPo iComment also analyzes unstructured

comments

Complexity• Detecting inconsistencies

o NLP• Abstracted away by tools

o Machine learning• Simple manual training rules

• Code maintenanceo Developers may forget to be thorough

• Automatic bug detectiono Locking errors are extremely complex

Author Bio• Primary author: Lin Tan• Improving software reliability

o Commentso Source codeo Execution traceso Manual input

• HotComments – prior ideas paper

Author Bio• Secondary author: Ding Yuan• Reliability of large software systems• Better logging

o Enhanced output

Author Bio• Professor: Yuanyuan Zhou• Better debuggers, software reliability

• Founded PatternInsight

PatternInsight Startup

• http://patterninsight.com/

http://patterninsight.com/



Conclusion• Comment-code inconsistencies are

bado Poorer software quality and reliability

• First work to automatically analyze commentso Uses NLP and static code analysis

• Detected real bugs in Linux/Mozilla• Manages complexity of code

consistency and maintenance

* iComment : Bugs or Bad Comments? *

Documents

wrong comments

bugsbad comments

error clone detection

new bugs

worksmotivationsoftware

instance lock

irq lock

developer assumptions