Alattin: Mining Alternative Patterns for Detecting Neglected Conditions Suresh Thummalapenta and Tao Xie Department of Computer Science North Carolina State University Raleigh, USA ASE 2009 This work is supported in part by NSF grant CCF-0725190 and ARO grant W911NF-08-1-0443 and ARO grant W911NF-08-1-0105 managed by NCSU Secure Open Source Systems Initiative (SOSI)
23
Embed
Alattin: Mining Alternative Patterns for Detecting Neglected Conditions Suresh Thummalapenta and Tao Xie Department of Computer Science North Carolina.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Alattin: Mining Alternative Patterns for Detecting Neglected Conditions
Suresh Thummalapenta and Tao XieDepartment of Computer Science
North Carolina State UniversityRaleigh, USA
ASE 2009
This work is supported in part by NSF grant CCF-0725190 and ARO grant W911NF-08-1-0443 and ARO grant W911NF-08-1-0105 managed by NCSU Secure Open Source Systems Initiative (SOSI)
Alattin: Motivation
2
Problem: Programming rules are often not well documented
General solution: Mine common patterns across a large
number of data points (e.g., code samples) Use common patterns as programming
rules to detect defects
3
Limited data points Existing approaches mine specifications from a few code
bases miss specifications due to lack of sufficient data points
Existing approaches produce a large number of false positives
Challenges addressed by Alattin
4
44
Code repositories Code repositories
1 2 N…
1 2mining patterns
searching miningpatterns
Code search engine e.g., Open source codeon the web
Eclipse, Linux, …
Existing approaches
Alattin approach
Often lack sufficient relevant data points (eg. API call sites)
Code repositories
Limited Data Points
5
5
Existing approaches produce a large number of false positives
One major observation: Programmers often write code in different ways for
achieving the same task Some ways are more frequent than others
Large Number of False Positives
Frequent ways
Infrequent ways
Mined Patterns
mine patterns detect violations
ViolationsFalse
Positives
6
Example: java.util.Iterator.next()
PrintEntries1(ArrayList<string> entries){ … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } …}
Code Sample 1
PrintEntries2(ArrayList<string> entries)
{ … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } …}
Code Example 2
Code Sample 2
Java.util.Iterator.next() throws NoSuchElementException when invoked on a list without any elements
7
Example: java.util.Iterator.next()
PrintEntries1(ArrayList<string> entries)
{ … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } …}
Code Sample 1
PrintEntries2(ArrayList<string> entries)
{ … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } …}
Code Sample 2
1243 code examples
Sample 1 (1218 / 1243)
Sample 2 (6/1243)
Mined Pattern from existing approaches:“boolean check on return of Iterator.hasNext before Iterator.next”
8
Example: java.util.Iterator.next()
Require more general patterns (alternative patterns): P1 or P2
P1 : boolean check on return of Iterator.hasNext before Iterator.nextP2 : boolean check on return of ArrayList.size before Iterator.next
Existing approaches cannot mine, since alternative P2 is infrequent
PrintEntries1(ArrayList<string> entries)
{ … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } …}
Code Sample 1
PrintEntries2(ArrayList<string> entries)
{ … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } …}
Code Sample 2
9
Our Solution: ImMiner Algorithm Mines alternative patterns of the form P1 or P2
Based on the observation that infrequent alternatives such as P2 are frequent among code examples that do not support P1
1243 code examples
Sample 1 (1218 / 1243)
Sample 2 (6/1243)
P2 is frequent among code examples not supporting P1
P2 is infrequent among entire 1243 code examples
10
Alternative Patterns ImMiner mines three kinds of alternative patterns of the general form “P1 or P2”
Balanced: all alternatives (both P1 and P2) are frequent
Imbalanced: some alternatives (P1) are frequent and others are infrequent (P2). Represented as “P1 or P^