Top Banner
12

0-knowledge fuzzing white paper

May 20, 2015

Download

Documents

Vincenzo Iozzo

White paper of 0-knowledge fuzzing presentation given at Black Hat DC 2010 by Vincenzo Iozzo
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 0-knowledge fuzzing white paper

1

Page 2: 0-knowledge fuzzing white paper

0-knowledge fuzzing

Vincenzo [email protected]

February 9, 2010

Abstract

Nowadays fuzzing is a pretty common techniqueused both by attackers and software developers.Currently known techniques usually involveknowing the protocol/format that needs to befuzzed and having a basic understanding of howthe user input is processed inside the binary.In the past since fuzzing was little-used obtain-ing good results with a small amount of effortwas possible.Today finding bugs requires digging a lot insidethe code and the user-input as common vulner-abilies are already identified and fixed by devel-opers. This paper will present an idea on howto effectively fuzz with no knowledge of the user-input and the binary.Specifically the paper will demonstrate how tech-niques like code coverage, data tainting and in-memory fuzzing allow to build a smart fuzzerwith no need to instrument it.

1 Introduction

Fuzzing, or fuzz testing, is a software test-ing methodology whose aim is to provide in-valid, unexpected or random inputs to a pro-gram. Although the idea behind this techniqueis conceptually very simple it is a well known

and widely established methodology employed inCOTS software vulnerability discovery process.The first appearance of fuzzing in software test-ing dates back to 1988 by Professor BartonMiller[1]; since then the technique has evolveda lot and it is not only used by attackers to dis-cover vulnerabilities but also internally by manycompanies to find bugs in their software.Over the course of time a lot of different imple-mentations of fuzz testing have been researched,nonetheless it is commonly believed that thereare two predominant approaches to fuzzing:Mutation-based and Generation-based.The former is based on random mutations ofknown well-formed data, whereas the latter cre-ates testing samples using templates describingthe format of the software input.Both approaches have their advantages and pit-falls. The former requires little effort to be im-plemented and it is reusable across different soft-ware. Nonetheless given the raising interest com-panies have shown in properly testing and devel-oping products this approach will generally yieldworse results than generation-based fuzzers.The second approach has the advantage of ob-taining better results in terms of bugs found, al-though it requires knowledge of the input formatthe binary expects and its reusability is boundedto binaries that deal with the same input format.

2

Page 3: 0-knowledge fuzzing white paper

The difficulty of creating input models can rangefrom low for public data formats to almost infea-sible for proprietary formats.In order to ease the process of creating inputtemplates various approaches have been stud-ied, most notably evolutionary fuzzers and in-memory fuzzers.Both are derived from mutation-based fuzzersbut for different purposes. The first type offuzzers, in fact, by employing genetic algorithmsattempts to generate sets of data which resem-ble as precisely as possible the input format. Thelatter, instead, first requires a human to manu-ally identify specific functions inside the binarythen mutates the input in-memory in order toprevent data validation which could lead to dif-ferent code paths thus resulting in not fuzzingcrucial pieces of an application.Evolutionary based fuzzers suffer from the diffi-culty of identifying proper scoring and mutationfunctions and for this approach to be effective itusually requires more time than the generation-based one. In-memory fuzzing on the other handhas a high rate of false positives and negativesand it requires an expert reverse engineer in or-der to identify proper test cases.

In this paper the author presents an approachto fuzz testing based on in-memory fuzzing aim-ing at limiting human intervention and minimiz-ing the number of false positives and negativesthat currently affects this technique. The pro-posed methodology employs a range of knownmetrics from both static and dynamic programanalysis together with a new technique for in-memory fuzzing. Specifically we will use datatainting for tracking user input, thus being ableto identify locations in-memory suitable for test-ing; we will also employ static analysis metricsin order to identify functions in the binary thatcan be interesting from a security testing point

of view.To the best of the author‘s knowledge there areno public attempts at combining together thesetechniques for fuzz-testing purposes. A notableexception is Flayer which nonetheless only fo-cuses on dynamic analysis and program pathsmanipulation in order to discover software de-fects.

The rest of this paper is organized as follows.In section 2 we provide basic background infor-mation on the metrics used. Section 3 discussesrelated work. Section 4 presents our approachand implementation. Finally we conclude anddiscuss future work directions in Section 5.

2 Background

In this section, we present background informa-tion on static analysis metrics, data tainting, andin-memory fuzzing.In our implementation we use primary two staticanalysis techniques: cyclomatic complexity andloop detection.Cyclomatic complexity is a software metric usedto determine how complex in terms of code pathsa function is. The computation is done on thenumber of edges and nodes a function contains.Intuitively the more the structure of the functionis complicated the more complex the functionis. In [2] the connection between function com-plexity and bugs presence has been discussed.Although there is not always a correlation be-tween the two, it is reasonable to assume thatmore complex functions are prone to containbugs given the amount of code they contain.Another metric employed is loop detection. Thisalgorithm takes advantage of some properties ofa function flowgraph and its dominator tree inorder to detect loops present in compiled code.

3

Page 4: 0-knowledge fuzzing white paper

This technique is widely used in compilers for op-timization purposes, and it has some interestingaspects from a security prospective as well. Itis commonly known, in fact, that memory writeoften happens inside loops and that most compil-ers usually inline functions like memcpy so thatthe function will effectively result in a loop.

Another crucial piece of infrastructure for theproposed fuzzer is the data tainting engine. Thegoal of data tainting is to gather information onhow user input is propagated through a binary.The concept of data tainting is intuitively verysimple, one or more markings are associated withsome data supposedly representing the user in-put and those markings are propagated follow-ing the program flow. Although it is possible toperform data tainting using static analysis thecomplexity of the task and the possible incom-plete set of information led the author to choosea dynamic analysis approach to the problem bytaking advantage of an existing dynamic datatainting framework called Dytan[6]. Using dy-namic data tainting has the benefit of obtain-ing more precise and richer information on datapropagation although it will not be able to ex-plore program paths that are not executed atrun-time. Given the nature of the fuzzer, ob-taining information on non-executed code pathsis of no interest as in-memory fuzzing relies onthe ability to reach code paths by mutating a setof known good data.

Finally in order to monitor the effectivenessof our fuzzer, we employ a software testing mea-sure known as code coverage. This techniqueverifies the degree to which the code of a pro-gram has been tested by tracing the executionof the binary. Although there are many differ-ent implementations of code coverage all usingdifferent criteria in terms of the kind of informa-tion to record, the author decided to implement

the technique so that basic blocks execution isbeing traced. This implementation does nottake into account code paths and therefore mightbe imprecise in some circumstances, nonethelesswe consider this trade-off to be acceptable as itavoids to overly complicating the implementa-tion and improves the fuzzer performance.

3 Related work

In this section we will briefly describe exist-ing approaches to data tainting and in-memoryfuzzing together with a brief description ofFlayer[3] being it the closest work to the one de-scribed in this paper.

3.1 Existing in-memory fuzzing im-plementations

(a) Mutation loop insertion

(b) Snapshot restorationmutation

Figure 1: Known implementations of in-memoryfuzzing

To the best of the author‘s knowledge in-

4

Page 5: 0-knowledge fuzzing white paper

memory fuzzing was first introduced to the pub-lic by Greg Hoglund of HBGary in [4] and laterfurther developed by Amini at al[5]. Currentlythere are two public methods: Mutation loop in-sertion and snapshot restoration mutation.The first method works by inserting an uncon-ditional jump from the function being tested toa function responsible for mutating the data re-siding in the process address space of the fuzzedbinary. At the end of the mutation function an-other unconditional jump to the beginning of thecurrently tested function is inserted. The controlflow graph of this approach is shown in 1(a).This approach suffers of a number of drawbackswith a high rate of false negatives and stack con-sumption being the two major ones. Anotherdisadvantage of this method is the general insta-bility of the memory after a few fuzzing itera-tions.The second approach works by inserting an un-conditional jump from the beginning of the func-tion being tested to a function responsible of tak-ing a memory snapshot. This function will latercall again the tested function. At the end of theanalyzed function another unconditional jump isinserted. The jump points to a function respon-sible of restoring the memory, fuzzing data andexecuting again the fuzzed function. A controlflow diagram employing this approach is shownin Figure 1(b). Although this method has someadvantages in respect to the first one described,it still suffers from a high false positives rate andit is also slower given the need of continuouslyhaving to restore process memory.

3.2 Existing data tainting implemen-tations

Dynamic data tainting has gained momentum inthe last few years given the increase complexity

of software. A lot of implementations of datatainting frameworks exist, for this reason theauthor decided to use a framework previouslycreated by James Clause and Alessando Orso ofGatech called Dytan[6]. The decision was madebased on a number of requirements.First and foremost the ability to instrument bi-naries without any recompilation or access to thesource code.Another very important requirement was porta-bility, most of the existing implementations arebased on Valgrind[7] which does not support theWindows platform. The two most appealingcandidates were Temu[9] and Dytan.The first one is built on the top of a modifiedversion of Qemu[8]. Although this would haverespected both the initial requirements we thinkthat a data tainting framework based on a vir-tual machine emulator is overkill for our goals.Besides the implementation in the author‘s opin-ion is not yet robust enough.Dytan is implemented as a pintool[16]. It is aflexible framework and can run on both Linuxand Windows.

3.3 Additional related work

As already mentioned in the previous sectionFlayer[3] is the most similar work to the ap-proach discussed in this paper. The softwarecombines data tainting and the ability to forcecode paths. Differently from many other datatainting tools Flayer has bit-precision markings.Although this grants a higher degree of precisionin obtaining information on data propagation forthe purpose of our work byte-precision markingsare detailed enough.Another limitation is the software the tool isbased on; as already mentioned Valgrind doesnot support Windows which severly impairs the

5

Page 6: 0-knowledge fuzzing white paper

usefulness of the tool.Finally even if the main aim of the tool is notfuzzing it has the ability of forcing code pathsand therefore it can be used to test various codepaths. This method has three main drawbacks;the first one is a high number of false positives,the second one is the absence of a sample whichcan be later used by the attacker to reproducethe bug and finally a problem known as code-path explosion. This problem arises because thenumber of code paths to force increases exponen-tially with the complexity of the software.

4 Proposed approach and im-plementation

Figure 2: Fuzzer components

In this section we will present the idea andimplementation of our work. As shown in Figure2 our fuzzer can be divided into 4 parts.

4.1 Static analysis metrics

Static analysis algorithms are used to determinewhich functions could be potentially of inter-est for our fuzzer. We assign a higher score tofunctions that have a high cyclomatic complex-ity score and at least one loop in them; we thenconsider all the functions that have loops but alow cyclomatic complexity score and finally wetake into account the remaining functions. Ide-ally we will add more metrics to the implementa-tion, therefore this rather trivial scoring system

should be replaced by a more sophisticated ap-proach which takes into account scores comingfrom various metrics and weights them in respectto their relevance from a security prospective.

Figure 3: The edge in red is missed by the ap-proximative cyclomatic complexity formula.

Cyclomatic complexity Cyclomatic com-plexity was first described by Robert McCabein [10]. The purpose of this metric is to calcu-late the number of independent paths in a codesection. Many formulation of this metric havebeen given, we briefly explain the ones that arerelevant to our fuzzer.

Definition Let G be a flowgraph, E the num-ber of edges in G, N the number of nodes in Gand P the number of connected components inG.Cyclomatic complexity is defined as:

M = E −N + 2P (1)

A connected component is a subgraph in whichany two vertices are connected to each other bypaths. This formula originates from the cyclo-matic number:

Definition Let G be a strongly connectedgraph, E the number of edges in G, N thenumber of nodes in G and P the number ofconnected components in G.

6

Page 7: 0-knowledge fuzzing white paper

The Cyclomatic number is defined as:

V (G) = E −N + P (2)

It should be notice that the cyclomatic num-ber can be calculated only on strongly connectedgraphs, that is a graph in which from every pairof vertices there is a direct path connecting themin both directions. McCabe proved that the flow-graph of a function with a single entry point anda single exit point can be considered a stronglyconnected graph and therefore the cyclomaticnumber theorem applies and that P = 1, thusthe resulting simplified formula is:

M = E −N + 2 (3)

Intuitively when a flowgraph has multiple exitpoints the aforementioned formula doesnt holdtrue anymore. Another one should be thereforeused:

Definition Let G be a flowgraph, π the numberof decision points in G and s the number of exitpoints in G. Cyclomatic complexity is definedas:

M = π − s+ 2 (4)

Applying (3) to functions with multiple exitpoints we will have, in fact, lower cyclomaticcomplexity values by a minimum factor of 2. Fig-ure 3 shows typical edges and connected com-ponents missed by using (3).Nonetheless the author believes that the less pre-cise measurement can be used without impairingthe results.We implemented cyclomatic complexity calcula-tion for each function in a module by using Bin-Navi API. A detailed explanation of the imple-mentation can be found in[11].

(a) Afunctionflowgraph,nodes inblue belongto a loop

(b) Domi-nator treeof thepreviousfunction,nodesin greencorrespondto theblue oneshighlightedin picture(a)

(c) Thenodesin greendominatethe node inred in thedominatortree

Figure 4: Graphs used in loop detection algo-rithm

Loop detection algorithm As previouslymentioned another metric, loop detection, isused to select functions. The first required stepis to extract the dominator tree out of a function.Formally:

Definition A dominator tree is a tree whereeach node‘s children are the nodes it immedi-ately dominates.A node d is said to dominate node k if everypath from the start node s to node k must gothrought node d.

To give a visual example of a dominator treeof a function please refer to Figure 4. Nodes inblue in Figure 4(a) are highlighted in the dom-inator tree in green in Figure 4(b).There are two known algorithms used to cal-culate the dominator tree of a flowgraph. Itis out of the scope of this paper to discussthem. It should be noticed, though, that the

7

Page 8: 0-knowledge fuzzing white paper

tool upon which we built our loop detectionalgorithm, BinNavi[12], implements Lengauer-Tarjan[13] dominator tree algorithm which is al-most linear thus granting us a higher computa-tional speed.The second step is to calculate for each node itsdominators. In Figure 4(c) the dominators ofthe node in red are the ones in green.The last step is to search for edges from a nodeto one of its dominators. Recalling the definitionof domination it is trivial to show that if thereis an edge from a node to one of its dominatorsa loop is present.Most complex assembly instruction sets havewhat are called implicit loops instruction, for in-stance rep movs in x86 ISA. Applying this al-gorithm to a flowgraph will therefore miss thistype of loops.In order to overcome this problem we will trans-late the function to an intermediate languagecalled REIL[14] implemented in BinNavi. Thisintermediate language provides a very small setof instructions which helps in the process of un-folding implicit loops.In [15] a detailed implementation of this algo-rithm can be found.

4.2 Data tainting

As stated before the author did not implementthe data tainting framework employed by thefuzzer, nonetheless given the critical importanceof data tainting for this project the author thinksit is important to briefly describe how dytanworks and how we use this framework for ourpurposes.We previously mentioned that data tainting is atechnique to track user input inside a binary.Tracking is usually performed by assigning mark-ings to data while executing the binary.

Each data tainting implementation can choosethe type of markings to use, more precisely itis possible to determine the granularity of thosemarkings.Dytan is able to either assign a single marking toeach piece of input or have byte-level markings.We chose to use the second type of markings asit is more precise but at the same time does notcause an excessive overhead during the execu-tion.In order to make data tainting work it is impor-tant to define what data needs to be tracked. InDytan it is possible to track user input comingfrom network operations, files access and com-mand line arguments passed to the main() func-tion. That is system calls and functions respon-sible for the aforementioned input sources aremonitored and their output is tracked throughthe binary.Another important factor to take into accountwhile implementing a data tainting tool is apropagation policy.A propagation policy is a set of rules followedwhile taint markings are assigned during pro-gram execution.Dytan currently is able to perform control anddata-flow or data-flow only analysis. The formertracks direct or transitive data assignments aswell as indirect propagation due to control flowdependencies upon user input. The latter in-stead can only track direct and transitive dataassignments. In our fuzzer we use the secondapproach as control flow analysis does not addany useful information on data locations to betested.Another problem to tackle while creating a prop-agation policy is how to deal with multiple mark-ings assigned to the same input. Dytan currentlyassigns to the resulting taint marking the unionof all the taint markings related to it. Although

8

Page 9: 0-knowledge fuzzing white paper

for our fuzzer a different approach might grantbetter results we currently use the default dytanpolicy.Finally we make dytan provide information onevery instruction that assigns taint markings.That is for each of those instructions we obtainthe state of taint markings on machine registersand on memory locations that are tainted at thatspecific program point.

4.3 In-memory fuzzing

We presented in section 3 the two known ap-proaches to in-memory fuzzing. In this sectionwe are going to present two slightly different ap-proaches which we believe to gain better resultsgiven the amount of information we can gatherfrom data tainting analysis.We implemented our in-memory fuzzer on topof PIN[16]. PIN has the ability to add instru-mentation functions before and after a binary isloaded in memory, functions and instructions.Recalling that for each instruction that assignstaint markings we retrieve from data taintinganalysis, we get the markings associated to ma-chine registers and memory locations, we areable to precisely identify program points duringbinary execution that are suitable for fuzzing.For both approaches we perform a number ofsteps:

1. Install an analysis function on image load-ing.

2. Install an analysis function before the func-tion we are interesting in fuzzing is exe-cuted.

3. Install an analysis function before each in-struction that assigns taint markings.

At point 1 we search for the address of the func-tion we are interested in fuzzing and install theanalysis function for that function. At point 2we iterate through function instructions locatingthe ones that are of interest in order to install ananalysis function as described in 3.The first approach consists of mutating memorylocations and registers in place. That is insteadof allocating new memory and pointing instruc-tions operands to it we modify the content ofboth memory locations and registers within theirlength boundaries.We then continue the program execution untilthe program quits or new data is obtained froma tainted source.This approach is more conservative than all theothers as it does not change the memory layoutthus the number of false positives is reduced butat the expenses of an increased number of falsenegatives.The second approach works very similar to SRM1(b). In addition to the first three steps we alsoadd an instrumentation functions at the end ofthe tested function. This function will be re-sponsible of restoring memory after fuzzing wasperformed. With the second approach the mem-ory layout is changed as the fuzzer will allocatechunks of memory to be used during the fuzzingphase.As for the first approach the program executionis continued until the application quits or newdata is obtained from a tainted source.Although our second approach is similar to SRMthere are a few notable differences that have tobe considered. First we do not take a full snap-shot of the process memory but we only trackmodifications that occurred due to fuzzing dur-ing the execution of the tested function. Thesecond difference is that memory is not totallyrestored after the function was fuzzed, this can

9

Page 10: 0-knowledge fuzzing white paper

allow us to reduce the number of false negativessince possible bugs caused by a faulty executionof the function are not missed by restoring thefull process memory.It has to be noted that both approaches de-scribed here although more effective cannot beused without a proper amount of informationgathered by the means of data tainting analy-sis or some similar techniques.

4.4 Code coverage

The combination of code coverage with fuzz test-ing has long been used in order to measure theeffectiveness of fuzzing. We implemented codecoverage on the top of BinNavi debugging API.The choice of using BinNavi debugger serves adouble purpose, not only we are able to imple-ment code coverage using lightwave breakpointswhich highly reduce execution overhead but weare also able to monitor the execution for possi-ble faults. We decided to implement code cov-erage at basic blocks level, that is a breakpointis set at the beginning of each basic block in thetested binary. We perform code coverage firstwhen the binary is executed with a known goodsample, later it is calculated again every timethe program is fuzzed. We require the fuzzingsample to perform at least as good as the knowngood sample, we also set a threshold defining theupper-bound after which the sample reaches the”halting point”. The ”halting point” is the pointwhere the fuzzing process is re-initialized with anew known good sample as shown in Figure 2.Formally:

Definition Let C be the code coverage scoreof a known good sample, C1 the code coveragescore of a fuzzing sample, t a user supplieddelta.

The following must hold true:

C1 ≤ C + t (5)

The halting point is defined as:

C1 = C + t (6)

The code coverage score is calculated as fol-lows:

Definition Let BBt be the totality of basicblocks in a binary, BBf the number of basicblocks touched in a single execution.The code coverage score is defined as:

C =BBf

BBt(7)

A detailed implementation of code coverageusing BinNavi API can be found in [17].

5 Results and future work

In this paper we have described a new approachto fuzz testing which highly reduce instrumenta-tion costs thus resulting very useful when dealingwith large proprietary applications.We have also shown how it is possible to com-bine static and dynamic analysis techniques totriage interesting functions from a security test-ing point of view.Finally we have proposed a new approach to in-memory fuzzing which is more precise and lessprone to false negatives than previous knowntechniques.We do not have enough data to determinewhether this approach has better results com-pared to other fuzzing techniques.The author believes that compared to othermutation-based and evolutionary-based method-ologies the one proposed in this paper will have

10

Page 11: 0-knowledge fuzzing white paper

better results. In comparison to generation-based fuzzers our technique will have better re-sults when dealing with complex software butworse results when the software input is simple.The main direction of future work will be focusedon reducing false positives by employing con-straint reasoners to determine whether a givenbug is reproducible with valid but unexpectedinput.Another important challenge is to implementmore static analysis metrics to triage functionswith a higher degree of precision.

Acknowledgments

The author would like to thank Thomas Dullien,Dino Dai Zovi and Shauvik Roy Choudhary fortheir suggestions and help while researching thetopic.The author would also like to thank JamesClause and Alessandro Orso for having providedaccess to dytan source code and their help whiletesting and improving the original code base.Finally we want to thank all the people who havereviewed the paper.

References

[1] B.P. Miller, L. Fredriksen, and B. So: ”AnEmpirical Study of the Reliability of UNIXUtilities”, Communications of the ACM 33,12 (December 1990)

[2] Kan: Metrics and Models in SoftwareQuality Engineering. Addison-Wesley. pp.316317.

[3] W. Drewry, T. Ormandy: Flayer:exposingapplication internals, Proceedings of the

first USENIX workshop on Offensive Tech-nologies.

[4] G. Hoglund: Runtime Decompilation: TheGreyBox process for Exploiting Software,Black Hat DC 2003

[5] M. Sutton, A. Greene, P. Amini:Fuzzing:brute force vulnerability discovery.Addison-Wesley.

[6] J.Clause, W. Li, A. Orso: Dytan:a genericdynamic taint analysis framework, Proceed-ings of the 2007 international symposium onSoftware testing and analysis

[7] Valgrind: http://www.valgrind.org

[8] Qemu: http://www.qemu.org

[9] Temu:http://bitblaze.cs.berkeley.edu/temu.html

[10] T. J. McCabe: A Complexity measure,IEEE transactions on software engineering,vol. se-2, no.4, december 1976

[11] V. Iozzo: Scripting with BinNavi - Cyclo-matic Complexity

[12] BinNavi: http://www.zynamics.com/binnavi.html

[13] T. Lengauer and R. E. Tarjan: A fast algo-rithm for nding dominators in a owgraph,ACM Transactions on Programming Lan-guages and Systems

[14] T. Dullien, S. Porst: REIL: A platform-independent intermediate representation ofdisassembled code for static code analysis,CanSecWest 2009

[15] V. Iozzo: Finding interesting loops us-ing(Mono)REIL

[16] PIN: http://www.pintool.org

11

Page 12: 0-knowledge fuzzing white paper

[17] V. Iozzo: Code coverage and BinNavi

12