Thursday, July 21, 2016

Fuzzing with AFL is an Art

Using one of the test cases from the previous post, I examine what affects AFL's ability to find a bug placed by LAVA in a program. Along the way, I found what's probably a harmless bug in AFL, and some interesting factors that affect its performance. Although its interface is admirably simple, AFL can still require some tuning, and unexpected things can determine its success or failure on a bug.

American Fuzzy Lop, or AFL for short, is a powerful coverage-guided fuzzer developed by Michal Zalewski (lcamtuf) at Google. Since its release in 2013, it has racked up an impressive set of trophies in the form of security vulnerabilities in high-profile software. Given its phenomenal success on real world programs, I was curious to explore in detail how it worked on an automatically generated bug.

I started off with the toy program we looked at in the previous post, with a single bug added. The bug added by LAVA will trigger whenever the first four bytes of a float-type file_entry are set to 0x6c6175de or 0xde75616c, and will cause printf to be called with an invalid format string, crashing the program.

After verifying that the bug could be triggered reliably, I compiled it with afl-gcc and started a fuzzing run. To get things started, I used a well-formed input file for the program that contained both int and float file_entry types:

Because I'm lucky enough to have a 24 core server sitting around, I gave it 24 cores (one using -M and the rest using -S) and let it run for about 4 and a half days, fully expecting that it would find the input in that time.

This did not turn out so well.

Around 20 billion executions later, AFL had found zilch.

At this point, I turned to Twitter, where John Regehr suggested that I look into what coverage AFL was achieving. I realized that I actually had no idea how AFL's instrumentation worked, and that this would be a great opportunity to find out.

Diving Into AFL's Instrumentation

The basic afl-gcc and afl-clang tools are actually very simple. They wrap gcc and clang, respectively, and modify the compile process to emit an intermediate assembly code file (using the -S option). Finally they do some simple string matching (in C, ew) to find out where to add in calls to AFL's coverage logging functions. You can get AFL to save the assembly code it generates using the AFL_KEEP_ASSEMBLY environment variable, and see exactly what it's doing. (There's actually also a newer way of getting instrumentation that was added recently using an LLVM pass; more on this later.)

Left, the original assembly code. Right, the same code after AFL's instrumentation has been added.

After looking at the generated assembly, I noticed that the code corresponding to the buggy branch of the if statement wasn't getting instrumented. This seemed like it could be a problem, since AFL can't try to use coverage to reach a part of the program if there's no logging to tell it that an input has caused it to reach that point.

Looking into the source code of afl-as, the program that instruments the assembly code, I noticed a curious bit of code:

AFL skips labels following p2align directives in the assembly code.

According to the comment, this should only affect programs compiled under OpenBSD. However, the branch I wanted instrumented was being affected by this even though I was running under Linux, not OpenBSD, and there were no jump tables present in the program.

The .L18 block should be instrumented by AFL, but won't be because it's right after an alignment statement.

Since I'm not on OpenBSD, I just commented out this if statement. As an alternate workaround, you can also add "-fno-align-labels -fno-align-loops -fno-align-jumps" to the compile command (at the cost of potentially slower binaries). After making the change I restarted, once again confident AFL would soon find my bug.

Alas, it was not to be. Another 17 hours of fuzzing on 24 cores yielded nothing, and so I went back to the drawing board. I am still fairly sure I found a real bug in AFL, but fixing it didn't help find the bug I was interested in. (Note: it's possible that if I had waited four days again it would have found my bug. On the other hand, AFL's cycle counter had turned green, indicating that it thought there was little benefit in continuing to fuzz.)

5.2 billion executions, no crashes :(

“Unrolling” Constants

Thinking about what would be required to find the bug by AFL, I realized that its chances of hitting our failing test case were pretty low. AFL will only prioritize a test case if it has seen that it leads to new coverage. In the case of our toy program, it would have to guess one of the two exact 32-bit trigger values at exactly the right place in the file, and the odds of this happening are pretty slim.

At this point I remembered a post by lcamtuf that described how AFL managed to figure out that an XML file could contain CDATA tags even though its original test cases didn't contain any examples that used CDATA. He also calls out our bug as exactly the kind of thing AFL is not designed to find:

What seemed perfectly clear, though, is that the algorithm wouldn't be able to get past "atomic", large-search-space checks such as:
if (strcmp(header.magic_password, "h4ck3d by p1gZ")) goto terminate_now;
if (header.magic_value == 0x12345678) goto terminate_now;

So how was AFL able to generate a CDATA tag out of thin air? It turns out that libxml2 has a set of macros that expand out some string comparisons into character-by-character comparisons that use simple if statements. This allows AFL to discover valid strings character by character, since each correct character will add new coverage, and cause further fuzzing to be done with that input.

We can also apply this to our test program. Rather than checking for the fixed constant 0x6c6175de, we can compare each byte individually. This should allow AFL to identify the trigger value one byte at a time. The new code looks like this:

The monolithic if statement has been replaced by 4 individual branches.

Once we make this change and compile with afl-gcc, AFL finds a crash in just 3 minutes on a single CPU!

AFL has found the bug!

This also makes me wonder if it might be worthwhile to implement a compiler pass that breaks down large integer comparisons into byte-sized chunks that AFL can deal with more easily. For string comparisons, one can already substitute in an inline implementation of strcmp/memcmp; an example is available in the AFL source.

A Hidden Coverage Pitfall

While investigating the coverage issues, I noticed that AFL has a new compiler: afl-clang-fast. This module, contributed by László Szekeres, performs instrumentation as an LLVM pass rather than by modifying the generated assembly code. As a result, it should be less brittle and allow for more instrumentation options; from what I can tell it's slated to become the default compiler for AFL at some point.

However, I discovered that its instrumentation is not identical to the instrumentation done by afl-as. Whereas afl-as instruments each x86 assembly conditional branch (that is, any of the instructions starting with "j" aside from "jmp"), afl-clang-fast works at the level of LLVM basic blocks, which are closer to the blocks of code found in the original source. And since by default AFL adds -O3 to the compile command, multiple conditional checks may end up getting merged into a single basic block.

As a result, even though we have added multiple if statements to our source, the generated LLVM looks more like our original statement – the AFL instrumentation is only placed in the innermost if body, and so AFL is forced to try and guess the entire 32-bit trigger at once again.

Using the LLVM instrumentation mode, AFL is no longer able to find our bug.

We can tell AFL not to enable the compiler optimizations, however, by setting the AFL_DONT_OPTIMIZE environment variable. If we do that and recompile with afl-clang-fast, the if statements do not get merged, and AFL is able to find the trigger for the bug in about 7 minutes.

So this is something to keep in mind when using afl-clang-fast: the instrumentation does not work in quite the same way as the traditional afl-gcc mode, and in some special cases you may need to use AFL_DONT_OPTIMIZE in order to get the coverage instrumentation that you want.

Making AFL Smarter with a Dictionary

Although it's great that we were able to get AFL to generate the triggering input that reveals the bug by tweaking the program, it would be nice if we could somehow get it to find the bugs in our original programs.

AFL is having trouble with our bugs because they require it to guess a 32-bit input all at once. The search space for this is pretty large: even supposing that it starts systematically flipping bits in the right part of the file, it's going to take an average of 2 billion executions to find the right value. And of course, unless it has some reason to believe that working on that part of the file will get improved coverage, it won't be focusing on the right file position, making it even less likely it will find the right input.

However, we can give AFL a leg up by allowing it to pick inputs that aren't completely random. One of AFL's features is that it supports using a dictionary of values when fuzzing. This is basically just a set of tokens that it can use when mutating a file instead of picking values at random. So one classic trick is to take all of the constants and strings found in the program binary and add them to the dictionary. Here's a quick and dirty script that extracts the constants and strings from a binary for use with AFL:

Once we give AFL a dictionary, it finds 94% of our bugs (149/159) within 15 minutes!

Now, does this mean that LAVA's bugs are too easy to find? At the moment, probably yes. In the real world, the triggering conditions will not always be something you can just extract with objdump and strings. The key improvement needed in LAVA is a wider variety of triggering mechanisms, which is something we're working on.


By looking in detail at a bug we already knew was there, we found out some very interesting facts about AFL:

  • Its ability to find bugs is strongly related to the quality of its coverage instrumentation, and that instrumentation can vary due both to bugs in AFL and inherent differences in the various compile-time passes AFL supports.
  • The structure of the code also heavily influences AFL's behavior: seemingly small differences (making 4 one-byte comparisons vs one 4-byte comparison) can have a huge effect.
  • Seeding AFL with even a naïve dictionary can be devastatingly effective.

In the end, this is precisely what we hoped to accomplish with LAVA. By carefully examining cases where current bug-finding tools have trouble on our synthetic bugs, we can better understand how they work and figure out how to make them better at finding real bugs as well.


Thanks to Josh Hofing, Kevin Chung, and Ryan Stortz for helpful feedback and comments on this post, and of course Michal Zalewski for making AFL.


momyc said...

Purchasing Method of Luxe Trim 1
Many human beings do no longer understand the exact place to get this complement, or they wander inside the nearby store. This complement is at the web shops from wherein you'll be effective to get.

Go at the authentic web page of Luxe Trim 1 or get all information, study the critiques or make certain this website is bureaucratic in actual. Confirm your order by using filling the form.

Read More About It >>>

brap said...

Customer Reviews Ketogeniks
Anna D. Cromer, 22 Years – Ketogeniks Keto Diet, I supplied my father, because he nonetheless wanted me to be out of place. I began the use of it with my ketogenic eating regimen. I have turn out to be surprised to appearance the transformation. In only a month, I dropped 10 kilos. I commonly use this product to gain their desires at the right time. Also, I extraordinarily endorse this product to each person who wants to reduce greater fat.

Read News >>

sadishna said...

Debbiesmiracles nerous dangers, notwithstanding, as with any significant surgery. For the individuals who trust surgery is the best alternative, counseling with an accomplished doctor is crucial. For people who are beefy beyond belief, surgery to sidestep parts of the stomach and small digestive tract may on occasion be the main compelling method for delivering maintained and critical reduction. Such surgery, in any case, can be dangerous, and it is performed.

taana said...

Sharktankpedia pend another unnecessary dime on products that don't work. You see, your body is a fine tuned management machine. If you follow the right plan and give your body only what it needs to survive it will do all the hard work for you. Given enough time and following the steps above you can lose tens and even hundreds of pounds without starving yourself or resorting to bogus pills or dangerous body sculpting surgerie.

safatini said...

Governmenthorizons l retract slightly. Three to six months later, the patient will return to evaluate the situation and determine if the outcome is satisfactory. Delaying the second stage allows for significant skin retraction and if a skin tightening procedure is desired it is likely to require smaller incisions. The Bottom Line? Being healthy has far greater importance than excess skin. Although, having excess skin can lead to some serious issues there are non-surgical ways of preventing infections such as keeping your excess skin dry. For those who are able and want to remove the excess skin, post-bariatric surgery is for you. It is important to remember that these procedures should only be considered after ones has stabilized; especially after surgery. This can range from - months after . The most important factor in finding a plastic surgeon who can perform these procedures is that they must be board certified. .

mideatimers3 said...

Autobodycu you're supposed to do then you may never see your dreams come true. This my friend is the saddest scenario there is. If you're bored of being fat, if your tired of being over, if your ready to start your new life today then hold on because I am about to give you the simple yet extremely effective plan you've been waiting for. Are you ready? I am going to tell you in no uncertain terms what you have to do to lose all the you want in easy to do steps. However, in order for this to work you must maintain your focus and discipline everyday to keep doing the steps until you reach your desired results. Even if you falter and veer off your track, it's OK. If you are able to quickly get back on track you will overtime, reap the benefits of dramatic and improved health. And now, without further adieu here are your ba.

thefitnesssupplement said...

Amazin Brain Hilarious clips of your favorite celebrities singing in Amazin Brain clips from The Late Late Show with James Corden. Amazin Brain equipment maker Singing Machine designed a microphone that makes you're feeling like you're internet hosting your personal episode of the popular James Corden show. The May 20, 2015 episode, which followed the finale of Late Present with David Letterman, was the very best rated episode of The Late Late Present within the history of the franchise with an audience of four million viewers and a score of two.5., despite starting 20 minutes late because of Letterman's present running over its scheduled finish time. Conan in Cuba: Conan 'Brien becomes the primary American late-evening host to do a show in The Fitness Supplement in additional than 50 years. With James Corden, Paul McCartney. Roomy, stylish, and featuring the Amazin Brain emblem entrance and middle, this Amazin Brain Neon Brand Canvas Tote Bag will quickly change into your go-to accent. Corden's Amazin Brain by the streets of London with pop singer Adele, a sketch which featured on his speak present in January 2016, Amazin Brain was the biggest YouTube viral video of 2016.

hrroman said...

If you are stuck with your marketing assignment then in this case you can opt for our Marketing Assignments. we provide the bestOnline marketing expert.We also provide Sales and Promotion help for students across the globe. for more information contact us +16692714848.

Paul Allen said...

Thank you so much for this excellent blog article. Your writing style and the way you have
presented your content is awesome. Now I am pretty clear on this topic. aroma rice cooker instructions

asdfsdfgsdgdfg said...

The round out has all the natural and natural accessories which may be warmness to paintings on the issues of the pores and pores and skin. Bioviderma Serum It will help in bringing once more the radiance and the glow of the pores and skin. The Revealed nutrients required with the aid of reject the pores and pores and pores and skin are Apt again to the pores and skin. Further, it will work at the growing older effects of the skin like exquisite traces, dark spots, blemishes and plenty of others. This is the incredible adding which has CBD in it.

jimyjack77 said...

Advanced Keto Plus Each of these elements combines right into a herbal fat burning components that let you sooner or later reach your weight loss desires! But the primary aspect inside the mix is garcinia cambogia. This ingredient works to reduce your cravings via growing serotonin tiers within the frame. On top of that, it may work to prevent your frame from growing fat inside the first area. This alongside a proper ketogenic eating regimen is precisely what you need to narrow down NATURALLY. You don’t want highly-priced and painful surgeries to get effects with those pinnacle selling capsules. So, click any picture or button on this web page to attempt them for yourself before substances are long past and you lose your pleasant desire to meet your weight dreams!


yoyorani77451 said...

Ascension Keto Never want to alternate the dose or by no means bethink that growth in treatment will come up with the effective results. The high might be more risky than your wondering, so get this as prescribed.

For further information, read the intake method go at your medical doctor first or get the consultation.


jimyjack77 said...

How To Buy Primal Grow Pro
Primal Grow Pro If you’re hoping to Buy Primal Grow Pro then you definately want to visit its legit internet site in which you could easily placed your order. So, click on the given photograph and go to its reputable website to assert your order earlier than it bought out!


yoyarano74 said...

Vitrexotin Increase sexual self assurance: Men typically face this problem of their lifestyles as they hesitate to have a conversation with their accomplice, now not able to recognize a way to execute things. So it’s crucial for men to simply go along with near and make each moment of your sex life filled with power, stamina and ultimately fulfill yourself and your associate that the maximum vital key of sex.


bebaalangar said...

What Is The Return Policy?
Buyers can fast skip decrease decrease lower all yet again the Rapid Boost Keto over the entire period of 30 days if it does Rapid Boost Keto now not display any very last outcomes. Amount of the clients gets refunded with out hassle on the account concurrently. Without keeping any doubt, certainly every body can without troubles buy and flow into lower another time the supplement.

Where to Buy Rapid Boost Keto?
Obtaining It is available on online Rapid Boost Keto net websites, so that you do now not want to transport anywhere else in your very personal buy rate. Here’s a hyperlink is given, an tremendous manner to robotically redirect to the expert page of the supplement and you could Quickly located the order.