Tuesday, September 6, 2011
What I Did on My Summer Vacation
Over the summer I worked at Microsoft Research, which has a fantastically smart bunch of people working on really cool and interesting problems. I just noticed that they've posted the video of my end-of-internship talk, Monitoring Untrusted Modern Applications with Collective Record and Replay. Please take a look if you're curious about what it might look like to try and monitor mobile apps in the wild with low overhead!
Labels:
collective,
microsoft,
mobile,
msr,
record and replay,
research,
security,
summer
Saturday, May 28, 2011
Paper and Slides Available for "Virtuoso: Narrowing the Semantic Gap in Virtual Machine Introspection"
I've recently returned from Oakland, CA, where the 25 IEEE Symposium on Security and Privacy was held. There were a lot of excellent talks, and it was great to catch up with others in the security community. Now that the conference is over, I'm happy to release the paper and slides of our work, "Virtuoso: Narrowing the Semantic Gap in Virtual Machine Introspection", which I have described in an earlier post.
The slides contain some animations, and so I've made them available in three formats:
You can also get a copy of the full paper here. I'm also hoping to have the source ready for release soon; when it is available, you'll be able to find it on Google Code under the name Virtuoso.
Once again, thanks to my most excellent co-authors at MIT Lincoln Labs and Georgia Tech for helping me see this project through!
Wednesday, April 6, 2011
Applying Forensic Tools to Virtual Machine Introspection
I've just released a technical report summarizing some work I did a couple years ago that explores how forensic memory analysis and virtual machine introspection are closely linked.
Abstract: Virtual machine introspection (VMI) has formed the basis of a number of novel approaches to security in recent years. Although the isolation provided by a virtualized environment provides improved security, software that makes use of VMI must overcome the semantic gap, reconstructing high-level state information from low-level data sources such as physical memory. The digital forensics community has likewise grappled with semantic gap problems in the field of forensic memory analysis (FMA), which seeks to extract forensically relevant information from dumps of physical memory. In this paper, we will show that work done by the forensic community is directly applicable to the VMI problem, and that by providing an interface between the two worlds, the difficulty of developing new virtualization security solutions can be significantly reduced.
You can read the full paper on SMARTech. Hopefully this will encourage others to start using great memory analysis tools like Volatility for live analysis of virtual machines!
Tuesday, March 15, 2011
Automatically Generating Memory Forensic Tools
Now that the IEEE Symposium on Security and Privacy program has finally been posted, I can describe some research I've been working on for the past year and a half related to virtual machine introspection (VMI) and memory forensics.
A well-known problem with VMI and memory forensics is the semantic gap -- basically, the kind of information you want out of a memory image or a running VM is high level information (what processes are running, what files are open, and so on) but what you get is a big bunch of uninterpreted bytes (i.e., a view of physical memory). Bridging this gap is what tools like Volatility were built to do, and they do it well.
However, building a tool like Volatility takes a lot of work and a lot of knowledge about the internals of the operating system you're trying to examine. With operating systems like Windows, which are closed source, this kind of knowledge comes from things like the Windows Internals book, blog posts, and good old fashioned reverse engineering. This takes a lot of time, and the process has to be repeated every time there's a new version of Windows or a new operating system you want to support. Volatility's next release will support Vista and Windows 7, but it hasn't been easy – the networking code, for example, was rewritten for Vista, which required some reverse engineering by MHL and a new plugin.
Is there an easier way? What we want, in an ideal world, is some way that we can generate some of these tools automatically, for any OS or version. That's the problem that we set out to solve, and it's one that I think we made some good progress on -- though as with any academic work, there's still lots of room for improvement :)
The basic idea is that many of the tools we want to run on a memory image could be easily coded if we had access to the native APIs on the system – for example, we could easily write something similar to pslist if we had access to the Windows API by doing something like:

Our system, which we call Virtuoso, takes advantage of this fact. We take small programs like the one shown above and run them inside a virtual machine that logs every instruction they execute, both in user-mode and in the kernel. From these logs, we can then automatically generate Volatility plugins that do the same thing. Of course, I'm omitting a lot of technical detail here – there's a lot of work that needs to be done to clean up the logs, cut out irrelevant parts of the computation, and reconstitute the logs back into something that resembles a program – but that's the core idea.
In our paper, we show off our technique by automatically generating 6 different programs on Linux, Windows, and Haiku. These programs do things like list the PIDs of currently running processes, enumerate loaded kernel modules, and retrieve the executable name for a given PID, and didn't require any special knowledge to create: we just looked up the API functions that did what we wanted and wrote small programs like the one shown above, then let Virtuoso do the hard work of creating a Volatility plugin.
In future posts, I'll go deeper into the technical methods used to achieve this. I'll also post the paper itself once the conference happens (after all, I have to give people some reason to come and see the talk ;) ). And finally, I'm hoping to release the code itself, once I get approval from the people that funded the research. For now, I'm going to employ a tactic known as "proof by screenshot", showing the steps involved in creating a plugin to list the PIDs of running proceses under Haiku. (Click any of the screenshots to see a larger version.)
First we write a program that uses the Haiku API to get a list of running processes. We annotate the program with some markers that tell our logging engine where to start and stop the trace, and what the inputs and outputs are (the calls to vm_mark_buf_{in,out}):

We now compile and run that program inside a virtual machine running Haiku, and log what computation it does:
Next, we run our analyzer on it, which does its magic and produces a plugin for Volatility:
Finally, we can run that plugin within Volatility to analyze a Haiku memory image:
To wrap things, up, I want to thank my co-authors Tim Leek, Michael Zhivich, Jonathon Giffin, and Wenke Lee. It's been a long road, but I'm hoping this research will make it a lot easier to build exciting new security tools for VMI and memory forensics!
Labels:
forensics,
haiku,
ieee security and privacy,
memory analysis,
oakland,
security,
virtualization,
virtuoso,
vmi
Thursday, July 15, 2010
GDI Utilities: Taking Screenshots of Memory Dumps
I've posted about this before (twice!), but somehow never gotten around to releasing functioning code. Here (click), for your downloading pleasure, is a set of plugins designed to extract information about on-screen (graphical) windows from Windows XP SP2/3 memory images. This includes:
- window_list - give a text listing of the window hierarchy, with each window's on-screen coordinates, current style, and its class (Button, Window, etc.). Here's some example output to whet your appetite.
- screenshot - save a wireframe "screenshot" of the on-screen windows in a memory image. See later in this post for some examples. Requires PIL.
- wndmon - continuously monitor a memory image and provide an updating view of the on-screen windows. Works best in a live environment, e.g. with XenAccess and PyXa. Requires PyGame. (This is what I used for the video demo).
- Get distorm3 from its Google Code site.
- Go into build/linux and type "make".
- Copy the resulting libdistorm3.so into the Python directory.
- Rename the Python directory to "distorm" and move it somewhere in your Python path (I use Debian, and found that /usr/local/lib/python2.6/dist-packages/ worked well).
Have fun with the code. If you go exploring in the source, you may find some interesting things -- there's more functionality there than is exposed through the plugins, including some functions and data structures that can extract HTML content from IE in memory... ;)
Anyway, to wrap things up, here's an example of the output from the screenshot plugin, running on the two NIST memory images:
From the 6/25 image:
From the 7/4 image:
And that, my friends, is the power of memory analysis.
Labels:
distorm,
GDI,
PIL,
plugins,
pygame,
python,
reverse engineering,
screenshots,
video,
Volatility,
win32k
Tuesday, July 6, 2010
Plugin Post: Robust Process Scanner
It's pretty well known, in memory forensics circles, that there are two common ways of finding processes in memory images: list-walking, which traverses the kernel's linked list of process data structures, and scanning, which does a sweep over memory, looking for byte patterns that match the data found in a process data structure.
Having two different ways of finding processes can be very handy, especially when we suspect that someone may be trying to hide processes. One common way of hiding processes in Windows is called DKOM (Direct Kernel Object Manipulation); this technique works by just unlinking the process you want to hide from the kernel's list, like so:

This makes it invisible from programs such as the task manager, as well as memory forensic tools that use list-walking (including Volatility's pslist). However, such hidden processes can still be found by scanning memory using a signature for the process data structure; this is what psscan2 does.
Unfortunately, it's been known since at least 2007 (as mentioned in AAron Walters and Nick Petroni's Blackhat DC talk, and more recently in a presentation by Jesse Kornblum) that even signature scans can be evaded by crafty attackers. Signatures typically rely on "magic" values found in the process data structure. For example, in Windows XP, process data structures always begin with "\x03\x00\x1b\x00", which makes it pretty easy to find them in memory images.
But is that magic value really essential to the correct functioning of a process in Windows? What if an attacker just overwrites those four bytes with zeroes? As it turns out, Windows will be perfectly happy to keep running the process! At the same time, it will be completely hidden from existing forensic tools. What's more, as I demonstrated in my paper for CCS 2009 (Robust Signatures for Kernel Data Structures), around 51 fields in the process data structure can be manipulated by attackers in this way – including nearly all of the fields currently used to find processes.
So what's a forensic analyst to do? Luckily, there are some parts of the process data structure that are hard for an attacker to mess with without causing one of these:

So if we can build a signature based on these fields, we can find processes that existing signature scanners might miss.
And that's just what I've done. Here, for your consideration and consumption, is the creatively-named psscan3 (just drop it into the memory_plugins directory of Volatility 1.3.2). It uses a only fields that have been identified as "robust" to locate processes in Windows memory. It's a bit slower than the existing scanners, right now, because it's checking for more things.
If you want to try it out, you might also want to download this sample memory image, which has a hidden process at offset 0x01a4bc20. In Volatility, pslist, psscan, and psscan2 all miss the process, but psscan3 detects it, as shown in this exciting screenshot (click to enlarge; the windows show, from left to right, psscan, psscan2, and psscan3) [EDIT: Blogger is for some reason refusing to link to the larger size; click here to view it]:
If you'd like a copy of the rootkit that hid this process (which is based on the FU Rootkit), send me an e-mail (but be warned that I probably won't be able to dig up the source until this fall).
Labels:
CCS2009,
DKOM,
forensics,
memory analysis,
plugins,
process,
rootkits,
scanning,
signature,
Volatility
Saturday, July 11, 2009
SANS Forensic Summit: Thoughts and Slides
This past Tuesday I attended the 2009 SANS Forensic Summit. In part, I was there to give a talk on combining volatile memory analysis with forensic analysis (see below for the slides from that), but I was also pretty excited about getting to hang out with the bright lights of the forensics community like Harlan Carvey, Chris Pogue, Richard Bejtlich, and many more.
Unfortunately, I was only able to attend the first day, which consisted primarily of technical talks on various aspects of forensics, incident response, and live forensics. All the talks were really excellent; Rob Lee and the folks at SANS should be commended for their great work in putting everything together. In this post I'm going to just describe the talks, rather than the panels; unfortunately I forgot to take notes during the panels and so I don't have as much to say about them, other than that they were fun and highly informative.
On to the talks! The first talk of the morning was Richard Bejtlich's keynote, which gave a really great analysis of the current state of the industry and the challenges faced by investigators today. He drew heavily from the Verizon Data Breach Investigations Report, which gave his assertions a nice feel of solidity to them; for example, when he says that we're in bad shape (getting compromised left and right), he can back that up with statistics showing that most intrusions are discovered only through third party notifications. If you're not already reading Richard's posts over at TaoSecurity, I highly encourage it.
After the keynote, Kris Harms got up to talk about live response. He gave a lot of cool tips on how to use some standard tools that most people should be familiar with (pslist, handles, etc.) to quickly triage a system and make a determination on whether it needs deeper analysis. I have to admit that I don't usually think a lot about live analysis--from a standpoint of simply collecting volatile data, I think that memory forensics offers a much better solution. However, from a triage perspective, live analysis makes a lot of sense; you can get a lot of leads very quickly by just knowing how to poke around on the live system.
Nevertheless, I did have one quibble with this talk. It seemed like a lot of the techniques presented, while cool, were a little haphazard. That is, "poking around" isn't necessarily repeatable, which means that as an investigator you could end up missing data by performing a different set of actions on different cases. After all, we're only human, and sometimes we forget things. I personally prefer to make sure that anything I'm going to do more than once is scripted. This allows one to codify an investigative procedure so that it's consistent and repeatable -- think of it as an executable checklist.
For example, in the presentation, Kris Harms described finding hidden processes by using handle.exe and pulling out the PIDs of each handle table it finds (Harlan Carvey now has a nice perl script that automates this). However, there are several rootkit detectors (such as IceSword) that will do this handle table vs. process list cross-view for you. I think we should definitely learn about these techniques and how they work, but I don't see the point in trying to keep them all in your head and do them by hand each time -- put it in a script and let the computer do the work.
After lunch, Harlan Carvey got up to talk about Windows registry analysis, a field that he did a lot of pioneering work in and essentially dominates. Things got a little hectic at the end, as he raced through some information-dense slides on specific kinds of forensic information you could get out of the registry, but overall I really found the talk engaging and illuminating. It also served as a really great motivator for my own talk: he spent a while near the beginning talking about volatile registry data and some of the reasons it's important. This set me up very nicely, since my own presentation was all about extracting registry data from memory. And I didn't even have to bribe him (much)!
Ending out the day (for presentations, at least) was a combined, hour and a half long session on memory analysis with Jamie Butler, Peter Silberman, and me. Peter and Jamie gave a great talk on Memoryze, which is Mandiant's free (as in beer) tool for analyzing volatile memory. Although most of the stuff presented was nothing new if you've been following memory analysis research, it was nice to see their software in action. They also announced the release of a new version of Memoryze, which supports Vista more fully, including the reworked networking code. Peter and Jamie are both very smart, and while I personally prefer Volatility for my own work, I'm glad that people have great options like Memoryze and Volatility to choose from.
Finally, after Jamie and Peter, I gave my own talk on combining registry analysis with memory forensics. There wasn't much new research presented in the talk, but I think it serves as a nice introduction to the toolset for people that haven't seen it before. The slides are available at the bottom of this post (assuming I can get this embedding thing to work), and I'll let them speak for themselves. :)
Once again, a huge thanks to Rob Lee and everyone who organized and attended the SANS Forensics Summit 2009! If you missed it this year, I hope this post has given you a taste of some of the great stuff that goes on there, and will encourage you to go next time!
Unfortunately, I was only able to attend the first day, which consisted primarily of technical talks on various aspects of forensics, incident response, and live forensics. All the talks were really excellent; Rob Lee and the folks at SANS should be commended for their great work in putting everything together. In this post I'm going to just describe the talks, rather than the panels; unfortunately I forgot to take notes during the panels and so I don't have as much to say about them, other than that they were fun and highly informative.
On to the talks! The first talk of the morning was Richard Bejtlich's keynote, which gave a really great analysis of the current state of the industry and the challenges faced by investigators today. He drew heavily from the Verizon Data Breach Investigations Report, which gave his assertions a nice feel of solidity to them; for example, when he says that we're in bad shape (getting compromised left and right), he can back that up with statistics showing that most intrusions are discovered only through third party notifications. If you're not already reading Richard's posts over at TaoSecurity, I highly encourage it.
After the keynote, Kris Harms got up to talk about live response. He gave a lot of cool tips on how to use some standard tools that most people should be familiar with (pslist, handles, etc.) to quickly triage a system and make a determination on whether it needs deeper analysis. I have to admit that I don't usually think a lot about live analysis--from a standpoint of simply collecting volatile data, I think that memory forensics offers a much better solution. However, from a triage perspective, live analysis makes a lot of sense; you can get a lot of leads very quickly by just knowing how to poke around on the live system.
Nevertheless, I did have one quibble with this talk. It seemed like a lot of the techniques presented, while cool, were a little haphazard. That is, "poking around" isn't necessarily repeatable, which means that as an investigator you could end up missing data by performing a different set of actions on different cases. After all, we're only human, and sometimes we forget things. I personally prefer to make sure that anything I'm going to do more than once is scripted. This allows one to codify an investigative procedure so that it's consistent and repeatable -- think of it as an executable checklist.
For example, in the presentation, Kris Harms described finding hidden processes by using handle.exe and pulling out the PIDs of each handle table it finds (Harlan Carvey now has a nice perl script that automates this). However, there are several rootkit detectors (such as IceSword) that will do this handle table vs. process list cross-view for you. I think we should definitely learn about these techniques and how they work, but I don't see the point in trying to keep them all in your head and do them by hand each time -- put it in a script and let the computer do the work.
After lunch, Harlan Carvey got up to talk about Windows registry analysis, a field that he did a lot of pioneering work in and essentially dominates. Things got a little hectic at the end, as he raced through some information-dense slides on specific kinds of forensic information you could get out of the registry, but overall I really found the talk engaging and illuminating. It also served as a really great motivator for my own talk: he spent a while near the beginning talking about volatile registry data and some of the reasons it's important. This set me up very nicely, since my own presentation was all about extracting registry data from memory. And I didn't even have to bribe him (much)!
Ending out the day (for presentations, at least) was a combined, hour and a half long session on memory analysis with Jamie Butler, Peter Silberman, and me. Peter and Jamie gave a great talk on Memoryze, which is Mandiant's free (as in beer) tool for analyzing volatile memory. Although most of the stuff presented was nothing new if you've been following memory analysis research, it was nice to see their software in action. They also announced the release of a new version of Memoryze, which supports Vista more fully, including the reworked networking code. Peter and Jamie are both very smart, and while I personally prefer Volatility for my own work, I'm glad that people have great options like Memoryze and Volatility to choose from.
Finally, after Jamie and Peter, I gave my own talk on combining registry analysis with memory forensics. There wasn't much new research presented in the talk, but I think it serves as a nice introduction to the toolset for people that haven't seen it before. The slides are available at the bottom of this post (assuming I can get this embedding thing to work), and I'll let them speak for themselves. :)
Once again, a huge thanks to Rob Lee and everyone who organized and attended the SANS Forensics Summit 2009! If you missed it this year, I hope this post has given you a taste of some of the great stuff that goes on there, and will encourage you to go next time!
Labels:
forensics,
presentation,
registry,
SANS,
slides,
Volatility,
volreg,
volrip
Subscribe to:
Posts (Atom)