Tuesday, February 7, 2012

Virtuoso – Initial Code Release

I've just gotten word that the Virtuoso source code has been approved by the sponsor for public release, so I've uploaded version 1.0 to the Virtuoso Google Code site! Thanks to Tim Leek at MIT Lincoln Laboratory for seeing this project through the lengthy release review process!

Also on Google Code, you can find an installation guide and a walkthrough to get you started.

Check it out, and generate some memory analysis tools! If you run into trouble, you can shoot me an email and I'll do my best to help out, but keep in mind that this is a research project, and so there are still lots of rough edges. Enjoy!

Tuesday, September 6, 2011

What I Did on My Summer Vacation

Over the summer I worked at Microsoft Research, which has a fantastically smart bunch of people working on really cool and interesting problems. I just noticed that they've posted the video of my end-of-internship talk, Monitoring Untrusted Modern Applications with Collective Record and Replay. Please take a look if you're curious about what it might look like to try and monitor mobile apps in the wild with low overhead!

Saturday, May 28, 2011

Paper and Slides Available for "Virtuoso: Narrowing the Semantic Gap in Virtual Machine Introspection"

I've recently returned from Oakland, CA, where the 25 IEEE Symposium on Security and Privacy was held. There were a lot of excellent talks, and it was great to catch up with others in the security community. Now that the conference is over, I'm happy to release the paper and slides of our work, "Virtuoso: Narrowing the Semantic Gap in Virtual Machine Introspection", which I have described in an earlier post.

The slides contain some animations, and so I've made them available in three formats:
You can also get a copy of the full paper here. I'm also hoping to have the source ready for release soon; when it is available, you'll be able to find it on Google Code under the name Virtuoso.

Once again, thanks to my most excellent co-authors at MIT Lincoln Labs and Georgia Tech for helping me see this project through!

Wednesday, April 6, 2011

Applying Forensic Tools to Virtual Machine Introspection

I've just released a technical report summarizing some work I did a couple years ago that explores how forensic memory analysis and virtual machine introspection are closely linked.

Abstract: Virtual machine introspection (VMI) has formed the basis of a number of novel approaches to security in recent years. Although the isolation provided by a virtualized environment provides improved security, software that makes use of VMI must overcome the semantic gap, reconstructing high-level state information from low-level data sources such as physical memory. The digital forensics community has likewise grappled with semantic gap problems in the field of forensic memory analysis (FMA), which seeks to extract forensically relevant information from dumps of physical memory. In this paper, we will show that work done by the forensic community is directly applicable to the VMI problem, and that by providing an interface between the two worlds, the difficulty of developing new virtualization security solutions can be significantly reduced.

You can read the full paper on SMARTech. Hopefully this will encourage others to start using great memory analysis tools like Volatility for live analysis of virtual machines!

Tuesday, March 15, 2011

Automatically Generating Memory Forensic Tools

Now that the IEEE Symposium on Security and Privacy program has finally been posted, I can describe some research I've been working on for the past year and a half related to virtual machine introspection (VMI) and memory forensics.

A well-known problem with VMI and memory forensics is the semantic gap -- basically, the kind of information you want out of a memory image or a running VM is high level information (what processes are running, what files are open, and so on) but what you get is a big bunch of uninterpreted bytes (i.e., a view of physical memory). Bridging this gap is what tools like Volatility were built to do, and they do it well.

However, building a tool like Volatility takes a lot of work and a lot of knowledge about the internals of the operating system you're trying to examine. With operating systems like Windows, which are closed source, this kind of knowledge comes from things like the Windows Internals book, blog posts, and good old fashioned reverse engineering. This takes a lot of time, and the process has to be repeated every time there's a new version of Windows or a new operating system you want to support. Volatility's next release will support Vista and Windows 7, but it hasn't been easy – the networking code, for example, was rewritten for Vista, which required some reverse engineering by MHL and a new plugin.

Is there an easier way? What we want, in an ideal world, is some way that we can generate some of these tools automatically, for any OS or version. That's the problem that we set out to solve, and it's one that I think we made some good progress on -- though as with any academic work, there's still lots of room for improvement :)

The basic idea is that many of the tools we want to run on a memory image could be easily coded if we had access to the native APIs on the system – for example, we could easily write something similar to pslist if we had access to the Windows API by doing something like:


Our system, which we call Virtuoso, takes advantage of this fact. We take small programs like the one shown above and run them inside a virtual machine that logs every instruction they execute, both in user-mode and in the kernel. From these logs, we can then automatically generate Volatility plugins that do the same thing. Of course, I'm omitting a lot of technical detail here – there's a lot of work that needs to be done to clean up the logs, cut out irrelevant parts of the computation, and reconstitute the logs back into something that resembles a program – but that's the core idea.

In our paper, we show off our technique by automatically generating 6 different programs on Linux, Windows, and Haiku. These programs do things like list the PIDs of currently running processes, enumerate loaded kernel modules, and retrieve the executable name for a given PID, and didn't require any special knowledge to create: we just looked up the API functions that did what we wanted and wrote small programs like the one shown above, then let Virtuoso do the hard work of creating a Volatility plugin.

In future posts, I'll go deeper into the technical methods used to achieve this. I'll also post the paper itself once the conference happens (after all, I have to give people some reason to come and see the talk ;) ). And finally, I'm hoping to release the code itself, once I get approval from the people that funded the research. For now, I'm going to employ a tactic known as "proof by screenshot", showing the steps involved in creating a plugin to list the PIDs of running proceses under Haiku. (Click any of the screenshots to see a larger version.)

First we write a program that uses the Haiku API to get a list of running processes. We annotate the program with some markers that tell our logging engine where to start and stop the trace, and what the inputs and outputs are (the calls to vm_mark_buf_{in,out}):

We now compile and run that program inside a virtual machine running Haiku, and log what computation it does:


Next, we run our analyzer on it, which does its magic and produces a plugin for Volatility:


Finally, we can run that plugin within Volatility to analyze a Haiku memory image:


To wrap things, up, I want to thank my co-authors Tim Leek, Michael Zhivich, Jonathon Giffin, and Wenke Lee. It's been a long road, but I'm hoping this research will make it a lot easier to build exciting new security tools for VMI and memory forensics!

Thursday, July 15, 2010

GDI Utilities: Taking Screenshots of Memory Dumps

I've posted about this before (twice!), but somehow never gotten around to releasing functioning code. Here (click), for your downloading pleasure, is a set of plugins designed to extract information about on-screen (graphical) windows from Windows XP SP2/3 memory images. This includes:
  • window_list - give a text listing of the window hierarchy, with each window's on-screen coordinates, current style, and its class (Button, Window, etc.). Here's some example output to whet your appetite.
  • screenshot - save a wireframe "screenshot" of the on-screen windows in a memory image. See later in this post for some examples. Requires PIL.
  • wndmon - continuously monitor a memory image and provide an updating view of the on-screen windows. Works best in a live environment, e.g. with XenAccess and PyXa. Requires PyGame. (This is what I used for the video demo).
All three plugins require the distorm disassembly library to work. I had a bit of trouble getting it to work under Linux, so here's the steps required so you don't have to go through the pain:
  1. Get distorm3 from its Google Code site.
  2. Go into build/linux and type "make".
  3. Copy the resulting libdistorm3.so into the Python directory.
  4. Rename the Python directory to "distorm" and move it somewhere in your Python path (I use Debian, and found that /usr/local/lib/python2.6/dist-packages/ worked well).
Hopefully this is a bit simpler under Windows, but I don't have a Windows box handy so I can't test that at the moment.

Have fun with the code. If you go exploring in the source, you may find some interesting things -- there's more functionality there than is exposed through the plugins, including some functions and data structures that can extract HTML content from IE in memory... ;)

Anyway, to wrap things up, here's an example of the output from the screenshot plugin, running on the two NIST memory images:

From the 6/25 image:



From the 7/4 image:



And that, my friends, is the power of memory analysis.

Tuesday, July 6, 2010

Plugin Post: Robust Process Scanner

It's pretty well known, in memory forensics circles, that there are two common ways of finding processes in memory images: list-walking, which traverses the kernel's linked list of process data structures, and scanning, which does a sweep over memory, looking for byte patterns that match the data found in a process data structure.

Having two different ways of finding processes can be very handy, especially when we suspect that someone may be trying to hide processes. One common way of hiding processes in Windows is called DKOM (Direct Kernel Object Manipulation); this technique works by just unlinking the process you want to hide from the kernel's list, like so:
This makes it invisible from programs such as the task manager, as well as memory forensic tools that use list-walking (including Volatility's pslist). However, such hidden processes can still be found by scanning memory using a signature for the process data structure; this is what psscan2 does.

Unfortunately, it's been known since at least 2007 (as mentioned in AAron Walters and Nick Petroni's Blackhat DC talk, and more recently in a presentation by Jesse Kornblum) that even signature scans can be evaded by crafty attackers. Signatures typically rely on "magic" values found in the process data structure. For example, in Windows XP, process data structures always begin with "\x03\x00\x1b\x00", which makes it pretty easy to find them in memory images.

But is that magic value really essential to the correct functioning of a process in Windows? What if an attacker just overwrites those four bytes with zeroes? As it turns out, Windows will be perfectly happy to keep running the process! At the same time, it will be completely hidden from existing forensic tools. What's more, as I demonstrated in my paper for CCS 2009 (Robust Signatures for Kernel Data Structures), around 51 fields in the process data structure can be manipulated by attackers in this way – including nearly all of the fields currently used to find processes.

So what's a forensic analyst to do? Luckily, there are some parts of the process data structure that are hard for an attacker to mess with without causing one of these:

So if we can build a signature based on these fields, we can find processes that existing signature scanners might miss.

And that's just what I've done. Here, for your consideration and consumption, is the creatively-named psscan3 (just drop it into the memory_plugins directory of Volatility 1.3.2). It uses a only fields that have been identified as "robust" to locate processes in Windows memory. It's a bit slower than the existing scanners, right now, because it's checking for more things.

If you want to try it out, you might also want to download this sample memory image, which has a hidden process at offset 0x01a4bc20. In Volatility, pslist, psscan, and psscan2 all miss the process, but psscan3 detects it, as shown in this exciting screenshot (click to enlarge; the windows show, from left to right, psscan, psscan2, and psscan3) [EDIT: Blogger is for some reason refusing to link to the larger size; click here to view it]:

If you'd like a copy of the rootkit that hid this process (which is based on the FU Rootkit), send me an e-mail (but be warned that I probably won't be able to dig up the source until this fall).

So that's it! If you want to find out more about what went into this plugin, you're encouraged to check out my paper, or browse the slides from the talk at CCS 2009.