Sunday, September 21, 2008

Window Messages as a Forensic Resource

In which window messages are explored, a new plugin is created, and we learn the importance of reading messages sent to you regularly.

In Windows, the GUI system is event-driven–actions occur in response to various events generated by the system. When you press a key on the keyboard, Windows generates a window message (specifically, WM_KEYDOWN) and sends it to the thread that owns the window that's in focus. That thread then call's the window's event handling procedure (the so-called WindowProc) to process the message. There are many such messages, covering input events such as keyboard and mouse actions, system-level events such as notification of a time change or a change in the system's power state, and events related to the windowing system, such as resizing or moving a window.

How can these be forensically relevant? Well, as it turns out, some threads in buggy applications aren't always good at processing their messages, and messages they receive pile up in the queue. This means that by looking at the message queues on a system, we can get some information about its past state. To make this more concrete, let's look at the message queue for a thread belonging to a certain full-disk encryption vendor on one of the images in my collection:
(I've omitted many of the details here to save space. Window messages also include things like the handle of the window the message is for and cursor position at the time the message was sent.)

If we look up WM_WTSSESSION_CHANGE on MSDN, we find that it's a message related to fast user switching; one of these is sent whenever someone logs in, locks the screen, or connects remotely (i.e. via remote desktop). The message is sent to all applications that have called WTSRegisterSessionNotification. However, in this case, despite the fact that the application asked to be notified of such changes, it didn't bother to process the messages it received! This means that we can now look through its queue and find out when the user logged in and out, when the screen was locked, and so on. (The times given are relative to the time the system was booted, and are only good up to 49.7 days–this is because the timestamp comes from the GetTickCount function). It should be clear why such information might be useful in a forensic investigation.

I want to stress that we simply got lucky in this case by finding such a wealth of information in the message queue. It would be unwise to rely on every system having a misbehaving application like the one in this example; on the NIST images, for example, and on my own (clean) test VMs, there were no messages found on the system at all–the applications involved had processed them, and so they were no longer queued. Still, on real-world systems, buggy applications may be more common, so this trick could come in handy.

To look at the message queues in a memory dump, you can use a small plugin for Volatility that I've created. It enumerates all threads on the system and lists any messages it finds. It has an internal table mapping numeric IDs to names for a large number of standard window messages; however, it is very likely that the list is not complete. In addition, applications can define their own message types; in these cases, interpreting the message is impossible without analyzing the program. Caveats aside, I'm happy to offer As usual, drop it into your "memory_plugins/" directory, and then run it with:
$ python volatility thread_queues -f [IMAGE]
As with the SSDT plugin, thread_queues requires my list-walking library, which should be placed in the "forensics/win32/" directory.

If you want to know how the plugin works internally, read on!

Each thread in Windows (represented by the _ETHREAD strucutre) has a field in its Thread Control Block (or Tcb, which is a _KTHREAD) called Win32Thread. This field points to a data structure, _W32THREAD, which is defined in the kernel-mode portion of the Windows graphical subsystem, win32k.sys. You can actually examine the _W32THREAD structure by issuing "dt win32k!_W32THREAD" in WinDbg; however, if you start reverse engineering win32k.sys, you'll quickly find that the information given there is far from complete. In fact, _W32THREAD is a much larger data structure, which includes information about the current desktop, keyboard layout, installed window hooks, and, most importantly for us, the input message queue. You can read more about the internals of win32k in Alex Ionescu's BlackHat talk from this year; however, most of the details are still unknown, and the best way to find out about them is to reverse engineer win32k.sys.

In my case, I found the answers I wanted in the PostMessage() function. To make a (very) long story short, PostMessage is defined as:

PostMessage(PWND window, ulong msg, ulong wParam, ulong lParam)

When called, it eventually allocates a new entry in the thread queue using AllocQEntry, places the message information into it with StoreQMessage (which also adds the timestamp and current cursor position), and finally tells the thread that it should wake up because there's a message pending (using SetWakeBit). In Windows XP SP2, the message queue is found at offset 0xD0 of _W32THREAD, and looks like:
typedef struct _MSG_QUEUE {
unsigned long NumberOfMessages;
Each queue entry, in turn, looks like:
typedef struct _MSG_QUEUE_ENTRY {
struct _MSG {
unsigned long hWnd;
unsigned long ulMsg;
unsigned long wParam;
unsigned long lParam;
unsigned long time; // In milliseconds
POINT pt; // Mouse cursor position
unsigned long dwUnknown;
unsigned long dwExtraInfo;
} Msg;
So for each thread, we can just start at the head of the queue, and keep following pNext until we get to the end, at which point pNext will be NULL.

One quick note of warning, though: when you're exploring these structures, you may initially think that all of the information is paged out, because none of the memory addresses seem valid. As it turns out, although Win32Thread and its friends all live in kernel space, this portion of kernel space is not visible from all threads. This flies in the face of a very common assumption in Windows memory analysis–that the kernel portion of memory looks the same to every process. In fact, the portions related to the GUI subsystem are only visible from processes that have at least one GUI thread! So, in particular, the System process, which is what's most commonly used in Volatility to access kernel memory, can't see any of the structures we're interested in. In my plugin, I make sure to use the address space of the process that owns the threads we're examining, so that the GUI structures will be accessible if the thread is a GUI thread.

There's a lot more cool stuff you can do once you start digging into the kernel mode portion of the Windows graphics system. In a future post, I'll describe how we can enumerate all the windows on the system, and list their titles and position, and (if I can figure out how) possibly even how to extract a screenshot at the time the memory was captured. Though this last bit doesn't have much forensic value, it would be pretty slick to be able to see exactly what was on the desktop for public images like the DFRWS challenge images or the NIST reference images.

Wednesday, August 20, 2008

Auditing the System Call Table

When malicious, kernel-level code is installed on the system, one action it may take is to hook various system services. What this means is that it takes some standard piece of operating system functionality and replaces it with its own code, allowing it to alter the way all other programs use the OS. For example, it may hook functions involved in opening registry keys, and modify their output so as to hide registry keys the rootkit uses. As system calls are the primary interface between user and kernel mode, the system call table is a popular place to do such hooking.

It's worth noting that many security products also make heavy use of hooking. One common example is antivirus software; among the many functions it hooks is NtCreateProcess (used, as the name suggests, to start a new process) so that it can do its on-demand scanning of any newly launched programs. For this reason, it's not safe to assume that any hooking of system calls is malicious; in fact, some of the most suspicious-looking things initially often turn out to be security software.

Still, it may be quite useful to be able to examine the system call table of a memory image during an investigation, in order to detect any hooks that shouldn't be there. To do this, we'll first look at how system calls work in Windows and lay out the data structures that are involved. I'll then describe a Volatility plugin that examines each entry in the system call table, gives its symbolic name, and then tells what kernel module owns the function it points to. If you want to skip the learning experience and get straight to the plugin, you can download it here and place it in your memory_plugins directory. You'll also need to get my library for list walking and place it in "forensics/win32".

If you look at any of the native API functions, like ZwCreateFile, you'll notice that they all look almost identical:
lkd> u nt!ZwCreateFile
804fd724 b825000000 mov eax,25h
804fd729 8d542404 lea edx,[esp+4]
804fd72d 9c pushfd
804fd72e 6a08 push 8
804fd730 e83cf10300 call nt!KiSystemService (8053c871)
804fd735 c22c00 ret 2Ch
We see that the function just places the value 0x25 into eax, points edx at the stack, and calls nt!KiSystemService. It turns out that this value, 0x25, is the system call number that corresponds to the CreateFile function.

Without going into too much detail about how KiSystemService works, the function essentially takes the value in the eax register, and then looks up that entry in a global system call table. The table contains function pointers to the actual kernel-land functions that implement that system call.

But, of course, the situation isn't quite as simple as that. In fact, Windows is designed to allow third party developers to add their own system calls. To support this, each _KTHREAD contains a member named ServiceTable which is a pointer to a data structure that looks like this:

PVOID KiServiceTable;
PULONG CounterBaseTable;
LONG ServiceLimit;
PUCHAR ArgumentTable;
As you can see, we can actually have up to four separate system service tables per thread! In practice, however, we only see the first two entries in this array filled in: the first one points to nt!KiServiceTable, which contains the functions that deal with standard OS functionality, and the second points to win32k!W32pServiceTable, which contains the functions for the GDI subsystem (managing windows, basic graphics functions, and so on). For system call numbers up to 0x1000, the first table is used, while for the range 0x1000-0x2000 the second table is consulted (this may generalize for 0x2000-0x3000 and 0x3000-0x4000, but I haven't tested it).

To take a look at the contents of these two tables, we can use the dps command in WinDbg, which takes a memory address and then attempts to look up the symbolic name of each DWORD starting at that address. To examine the full table, you should pass dps the number of DWORDS you want to examine -- the exact number will be the value found in the ServiceLimit member for the table you're interested in. For example:
lkd> dps nt!KiServiceTable L11c
805011fc 80598746 nt!NtAcceptConnectPort
80501200 805e5914 nt!NtAccessCheck
80501204 805e915a nt!NtAccessCheckAndAuditAlarm
80501208 805e5946 nt!NtAccessCheckByType
8050128c 8060be48 nt!NtCreateEventPair
80501290 8056d3ca nt!NtCreateFile
80501294 8056bc5c nt!NtCreateIoCompletion
Note that NtCreateFile is the 0x25th entry in the table, as we expected. On a system with no hooks installed, all functions in nt!KiServiceTable will point into the kernel (ntoskrnl.exe), and all functions in win32k!W32pServiceTable will be be inside win32k.sys. If they don't, it means the function has been hooked.

The plugin for Volatility, then, works as follows. First, we go over each thread in each process, and gather up all distinct pointers to service tables. We examine all of them in case one thread has had its ServiceTable changed while the others remain untouched. Then we display each entry in each (unique) table, along with the name it usually has (in an unhooked installation), and what driver the function belongs to. Here's some sample output:
$ python volatility ssdt -f xp-laptop-2005-07-04-1430.img
Gathering all referenced SSDTs from KTHREADs...
Finding appropriate address space for tables...
SSDT[0] at 804e26a8 with 284 entries
Entry 0x0000: 0x805862de (NtAcceptConnectPort) owned by ntoskrnl.exe
Entry 0x0001: 0x8056fded (NtAccessCheck) owned by ntoskrnl.exe
Entry 0x0002: 0x8058945b (NtAccessCheckAndAuditAlarm) owned by ntoskrnl.exe
Entry 0x0035: 0xf87436f0 (NtCreateThread) owned by wpsdrvnt.sys
SSDT[1] at bf997780 with 667 entries
Entry 0x1000: 0xbf93517d (NtGdiAbortDoc) owned by win32k.sys
Entry 0x1001: 0xbf946c1f (NtGdiAbortPath) owned by win32k.sys
Here we can see that the NtCreateThread function has been hooked by wpsdrvnt.sys. A little Googling shows that this driver is a part of Sygate Personal Firewall -- as mentioned before, security products are the most common non-malicious software that hooks kernel functions.

In closing, I should mention one caveat to using this tool: at the moment, the names of the system calls are hardcoded with the values derived from WinDbg on Windows XP SP2. As demonstrated by the Metasploit System Call Table page, the order and number of entries in the system call table change between different versions of Windows, so make sure that you only analyze SP2 images with this plugin! As always, patches are welcome if you want to adapt this to deal with other versions of Windows.

Now go forth, and catch those rootkits!

Sunday, August 17, 2008

Introducing Volshell

This one's for all the command line lovers out there: I'm happy to release volshell, an interactive shell built on Python and designed with memory analysis research in mind. I gave a demo of this at my OMFW talk, "Interactive Memory Exploration with Volatility"; since it was more of a live demo, I don't have slides from that, but you can find my notes here. You should be able to follow the notes as a sort of walkthrough that will get you up and running with volshell, and introduce some of the more advanced features.

Briefly, here are some of the features of volshell:
  • Shell is a full Python interpreter, so all the power of Python can be leveraged.
  • Uses Volatility 1.3 object model for easy access to data structures in memory.
  • Can use iPython for the underlying shell if available, which enables some nice features.
  • Commands modelled after WinDbg.
  • Works with any memory image format that Volatility supports (dd, crash, vmem, hibernation file)
To use it, just download and drop it in your memory_plugins directory in Volatility 1.3. Then start the shell with:

$ python volatility volshell -f $IMAGE


Saturday, August 16, 2008

Linking Processes to Users

In the course of an investigation, it may be critical to be able to link up a process that's running to a particular user account. Particularly in a multi-user environment such as Windows Terminal Server, this isn't always as easy as checking who was logged in at the time.

Luckily, each process in Windows has an associated token, a chunk of metadata that describes what Security Identifier (SID) owns the process and what privileges have been granted to it. As Larry Osterman explains, A SID is essentially a unique ID that is assigned to a user or group, and is broken into several parts: the revision (currently always set to 1), the identifier authority (describing what authority created the SID, and hence how to interpret the subauthoriries), and finally a list of subauthorities.

In general, when users see SIDs (which they rarely do), they are in what's called the Security Descriptor Definition Language (SDDL) form. This is a string that looks like:

Here, "1" is the revision, "5" is the identifier authority, and the remaining portions are the subauthorities. The exact data structure for a SID is:
typedef struct _SID {
BYTE Revision;
BYTE SubAuthorityCount;

The SID_IDENTIFIER_AUTHORITY here is actually an array of 6 characters. However, at the moment only the final character is used. Osterman's article enumerates all the possible identifier authorities; for our purposes we will be focusing on the NT authority, which is {0,0,0,0,0,5}. This is the authority which describes accounts managed by the NT security subsystem.

So how can we get this information from a memory image? We start, as usual, by looking at the _EPROCESS structure. Inside it, we find the Token member at offset 0xc8. However, the member doesn't directly point to an object of type _TOKEN, as we might expect. Instead, it is described as an _EX_FAST_REF. These types of objects are basically an optimization used by Windows to store both a pointer to and object and the object's reference count, all in a single DWORD. In an _EX_FAST_REF, the last 3 bits are co-opted to encode the reference count of the object. To get the actual pointer, you can mask off the last 3 bits, like so:
token_address = proc.Token.Value & ~0x7

Now, on to the token itself. Each token contains a list of user and group SIDs. The relevant members of the _TOKEN structure (for our immediate purpose) are UserAndGroupCount (unsigned long) and UserAndGroups (pointer to array of _SID_AND_ATTRIBUTES), at offsets 0x4c and 0x68, respectively. _SID_AND_ATTRIBUES, in turn, contains a pointer to the SID itself and a DWORD of flags giving the SID's attributes (the meaning of which are dependent on the type of SID; for group SIDs, the flags can be found in winnt.h).

Unfortunately, just having the SIDs by themselves may not be so meaningful to you. Actual account names would be better; luckily, these can be found by looking in the registry. The key HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\ProfileList contains all the local user account SIDs on the machine, along with the location of their profile on disk. The username can usually be inferred from this; e.g. a profile directory of %SystemDrive%\Documents and Settings\Bob would imply that the username is Bob.

Aside from the individual account SIDs, there are also a number of well-known group SIDs. These are SIDs that Microsoft has set aside for specific purposes, and will be the same on any Windows machine. A full list is available in KB243330, "Well-known security identifiers in Windows operating systems".

As a reward for sitting through all that dry description, here's a nice juicy tool that you can use to get started looking at the SIDs associated with a process in a memory image, written as a plugin for the just-released Volatility 1.3. You can download here. To use it, just drop it in your plugins directory (memory_plugins) and run:
$ python volatility getsids -f xp-laptop-2005-07-04-1430.img
dd.exe (3300): S-1-5-21-1957994488-484763869-854245398-1006
dd.exe (3300): S-1-5-21-1957994488-484763869-854245398-513 (Domain Users)
dd.exe (3300): S-1-1-0 (Everyone)
dd.exe (3300): S-1-5-32-544 (Administrators)
dd.exe (3300): S-1-5-32-545 (Users)
dd.exe (3300): S-1-5-4 (Interactive)
dd.exe (3300): S-1-5-11 (Authenticated Users)
dd.exe (3300): S-1-5-5-0-49673 (Logon Session)
dd.exe (3300): S-1-2-0 (Users with the ability to log in locally)

Volatility 1.3 is out!

After tons of hard work by a lot of people (including me), Volatility 1.3 has been released to the world at large. AAron has a blog post up with all the juicy details, including a list of new features. For my part, I want to take this opportunity focus on a couple new things that I think are really cool. They're mostly developer-focused, so if you have no interest in adding new capabilities to Volatility, you can skip the rest of this entry and just head over to AAron's post to see all the new modules and functionality.

The first feature I want to point out is the new plugin system. Basically, rather than creating a new module and then editing to add new commands to Volatility, you can now just create a class extending forensics.commands.command with the code you want to run, drop it into a file, and put that file into the memory_plugins directory, and Volatility will pick it up and see it as a new command that can be run. This means that anyone can just give out a single file and allow anyone to use it in Volatility with minimal effort.

In addition, it removes the need to manually integrate new modules from contributors into the source tree; instead, plugins can be developed and distributed independently, without relying on the Volatility devs at all. I'm hoping that this capability will allow a cool little cottage industry of plugin development to form around Volatility, in much the same way that users of EnCase currently trade EnScripts.

Second, I want to describe the new object model. Volatility 1.3 contains a new way of working with data structures in memory dumps. Each data structure found in can now be instantiated at a given memory address as a full-fledged Python object, and the data inside it can be accessed using standard Python syntax. No need to use read_obj again! For example, to print the size of a process located at address 0x823c87c0, we can do:

eprocess = Object('_EPROCESS', 0x823c87c0,
addr_space, None, profile=Profile())
print eprocess.VirtualSize

In addition, each structure can be given object-specific behaviors by subclassing the main Object class. For example, the _UNICODE_STRING type's Buffer member is a pointer to an unsigned short. By creating a specific _UNICODE_STRING class (as is done in memory_objects/Windows/, we can cause the Buffer member to be returned to the user as a Python string with the correct string data, automatically translated from Unicode.

Hopefully, with these new features, developing cool stuff for Volatility will be easier than ever. I know that for myself, I've found that it's now orders of magnitude faster to go from "Wouldn't it be neat if ..." to a full, working plugin. In the coming weeks, I'll be writing some posts introducing the new development features in more depth, so that as many people as possible can get involved. Remember, the power of Volatility is in its community!

Friday, August 15, 2008

Sorry for the Hiatus!

It's been quite a while since I wrote any new blog posts. This isn't entirely because I've been lazy; rather, I've picked up and relocated to sunny (and often hot and humid) Atlanta, Georgia to start the PhD program at Georgia Tech. I'm going to be working on lots of cool stuff with the Georgia Tech Information Security Center.

Now that I'm here and starting to get settled in, you can expect the blog posts to start up again. Particularly with the forthcoming release of Volatility 1.3, I'm going to have a lot of new plugins and functionality to blog about.

As a teaser, here are some of the things I've got in the works:

  • -- get the SID (kind of like a user ID in unix) that owns each process

  • -- extract loaded kernel modules from memory

  • -- list recently unloaded kernel modules

  • -- show the System Service Descriptor Table, along with the kernel module that owns the memory. This can be used to detect hooking, legitimate and otherwise.

  • -- an interactive shell designed for exploration of memory images (presented at OMFW; note that this is aimed mainly at memory forensics researchers)

  • -- extracts a list of window handles and titles by using some reverse-engineered GDI structures in the kernel

I'm planning on accompanying these with posts describing the technical details of how they work. Also, as soon as I get the code ported to 1.3, I'll be releasing the code I wrote to extract registry information (as presented in my DFRWS paper).

I also spoke with Michael Cohen, the creator of PyFlag at DFRWS, and it sounds like he's interested in integrating in-memory registry support into PyFlag through Volatility. This will let users access the registry data in a memory dump through the PyFlag VFS, and perform queries and correlation on the registry data. This will be, I believe, "wicked awesome" (technical term).

Hopefully this has given you a taste of things to come, and gotten you good and excited about 1.3 (the amazing features of which I'll also be writing about soon). Stay tuned!

Wednesday, May 7, 2008

Parsing Windows Minidumps

When a user-mode application crashes in Windows, a built-in debugger known as "Dr. Watson" steps in and captures some basic information that can be sent back to developers to help debug the crash. As part of this process, it creates what's called a minidump that contains portions of the process's memory and a great deal of extra information about the state and attributes of the process. Among the information available is:

  • CPU state for each thread.

  • A list of loaded modules, including their timestamps.

  • The Process Environment Block (PEB) for the process.

  • Basic system information, such as the build number and service pack level of the perating system.

  • The process creation time, and how long it has spent executing in kernel and user space.

  • Detailed information on the exception that was raised.

Using the userdump.exe utility provided by Microsoft, it is also possible to take a complete snapshot of the memory of any running process. This tool also, as it turns out, stores its output using the minidump format. Minidumps made with this tool, in addition to all the information available in a standard minidump, include the full process memory, and (with the -w option), a list of window handles as well.

Unlike many Microsoft formats, the minidump container format is actually fully documented by Microsoft. The relevant data structures and constants can all be found in dbghelp.h, and explanations of each field can be found on MSDN. The basic structure of the file is simple: it starts with the MINIDUMP_HEADER, which gives the offset of the stream directory (a list of MINIDUMP_DIRECTORY structures). Each directory entry has a type code (indicating what the stream is for), the size of the stream, and the offset in the file where the stream begins. Don't be scared by the use of the term "relative virtual offset" (RVA) in the Microsoft documentation; in this context, it just means "offset from the beginning of the file".

The format is not only openly documented, it is also extensible: any application can add a new stream type (using the type codes above the reserved range 0x0000-0xFFFF) and thereby include any sort of extra data in the minidump. The open-source, cross-platform crash reporter, Google Breakpad, actually uses the minidump container as its native crash dump format on all platforms. The project's source includes a set of C++ classes that can parse and work with minidump files, which can be instructive in clearing up any ambiguities in the MS-provided documentation. One final (and somewhat unexpected) source of information is the United States patent on generating minidumps. Putting aside the fact that patenting the process of saving some context to a container format after a crash seems pretty silly, the patent description is full of interesting technical details.

For memory analysis purposes, it is useful to understand the minidump format, as it is the format used by the userdump utility to save the full address space of a process. For minidumps written by userdump.exe, the actual memory ranges are described in the Memory64ListStream stream (type code 9). The stream gives the base offset in the file where the process's memory can be found, and then has a list of structures that give the size and virtual address of each memory region. (it is not necessary to give the file offset for each memory range, since they are all contiguous; the second memory range described appears in the file directly after the end of the first). Additional information on each memory range is found in the MemoryInfoListStream, which lists the protection attributes (read-only, writable, executable), state (free, reserved, or committed) and type (image, mapped file, or private allocation) for each range addressable by the process.

From this information we can reconstruct the entire memory space for a given process, and then examine its virtual address space to find interesting artifacts, such as its list of loaded modules (accessible through the Process Environment Block, or PEB) or any application-specific data it was working with (a notable example would be passwords or encryption key data, as demonstrated at CanSecWest this year). It should be fairly easy to create an address space class within Volatility that can read minidumps, at which point any of the Volatility modules work with user-mode data (currently just dlllist, but more are expected in the future) will be usable on minidumps generated by userdump.exe.

Rather than go into the gory details of the data structures involved in parsing each stream, I have decided to simply release a library written using Python and Construct. The library can be downloaded here; currently every stream type listed in Microsoft's documentation is fully parsed. The library also supports the "Window Handle" stream created by userdump.exe (stream type 0x10000), although some fields are still unknown as they are undocumented (specifically, there are four unknown DWORDS that I have been unable to decipher -- if anyone has any suggestions as to the structure, I would love to hear them!).

You can also run as a command line program, and it will print out the entire parsed structure of the minidump, including thread context, open handles, system information, and loaded modules. Enjoy!

Monday, May 5, 2008

DFRWS 2008 - Registry Forensics in Memory

I'm pleased ecstatic to announce that my paper, Forensic Analysis of the Windows Registry in Memory, has been accepted into the 8th annual Digital Forensics Research Workshop! The full program is available, and it looks like there are a lot of really cool presentations scheduled. Memory analysis is heavily featured this year, and has been given a whole session.

As usual, all the papers will be posted on the DFRWS website once the conference begins. Until then, here's the abstract of my paper:

This paper describes the structure of the Windows registry as it is stored in physical memory. We present tools and techniques that can be used to extract this data directly from memory dumps. We also provide guidelines to aid investigators and experimentally demonstrate the value of our techniques. Finally, we describe a compelling attack that modifies the cached version of the registry without altering the on-disk version. While this attack would be undetectable with conventional on-disk registry analysis techniques, we demonstrate that such malicious modifications are easily detectable by examining memory.

I also want to give my ongoing thanks to AAron Walters, who helped me out a ton by providing comments and suggestions on drafts of the paper. He continues to do great work that enriches the entire memory analysis community.

If you're going to be at DFRWS '08 and want to meet up, drop a note in the comments or send me an e-mail! See you in Baltimore!

Wednesday, April 16, 2008

Finding Kernel Global Variables in Windows

When performing memory analysis of Windows systems, there are a number of kernel variables that are extremely helpful in determining the state of the operating system. For example, the global variable PsActiveProcessHead is a pointer to the start of the kernel's list of _EPROCESS structures, and PsLoadedModuleList points to the list of currently loaded kernel modules (i.e., drivers). Unfortunately, these variables are not among those exported by the kernel, so finding them isn't as simple as looking them up in the export table of ntoskrnl.exe.

How, then, do tools like Volatility find these addresses so that they can produce process listings from memory dumps? The simplest technique is just to hard-code the address of each variable. The "basic" version of Volatools does exactly this; if you open up, you will see the hardcoded values for PsLoadedModuleList, PsActiveProcessHead, PsIdleProcess, and HandleTableListHead. The downside of this is that the addresses of these variables are only constant for a particular version of Windows at a given patch level, and even a single hotfix can change them and render your tools useless.

Much more reliable is what's known as the KPCR trick. First discovered by Edgar Barbosa (Opc0de) in his paper "Finding some non-exported kernel variables in Windows XP", and expanded upon by Alex Ionescu in a post entitled "Getting Kernel Variables from KdVersionBlock, Part 2", the trick takes advantage of a data structure in Windows known as the Kernel Processor Control Region, or KPCR. The KPCR is a data structure used by the Windows kernel to store information about each processor, and it is located at virtual address 0xffdff000 in XP, 2003, and Vista. What Opc0de noticed is that between Windows 2000 and Windows XP, one of the fields, Reserved2 changed to KdVersionBlock. After some investigation, it turned out that this field pointed to a _DBGKD_GET_VERSION64 structure, which in turn contained a linked list of _KDDEBUGGER_DATA64 structures.

The _KDDEBUGGER_DATA64 structure is used by the kernel debugger to easily find out the state of the operating system -- in order to provide meaningful information about the cause of a crash, the debugger needs many of the same pieces of information that we do when performing forensics: what processes were active, which handles to objects were held, and so on. In addition, Microsoft would prefer to only support one version of WinDbg, rather than having to deal with many different builds for different versions and service packs of the operating system, so the data in the structure is relatively stable. And so this data structure contains the memory addresses of a large number of kernel variables (around 70 in Windows Vista).

The fact that this structure can be found through a constant offset and contains such a wealth of data makes the KPCR method a very attractive option on XP and above. Also, fields inside the structure do not move around between versions of the operating system (there is even a stern warning in WDbgExts.h not to add or remove fields in the middle of the structure), though fields may be added at the end. This technique is used by Volatility to find many kernel global variables while supporting different versions of Windows XP SP2. For a full list of the variables that can be found using this method, consult WDbgExts.h in the WinDbg SDK.

Unfortunately, as mentioned above, Windows 2000 does not have a pointer to the _KDDEBUGGER_DATA64 stored in the KPCR. However, I have found that this structure still exists in Windows 2000 and contains the same useful data, but we must do a little bit more work to find it. To locate the data we need in Windows 2000, we will develop a signature that reliably locates the structure, and then simply do a linear scan of physical memory for it.

The debugger data block starts with a standard header that looks like (reproduced here from WDbgExts.h, minus comments):

typedef struct _DBGKD_DEBUG_DATA_HEADER64 {
ULONG OwnerTag;

Since the LIST_ENTRY is 64-bit, on 32-bit systems the last 8 bytes of the list structure will be 0. Perhaps more importantly, the OwnerTag is set to the constant value "KDBG". Finally, the Size is 0x290 bytes on XP and above, and 0x208 bytes on Windows 2000 (thus on Windows 2000 only the variables up through MmLoadedUserImageList in the structure are available).

To search for the debugger data block, then, we can just look for the constant string:


If we wish to restrict our search to a specific operating system, we can append \x08\x02 or \x90\x02 to specify the size for Windows 2000 or XP, respectively.

In theory, it should even be possible to extend this technique to Windows NT4, though I do not have any NT4 memory images to test with. According to WDbgExts.h, however, the structures used would be 32-bit (_KDDEBUGGER_DATA32 and _DBGKD_DEBUG_DATA_HEADER32), so the signature would need some small modifications.

So there you have it! Using either the KPCR trick or a signature based-scan, we can find the kernel debugger data block, and from it obtain the addresses of crucial kernel variables. It is worth noting, though, that not all of the kernel's global variables are available through this technique. For example, one variable which we used recently, CmpCacheTable, is nowhere to be found in the debugger data block. In this case, either a heuristic or the public debug symbols will be needed to find the address of the variable.

Tuesday, February 26, 2008

Cell Index Translation

Throughout our previous discussions of registry keys, hives, and so on, we have run into the concept of a cell index several times. They appear in kernel memory in places such as the KeyCell member of the _CM_KEY_CONTROL_BLOCK structure; likewise, all of the structures representing portions of the hive itself (_CM_KEY_NODE, _CM_KEY_VALUE, and so on) have members that point to other structures by cell index rather than by virtual address.

The reason for this is that registry hives live two lives: one, as on-disk files, stored in %WINDIR%\system32\config\, and another, as an in-memory structure that the Configuration Manager uses in memory. In the former, cell indices can be thought of essentially as simple offsets from the start of the file; in the latter, however, the situation is more complicated. When registry hives are represented in memory, they must account for the fact that new keys and values may be added at any time. Since space for the hives is allocated out of paged pool, there is no way of guaranteeing that the memory allocated for the hive will continue to be contiguous.

To solve this problem, the Configuration Manager makes use of what Solomon and Russinovich (Windows Internals pp. 203-207) call cell maps. This is done in a very similar way to virtual to physical address translation on x86. Just as different portions of a virtual address on x86 are actually indices into different tables starting with the page directory, each portion of a cell index is an offset into several tables starting at the Storage.Map member of the appropriate _HHIVE.

Cell Index Translation in Memory

As mentioned before, there are several parts to the cell index: the type (1 bit, stable or volatile), the table index (10 bits), the entry index (9 bits), and finally the offset into the block (12 bits). The following diagram illustrates this layout:

1 10 bits 9 bits 12 bits
[V][ Table ][ Entry ][ Offset ]

Or, in terms of masks and shifts:

CI_TYPE_MASK = 0x80000000
CI_TABLE_MASK = 0x7FE00000

To translate a cell index to a virtual address, first start by selecting the appropriate storage map for the hive using the first bit of the cell index. Each cell index can refer to either the stable or volatile map for the hive, stored in the first and second elements of the Storage array of the _HHIVE, respectively. These elements are of type _DUAL and contain several fields:

lkd> dt nt!_DUAL
+0x000 Length : Uint4B
+0x004 Map : Ptr32 _HMAP_DIRECTORY
+0x008 SmallDir : Ptr32 _HMAP_TABLE
+0x00c Guard : Uint4B
+0x010 FreeDisplay : [24] _RTL_BITMAP
+0x0d0 FreeSummary : Uint4B
+0x0d4 FreeBins : _LIST_ENTRY

The one we care about here is the Map member. This plays a role analogous to the page directory in virtual to physical address mapping; it is the first table we will consult when translating an address, and we will always need to be able to locate it if we are to translate any cell index.

The next 10 bits of the cell index encode the table index; this refers to an entry in the _HMAP_DIRECTORY. The _HMAP_DIRECTORY is an array of 1024 (210) pointers to _HMAP_TABLEs; the table index lets us know which of these we want.

Next we have 9 bits that give the entry index, a value that selects a particular entry from the _HMAP_TABLE, which is an array of 512 (29) _HMAP_ENTRY structures. Finally, each _HMAP_ENTRY looks like:

lkd& dt nt!_HMAP_ENTRY
+0x000 BlockAddress : Uint4B
+0x004 BinAddress : Uint4B
+0x008 CmView : Ptr32 _CM_VIEW_OF_FILE
+0x00c MemAlloc : Uint4B

The BlockAddress member will give us the virtual address in memory of the hbin (remember that hives are structured into a series of bins, which in turn contain cells). At this point, we need only add to it offset (the final 12 bits) to get the address of the cell corresponding to the data we're looking for.

The only thing left to do before we can make use of the address we've computed is add 4. Why? As it turns out, the cell index actually points to a DWORD at the beginning of the cell that gives the length of the data. So, for example, the cell index of a key will point to the size of the key, rather than the start of the key itself. While this may be quite helpful in C (since you can just read that many bytes directly into the appropriate C struct), in higher level languages it's not necessary. So as the final step of the translation, we add 4 to the virtual address to get the actual start of the data.

On-Disk Cell Index Translation

When dealing with hive files on disk, things are much simpler. To translate a cell index into a file offset, we just use the following formula:

FileAddress = CellIndex + 0x1000 + 4

Adding 0x1000 is necessary because cell indexes are relative to the first hbin in the file; before the hbins there is an additional 0x1000 byte block known as the base block that contains metadata about the hive as a whole (the layout of this block can be seen by issuing the dd nt!_HBASE_BLOCK from within WinDbg). In addition, we add 4 to the address for precisely the same reasons we did when translating cell indexes in memory.

This simpler translation scheme is also used in memory if the Flat flag is set in the _HHIVE structure.

Notes on Implementation

In Volatility, the super-cool memory analysis framework from Volatile Systems, we have the concept of stackable address spaces. The "address space" part means that we have can have objects that represent various types of memory spaces (such as standard x86 memory space, PAE memory space, and so on); these objects have, at a minimum, read(), vtop(), and is_valid_address() methods that handle reading from the memory space, translating virtual to physical addresses (whatever virtual and physical mean in that space's context), and checking to see if a given address is valid in that space.

The "stackable" part means that address spaces can be layered, with on address space calling on another to actually read the data. One example in Volatility is the IA32PagedMemory class, which deals with virtual addresses, but relies on having some base address space (in most cases, a FileAddressSpace, which just treats any address as an offset in the underlying file) available to read data from.

In the same way, we can see each hive as its own space, with cell indexes playing the same role as virtual addresses in memory. I have implemented two address spaces within Volatility that demonstrate this: HiveAddressSpace, which uses a virtual memory space like IA32PagedMemory and does cell map translation, and HiveFileAddressSpace, which uses simpler file-based translation method and works with spaces like the FileAddressSpace.

There are several benefits to this approach. First, all of the interfaces used in Volatility to deal with structures (read_obj, read_value, and so on) will work with the new address spaces we have defined. So, for example, assuming we have defined _CM_KEY_NODE, _CHILD_LIST, and _CM_KEY_VALUE in our, the following code will get us the name of the first value of a given key:

val_count = read_obj(hive_space, types,
["_CM_KEY_NODE", "ValueList", "Count"], key_addr)
val_list_ptr = read_obj(hive_space, types,
["_CM_KEY_NODE", "ValueList", "List"], key_addr)
val_ptr = read_value(hive_space, "pointer",
name_len = read_obj(hive_space, types,
["_CM_KEY_VALUE", "NameLength"], val_ptr)
name = read_string(hive_space, types,
["_CM_KEY_VALUE", "Name"], val_ptr, name_len)

The other benefit to implementing the translation as an address space is that the code above will work equally well with a HiveAddressSpace or a HiveFileAddressSpace -- meaning we can deal with hives in memory exactly as we do on-disk hives. To cite a relevant example, all of the code in CredDump works perfectly with a HiveAddressSpace object in place of the HiveFileAddressSpace. This means we can seamlessly extract password hashes and other credentials directly from memory! Cool!

What Can be Done Now

At this point, we can locate hives in memory, find open keys, and make sense of the cell indexes that point into the raw hive data in memory. But how can we actually now traverse the keys and values in the registry? It turns out this is actually not a big problem; many people have already done work on parsing raw registry hives, and we can apply these techniques to memory as well. Samba's regfio library and Tim Morgan's RegLookup are good places to start.

Aside from looking at open source projects, I have also discovered that a large amount of information on internal registry structures already exists in the public symbols distributed by Microsoft. Here are the types I know of that correspond to hive data structures:







Using these types and Volatility's built-in data access functions, it is pretty simple to write code that can traverse the registry hive in memory or on disk. A very simple implementation that can traverse key nodes and values can be found in framework/win32/ in the CredDump source.

One word of caution, though. As with all tools that work with memory, we must be able to deal with the fact that the data we're looking for may not be there. This is a very real danger with hive files in memory; not only can portions of memory be paged out as usual, but the part of the hive we're looking for may not have ever been loaded into memory at all! Windows XP and above only map in small views of each hive that correspond to the pieces of data they need to access. So while, in general, every key that has a Key Control Block will be mapped in, for any other keys there is no guarantee that the data will be there. Still, with a little bit of careful programming it's not too hard to avoid getting into trouble.

The End

With what we know now about the registry and how it is stored in Windows memory, we should have no trouble extracting all the registry data we can get our hands on from a memory dump. Of course, this is not really the end -- there are still many, many tools that can be developed that examine specific parts of the registry in memory; CredDump is just one small example. However, I believe that all the significant technical challenges involved in accessing registry data in Windows XP memory dumps have been resolved, and so I will leave it to others to make the most of this new capability.

Friday, February 22, 2008

Cached Domain Credentials

When a Windows computer is joined to a domain, authentication of users is performed not against the local SAM database, but by querying the domain controller. From this description, we might be tempted to conclude that there won't be any useful credentials stored in the registry on a machine that is part of a domain; the users and their hashes don't actually exist on the local machine but rather on the domain controller.

As it turns out, however, by default Windows does store domain credentials on client machines. The reason for this is simple: if the domain controller is unavailable for some reason, users would still like to be able to log into their machines using their credentials; for this reason Windows caches domain credentials of the last (by default) 10 users to log on to the machine. The exact number of cached logons is controlled by the value "CachedLogonCount" of HKLM\Software\Microsoft\Windows NT\CurrentVersion\WinLogon.

Cached Credentials in the Registry

The cached credentials are stored in the SECURITY hive, as with LSA secrets; specifically, they can be found in the values of HKLM\Security\Cache. This key has a number of values, named NL$1 for the first cached account, NL$2 for the second, and so on. However, as we have come to expect in these matters, the data there is not immediately usable.

The first public tool capable of dumping these credentials was CacheDump, by Arnaud Pilon. He was also kind enough to provide a pretty decent explanation of how the whole process worked, in addition to the code, which makes our job at this point much easier.

As Pilon explains, the data in the NL$?? keys is formatted as follows:

[ metadata ][ CH ][ T ][ Encrypted Data ]
64 bytes 16 16 > 100 bytes

The metadata section is cleartext, and contains such information as:

  • Length of the username

  • Length of the domain

  • Length of the full domain name

The actual username, domain name, and hash data are all stored in the encrypted data block. The CH portion of the data is an apparently random 16 byte key that is used to generate the decryption key for the encrypted data.

Decrypting the Cached Data

The decryption process is actually quite simple. First, we obtain the value of NL$KM from the LSA secrets database and decrypt it, as described in an earlier article. Then we generate an RC4 key by taking the HMAC of CH and use NL$KM as the key for the hash function. In Python this looks like:

hmac_md5 =,ch)
rc4key = hmac_md5.digest()

The key is then used to decrypt the data in the NL$?? key. (As an aside, this may be the simplest of the obfuscation techniques we have looked at so far.

Once we have the data decrypted, we can find the domain cached password as the first 16 bytes of the encrypted data. The username, domain, and domain name then appear in a contiguous block starting at offset 72 in the encrypted data. We can use the lengths of these fields from the metadata block to extract the strings; note however that they are aligned to 4 bytes, so if the length is not a multiple of 4, it will be padded with nulls.

Wasn't that easy?

And that's really all that there is to it. For the record, the hash algorithm used on the plaintext password is:

MD4(MD4(password) + username)

where both password and username are little-endian UTF-16 strings.

This concludes our brief tour of the various mechanisms used to obscure credentials in Windows. As we have seen, all of the algorithms, once reverse engineered, can be implemented offline; this is precisely what CredDump does. If you have any lingering questions about how all this works, feel free to check out the code, or shoot me an e-mail. Next time we'll be diving right back into memory analysis, and I'll describe how these same techniques we've explored in the past few articles can be used on what portions of the hive remain in memory on Windows systems.

Decrypting LSA Secrets

The LSA secrets store is a protected storage area used the the Local Security Authority (LSA) system in Windows to keep important pieces of information safe from prying eyes. As with Syskey, however, we will see that these secrets are only obfuscated, and once the mechanism is known, we can extract them from the registry with ease. Even without knowing the obfuscation algorithm, however, it is possible to read the values of this data using the LsarQuerySecret function exported by lsasrv.dll, but only if one has the LocalSystem privilege (for example, if the process is running under the SYSTEM account).

What kind of data is stored in these secrets? Here are a few that I've come across:

  • $MACHINE.ACC: has to do with domain authentication, see KB175468

  • DefaultPassword: password used to logon to Windows if auto-logon is enabled

  • NL$KM: secret key used to encrypt cached domain passwords

  • L$RTMTIMEBOMB_[...]: FILETIME giving the date when an unactivated copy of Windows will stop working

So it could clearly be of great use in some cases to be able to read these secrets. To take one example, if one wanted to know when the system was installed, one could look at the value of L$RTMTIMEBOMB, and then subtract 30 days (Windows gives you 30 days from the time you installed to activate the product before it stops working). And, of course, dumping the cached domain logon credentials requires us to be able to get the value of NL$KM.

LSA Secrets in the Registry

LSA secrets are stored in the SECURITY hive of the registry, in the key SECURITY\Policy\Secrets. Under that key, each secret has its own key, so NL$KM can be found at SECURITY\Policy\Secrets\NL$KM. Each secret key then has several subkeys associated with it: CurrVal, CupdTime, OldVal, OupdTime, and SecDesc. CurrVal and Oldval are the current and previous values of the secret, and CupdTime and OupdTime are the respective timestamps when the values were updated, as 64-bit FILETIMEs. SecDesc is assumed to be a security descriptor for the secret, but I have not verified this empirically.

The values of the CurrVal and OldVal keys are unnamed, and displayed as "(Default)" in regedit. Even once you obtain the values, however, you won't be able to make much sense of them. The reason for this is that as with the hashes in the SAM, LSA secrets are encrypted with a key that is specific to each machine (in fact, as we will see, it is based on the boot key used in SysKey encryption).

Decrypting the Secrets

The encrypted data of each secret has the following form:

12 bytes Variable len
[ Metadata ][ Encrypted Data ]

The initial metadata portion does not contain much of interest to us (among other things, it has the size of the block of data, which we already know, since it is specified in the key value data structure). In my code, I ignore it and only deal with the data from offset 0xC (decimal 12) onwards.

So to decrypt the secret's information, we will need two things: an encryption key, and knowledge of which algorithm was used. Luckily, a Chinese reverse engineer known to me only as Eyas[at] has done much of this work for us, and written about the process in an article on XFocus. Unfortunately, the article is in Chinese, but Google Translate does a reasonable job on it, and the included x_dialupass2.cpp is happily quite universal.

Perusing this code shows that the author obtained the secret decryption key by searching the memory space of lsass.exe for the string "\x10\x00\x00\x00\x10\x00\x00\x00"; this is because the key was known to be 16 bytes, and it is contained in a data structure whose first two fields are little-endian integers giving the current and maximum size of the contained data. Crude as this method is, it has worked in every case I have tried it, and it works well enough that the author of CacheDump uses it in his utility to decrypt the value of NL$KM.

The encryption algorithm used, however, is still somewhat mysterious. Both x_dialupass2.cpp and CacheDump both rely on a function named SystemFunction005 in advapi32 do perform the actual decryption, and this function is (naturally) entirely undocumented by Microsoft. So we will have to do a bit of work if we want to really understand how this works, and produce our own implementation of it.


Some Googling and poking through the Wine source code gives us some initial information: SystemFucntion005 apparently decrypts a variable-length block of data using DES in ECB mode (the relevant code can be found in dlls/advapi32/crypt_lmhash.c.

This description initially seems promising, and, indeed, if we decrypt the first 8 bytes of the secret data using the first 7 bytes of the LSA key, we get sensible-looking data that appears to be two little-endian integers that describe the length of the encrypted data. Unfortunately, past that, using that 7-byte key produces gibberish.

What went wrong? Well, notice that in our first attempt, we only used the first 7 bytes of a 16 byte key. As it turns out, although the Wine implementation does not support this, SystemFunction005 can use keys longer than 7 bytes. The first 7 bytes of the key decrypt the first 8-byte block of data, the next 7 bytes (key[7:14]) decrypt the second 8-byte block, and so on.

When we there are fewer than 7 bytes left in the key data, we start again at the beginning of the key, plus the number of bytes of key data we had left over. So for example, with the 16 byte LSA key, we use the following keys: key[0:7], key[7:14], key[2:9], key[9:16], key[0:7], ... and so on. The function that moves through the key data in advapi32 is called AdvanceKeyData, if you are inclined to verify that I've gotten everything correct.

The only other small wrinkle is that the decrypted data is actually an LSA_BLOB data structure:

typedef struct _LSA_BLOB {
DWORD cbData;
DWORD cbMaxData;
BYTE * szData;

Thus, the data returned to the user does not include the first 8 bytes, and will be cbData bytes long.

Obtaining the LSA key

Although searching for the LSA key in memory works, it's a bit crude, and it leaves us with an incomplete picture of how the obfuscation algorithm works. Moreover, it won't work if we don't have access to the original system the secrets were stored on (e.g., if we only have access to the on-disk hives from the system). With the help of some information from Massimiliano Montoro (mao), author of Cain & Abel, we can do better.

As mao describes in a post to the Cain & Abel forums, the LSA key is derived from the registry key SECURITY\Policy\PolSecretEncryptionKey and the boot key used in SysKey. To calculate it, we first obtain an RC4 key from the MD5 hash of the boot key followed by 1000 instances of bytes 60 to 76 of the data in PolSecretEncryptionKey. In Python This looks like:

md5 =
for i in range(1000):
rc4key = md5.digest()

We then use this key to decrypt 48 bytes of data from PolSecretEncryptionKey starting at offset 12 (that is, RC4(rc4key, pol_sec_key[12:60])). Bytes 0x10 through 0x20 of the resulting string are precisely the value of the LSA key we need! Using this key, we can decrypt all the LSA secrets as described above.

Final Words

The code that implements the algorithms described here is part of CredDump, and can be found in framework/win32/ I hope you've enjoyed this look at another obfuscation algorithm. In the next article, we'll be making use of the knowledge we gained here to tackle the problem of dumping the domain password hashes from a local client.

Thursday, February 21, 2008

SysKey and the SAM

The Security Accounts Manager

The Security Accounts Manager, or SAM, has been used by Windows since the days of NT to store information on local user accounts (or, in the case of a domain controller, the accounts for all users on the domain). It takes the form of a registry hive, and is stored in %WINDIR%\system32\config. Generally, two types of hash are stored in the SAM: the LanMan hash and the NT hash.

The LanMan hash has many flaws:

  1. It is not salted, and is thus vulnerable to precomputed dictionary attacks such as rainbow tables.

  2. The hash is split into two 7-byte pieces, which allows attacks to be performed against each piece at the same time. This also means that if the password is shorter than 7 characters, the last half of the hash will be a constant value.

  3. The password is converted to uppercase before hashing, which reduces the keyspace.

The LM hash is computed by padding or truncating the password to 14 characters, splitting it into two halves, and then using each half as a 56-bit DES key to encrypt the fixed string "KGS!@#$%"

The NT hash, by contrast, is simply the MD4 hash of the password (encoded as UTF-16 little endian). Although it is still unsalted and therefore vulnerable to precomputed dictionary attacks, it is much more secure than the LM hash, as it allows mixed case passwords up to 128 characters.

The SAM before Windows 2000

In the registry, the hashes for each user are stored under SAM\SAM\Domains\Account\Users\[RID], where RID is the numeric user ID of the user as an 8 digit hex string. Inside this key, the V value is a binary data structure that stores the account information, including the password hashes. The various pieces of information can be found from the following calculations (Python syntax):

hash_offset = unpack("<L", V[0x9c:0xA0])[0] + 0xCC
name_offset = unpack("<L", V[0x0c:0x10])[0] + 0xCC
name_length = unpack("<L", V[0x10:0x14])[0]

Once the raw hashes are obtained, they still need one last step of de-obfuscation before they can be fed to a password-cracking program like Ophcrack. Each hash must be decrypted using a key based on the user ID, using the following algorithm:

def sid_to_key(sid):
s1 = ""
s1 += chr(sid & 0xFF)
s1 += chr((sid>>8) & 0xFF)
s1 += chr((sid>>16) & 0xFF)
s1 += chr((sid>>24) & 0xFF)
s1 += s1[0];
s1 += s1[1];
s1 += s1[2];
s2 = s1[3] + s1[0] + s1[1] + s1[2]
s2 += s2[0] + s2[1] + s2[2]

return str_to_key(s1),str_to_key(s2)

The str_to_key function just converts a 7 byte string to an 8 byte DES key with odd parity.

The two keys are used to decrypt the two halves of the password hashes, so:

k1,k2 = sid_to_key(sid)
lmhash = DES(k1,enc_lmhash[:8])+DES(k2,enc_lmhash[8:])
lmhash = DES(k1,enc_lmhash[:8])+DES(k2,enc_lmhash[8:])

And in Windows NT, this is all we need to do to get the hashes. Note that only the SAM hive was necessary to fully decrypt the hashes.


To make the hashes harder to decrypt, Microsoft introduced SysKey, an additional layer of obfuscation SysKey is on by default in Windows 2000 and above, and can be enabled in Windows NT 4.0 using the SysKey utility. In this scheme, a key stored in the system hive is used to further encrypt the hashes in the SAM.

The key, known as the boot key is taken from four separate keys: SYSTEM\CurrentControlSet\Control\Lsa\{JD,Skew1,GBG,Data}. However, the actual data needed is stored in a hidden field of the key that cannot be seen using tools like regedit. Specifically, each part of the key is stored in the key's Class attribute, and is stored as a Unicode string giving the hex value of that piece of the key.

Once we have obtained the 16-byte boot key, it must be descrambled. It is permuted according to:

p = [ 0x8, 0x5, 0x4, 0x2,
0xb, 0x9, 0xd, 0x3,
0x0, 0x6, 0x1, 0xc,
0xe, 0xa, 0xf, 0x7 ]
for i in range(len(key)):
key[i] = scrambled_key[p[i]]

This gives us the final value of the boot key, and is all the information we need from the system hive. This boot key is used for several other things aside from just decrypting the SAM -- it is also used to decrypt LSA secrets and cached domain passwords, as we will see.

Turning now to the SAM, our first task is to generate the hashed boot key, which we will then use to derive the encryption key for the individual hashes. To get the hashed boot key, we first go to SAM\SAM\Domains\Account and read the value of F there. Next, we generate an RC4 key as:

rc4_key = MD5(F[0x70:0x80] + aqwerty + bootkey + anum)

where aqwerty and anum are the constant strings:

aqwerty =
anum = "0123456789012345678901234567890123456789\0"

Finally, the key is used to decrypt the 32 bytes at F[0x80:0xA0]. The resulting value is the hashed boot key.

At this point we're almost done. To decrypt the actual hashes for each user, we follow essentially the same procedure as in Windows NT: we go to SAM\SAM\Domains\Account\Users\[RID], and read the encrypted hashes from the V value of that key.

However, we must now apply one additional stage of decryption to the hashes. Once again we must generate the an RC4 key to decrypt the hashes; as before, it will be created from the MD5 of several strings. Specifically, the RC4 key is the MD5 of the first 16 bytes of the hashed boot key, the user ID (as a 32-bit little-endian integer), and the string "LMPASSWORD\0" or "NTPASSWORD\0" (depending on whether the key will be used to decrypt a LanMan or NT hash).

If you find code easier to read than English, here's the specific process:

antpassword = "NTPASSWORD\0"
almpassword = "LMPASSWORD\0"
rc4_key_lm = MD5(hbootkey[:0x10] +
pack("&L",rid) +
rc4_key_nt = MD5(hbootkey[:0x10] +
pack("&L",rid) +

And, at last, we can decrypt the LM and NT hashes with RC4 using their respective keys. This will give us the same kind of hashes we found in Windows NT -- that is, they still need to be decrypted using DES and the sid_to_key function. This will give us the hashes in a form that we can attempt to break.


Although this process certainly is complicated, in the end, it is no more than an obfuscation technique. An attacker can still easily extract the hashes if he can steal the system and SAM hives, or even just the SAM hive if he has some other means of obtaining the boot key. Moreover, the obfuscation mechanism only has to be reverse engineered once, but the entire protection mechanism will then be useless until the algorithm is changed.

Up next, we'll give LSA secrets the same treatment we gave SysKey. LSA secrets are a protected data store that can store several interesting pieces of information, such as the default password for systems that have auto-logon enabled, the timestamp used by Windows to decide when to stop working if it has not been activated, and an encryption key used to encrypt cached domain credentials.

As a final note, if you'd like to just look at some code implementing this, have a peek at framework/win32/ in the CredDump distribution.

Wednesday, February 20, 2008

CredDump: Extract Credentials from Windows Registry Hives

This is just a short post to talk about a new tool I've developed, called CredDump. CredDump is a pure-Python implementation of the bkhive/samdump2, lsadump2, and cachedump utilities commonly used to audit Windows password security.

Why re-implement these tools? Well, for one thing, I like Python, and wanted to learn the details of how they all worked. Much more importantly though, this tool is completely cross-platform and requires only the registry hives recovered from the system to operate. Most other tools currently available rely on some aspect of Windows to provide their functionality. For example:

  • lsadump2 calls advapi32.LsarQuerySecret to obtain the unencrypted LSA secrets.

  • cachedump searches through the address space of lsass.exe to obtain the unencrypted LSA key, and uses advapi32.SystemFunction005 to decrypt the value of NL$KM with it.

  • Cain & Abel is the closest to operating entirely offline, but still uses advapi32.SystemFunction005 to do the decryption. Also, it's closed source and Windows-only.

So the hope was to build something in the spirit of Nicola Cuomo's bkhive and samdump2, that could work without requiring Windows at all.

It turned out that almost all of the pieces were already publicly available in various forms. The only truly original work I ended up having to do was reverse engineering SystemFunction005 to re-implement it in Python (and even there, the Wine folks had already done most of the work, they just didn't account for keys longer than 7 bytes).

You can find the project's home page here, or download it directly here.

But wait, I hear you cry, how do I get these hives in the first place? You extract them from a Windows system. Unfortunately, it's not as easy as just copying them out of c:\windows\system32\config, because the files are always open. To get around this, you can either boot into Knoppix, or use the Volume Shadow Copy trick.

In the next few articles, I'll be discussing in detail how these programs work. The first article will cover Syskey encryption and how to dump the SAM database, the next will explore the extraction of LSA secrets, and the final one will talk about cached domain passwords. Of course, if you don't want to wait, there's always the code, which should be pretty readable Python. After that, it's back to memory analysis; now that we have a sense of some of the things in the registry, I'll show you how we can access large portions of the registry directly from Windows memory dumps.

Wednesday, February 13, 2008

Keys Open by Hive

Quick addendum: I thought it might also be fun to post how many keys were open in each hive (this is for xp-laptop-2005-07-04-1430.img -- and sorry for the huge amount of space, Blogger's auto-formatting seems to get confused around tables):

HiveKey Count
[...]ts and Settings\Sarah\NTUSER.DAT147

Reading Open Keys

Last time, we found out how to find information about loaded registry hives in Windows memory dumps. However, just knowing what hives are loaded may not be particularly useful. To fill out our view of the state of the system's registry, we will also want to get information about registry keys currently open in the Configuration Manager. To start, though, we will need to know a little bit more about how the Configuration Manager represents keys in memory.

Each open key is represented by a Key Control Block. Key Control Blocks, or KCBs, store a number of pieces of metadata about the key, such as the name, last write time, pointers to the cached values contained in the key (recall that the registry is essentially a filesystem, where keys act as folders and values are the files). To see precisely what information we can get about a key from its KCB, we can look at the _CM_KEY_CONTROL_BLOCK structure in Windbg:

+0x000 RefCount : Uint2B
+0x002 Flags : Uint2B
+0x004 ExtFlags : Pos 0, 8 Bits
+0x004 PrivateAlloc : Pos 8, 1 Bit
+0x004 Delete : Pos 9, 1 Bit
+0x004 DelayedCloseIndex : Pos 10, 12 Bits
+0x004 TotalLevels : Pos 22, 10 Bits
+0x008 KeyHash : _CM_KEY_HASH
+0x008 ConvKey : Uint4B
+0x00c NextHash : Ptr32 _CM_KEY_HASH
+0x010 KeyHive : Ptr32 _HHIVE
+0x014 KeyCell : Uint4B
+0x018 ParentKcb : Ptr32 _CM_KEY_CONTROL_BLOCK
+0x01c NameBlock : Ptr32 _CM_NAME_CONTROL_BLOCK
+0x020 CachedSecurity : Ptr32 _CM_KEY_SECURITY_CACHE
+0x024 ValueCache : _CACHED_CHILD_LIST
+0x02c IndexHint : Ptr32 _CM_INDEX_HINT_BLOCK
+0x02c HashKey : Uint4B
+0x02c SubKeyCount : Uint4B
+0x030 KeyBodyListHead : _LIST_ENTRY
+0x030 FreeListEntry : _LIST_ENTRY
+0x038 KcbLastWriteTime : _LARGE_INTEGER
+0x040 KcbMaxNameLen : Uint2B
+0x042 KcbMaxValueNameLen : Uint2B
+0x044 KcbMaxValueDataLen : Uint4B

So to find out what keys are open, we will be looking for a list of key control blocks in memory. To figure out where to look, I turned to the kernel debugging extensions. After all, Windbg's !reg openkeys command is able to get a list of all open keys and their KCBs, so by reverse engineering that command we can figure out how to extract the same information from a memory dump.

Luckily, the kernel debugging extensions disassemble easily, and debug information for them is available from the Microsoft symbol server, so we get information on function names and so on for free (for anyone wishing to try this at home, the appropriate file is in winxp\kdexts.dll in the Debugging Tools for Windows program directory). In general, this can actually be an excellent technique for finding out about Windows internals -- any information you get a Windbg extension can be extracted from live memory with a little effort, and since the intention is to display the information to the user, there are a lot of useful markers in the code like print statements that help you understand what particular data structures and fields mean.

By reverse engineering the regopenkeys() function, we find that it uses the unexported kernel global variables CmpCacheTable and CmpHashTableSize. CmpCacheTable points to a hash table that is used by the Configuration Manager to provide fast lookups for open keys. Each entry in the table points to a _CM_KEY_HASH structure, shown below:

lkd> dt nt!_CM_KEY_HASH
+0x000 ConvKey : Uint4B
+0x004 NextHash : Ptr32 _CM_KEY_HASH
+0x008 KeyHive : Ptr32 _HHIVE
+0x00c KeyCell : Uint4B

The NextHash member points to the next entry in the given hash bucket. To read the entire table, we just read CmpHashTableSize pointers at CmpCacheTable, and then for each of those, read in all the entries in that bucket using the NextHash field (the last entry in a given bucket will have its NextHash set to 0).

However, the !reg openkeys command gives much more information about each key than we find in a _CM_KEY_HASH structure -- it also lists the KCB, name, and cell. Digging a little deeper, we discover the secret to how it does this: each _CM_KEY_HASH is in fact embedded in its corresponding _CM_KEY_CONTROL_BLOCK (see the KeyHash member at offset 0x8 up there?). By subtracting 8 from the address of the _CM_KEY_HASH, we get the address of the KCB, which in turn gives us all the information we could want: last write time, name, cell index, and pointers to parent KCBs.

Throughout this discussion, you may have noticed that unlike the debugger, we don't know in what value CmpCacheTable has. Unfortunately, the location of this variable changes with each version of Windows and even each service pack or hotfix. Was all our hard work figuring this out for naught, then, since we can't implement it in an offline analysis framework like Volatility?

Well, not really. Just as we came up with a signature for the registry hive data structure in the last post, we can come up with a heuristic for determining the address of CmpCacheTable in memory as well. Since the hash table is just a big list of pointers, however, a static signature probably won't work. So let's look at the things we know about CmpCacheTable:

  1. It's a kernel global variable, so it should reside inside the mapped executable image for ntoskrnl.exe.

  2. It points to a list of pointers, and each of those points to a _CM_KEY_HASH structure.

  3. The _CM_KEY_HASH structure has a pointer to a the hive the key belongs to at offset 0x8.

Point (3) is the critical one, because we already know how to identify valid _HHIVE structures! We can leverage this to find the table of cached registry keys.

Our algorithm for finding this table is going to look something like this, then:

  1. Find the base address of the kernel in memory. This can be done using the KPCR trick.

  2. Parse the PE header to get the offset (RVA) of each section, and add that to the kernel base address to get its virtual address. (I find pefile quite useful for this.)

  3. Loop over each section, and treat each dword in that section as a potential pointer.

  4. At the pointed-to memory location, treat each dword as a pointer to a _CM_KEY_HASH.
  5. Follow the KeyHive pointer to the _HHIVE candidate, and check that it matches the signature we developed earlier.

  6. As a bonus, once we've found CmpCacheTable this way, CmpHashTableSize is the next dword after it, at least on XP SP2 (the variables are most likely declared next to each other in the Windows kernel source.)

If you want to cheat a bit, the variable is in the PAGEDATA section, which should simplify your search. It is up to you how many entries in the candidate hash table you want to check; checking more will take longer, but should reduce the chance of false positives. Also, some entries will be 0 (if that hash bucket is not filled), so don't throw out a candidate just because some of its entries are null.

Armed with this information, you should now be able to read the full list of registry keys opened by the configuration manager. So how many keys can you expect to find using this method? Here are some quick stats from some images I have (the first two are publicly available test images, the third is just an image I took with dd from my own XP SP2 machine):

Image# keys

Although we have now looked pretty deeply into the internals of the Configuration Manager and figured out how to extract several important pieces of information about the registry from memory, we aren't quite done with the topic yet. For one thing, we've so far only really dealt with metadata, rather than the registry data itself. In addition to things like key key names and timestamps, we may want to pull out as much information as possible about each key and its values. To do this, we will have to start looking at the pieces of the raw hive itself that are mapped into memory, which will involve finally figuring out how to make use of cell indexes and where they fit into virtual address space. Stay tuned!

Friday, February 8, 2008

Enumerating Registry Hives

The Windows registry can be an important forensic resource. Harlan Carvey has written extensively on various aspects of registry analysis, and is even considering writing a book on the topic. I have already written about attempting to extract hives from memory; in this post, we will again look at registry hives in Windows memory, but this time in a more top-down fashion, by examining the data structures used to represent hives by the Configuration Manager.

A hive is represented by the Windows Configuration Manager using the _CMHIVE data structure. You can examine this structure in detail using the dt command in Windbg:

lkd> dt nt!_CMHIVE
+0x000 Hive : _HHIVE
+0x210 FileHandles : [3] Ptr32 Void
+0x21c NotifyList : _LIST_ENTRY
+0x224 HiveList : _LIST_ENTRY
+0x22c HiveLock : Ptr32 _FAST_MUTEX
+0x230 ViewLock : Ptr32 _FAST_MUTEX
+0x234 LRUViewListHead : _LIST_ENTRY
+0x23c PinViewListHead : _LIST_ENTRY
+0x244 FileObject : Ptr32 _FILE_OBJECT
+0x248 FileFullPath : _UNICODE_STRING
+0x250 FileUserName : _UNICODE_STRING
+0x258 MappedViews : Uint2B
+0x25a PinnedViews : Uint2B
+0x25c UseCount : Uint4B
+0x260 SecurityCount : Uint4B
+0x264 SecurityCacheSize : Uint4B
+0x268 SecurityHitHint : Int4B
+0x26c SecurityCache : Ptr32 _CM_KEY_SECURITY_CACHE_ENTRY
+0x270 SecurityHash : [64] _LIST_ENTRY
+0x470 UnloadEvent : Ptr32 _KEVENT
+0x474 RootKcb : Ptr32 _CM_KEY_CONTROL_BLOCK
+0x478 Frozen : UChar
+0x47c UnloadWorkItem : Ptr32 _WORK_QUEUE_ITEM
+0x480 GrowOnlyMode : UChar
+0x484 GrowOffset : Uint4B
+0x488 KcbConvertListHead : _LIST_ENTRY
+0x490 KnodeConvertListHead : _LIST_ENTRY
+0x498 CellRemapArray : Ptr32 _CM_CELL_REMAP_BLOCK

Some interesting things jump out to start with. First, two _UNICODE_STRINGs. These are always good as they will provide us with a human-readable name for the hive. We also notice that the very first field, Hive, is 0x210 bytes long! So let's look at its structure, _HHIVE:

lkd> dt -r1 nt!_HHIVE
+0x000 Signature : Uint4B
+0x004 GetCellRoutine : Ptr32 _CELL_DATA*
+0x008 ReleaseCellRoutine : Ptr32 void
+0x00c Allocate : Ptr32 void*
+0x010 Free : Ptr32 void
+0x014 FileSetSize : Ptr32 unsigned char
+0x018 FileWrite : Ptr32 unsigned char
+0x01c FileRead : Ptr32 unsigned char
+0x020 FileFlush : Ptr32 unsigned char
+0x024 BaseBlock : Ptr32 _HBASE_BLOCK
+0x000 Signature : Uint4B
+0x004 Sequence1 : Uint4B
+0x008 Sequence2 : Uint4B
+0x00c TimeStamp : _LARGE_INTEGER
+0x014 Major : Uint4B
+0x018 Minor : Uint4B
+0x01c Type : Uint4B
+0x020 Format : Uint4B
+0x024 RootCell : Uint4B
+0x028 Length : Uint4B
+0x02c Cluster : Uint4B
+0x030 FileName : [64] UChar
+0x070 Reserved1 : [99] Uint4B
+0x1fc CheckSum : Uint4B
+0x200 Reserved2 : [894] Uint4B
+0xff8 BootType : Uint4B
+0xffc BootRecover : Uint4B
+0x028 DirtyVector : _RTL_BITMAP
+0x000 SizeOfBitMap : Uint4B
+0x004 Buffer : Ptr32 Uint4B
+0x030 DirtyCount : Uint4B
+0x034 DirtyAlloc : Uint4B
+0x038 RealWrites : UChar
+0x03c Cluster : Uint4B
+0x040 Flat : UChar
+0x041 ReadOnly : UChar
+0x042 Log : UChar
+0x044 HiveFlags : Uint4B
+0x048 LogSize : Uint4B
+0x04c RefreshCount : Uint4B
+0x050 StorageTypeCount : Uint4B
+0x054 Version : Uint4B
+0x058 Storage : [2] _DUAL
+0x000 Length : Uint4B
+0x004 Map : Ptr32 _HMAP_DIRECTORY
+0x008 SmallDir : Ptr32 _HMAP_TABLE
+0x00c Guard : Uint4B
+0x010 FreeDisplay : [24] _RTL_BITMAP
+0x0d0 FreeSummary : Uint4B
+0x0d4 FreeBins : _LIST_ENTRY

Once again we notice a few fields that might be of interest to us. The Storage member contains information that the configuration manager uses to translate cell indices (see my earlier article for more details on cell indices) into virtual addresses in memory. The BaseBlock member gives us a direct pointer to the base block structure of the registry file, as it exists in memory. Reading memory there shows us exactly what we expect; the usual registry file signature "regf" followed by the timestamp, filename, etc.

Knowing what the structure looks like is all well and good, but how do we actually find them in memory? It turns out that we can create a highly effective signature by noting two facts. First, the _CMHIVE data structure is allocated in paged pool, and thus has a pool tag: "CM10" (listed in pooltag.txt as "Internal Configuration manager allocations"). We also know from the the structure listing above that it is 0x49c bytes long; by including the pool header (8 bytes) and padding it out to an 8 byte alignment, we get a total size of 0x4a8 for the whole pool block. This will appear in the pool header as 0x95 (since the size is listed as the number of 8-byte blocks in the allocation, and 0x95 * 0x8 = 0x4a8). Finally, the pool type (paged pool) is encoded as 0x0c.

So we can already create a signature for the pool header (regular expression notation): "..\x95\x0cCM10" that should be fairly accurate in finding the pool allocations we want. But we can do even better! Notice the first field of the _HHIVE structure -- Signature. Each _HHIVE, and therefore each _CMHIVE, begins with a constant signature, 0xbee0bee0 as a little-endian integer. So armed with this knowledge, we can build the full signature for use with something like FTimes:


Once we have found just a single hive, we can use it to find all of the others in virtual memory. Each _CMHIVE has a member HiveList. This _LIST_ENTRY links together all the hives in the kernel address space. We can traverse this linked list to find all the hives currently loaded by the configuration manager (remember that each _LIST_ENTRY will point to the HiveList member of the _CMHIVE, not the beginning of the structure -- you will have to subtract 0x224 from the value of the Flink or Blink to obtain the correct value). Note that the addresses given by the HiveList member are virtual addresses; you will need a page directory to translate them. Luckily, all Configuration Manager structures live in kernel space, so the page directory of any process will do -- I generally use the Idle process.

As you traverse this list, you may notice a couple oddities. First, there will be one entry in the list that doesn't seem to match the others. Its address will be somewhere around 0x8067b3a8, and its signature won't be 0xbee0bee0. This is because it actually isn't a hive at all! Rather, this is actually the list head. It exists as a global variable within the NT kernel; you can get its value by issuing ? CmpHiveListHead from within Windbg. If all you want is a list of the hives, though, it's save to just skip over it as you traverse the list.

The other slightly strange thing you will encounter is that there are many more hive structures found in the physical memory dump than can be found by traversing the linked list. Upon closer inspection, this is because there are actually a number of duplicate structures in physical memory -- they have exactly the same data as other entries that can be found in the linked list. A similar phenomenon can be seen when searching for process structures in physical memory; ptfinder calculates the MD5 of the structures to avoid such duplicates. It is probably safe to ignore such duplicates and focus only on the entries that are in the in-memory list.

Hopefully this article has given you a taste for what kinds of information about the registry can be extracted from Windows memory dumps. But perhaps you feel let down -- after all, a simple list of the hives loaded into memory doesn't tell you much that's forensically useful. Sure, you might be able to say when the hive was last modified by looking at the timestamp in the base block, or find out who was logged in by looking at what ntuser.dat hives were open, but that's not that exciting. What you'd really like to do is get information about actual registry keys that are currently open. This can also be accomplished using memory analysis, and we'll see how in an upcoming article.