SysKey and the SAM
The Security Accounts Manager
The Security Accounts Manager, or SAM, has been used by Windows since the days of NT to store information on local user accounts (or, in the case of a domain controller, the accounts for all users on the domain). It takes the form of a registry hive, and is stored in %WINDIR%\system32\config. Generally, two types of hash are stored in the SAM: the LanMan hash and the NT hash.
The LanMan hash has many flaws:
- It is not salted, and is thus vulnerable to precomputed dictionary attacks such as rainbow tables.
- The hash is split into two 7-byte pieces, which allows attacks to be performed against each piece at the same time. This also means that if the password is shorter than 7 characters, the last half of the hash will be a constant value.
- The password is converted to uppercase before hashing, which reduces the keyspace.
The LM hash is computed by padding or truncating the password to 14 characters, splitting it into two halves, and then using each half as a 56-bit DES key to encrypt the fixed string "KGS!@#$%"
The NT hash, by contrast, is simply the MD4 hash of the password (encoded as UTF-16 little endian). Although it is still unsalted and therefore vulnerable to precomputed dictionary attacks, it is much more secure than the LM hash, as it allows mixed case passwords up to 128 characters.
The SAM before Windows 2000
In the registry, the hashes for each user are stored under SAM\SAM\Domains\Account\Users\[RID], where RID is the numeric user ID of the user as an 8 digit hex string. Inside this key, the V value is a binary data structure that stores the account information, including the password hashes. The various pieces of information can be found from the following calculations (Python syntax):
hash_offset = unpack("<L", V[0x9c:0xA0])[0] + 0xCC
name_offset = unpack("<L", V[0x0c:0x10])[0] + 0xCC
name_length = unpack("<L", V[0x10:0x14])[0]
Once the raw hashes are obtained, they still need one last step of de-obfuscation before they can be fed to a password-cracking program like Ophcrack. Each hash must be decrypted using a key based on the user ID, using the following algorithm:
def sid_to_key(sid):
s1 = ""
s1 += chr(sid & 0xFF)
s1 += chr((sid>>8) & 0xFF)
s1 += chr((sid>>16) & 0xFF)
s1 += chr((sid>>24) & 0xFF)
s1 += s1[0];
s1 += s1[1];
s1 += s1[2];
s2 = s1[3] + s1[0] + s1[1] + s1[2]
s2 += s2[0] + s2[1] + s2[2]
return str_to_key(s1),str_to_key(s2)
The str_to_key function just converts a 7 byte string to an 8 byte DES key with odd parity.
The two keys are used to decrypt the two halves of the password hashes, so:
k1,k2 = sid_to_key(sid)
lmhash = DES(k1,enc_lmhash[:8])+DES(k2,enc_lmhash[8:])
lmhash = DES(k1,enc_lmhash[:8])+DES(k2,enc_lmhash[8:])
And in Windows NT, this is all we need to do to get the hashes. Note that only the SAM hive was necessary to fully decrypt the hashes.
SysKey
To make the hashes harder to decrypt, Microsoft introduced SysKey, an additional layer of obfuscation SysKey is on by default in Windows 2000 and above, and can be enabled in Windows NT 4.0 using the SysKey utility. In this scheme, a key stored in the system hive is used to further encrypt the hashes in the SAM.
The key, known as the boot key is taken from four separate keys: SYSTEM\CurrentControlSet\Control\Lsa\{JD,Skew1,GBG,Data}. However, the actual data needed is stored in a hidden field of the key that cannot be seen using tools like regedit. Specifically, each part of the key is stored in the key's Class attribute, and is stored as a Unicode string giving the hex value of that piece of the key.
Once we have obtained the 16-byte boot key, it must be descrambled. It is permuted according to:
p = [ 0x8, 0x5, 0x4, 0x2,
0xb, 0x9, 0xd, 0x3,
0x0, 0x6, 0x1, 0xc,
0xe, 0xa, 0xf, 0x7 ]
for i in range(len(key)):
key[i] = scrambled_key[p[i]]
This gives us the final value of the boot key, and is all the information we need from the system hive. This boot key is used for several other things aside from just decrypting the SAM -- it is also used to decrypt LSA secrets and cached domain passwords, as we will see.
Turning now to the SAM, our first task is to generate the hashed boot key, which we will then use to derive the encryption key for the individual hashes. To get the hashed boot key, we first go to SAM\SAM\Domains\Account and read the value of F there. Next, we generate an RC4 key as:
rc4_key = MD5(F[0x70:0x80] + aqwerty + bootkey + anum)
where aqwerty and anum are the constant strings:
aqwerty =
"!@#$%^&*()qwertyUIOPAzxcvbnmQQQQQQQQQQQQ)(*@&%\0"
anum = "0123456789012345678901234567890123456789\0"
Finally, the key is used to decrypt the 32 bytes at F[0x80:0xA0]. The resulting value is the hashed boot key.
At this point we're almost done. To decrypt the actual hashes for each user, we follow essentially the same procedure as in Windows NT: we go to SAM\SAM\Domains\Account\Users\[RID], and read the encrypted hashes from the V value of that key.
However, we must now apply one additional stage of decryption to the hashes. Once again we must generate the an RC4 key to decrypt the hashes; as before, it will be created from the MD5 of several strings. Specifically, the RC4 key is the MD5 of the first 16 bytes of the hashed boot key, the user ID (as a 32-bit little-endian integer), and the string "LMPASSWORD\0" or "NTPASSWORD\0" (depending on whether the key will be used to decrypt a LanMan or NT hash).
If you find code easier to read than English, here's the specific process:
antpassword = "NTPASSWORD\0"
almpassword = "LMPASSWORD\0"
rc4_key_lm = MD5(hbootkey[:0x10] +
pack("&L",rid) +
almpassword)
rc4_key_nt = MD5(hbootkey[:0x10] +
pack("&L",rid) +
antpassword)
And, at last, we can decrypt the LM and NT hashes with RC4 using their respective keys. This will give us the same kind of hashes we found in Windows NT -- that is, they still need to be decrypted using DES and the sid_to_key function. This will give us the hashes in a form that we can attempt to break.
Conclusion
Although this process certainly is complicated, in the end, it is no more than an obfuscation technique. An attacker can still easily extract the hashes if he can steal the system and SAM hives, or even just the SAM hive if he has some other means of obtaining the boot key. Moreover, the obfuscation mechanism only has to be reverse engineered once, but the entire protection mechanism will then be useless until the algorithm is changed.
Up next, we'll give LSA secrets the same treatment we gave SysKey. LSA secrets are a protected data store that can store several interesting pieces of information, such as the default password for systems that have auto-logon enabled, the timestamp used by Windows to decide when to stop working if it has not been activated, and an encryption key used to encrypt cached domain credentials.
As a final note, if you'd like to just look at some code implementing this, have a peek at framework/win32/hashdump.py in the CredDump distribution.
Comments
I also wrote a small snippet in python which takes a password ( of a already existing local user, so that I can validate the output), converts it to little-endian unicode and then does a MD4 encryption on it.
What I noticed was quite strange!
The output of pwdump7, and creddump matches each other. The output of my snippet and SAM Insider Pro matches. But the out put of SAM Inside and creddump ( or pwdump7) do NOT match.
Why is this so ?
from Crypto.Hash import MD4
print MD4.new("password".encode("utf-16-le")).hexdigest()'
Maybe you're mixing up the LM and NT passwords? The NT hash is the one that is simply MD4(unicode(password))...
administrator:500:cadd0d0a53b98d108eabc435517252b3:3ec9b744f3399c1b97fac549489981bc:::
HelpAssistant:1000:c6fc3e341ade2815513e700d40846c29:f219b7d89d91fdcdcb6ad6d5d59e7190:::
ASPNET:1004:481a3954f4522061e36b77c5fb103670:c470b7be48e031635f7790df898c4873:::
asifm:1006:46f50fd08d46a1a54a4f89c73f8a4a24:716c741a93216e4ed46e56c29a9f7e08:::
fgdump output.
asifm:1006:NO PASSWORD*********************:BC5B20F2A68FFD985171DDA2E2BE111A:::
ASPNET:1004:481A3954F4522061E36B77C5FB103670:C470B7BE48E031635F7790DF898C4873:::
HelpAssistant:1000:C6FC3E341ADE2815513E700D40846C29:F219B7D89D91FDCDCB6AD6D5D59E7190:::
SUPPORT_388945a0:1002:NO PASSWORD*********************:3D050133DF4E81C9F92CC11153426B25:::
administrator:500:NO PASSWORD*********************:80DCB98DE59D38D913F232C3929CA6A5:::
My Code Snippet
uni_le_str = password_guess.encode('utf-16-le')
md4_le = MD4.new()
md4_le.update(uni_le_str)
print "lil endian md4", md4_le.digest().encode('hex')
My code output
lil endian md4 80dcb98de59d38d913f232c3929ca6a5
--------------------------------
My code output matches with fgdump but not with creddump.
I did a small comparison exercise and came up with an interesting statistics.
sam_inside_pro = fgdump = my_code = lcp
samdump2 = creddump = pwdump7
Any idea what is going wrong and what am I missing?
creddump output
administrator:500:
cadd0d0a53b98d108eabc435517252b3:
3ec9b744f3399c1b97fac549489981bc:::
HelpAssistant:1000:
c6fc3e341ade2815513e700d40846c29:
f219b7d89d91fdcdcb6ad6d5d59e7190:::
SUPPORT_388945a0:1002:
aad3b435b51404eeaad3b435b51404ee:
3d050133df4e81c9f92cc11153426b25:::
ASPNET:1004:
481a3954f4522061e36b77c5fb103670:
c470b7be48e031635f7790df898c4873:::
fgdump output
ASPNET:1004:
481A3954F4522061E36B77C5FB103670:
C470B7BE48E031635F7790DF898C4873:::
HelpAssistant:1000:
C6FC3E341ADE2815513E700D40846C29:
F219B7D89D91FDCDCB6AD6D5D59E7190:::
SUPPORT_388945a0:1002:
NO PASSWORD*********************:
3D050133DF4E81C9F92CC11153426B25:::
administrator:500:
NO PASSWORD*********************:
80DCB98DE59D38D913F232C3929CA6A5:::
I ask because there seems to be an extra account found by fgdump (SUPPORT_388945a0) that is not present in the creddump output. How are you obtaining the hives?
I used live linux ( knoppix) to copy out the hives.
There is only on copy of the hive.
Are you suggesting that pwdump7 and creddump are picking up the files from different locations???
Windows/system32/config/SYSTEM
Windows/system32/config/SAM
?
I'm at a bit of a loss to explain why this would be the case. If you change your password, do the two tools still report different values?
1>. The consistency between pwdump7 samdum2 etc? I would expect 1 program to go wrong, not 3 of them all together
I am setting up a VMWare where I can play more freely.
Let me get back with the results.
Do you think the method of password collection matters... for example pwdump7, samdump2 actually read it from the hives, where else fgdump retrieves it from the dll injection method?
Thanks for being so persistent in chasing this down!
Set up a vmware and renamed the administrator to susanna and assigned the same password as in the physical machine.
all the pwdump program report the same password hash which is consistent with the an md4 of the password.
So the question is why samdump2 = creddump = pwdump7 misbehave on the physical machine.
So far so good. And now the fun starts.
I copied the registry hives for the physical machine to the vmware and ran all the programs and here are the results.
sam_inside_pro = fgdump = my_code as before.
samdump2 -- garbage output
creddump -- same erroneous hash as on the target machine
pwdump7 -- crashes out.
Any idea...?
The hive from the physical machine is corrupted?
Or the environment on the physical machine is corrupted.
I ran the tests on the physical machine with the dump from a third machine ( not the vmware ).
Again the same results
sam_inside_pro = fgdump = lcp
!=
samdump2 = creddump = pwdump7
There is definitely something here....
-Brendan
But I've never seen any tools that will let you retrieve the hashes as is usual, but then re-encrypt them with a different machine's syskey so they can be essentially cloned and re-inserted into the appropriate registry keys of this other machine.
This would be useful in seeing if login credentials could be cloned or "grafted" successfully on a different system, for example, copying cached creds from an XP 32-bit system to a Win7 64-bit system. Or being able to copy the domain credentials of an existing machine onto a new machine so as to not need a domain admin to authorize the new machine to join the domain - it would simply appear to be the old machine to the DC.
The input can be any syskey obfuscated reg key, including cached credentials, machine passwords, etc.
Does such a tool exist anywhere? Is it as simple as reversing the algorithm that, say, creddump uses?