Paper Title (use style: paper title)

advertisement
Use of distributed or Parallel Computing to Crack
Windows Password Hashes
Shishir Jha (Author)
Department of Computer Science
Hood College
Frederick Maryland, USA
sj4@hood.edu
Abstract — Cracking windows password hashes has been an
inherently single process based algorithm that requires extensive
computing resource and time. Since use of parallel or distributed
computing can address the need of computing resource by
delegating the processing to multiple nodes, this paper is an
attempt to see if password cracking be modified to take the
advantage of multiple nodes and find the potential speed up.
Keywords-LM Hash, parallel, distributed, dictionary attack,
windows password
I.
INTRODUCTION
Passwords have been used for centuries as a method for
challenging the credentials of someone attempting to enter a
secured compound or accessing a private gathering.
Computers are no different and require one form or other of
passwords to let only certified personnel access the
information in it. The trend of using password in computers
started in sixties and seventies when a need for more
structured method of processing the access control emerged in
the computing world.[1]
The first password schemes relied simply on a flat file
located on disk or in memory which contained the user names
and passwords. These password files were typically locked by
the operating system however some systems allowed any user
with the appropriate privileges to access the file.
When Windows 95 was released, user account information
was placed in a .pwl file or password list file and the password
file was encrypted using the password and an RC4 encryption
algorithm. When a user entered his or her credentials, the
password was encrypted and the checksum of the .pwl file was
compared against the checksum derived from the user
credentials. By the time Windows XP was released, the
passwords were moved to a SAM (Security Account Manager)
file. The SAM file was encrypted with the SYSKEY. The
passwords were still not placed in the file directly but rather a
hashed version of the password was saved. The Hashing
algorithm was based on the MD5 hashing. Access to the
computer required the user to enter a password and comparing
the hashed output with the contents of the SAM file.
Hashes in windows SAM file are computed using either the
LM has method or the NTLM Hash method. Although it is
based on DES encryption, LM Hash is not a true one-way
function. Because of way the LM Hash function is
implemented, there are several weaknesses in its
implementation which allows careful programmers[2] to use
different algorithms to get the password from the MD5 hashes.
II.
BACKGOUND
A. LM Hash
Though NTLM has primarily replaced the LM Hash in all
the application protocols used to authenticate remote users and
to provide session security when requested by the application,
LM is still used in vast majority of Windows machine that are
not part of any remote domain and Active Directory based
networks.[3] However, Windows before the introduction of
Vista still compute and store the LM hash by default for
compatibility with previous generation clients that still use 16
bit applications which requires LM Hash based authentication.
This paper is an attempt to attack this security hole present in
the Windows architecture and gain access to users
authentication credential.
To better understand the premise of the problem that this
paper is attempting to solve, it is necessary to first understand
how LM hash encrypts the user credentials. The general steps
in LM Hash computing are as follows: [2]
1. The user entered ASCII password is converted to
UPPERCASE
2. This password is then made 14 byte long by null
padding and split into two 7-byte halves
3. These values are used to create two DES keys, one
from each half. This generates a 64 bit DES Keys
4. Each of these keys is used to DES-encrypt the
constant ASCII string, resulting in two 8-byte cipher
text values.
5. These two cipher text values produce a 16 byte long
cipher text which is result of concatenating the two 8byte values.
The hashes produced by both NTLM and LM are stored in
the System32 directory of the Windows installation in a SAM
file which itself is encrypted by using a different file.
B. Different ways of extracting the password
Though LM Hash uses a DES encryption key, because of
the encryption methodology used by the LM, the encryption is
not a true one way function and with some widely known hash
attack algorithms and fair bit of time and computing resource,
the hash can be reversed to the user credentials. Using simple
math it can be shown that total number of passwords to be
cracked is ideally 295 but because of the different steps like
converting the characters to uppercase and splitting the whole
password into two the real problem domain for any hash attack
algorithm is reduced to 243.
The most common form of cracking method used is
Dictionary attack. Ideally, it does nothing more than
comparing the hash function obtained from the host computer
with that in pre-existing dictionary of commonly used
passwords. This way of cracking method is a long attempt and
usually works for weak and commonly used passwords. The
ability of the algorithm to crack the password is directly
dependent upon how diverse and big the dictionary is.
The next technique used in cracking of hashes is the brute
force method. Essentially, a password generator, generating all
possible of combination of words until a possible match is
found. Though bound to work, the time complexity increases
with password length and set of different characters and
symbol used to generate the password.
The third method and now widely used method is a
cracking utility by Zhu Shuanglei called Rainbow Table
Method. His tool is based on Philippe Oeshslin’s faster timememory trade off technique. This method proposes a new way
of pre-calculating the data which reduces by two the number
of calculation necessary for cryptanalysis [4]. For use in
cracking passwords, this method is almost equivalent to bruteforce attack but Rainbow Table uses pre-calculated chains of
words stored in the table. Though the cracking speed decreases
by fold of 100’s if not 1000’s using this method, the main
pitfall of this method is the time investment required to build
the tables. However, thanks to the internet and wide
availability of tools and resources under open license there are
numerous libraries available for free in the internet which
provides users with pre-build tables.
III.
PRESENT WORK
There has been wide spread work in this field as recovering
passwords for both legitimate and illegitimate purposes has
been a necessity ever since the start of its use. Though there are
freely available tools that will attempt at cracking the password
using one of the numerous methods mentioned above, there is
no guarantee of results. This is primarily because of the
Identify applicable sponsor/s here. (sponsors)
possible time complexity and lack of proper utilization of the
computing resource available in today’s multi core and high
speed interfaced computers.
To fill in the gap left by these free tools there are
commercially available tools in the market that use specialized
algorithms, pre hashed tables, rainbow tables to crack the
password. Further more, there are online web sites that offer
services to crack your password, provided you can give them
the hash function from the SAM file. These online sites use
massive collection of rainbow tables ranging anywhere from 60
to 160 GB to facilitate its user with cracking of their purpose.
Inherently almost all the programs available in the internet
that are downloadable are single processor based programs and
not optimized to take benefit of either multiprocessor
environment. Though in reality the general public might not
have access to a hugely parallel system or a distributed
computing for their daily use, the fact that these systems cannot
even use the processing power of general multicore processors
in use today puts these programs at a disadvantage. Because of
this and other reasons out of scope of this paper, acceptance of
use of such programs has not grown over the year. From the
presence of different commercial tools and websites available
in the internet it is evident that need of a tool which can run
faster and utilize the available computing resources in general
users computer exists.
It is however worth mentioning that there are projects which
have implemented crack tools in various parallel programming
paradigms that utilize the multiple cores and other recent
advancement in computing resources available, not much of
material is available for review.
Furthermore, it has to be noted that there are numerous
implementation of crack tools that utilize the massively parallel
capability of today’s graphics card to do calculation for
cracking hashes generated by different programs. Primarily
programmed using CUDA and similar interface, this kind of
crack tool claim huge improvement in cracking time over
traditional single and SMP based computers. But because using
such programs mandates presence of compatible graphics card
in users’ computer, it is not always feasible for using such
programs.
Since cracking password is among many different families
of present problems that can utilize parallel computing
techniques to speedup their performance, this paper and the
final project is an attempt at measuring how much of difference
can using parallel and distributed computing can make over
traditional methods of cracking a password.
IV.
APPLYING PARALLEL COMPUTING TECHNIQUES
Applying parallel computing technique to a problem that is
repeated time and over which can be broken into smaller pieces
seems to be a pretty intuitive thing to do. All crack tools
essentially have an approach where the algorithms serially
attack the hash in hand with the ones that are either present in a
table, dictionary or are generated depending upon the
constraints set by the user. As the data set that the algorithm is
working on is usually mutually exclusive or can be made
exclusive to each other, there is a distinct possibility that using
parallel computing or distributed computing in which data set
that the algorithm acts upon can divided will speed up the
process.
Though there are numerous constraints on how the
algorithms works in some of the method explained above,
Brute force attack and dictionary attack work on set of data and
constraints that are pre-defined and not dependent upon preformulated chains as in Rainbow attack. Hence, by dividing
and distributing either the dictionary in dictionary attack or the
random password generator constraint in the brute force attack,
theoretically, a good speed up can be achieved by using parallel
or distributed computing. Since the overall hash calculation and
matching will be done in the cluster or individual processor
there is minimum communication and setup overhead which
compared to the processing time necessary should have
insignificant effect on overall computing time. This means that
there will be less communication overhead resulting in a better
speed up.
The attempt here will be to parallelize or distribute the
dictionary attack algorithm for cracking the hash file using
MPI or OpenMP based parallel computing approach by
distributing the hash file to multiple nodes and with preassigned dictionaries to perform the algorithm. On success the
node will reply to the root node to indicate that matching hash
has been found and performance time will be noted.
V.
EXPECTED RESULT AND CONCLUSION
Though the expected speedup of the algorithm is ideally
number of processes spawned by the MPI program, this is true
just for worst case scenario. However, even for a general case,
this simulation of parallel computing should yield some
substantial speed up data that can be used to measure the
viability of using parallel processes to speed up programs that
have inherently been designed for single processor based
computers. As we know that result of any parallel computing
needs a sizeable amount of data for producing any legible
data. Though the data in the test might be slightly limited
because the processing is being distributed among more than
one node, the overall result should produce data that can be
used legibly for drawing conclusions for the work being done.
REFERENCES
[1]
[2]
[3]
[4]
M Naylor, S Jha, S K Mehta, “Hacking Windows Password,”
unpublished.
Microsoft Knowledgebase, www.microsoft.com
Brian Wilson, “LM & MD5 Hash Security & Cracking,”
PhilippeOechslin, “Making a Faster Cryptanalytic Time-Memory TradeOff, ”
Download