Uploaded by Vlad Costin Cismaru


Hello My Name is Vlad Cismaru and today I will be presenting my dissertation paper titled "General
Analysis of Password Format", an incursion into the world of password security and the issues of using
this technology.
Motivation: I want to illustrate the common problems with passwords used as a security measure to
properly secure any device, service or account. An Android app that generates secure pdeudorandom
passwords using a simple Rating algorithm will be presented and an example of using the hashcat tool
and the results yielded from computational run on MD5 hashes.
Given its long history of existence and usage the current paper aims to provide an overview of the
modern usage of passwords and explores methods to protect a user from creating self‐defeating weak
passwords which an attacker might exploit to his/her advantage and compromise the security of a
It takes into account the human factor of decision in password creation and the methods of
exploitation based on predictability. It also explains the issue with pseudo‐randomness and low
entropy vulnerability.
Various strategies on improving overall password strength are outlined and are meant to give
understanding about the crucial factor of optimum tradeoff between usability and security.
All of the theories presented are analyzed using a dataset of passwords from various sources that
identify predictability and entropy and presents a case for a general understanding of password
The last section of this paper presents a solution to one of the most common problems encountered
by users when using various passwords for different accounts and that is a piece of software called a
"PasswordGen" which essentially generates all the passwords in a centralized manner. This solution
provides security and convenience for the average user that has to deal with an impressive amount of
password data for everyday use. Also, a use case scenario is used to illustrate the vulnerability of the
weak passwords using specific tools.
Passwords – Usability versus Security
Password authentication is one of the most widespread mechanism of performing an authentication
process in a computer system environment. It is a very convenient method for authentication due to
its high usability. The concept of a shared secret in form of a password is currently the best option a
human has to identify, because it’s also very deployable and does not need any other extra hardware
to be implemented. This is true if we compare passwords to other ways, we could perform an
authentication like biometric measurements which require specialized hardware or other methods
like using Smart Cards or OTP tokens.
Any password that is created, usually by a user needs to adhere to a set of rules in order to ensure the
usability and strength. These two qualities of a password are needed to protect the user from a
myriad of attacks that are aimed at compromising an account. Unfortunately, they are also limited by
human cognition because user created passwords are very predictable and there is a constant struggle
between security and usability. [4]
It is known that a person can recall on the fly a string of alphanumeric letters that is seven characters
long. This is a consensus among scientists and has been extensively studied. Regrettably this length is
not sufficient to ensure a proper security level for any given account. Most of the times when a user is
asked to enter a password a “Password Strength Meter” is used to measure and evaluate the
complexity and “strength”, which is usually determined by calculating entropy of the string. The
entropy of the string is decided based on how random the characters of the string are. Also, other
factors such as length and the use of special characters are taken into account when calculating an
estimate on how long it would take to guess that particular string using techniques such as brute force
cracking or wordlist‐based attacks. [5]
What are hashes?
A hash function is any function that can be used to map data of arbitrary size to fixed‐size values. The
values returned by a hash function are called hash values, hash codes, digests, or simply hashes. The
values are used to index a fixed‐size table called a hash table. Use of a hash function to index a hash
table is called hashing or scatter storage addressing.
Hash functions and their associated hash tables are used in data storage and retrieval applications to
access data in a small and nearly constant time per retrieval, and storage space only fractionally
greater than the total space required for the data or records themselves. Hashing is a computationally
and storage space efficient form of data access which avoids the non‐linear access time of ordered
and unordered lists and structured trees, and the often exponential storage requirements of direct
access of state spaces of large or variable‐length keys.
SLIDE 3 and 4
The basic strategy when creating a strong password is to give it enough complexity and randomness
so that it will make predicting it or guessing it infeasible in a reasonable amount of time. The human
factor is probably one of the most consequent when creating a random password because most
humans have a poor concept of randomness which is difficult in nature to define. Randomness by
definition should be unpredictable, but humans are hardwired to detect patterns and think in terms
of predictable behavior and thus a potential target can be studied and exploited from this
perspective. Human beings perceive the notion of randomness in a sequence like (78, 65, 43, 21, 95)
by an apparent lack of order and thus making it hard to predict the next element. But a lack of order
does not guarantee that it has the characteristic of being random. A more accurate definition would
be that a sequence of characters is truly random if there’s no way it can be replicated given any
circumstances or information. In order to determine if a sequence is actually random it must have
several properties to be labeled as such:
Uniqueness – if a sequence has the property of very rarely repeating a pattern of data mode
than once, so the longer the sequence, the more unique it becomes
Unpredictability – all the information in a given sequence does not abide to a repeatable
pattern and the concordantly there is no clear distinctive relationship between the elements of that
Even Distribution – a balanced probability of distribution in the entirety of the data set
Example of Permutation vs combination of 0‐9 digits in the context of complexity
Permutations: P(n,r) = n! / (n ‐ r)!
Combinations: (C(n,r) = n! / r! (n ‐ r)!)
Short presentation of the ANdroid app
Brute‐force attack
Combinator attack
Dictionary attack
Fingerprint attack
Hybrid attack
Mask attack
Permutation attack
Rule‐based attack
Table‐Lookup attack (CPU only)
Toggle‐Case attack
PRINCE attack [8]
Hybrid attack ‐‐> combinator attack + brute force
toggle attack ‐‐> all combo upper and lower case from a word ina dictionary
PRINCE attack ‐‐> specialized to better use the GPU performance
SLIDE 8 (demo)
Showing the cracking results and the original myhashes.txt
Questions and Thank yous.
A regular expression (shortened as regex or regexp;[1] also referred to as rational expression)[2][3] is
a sequence of characters that define a search pattern. Usually such patterns are used by string
searching algorithms for "find" or "find and replace" operations on strings, or for input validation. It is
a technique developed in theoretical computer science and formal language theory.
Hybrid attack ‐‐> combinator attack + brute force
toggle attack ‐‐> all combo upper and lower case from a word ina dictionary
PRINCE attack ‐‐> specialized to better use the GPU performance
Example of Permutation vs combination of 0‐9 digits in the context of complexity
Permutations: P(n,r) = n! / (n ‐ r)!
Combinations: (C(n,r) = n! / r! (n ‐ r)!)
The MD5 message‐digest algorithm is a widely used hash function producing a 128‐bit hash value.
Although MD5 was initially designed to be used as a cryptographic hash function, it has been found to
suffer from extensive vulnerabilities. It can still be used as a checksum to verify data integrity, but
only against unintentional corruption. It remains suitable for other non‐cryptographic purposes, for
example for determining the partition for a particular key in a partitioned database