Information Theory and Security

advertisement
Authentication Methods: From
Digital Signatures to Hashes
Lecture Motivation


We have looked at confidentiality services, and also examined
the information theoretic framework for security.
Confidentiality between Alice and Bob only guarantees that Eve
cannot read the message, it does not address:
– Is Alice really talking to Bob?
– Is Bob really talking to Alice?

In this lecture, we will look at the following problems:
– Entity Authentication: Proof of the identity of an individual
– Message Authentication: (Data origin authentication) Proof that
the source of information really is what it claims to be
– Message Signing: Binding information to a particular entity
– Data Integrity: Ensuring that information has not been altered by
unknown entities
Lecture Outline

Discrete Logarithms and ElGamal
– Primitive elements and some more number theory (quickly)
– DLOG
– ElGamal, another Public Key Algorithm…

Digital Signatures:
– The basic idea
– RSA Signatures and ElGamal Signatures
– Inefficiencies: Hashing and Signing

Hash Functions:
– Definitions and terminology
– CHP Hash
– SHA-1

Message Authentication Codes
Note: Some attacks will be discussed. More attacks and cryptanalysis will come later in the semester
Primitive Roots

Consider the following powers of 3 (mod 7):
31  3, 32  2, 33  6, 34  4, 35  5, 36  1 (mod 7)
Note that we obtain all non-zero numbers mod 7.
When this happens, we call 3 a primitive root (or generator) mod 7.

Is a number always a primitive root? No.

If p is prime there are f(p-1) primitive roots mod p.

How to find them? Good homework problem… 

Proposition: Let g be a primitive root for the prime p
1.
If n is an integer, then gn=1 (mod p) if and only if and only if n=0 (mod p-1) .
2.
If j and k are integers, then gj=gk (mod p) if and only if j=k (mod p-1).
Proof: We sketch (1) on the board.
Discrete Logarithms

Let p be a prime, and a and b nonzero integers (mod p) with
b  a x (mod p)

The problem of finding x is called the discrete logarithm
problem, and is written:
x  La b

Often a will be a primitive root mod p.

The discrete log behaves like the normal log in many ways:
La b1b2   La b1   La b2 

Generally, finding the discrete log is a hard problem.

f(x) = ax (mod p) is an example of a one-way function.
ElGamal Public Key Cryptosystem

One way functions are often used to construct public key
cryptosystems. We saw one in RSA, we now show an example
using the DLOG problem.

Alice wants to send m to Bob. Bob chooses a large prime p and a
primitive root a. We assume 0 < m < p. Bob also chooses a
secret integer a and computes b=aa (mod p).
Bob’s Public key is: (p, a, b)
Alice does:


1.
2.
3.

Chooses a secret random integer k and computes r=ak (mod p)
Computes t=bkm (mod p).
Sends (r,t) to Bob.
Bob decrypts by:
tr a  m (mod p)
ElGamal Public Key Cryptosystem, pg. 2

Important issues…
– a must be kept secret, else Eve can decrypt
– Eve sees (r,t): t is the product of two random numbers and is
hence random. Knowing r does not really help as Eve would
need to be able to solve DLOG in order to get k.

Very important: A different random k must be used for each
message!
– If we have m1 and m2, and use the same k, then the
ciphertexts will be (r,t1) and (r,t2)
– If Eve ever finds m1 then she has m2 also!!!
t1 / m1  bk  t 2 / m2  m2  t 2 m1 / t1 mod p
Overview of Digital Signatures

Suppose you have an electronic document (e.g. a Word file).
How do you sign the document to prove to someone that it
belongs to you?

You can’t use a scanned signature at the end– this is easy to
forge and use elsewhere.

Conventional signing can’t work in the digital world.

We require a digital signature to satisfy:
1.
2.
Digital signatures can’t be separated from the message and
attached to another message.
Signature needs to be verified by others.
An Application for Digital Signatures

Suppose we have two countries, A and B, that have agreed not
to test any nuclear bombs (which produce seismic waves when
detonated). How can A monitor B by using seismic sensors?
1.
The sensors need to be in country B, but A needs to access
them. There is a conflict here.
2.
Country B wants to make sure that the message sent by the
seismic sensor does not contain “other” data (espionage).
3.
Country A, however, wants to make sure that the data has not
been altered by country B. (Assumption: the sensor itself is
tamper proof).
How can we solve this problem?
Treaty Verification Example

RSA provides a solution:
1.
Country A makes an RSA public/private key. (n,e) are given to
B but (p,q,d) are kept private in the tamper-proof sensor.
2.
Sensor collects data x and uses d to encrypt: y=xd (mod n), and
sends x and y to country B.
3.
Country B takes x and y and calculates z=ye (mod n).
4.
If z=x, then B can be sure that the encrypted message
corresponds to x. B then forwards (x,y) to A.
5.
Country A checks that ye (mod n)=x. If so, then A is sure that x
has not been modified, and A can trust x as being authentic.

In this example, it is hard for B to forge (x,y) and hence if (x,y)
verifies A can be sure that data came unaltered from the sensor.
RSA Signatures

The treaty example is an example of RSA signatures. We now
formalize it with Alice and Bob.

Alice publishes (n,eA) and keeps private (p,q,dA)

Alice signs m by calculating y=mdA (mod n). The pair (m,y) is
the signed document.

Bob can check that Alice signed m by:
1.
2.
Downloading Alice’s (n,eA) from a trusted third party.
Guaranteeing that he gets the right (n,eA) is another problem
(we’ll talk about this in a later lecture).
Calculate z=yeA (mod n). If z=m then Bob (or anyone else) can
be guaranteed that Alice signed m.
RSA Signatures, pg. 2

Suppose Eve wants to attach Alice’s signature to another message m1. She
cannot simply use (m1, y) since
y eA  m1 mod n 

Therefore, she needs y1 with y1eA=m1 (mod n).

m1 looks like a ciphertext and y1 like a plaintext. In order for Eve to make a
fake y1 she needs to be able to decrypt m1 to get y1!!! She can’t due to
hardness of RSA.

Existential Forgery: Eve could choose y1 first and then calculate an m1 using
(n,eA) via m1=y1eA (mod n). Now (m1, y1) will look like a valid message and
signature that Alice created since m1=y1eA (mod n).

Problem with existential forgery: Eve has made an m1 that has a signature, but
m1 might be gibberish!

Usefulness of existential forgery depends on whether there is an underlying
“language” structure.
Blind RSA Signatures
Sometimes we might want Alice to sign a document without knowing its
contents (e.g. privacy concerns: purchaser does not want Bank to know what
is being purchased, but wants Bank to authorize purchase).

We can accomplish this with RSA signatures (Bob wants Alice to sign a
document m):
1.
Alice generates an RSA public and private key pair.
2.
Bob generates a random k mod n with gcd (k,n)=1.
3.
Bob computes t=keAm (mod n), and sends t to Alice.
4.
Alice signs t as following the normal RSA signature procedure by calculating
s=tdA (mod n). Alice sends Bob s.
5.
Bob computes k-1s (mod n). This is the signed message mdA (mod n).
Verification:

k s mod n   k t
1
1 d A
 k
 k
1
eA
m
Does Alice learn anything about m from t?

dA
 k 1k eAd A md A  md A mod n 
ElGamal Signatures

We may modify the ElGamal public key procedure to become a
signature scheme.

Alice wants to sign m. Alice chooses a large prime p and a
primitive root a. Alice also chooses a secret integer a and
computes b=aa (mod p).

Alice’s Public key is: (p, a, b). Security of the signature depends
on the fact a is private.

Alice does:
1.
Chooses a secret random integer k with gcd(k,p-1)=1, and
computes r=ak (mod p)
2.
Computes s=k-1(m-ar) (mod p).
3.
The signed message is the triple (m,r,s).
ElGamal Signatures, pg. 2

Bob can verify by:
1.
Downloading Alice’s public key (p, a, b).
2.
3.

Computes v1=brrs (mod p) and v2=am (mod n)
The signature is valid if and only if v1=v2 (mod p)
Verification: We have
sk  m  ar mod p 1  m  sk  ar mod p 1
Therefore


  
v2  a  a
 a a  br r s  v1 mod p
This scheme is believed to be secure, as long as DLOG is hard to
solve.
m
sk  ar
a r
k s
Don’t: Choose a p with (p-1) the product of small primes and
don’t reuse k.
Wastefulness of plain signatures

In signature schemes with appendix, where we attach the
signature to the end of the document, we increase the
communication overhead.

If we have a long message m=[m1,m2,…,mN], then our signed
document is {[m1,m2,…,mN],[sigA(m1),…,sigA(mN)]}.

This doubles the overhead!

We don’t want to do this when communication resources are
precious (which is always!).

Solution: We need to shrink the message into a smaller
representation and sign that.

Enter: Hash functions
Hash Functions

Straight-forward application of digital signatures can be
expensive when the message is large

In general, many security protocols benefit from using a
“digested” or “compressed” representative of a message
– We typically need additional cryptographic properties in order for
the compression operation to be useful

This “compression function” is a hash function:
h(m)
Domain
Range
Hash Functions, pg. 2

Formally, a cryptographic hash function h takes an input
message of arbitrary length and produces a message digest of
fixed length, and satisfies:
1.
Given a message m, h(m) is quick to calculate
2.
One-Way (preimage resistance): Given a digest y, it is
computationally infeasible to find an m with h(m)=y.
3.
Strongly Collision Free: It is computationally infeasible to
find messages m1 and m2 with h(m1)=h(m2).

Can we ever have h(m1)=h(m2)? Yes. Why?

We will look at a couple examples.
Chaum, vanHeijst, Pfitzman Hash

We may use the DLOG problem to construct a hash function

Choose a prime p such that q=(p-1)/2 is also prime. (There’s an
algorithm for doing this, but that’s not our goal today). Choose
two primitive roots a and b.

The hash function h(m) will take integers (mod q2) to integers
(mod p). Hence, producing half the bits.

Write m=x0+x1q with 0  x 0 , x1 . q  1

Define the hash by:
hm  a x 0 bx1 mod p 
CHP Hash is strongly collision-free


Proposition: If we know m  m with h (m)  h (m) , then
we can solve the discrete logarithm a  La b .
Proof: Will be given on the board after we cover all of the slides.
SHA-1

In order to get fast hash functions, we need to operate at the bitlevel. SHA-1 is one such algorithm.

Many of the popular hash functions (e.g. MD5, SHA-1) use an
iterative design:
– Start with a message m of arbitrary length and break it into nbit blocks, m=[m1,m2,…,ml]. The last block is padded to fill
out a full block size.
– Message blocks are processed via a sequence of rounds using
a compression function h’ which combines current block and
the result of the previous round
X j  hX j1 , m j 
– X0 is an initial value, and Xl is the message digest.
SHA-1, pg. 2

In SHA-1, we pad according to the rule:
– Start with a message m of arbitrary length and break it into nbit blocks.
– The last block is padded with a 1 followed by enough 0 bits
to make the new message 64 bits short of a multiple of 512
bits in length.
– Into the 64 unfilled bits of the last block, we append the 64bit representation of the length T of the message.
– Overall, we have L  T / 512  1 blocks of 512 bits.
– The appended message becomes m=[m1,m2,…,mL].
SHA-1, pg. 3 (Basic Operations)

We will need the following bit operations:
SHA-1, pg. 4 (Basic Algorithm)
SHA-1, pg. 5 (Inside the Alg.)
Initial 160-bit register
X0=[H0,H1,H2,H3,H4]
SHA-1, pg. 6 (Subregister Operations)
• The operations done by
ft(b,C,D) depend on the round
number t
• The word Wt depends on the
round number t
• The constant Kt depends on
the round number t
Message Authentication Codes

A message authentication code (MAC) is a function that is used
to prevent alteration of messages:
–
–
–
–



MACs use a shared key K between Alice and Bob
Alice will send not only the message m, but also MACK(m).
Bob checks whether the attached MAC matches what he calculates
Eve cannot alter the message because she does not have K.
The MAC takes two inputs: the key K and an arbitrary size m.
Ideally, a MAC should be a random mapping from all possible
inputs to n-bits of output.
The uncertainty (and security) of the MAC is directly associated
with the size of the key K
– Remember: to Eve, the message is known, so it’s the key that
contains the security
CBC-MAC

CBC-MAC is a method for turning a block cipher into a MAC:
– Idea: encrypt m using CBC mode and throw away all but last
block of ciphertext.
– For messages P1, P2, …, Pk, the MAC is calculated by
H 0  IV
H i  E K Pi  H i 1 
MAC  H k

Do not use the same key for encryption (confidentiality) and
authentication!
CBC-MAC, pg. 2

Be careful when using CBC-MAC. Here’s a possible protocol
failure:

Observe: Fix K. If MAC(a)=MAC(b), then MAC(a||c)
=MAC(b||c), where c is a single block length in size.
MAC(a || c)  EK c  MAC(a)  EK c  MAC(b)  MAC(b || c)
1. Now, suppose attacker collects many MAC values and finds a
collision. This gives a and b for which MAC(a)=MAC(b).
2.
If attacker can get the sender to authenticate (a||c) (How is
another matter…) then the attacker can replace the message
being sent to the receiver with (b||c).
Comment: Its not an easy attack to do, but it is a possible
weakness!
CBC-MAC, pg. 3

Practical Implementation Details:
1.
Generally, if your message is m, do not just calculate MAC(m),
rather you should make an intermediate message s=(l||m),
where l is the length of m in a fixed-length format.
2.
Pad s to be a multiple of block size
3.
Apply CBC-MAC to the padded string s
4.
Output the last ciphertext block. Do not output any
intermediate block values!

CBC-MAC can reuse same code as confidentiality (encryption)
functions

CBC-MAC is generally tough to use correctly, though.
HMAC


We may also use hash functions to build MACs.
We cannot simply use MACK(m)=h(K||m) or h(m||K):
– Having the key at the front allows for length extension
attacks
– Having the key at the end allows for key-recovery attacks

Designers of HMAC considered these issues

HMAC computes
MACK m  hK  a || hK  b || m
Where a and b are constants that are specified.

HMAC has been around for a while and has been cryptanalyzed.
It’s the preferred MAC to use.
Using MACs

We must be careful using MACs.

If Alice sends Bob [m||MACK(m)] and Eve records this, she may
send it again at a later time (the replay attack!)

Generally, you want to authenticate not just the message, but the
context. That is, you want to authenticate m and additional data
d (such as message number, source, destination, protocol
identifier, sizes for different fields, etc.)

Why all these possibilities? If you tie the message to the specific
context, then it is harder for an adversary to manipulate context
fields to forge.

Make certain, though, that you have clear rules on how to split
concatenations (d||m) back into d and m.
Problems with Hashes

We must be careful when using hash functions, they are subject to some
“attacks”

Length Extension Attack: Consider a block-based hash like SHA-1, with
input blocks m=(m1, m2, …, mk), and hash h(m).
A new message m’ =(m1, m2, …, mk, mk+1), will have hash h(m’)=h’(h(m),mk+1),
where h’ is the compression sub-function.
In systems, such as authentication applications, where we calculate h(X||m), Eve
can append extra text to m and also update the hash.

Partial Message Collision Attack: Suppose we are able to find m and m’
such that h(m)=h(m’). If a system uses h(m||X) as an authentication parameter,
then due to the iterative nature h(m||X)=h(m’||X).
An adversary can replace m with m’ during authentication.

In general hashing practice, we really use f(m)= h(h(m)||m) or f(m)=h(h(m))
as the hash.
Download