CS 3343/3341
Analysis of Algorithms
Spring 2012 Weird Topic
Merkle's
Puzzles
Public Key Distribution
[Note: In the course, we only
covered the first section below.]
The first ideas of public key cryptography can best be explained
using a clever method for two people to exchange a common
secret key using only public communications.
The following example is not intended to be practical, but it
illustrates the ideas. In fact, it would work in practice, but there
are better methods.
Merkle's Puzzles.
Imagine that person A ("Alice") wants
to establish secure communication
with a distant individual B ("Bob").
Alice and Bob have made no
previous arrangements and can only communicate over a public
channel that anyone can listen in on. Another person ("Boris")
listens to all their communications and
has more computing power than they have.
Also Alice and Bob have
no special secret methods not known to Boris.
However, Boris can only listen in and cannot change messages.
In spite of all this, it is still possible for the two of them
to establish secure communication that Boris cannot
understand. This surprising technique was developed by R.C. Merkle
before the introduction of true public key cryptography.
It helps prepare students of cryptography
for even more surprising ideas.
Alice and Bob must be able to create "puzzles" that are
hard but not impossible to solve. For example, they could agree
that all but the low 50 bits of a
128-bit AES key would be zero bits.
Then they could encrypt two blocks of information with this key.
A brute-force breaking of the "puzzle" would mean trying
2^{49} keys on the average.
The encrypted information would also start with a block of zero bits,
so that they can tell when the puzzle is broken.
Suppose it takes Alice and Bob an hour of computer time
to break one of these puzzles. It is important to emphasize that
Boris also knows exactly how the puzzles are constructed.
Suppose Boris has a much more
powerful computer and can break a puzzle in just one minute.
Bob creates 14400 of these puzzles, each containing
a 128-bit random key in one block
and a sequence number identifying the puzzle (a number from
1 to 14400 in binary with leading
zeros to make it 128 bits long).
Bob transmits all of these puzzles to Alice, in random order.
Alice chooses one puzzle at random and breaks it in one hour.
She then keeps the random key and sends the sequence number
back to Bob. (Boris listens in to this sequence number, but
it does him no good because he doesn't know which number goes with which
puzzle.) Bob has saved the content of the puzzles (or has
generated them pseudo-randomly) so that he can look up the sequence number
and find the same random key that Alice has.
Boris, their opponent,
must break half of the puzzles on the average to recover the
common key that Alice and Bob are using.
Even with his much faster
machines, it still takes Boris14400/2 minutes
or 5 days on the average to break Alice and Bob's
system. Thus Alice and Bob get 5 days of secret
communications. If they want more time, say 50 days,
then Bob just needs to send ten times as many messages initially,
that is, 144000. If the disparity
between their computing power and that of the listeners is even
greater, then again Bob must simply send more puzzles initially.
If Boris can actually alter messages (change or inject or delete
them) as well as listening in, then he
might pretend to be Alice and communicate with Bob, or
vice versa. In this case Alice and Bob must be able
to describe shared experiences or to communicate shared information
to one another to be sure they are not
communicating with a stranger (Boris).
If Boris can intercept and change all messages
between Alice and Bob,
then he could answer both Alice's and
Bob's requests to communicate,
and pretend to be each to the other. Once Boris has secure
communication established with both Alice and Bob, he could
relay messages back and forth between them, translating between
the two cryptosystems. Then even if they authenticated each other,
he would still be listening in. This is called a
man-in-the-middle attack.
In this extreme case, the method does not work. This shows
the great care that must be taken with cryptographic protocols.
There are more complicated methods relying on a trusted
server that will foil this and other attacks
Commuting Ciphers.Alice and Bob still want to communicate securely,
without having set it up
ahead of time, but they would like a simpler and more mathematical
system. The ground rules are the same as above: Boris
listens in, knows everything they know, and has more computing power.
If they each had their own secret commuting cipher,
say Alice had EA and Bob had
EB, then, using a common
public integer a, Alice could send
EA(a) to Bob, and Bob could send
EB(a) to Alice. Then Alice
computes EA(EB(a))
and Bob computes EB(EA(a)).
Because the two ciphers commute, these quantities are
the same. The two people can use this common value or a portion
of it to make up a secret key for conventional cryptography.
If Boris can't break the cipher, he has no way of knowing this
secret value.
Actually, in this system, all that is needed is secret commuting
one-way functions. Such functions are discussed
in a later section, though not commuting ones.
It is a fact that cryptosystems almost never commute. The
next section describes the only exception I know of.
Exponentiation and the Discrete Logarithm.
One can imagine a cryptosystem in which a fixed number a is
raised to a power modulo a fixed prime p,
and the power is the plaintext m.
The result is the ciphertext c.
(There is a cryptosystem called the Pohlig-Hellman System
similar to this but a bit more complicated.)
So one fixes a (large) prime p and an
integer a that is a generator for multiplication in the
field Zp. (For this generator, see
the section on Fermat's Theorem in the "Favorites" chapter.)
From a message m, calculate a ciphertext
c as described above:
c = am mod p
If the quantities above were ordinary real numbers, and the
equation were c = am, then solving
for m would give m = logac,
the "logarithm base a of c".
Because of this notation from real numbers, one also refers
to m above as the "discrete logarithm
of c to base a modulo p".
It turns out that there is no known efficient way to calculate
discrete logarithms, even knowing c, a, and p.
Using a brute force approach, one could just try all possible
values of m, but there are ways to do better than this, including a
"meet-in-the-middle" approach similar to the attack of the same
name on double encryption with block ciphers.
There are algorithms that are efficient if p-1 has no large
prime factors.
With the best known approaches, if the prime
p is large and random, and if p-1 has a large prime factor,
then there is no efficient algorithm to calculate discrete
logarithms.
The Diffie-Hellman Key Distribution System.
With this system, Alice and Bob work with a large public prime
p, and a public generator a.
Alice chooses a secret integer A,
while Bob chooses a secret B.
Alice sends aA mod p to Bob, and
Bob sends aB mod p to Alice.
Then in secret, Alice computes
(aB)A mod p = a(B*A) mod p.
Similarly in secret, Bob computes
(aA)B mod p = a(A*B) mod p,
and these two quantities
are the same, a common secret value that they can use for
further communication. Boris listening in cannot compute
either A or B, and cannot discover this common secret.
Just as with the Merkle puzzles, if Boris can do more than listen
in on the communications, he could pretend to be Alice to Bob or
vice versa, and he could even use a man-in-the-middle attack
to make Alice and Bob think they are communicating in secret
when they are not. There are more elaborate methods involving
authenticated public keys that both Alice and Bob have, and these
methods allow them to establish a common secret key even in the
presence of an active opponent like Boris.
In order to set up a practical system as above, first choose
a "large enough" random integer n. The
size needed depends on how fast the computers have become and
on whether there has been progress in computing discrete logs;
at present n should be at least 1024 bits long.
Then test the numbers n, n+1, n+2, etc.,
until the next prime q
is found. It is not enough just to have a random prime,
but the prime minus one must have a large prime factor.
So just find a larger prime p with q as a factor.
For this purpose, test 2*q + 1 to see if it is prime.
If it is not a prime, start over again with another random n.
Finally, one will have a prime p with the
property that p - 1 = 2*q, for a prime q
of about half the size of p.
In order to get a generator, choose an a at random,
or start with a = 2.
Check this a to see if a2 mod p = 1
or if aq mod p = 1.
When an a doesn't satisfy either of these, one knows the value is
a generator, and can be used for key distribution.
(A random a would also probably work, but not with certainty.)
Revision date:2012-03-01.
(Please use ISO
8601, the International Standard.)