F k (m) - Stefan Dziembowski
Download
Report
Transcript F k (m) - Stefan Dziembowski
Cryptography on non-trusted
machines
Stefan Dziembowski
University of Rome
La Sapienza
General plan
1.
December 2008:
Introduction to provably-secure symmetric
cryptography
2.
January 2009:
Cryptography on non-trusted machines
Introduction to provably-secure
symmetric cryptography
Stefan Dziembowski
University of Rome
La Sapienza
Cryptography
In the past:
the art of encrypting messages (mostly for the
military applications).
Now:
the science of securing digital communication
and transactions (encryption, authentication,
digital signatures, e-cash, auctions, etc..)
Plan
1.
Encryption schemes
a) information-theoretically secure
b) computationally-secure;
basic tools:
i.
ii.
iii.
2.
pseudorandom generators
one-way functions
pseudorandom functions and permutations
Message authentication schemes
a) information-theoretically secure
b) computationally-secure
3.
Hash functions (collision-resistance and the random
oracle model).
(we’ll talk only about the symmetric-key cryptography)
Encryption schemes
(a very general picture)
Encryption scheme (cipher) = encryption & decryption
plaintext m
encryption
In the past:
a text in natural language.
Now:
a string of bits.
ciphertext c
decryption
should not learn m
m
Art vs. science
In the past:
lack of precise definitions, ad-hoc design,
usually insecure.
Nowadays:
formal definitions, systematic design, very
secure constructions.
Provable security
We want to construct schemes that are
provably secure.
But...
• why
do we want to do it?
• how to define it?
• and is it possible to achieve it?
Provable security – the motivation
In many areas of computer science formal proofs are not essential.
For example, instead of proving that an algorithm is efficient,
we can just simulate it on a “typical input”.
In cryptography it’s not true, because
there cannot exist an experimental proof that a scheme is secure.
Why?
Because a notion of a
“typical adversary”
does not make sense.
Provable security – a tradeoff
strong proofs
and
not efficient
no proofs
and
very efficient
good news
There exist schemes that
• are very efficient
• have some provable properties
A typical strategy for a theoretician
take a scheme that is used in real life
2. then:
1.
•
•
prove that it is secure, or
propose an equally efficient scheme that is provably secure
This works especially if the real-life schemes have
some security weaknesses.
Examples of success stories:
HMAC, OAEP...
Kerckhoffs' principle
Auguste Kerckhoffs (1883):
The enemy knows the system
The cipher should remain secure even
if the adversary knows the
specification of the cipher.
The only thing that is secret is a
short key k
that is usually chosen uniformly at random
12
A more refined picture
plaintext m
encryption
ciphertext c
key k
decryption
m
key k
(Of course Bob can use the same method to send messages to Alice.)
(That’s why it’s called the symmetric setting)
Let us assume that k is unifromly random
13
Kerckhoffs' principle – the
motivation
1.
2.
3.
In commercial products it is unrealistic to assume that
the design details remain secret (reverseengineering!)
Short keys are easier to protect, generate and
replaced.
The design details can be discussed and analyzed in
public.
Not respecting this principle
=
``security by obscurity”.
A mathematical view
K – key space
M – plaintext space
C - ciphertext space
An encryption scheme is a pair (Enc,Dec), where
Enc : K × M → C is an encryption algorithm,
Dec : K × C → M is an decryption algorithm.
We will sometimes write Enck(m) and Deck(m) instead of
Enc(k,m) and Dec(k,m).
Correctness
for every k we should have Deck(Enck(m)) = m.
Historical ciphers
shift cipher,
substitution cipher,
Vigenère cipher,
Enigma,
...
all broken...
Defining “security of an encryption scheme” is not
trivial.
consider the following experiment
(m – a message)
1.
the key K is chosen randomly
2.
C := EncK(m) is given to the
adversary
how to define
security
?
Doesn’t need to be
uniform,
but at least has to be
“fresh”
(i.e. sampled
independently of m).
Idea 1
(m – a message)
1.
the key K is chosen randomly
2.
C := EncK(m) is given to the adversary
An idea
“The adversary should not be able to compute K.”
A problem
the encryption scheme that “doesn’t encrypt”:
EncK(m) = m
satisfies this definition!
(m – a message)
Idea 2
1.
the key K is chosen randomly
2.
C := EncK(m) is given to the adversary
An idea
“The adversary should not be able to compute m.”
A problem
What if the adversary can compute, e.g., the first half of m?
m1
...
m|m|/2
?
...
?
Idea 3
(m – a message)
1.
the key K is chosen randomly
2.
c := Enck(m) is given to the adversary
An idea
“The adversary should not learn any information about m.”
A problem
But he may already have some a priori information about m!
For example he may know that m is a sentence in English...
Idea 4
(m – a message)
1.
the key K is chosen randomly
2.
C := EncK(m) is given to the adversary
An idea
“The adversary should not learn any additional information
about m.”
This makes much more sense.
But how to formalize it?
We will use the language of the probability theory.
Notation
A - a random variable X – an event
then
PA denotes the distribution of A:
PA(a) = P(A = a)
and
PA|X denotes the distribution of A conditioned on X:
PA|X (a) = P (A = a | X).
More notation
Two (discrete) random variables A and B
are independent if
for every a and b:
P(A = a and B = b) = P(A = a) P(B = b).
How to formalize the “Idea 4”?
“The adversary should not learn any additional information
about m.”
also called: information-theoretically secret
An encryption scheme is perfectly secret if
for every random variable M
and every c Є C
PM = PM | (Enc(K,M))= c
equivalently: M and Enc(K,M) are independent
such that
P(C = c) > 0
Equivalently:
for every M we have that: M and Enc(K,M) are
independent
for every M and m we have:
P Enc(K,M) = P (Enc(K,M)) | M = m
for every M and m we have:
P Enc(K,M) = P Enc(K,m)
for every m0 and m1 we have P Enc(K,m0) = P Enc(K,m1)
intuitive...
A perfectly secret scheme: one-time pad
t – a parameter
K = M = {0,1}t
component-wise xor
Vernam’s cipher:
Enck(m) = k xor m
Deck(c) = k xor c
Gilbert
Vernam
(1890 –1960)
Correctness is trivial:
Deck(Enck(m)) = k xor (k xor m)
m
26
Perfect secrecy of the one-time pad
Perfect secrecy of the one time pad is also
trivial.
This is because for every m
the distribution PEnc(K,m) is uniform
(and hence does not depend on m).
Observation
One time pad can be generalized as follows.
Let (G,+) be a group. Let K = M = C = G.
The following is a perfectly secret encryption
scheme:
Enc(k,m) = m + k
Dec(k,m) = m – k
Why the one-time pad is not practical?
1.
The key has to be as long as the message.
2.
The key cannot be reused
This is because:
Enck(m0) xor Enck(m1) = (k xor m0) xor (k xor m1)
= m0 xor m1
29
One time-pad is optimal in the class of
perfectly secret schemes
Theorem (Shannon 1949)
In every perfectly secret encryption scheme
Enc : K × M → C , Dec : K × C → M
we have |K| ≥ |M|.
Proof
Perfect secrecy implies that the distribution of Enc(K,m) does not depend on m
Wlog suppose that C contains only the ciphertexts that
have a non-zero probability in Enc(K,m).
Hence: |K| ≥ |C|.
Fact: we always have that |C| ≥ |M|.
This is because for every k we have that
Enck : M → C is an injection
(otherwise we wouldn’t be able to decrypt).
|K| ≥ |M|
30
Outlook
We constructed a perfectly secret
encryption scheme
Our scheme has certain drawbacks
(|K| ≥ |M|).
But by Shannon’s theorem this is
unavoidable.
Can we go home and relax?
31
What to do?
Idea
limit the power of the adversary.
How?
Classical (computationally-secure) cryptography:
bound his computational power.
Alternative options: quantum cryptography, bounded-storage model,...
(not too practical)
How to bound the computational
power of the adversary?
We required that
M and EncK(M)
are independent,
It is enough to require that
M and EncK(M)
are independent
“from the point of view of a computationally-limited adversary’’.
How can this be formalized?
We will use the complexity theory!
33
Real cryptography starts here:
Eve is computationally-bounded
But what does it mean?
Ideas:
1. “She has can use at most 1000
Intel Core 2 Extreme X6800 Dual Core Processors
for at most 100 years...”
2. “She can buy equipment worth 1 million euro and use it for 30 years..”.
it’s hard to reason
formally about it
A better idea
”The adversary has access to a Turing Machine that can make at most 1030
steps.”
More generally, we could have definitions of a
type:
“a system X is (t,ε)-secure if every Turing Machine
that operates in time t
can break it with probability at most ε.”
This would be quite precise, but...
We would need to specify exactly what we mean by a “Turing Machine”:
how many tapes does it have?
how does it access these tapes (maybe a “random access memory” is a
more realistic model..)
...
Moreover, this approach often leads to ugly formulas...
What to do?
Idea:
t steps of a Turing Machine → “efficient computation”
ε → a value “very close to zero”.
How to formalize it?
Use the asymptotics!
Efficiently computable?
“efficiently computable”
=
“polynomial-time computable
on a Turing Machine”
that is: running in time
O(nc) (for some c)
Here we assume that the Turing Machines are the right model for
the real-life computation.
Not true if a quantum computer is built...
Very small?
“very small”
=
“negligible”
=
approaches 0 faster than the inverse of any polynomial
Formally
A function µ : N → R is negligible if
1
| ( n) | c
c
n0 n > n0
n
Nice properties of these notions
A sum of two polynomials is a polynomial:
poly + poly = poly
A product of two polynomials is a polynomial:
poly * poly = poly
A sum of two negligible functions is a negligible function:
negl + negl = negl
Moreover:
A negligible function multiplied by a polynomial is negligible
negl * poly = negl
Security parameter
Typically, we will say that a scheme X is secure if
P (M breaks the scheme X) is negligible
A
polynomial-time
Turing Machine M
The terms “negligible” and “polynomial” make sense only if X (and the adversary) take
an additional input 1n called
a security parameter.
In other words: we consider an infinite sequence
X(1),X(2),...
of schemes.
Convention (in this lecture)
security parameter n = the length of the secret key k
in other words: k is always a random element of {0,1}
Observation
The adversary can always guess k with probability 2-n.
This probability is negligible.
Is this the right approach?
Advantages
1.
All types of Turing Machines are “equivalent” up to a
“polynomial reduction”.
Therefore we do need to specify the details of the model.
2.
The formulas get much simpler.
Disadvantage
Asymptotic results don’t tell us anything about security of the concrete
systems.
However
Usually one can prove formally an asymptotic result and then argue
informally that “the constants are reasonable”
(and can be calculated if one really wants).
How to change the security definition?
we will require that m0,m1 are chosen by a poly-time adversary
An encryption scheme is perfectly secret if for every m0,m1 Є M
PEnc(K, m0) = PEnc(K, m1)
we will require that no poly-time adversary can distinguish
Enc(K, m0) from Enc(K, m1)
A game
(Enc,Dec) – an encryption scheme
security parameter
1n
adversary
(polynomial-time Turing machine)
chooses m0,m1 such that
|m0|=|m1|
has to guess b
oracle
m0,m1
c
1. selects k randomly from
{0,1}n
2. chooses a random b = 0,1
3. calculates
c := Enc(k,mb)
Alternative name: has indistinguishable encryptions
(sometimes we will say: “is computationally-secure”, if the context is clear)
Security definition:
We say that (Enc,Dec) is semantially-secure if any polynomial time adversary guesses
b correctly with probability at most 0.5 + ε(n), where ε is negligible.
Testing the definition
Suppose the adversary can compute k from Enc(k,m).
Can he win the game?
YES!
Suppose the adversary can compute some bit of m
from Enc(k,m). Can he win the game?
YES!
Multiple messages
In real-life applications we need to encrypt
multiple messages with one key.
The adversary may learn something about the
key by looking at
ciphertexts c1,...,ct of
some messages m1,...,mt.
How are these messages chosen?
let’s say: the adversary can choose them!
A chosen-plaintext attack (CPA)
security parameter
1n
chooses m’1
1. selects random k Є {0,1}n
2. chooses a random b = 0,1
m’1
c1 = Enc(k,m’1)
...
chooses m’t
challenge phase:
chooses m0,m1
m’t
ct = Enc(m’t)
m0,m1
c = Enc(k,mb)
the interaction continues . . .
has to guess b
oracle
CPA-security
Alternative name: CPA-secure
Security definition
We say that (Enc,Dec) has indistinguishable encryptions
under a chosen-plaintext attack (CPA) if
every randomized polynomial time adversary
guesses b correctly
with probability at most 0.5 + ε(n), where ε is negligible.
Observation
Every CPA-secure encryption has to be
• randomized, or
• “have a state”.
CPA in real-life
Q: Aren’t we too pessimistic?
A: No! CPA can be implemented in practice.
Example: routing
k
m
Enck(m)
k
Is it possible to prove security?
(Enc,Dec) -- an encryption scheme.
For simplicity suppose that:
1. for a security parameter n the key is of length n.
2. Enc is deterministic
Consider the following language:
{(Enc(k , m) : k {0,1}n , m {0,1}m1}
L=
n
Q:What if L is polynomial-time decidable?
A: Then the scheme is broken (exercise)
On the other hand: L is in NP.
(k is the NP-witness)
So, if P = NP, then any semantically-secure encryption is broken.
Is it really true?
“If P=NP, then the semantically-secure encryption is
broken”
Is it 100% true?
Not really...
This is because even if P=NP we do not know what
are the constants.
Maybe P=NP in a very “inefficient way”...
To prove security of a cryptographic scheme we need to show
a lower bound on the computational complexity of some
problem.
In the “asymptotic setting” that would mean that
at least
we show that P ≠ NP.
Does the implication in the other direction hold?
(that is: does P ≠ NP imply anything for cryptography?)
No! (at least as far as we know)
Therefore
proving that an encryption scheme is secure is probably much
harder than proving that P ≠ NP.
What can we prove?
We can prove conditional results.
That is, we can show theorems of a type:
Suppose that some
“computational
assumption A”
holds
then scheme X is
secure.
Suppose that some
scheme Y is secure
then scheme X is
secure.
Research program in cryptography
Base the security of cryptographic schemes on a small number
of well-specified “computational assumptions”.
Examples of A:
“decisional Diffie-Hellman assumption”
“strong RSA assumption”
Some “computational
assumption A”
holds
in this we
have to
“believe”
interesting only
if this is “far
from trivial”
then scheme X is
secure.
the rest is
provable
Example
Suppose that G is a
“cryptographic
pseudorandom generator”
we can construct a secure
encryption scheme
Pseudorandom generators
n
s
G(s)
“expansion
factor”
Definition
l – polynomial such that always l(n) > n
An algorithm G : {0,1}* → {0,1}* is called a pseudorandom generator
(PRG) if
for every n
and for every s such that |s| = n
we have
|G(s)| = l(n).
?
and for a random s the value G(S) “looks random”.
l(n)
Idea
Use PRGs to “shorten” the key in the one time pad
for a moment just
consider a single
message case
Key: random string of length n
Plaintexts: strings of length l(n)
xor
Enc(s,m)
Dec(s,m)
s
s
G(s)
G(s)
m
m
xor
G(s)
c
c
xor
G(s)
If we use a “normal PRG” – this idea doesn’t work
We have to use the cryptographic PRGs.
What about the multi-message case?
One possible solution: “synchronized mode”.
G : {0,1}n → {0,1}very large – a PRG.
s
G is computed “on fly”
...
G(s)
m0
m1
m2
m3
c0
c1
c2
c3
xor
“Looks random”
What does it mean?
Non-cryptographic applications:
should pass some statistical tests.
Cryptography:
should pass all polynomial-time tests.
Cryptographic PRG
outputs:
a random string R
0 if he thinks it’s R
or
G(S) (where S random)
a polynomial-time
distinguisher D
1 if he thinks it’s G(S)
Should not be able to
distinguish...
Definition
n – a parameter
S – a variable distributed uniformly over {0,1}n
R – a variable distributed uniformly over {0,1} l(n)
G is a cryptographic PRG if
for every polynomial-time Turing Machine D
we have that
|P(D(R) = 1) – P(D(G(S)) = 1)|
is negligible in n.
Constructions
There exists constructions of cryptographic
pseudorandom-generators, that are
conjectured to be secure.
We will discuss them later...
Theorem
If G is a cryptographic PRG then the encryption
scheme constructed before is semantically-secure
(i.e. it has indistinguishable encryptions).
cryptographic PRGs
exist
computationally-secure encryption
exists
Proof (sketch)
Suppose that it is not secure.
Therefore there exists an poly-time adversary that wins the “guessing
game” with probability 0.5 + δ(n), where δ(n) is not negligible.
simulates
X
chooses m0,m1
tries to guess b
If the adversary
guessed b correctly
output 1:
“x is pseudorandom”.
m0,m1
c
1.
2.
b = 0,1 random
c := x xor mb
otherwise
output 0:
“x is random”.
Hence
x is a random string R
the adversary guesses b correctly
with probability 0.5
prob.
0.5
outputs:
1
prob.
0.5
0
x = G(S)
the adversary guesses b correctly
with probability 0.5 + δ(n)
prob.
0.5 + δ(n)
prob.
0.5 - δ(n)
1
| P(D(R) = 1) – P(D(G(S)) = 1) | = | 0.5 – (0.5 + δ(n)) | = δ(n)
Since δ is not negligible G cannot be a cryptographic PRG
0
The complexity
The distinguisher
simply simulated
one execution of the adversary
against the oracle
.
Hence he works in polynomial time.
QED
Moral
cryptographic PRGs
exist
computationally-secure encryption
exists
To construct secure encryption it suffices to
construct a secure PRG.
Moreover, we can also state the following:
Informal remark. The reduction is tight.
A question
What if the distinguisher
needed to simulate
1000 executions of the adversary
?
An (informal) answer
Then, the encryption scheme would be “1000 times less secure”
than the pseudorandom generator.
Why?
To achieve the same result
more than
.
needs to work 1000 times
General rule
Take a secure system that uses some long secret string X.
secret string X
Then, you can construct a system that uses a shorter string S,
and expands it using a PRG:
X = G(S)
secret string S
G
“pseudorandom” string X
Constructions of PRGs
A theoretical result
a PRG can be constructed from any one-way function
(very elegant, impractical, inefficient)
Based on hardness of some
particular computational problems
For example
[Blum, Blum, Shub. A Simple Unpredictable Pseudo-Random Number Generator]
(elegant, more efficient, still rather impractical)
“Stream ciphers”
ugly, very efficient, widely used in practice
Examples: RC4, Trivium, SOSEMANUK,...
One-way functions
A function
f : {0,1}* → {0,1} *
is one-way if it is “hard to invert it”.
random
x Є {0,1}n
f
f(x)
x’
probability that any poly-time adversary
outputs x’ such that
f(x) = f(x’)
is negligible in n
More formally...
experiment (machine M, function f)
1.
2.
3.
4.
pick a random element x from {0,1}n
let y := f(x),
let x’ be the output of M on y
we say that M won if f(x’) = y
We will say that f : {0,1}* → {0,1}* is one-way if
P (M wins) is negligible
A
polynomial-time
Turing Machine M
Example of a (candidate for) a one-way
function
If P=NP then one-way functions don’t exist.
So no function can be proven to be one-way.
But there exist candidates. Example:
f(p,q) = pq
this function is defined on
primes × primes,
not on
{0,1}*
but it’s just a technicality
One way functions do not “hide all the
input”
Example:
x1
one-way
function
xn
xn+1
f
f(x)
xn+1
f ’(x1,...,xn+1) := f(x1,...,xn) || xn+1 is also a one-way function
One of the most fundamental results in
the symmetric cryptography
[Håstad, Impagliazzo, Levin, Luby A Pseudorandom Generator from any One-way
Function]:
“a PRG can be constructed from any one-way function”
one-way functions
exist
cryptographic PRGs
exist
computationally-secure encryption
exists
The implication also holds in the other
direction
computationally-secure encryption
exists
plaintext
M
one-way functions
exist
Enc
ciphertext
C(K,M)
key
K
f(K) = Enc(K,(0,...,0)) is a one-way function
“Minicrypt”
P ≠ NP
?
big open problem
one-way functions
exist
computationally-secure
encryption exists
cryptographic PRGs
exist
The “world” where the one-way functions exist
is called “minicrypt”.
Practical encrypion
Stream ciphers
Block ciphers
we will discuss it now
Idea
1.
2.
Desing ciphers that work on small blocks
(e.g. of length n = 128 bits)
Then build the real encryption schemes out of them.
plaintext m
key k
encryption F
this will be called:
a block cipher
ciphertext c
key k
decryption F
plaintext m
Block ciphers – an bit more formaly
For a partial function F : {0,1}* × {0,1}*→ {0,1}*
let Fk(m) denote F(k,m).
A block cipher is a function F such that
1. It is a keyed-permutation, i.e.:
2.
•
for every k function Fk is a permutation on some {0,1}n
(for simplicity assume: n = |k|).
•
for every k functions Fk and Fk-1 are efficiently computable.
for a random k and any m1,...,mt the values
Fk(m1),...,Fk(mt) “look random”
Block ciphers – an intuition
key k
F
2n blocks
block 1
block 2
block 3
length n
length n
length n
block
cipher
≈
...
block 2n
length n
a pseudorandom generator with an output of exponential length,
with a random access to the output:
the ith block can be accessed as Fk(i)
additional properties:
the function Fk is a bijection,
and Fk-1 is efficiently computable
How to formalize it?
Remember:
stream ciphers ≈ pseudorandom generators
We will have
block ciphers ≈ pseudorandom permutations
Intuition:
a pseudorandom permutation
should not be distinguishable from
a “completely random permutation”.
Scenario 1
security parameter
1n
distinguisher D
m1 є {0,1}n
Fk(m1)
m2 є {0,1}n
Fk(m2)
...
mt є {0,1}n
outputs b є {0,1}
Fk(mt)
oracle
chooses a random k є {0,1}n.
Scenario 2
security parameter
1n
distinguisher D
m1 є {0,1}n
oracle
chooses a random function
F : {0,1}n→ {0,1}n
Fk(m1)
m2 є {0,1}n
Fk(m2)
...
mt є {0,1}n
outputs b є {0,1}
Fk(mt)
This of course cannot be
done efficiently, but it doesn’t
matter
Pseudorandom permutations – the definition
We say that a keyed-permutation
F : {0,1}* × {0,1}*→ {0,1}*
is a pseudorandom permutation if
any polynomial-time randomized distinguisher D
cannot distinguish scenario 1 from scenario 2 with
a non-negligible advantage.
That is:
|P(D outputs “1” in scenario 1) - P(D outputs “1” in scenario 2)|
is negligible in n
PRFs vs PRP
If we drop the assumption that
Fk has to be a permutation
we obtain an object called
a “pseudorandom function”.
The security definition doesn’t change.
In fact those two objects are indistinguishable
for a polynomial-time adversary.
Pseudorandom permutations in real-life
block ciphers
key length
block length
56
64
128
64
128, 192 or 256
128
DES (1976)
(Data Encryption Standard)
IDEA (1991)
(International Data Encryption
Algorithm)
AES (1998)
(Advanced Encryption Standard)
Other: Blowfish,Twofish, Serpent,...
Block cipher modes of operation
Block ciphers cannot be used directly for encryption.
They are always used in some “modes of operation”:
1.
2.
3.
4.
Electronic Codebook (ECB) mode ← not secure,
Cipher-Block Chaining (CBC) mode,
Output Feedback (OFB) mode,
Counter (CTR) mode,
...
Electronic Codebook mode
plaintext
encryption:
...
block 1
block 2
block 2
block t
Fk
Fk
Fk
Fk
block 1
block 2
block 2
block t
ciphertext
decryption:
block 1
block 2
block 2
F-1k
F-1k
F-1k
block 1
block 2
block 2
plaintext
block t
...
F-1k
block t
Electronic Codebook mode should not be
used!
This mode was used in the past.
It is not secure, and should not be used.
Example:
ECB
Cipher-Block Chaining (CBC)
encryption:
random
initial
value
plaintext
...
block 1
block 2
block 3
xor
xor
xor
xor
Fk
Fk
Fk
Fk
block 1
block 2
block 3
block t
ciphertext
This mode is self-synchronizing
Error in block ci affects only ci and ci+1.
So, errors don’t propagate
block t
Cipher-Block Chaining (CBC)
decryption:
ciphertext
random
initial
value
...
block 1
block 2
block 3
F-1k
F-1k
F-1k
F-1k
xor
xor
xor
xor
block 1
block 2
block 3
block t
plaintext
block t
CBC mode is secure
Theorem. If F is a PRP then F-CBC is secure.
[M. Bellare, A. Desai, E. Jokipii and P. Rogaway 1997]
In the proof one can assume that Fk is a completely random
function.
(If CBC behaves differently on a pseudorandom function, then
one could construct a distiguisher.)
plaintext
plaintext
CBC
CBC
Fk
...
ciphertext
Fk
random
...
ciphertext
random
How to convert a pseudorandom permutation
into a pseudorandom generator?
a seed
k
0000001
0000002
0000003
0000004
Fk
Fk
Fk
Fk
block 1
block 1
block 1
block 1
a pseudorandom stream
Essentially, this is called a “counter mode”.
...
One more member of minicrypt!
pseudorandom
functions/permutations
exist
using “modes of
operation”
secure encryption
exist
this we already
knew
one-way functions
exist
this can also be
proven
Plan
1.
Encryption schemes
a) information-theoretically secure
b) computationally-secure;
basic tools:
i. pseudorandom generators
ii. one-way functions
iii. pseudorandom functions and permutations
2.
Message authentication schemes
a) information-theoretically secure
b) computationally-secure
3.
Hash functions (collision-resistance and the
random oracle model).
Message Authentication
Integrity:
M
Alice
Bob
interferes with the transmission
How can Bob be sure that
M really comes from Alice?
96
Sometimes: more important than
secrecy!
transfer 1000 $ to Bob
Alice
transfer 1000 $ to Eve
Bank
Of course: usually we want both secrecy and integrity.
97
Does encryption guarantee message integrity?
Idea:
1.
2.
Alice encrypts m and sends c=Enc(k,m) to Bob.
Bob computes Dec(k,m), and if it “makes sense” accepts it.
Intuiton: only Alice knows k, so nobody else can produce a valid
ciphertext.
It does not work!
Example: one-time pad.
plaintext
transfer 1000 $ to Bob
“Eve” xor “Bob”
transfer 1000 $ to Eve
key K
xor
ciphertext C
98
Message authentication
verifies if
t=Tagk(m)
(m, t=Tagk(m))
m
Alice
Bob
k
k
Eve can see (m, t=Tagk(m))
She should not be able to
compute a valid tag t’ on any
other message m’.
99
Message authentication – multiple
messages
m1
(m1, t=Tagk(m1))
m2
(m2, t=Tagk(m2))
mt
...
...
Alice
Bob
(mw, t=Tagk(mw))
k
k
Eve should not be able to
compute a valid tag t’ on any
other message m’.
100
Message Authentication Codes – the
idea
m є {0,1}*
(m, t=Tagk(m))
Vrfyk(m) є {yes,no}
Alice
Bob
k
k
k is chosen randomly
from some set T
101
A mathematical view
K – key space
M – plaintext space
T - set of tags
A MAC scheme is a pair (Tag, Vrfy), where
Tag : K × M → T is an tagging algorithm,
Ver: K × M × T → {yes, no} is an decryption algorithm.
We will sometimes write Tagk(m) and Vrfyk(m,t) instead of
Tag(k,m) and Vrfy(k,m,t).
Correctness
it should always holds that:
Vrfyk(m,Tagk(m)) = yes.
Conventions
If Vrfyk(m,t) = yes then we say that t is a
valid tag on the message m.
If Tag is deterministic, then Vrfy just computes
Tag and compares the result.
In this case we do not need to define Vrfy
explicitly.
How to define security?
We need to specify:
1.
how the messages m1,...,mt are chosen,
2.
what is the goal of the adversary.
Good tradition: be as pessimistic as possible!
Therefore we assume that
1.
The adversary is allowed to chose m1,...,mw.
2.
The goal of the adversary is to produce a valid tag on
some m’ such that m’ ≠ m1,...,mw.
104
security parameter
1n
adversary
selects random a k Є {0,1}n
m1
(m1, t=Tagk(m1))
oracle
...
mw
(mw, t=Tagk(mw))
We say that the adversary breaks the MAC scheme at the end
she outputs (m’,t’) such that
Vrfy(m’,t’) = yes
and
m’ ≠ m1,...,mw
105
The security definition
We say that (Tag,Vrfy) is secure if
P(A breaks it) is negligible (in n)
A
polynomial-time
adversary A
106
Aren’t we too paranoid?
Maybe it would be enough to require that:
the adversary succeds only if he forges a message
that “makes sense”.
(e.g.: forging a message that consists of random noise
should not count)
Bad idea:
hard to define,
is application-dependent.
107
Warning: MACs do not offer protection against the
“replay attacks”.
(m, t)
Alice
Bob
Since Vrfy has no state (or
“memory”) there is no way to
detect that (m,t) is not fresh!
This problem has to be solved by the higher-level application
(methods: time-stamping, sequence numbers...).
108
Constructing a MAC
1.
There exist MACs that are secure even if the adversary is
infinitely-powerful.
These constructions are not practical.
2.
MACs can be constructed from the block-ciphers.
We will now discuss to constructions:
◦
◦
1.
simple (and not practical),
a little bit more complicated (and practical) – a CBC-MAC
MACs can also be constructed from the hash functions
(NMAC, HMAC).
109
Information-theoretically secure MACs
We now show a construction of informationtheoretically secure MACs, i.e.:
MACs that are secure against an infinitelypowerful adversary
Our construction will be secure only if the key
is never reused.
like in the one-time pad encryption...
Idea
Consider a scheme where, for a random k the tagging function
TagK is a completely random function from the set of all
functions M → T .
(problem: the key needs to describe a function from M → T, and
hence it has an exponential length...)
Cleary, in this case on the variables {TagK(m)}m Є M are
independent. Hence, seeing the pairs:
(m1, t=Tagk(m1))
(m2, t=Tagk(m2))
...
(mw, t=Tagk(mw))
gives to the adversary no advantage in guessing a tag on some
other message m’.
Thus his chances of success are exactly 1/| T|.
How does it look in case of the “onetime MACs”
(m, t=Tagk(m))
m
Alice
Bob
Eve can see (m, t=Tagk(m))
She should not be able to compute a valid tag t’ on any other message m’.
It is enough that any pair of variables in the set {TagK(m)}m Є M is
independent.
This is called a set of pair-wise independent variables.
We are now going to construct such a set...
Idea: Linear function over Zp (where p is a large prime)
M = Zp
K = Zp × Zp
T = Zp
for example
p = 2107- 1
Tag((a,b), m) = am + b mod p
Intuition:
...
?
a
...
b
m0
m1
m0
m1
113
Lemma. Let (A,B) be distributed uniformly over Zp × Zp. Then for every distinct m0
and m1 the following variables are independent
(A· m0 + B) and (A· m1 + B) .
Clearly, each of those variables is distributed uniformly over Zp, and hence of every (x,y)
we have
P (A · m0 + B = x)· P(A· m1 + B = y) = 1/p · 1/p = 1/p2
Therefore it suffices to show that
P (A· m0 + B = x and A· m1 + B = y) = 1/p2
This is equivalent to the fact that the following system of linear equations (over Zp) has
exactly one solution (where a and b are the unknowns):
{
Clearly if m0 ≠ m1 then
det
a· m0 + b = x
a· m1 + b = y
[
m0
1
m1
1
]
≠0
Thus we are done
114
Can we reuse the same key many times?
After seeing two values:
Tag(k,m0) = A· m0 + B
Tag(k,m1) = A· m0 + B
(for m0 ≠ m1) the adversary can compute
(A,B) by solving a system of linear equations.
It can be shown that in general the length of the key
has to be proportional to the total length of
authenticated messages.
115
How to encrypt more messages with one
short key k?
Simple idea:
For every new message mi generate pseudorandomly
a new key ki for the one-time MAC.
k
PRG G
k1
k2
k3
...
Tag(k1,m1)
Tag(k2,m2)
Tag(k3,m3)
This can be proven secure!
A new member of “Minicrypt”
one-way functions
exist
this can be proven
this we already knew
computationally-secure
MACs exist
cryptographic PRGs
exist
this we have just proven
A simple construction from a block
cipher
Let
F : {0,1}n × {0,1}n → {0,1}n
be a block cipher.
We can now define a MAC scheme that works only for
messages m ε {0,1}n as follows:
F(k,m)
Mac(k,m) = F(k,m)
It can be proven that it is a secure MAC.
k
Fk
How to generalize it to longer messages?
m
118
Idea 1
• divide the message in blocks m1,...,md
• and authenticate each block separately
F(k,m1)
F(k,md)
...
Fk
Fk
m1
md
This doesn’t work!
119
What goes wrong?
m:
t = Tagk(m):
perm
m’ =
perm(m):
t’ = perm(t):
Then t’ is a valid tag on m’.
120
Idea 2
Add a counter to each block.
F(k,x1)
F(k,xd)
...
Fk
1
m1
x1
Fk
d
md
xd
This doesn’t work either!
121
i
mi
xi
m:
t = Tagk(m):
m’ = a prefix of m:
t’ = a prefix of t:
Then t’ is a valid tag on m’.
122
Idea 3
Add l := |m| to each block
F(k,x1)
F(k,xd)
...
Fk
l
1
Fk
m1
x1
l
d
md
xd
This doesn’t work either!
123
l
1
m1
xi
What goes wrong?
m:
m’:
t = Tagk(m):
t’ = Tagk(m’):
m’’ = first half from m || second half from m’
t’’ = first half from t || second half from t’
Then t’’ is a valid tag on m’’.
124
Idea 4
Add a fresh random value to each block!
F(k,x1)
F(k,xd)
...
Fk
r
Fk
d
l
x1
md
r
d
l
md
xd
This works!
125
tagk(m)
r
r
F(k,x1)
F(k,x2)
Fk
Fk
1
l
m1
r
n – block length
|mi| = n/4
m1
m2
Fk
...
m2
x2
x1
r is chosen randomly
...
2
l
F(k,xd)
r
d
l
md
xd
...
m
md
000
l
pad with zeroes if needed
126
This construction can be proven secure
Theorem
Assuming that
F : {0,1}n × {0,1}n → {0,1}n is a pseudorandom
permutation
the construction from the previous slide is a secure MAC.
Proof idea:
Suppose it is not a secure MAC.
Let A be an adversary that breaks it with a non-negligible
probability.
We construct a distinguisher D that distinguishes F from a
random permutation.
127
This construction is not practical
Problem:
The tag is 4 times longer than the message...
We can do much better!
128
CBC-MAC
F : {0,1}n × {0,1}n → {0,1}n - a block cipher
tagk(m)
Fk
Fk
Fk
Fk
Fk
|m|
m1
m2
m3
m
...
md
0000
pad with zeroes if needed
Other variants exist!
129
tagk(m)
Fk
Fk
Fk
Fk
|m|
m1
m2
m3
Fk
...
md
Why is this
needed?
Suppose we don’t prepend |m|...
130
t1=tagk(m1)
t2=tagk(m2)
Fk
Fk
m1
m2
the adversary
chooses:
t’= tagk(m’)
t1
now she can
compute:
Fk
Fk
m1
m2 xor t1
m’
t’ = t2
131
Plan
1.
Encryption schemes
a) information-theoretically secure
b) computationally-secure;
basic tools:
i. pseudorandom generators
ii. one-way functions
iii. pseudorandom functions and permutations
2.
Message authentication schemes
a) information-theoretically secure
b) computationally-secure
3.
Hash functions (collision-resistance and the
random oracle model).
Another idea for authenticating long messages
Fk(h(m))
k
a block cipher
Fk
h(m)
a “hash function” h
long m
By the way: a similar method is used in the public-key cryptography (it is called
“hash-and-sign”).
133
How to formalize it?
We need to define what is a “hash function”.
The basic property that we require is:
“collision resistance”
Collision-resistant hash functions
short H(m)
a hash function
H : {0,1}* → {0,1}L
long m
collision-resistance
a “collision”
Requirement: it should be hard to find a pair (m,m’) such that
H(m) =H(m’)
135
Collisions always exist
m
domain
range
m’
Since the domain is
larger than the range the
collisions have to exist.
136
Hash functions are a bit simillar to the
error-correcting codes
Difference between the hash functions and the error
correcting codes:
error-correcting codes are secure against the random
errors.
collision-resistant hash functions are secure against the
intentional errors.
A bit like:
pseudorandom generators
vs.
cryptographic pseudorandom generators.
137
“Practical definition”
H is a collision-resistant hash function if it is “practically
impossible to find collisions in H”.
Popular hash funcitons:
MD5 (now cosidered broken)
• SHA1
• ...
•
138
How to formally define “collision resitance”?
Idea
Say something like: H is a collision-resistant hash
function if
P(A finds a collision in H) is small
A
efficient
adversary A
Problem
For a fixed H there always exist a constant-time algorithm that
“finds a collision in H” in constant time.
It may be hard to find such an algorithm, but it always exists!
139
Solution
When we prove theorems we will always
consider
families of hash functions
indexed by a key s.
s
{H }
s є keys
140
informal description:
“knows H”
a protocol
H
H
H
s is chosen
randomly
formal model:
a protocol
s
Hs
Hs
Hs
141
informal description:
“knows H”
a protocol
H
H
H
real-life implementation (example):
“knows SHA1”
a protocol
SHA1
SHA1
SHA1
142
Hash functions – the functional
definition
A hash function is a probabilistic polynomial-time
algorithm H such that:
H takes as input a key s є {0,1}n and a message
x є {0,1}* and outputs a string
Hs(x) є {0,1}L(n),
where L(n) is some fixed function.
143
Hash functions – the security definition
[1/2]
1n
s
selects a
random
s є {0,1}n
outputs (m,m’)
We say that adversary A breaks the function H if
Hs(m) = Hs(m’).
144
Hash functions – the security definition
[2/2]
H is a collision-resistant hash function if
P(A breaks H) is negligible
A
polynomial-time
adversary A
145
Authentication scheme - formally
A key for the MAC is a pair:
a key for the hash function H
(s,k)
a key for the PRF F
Tag((k,s),m) = Fk(Hs(m))
Theorem. If H and F are secure then Tag is secure.
This is proven as follows.
Suppose we have an adversary
that breaks Tag. Then we can construct:
a distinguisher for F
simulates
an adversary for H
or
simulates
Do collision-resilient hash functions
belong to minicrypt?
collision-resilient hash
functions exist
open problem
easy exercise
one-way functions
exist
[D. Simon: Finding Collisions on a One-Way
Street: Can Secure Hash Functions Be
Based on General Assumptions? 1998]:
there is no “black-box reduction”.
Other uses of “hash functions”
Hash functions are used by practicioners to convert
“non-uniform randomness” into a uniform one.
Example:
shorter “uniformly random” H(m)
a hash function
H : {0,1}* → {0,1}L
user generated randomness X (key strokes, mouse movements, etc.)
How to formalize it?
Random oracle model
[Bellare, Rogaway, Random Oracles are Practical:
A Paradigm for Designing Efficient Protocols,
1993]
Idea: model the hash function as a random oracle.
x
H(x)
H : {0,1}* → {0,1}L
a completely random
function
informal description:
“knows H”
a protocol
H
formal model:
H : {0,1}* → {0,1}L
a protocol
Every call to H
is replaced with
a query to the
oracle.
also the
adversary is
allowed to
query the
oracle.
150
How would we use it in the proof?
shorter “uniformly random” H(X)
a hash function
H : {0,1}* → {0,1}L
user generated randomness X
As long as the adversary never queried the oracle on X
the value H(X) “looks completely random to him”.
Criticism of the Random Oracle Model
[Canetti, Goldreich, Halevi: The random oracle
methodology, revisited. 1998]
There exists a signature scheme that is
secure in ROM
but
is not secure if the random oracle is replaced with any
real hash function.
This example is very artificial. No “realistic” example of
this type is know.
Terminology
Model without the random oracles:
•“plain model”
•“cryptographic model”
Random Oracle Model is also called:
the “Random Oracle Heuristic”.
Common view: a ROM proof is better than nothing.
Plan
1.
Encryption schemes
a) information-theoretically secure
b) computationally-secure;
basic tools:
i. pseudorandom generators
ii. one-way functions
iii. pseudorandom functions and permutations
2.
Message authentication schemes
a) information-theoretically secure
b) computationally-secure
3.
Hash functions (collision-resistance and the
random oracle model).
Outlook
cryptography
“information-theoretic”,
“unconditional”
• one time pad,
• quantum cryptography,
• ...
“computational”
based on 2 simultanious
assumptions:
1. some problems are
computationally difficult
2. our understanding of what
“computational difficulty”
means is correct.
Symmetric cryptography
symmetric
cryptography
encryption
authentication
Basic information-theoretic tools
xor (one-time pad)
two-wise independent functions
Basic tools from the computational
cryptography
one-way functions
pseudorandom generators
pseudorandom functions/permutations
hash functions
A method for proving security:
reductions
P ≠ NP
hash functions
one-way functions
pseudorandom generators
pseudorandom functions/permutations
computationally-secure authentication
computationally-secure encryption
in general the picture
is much more
complicated!
What will we be talking about in January?
“Cryptography on Non-Trusted Machines”
(more precisely: “Leakage-Resilient
Cryptography”)
How to construct secure cryptographic
devices?
cryptographic device
very secure
Security based on well-defined
mathematical problems.
CRYPTO
not secure!
The problem
cryptographic device
CRYPTO
Information leakage
Side channel information:
• power consumption,
• electromagnetic leaks,
• timing information,
etc.
cryptographic device
The standard view
cryptographic device
cryptographic device
practitioners
Implementation is
not our business!
CRYPTO
CRYPTO
theoreticians
A recent idea
Design cryptographic
protocols that are secure
even
on the machines that leak
information.
The model
(standard) black-box access
cryptographic
scheme
additional access
to the internal data