Slides - Winlab
Download
Report
Transcript Slides - Winlab
Information Theory and Security
Lecture Motivation
Up to this point we have seen:
– Classical Crypto
– Symmetric Crypto
– Asymmetric Crypto
These systems have focused on issues of confidentiality:
Ensuring that an adversary cannot infer the original plaintext
message, or cannot learn any information about the original
plaintext from the ciphertext.
But what does “information” mean?
In this lecture and the next we will put a more formal framework
around the notion of what information is, and use this to provide
a definition of security from an information-theoretic point of
view.
Lecture Outline
Probability Review: Conditional Probability and Bayes
Entropy:
– Desired properties and definition
– Chain Rule and conditioning
Coding and Information Theory
– Huffman codes
– General source coding results
Secrecy and Information Theory
– Probabilistic definitions of a cryptosystem
– Perfect Secrecy
The Basic Idea
Suppose we roll a 6-sided dice.
– Let A be the event that the number of dots is odd.
– Let B be the event that the number of dots is at least 3.
A = {1, 3, 5}
B = {3, 4, 5, 6}
I tell you: the roll belongs to both A and B then you know there
are only two possibilities: {3, 5}
In this sense A B tells you more than just A or just B.
That is, there is less uncertainty in A B than in A or B.
Information is closely linked with this idea of uncertainty:
Information increases when uncertainty decreases.
Probability Review, pg. 1
A random variable (event) is an experiment whose outcomes
are mapped to real numbers.
For our discussion we will deal with discrete-valued random
variables.
Probability: We denote pX(x) = Pr(X = x).
For a subset A,
p(A) p X x
xA
Joint Probability: Sometimes we want to consider more than
two events at the same time, in which we case we lump them
together into a joint random variable, e.g. Z = (X,Y).
p X ,Y X, Y x, y Pr X x , Y y
Independence: We say that two events are independent if
p X ,Y X, Y x, y p X x p Y y
Probability Review, pg. 2
Conditional Probability: We will often ask questions about
the probability of events Y given that we have observed X=x.
In particular, we define the conditional probability of Y=y
given X=x by
p XY ( x , y)
pY (y | x)
pX (x)
Independence: We immediately get pY ( y | x) pY ( y)
Bayes’s Theorem: If pX(x)>0 and pY(y)>0 then
p X ( x )p Y ( y | x )
p X ( x | y)
p Y ( y)
Example
Example: Suppose we draw a card from a standard deck. Let
X be the random variable describing the suit (e.g. clubs,
diamonds, hearts, spades). Let Y be the value of the card (e.g.
two, three, …, ace). Then Z=(X,Y) gives the 52 possibilities
for the card.
P( (X,Y) = (x,y) ) = P(X=x, Y=y) = 1/52
P(X=“clubs”) = 13/52 = ¼
P(Y=“3”) = 4/52 = 1/13
Entropy and Uncertainty
We are concerned with how much uncertainty a random event
has, but how do we define or measure uncertainty?
We want our measure to have the following properties:
1.
2.
To each set of nonnegative numbers p p1 , p2 ,, pn with
p1 p2 pn 1 , we define the uncertainty by H ( p) .
H (p) should be a continuous function: A slight change in p
should not drastically change H (p)
3.
for all n>0. Uncertainty increases
H n1 ,, n1 H n11 ,, n11
when there are more outcomes.
4.
If 0<q<1, then
Hp1 ,, qp j , (1 q)p j ,, p n Hp1 ,, p n p jHq,1 q
Entropy, pg. 2
We define the entropy of a random variable by
HX px log 2 p( x )
x
Example: Consider a fair coin toss. There are two outcomes,
with probability ½ each. The entropy is
1 1
1
1
log 2 log 2 1 bit
2 2
2
2
Example: Consider a non-fair coin toss X with probability p of
getting heads and 1-p of getting tails. The entropy is
HX p log 2 p 1 plog 2 1 p
The entropy is maximum when p= ½.
Entropy, pg. 3
Entropy may be thought of as the number of yes-no questions
needed to accurately determine the outcome of a random
event.
Example: Flip two coins, and let X be the number of heads.
The possibilities are {0,1,2} and the probabilities are {1/4,
1/2, 1/4}. The Entropy is
1 1
1 1
1
log 2 log 2 log 2
4 2
2 4
4
So how can we relate this to questions?
1 3
bits
4 2
First, ask “Is there exactly one head?” You will half the time
get the right answer…
Next, ask “Are there two heads?”
Half the time you needed one question, half you needed two
Entropy, pg. 4
Suppose we have two random variables X and Y, the joint
entropy H(X,Y) is given by
HX, Y p XY x, y log 2 p XY ( x, y)
x y
Conditional Entropy: In security, we ask questions of whether
an observation reduces the uncertainty in something else. In
particular, we want a notion of conditional entropy. Given that
we observe event X, how much uncertainty is left in Y?
HY | X p X ( x )H(Y | X x )
x
p X ( x ) p Y ( y | x ) log 2 p Y ( y | x )
x
y
p XY ( x, y) log 2 p Y ( y | x )
x
y
Entropy, pg. 5
Chain Rule: The Chain Rule allows us to relate joint entropy to
conditional entropy via H(X,Y) = H(Y|X)+H(X).
HX, Y p XY ( x, y) log 2 p XY ( x, y)
x
y
p XY ( x, y) log 2 p X ( x )p Y ( y | x )
x
y
p X ( x ) log 2 p X ( x ) H(Y | X)
x
H(X) H(Y | X)
(Remaining details will be provided on the white board)
Meaning: Uncertainty in (X,Y) is the uncertainty of X plus
whatever uncertainty remains in Y given we observe X.
Entropy, pg. 6
Main Theorem:
1.
Entropy is non-negative. H(X ) 0
2.
H(X) log 2
where denotes the number of elements
in the sample space of X.
3.
HX, Y H(X) H(Y)
4.
(Conditioning reduces entropy)
H(Y | X) H(Y)
with equality if and only if X and Y are independent.
Entropy and Source Coding Theory
There is a close relationship between entropy and representing
information.
Entropy captures the notion of how many “Yes-No” questions
are needed to accurately identify a piece of information… that
is, how many bits are needed!
One of the main focus areas in the field of information theory is
on the issue of source-coding:
– How to efficiently (“Compress”) information into as few bits as
possible.
We will talk about one such technique, Huffman Coding.
Huffman coding is for a simple scenario, where the source is a
stationary stochastic process with independence between
successive source symbols
Huffman Coding, pg. 1
Suppose we have an alphabet with four letters A, B, C, D with
frequencies:
A
0.5
B
0.3
C
0.1
D
0.1
We could represent this with A=00, B=01, C=10, D=11. This
would mean we use an average of 2 bits per letter.
On the other hand, we could use the following representation:
A=1, B=01, C=001, D=000. Then the average number of bits
per letter becomes
(0.5)*1+(0.3)*2+(0.1)*3+(0.1)*3 = 1.7
Hence, this representation, on average, is more efficient.
Huffman Coding, pg. 2
Huffman Coding is an algorithm
that produces a representation for
a source.
A 0.5
The Algorithm:
B 0.3
– List all outputs and their
probabilities
– Assign a 1 and 0 to smallest two,
and combine to form an output
with probability equal to the sum
– Sort List according to
probabilities and repeat the
process
– The binary strings are then
obtained by reading backwards
through the procedure
1
1.0
1
0.5
C 0.1
1
0
0.2
0
D 0.1
0
Symbol Representations
A: 1
B: 01
C: 001
D: 000
Huffman Coding, pg. 3
In the previous example, we used probabilities. We may directly
use event counts.
Example: Consider 8 symbols, and suppose we have counted
how many times they have occurred in an output sample.
S1
28
S2
25
S3
20
S4
16
S5
15
S6
8
S7
7
S8
5
We may derive the Huffman Tree
The corresponding length vector is (2,2,3,3,3,4,5,5)
The average codelength is 2.83. If we had used a full-balanced
tree representation (i.e. the straight-forward representation) we
would have had an average codelength of 3.
Huffman Coding, pg. 4
We would like to quantify the average amount of bits needed in
terms of entropy.
Theorem: Let L be the average number of bits per output for
Huffman encoding of a random variable X, then
HX L HX 1,
L pxlx
x
Here, lx =length of codeword assigned to symbol x.
Example: Let’s look back at the 4 symbol example
HX .5 log 2 (0.5) .3 log 2 (0.3) .1log 2 (0.1) .1log 2 (0.1) 1.685
Our average codelength was 1.7 bits.
Next Time
We will look at how entropy is related to security
– Generalized definition of encryption
– Perfect Secrecy
– Manipulating entropy relationships
The next computer project will also be handed out