Transcript Entropy
Entropy
Optimal Value
Problem
Assume you are designing a
digital display.
Variable digits in the base
Variable display digits
Display cost is the product of
the base and number of digits.
What is the optimal base to
use for the display?
Example
The decimal number 563 costs
10 3 = 30 units.
The binary number
1000110011 costs 2 10 = 20
units.
Same value as decimal 563
Lower cost
Natural Base
The number of digits d is the
logarithm of the number i in a
given base n.
d log n i
The cost C(i) is the product of
n and d.
C (i ) n
Differentiation can find the
optimal base.
dC (i )
ln i ln n 1
0
dn
ln n 2
natural log gives the base
This is an example of
information theory.
ln i
ln n
ln n 1 0
ne
ln i
ln n
Better Information
Problem
Suppose the display only
includes values from 0-4.
The pental numbers only need
a single digit.
How does the cost of a pental
(base 5) display compare to a
binary display?
The binary numbers need at
least three digits.
There is better information in
this problem than in the
previous problem.
Cost 5 1 = 5 units
1002 = 410
Cost 2 3 = 6 units
Shannon’s Information
In 1948 Claude Shannon identified a way to measure
the amount of uncertainty in a set of probabilities.
Set of probabilities {p1, p2, …pn}
Measure of uncertainty H(pi)
There are three conditions.
1. H is a continuous function of the pi.
2. If all pi are equal then H is monotonically increasing in n.
3. If H has subsets with Hj, then H is the sum of the Hj.
Increasing Uncertainty
The simplest choice is the sum
of an unknown continuous
function f(pi).
Solve for equal probabilities
No loss of generality
Differentiation sets up
condition 2.
Monotonically increasing
n
H ( pi ) f ( pi )
i 1
p
1
n
1
H nf
n
dH d 1
nf 0
dn dn n
Carter, chap 20
Subset Composition
Let H be composed of two
independent subsets.
r outcomes with H(1/r)
s outcomes with H(1/s)
Independent, so n = rs
Redefine the variables
This can be differentiated by
both R and S.
Leads to constants A, C
Integrate and substitute
1
1
1
H rsf rf sf
rs
r
s
g RS g R g S
SgRS gR
Rg RS g S
RgR Sg S A
g R A ln R C
A
C
1
f ln r
r
r
r
Logarithmic Information
The boundary condition gives
f(1) = 0.
No uncertainty when p = 1
C=0
Condition 2 requires A < 0.
Redefine as positive
The uncertainty can now be
expressed in terms of
probabilities.
1
H nf A ln n
n
1
p
K A
n
f p Kp ln p
n
H ( pi ) K pi ln pi
i 1
Maximum Uncertainty
The uncertainty is constrained
by the probability and mean.
n
n
1 pi
x xi pi
i 1
Maximize H with Lagrange
multipliers.
Assume dpi independent
Each term vanishes
i 1
n
n
H
a 1 pi xi pi
K
i 1
i 1
n
dpi ln pi a xi 0
i 1
Apply the probability constraint
to find the multiplier a.
pi e a xi
n
1 e
i 1
a xi
e
a
n
xi
e
i 1
Partition Function
n
Z e e
a
xi
The sum with the multiplier is
the canonical partition function.
i 1
The mean constraint
completes the solution for .
n
x xi e a xi
Usually not analytic
i 1
n
x xi e
n
xi
e
xi
i 1
x
i 1
ln Z
x2 x
2
ln Z
2
2
The second derivative gives
the variance.
Information Entropy
The partition function includes information about the
microscopic behavior of the system.
The number of choices available at the microscopic level
gives rise to the macroscopic behavior.
The quantity H is the information entropy.
Equivalent to thermodynamic entropy S
Boltzmann’s constant k
n
S k pi ln pi
i 1
Extra Constraints
Some systems have additional
information fk(xi).
Means of functions known
Let f1(xi) = xi
Uncertainty must be reduced
or unchanged
pi e
a
Z e
k
k f k xi
k f k xi
k
ea
i
The constraint mean is related
to the partition function.
fj
ln Z 1
j
Z
f x e
j
i
i
k f k xi
k
Entropy and Partition
The entropy is based on the
probabilities of microstates.
n
S K pi ln pi
i 1
n
S K e
a
k f k xi
k
ln e
a
k f k xi
k
The probabilities relate to the
partition function.
i 1
n
S K
e
k f k xi
Z
i 1
k
ln
e
k f k xi
k
Z
Entropy is a Legendre
transformation of the partition
function.
S K ln Z K j f j
j
k k f k xi
k f k xi
K
e
K
ln
S e k
j fj
Z
Z i 1
Z
j
n
f e
j
j
j
i
k f k xi
k