Transcript Entropy

Entropy
Optimal Value
Problem
 Assume you are designing a
digital display.
 Variable digits in the base
 Variable display digits
 Display cost is the product of
the base and number of digits.
 What is the optimal base to
use for the display?
Example
 The decimal number 563 costs
10  3 = 30 units.
 The binary number
1000110011 costs 2  10 = 20
units.
 Same value as decimal 563
 Lower cost
Natural Base
 The number of digits d is the
logarithm of the number i in a
given base n.
d  log n i 
 The cost C(i) is the product of
n and d.
C (i )  n
 Differentiation can find the
optimal base.
dC (i )
ln i ln n  1
0
dn
ln n 2
 natural log gives the base
 This is an example of
information theory.
ln i
ln n
ln n 1  0
ne
ln i
ln n
Better Information
Problem
 Suppose the display only
includes values from 0-4.
 The pental numbers only need
a single digit.
 How does the cost of a pental
(base 5) display compare to a
binary display?
 The binary numbers need at
least three digits.
 There is better information in
this problem than in the
previous problem.
 Cost 5  1 = 5 units
 1002 = 410
 Cost 2  3 = 6 units
Shannon’s Information
 In 1948 Claude Shannon identified a way to measure
the amount of uncertainty in a set of probabilities.
 Set of probabilities {p1, p2, …pn}
 Measure of uncertainty H(pi)
 There are three conditions.
1. H is a continuous function of the pi.
2. If all pi are equal then H is monotonically increasing in n.
3. If H has subsets with Hj, then H is the sum of the Hj.
Increasing Uncertainty
 The simplest choice is the sum
of an unknown continuous
function f(pi).
 Solve for equal probabilities
 No loss of generality
 Differentiation sets up
condition 2.
 Monotonically increasing
n
H ( pi )   f ( pi )
i 1
p
1
n
1
H  nf  
n
dH d   1 

nf    0

dn dn   n 
Carter, chap 20
Subset Composition
 Let H be composed of two
independent subsets.




r outcomes with H(1/r)
s outcomes with H(1/s)
Independent, so n = rs
Redefine the variables
 This can be differentiated by
both R and S.
 Leads to constants A, C
 Integrate and substitute
1
1
1
H  rsf    rf    sf  
 rs 
r
s
g RS   g R  g S 
SgRS   gR
Rg RS   g S 
RgR  Sg S   A
g R  A ln R  C
A
C
1
f     ln r 
r
r
r
Logarithmic Information
 The boundary condition gives
f(1) = 0.
 No uncertainty when p = 1
 C=0
 Condition 2 requires A < 0.
 Redefine as positive
 The uncertainty can now be
expressed in terms of
probabilities.
1
H  nf     A ln n
n
1
p
K  A
n
f  p  Kp ln p
n
H ( pi )   K  pi ln pi
i 1
Maximum Uncertainty
 The uncertainty is constrained
by the probability and mean.
n
n
1   pi
x   xi pi
i 1
 Maximize H with Lagrange
multipliers.
 Assume dpi independent
 Each term vanishes
i 1
n
n
H
 a  1 pi    xi pi
K
i 1
i 1
n
  dpi ln pi  a   xi   0
i 1
 Apply the probability constraint
to find the multiplier a.
pi  e  a  xi 
n
1 e
i 1
 a  xi 
e
a
n
 xi
e

i 1
Partition Function
n
Z  e  e
a
  xi
 The sum with the multiplier is
the canonical partition function.
i 1
 The mean constraint
completes the solution for .
n
x   xi e a  xi 
 Usually not analytic
i 1
n
x   xi e
n
 xi
e

 xi
i 1
x 
i 1
 ln Z

x2  x 
2
 ln Z
 2
2
 The second derivative gives
the variance.
Information Entropy
 The partition function includes information about the
microscopic behavior of the system.
 The number of choices available at the microscopic level
gives rise to the macroscopic behavior.
 The quantity H is the information entropy.
 Equivalent to thermodynamic entropy S
 Boltzmann’s constant k
n
S  k  pi ln pi
i 1
Extra Constraints
 Some systems have additional
information fk(xi).
 Means of functions known
 Let f1(xi) = xi
 Uncertainty must be reduced
or unchanged
pi  e

 a 


Z  e


k

 k f k  xi  


  k f k  xi 
k
 ea
i
 The constraint mean is related
to the partition function.
fj
 ln Z 1


 j
Z
 f x e
j
i
i

  k f k  xi 
k
Entropy and Partition
 The entropy is based on the
probabilities of microstates.
n
S   K  pi ln pi
i 1
n
S  K  e

 a 



  k f k  xi  
k

ln e

 a 



  k f k  xi  
k

 The probabilities relate to the
partition function.
i 1

n
S  K 
e
  k f k  xi 
Z
i 1

k
ln
e
  k f k  xi 
k
Z
 Entropy is a Legendre
transformation of the partition
function.
S  K ln Z  K   j f j
j
 k  k f k  xi 

 k f k  xi 

K
e
K
ln
S   e k
j fj 

 Z
Z i 1
Z
j


n

  f e
j
j
j
i

  k f k  xi 
k