Transcript Statistics
Statistics
Large Systems
Macroscopic systems involve
large numbers of particles.
Microscopic determinism
Macroscopic phenomena
The basis is in mechanics from
individual molecules.
Classical and quantum
Consider 1 g of He as an ideal
gas.
N = 1.5 1023 atoms
Use only position and
momentum.
3 + 3 = 6 coordinates / atom
Total 9 1023 variables
Requires about 4 109 PB
Find the total kinetic energy.
Statistical thermodynamics
provides the bridge between
levels.
K = (px2 + py2 + pz2)/2m
About 100 ops / collision
At 100 GFlops, 9 1014 s
1 set of collisions in 3 107 yr
Ensemble
Computing time averages for
large systems is infeasible.
Imagine a large number of
similar systems.
Prepared identically
Independent
This ensemble of systems can
be used to derive theoretical
properties of a single system.
Probability
Probability is often made as a statement before the fact.
A priori assertion - theoretical
50% probability for heads on a coin
Probability can also reflect the statistics of many events.
25% probability that 10 coins have 5 heads
Fluctuations where 50% are not heads
Probability can be used after the fact to describe a
measurement.
A posteriori assertion - experimental
Fraction of coins that were heads in a series of samples
Head Count
trial
#heads
trial
#heads
1
5
11
5
2
8
12
1
3
6
13
5
4
5
14
5
5
6
15
6
6
6
16
6
7
1
17
2
8
5
18
4
9
7
19
6
10
4
20
6
Take a set of experimental
trials.
N number of trials
n number of values (bins)
i a specific trial (1 … N)
j a specific value (1 … n)
Use 10 coins and 20 trials.
Distribution
f(x)
Sorting trials by value forms a
distribution.
7
6
5
Distribution function f counts
occurrences in a bin
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
x
The mean is a measure of the
center of the distribution.
Mathematical average
Coin distribution <x> = 4.95
N
Median - midway value
i 1
Coin median = 5
f ( x) xi x
1
x
N
N
xi
i 1
Mode - most frequent value
Coin mode = 6
Probability Distribution
P(x)
The distribution function has a
sum equal to the number of
trials N.
0.3
0.2
0.1
0
0 1 2 3 4 5 6 7 8 9 10
x
1 N
1 N n
x xi x j xi x j
N i 1
N i 1 j 1
n
x Pj x j
j 1
A probability distribution p
normalizes the distribution
function by N.
Sum is 1
The mean can be expressed in
terms of the probability.
Subsample
trial
#heads
trial
#heads
1
5
11
5
2
8
12
1
3
6
13
5
4
5
14
5
5
6
15
6
6
6
16
6
7
1
17
2
8
5
18
4
9
7
19
6
10
4
20
6
Subsamples of the data may
differ on their central value.
First five trials
Mean 6.0
Median 6
Mode 5 and 6, not unique
Experimental probability
depends on the sample.
Theoretical probability predicts
for an infinitely large sample.
Deviation
Individual trials differ from the
mean.
xi xi x
The deviation is the difference
of a trial from the mean.
1 N
x xi x
N i 1
N
x
x 0
N
mean deviation is zero
The fluctuation is the mean of
the squared deviations.
Fluctuation is the variance
Standard deviation squared
x
2
1
N
x2 x
x x
N
i 1
2
2
i
Correlation
Events may not be random,
but related to other events.
Time measured by trial
The correlation function
measures the mean of the
product of related deviations.
Autocorrelation C0
Different variables can be
correlated.
1
Ck
N k
N k
x x x
i 1
ik
i
x
Ck xixi k
1
Ck
N k
1
C xy
N
N k
xi xi k x
i 1
x x y
N
i 1
C xy xy
2
i
i
y
Independent Trials
trial
#heads
trial
#heads
1
5
11
5
2
8
12
1
3
6
13
5
4
5
14
5
5
6
15
6
6
6
16
6
7
1
17
2
8
5
18
4
9
7
19
6
10
4
20
6
Autocorrelation within a
sample is the variance.
Coin experiment C0 = 3.147
Nearest neighbor correlation
tests for randomness.
Coin experiment C1 = -0.345
Much less than C0
Ratio C1 / C0 = -0.11
Periodic systems have Ct peak
for some period t.
Correlation Measure
Independent trials should peak
strongly at 0.
No connection to subsequent
events
No periodic behavior
“This sample autocorrelation
plot shows that the time series
is not random, but rather has a
high degree of autocorrelation
between adjacent and nearadjacent observations.”
nist.gov
Continuous Distribution
Data that is continuously
distributed is treated with an
integral.
Probability still normalized to 1
The mean and variance are
given as the moments.
First moment mean
Second moment variance
Correlation uses a time
integral.
N dxf x
f x
P( x)
N
x dxP xx
C0 dxPxx2 x
C t dtxt xt t
2
Joint Probability
The probabilities of two
systems may be related.
A
The intersection A B
indicates that both conditions
are true.
C
B
C=AB
Independent probability →
P(A B) = P(A)P(B)
The union A B indicates that
either condition is true.
P(A B) =P(A)+P(B)-P(A B)
P(A) + P(B), if exclusive
Joint Tosses
x
P(x)
0
0
1
0.10
2
0.05
3
0
4
0.10
5
0.30
6
0.35
7
0.05
8
0.05
9
0
10
0
Define two classes from the
coin toss experiment.
A={x<5}
B={2<x<8}
Individual probabilities are a
union of discrete bins.
P(A) = 0.25, P(B) = 0.80
P(A B) = 0.95
Dependent sets don’t follow
product rule.
P(A B) = 0.1 P(A)P(B)
Conditional Probability
The probability of an
occurrence on a subset is a
conditional probability.
A
Probability with respect to
subset.
P(A | B) =P(A B) / P(B)
C
B
Use the same subsets for the
coin toss example.
P(A | B) = 0.10 / 0.80 = 0.13
C=A|B
Combinatorics
The probability that n specific
occurrences happen is the
product of the individual
occurrences.
Other events don’t matter.
Separate probability for
negative events
Arbitrary choice of events
require permutations.
Exactly n specific events
happen at p:
n
P p
No events happen except the
specific events:
N n
Pq
Select n arbitrary events from
a pool of N identical types.
N
N!
n n!( N n)!
Binomial Distribution
Treat events as a Bernoulli
process with discrete trials.
N separate trials
Trials independent
Binary outcome of trial
Probability same for all trials
mathworld.wolfram.com
The general form is the
binomial distribution.
Terms same as binomial
expansion
Probabilities normalized
N n N n
Pn p q
n
N
P
n 0
n
( p q) N 1
Mean and Standard Deviation
The mean m of the binomial
distribution:
N
N
N n N n
m nPn n p q
n 0
n 0 n
Consider an arbitrary x, and
differentiate, and set x = 1.
N n n N n
N
( px q) p x q
n 0 n
N
Np( px q )
N 1
N
nx n 1 Pn
n 0
The standard deviation s of
the binomial distribution:
N
s n m 2 Pn
2
n 0
s 2 (n2 Pn 2mnPn m 2 Pn )
s 2 n2 Pn 2m nPn m 2 Pn
s 2 [ N ( N 1) p 2 m ] 2m 2 m 2
s 2 N 2 p 2 Np 2 Np N 2 p 2
N
Np nPn m
n 0
s Np(1 p) Npq
Poisson Distribution
Many processes are marked
by rare occurrences.
Large N, small n, small p
N
N!
Nn
n n!( N n)! n!
q N n q N (1 p) N
N ( N 1) 2
p
2!
( Np) 2
1 Np
e Np
2!
q N n 1 Np
q N n
This is the Poisson
distribution.
Probability depends on only
one parameter Np
Normalized when summed
from n =0 to .
N n N n ( Np) n Np
Pn p q
e
n!
n
Poisson Properties
The mean and standard deviation are simply related.
Mean m = Np, standard deviation s2 = m, s m
Unlike the binomial distribution the Poisson function has
values for n > N.
Poisson Away From Zero
The Poisson distribution is
based on the mean m = Np.
Assumed N >> 1, N >> n.
Now assume that n >> 1, m
large and Pn >> 0 only over a
narrow range.
This generates a normal or
Gaussian distribution.
Let x = n – m.
m m xe m
m m m xe m
Px
( m x)! m![( m x)! / m!]
Use Stirling’s formula.
m! 2mm m e m
Px
Px
mx
2m[( m 1)...( m x)]
1
2m[(1 1 / m )...(1 x / m )]
e x / 2m
Px
1/ m
x/m
2m[(e )...(e )]
2m
1
2
Normal Distribution
The full normal distribution
separates mean m and
standard deviation s
parameters.
P(x)
1
x m 2 / 2s 2
f ( x)
e
2 s
Tables provide the integral of
the distribution function.
Useful benchmarks:
P(|x - m| < 1 s = 0.683
P(|x - m| < 2 s = 0.954
P(|x - m| < 3 s = 0.997
m
0
x