Access Presentation

Download Report

Transcript Access Presentation

Probabilistic Reasoning
in Data Analysis
Lawrence Sirovich
Mt. Sinai School of Med.; Rockefeller U.; Courant Inst., NYU
[email protected]
1
Experiments with probability
Coin Tossing
where Nh= number of heads in N trials.
More generally …
Urn Model: Urn filled with Nb, black, and Nw ,white marbles, chosen at random with replacement.
Generalize to k colors so that
Probability of q heads
followed by p tails
Table for 3 tosses
and for all permutations we obtain the binomial probability distribution (slide 3)
2
Observations on probabilities
}
, for x discrete or continuous
From
and a little thought it is seen that Bn(k) is probability of k heads in n trials.
Factorial for noninteger n defined by
Therefore
is a probability in x, since
This is the gamma distribution of rate λ, since <tGs>= λ-1
3
Miscellanea
Hint: Expand log for 1/n small in
(1  x / n)n  en log(1 x / n )

Hint: Integrand has a max at y = n, and area is concentrated
therein
n !   exp(  y  n log y )dy
0
Which yields Stirling’s formula:
Gaussian probabilty:
Probability distribution functions, pdfs.
4
Binomial distribution: A special case
Binomial probability distribution
Assume that:
Thus, if
and Stirling’s formula are substituted
This is called the Poisson distribution
As with all probability distributions
, a pdf in k for t fixed. Hint: e 
x

x
k
/ k !.
k 0
5
Expected Outcomes
For pdf P(x) and some function f(x) define expectation
Thus, the average or mean is
6
Random Arrivals:
Events that do not depend on prior history
Biological examples:
Photon arrivals activating retinal photoreceptors
Spontaneous neurotransmitter release events at a synapse
Suppose N(t) is the number of events (arrivals) in the time t;
then the arrival rate is estimated by
The approximate number of events for any t is
Since the process is memoryless, the pdf in Ƭ satisfies
This functional relation can only be satisfied by an exponential
7
Poisson Process
This form guarantees
Since <t>=1/λ
This is the probability of waiting times for a Poisson process
(not the same as a Poisson Distribution)
Average or expected waiting time defined by:
8
Consequences of the Poisson process
For any time t = t1, the probability of an event in the interval t is
A consequence of this is that the probability of no event is
Since λdt is the probability of an event, and 1 – λdt a nonevent, in an increment dt in general
Pk 1 (t  dt )  ( dt ) Pk (t )  (1   dt ) Pk 1 (t )


dt 0
The previously defined Poisson pdf
d
Pk 1 (t )   Pk (t )   Pk 1 (t )
dt
satisfies this differential equation,
which justifies the notation. Recall
The variance is given by
so that mean & variance are equal.
9
Poisson Processes in Biology
A classic, Nobel-worthy, paper
Hecht, Shlaer, & Pirenne, J. Gen. Physiol. 25:819-840, 1942.
addresses the question: How many photons must be captured by
the retina for the subject to correctly perceive an event?
In the psychophysical experiment, subjects are exposed to brief
1-ms flashes of light to determine the probability dependence of
perception on the brightness of the flash.
Quantal content of 1-ms flash
It is a reasonable hypothesis that absorption of quanta by the retina obeys a
Poisson distribution
10
The Cumulative Poisson pdf
The probability that n or more photons are detected is described by the cumulative pdf:
Note: The curves can be distinguished from one another
by the steepness, which depends on n.
11
Original data from Hecht, Shlaer, & Pirenne
Hecht, Shlaer, & Pirenne, J. Gen. Physiol. 25:819-840, 1942.
These data are to be analyzed in the Problem Set
12
Poisson Processes in Biology
Related studies
Next, we consider the work of Bernard Katz (del Castillo & Katz, 1954) who recorded
nervous activity at the neuromuscular junction and noted low-level persistent voltage
activity, later called mini end-plate potentials. He pursued the origin of this noise and
conjectured that it was due to the synaptic release of vesicles of neurotransmitter of
uniform size and further hypothesized that the number of arriving vesicles followed a
Poisson distribution. The ensuing experiments confirmed his speculations and
contributed to his Nobel prize in 1970.
The next two slides are based on subsequent verification (Boyd & Martin, 1956), and
summarize the experimental and theoretical deliberations that went into this brilliant
scientific effort.
In a complementary vein, Luria & Delbrück (1943) demonstrated that bacterial
mutations were of random origin. In effect, they did this by refuting the hypothesis that
that the mutations were governed by Poisson statistics; in the process, they established
the genetic basis of bacterial reproduction and were awarded the Nobel prize for this
work in 1969.
References
del Castillo & Katz, J. Physiol. 124:560-573, 1954.
Boyd & Martin, J. Physiol. 132:74-91, 1956.
Luria & Delbrück, Genetics 28:491-511, 1943.
13
Neuro- muscular Vesicle Release
Fluctuations in post-synaptic end plate potential (e.p.p.) as reinvestigated by Boyd & Martin (1956)
This publication reports on No = 198 trials measuring postsynaptic epps in response to a
single presynaptic neural impulse. The inset to the figure below is the histogram of
spontaneous activity-no upstream impulse.
Under Katz’s hypotheses, this implies that a single vesicle produces a 0.4-mV fluctuation
Over all trials, the mean fluctuation was 0.993 mV. Therefore, the mean arrival is
m = 0.993/0.4 = 2.33.
14
The Gaussian Fit
This implies that the number vesicles is given by:
N 0 pk  N 0 m k e  m / k !
The continuous curve in the previous slide was created by a Gaussian fit, as
explained in the figure legend below, that allows for the inclusion of the side bars
that are seen in the above histogram.
Boyd & Martin, J. Physiol. 132:74-91, 1956
15
Theory vs. Experiment
Comparison between theory and experiment is summarized in the following
table
k
0
1
2
3
4
5
6
7
8
Poisson
19
44
52
40
24
11
5
2
1
Experiment
18
44
55
36
25
12
5
2
1
Read this paper to see how beautifully all the data agree with the hypothesis that
evoked release follows a Poisson distribution
Slides from a lecture in the course Systems Biology—Biomedical
Modeling
Citation: L. Sirovich, Probabilistic reasoning in data analysis. Sci. Signal. 4, tr14
(2011).
www.sciencesignaling.org