Poisson Distribution

Download Report

Transcript Poisson Distribution

Application
Confidence Intervals for Mean
Suppose that the random variables
Y1,Y2, …………Yn model independent
observations from a distribution with mean
µ and variance σ2 .
n
Then
1
Y   Yi
n i 1
is the sample mean.
Now by the CLT
    
Y ~ N  , 


  n 


2
This is because µ is replaced by µ/n
and σ by σ /n (for means)
Recall from Statistics 2 that, if σ2 is
estimated by the sample variance, s2, an
approximate confidence interval for µ is
given by:
s
s 

y

z
,
y

z


n
n

_
Here y is the observed sample mean, and
z is proportional to the level of confidence
required.
So for 95% confidence an approximate
interval for µ is given by:
s
s 

y

2
,
y

2


n
n

2 is approximate - an accurate value can
be obtained from tables or by using the
qnorm function on R.
> qnorm(0.975)
[1] 1.959964
> qnorm(0.995)
[1] 2.575829
> qnorm(0.025)
[1] -1.959964
>
Thus in R, an approximate 95% confidence
interval for the mean µ is given by
> mean(y)+c(-1,1)*qnorm(0.975)*sqrt(var(y)/length(y))
where y is the vector of observations.
A more accurate confidence interval,
allowing for the fact that s2 is only an
estimate of σ2,is given by use of the
function t.test.
Example
The R vector abbey in the package MASS
gives 31 determinations of nickel content
(μg g-1) in a Canadian syenite rock.
We check whether the data are
reasonably modelled by an exponential
distribution. There is no predefined
function in R to construct an
exponential Q-Q plot, so we have to
work from first principles.
qexp(ppoints(31)) gives the theoretical
quantile values at 31 probability points.
sort(abbey) gives the sorted experimental values
The following command produces a Q-Q plot
with axes labelled accordingly.
If we ignore the highest observation, the
data appear to be reasonably compatible
with an exponential distribution with
mean around 12.5 (exp(0.08)).