Estimating – with confidence!
Download
Report
Transcript Estimating – with confidence!
Estimating – with
confidence!
Case study…
The amount of potassium in the blood varies slightly from
day to day and this fact (along with routine measurement
errors) means that successive readings of a patient’s
potassium level will show a small variation with a standard
deviation of s = 0.2 mmol/L with a range of 3.5 – 5.0
mmol/L being cosidered normal. A significantly different
value could indicate possible renal failure.
A patient has 1 K-test performed and presents a
reading of 3.4 mmol/L. How reliable is this 1 measure?
Can we quantify our confidence in this reading?
A confidence Interval
If we assume that the errors in measuring K
are normally distributed then we can use zscores to help quantify our confidence in this
reading:
The reading of X=3.4 mmol/L is our estimate
of the true value of the parameter “K bloodlevel concentration”
A 90% confidence interval would represent a
range of readings that we would expect to get
90% of the time.
Solution…
Use the correct z-value for 90%
5% of area left of this
point
95% of area
left of this point
The correct z values are -1.645 and +1.645 and are usually
denoted z* to indicate that these are special ones chosen with
a particluar confidence level “C” in mind. In this example C =
90%
Using the z-score formula we get:
z*
X
s
z *s X z *s , z* 1.645
3.4 1.645 0.2 X 3.4 1.645 0.2
90% of the readings will be expected to fall in the range
(3.1,3.7) mmol/L
Suppose 5 tests were performed over a number of days and eachtime
gave a result of 3.4 mmol/L. How would that change the range of
numbers in the confidence interval?
Using Confidence Intervals when
Determining the True value of a
Population Mean
We rarely ever know the population mean – instead
we can construct SRS’s and measure sample
means.
A confidence interval gives us a measure of how
precisely we know the underlying population mean
We assume 3 things:
We can construct “n” SRS’s
The underlying population of sample means is
Normal
We know the standard deviation
This gives …
Confidence interval for a population mean:
X z*
s
n
X z*
Number of samples
or tests
We measure this
We infer this
s
n
Example: Fish or Cut Bait?
A biologist is trying to determine how many rainbow trout
are in an interior BC lake. To do this he uses a large net
that filters 6000 m3 of lake water in each trial. He drops
the net in a specific area and records the mean number of
fish caught in 10 trials. This represents one SRS. From
this he is able to determine a mean and standard
deviation for the number of fish in 100 SRS’s. Each SRS
has the same s = 9.3 fish with a sample mean of 17.5
fish. How precisely does he know the true mean of
fish/6000 m3? Use C = 90%
If the volume of the lake is
60 million m3, how many
trout are in the lake?
Solution:
Since C = 0.90, z* = 1.645
z *s
n
X z *s
n
17.5 1.645(9.3 ) X 17.5 1.645(9.3 )
10
10
There is a 90% chance that the true mean number of fish/6000 m3 lies
in the range (16.0,19.0) Total number of fish: He is 90% confident
that there are between 160 000 and 190 000 fish in the lake.
Why should you be skeptical of this result?
Margin of Error
When testing confidence limits you are
saying that your statistical measure of the
mean is:
estimate +/- the margin of error
ie: X = 3.2 cm +/- 1.1 cm with a 90%
confidence
Math view…
Mathematically the margin of error is:
z *s
n
You can reduce the margin of error by
• increasing the number of samples you test
• making more precise measurements (makes s
smaller)
Matching Sample Size to
Margin of Error
An IT department in a large company is testing
the failure rate of a new high-end graphics card
in 200 of its work stations. 5 cards were chosen
at random with the following lifetime per failure
(measured in 1000’s of hours) and s = 0.5:
1
2
3
4
5
1.4
1.7
1.5
1.9
1.8
Provide a 90% confidence level for the mean lifetime of these boards.
1.4 1.7 1.5 1.9 1.8
X
1.66
5
X z *s
0.5
1.66 1.645( ) 1.66 0.37
n
5
IT is 90% confident that the mean lifetime of these boards is between 1290
and 2030 hours.
However – these are expensive boards and accounting wants to have the
margin of error reduced to 0.10 with a 90% confidence level. What should
IT do?
m z*
s
n
n (z *
s
m
)2
IT needs to test 68 machines!
Important Caveats…
Read page 426 carefully!
Data must be a SRS
Outliers can wreak havoc!
We “fudged” our knowledge of s, in general we
don’t know this
Poorly collected data or bad experiment design
cannot be overcome by fancy formulas!
Examples…
6.13
6.18
6.19
6.30
In conclusion…
This whole discussion rests on your
understanding of z-scores. If you are OK
with this then just review the new terms and
try the previous examples
If you are still “rusty” or un-sure about zscores, come and see me!