Estimating – with confidence!

Download Report

Transcript Estimating – with confidence!

Estimating – with
confidence!
Case study…

The amount of potassium in the blood varies slightly from
day to day and this fact (along with routine measurement
errors) means that successive readings of a patient’s
potassium level will show a small variation with a standard
deviation of s = 0.2 mmol/L with a range of 3.5 – 5.0
mmol/L being cosidered normal. A significantly different
value could indicate possible renal failure.
 A patient has 1 K-test performed and presents a
reading of 3.4 mmol/L. How reliable is this 1 measure?
Can we quantify our confidence in this reading?
A confidence Interval
If we assume that the errors in measuring K
are normally distributed then we can use zscores to help quantify our confidence in this
reading:
 The reading of X=3.4 mmol/L is our estimate
of the true value of the parameter “K bloodlevel concentration”
 A 90% confidence interval would represent a
range of readings that we would expect to get
90% of the time.

Solution…

Use the correct z-value for 90%
5% of area left of this
point
95% of area
left of this point
The correct z values are -1.645 and +1.645 and are usually
denoted z* to indicate that these are special ones chosen with
a particluar confidence level “C” in mind. In this example C =
90%
Using the z-score formula we get:
z* 
X 
s
  z *s  X    z *s , z*  1.645
3.4  1.645  0.2  X  3.4  1.645  0.2
90% of the readings will be expected to fall in the range
(3.1,3.7) mmol/L
Suppose 5 tests were performed over a number of days and eachtime
gave a result of 3.4 mmol/L. How would that change the range of
numbers in the confidence interval?
Using Confidence Intervals when
Determining the True value of a
Population Mean



We rarely ever know the population mean – instead
we can construct SRS’s and measure sample
means.
A confidence interval gives us a measure of how
precisely we know the underlying population mean
We assume 3 things:



We can construct “n” SRS’s
The underlying population of sample means is
Normal
We know the standard deviation
This gives …
Confidence interval for a population mean:
X  z*
s
n
   X  z*
Number of samples
or tests
We measure this
We infer this
s
n
Example: Fish or Cut Bait?
A biologist is trying to determine how many rainbow trout
are in an interior BC lake. To do this he uses a large net
that filters 6000 m3 of lake water in each trial. He drops
the net in a specific area and records the mean number of
fish caught in 10 trials. This represents one SRS. From
this he is able to determine a mean and standard
deviation for the number of fish in 100 SRS’s. Each SRS
has the same s = 9.3 fish with a sample mean of 17.5
fish. How precisely does he know the true mean of
fish/6000 m3? Use C = 90%
If the volume of the lake is
60 million m3, how many
trout are in the lake?
Solution:

Since C = 0.90, z* = 1.645
  z *s
n
 X    z *s
n
17.5  1.645(9.3 )  X  17.5  1.645(9.3 )
10
10
There is a 90% chance that the true mean number of fish/6000 m3 lies
in the range (16.0,19.0) Total number of fish: He is 90% confident
that there are between 160 000 and 190 000 fish in the lake.
Why should you be skeptical of this result?
Margin of Error

When testing confidence limits you are
saying that your statistical measure of the
mean is:
estimate +/- the margin of error

ie: X = 3.2 cm +/- 1.1 cm with a 90%
confidence
Math view…

Mathematically the margin of error is:
z *s

n
You can reduce the margin of error by
• increasing the number of samples you test
• making more precise measurements (makes s
smaller)
Matching Sample Size to
Margin of Error

An IT department in a large company is testing
the failure rate of a new high-end graphics card
in 200 of its work stations. 5 cards were chosen
at random with the following lifetime per failure
(measured in 1000’s of hours) and s = 0.5:
1
2
3
4
5
1.4
1.7
1.5
1.9
1.8
Provide a 90% confidence level for the mean lifetime of these boards.
1.4  1.7  1.5  1.9  1.8
X
 1.66
5
X  z *s
0.5
 1.66  1.645( )  1.66  0.37
n
5
IT is 90% confident that the mean lifetime of these boards is between 1290
and 2030 hours.
However – these are expensive boards and accounting wants to have the
margin of error reduced to 0.10 with a 90% confidence level. What should
IT do?
m  z*
s
n
 n  (z *
s
m
)2
IT needs to test 68 machines!
Important Caveats…

Read page 426 carefully!
Data must be a SRS
 Outliers can wreak havoc!
 We “fudged” our knowledge of s, in general we
don’t know this
 Poorly collected data or bad experiment design
cannot be overcome by fancy formulas!

Examples…
6.13
 6.18
 6.19
 6.30

In conclusion…
This whole discussion rests on your
understanding of z-scores. If you are OK
with this then just review the new terms and
try the previous examples
 If you are still “rusty” or un-sure about zscores, come and see me!
