Transcript t - CSUS

Chem. 31 – 9/21 Lecture
Guest Lecture
Dr. Roy Dixon
Announcements I
• Due on Wednesday
– Pipet/Buret Calibration Lab Report
– Format: Pipet Report Form, Buret Calibration Plot and data
for these measurements (use Lab Manual pages or
photocopy your lab notebook pages if neat/organized)
• Last Week’s Additional Problem
– returned in labs
– remember to put your LAB SECTION NUMBER on all
assignments turned in (grading is by lab section)
Announcements II
• Today’s Lecture
– Error and Uncertainty
• Finish up Gaussian Distribution Problems
• t and Z based Confidence Intervals
• Statistical Tests
– Lecture will be posted under my faculty web page (but I
will also send them to Dr. Miller-Schulze for his posting
method)
Example Problems
Text Problems
4-2 (a), (d), (e)
4-4 (a), (b)
Done already
Chapter 4 –
A Little More on Distributions
(1) Spec #1 * [BP = 234.1, 52707]
100
1343.9877
90
80
70
2s ~ 0.2 amu
60
% Intensity
• Measurements can be a
naturally varying quantity (e.g.
student heights, student test
scores, Hg levels in fish in a
lake)
• Additionally, a single quantity
measured multiple times
typically will give different
values each time (example: real
distribution of measurements of
mass of an ion)
50
40
1344.9770
30
20
10
0
1343.12360
1343.92636
1344.72913
1345
m/z
x axis is mass
Note: to be considered “accurate mass”, an ion’s mass error must be less
than 5 ppm (0.007 amu in above spectrum). This is only possible by
averaging measurements so that the average mass meets the requirement.
Chapter 4 –
A Little More on Distributions
• Reasons for making multiple measurements:
– So one has information on the variability of the
measurement (e.g. can calculate the standard
deviation and uncertainty)
– Average values show less deviation than single
measurements
– Mass spectrometer example: standard deviation
in single measurement ~0.1 amu, but standard
deviation in 4 s averages ~0.005 amu
Chapter 4 – Calculation of Confidence
Interval
1.
2.
x
n
Z depends on area or desired
probability
At Area = 0.45 (90% both sides),
Normal Distribution
Frequency
Confidence Interval = x + uncertainty
Calculation of uncertainty depends on
whether σ is “well known”
3.
If s is not well known (covered later)
4.
When s is well known (not in text)
Value + uncertainty =
Zs
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
-3
-2
-1
0
Z value
Z = 1.65
At Area = 0.475 (95% both sides), Z = 1.96 => larger confidence
interval
1
2
3
Chapter 4 – Calculation of
Uncertainty
Example:
The concentration of NO3- in a sample is measured many
times. If the mean value and standard deviation
(assume as population standard deviation) are 14.81
and 0.62 ppm, respectively, what would be the expected
95% confidence interval in a 4 measurement average
value? (Z for 95% CI = 1.96)
What is the probability that a new measurement would
exceed the upper 95% confidence limit?
Chapter 4 –
Calculation of Confidence Interval with s Not Known
Value + uncertainty =
tS
x
n
t = Student’s t value
t depends on:
- the number of samples (more samples => smaller t)
- the probability of including the true value (larger
probability => larger t)
Chapter 4 –
Calculation of Uncertainties Example
• Measurement of lead in drinking water
sample:
– values = 12.3, 9.8, 11.4, and 13.0 ppb
• What is the 95% confidence interval?
Chapter 4 –
Ways to Reduce Uncertainty
1. Decrease standard deviation in
measurements (usually requires more
skill in analysis or better equipment)
2. Analyze each sample more time (this
increases n and decreases t)
Overview of Statistical Tests
• t-Tests: Determine if a systematic error
exists in a method or between methods or
if a difference exists in sample sets
• F-Test: Determine if there is a significant
difference in standard deviations in two
methods (which method is more precise)
• Grubbs Test: Determine if a data point
can be excluded on a statistical basis
Statistical Tests
Possible Outcomes
• Outcome #1 – There is a statistically significant
result (e.g. a systematic error)
– this is at some probability (e.g. 95%)
– can occasionally be wrong (5% of time possible if test
barely valid at 95% confidence)
• Outcome #2 – No significant result can be
detected
– this doesn’t mean there is no systematic error or
difference in averages
– it does mean that the systematic error, if it exists, is
not detectable (e.g. not observable due to larger
random errors)
– It is not possible to prove a null hypothesis beyond
any doubt
Statistical Tests
t Tests
• Case 1
– used to determine if there is a significant bias by measuring a
test standard and determining if there is a significant difference
between the known and measured concentration
• Case 2
– used to determine if there is a significant differences between
two methods (or samples) by measuring one sample multiple
time by each method (or each sample multiple times)
• Case 3
– used to determine if there is a significant difference between
two methods (or sample sets) by measuring multiple sample
once by each method (or each sample in each set once)
Case 1 t test Example
• A new method for determining sulfur
content in kerosene was tested on a
sample known to contain 0.123% S.
• The measured %S were:
0.112%, 0.118%, 0.115%, and 0.119%
Do the data show a significant bias at a
95% confidence level?
Clearly lower, but is it significant?
Case 2 t test Example
• A winemaker found a barrel of wine that was labeled as
a merlot, but was suspected of being part of a
chardonnay wine batch and was obviously mis-labeled.
To see if it was part of the chardonnay batch, the mislabeled barrel wine and the chardonnay batch were
analzyed for alcohol content. The results were as
follows:
– Mislabeled wine: n = 6, mean = 12.61%, S = 0.52%
– Chardonnay wine: n = 4, mean = 12.53%, S = 0.48%
• Determine if there is a statistically significant difference
in the ethanol content.
Case 3 t Test Example
• Case 3 t Test used when multiple
samples are analyzed by two different
methods (only once each method)
• Useful for establishing if there is a
constant systematic error
• Example: Cl- in Ohio rainwater measured
by Dixon and PNL (14 samples)
Case 3 t Test Example –
Data Set and Calculations
Calculations
Conc. of Cl- in Rainwater
(Units = uM)
Step 1 –
Calculate
Difference
Sample #
Dixon Cl-
PNL Cl-
1
9.9
17.0
7.1
2
2.3
11.0
8.7
3
23.8
28.0
4.2
4
8.0
13.0
5.0
5
1.7
7.9
6.2
6
2.3
11.0
8.7
7
1.9
9.9
8.0
8
4.2
11.0
6.8
9
3.2
13.0
9.8
10
3.9
10.0
6.1
11
2.7
9.7
7.0
12
3.8
8.2
4.4
13
2.4
10.0
7.6
14
2.2
11.0
8.8
Step 2 - Calculate
mean and standard
deviation in differences
ave d = (7.1 + 8.7 + ...)/14
ave d = 7.49
Sd = 2.44
Step 3 – Calculate t value:
tCalc 
d
Sd
tCalc = 11.5
n
Case 3 t Test Example –
Rest of Calculations
• Step 4 – look up tTable
– (t(95%, 13 degrees of freedom) = 2.17)
• Step 5 – Compare tCalc with tTable, draw
conclusion
– tCalc >> tTable so difference is significant
t- Tests
• Note: These (case 2 and 3) can be applied to
two different senarios:
– samples (e.g. sample A and sample B, do they have
the same % Ca?)
– methods (analysis method A vs. analysis method B)
F - Test
• Similar methodology as t tests but to compare
standard deviations between two methods to
determine if there is a statistical difference in
precision between the two methods (or
variability between two sample sets)
FCalc
S1 > S2
S12
 2
S2
As with t tests, if FCalc > FTable,
difference is statistically significant
Grubbs Test Example
• Purpose: To determine if an “outlier” data point
can be removed from a data set
• Data points can be removed if observations
suggest systematic errors
•Example:
•Cl lab – 4 trials with values of 30.98%, 30.87%, 31.05%, and 31.00%.
•Student would like less variability (to get full points for precision)
•Data point farthest from others is most suspicious (so 30.87%)
•Demonstrate calculations
Dealing with Poor Quality Data
• If Grubbs test fails, what can be done to
improve precision?
– design study to reduce standard deviations
(e.g. use more precise tools)
– make more measurements (this may make an
outlier more extreme and should decrease
confidence interval)