Transcript Document
CHEM2017
ANALYTICAL CHEMISTRY
Mrs Billing
Gate House 8th floor, GH840
[email protected]
011 717-6768
ANALYTICAL CHEMISTS IN INDUSTRY - INTERFACES
Other
Colleges
Lawyers chemists
Universities
Health
Peers,
&
Supervisors
Safety
Production
plants
Technical reps
In field
Life
scientists
Contract
labs
Analytical
chemist
Sales
&
Marketing
Management
Suppliers
Professional
organizations
Engineers
Statisticians
Government
agencies
STATISTICAL TESTS
AND ERROR
ANALYSIS
PRECISION AND ACCURACY
PRECISION – Reproducibility of the result
ACCURACY – Nearness to the “true” value
TESTING ACCURACY
TESTING PRECISION
SYSTEMATIC / DETERMINATE ERROR
• Reproducible under the same conditions in the same
experiment
• Can be detected and corrected for
• It is always positive or always negative
To detect a systematic error:
• Use Standard Reference Materials
• Run a blank sample
• Use different analytical methods
• Participate in “round robin” experiments
(different labs and people running the same
analysis)
RANDOM / INDETERMINATE ERROR
• Uncontrolled variables in the measurement
• Can be positive or negative
• Cannot be corrected for
• Random errors are independent of each other
Random errors can be reduced by:
• Better experiments (equipment, methodology,
training of analyst)
• Large number of replicate samples
Random errors show Gaussian distribution for a
large number of replicates
Can be described using statistical parameters
For a large number of experimental replicates the
results approach an ideal smooth curve called the
GAUSSIAN or NORMAL DISTRIBUTION CURVE
Characterised by:
The mean value – x
gives the center of the
distribution
The standard
deviation – s
measures the width of
the distribution
The mean or average, x
the sum of the measured values (xi) divided by the
number of measurements (n)
n
_
x
xi
i1
n
The standard deviation, s
measures how closely the data are clustered about
the mean (i.e. the precision of the data)
s
i
x i x
2
n1
NOTE: The quantity “n-1” = degrees of freedom
Other ways of expressing the precision of the data:
• Variance
Variance = s2
• Relative standard deviation
RSD
s
x
• Percent RSD / coefficient of variation
%RSD
s
x
100
POPULATION DATA
For an infinite set of data,
n→∞:
x → µ
and
population mean
s→σ
population std. dev.
The experiment that produces a small
standard deviation is more precise .
Remember, greater precision does not
imply greater accuracy.
Experimental results are commonly
expressed in the form:
mean standard deviation
_
x s
The more times you measure, the more confident you
are that your average value is approaching the “true”
value.
The uncertainty decreases in proportion to 1/ n
EXAMPLE
Replicate results were obtained for the analysis of
lead in blood. Calculate the mean and the standard
deviation of this set of data.
Replicate
[Pb] / ppb
1
752
2
756
3
752
4
751
5
760
_
x
s
xi
n
x i
x
2
Replicate
1
2
3
[Pb] / ppb
752
756
752
4
5
751
760
n1
NB DON’T round a
std dev. calc until
the very end.
x 754
The first decimal place
of the standard
deviation is the last
significant figure of the
average or mean.
754 4 ppb Pb
s 3.77
Also:
RSD
s
x
%RSD
s
3.77
754
100
x
Variance = s2
0.00500
3.77
754
3.77
2
100
14.2
0.500%
Lead is readily absorbed through the gastro intestinal tract. In blood,
95% of the lead is in the red blood cells and 5% in the plasma. About
70-90% of the lead assimilated goes into the bones, then liver and
kidneys. Lead readily replaces calcium in bones.
The symptoms of lead poisoning depend upon many factors, including
the magnitude and duration of lead exposure (dose), chemical form
(organic is more toxic than inorganic), the age of the individual
(children and the unborn are more susceptible) and the overall state of
health (Ca, Fe or Zn deficiency enhances the uptake of lead).
European Community Environmental
Quality Directive – 50 g/L in drinking water
Pb – where from?
• Motor vehicle emissions
• Lead plumbing
• Pewter
• Lead-based paints
• Weathering of Pb minerals
World Health Organisation – recommended
tolerable intake of Pb per day for an adult –
430 g
Food stuffs < 2 mg/kg Pb
Next to highways 20-950 mg/kg Pb
Near battery works 34-600 mg/kg Pb
Metal processing sites 45-2714 mg/kg Pb
CONFIDENCE INTERVALS
The confidence interval is the expression stating that
the true mean, µ, is likely to lie within a certain
distance from the measured mean, x.
– Student’s t test
The confidence interval is given by:
_
μ x
ts
n
where t is the value of student’s t taken from the table.
A ‘t’ test is used to compare sets of measurements.
Usually 95% probability is good enough.
Example:
The mercury content in fish samples were determined
as follows: 1.80, 1.58, 1.64, 1.49 ppm Hg. Calculate the
50% and 90% confidence intervals for the mercury
content.
Find x = 1.63
s = 0.131
_
μ x
ts
n
50% confidence:
t = 0.765 for n-1 = 3
μ 1.63
0.765 0.131
4
μ 1.63 0 . 05
There is a 50% chance that the true
mean lies between 1.58 and 1.68
ppm
x = 1.63
s = 0.131
1.78
90% confidence:
t = 2.353 for n-1 = 3
90%
1.68
_
μ x
ts
50%
n
μ 1.63
2.353 0.131
1.63
1.58
4
μ 1.63 0 . 15
There is a 90% chance that the true
mean lies between 1.48 and 1.78 ppm
1.48
Confidence intervals - experimental uncertainty
APPLYING STUDENT’S T:
1) COMPARISON OF MEANS
Comparison of a measured result with a ‘known’
(standard) value
t calc
known
value x
n
s
tcalc > ttable at 95% confidence level
results are considered to be different
the difference is significant!
Statistical tests are giving only probabilities.
They do not relieve us of the responsibility of interpreting
our results!
2) COMPARISON OF REPLICATE MEASUREMENTS
Compare two sets of data when one sample has been
measured many times in each data set.
For 2 sets of data with number of measurements n1 , n2 and means
x1 , x2 :
t calc
x1 x2
n 1n 2
s pooled
n1 n2
Where Spooled = pooled std dev. from both sets of data
2
s pooled
2
s 1 (n 1 1) s 2 (n 2 1)
n1 n 2 2
Degrees of freedom
= (n1 + n2 – 2)
tcalc > ttable at 95% confidence level
difference between results is significant.
3) COMPARISON OF INDIVIDUAL DIFFERENCES
Compare two sets of data when many samples have
been measure only once in each data set.
e.g. use two different analytical methods, A and B, to make single
measurements on several different samples.
Perform t test on individual differences between results:
t calc
Where
d
n
d = the average difference between
methods A and B
n = number of pairs of data
sd
sd
(d i
d)
2
n1
tcalc > ttable at 95% confidence level
difference between results is significant.
Example:
(di)
Are the two methods used comparable?
(d i
sd
sd
d)
2
n1
0 . 02 2
0 . 22 0 . 11 0 . 11 0 . 02 0 . 04
2
2
2
2
2
61
s d 0 . 12
t calc
t calc
d
n
sd
0.06
0.12
t calc 1 . 2
6
ttable = 2.571 for 95%
confidence
tcalc < ttable
difference between results
is NOT significant.
F TEST
COMPARISON OF TWO STANDARD
DEVIATIONS
Fcalc
s1
s2
2
2
Fcalc > Ftable at 95% confidence level
the std dev.’s are considered to be different
the difference is significant.
Q TEST FOR BAD DATA
Q calc
gap
range
The range is the total spread
of the data.
The gap is the difference
between the “bad” point and
the nearest value.
Example:
12.2 12.4 12.5 12.6
Range
Gap
12.9
If Qcalc > Qtable discarded
questionable point
EXAMPLE:
The following replicate analyses were obtained when
standardising a solution: 0.1067M, 0.1071M, 0.1066M and 0.1050M.
One value appears suspect. Determine if it can be ascribed to
accidental error at the 90% confidence interval.
Arrange in increasing order:
Gap
Q = Range