Estimating variances - Crop and Soil Science

Download Report

Transcript Estimating variances - Crop and Soil Science

Sub - Sampling
 It may be necessary or convenient to measure a
treatment response on subsamples of a plot
– several soil cores within a plot
– duplicate laboratory analyses to estimate grain protein
 Introduces a complication into the analysis that can be
handled in one of two ways:
– compute the average for each plot and analyze normally
– subject the subsamples themselves to an analysis
 The second choice gives an additional source of
variation in the ANOVA – often called the sampling error
Use Sampling to Gain Precision
 When making lab measurements, you will have
better results if you analyze several samples to
get a truer estimate of the mean.
 It is often useful to determine the number of
samples that would be required for your chosen
level of precision.
 Sampling will reduce the variability within a
treatment across replications.
Stein’s Sample Estimate
2 2
1
2
t s
n
d
Where
t1 is the tabular t value for the desired confidence
level and the degrees of freedom of the initial
sample
d is the half-width of the desired confidence
interval
s is the standard deviation of the initial sample
For Example
Suppose we were measuring
grain protein content and we
wanted to increase the precision
with which we were measuring
each replicate of a treatment.
If we collected and ran five samples
from the same block and same
treatment, we might obtain data like
that above. We decide that an alpha
level of 5% is acceptable and we
would like to be able to get within .5
units of the true mean.
The formula indicates that to gain that
type of precision, we would need to
run 14 samples per block per
treatment.
Subsample
6.2
7.4
5.8
7
6.1
mean
variance
t (0.05, 4 df)
d
n
6.50
0.45
2.78
0.50
13.88
t12s2 2.782 * 0.45
n 2 
 13.88
2
d
0.5
Linear model with sub-sampling
 For a CRD
Yijk= + i + ij + ijk
 = mean effect
i = ith treatment effect
ij = random error
ijk=sampling error
 For an RBD
Yijk= + i + j + ij + ijk
 = mean effect
βi =
ith block effect
j = jth treatment effect
ij = treatment x block interaction, treated as error
ijk=sampling error
Expected Mean Squares – RBD with subsampling
Source
df
Expected Mean Square
Block
r-1
σs2  nσe2  tnσb2
Treatment
t-1
  n  rn
Error
Sampling Error
(r-1)(t-1)
rt(n-1)
2
s
2
s
  n
2
e
2
e
2
t

2
s
 In this example, treatments are fixed and blocks are random effects
 This is a mixed model because it includes both fixed and random effects
 Appropriate F tests can be determined from the Expected Mean Squares
The RBD ANOVA with Subsampling
Source df
SS
MS
Total
rtn-1
SSTot =
Block
r-1

tn   Y  Y 
rn   Y  Y 
n   Y  Y   SSB  SST

 ijk Yijk  Y
SSB=
t-1
SST =
2
(r-1)(t-1)
SSE =
k
Sample
Err.
rt(n-1)
SST/(t-1)
FT = MST/MSE
j
j
Error
SSB/(r-1)
2
i
i
Trtmt
F
2
SSE/(r-1)(t-1) FE = MSE/MSS
k
SSS =
SSS/rt(n-1)
SSTot-SSB-SST-SSE
Means and Standard Errors
Standard Error of a treatment mean
s Y  MSE rn
Confidence interval estimate
L   i   Y i  t  MSE rn
Standard Error of a difference
s Y  Y   2MSE rn
1
2
Confidence interval estimate L   1   2    Y1  Y 2   t  2MSE rn
T to test difference between
two means
Y
1  Y2
t
2MSE rn
Significance Tests
 MSS estimates
– the variation among samples
 MSE estimates
– the variation among samples
plus
– the variation among plots
treated alike
 MST estimates
– the variation among samples
plus
– the variation among plots
treated alike plus
– the variation among treatment
means


Therefore:
FE
–

tests the
significance of the
variation among
plots treated alike
FT
–
tests the
significance of the
differences
among the
treatment means
Allocating resources – reps vs samples
 Cost function
C = c1r + c2rn
n
– c1 = cost of an experimental unit
– c2 = cost of a sampling unit
2
c1s
2
c 2 e
 If your goal is to minimize variance for a fixed cost,
use the estimate of n to solve for r in the cost function
 If your goal is to minimize cost for a fixed variance,
use the estimate of n to solve for r using the formula
2
2
for a variance of a treatment mean


2y 
See Kuehl pg 163 for an example
s
rn

e
r