Random Samples

Download Report

Transcript Random Samples

What is a Random Sample
(and what if its not)
©Dr. B. C. Paul 2005
Some Commentary on Random
We are using mathematical models as
surrogates for a reality we either don’t have data
for or can’t afford what it means to get the data
We’ve already discussed that we assumed normal
distribution (the t distribution is just an adaptation
with uncertainty in the stdev.)
What does it mean to say our sample was
1- No one cherry picked the data set (can be a
problem when visual appearance is different –
humans are born cherry pickers)
2- Value of one sample has no bearing on what the
next sample value will be
When is That Not True
When taking the sample alters the nature
of the remaining population
Example – Playing Black Jack
When a card is drawn and played the number of
that particular card in the deck is changed
 Casinos may play with several decks to more
closely approximate a random chance draw
(because the house has an advantage in a random
Casinos also tend to get upset if they find that someone
is trying to recalculate the odds based on what has
Another Time it is not True
In the presence of spatial correlation
Spatial correlation is commonly seen in Mining Ore
Grade problems and Environmental Engineering
If I take a soil sample and find it loaded with
dioxin what are the chances that a soil sample
taken two inches away will show no dioxin?
With the random formula for variance of the
mean there are so many little samples in a truck
load of ore that every truck load of ore should
have the average grade of the deposit – IF
Variance of Means with Spatial
First Thing one must define how
correlation is influenced by distance and
Take the samples and create a
Plot the average half squared difference for all
samples a distance X apart
The Semivariogram
½ squared
Model Line fit to data points
from samples
½ squared difference has same units as
Sample variance and levels out at
Sample variance
Measures correlation
Using ½ the squared
Difference between
Samples a distance X
½ squared difference is named
Gama (symbol – γ)
Variance of Means
Itsy Bitsy
Sample used
To plot
Big Block of Ore Loaded in a Truck
We know the big block of ore has
A lower stdev than the samples –
But how much lower?
Its not σ/sqrt(n)
Using Numerical Methods and
Computer creates a grid of
Points – about 25 is usually
Computer then exhaustively
Measures all combinations of
Distances between points (all
525 of them)
For each distance it uses the
Semivariogram model to
Calculate the expected
Variability of the points
It keeps a running total and
Then calculates the average
Value of gamma.
The Variance of Big Blocks is
Remember variance is just standard deviation squared
We know the variance of samples cause we have the sample set
And have calculated it
That gamma bar thing up there is the number our computer just
Chugged out for us
Hey I can subtract even on a bad day!!
We’ll look more at Spatial Statistics Later
Randomness and Using Normal
Distribution Statistics
Use ordinary normal distribution (or T) statistics if you
are using random samples
Don’t cherry pick your samples
Don’t determine what the test is after you collect your test
Watch Out for Conditions that make a random sample
impossible to take
Cases were your sample actually changed in a noticeable way
the remaining population (the Black Jack example)
Cases were your samples are in fact related to each other by
virtue of how close and in what direction they came from (ieSpatial Correlation)
We can handle these non-random sampling events but it does
take a different mathematical model (don’t use the wrong