#### Transcript Random Samples

What is a Random Sample (and what if its not) ©Dr. B. C. Paul 2005 Some Commentary on Random Samples We are using mathematical models as surrogates for a reality we either don’t have data for or can’t afford what it means to get the data We’ve already discussed that we assumed normal distribution (the t distribution is just an adaptation with uncertainty in the stdev.) What does it mean to say our sample was random 1- No one cherry picked the data set (can be a problem when visual appearance is different – humans are born cherry pickers) 2- Value of one sample has no bearing on what the next sample value will be When is That Not True When taking the sample alters the nature of the remaining population Example – Playing Black Jack When a card is drawn and played the number of that particular card in the deck is changed Casinos may play with several decks to more closely approximate a random chance draw (because the house has an advantage in a random game) Casinos also tend to get upset if they find that someone is trying to recalculate the odds based on what has played Another Time it is not True In the presence of spatial correlation Spatial correlation is commonly seen in Mining Ore Grade problems and Environmental Engineering If I take a soil sample and find it loaded with dioxin what are the chances that a soil sample taken two inches away will show no dioxin? With the random formula for variance of the mean there are so many little samples in a truck load of ore that every truck load of ore should have the average grade of the deposit – IF THINGS WERE RANDOM Variance of Means with Spatial Correlation First Thing one must define how correlation is influenced by distance and direction. Take the samples and create a “Semivariogram” Plot the average half squared difference for all samples a distance X apart The Semivariogram ½ squared difference Model Line fit to data points from samples ½ squared difference has same units as Sample variance and levels out at Sample variance Measures correlation Using ½ the squared Difference between Samples a distance X apart Distance ½ squared difference is named Gama (symbol – γ) Variance of Means Itsy Bitsy Sample used To plot semivariogram Big Block of Ore Loaded in a Truck We know the big block of ore has A lower stdev than the samples – But how much lower? Its not σ/sqrt(n) Using Numerical Methods and Computers Computer creates a grid of Points – about 25 is usually Enough. Computer then exhaustively Measures all combinations of Distances between points (all 525 of them) For each distance it uses the Semivariogram model to Calculate the expected Variability of the points It keeps a running total and Then calculates the average Value of gamma. The Variance of Big Blocks is ( W : W ) BigBlocks samples 2 2 Remember variance is just standard deviation squared We know the variance of samples cause we have the sample set And have calculated it That gamma bar thing up there is the number our computer just Chugged out for us Hey I can subtract even on a bad day!! We’ll look more at Spatial Statistics Later Randomness and Using Normal Distribution Statistics Use ordinary normal distribution (or T) statistics if you are using random samples Don’t cherry pick your samples Don’t determine what the test is after you collect your test statistics Watch Out for Conditions that make a random sample impossible to take Cases were your sample actually changed in a noticeable way the remaining population (the Black Jack example) Cases were your samples are in fact related to each other by virtue of how close and in what direction they came from (ieSpatial Correlation) We can handle these non-random sampling events but it does take a different mathematical model (don’t use the wrong model)