Lec Notes on Sampling
Download
Report
Transcript Lec Notes on Sampling
Introduction to
Biostatistics
(PUBHLTH 540)
Sampling
1
Sampling Distributions
Sampling is a fundamental idea underlying
much of statistics. Statistical inference
commonly involves making statements about
population parameters based on sample
estimates.
Population
N
sample
n
x
s2
m
s2
inference
2
Sampling Distributions
Suppose we take all possible samples
of size n from a population (e.g.
samples of size n = 10)
- For each sample, compute sample
mean, and variance, s2
- We then have a population of
sample means.
3
Sampling Distributions
By examining the distribution of
possible sample means,
• we can study their properties,
such as what we would expect
the sample mean to be, and how
spread out the sample means
are.
Simplest Example:
Simple random sample of
size n=1
4
Example
Suppose a population consists of 4 people
with AIDS. We only know response for a
single randomly selected subject, but want
to guess the average in the population. The
number of hospitalized days for each
person last year was:
ID
1
2
3
4
Days
11
16
12
17
First, what is the population mean and
variance?
5
1
m
N
1
2
s
N
4
x
i 1
i
14
4
2
(
x
m
)
6.5
i
i 1
How many possible different
samples are there?
# Possible Samples- 4
6
Random Variables
How do we represent a single
random selection from the
population? - need a notationDefine a random variable: X
=represents the
value that we
could see (realize)
upon selection
Typically is
represented by a
Capital Letter
7
Definition of a Random Variable
Random Variable: X
Event
Realized Value (x)
Pick ID=1
11
Pick ID=2
16
Pick ID=3
12
Pick ID=4
17
Probability
¼
¼
¼
¼
•Ingredients:
•List of possible events (mutually
exclusive and exhaustive)
•Value and probability for each
event
8
Properties of Probabilities
• A probability is the long-run relative
frequency of an event occurring.
– the probability of an event is
between 0 and 1
– the sum of probabilities of all mutually
exclusive (and exhaustive events) is 1.
9
Definition of a Random Variable
• Common Terminology: X x
the realized value of X
is x
Example: Suppose that the selection
of a subject is ID=3 (where x=12).
Then the realized value of X is 12.
Note: This doesn’t mean the random
variable, X, is 12. The realized value
of X is 12.
10
Expected Value: Mean
• What do we expect X to be?
– i.e. What value to you expect X to
have?
– E(X)=?
EX
P X x x
all possibilites
1
1
1
1
E X 11 16 12 17
4
4
4
4
14
mx
11
Expected Value: Variance
• What is the variance of X?
• i.e. What value2 to you expect
X E X to have?
s E X E X
2
X
all possibilites
2
P X x x E X
2
12
Example of Variance of X
Suppose a population consists of 4 people
with AIDS.
The number of hospitalized days for each
person last year was:
ID
1
2
3
4
Days
11
16
12
17
Suppose we take a simple random
sample (SRS) of n=1. What is the
expected value of X? Var(X)?
13
Computing Expected Values
EX
P X x x
all possibilites
s E X E X
2
X
all possibilites
2
P X x x E X
2
14
1
1
1
1
E X 11 16 12 17
4
4
4
4
14
Variance of X
1
1
2
2
s 11 14 16 14
4
4
1
1
2
2
12 14 17 14
4
4
6.5
2
X
15
Stochastic Model
A stochastic Model is an equation that
includes random variables. There is a
deterministic equation for each realization
of the random variables.
Example: X m E
Event
Realized Value (x)
Pick ID=1
11
Pick ID=2
16
Pick ID=3
12
Pick ID=4
17
Deterministic
Equation
11=14-3
16=14-2
12=14+2
17=14+3
16
Stochastic Model
• Note that E is also a random variable. We
can define it by
Random Variable: E
Event
Realized Value (e)
Pick ID=1
-3
Pick ID=2
2
Pick ID=3
-2
Pick ID=4
3
Probability
¼
¼
¼
¼
17
Stochastic Model (additive)
X mE
Random
Variables
where
Constant
EX m
• This is called an additive model since the
additional term, E , is added to the
expected value
18