Lecture 7 - Statistics
Download
Report
Transcript Lecture 7 - Statistics
Survey
How would you judge the pace of the lectures?
Do you find the notes meaningful?
Can you offer any suggestions for improving the slide/lectures?
Have you found the labs meaningful for the intended objectives (e.g., graphical presentation and probability)?
Can you offer any suggestions for improvement for the labs?
Survey
If you could have me go over one topic again what would it be?
In the integers from 1 to 10, what is your favorite number?
Statistics 400 - Lecture 7
Distribution of scores on a standardized test can be approximated
by a normal distribution with mean of 500 and standard deviation
of 100. Find probability that a randomly selected student scores:
Over 650
Between 325 and 675
What proportion of students score better than 680?
Checking Normality
Does normal distribution reasonably approximate distribution of
data
Can use a normal probability plot (or normal scores plot) to assess
normality
Plots sorted data versus percentiles of standard normal distribution
If data is normally distributed, plot should display:
Example
It is felt that the distribution of scores on a standardized test can be
approximated by a normal distribution
To see if this is true, a random sample of 15 students’ scores is
taken
403 633 315 630 505
711 546 561 574 420
428 474 473 489 565
P
T
e
s
t
S
c
o
r
30 40 50 60 70
No r m
al
-1
0
1
Qu a n ti l e s
o
Sampling Distributions
A parameter is a numerical feature of a distribution or population
Statistic is a function of sample data
Suppose you draw a sample and compute the value of a statistic
Suppose you draw another sample of the same size and compute
the value of the statistic
Would the 2 statistics be equal?
Use statistics to estimate parameters
Will the statistics be exactly equal to the parameter?
Observed value of the statistics depends on the sample
There will be variability in the values of the statistic over repeated
sampling
Probability distribution of a statistic is called the sampling
distribution (or distribution of the statistic)
Based on repeated random samples of the same size from the
population
In a random sample, the observations are independent and
identically distributed
Example
Large population is described by the probability distribution
X
P(X=x)
0
0.2
3
0.3
12 0.5
If a sample of size 2 is computed, what is the sampling distribution
for the sample mean?
Sampling Distribution of the
Sample Mean
Have a random sample of size n
The sample mean is
n
xi
x i 1
n
What is it estimating?
Properties of the Sample Mean
Expected value:
Variance:
Standard Deviation:
Sampling from a Normal Distribution
Suppose have a sample of size n from a
N ( , ) distribution
What is distribution of the sample mean?
Example
Distribution of moisture content per pound of a dehydrated protein
concentrate is normally distributed with mean 3.5 and standard
deviation of 0.6.
Random sample of 36 specimens of this concentrate is taken
Distribution of sample mean?
What is probability that the sample mean is less than 3.5?
Central Limit Theorem
In a random sample (iid sample) from any population with mean
and standard deviation when n is large, the distribution of the
sample mean
is approximately normal.
x
That is,
Thus,
x
Z
/ n
Implications
So, for random samples, if have enough data, sample mean is
approximately normally distributed...even if data not normally
distributed
If have enough data, can use the normal distribution to make
probability statements about x
Example
A busy intersection has an average of 2.2 accidents per week with a
standard deviation of 1.4 accidents
Suppose you monitor this intersection of a given year, recording the
number of accidents per week.
Data takes on integers (0,1,2,...) thus distribution of number of
accidents not normal.
What is the distribution of the mean number of accidents per week
based on a sample of 52 weeks of data
Example
What is the approximate probability that
x
is less than 2
What is the approximate probability that there are less than 100
accidents in a given year?