Psyc 235: Introduction to Statistics

Download Report

Transcript Psyc 235: Introduction to Statistics

Psyc 235:
Introduction to Statistics
http://www.psych.uiuc.edu/~jrfinley/p235/
DON’T FORGET TO SIGN IN FOR CREDIT!
About the Graded Assessment…
• Number One Predictor of Performance on
Assessment:
 How much of the content you’ve covered.
• Importance of time on ALEKS
 Help provide a measure to pace yourself
 Keep on track for option of extra credit final
• However! Your grade is based on how much of
the content you’ve learned.
• You need to keep up with the content goals!
Trouble meeting content goals?
• All content goals are listed on the syllabus.
(Available on course webpage)
• Please attend office hours and lab.
• We are here to help!
• Special Invited Lectures:
 Mandatory for invited students
 Will cover topics that are giving folks trouble
 Expect notices in the next couple weeks
Concerned about assessment grade?
• Catch up on content as soon as possible
• Remember the final extra credit option
• Feel free to contact us for more specific
advice.
Moving Forward:
• Mid-course evaluation forms soon.
• Suggestions for course, lecture, lab
format.
Data World vs. Theory World
• Theory World: Idealization of reality (idealization
of what you might expect from a simple
experiment)
 POPULATION
 parameter: a number that describes the population.
fixed but usually unknown
• Data World: data that results from an actual
simple experiment
 SAMPLE
 statistic: a number that describes the sample (ex:
mean, standard deviation, sum, ...)
Last Week…
•
Binomial:
 n: # of independent trials
 p: probability of “success”
 q: probability of failure (1-p)
 X = # of the n trials that are “successes”
 x = np
 x = √np(1-p)

Binomial Probability Formula
P(X  k)
 P(exactly k many successes)
specific # of
successes you
could get
probability
of success
n k
nk
P(X  k)   p (1 p)
k 
Binomial
Random
Variable
combination
called the
Binomial Coefficient
n
n!

 
k  k!(n  k)!
specific #
of
failures
probability
of failure
Note for p (X ≥ k)
Sum p for each k in range.
Jason’s Coin Toss Demo:
Bernoulli Trial:
one coin toss
Success=Heads
p=.5
Population:
Outcomes of all possible coin tosses
(for a fair coin)
10 tosses
n=10 (sample size)
0.3
0.25
probability
0.2
0.15
0.1
0.05
0
0
Sample:
1
2
3
4
5
6
7
8
9
# of successes
X=
....
Sampling Distribution
10
Jason’s Coin Toss Demo:
And,
Population:
we can use the formulas
Outcomes of all possible coin tosses
we’ve learned to
(for a fair coin)
calculate the population
parameters for the sampling distribution:
0.3
x = np=10 * .5 = 5
0.25
x = √np(1-p)≈1.58
probability
0.2
0.15
0.1
0.05
0
0
Sample:
1
2
3
4
5
6
7
8
9
# of successes
X=
....
Sampling Distribution
10
With different sample sizes,
you all discovered something interesting…
Binomial Distribution, p=.5, n=10
Binomial Distribution, p=.5, n=5
0.3
0.35
0.25
0.3
0.2
probability
probability
0.25
0.2
0.15
0.15
0.1
0.1
0.05
With large n, the binomial distribution
starts to look like a normal distribution!
0.05
0
0
0
0
1
2
3
4
1
2
3
4
5
5
6
7
8
9
10
# of successes
# of successes
Binomial Distribution, p=.5, n=20
Binomial Distribution, p=.5, n=100
Binomial Distribution, p=.5, n=50
0.2
0.09
0.12
0.18
0.08
0.1
0.16
0.07
0.14
0.06
probability
probability
0.1
0.06
0.08
0.05
0.04
0.03
0.04
0.06
0.02
0.04
0.02
0.01
0.02
0
# of successes
# of successes
# of successes
99
96
93
90
87
84
81
78
75
72
69
66
63
60
57
54
51
48
45
42
39
36
33
30
27
24
21
18
9
15
6
3
12
50
48
46
0
0
44
20
42
19
40
18
38
17
36
16
34
15
32
14
30
13
28
12
26
11
24
10
22
9
20
8
18
7
16
6
14
5
12
4
8
3
10
2
6
1
4
0
2
0
0
probability
0.08
0.12
What is a Normal Distribution?
• Class of distributions with the same overall shape
• Continuous
probability
distribution
• defined by
two parameters:
 mean: 
 stdev: 
Special:
Standard Normal
Distribution
Standard Normal Distribution
• A distribution of z-scores (standardized scores).
x 
• Scores derived by:
z
• Allows comparisons of
scores from different
normal distributions


Note:
=0
=1
Note:
Link between
area and p(x)
Note also:
+1 unit equals
+1 
Probability & Standardizing Scores
• The standard normal distribution allows us to easily
calculate probabilities for any normal distribution:
• Example: Say that we know that the average checking
account balance for a UIUC student is normally distributed
with an average balance of $150 and a standard deviation
of $125.
• What is the probability of a randomly selected student
having a balance of…
• more than $250?
• Less than $0
• Between $100 and $200?
Why do we care so much about
Normal Distributions?
• What happened to the binomial distribution as n
increased?
Central Limit Theorem
As the sample size n increases, the distribution
of the sample average approaches the normal
distribution with a mean µ and variance 2/n
irrespective of the shape of the original
distribution.
Wait. What?
• Example: Rolling one die, multiple dice…
• http://www.stat.sc.edu/~west/javahtml/CLT.html
• So, just like flipping the coin, multiple samples of
the sum of the n observations, approaches the
normal.
• Since the mean of a sample is the sum of all
observations over n (where n is constant for all
samples), this same principle applies to the
sample mean.
Hmm. Ok…
• But, does the underlying distribution really not
matter?
 http://intuitor.com/statistics/CentralLim.html
• Note that the size of n slightly changed the shape of
the normal distribution.
• Also, note that the central limit theorem stated the
mean was µ and variance 2/n (so stdev = /√n )
• The variance is a little different than before isn’t it?
T distributions
• To adjust for the fact that the normal distribution is a better
approximation for a sampling distribution as n increases, we have
the T distribution…
So, the t distribution
varies depending on the
number of degrees of
freedom (n-1)
With lower n, the t
distribution is more
spread out. This means
that getting more
extreme values is more
probable with low n.
So what good does that do us, anyway?
• Because we can assume that a sampling
distribution will be approximately normal
with a large n, we can use this distribution
to estimate the probability of obtaining a
given sample.
Example:
(aka excuse to show pictures of my
dog)
A large dog shelter in Chicago wants to
increase awareness of the adorable
pups they have for adoption by bringing
some dogs to a local festival. They
have 50 people who have volunteered
to walk the dogs around the festival. In
the shelter there are several hundred
dogs. The shelter knows that on
average their dogs have a 14 point
adoptability score (combination of things
like behavior, training, breeding,
cuteness, etc.), and the scores tend to
vary by about 3. The shelter would
prefer to show dogs that have an
average of at least a 16 adoptability
score. Should they go through all the
dogs and select 50 by hand, or are they
likely to get a group with this average by
chance?
Notice that we don’t know what the
underlying distribution of adoptability
scores looks like at this shelter, but
because of CLT we can still come up
with an answer.
Example:
(aka excuse to show pictures of my
dog)
A large dog shelter in Chicago wants to
increase awareness of the adorable
pups they have for adoption by bringing
some dogs to a local festival. They
have 50 people who have volunteered
to walk the dogs around the festival. In
the shelter there are several hundred
dogs. The shelter knows that on
average their dogs have a 14 point
adoptability score (combination of things
like behavior, training, breeding,
cuteness, etc.), and the scores tend to
vary by about 3. The shelter would
prefer to show dogs that have an
average of at least a 16 adoptability
score. Should they go through all the
dogs and select 50 by hand, or are they
likely to get a group with this average by
chance?
What information is important here?
µ = 14
=3
X = 16
N = 50
A couple more distributions
• There are 2 more distributions that we will need
later.
• ALEKS is familiarizing them with you now so that
you know how to use the calculators etc. when it
comes up.
• Generally, you should know:
 Shape of the distribution
 How to use the distribution practically (at this point
this means using the ALEKS calculator to find the
probability of a given value in a distribution)-- so don’t
worry
 Vague concept of what the distribution means
Chi Square
2
( )
Distribution
Distribution of the sum of 2+ squared normal distributions
Where k is number of groups
This is useful because later
when we’re comparing multiple
distributions, we will want to
determine whether two
distributions are the same thing
added together or are actually
two separate distributions.
F distribution
Distribution of the variance of one sample from a normally distributed
population divided by the variance of another.
This will be useful later when
we want to test if there is
more variance within a group
than across groups
(ANOVA)… if there is
greater within group variance,
then its unlikely that the
groupings are meaningful.
d1 is degrees of freedom of the top (numerator) distribution
d2 is degrees of freedom for the bottom (denominator) distribution
Next Week
• Keep up with the content goals
• Watch for an email about course
evaluations/suggestions
• Please let us know if you want or need
help
• If you’ve fallen behind, expect to be
contacted by email.
• Have a good week everyone!