Lecture 3 slides

Download Report

Transcript Lecture 3 slides

Introduction to summary statistics:
Sample mean & sample variance
Fred Boehm
Statistics 224
January 27, 2014
224 logistics

Website updates:

Revised office hours info
•
•


Alyssa & Huikun: 12-2pm today in MSC 1217c
Fred: 6:30-8pm today in Wendt library (room
129)
Electronic survey

Respond by 6pm tonight (Jan 27)

Completion time: ~ 3 minutes

Email me if you can't find the email with
hyper-link
Homework 1 due Wednesday at 11am in class
Lecture overview


Key terms in statistics

Statistic & Parameter

Random variable
Measures of central tendency


Measures of spread


Sample mean as a statistic related to central
tendency
Sample variance as a statistic related to
spread of data
Coin flip examples
Statistic vs. Parameter

Statistic – observed values, or function of
observed values

Coin Flip Example:


For ten coin flips, what is the number of
heads?
Parameter – unknown, underlying value that
impacts the observed outcomes

Coin Flip Example:


Is the coin fair?
In other words, is the probability of observing
heads equal to 0.5?
Random variable


Technical definitions use notions from
probability theory
For our purposes, we may think of a random
variable as an outcome that has more than one
possible value

Random variable example: a coin flip

Two possible outcomes (heads or tails)
What is a “sample mean”



A statistic (function of observed data)
Intuitively, the 'center' point of your
observations
Mathematically, the “average” of your observed
values

Written as X with a bar above it

Pronounced “X bar”
Batting Average in Baseball

Baseball batting average

What is the maximum possible value of AVG?

What is the minimum possible value of AVG?
Sample mean, continued

Coin flips example

Repeat coin flips and record outcomes
Coin Flips Activity




Each student flips the penny 5 times
Record the number of heads (between zero and
five)
Show of hands for each value of number of
heads
Plot the data (as histogram) in R
Coin Flips Activity, continued

Do you think that your coin is fair?

Why?


What might you do to better assess the fairness
of your coin?
Turn to your neighbor to discuss these three
questions
Sample variance



Tells you about the 'spread' of the data
Larger sample variance corresponds to data
being more spread out
Mathematically, one definition is:
Sample variance & coin flips

You've already flipped your penny 5 times

You recorded the number of heads that you saw

Calculate, from your five flips:

Sample mean = Xbar = (number of
heads)/(number of flips)
Sample variance & coin flips





Now, calculate the sample variance from your
five flips
Compare your sample variance with those of
your neighbors
Should you have the same sample variance as
your neighbors?
Should you be surprised if you and your
neighbor have the same sample variance?
Why?
Histograms of three random samples
Black: Variance=100; Sample variance=89
Red: Variance=16; Sample variance=17
Green: Variance=1; Sample variance=0.95

Sibling count histogram
How do we get the sample
mean from a histogram?

What is (approximately)
the sample mean here?

Sibling count histogram
How do we get the sample
mean from a histogram?

What is (approximately)
the sample mean here?
 1.8

Data from Stockholm Birth
Cohort Study.


http://www.stockholmbirthcohort.su.se/
Lecture overview


Key terms in statistics

Statistic & Parameter

Random variable
Measures of central tendency


Sample mean as a statistic related to central
tendency
Measures of spread

Sample variance as a statistic related to
spread of data

Coin flip examples

Guessing a sample mean from a histogram