Transcript Slide 1
Lesson 1-1
Introduction to
Statistics
You will learn to…
* define statistics
* define vocabulary associated with
statistics
The word statistics is
derived from the Latin
word status, meaning
“state.”
3 reasons for studying
statistics:
1) to understand results of
studies done
2) to be able to conduct
our own research
3) to become better
consumers and citizens
Buying a car
Medicine
Source:
Viagra.com
Car insurance
$842.30
18 year old male
Car insurance
$661.70
18 year old female
Car insurance
Stats in USC Colleges
• Required
• College of Arts and
Sciences – Science
Departments
• Moore School of Business
• College of Education – any
BS and Early Childhood
• College of Engineering and
Information Technology
• College of Pharmacy
• Arnold School of Public
Health
• Optional
• College of Arts
and Sciences –
Arts
Departments
• College of
Education – any
BA and
Elementary
• School of Music
Sample Schedule
USC School of Business
Third Semester
• ECON 221 Principles of Microeconomics
• MGSC 290 or Computers in Business or
MGSC 291 Probability and Statistics
• MGMT 250/ENGL 463 or Professional
Communication or ENGL 282-286 Fiction,
drama, poetry, or American or British literature
• ACCT 225 Introduction to Financial Accounting
• Liberal Arts Philosophy, history, political
science, geography, foreign language, etc.
USC Nursing:
General Education Requirements
• General education course selections must meet
University general education requirements.
• English: ENGL 101-102 or higher
• Social Sciences: Two courses from one of
these: sociology or psychology. One course
must cover life-span content.
• Analytical Reasoning: To be satisfied in one of
the following ways: 1) STAT 110 and MATH 122
or 2) STAT 110 and STAT 201
USC: Hotel Recreation
Tourism Management
a. MATH 122 or 141, plus an additional course
from PHIL 110 or 111, mathematics (at the next
higher level), computer science (above CSCE
101), or statistics
b. Two courses from one of the following fields-Philosophy (110 and 111 only) or computer
science (above CSCE 101) or statistics
Majors of USC Students
in one STAT 110
Nursing
HRTM
Early childhood
Nursing
French/Pub Rels
Broadcast Jour
Public Relations
Fashion Merch.
HRTM
Entertain. Mgt
Crim Just
Poli Sci & Cr J
Print Journalism
Early childhood
Early childhood
Broadcast Jour
Crim Just
Early childhood
Public Relations
Broadcast Jour
Nursing
HRTM
Poli Sci
Nursing
Jour/Mass Comm
Nursing
psych/premed
Business
Nursing
What is Statistics?
Collect, Organize, Analyze,
and Interpret Data
in order to
Make Decisions
Statistics can be Hocus-Pocus!
What is data?
Data consists of information
from observations, counts,
measurements, or responses.
examples: 5 ft, 98˚, 2 hrs, 165 lb,
male, 50 years old, 4 fat grams,
200 times at bat, 100,000 sold
Population
The collection of all things being
studied.
Sample
A subset of
the population.
X X XX X
X X X
X
X XX XX
X X
X X X
X
X X
X
X
X
X X
X
X X
X X
X X X X X X X X X
X X
X X
X
XX
X XX
X X
X X X
X
X
X
X
X X
X
X X
X
X
X
Heights of Ridge
View students
Heights of Ridge View
students taking
probability & statistics
1. Population? All RV students
Heights of Ridge
View students
Heights of Ridge View
students taking
probability & statistics
2. Sample?
RV prob & stats
students
Population
Sample
RV students
RV seniors
math courses
all courses
students in this
students in this
class
class
We get every
We may get only
measurement or partial information,
count that is of but that might be the
interest.
most economical
way to get info.
Time & Money
How
many
were
surveyed?
Explain
“At
was
home:
surveyed?
82%.”
WhyWho
do
the
percents
add up
1746
students
to
more
than 100%?
82%
of university
those
surveyed
like to
multiple snack
answers
were allowed
at home
Explain
“Cashiers;
3,262,120; $5.75.”
Who
Who
didwas
the surveying?
surveyed?
#US
of cashiers
surveyed
in
1996;
employed
American
adults
Labor Department
median hourly pay rate
The U.S. Department of Energy
conducts a survey of 800
gasoline stations to determine
the average price per gallon.
3. Identify the population.
all gas stations
4. Identify the sample.
800 gas stations
5. What does the data set consist of ?
price per gallon ($)
A study of 33,043 infants in Italy was
conducted to find a link between a
heart rhythm abnormality and
sudden infant death syndrome.
6. Identify the population.
all infants
7. Identify the sample.
33,043 infants
8. What does the data set consist of ?
heart rate in beats per minute
A survey of 546 women found that
more than 56% are the job of
paying bills in their household.
9. Identify the population.
all women
10. Identify the sample.
546 women
11. What does the data set consist of ?
Yes or No – Are you the
primary person in your
household who pays the bills?
Parameter:
P
A numerical description of a population.
* data from the population
S
Statistic:
A numerical description of a sample.
* data from a sample
Is the fact a parameter or statistic?
12. The average income of all people
in the U.S. in 2002.
13. The average income of people from
three U.S. states in 2002.
14. A survey of a sample of workers
reported their starting salary
Statistic
Parameter
Is the fact a parameter or statistic?
15. Starting salaries for the 2005
graduates from USC
16. The number of students with
Cingular cell phone service in a
random check of classrooms
Parameter
Statistic
Parameter:
P
A numerical description of a population.
* data from the population
Statistic:
S
A numerical description of a sample.
* data from a sample
Parameters are fixed in value,
while statistics vary in value.
Two Branches of Statistics
1) Descriptive Statistics
* report the facts discovered
in the survey
2) Inferential Statistics
* use sample data to make
conclusions about an
entire population
* estimation, prediction, probability
Whole Population Available
Find the average height
of women 18 - 24
Collect
Data
POPULATION
of women
N = 130,000,000
Describe
Population
Descriptive
Statistics
Whole Population NOT Available
Find the average height
of women 18 - 24
Take
Sample
POPULATION
of women
N = 130,000,000
Use sample to
estimate description
of population
Collect
Data
SAMPLE
of women
n = 1000
Inferential
Statistics
17. 1000 U.S. teens were surveyed.
72% of the girls and 58% of the
boys had after school jobs.
Descriptive statistics:
72% of the girls and 58% of the
boys had after school jobs.
Inferential statistics:
We predict that a higher percentage
of teen girls have after school jobs.
18. In a recent survey of 1000 adults,
47% said using a cell phone while
driving should be illegal.
Descriptive statistics:
47% of 1000 U.S. adults believed that using
a cell phone while driving should be illegal.
Inferential statistics:
Based on a recent survey, about half of the
population believe that using a cell phone
while driving should be illegal.
It’s Time to Practice!
Assignment 1.1
Lesson 1-2
Types of Data
You will learn to…
* classify data
* identify types of measurements
Qualitative data
Data that cannot be
measured or counted
characteristic or
categorical
Examples:
gender, favorite class,
religious preference, eye color, hair color,
geographical location, zip code
Quantitative data
Data that can be
measured or counted
numerical data
Examples:
age, heights, weights,
temperatures, grades,
time, money
Qualitative or Quantitative data?
1. ID numbers of the students
in this class
2. temperature each day
this week
3. jersey numbers of the players
on a team
4. vehicle models
5. price of vehicles
qualitative
quantitative
Nominal Data
> list of categories, names,
labels, or qualities
> order (rank) cannot be
assigned to the categories
Examples:
type of car you drive, your jersey number,
college you want to attend, eye color,
hair color, gender, zip code
Ordinal Data
> data that is ordered or
ranked
Examples:
race outcomes (1st,2nd,3rd),
grade (A,B,C,D),
top 5 sports teams,
rating (good, better, best)
Decide whether the data is
nominal or ordinal. Why?
1. highest level of education
2. marital status
3. zip code
4. rating for first impression of store
ordinal
nominal
Discrete Data
> countable
> usually integers only – no
decimals or fractions
Examples:
9.9, 9.5, 8.8, 10.0, 9.3
number of courses you are taking,
number of pairs of shoes you own,
number of CDs you own,
score at figure-skating competition
cost of concert tickets
Continuous Data
> not countable
> weight or measurement
time is
continuous
Examples:
weight of a bookbag,
minutes it takes for you to get to school,
inches of rain fall
Decide whether the data is
continuous or discrete. Why?
1. students wearing blue jeans
2. height of students
3. money each student has
4. weight of each bag of M&Ms
continuous
discrete
nominal
qualitative
ordinal
variable
discrete
quantitative
continuous
Lesson 1-3
Statistical
Design
You will learn to…
* identify ways to collect data
* identify ways to get a sample of the
population for a study
The goal of every study is to
collect data and use it to
make a decision.
If the data collection
process is flawed, then
the results are not valid.
Designing a Statistical Study
1) identify data of interest
& identify population
2) develop a plan for collecting data
3) collect data
4) report descriptive statistics
5) report inferential statistics
6) identify any possible errors
Data Collection
1) Take a Census
(entire population)
2) Use a Sampling
(part of a population)
3) Create a Simulation
(reproduce conditions - crash dummies)
4) Conduct an Experiment
(study group & control group)
Which method of data collection?
1. the effect of changing flight patterns
on the number of airplane accidents
simulation
2. the effect of aspirin on preventing
experiment
heart attacks
3. the weights of all linemen in the
National Football League census
4. U.S. residents’ approval rating of the
sample
president
Experiment
Everyone in class will look at a
picture. Without saying anything,
you will write down what you see
in the picture.
If you are sitting in seat 1-13,
close your eyes, cover your eyes,
or put your head down.
Do not say
anything.
Do not write
anything.
Just look at the
picture.
Without saying
anything,
write down
what you see
in the picture.
Watch this video.
In this video, 3 kids have white
shirts and 3 kids have black
shirts. Focus on the kids in
white and count the number of
times they pass the ball to a
different person.
When time and money
prevent you from
collecting data from the
entire population…
Data Collection
1) Take a Census
(entire population)
2) Use a Sampling
(part of a population)
3) Create a Simulation
(reproduce conditions - crash dummies)
4) Conduct an Experiment
(study group & control group)
5 Sampling Techniques
(ways to choose a sample)
Random Sample:
> Each member of the population
has an equal chance of being
selected.
(heights of students)
Using the Calculator
>MATH > PRB > randInt(
randInt (begin, end, # in sample)
randInt (1, # in population, # in sample)
How can we all get the same data?
1 rand
Stratified Sampling
> divide the population into
groups using some characteristic
> select a few members from
each group
Stratified Sampling
X
X
X
X
Low
Income
X
X
High Income
Middle
Income
Stratified Sampling
Freshmen
Sophomores
D Hall
A Hall
Juniors
Seniors
C Hall
B Hall
Proportional Stratified
Sampling
500
families
Low
Income
2,000
families
500
families
High Income
Middle
Income
Cluster Sampling
> population is divided into groups
> use one group for the sample
Cluster Sampling
Low
Income
Middle
Income
High Income
Cluster Sampling
Freshmen
Juniors
Sophomores
Seniors
Systematic Sampling
> every nth number from the
population is included in
the sample
Systematic Sampling
Choose every 3rd household
X
X
X
X
X
X
Systematic Sampling
Choose every
th
5
student
Convenience Sampling
> subjects used because they
are convenient and available
> volunteer sampling
* telephone survey
* survey at a shopping center
Identify the sampling technique used
for each study.
5. select a class at random and question
each student in the class
6. divide the students by grade level and
question some students in each
grade level
cluster
stratified
Identify the sampling technique used
for each study.
7. assign each student a number,
generate random numbers, and
question each student whose number
is selected
8. choose a starting point and question
every 25th student
systematic
random
The commonly used sampling
methods that often results in
biased samples are
_______________________
• Volunteer sampling
• Convenience sampling
The Statistical Process:
population
1) identify _________
2) plan investigation
__________
data
3) collect & analyze ______
sample
4) describe ______
5) make inferences about
population
__________
errors
6) Identify possible ________
It’s Practice Time!
Assignment 1.3
Ch 1 Review
Assignment