Chapter 5: Regression
Download
Report
Transcript Chapter 5: Regression
2
Review: Categorical Variables place individuals into one of several
groups or categories.
The values of a categorical variable are labels for the different
categories.
The distribution of a categorical variable lists the count or
percent of individuals who fall into each category.
When a dataset involves two categorical variables, we begin by
examining the counts or percents in various categories for one of the
variables.
Two-way Table – describes two categorical
variables, organizing counts according to a row
variable and a column variable.
Young adults by gender and chance of getting rich
Female Male Total
Almost no chance
96
98
194
Some chance, but probably not
A 50-50 chance
A good chance
Almost certain
426
696
663
486
286
720
758
597
712
1416
1421
1083
Total
2367
2459 4826
What are the variables described by this two-way table?
How many young adults were surveyed?
3
Practice 1:Complete the following tables
1) Students are asked
if they prefer to go
swimming or to the gym.
Swimming
Boys
Gym
25
64
2) Some people are
asked about their favorite outdoor sport.
Under 15’s
15 – 30
Over 30
Total
30
45
Girls
Total
Hiking
42
Total
100
Canoeing Climbing Total
150
30
47
100
15
75
300
Male
Female
TOTALS
Yes
190
110
300
No
130
165
295
320
275
595
Sex
Eat breakfast
on a regular
basis
TOTALS
The Marginal Distribution of one of the categorical
variables in a two-way table of counts is the distribution of
values of that variable among all individuals described by
the table.
Note: Percent's are often more informative than counts,
especially when comparing groups of different sizes.
To examine a marginal distribution:
1. Use the data in the table to calculate the marginal
distribution (in percent's) of the row or column totals.
2. Make a graph to display the marginal distribution.
6
“Age groups” is the categorical explanatory variable
“Education level” is the categorical response variable
Variables
Marginal
distributions
BPS
Chapter 6
7
Variables
27,858
58,077
44,465
44,828
37,786
81,435
56,008
Marginal totals
BPS
Chapter 6
8
marginal total
marginal percent
100%
table total
Marginal distributions are used as
background information only.
They do not address association
BPS
Chapter 6
9
% not completed HS =
27,859 / 175,230 × 100% = 15.9%
% graduated HS =
58,077 / 175,230 × 100% = 33.1%
% finished 1-3 yrs col. = 44,465 / 175,230 × 100% = 25.4%
% finished ≥4 yrs col. =
BPS
44,828 / 175,230 × 100% = 25.6%
Chapter 6
10
% age 25–34 =
37,786 / 175,230 × 100% = 21.6%
% age 35–54 =
81,435 / 175,230 × 100% = 46.5%
% 55 and over = 56,008 / 175,230 × 100% = 32.0%
BPS
Chapter 6
11
Young adults by gender and chance of getting rich
Female
Male
Total
Almost no chance
96
98
194
Some chance, but probably not
426
286
712
A 50-50 chance
696
720
1416
A good chance
663
758
1421
Almost certain
486
597
1083
Total
2367
2459
4826
Examine the marginal
distribution of chance of
getting rich.
Chance of being wealthy by age 30
Percent
Almost no chance
194/4826 =
4.0%
Some chance
712/4826 =
14.8%
A 50-50 chance
1416/4826 =
29.3%
A good chance
1421/4826 =
29.4%
Almost certain
1083/4826 =
22.4%
Percent
Response
35
30
25
20
15
10
5
0
Almost none Some chance
50-50 Good chance
chance
Survey Response
Almost
certain
12
Practice 2:
Copy and complete the following two-way table
about ways of eating potato and then answer the questions.
New
Boys
Girls
Teachers
Total
1)
2)
3)
4)
25
12
Chips Mashed
34
37
22
104
Total
100
50
250
How many boys liked mashed?
How many teachers preferred new potatoes?
How many girls were asked?
Out of the people who liked chips, how many were
boys?
Practice 3: For the following two-way table answer the
questions on probability. A person is picked at random
from the sample.
New
Chips
Mash
Total
Boys
15
51
34
100
Girls
25
37
38
100
Teachers
12
16
22
50
Total
52
104
94
250
1) What is the probability the person picked is a boy?
2) What is the probability the person liked mash?
3) What is the probability the person was a teacher who preferred new
potatoes?
4) What is the probability that, out of the girls, the person liked chips?
5) Out of the people who liked chips, what is the probability the person
was a boy?
For the following table, copy and complete it, then answer the
questions about types of coffee bought.
Coffee is sold in three types and in three weights.
100g
Ground
200g
300g
120
50
Powder
80
35
Granules
40
45
Total
135
Total
26
135
400
a)
How many people bought 100g of powdered coffee?
b)
How many people bought ground coffee?
c)
Out of the packets weighing 200g, what is the probability the packet
bought contained granules?
Are you ready for the answers ?
Next
100g
200g
300g
Total
Ground
15
50
55
120
Powder
80
35
26
Granules
40
45
54
141
139
Total
135
130
135
400
(a) 80
(b) 120
(c) 45/130
Yes
Has been to
Myrtle Beach
Yes
No
TOTALS
No
TOTALS
Has an Ipod
•
•
•
•
•
A student has an Ipod?
A student has been to Myrtle Beach?
A student does not have an Ipod?
A student has not been to Myrtle Beach?
A student has an Ipod and has been to the Myrtle
Beach?
• A student has an Ipod or has been to Myrtle Beach)
• A student does not have an Ipod or has not been to
the Myrtle Beach?
Yes
Hours
watching TV
last night
TOTALS
Less
than 2
hours
2 or
more
hours
No
TOTALS
Brown Hair
• A student has brown hair?
• A student spent less than 2 hours watching TV
last night?
• A brunette student spent less than 2 hours
watching TV last night?
• A student who was watching TV for more than 2
hours last night is brunette?
Male
Has correct
supplies for
class
TOTALS
Yes
No
Female
TOTALS
Did Homework
• Write your own set of at least 5 questions you could
answer using the previous two-way table.