Bivariate Analysis with Qualitative Data

Download Report

Transcript Bivariate Analysis with Qualitative Data

Business Statistics
QM 2113 - Spring 2002
Bivariate Analyses
for Qualitative Data
Student Objectives
 Summarize regression analysis
– Interpret regression statistics
– Incorporate into report
– Address questions concerning homework
 Discuss why regression won’t work with
qualitative data
 Use crosstab approach for joint
frequency distributions
 Use PivotTable feature of Excel for
creating crosstabs
Let’s Wrap Up
Regression
 Complete example from previous class
 Review interpretations of regression
statistics
– Describe the relationship
– Assess the validity
 Summary of notation & terminology
 Address questions concerning the
homework
– Expectations
– Mechanics (e.g., copy/paste)
– Other . . . ?
Results of Analysis of
TV Time versus Age
 Note: using complete data set
 Results
b0 = 5.581 hours/week
b1 = 0.522 hours per year of age
R2 = 56%
Syx = 6.924 hours/week
 Correlation (r): a single, multipurpose
measure
–
–
–
–
Square root of R
Same sign as b1
R = +0.75
Summarizes the estimated strength of the
relationship
Interpreting Regression
Analyses (a)
 Describing the relationship
– Intercept (b0):
• Base value for Y
• If it were possible for X to be 0, this is
what Y would be
– Slope (b1):
• How much Y changes when X changes 1 unit
• The sensitivity of Y to changes in X
(sometimes, the marginal value of X)
Interpreting Regression
Analyses (b)
 Validity
– R-Square (R2): we know Y varies, but
how much (i.e., what percentage) is
attributable to the variation in X?
– Standard error (Syx): if we used the
regression equation to predict Y, how
much, on the average, should we expect
to be wrong?
Questions About the
Homework?
 Which data:
– kivzdata.xls
– All households, not just Ch.7
 What analyses
– Univariate
• Include: histogram and descriptive stats
• Variables: TV Time, Income
– Bivariate
• Scatterplot (properly labeled)
• Regression statistics (the basic 4)
 The report
– Integrate charts with text
– Nontechnical language
 Other questions . . . ?
Regression, What Not
to Do
 Typical modeling errors
– Reverse Y and X
– Treat qualitative variables as
quantitative
 Use Excel shortcuts to create
inflexible worksheets
– Data analysis tool
– Plot trend line
Now, Recall Analysis
Depends on Data Type
 Univariate:
– Quanitative data: histograms, averages, etc.
– Qualitative data: bar charts, proportions
 Bivariate:
– Both variables quantitative
• Scatterplots
• Regression analysis
– Either or both variables qualitative
• Contingency tables, aka:
– PivotTables (Excel)
– Crosstabulations
• Chi-square analysis (beyond our scope)
Let’s Look at the
Website Analytics Case
 Pilot sample of major eCommerce sites
 Note Internet business models
–
–
–
–
Virtual storefront (e.g., Amazon)
Content provider (e.g., WSJ)
Auction (e.g., eBay)
Several others, but these are the top three
 Major decision common in business
– Make vs buy
– Apply to site development
 What’s the research question here?
Examining the Question
 Does “make vs buy” depend upon
type of business model?
 Start with simple frequency tables
 Doesn’t tell us about how these
variables are related
 Need to go further: crosstab
Crosstabs:
Many Flavors
 Joint frequency: basis for developing
the other three
 Joint relative frequency (% of total)
– Joint percentages
– Margin percentages (same as univariate %)
 Analyzing relationships
– Row percentage
– Column percentage
Crosstabs:
Relationships
 Relationship?
– If so, % of observations in given category
of primary variable should differ
substantially across categories of
explanatory variable
– That is, depending upon type of table,
• Row % values differ down a given column, or
• Column % values across a given row
 Easier to analyze
– With practice
– Using basic probability concepts
Using Excel’s PivotTable
Feature for Crosstabs




Select the data, including headings
Click on Data | PivotTable
Click twice on Next
Click on Layout
–
–
–
–
Drag Development to row
Drag Model to column
Drag either to data
Double click on data button
• Select Count, then click on Options
• In Show Data As, select % of Total
• Click on OK
– Click on OK
 Click on Finish
Homework
 Complete the KIVZ analysis/report
 Development vs Model for WA case
– Try to create crosstabulation
– Think about whether a relationship
exists