Transcript Sample

Business Statistics:
A Decision-Making Approach
7th Edition
Chapter 1
The Where, Why, and How of
Data Collection
What is Statistics?

Statistics is the development and application of methods
to collect, analyze and interpret data.

Statistics is a discipline which is concerned with:




designing experiments and other data collection
summarizing information to aid understanding
drawing conclusions from data
estimating the present or predicting the future.
Populations and Samples

A Population is the set of all items or individuals
of interest


Examples:
All likely voters in the next election
All parts produced today
All sales receipts for November
A Sample is a subset of the population

Examples:
1000 voters selected at random for interview
A few parts selected for destructive testing
Every 100th receipt selected for audit
Population vs. Sample
Population
a b
Sample
cd
b
ef gh i jk l m n
o p q rs t u v w
x y
z
c
gi
o
n
r
y
u
Why Sample?

Less time consuming than a census

Less costly to administer than a census

It is possible to obtain statistical results of a
sufficiently high precision based on samples.
Sampling Techniques
Sampling Techniques
Nonstatistical Sampling
Convenience
Statistical Sampling
(Simple)
Random
Systematic
Judgment
Not interested in……
Stratified
Cluster
Statistical Sampling

Items of the sample are chosen based on
known or calculable probabilities
Statistical Sampling
(Probability Sampling)
(Simple) Random
Stratified
Video Clip
Systematic
Cluster
Random Sampling

Every possible sample of a given size has an equal
chance of being selected

Selection may be with replacement or without
replacement

The sample can be obtained using a table of random
numbers or computer random number generator
Stratified Random Sampling



Watch Video Clip: Samples and Surveys (#14)
More often used than “Systematic” and “Cluster”
Stratified random sampling could be used to divide
the employees into groups with similar characteristics
that might affect preferences like marital status or
age and then simple random samples can be taken
from each group.
Systematic Random Sampling

Decide on sample size: n (sample)

Divide frame of N (population) individuals into groups of
k individuals: k=N/n

Randomly select one individual from the 1st group

Select every kth individual thereafter
N = 64
n=8
k=8
First Group
Cluster Sampling


Divide population into several “clusters,” each
representative of the population
Select a simple random sample of clusters

All items in the selected clusters can be used, or items can be
chosen from a cluster using another probability sampling
technique
Population
divided into
16 clusters.
Randomly selected
clusters for sample
Two Basic Divisions of Statistics

Descriptive statistics

Descriptive statistics are numbers that are used to summarize
and describe data.

Examples:

Average salary of various occupations

Median house price in Bakersfield, CA

Descriptive statistics do not infer the properties of the population
from which the sample was drawn --- do not involve generalization
Descriptive Statistics

Collect data

e.g., Survey, Observation,
Experiments

Present data


e.g., Charts and graphs
Characterize data

e.g., Sample mean =
x
n
i
Inferential Statistics

You have been hired by the National Election
Commission to examine how the American people feel
about the fairness of the voting procedures in the U.S.
Who will you ask?


Ask every single American
Ask randomly selected a small group (sample) of Americans
and then draw inferences about the entire country from their
responses.
Inferential Statistics


are used to draw inferences about a population from
a sample (two main methods: estimation and
hypothesis testing).
Sample selection is “critical” matter….
 Not from a particular state
 Not from a particular party
Tools for Collecting Data
Data Collection Methods
Domino Pizza example from What is Statistics? (#1)
Also Watch Video Clip: Samples and Surveys (#14)
Experiments
Telephone
surveys
Written
questionnaires
Direct observation and
personal interview
Survey Design Steps

Define the issue

What are the purpose and objectives of the survey?

How will the survey be administered?

(e.g. phone, email, face to face)

Define the population of interest

Develop survey questions

Make questions clear and unambiguous

Use universally-accepted definitions

Limit the number of questions
1-17
Survey Design Steps
(continued)

Pre-test the survey

Pilot test with a small group of participants

Assess clarity and length

Determine the sample size and sampling method

Select sample and administer the survey
Types of Questions

Closed-end Questions


Select from a short list of defined choices
Example: Major: __business __liberal arts
__science __other
Open-end Questions

Respondents are free to respond with any value, words, or
statement
Example: What did you like best about this course?

Demographic Questions

Questions about the respondents’ personal characteristics
Example: Gender: __Female __ Male
Observations and Interviews

Observations




Data collected is observed and recorded based on
what takes place
Very subjective
Example: Observe reactions of customers to a new
store layout
Interviews



Can be structured – fixed set of questions
Can use a variety of questions
Requires more time from the researcher
Data Collection Pitfalls






Interview bias
Non response
Selection bias
Observer bias
Measurement error
Internal/External validity
The objective is to collect accurate and reliable data!
Data (variable) Types
Data
Qualitative
(Categorical)
Quantitative
(Numerical)
Examples:



Marital Status
Political Party
Eye Color
(Defined categories)
Discrete
Examples:


Number of Children
Defects per hour
(Counted items)
Continuous
Examples:


Weight
Voltage
(Measured
characteristics)
Qualitative vs. Quantitative Variables (Data)


Qualitative variables (data) take on values that are
names or labels.
Quantitative variables are numerical. They represent a
measurable quantity.


Quantitative variables can be further classified
as discrete or continuous: If a variable can take on any value
between its minimum value and its maximum value, it is called a
continuous variable; otherwise, it is called a discrete variable.
Video clip (easy and simple): Introduction to Variables
Data Measurement Levels
(Please see the Video Clip: Scales of Measurement)
Measurements
Ratio/Interval Data
Rankings
Ordered Categories
Categorical Codes
ID Numbers
Category Names
Ordinal Data
Nominal Data
Highest Level
Complete Analysis
Higher Level
Mid-level Analysis
Lowest Level
Basic Analysis
Data Types

Time Series Data


Ordered data values observed over time
Cross Section Data

Data values observed at a fixed point in time
Data Types
Sales (in $1000’s)
2003
2004
2005
2006
Atlanta
435
460
475
490
Boston
320
345
375
395
Cleveland
405
390
410
395
Denver
260
270
285
280
Cross Section
Data
Time
Series
Data