Overview of Data Analysis Concepts

Download Report

Transcript Overview of Data Analysis Concepts

Statistics for
Decision Making
QM 2113 - Spring 2002
Descriptive Statistics
Review
 What is statistics?
– Description (Data analysis) ---> Stage I
– Inference (Applying results) ---> Stage 2
 Data types
– Quantitative (numeric)
– Qualitative (categorical)
 Introduction to descriptive analysis
– Informal (tables & charts)
– Summary measures
Schematic View
Statistics
Quantitative Data
Informal
Summary Measures
Inferential Analyses
Qualitative Data
Informal
Summary Measures
Inferential Analyses
Probability is what allows the linkage between descriptive and inferential analyses
Sampling
Population
Sample
Statistic
Parameter
Very Important
 Type of analysis depends upon data:
– Quantitative
• Ratio
• Interval
• Ordinal
– Qualitative
• Ordinal
• Nominal
 Examples?
Descriptive Analysis
 Three general forms
– Informal
• Tables
• Charts
– Formal: Numeric (i.e., statistics)
 Forms basis for performing
inferential analyses
Descriptive Statistics
 Qualitative data
– Percentages
– Analysis of proportions
 Quantitative data
– Single numbers that summarize
• Location (i.e., general tendencies)
• Variation (i.e., how different the values are)
– Primary importance
• Mean
• Standard deviation
Primary Measures
 Mean -- just a simple average
Add the values and divide by number of observations
 Standard deviation
– Average difference among the values
– Process:
•
•
•
•
Subtract the average from each value
Square each result
“Average” the squared results
Take the square root of that result
Miscellaneous Statistics
 Less important but need to be familiar
with:
– Location
• Median
• Mode
• Quantiles
– Variation
• Range
• Min and Max
– Both (?)
• Z-score
• Empirical Rule
Numeric Data: Charts
& Tables
 Getting organized:
– Ordered array
– Frequency distribution
• Absolute frequencies
• Relative frequencies (%)
• Cumulative frequencies
– Cumulative relative frequencies
 Histogram (frequencies)
 Other
– Stem-leaf display
– Ogive (cumulative frequencies)
Frequency Distributions
Determining Frequency Groups
 Start by breaking the data range into k
equal width intervals
– Let n represent the number of observations
– Number of intervals such that 2k > n
 Interval width
– Start with:
(Max - Min) / k
– Use convenient breakpoints for intervals
• 91.0 through 97.4 (OK)
• 90.0 through 95.0 (Better)
 Intervals: no overlap; no gaps
Frequency Distributions
Determining Frequencies
 “Absolute” frequencies
Count number of observations in each
interval
 Relative frequencies
Divide absolute frequency by total number
of observations
 Cumulative frequencies
Add frequencies for all previous intervals
(note difference from manner done in
text)
 Cumulative relative frequencies
Add relative frequencies for all previous
intervals
Histograms
 What are they?
– Just graphical displays of frequency
distributions
• Absolute frequencies
• Relative frequencies
• Cumulative frequencies
– Provide “picture” of the variation in the
data
 Basics
– Horizontal axis: values for variable of
concern
– Vertical axis: indicates corresponding
frequencies
Qualitative Data:
Charts & Tables
 Frequency table is basis for chart
Same as with numerical data, except
data already are broken into frequency
groups (categories)
 Bar chart
 Pie chart
 Pareto chart
Bar Charts and Pie
Charts
 Bar chart
– Two formats
• Vertical (preferred)
• Horizontal
– Analogous to histograms, but
• Bars don’t touch each other
• Ordering of bars doesn’t matter
 Pie chart
– Often preferable to bar charts
– Must identify slices
Summary
 We’ve overviewed the basic informal
means of describing data
– Tables
– Charts
 Type of exhibit depends on data type
– Quantitative
– Qualitative
 What’s next: numerical summary
measures for numeric data