Chapter 1 - What is Statistics?

Download Report

Transcript Chapter 1 - What is Statistics?

Ch 1
What is Statistics?
Copyright ©2009 Cengage Learning
1.1
Example 1. Statistics Marks
A list of the final marks in last year. (out of 100)
65
71
66
79
65
82
80
86
67
64
62
74
67
72
68
Copyright © 2009 Cengage Learning
81
53
70
76
73
73
85
83
80
67
78
68
67
62
83
72
85
72
77
64
77
89
87
78
79
59
63
84
74
74
59
66
71
68
72
75
74
77
69
60
92
69
69
73
65
1.2
Example 1. Histogram
Frequency
Histogram
30
20
10
0
50
60
70
80
90
100
Marks
Copyright © 2009 Cengage Learning
1.3
Example 1. Central Location
“Typical mark” : idea of average man/woman
Mean (average mark)
Median (mark such that 50% of class is above
the grade and 50% is below)
Mean = 72.67
Median = 72
Copyright ©2009 Cengage Learning
1.4
Example 1. Variability
Variability: Are most of the marks clustered around
the mean or are they more spread out?
Range = Maximum – minimum = 92-53 = 39
Variance
Standard deviation
A graphical technique –histogram can provide us
with this and other information
Copyright ©2009 Cengage Learning
1.5
Descriptive Statistics
Descriptive statistics deals with methods of organizing,
summarizing, and presenting data .
Graphical techniques: histogram, bar and pie charts.
Numerical techniques:
The mean and median to describe the location of the data.
The range, variance, and standard deviation measure the
variability of the data
Copyright © 2009 Cengage Learning
1.6
Excel: How to do
• Tools -> Data Analysis -> Histogram, Descriptive
Statistics
• If it is your first time to use ‘Data Analysis’, be sure to do
the followings before you use these tools.
• Tools -> Add-Ins -> click ‘Analysis Toolpak’.
Copyright © 2009 Cengage Learning
Example 2. Exit Poll
The exit poll results from the state of Florida during the 2000
year elections were recorded (the Republican candidate George
W. Bush vs. the Democrat Albert Gore).
Suppose that the results (765 people who voted for either Bush
or Gore) were stored. (1 = Gore and 2 = Bush). The network
analysts would like to know whether more than 50% of the
electorate voted for Bush.
Approximately 5 million Floridians voted for Bush or Gore for
president. The sample consisted of the 765 people randomly
selected by the polling company
Copyright © 2009 Cengage Learning
1.8
Key Statistical Concepts
Population
— a population is the group of all items of interest to
a statistics practitioner.
— frequently very large; sometimes infinite.
E.g. All 5 million Florida voters
Sample
— A sample is a set of data drawn from the
population.
E.g. a sample of 765 voters exit polled on election day.
Copyright © 2009 Cengage Learning
1.9
Key Statistical Concepts
Parameter
— A descriptive measure of a population.
Statistic
— A descriptive measure of a sample.
Copyright © 2009 Cengage Learning
1.10
Key Statistical Concepts
Population
Sample
Subset
Parameter
Statistic
Populations have Parameters,
Samples have Statistics.
Copyright © 2009 Cengage Learning
1.11
Example 3. Lake Michigan
• A researcher in Shedd Aquarium in Chicago wanted to
know the average size of fish in the Lake Michigan.
• She collected 500 samples from the Lake, measured
lengths, and calculated the average (mean) of the sample.
•
•
•
•
Population?
Sample?
Parameter?
Statistics?
Copyright © 2009 Cengage Learning
Idea of statistics.
• Because we will not ask every one of the 5 million actual
voters for whom they voted, we cannot predict the
outcome with 100% certainty.
• A sample that is only a small fraction of the size of the
population can lead an inference.
• But inferences will be correct only a certain percentage of
the time.
Copyright © 2009 Cengage Learning
Inferential statistics
Inferential statistics is a body of methods used to draw conclusions or
inferences about characteristics of populations based on sample data.
The population is 5 million voters in Florida in the year of 2000.
The sample is 765 people randomly selected at the poll station.
To have a definite answer of the question (who will win in Florida), one
sure way is to interview all 5 million voters.
Statistical techniques make such endeavors unnecessary. Instead, we can
randomly draw a much smaller number of voters infer from the data.
Copyright © 2009 Cengage Learning
1.14
Statistical Inference
Statistical inference is the process of making an estimate,
prediction, or decision about a population based on a sample.
Population
Sample
Inference
Statistic
Parameter
What can we infer about a Population’s Parameters
based on a Sample’s Statistics?
Copyright © 2009 Cengage Learning
1.15
Statistical Inference
We use statistics to make inferences about parameters.
Parameter: the actual proportion of voters in Florida who
voted for Bush in 2000.
Statistics: the sample proportion among the people who were
selected at the exit poll.
Therefore, we can make an estimate, prediction, or decision
about a population based on sample data.
Copyright © 2009 Cengage Learning
1.16