Statistics and Data Analysis -- Class #2

Download Report

Transcript Statistics and Data Analysis -- Class #2

Statistics & Data Analysis
Course Number
Course Section
Meeting Time
B01.1305
31
Wednesday 6:00-8:50 pm
CLASS #2
Class #2 Outline




Brief review of last class
Class introduction with Birthday Problem
Questions on homework
Chapter 3: A First Look at Probability
Professor S. D. Balkin -- February 5, 2003
-2-
Class Introduction and The Birthday Problem
 Everyone introduce yourselves, giving your name,
job/industry, and birthday
 Question: How likely is it that two people in your class have
the same birthday?
 Let’s make a bet: I bet that at least two people in this class
share the same birthday.
• What should we bet?
• Should I be so certain?
Professor S. D. Balkin -- February 5, 2003
-3-
Review of Last Class
 Distinguish between quantitative and qualitative
variables
 Graphical representations of single variables
 Numeric measures of center and variation
Professor S. D. Balkin -- February 5, 2003
-4-
Chapter 3
A First Look At Probability
Chapter Goals
 Be able to interpret probabilities
 Understand the differences between statistics and probability
 Understand basic principles of probability
• Addition, Complements, Multiplication
 Understand statistical independence and conditional probability
 Be able to construct probability trees
 Understand managerial implications of probability
Professor S. D. Balkin -- February 5, 2003
-6-
Probability in Everyday Life





There is a 90% chance the Yankees will win the game tomorrow
There is a sixty percent chance of thunderstorm this afternoon
That bill has a 35% chance of being passed
There is a 20% chance of rain today
There is a 37% chance my hand will beat the dealer’s
Professor S. D. Balkin -- February 5, 2003
-7-
Probability in Everyday Life (cont)
 Your company is deciding on launching a new product in the consumer
market. Success based on reaction from competition, ability of suppliers
to meet demand, unknown adverse events or issues, economic and
regulatory conditions, etc.
 An airplane has multiple engines and can make a journey safely as long
as at least one is operating. Despite designers’ best efforts, what is the
chance of a disaster occurring? Which parts of the plane should receive
the most attention?
 You’re still waiting for Ed McMahan to knock on your door?
Professor S. D. Balkin -- February 5, 2003
-8-
What is Probability?
 Quantification of uncertainty and variability
 Basis for statistical inference and business decision making
 Probability theory is a branch of mathematics and it beyond
the scope of this class
Professor S. D. Balkin -- February 5, 2003
-9-
Illustrative Questions…
If you toss a coin, what is the probability of getting a head?
 If you toss a coin twice, what is the probability of getting exactly one Head?
•


How can you verify your answer?
If you toss a coin 10 times and count the total number of Heads, do you think
probability of 0 heads equals the probability of 5 heads?
Do you think probability of 4 heads equals the probability of 6 heads?
Professor S. D. Balkin -- February 5, 2003
- 10 -
History of Probability

Originated from the study of games of chance
•
•
Tossing a dice
Spinning a roulette wheel

Probability theory as a quantitative discipline arose in the seventeenth century
when French gamblers prominent mathematicians for help in their gambling

In the eighteenth and nineteenth centuries, careful measurements in
astronomy and surveying led to further advances in probability.

In the twentieth century probability is used to control the flow of traffic through
a highway system, a telephone interchange, or a computer processor; find the
genetic makeup of individuals or populations; figure out the energy states of
subatomic particles; Estimate the spread of rumors; and predict the rate of
return in risky investments.
Adapted from Probability Central
Professor S. D. Balkin -- February 5, 2003
- 11 -
Example: New York Times Online
Cellphones Not Killing Real Ones
(May 26, 2002)
Despite their growing affection for cellphones, most Americans are not ready
to pull the plug on traditional phones, according to a survey by Maritz
Research. The results were released this month.
When asked about the probability that they would use only cellphones for
their calls in the next year, only 8 percent said that they were very likely
or certain to do so; 79 percent answered "very unlikely" or "absolutely
not." Maritz surveyed 803 adults nationwide this spring. Each respondent,
or someone in the household, subscribed to a wireless phone service,
Forty-two percent, however, said their wireless phones had led them to use
their existing long-distance companies less than they did previously.
"Just five years ago, cellphones were viewed as a luxury; now they've become
ingrained in everyday life for all members of a family," said Paul Pacholski,
a vice president at Maritz.
Professor S. D. Balkin -- February 5, 2003
- 12 -
Example: Wall Street Journal Online
European Markets Close Little Changed
(May 21, 2002)
…Retail-price data published Tuesday showed that inflation in the United
Kingdom was steady in April at an annual rate of 2.3%, lower than the
expected 2.4%. However, Lehman Brothers economist Michael Hume
said the numbers are no obstacle to an interest-rate hike. "We
continue to look for a rate hike in June, but would put the probability
of a move at no more than 60%," he said….
Professor S. D. Balkin -- February 5, 2003
- 13 -
Interesting Probability Quotes
 Aristotle: The probable is what usually happens
 Sir Arther Conan Doyle,The Sign of Four : When you have
eliminated the impossible, what ever remains, however
improbable, must be the truth.
 Blaise Pascal: The excitement that a gambler feels when
making a bet is equal to the amount he might win times the
probability of winning it.
Professor S. D. Balkin -- February 5, 2003
- 14 -
Types of Occurrences
 Predictable Occurrence: Occurrence whose value can be accurately
determined using science:
• Position of a meteor in 25 years
 Unpredictable Occurrence: Occurrence whose value is based on a
random process:
• Toss of a coin
• Gender of a baby
 Random Process: An event or phenomenon is called random if individual
outcomes are uncertain but there is, however, a regular distribution of
relative frequencies in a large number of repetitions.
Professor S. D. Balkin -- February 5, 2003
- 15 -
Probability and Statistics
 Statistics: Observed data to generalizations about how the
world works
 Probability: Start from an assumption about how the world
works, and then figure out what kinds of data you are likely to
see
Probability is the only scientific basis
for decision making in the face of
uncertainty
Professor S. D. Balkin -- February 5, 2003
- 16 -
Terminology
 Random Experiment: A process or course of action that
results in one of a number of possible outcomes
• The outcome that occurs cannot be predicted with certainty
 Outcome: Single possible results of a random experiment
 Sample Space: The set of all possible outcomes of the
experiment
 Event: Any subset of the sample space
 Simple Event: Event consisting of just one outcome
Professor S. D. Balkin -- February 5, 2003
- 17 -
Example
 If we toss a nickel and a dime:
• What are the possible outcomes?
• Which outcome is the event “no heads”?
• Which outcomes are in the event “one head and one tail”?
• Which outcomes are in the event “one or more heads”?
Professor S. D. Balkin -- February 5, 2003
- 18 -
Defining Probabilities
 Probability has no precise definition!!
 All attempts to define probability must ultimately rely on
circular reasoning
 Roughly speaking, the probability of an event is the chance or
likelihood that the event will occur
 To each event A, we want to attach a number P(A), called the
probability of A, which represents the likelihood that A will
occur
Professor S. D. Balkin -- February 5, 2003
- 19 -
Defining Probabilities (cont.)
 There are various ways to define P(A), but in order to make
sense, any definition must satisfy
• P(A) is between zero and 1
• P(E1) + P(E2) + ··· = 1, where E1, E2, ··· are the simple events in the
sample space
 The three most useful approaches to obtaining a definition of
probability are:
• classical
• relative frequency
• subjective
Professor S. D. Balkin -- February 5, 2003
- 20 -
Classical Approach
Assume that all simple events are equally likely.
Define the classical probability that an event A will occur as:
# Simple Events in A
P( A) 
# Simple Events in S
So P(A) is the number of ways in which A can occur, divided
by the number of possible individual outcomes, assuming all
are equally likely.
Professor S. D. Balkin -- February 5, 2003
- 21 -
Example: Classical Approach
 In tossing a coin twice, if we take:
S = {HH, HT, TH, TT},
then the classical approach assigns probability 1/ 4 to each
simple event.
 If A = {Exactly One Head} = {HT, TH}, then
P(A) = 2/ 4 = 1/ 2 .
Question : Does this tell you how often A would occur if we
repeated the experiment (“toss a coin twice”) many times?
Professor S. D. Balkin -- February 5, 2003
- 22 -
Relative Frequency Approach
 The probability of an event is the long run frequency of
occurrence.
 To estimate P(A) using the frequency approach, repeat the
experiment n times (with n large) and compute x/n, where x =
# Times A occurred in the n trials.
 The larger we make n, the closer x/ n gets to P(A).
x
 P ( A)
n
Coin Flipping Example
Professor S. D. Balkin -- February 5, 2003
- 23 -
Classical and Frequency Approaches
 If we can find a sample space in which the simple events
really are equally likely, then the Law of Large Numbers
asserts that the classical and frequency approaches will
produce the same results.
 For the experiment “Toss a coin once”, the sample space is S
= {H, T} and the classical probability of Heads is 1/2.
 According to the Law of Large Numbers (LLN), if we toss a
fair coin repeatedly, then the proportion of Heads will get
closer and closer to the Classical probability of 1/2.
Professor S. D. Balkin -- February 5, 2003
- 24 -
Subjective Approach
 This approach is useful in betting situations and scenarios where onetime decision- making is necessary. In cases such as these, we wouldn’t
be able to assume all outcomes are equally likely and we may not have
any prior data to use in our choice.
 The subjective probability of an event reflects our personal opinion about
the likelihood of occurrence. Subjective probability may be based on a
variety of factors including intuition, educated guesswork, and empirical
data.
 Eg: In my opinion, there is an 85% probability that Stern will move up in
the rankings in the next Business Week survey of the top business
schools.
Professor S. D. Balkin -- February 5, 2003
- 25 -
Example: Not Equally Likely Events
 A market research survey asks the planned number of children for newly
married couples giving the following data. What are the probabilities of a
couple planning:
Number of
Children
Probability
0
1
2
3
4
5+
0.347
0.209
0.191
0.125
0.067
0.061
• 1 or 2 children?
• 3 or 4 children?
• 4 or more children?
Professor S. D. Balkin -- February 5, 2003
- 26 -
Complement Rule
 The probability of the complement of an event is equal to 1 minus the
probability of the event itself
P( A )  1  P( A)
Professor S. D. Balkin -- February 5, 2003
- 27 -
Example: Complement Rule
 A market research survey asks the planned number of children for newly
married couples giving the following data.
Number of
Children
Probability
0
1
2
3
4
5+
0.347
0.209
0.191
0.125
0.067
0.061
• Use the complement rule to find the probability of a couple planning to have
any children at all
Professor S. D. Balkin -- February 5, 2003
- 28 -
Odds
 Odds are often used to describe the payoff for a bet.
 If the odds against a horse are a:b, then the bettor must risk b
dollars to make a profit of a dollars.
 If the true probability of the horse winning is b/(a+b), then this
is a fair bet.
 In the 1999 Belmont Stakes, the odds against Lemon Drop
Kid were 29.75 to 1, so a $2 ticket paid $61.50.
 The ticket returns two times the odds, plus the $2 ticket price.
Professor S. D. Balkin -- February 5, 2003
- 29 -
Example: Odds
 If a fair coin is tossed once, the odds on Heads are 1 to 1
 If a fair die is tossed once, the odds on a six are 5 to 1.
 In the game of Craps, the odds on getting a 6 before a 7
are 6 to 5. (We will show this later).
Professor S. D. Balkin -- February 5, 2003
- 30 -
Combining Events
 The union A  B is the event consisting of all outcomes in A
or in B or in both.
 The intersection A  B is the event consisting of all
outcomes in both A and B.
 If A  B contains no outcomes then A, B are said to be
mutually exclusive .
 The Complement A of the event A consists of all outcomes
in the sample space S which are not in A.
Professor S. D. Balkin -- February 5, 2003
- 31 -
Combining Events (cont.)
P( A  B)  Probabilit y A or B or both occur
P( A  B)  Probabilit y A and B both occur
P( A )  Probabilit y A does NOT occur
Professor S. D. Balkin -- February 5, 2003
- 32 -
Rules for Combining Events
Complement Rule : P( A )  1  P( A)
Addition Rule : P( A  B)  P( A)  P( B)  P( A  B)
If A and B are mutually exclusive, then : P( A  B)  0
Professor S. D. Balkin -- February 5, 2003
- 33 -
Example 1: Combining Events
 Based on the past experience in your copier repair shop
suppose…
• Probability of a blown fuse is 6%
• Probability of a broken wire is 4%
• 1% of copiers to be repaired come in with both a blown fuse AND a
broken wire
 What is the probability of a copier coming in with a blown fuse
OR a broken wire?
Professor S. D. Balkin -- February 5, 2003
- 34 -
Example 2: Combining Events
 Market research firm tests a potential new product
 200 male respondents, selected at random, gave their
opinions for the product and their marital status giving the
following data:
Poor
Marital
Status
Never Married
Divorced
Married
Widowed
Total
Professor S. D. Balkin -- February 5, 2003
5
1
12
2
20
Opinion
Fair
Good
Excellent
9
26
10
4
16
9
23
37
32
8
5
1
44
84
52
Total
50
30
104
16
200
- 35 -
Conditional Probability
 Calculating probabilities given some restrictive condition
 Example: Absenteeism Last Year for 400 Employees.
Days Absent Smoker Non-Smoker Total
<10
10+
Total
34
78
112
260 294
28 106
288 400
 Compute the probability that a randomly selected employee is a smoker.
 If we are told that the employee was absent less than 10 days, does this
partial knowledge change the probability that the employee is a smoker?
Professor S. D. Balkin -- February 5, 2003
- 36 -
Conditional Probability (cont.)
Conditiona l Probabilit y P( B | A)
P( A  B)
P( B | A) 
P( A)
0.085
P(Smoker | Absent  10 days) 
0.28
 0.304
Professor S. D. Balkin -- February 5, 2003
- 37 -
Multiplication Law
For any events A and B,
P( A  B)  P( A) P( B | A)
 P( B) P( A | B)
Professor S. D. Balkin -- February 5, 2003
- 38 -
Statistical Independence
Events A and B are statistically independent if and only if
P(B|A) = P(B). Otherwise, they are dependent.
If events A and B are independent, then
P(A  B) = P(A)P(B)
Professor S. D. Balkin -- February 5, 2003
- 39 -
Example: Independence
 Seattle corporations with 500 or more employees
• 468 executives; 30 whom are women
• Conditional probability of a person being a woman given that the
person is an executive is 30/468 = 0.064
 In the population, 51.2% are women
 Since the probability of randomly choosing a women
changes when conditioning on “being an executive”,
being a women and being an executive are dependent
events
Professor S. D. Balkin -- February 5, 2003
- 40 -
Another Independence Example
 You are responsible for scheduling a construction project
• In order to avoid trouble, it will be necessary for the foundation to be
completed by July 27th and for the electricity to be installed before
August 6th
• Based on your experiences, you fix probabilities of 0.83 and 0.91 for
these events to occur
• Assume you have a 96% chance of meeting one deadline or the other
(or both)
 What is the probability of missing both deadlines?
 Are these events mutually exclusive? How?
 Are these events independent? How?
Professor S. D. Balkin -- February 5, 2003
- 41 -
Revisiting the Birthday Problem
 What is the probability that at least two people in this class
share the same birthday?
 Can be formulated as: What is the probability no one in this
class shares the same birthday, and take the complement
Professor S. D. Balkin -- February 5, 2003
- 42 -
Probability Tables and Trees
 Human resources found that 46% of its junior executives have
two-career marriages, 37% have single-career marriages, and
17% are unmarried.
 HR estimates that 40% of the two-career marriage executives
would refuse to transfer, as would 15% of the single-careermarriage executives, and 10% of the unmarried executives.
 If a transfer offer is made to randomly selected executives,
what is the probability it will be refused?
Professor S. D. Balkin -- February 5, 2003
- 43 -
Probability Tables
 Fill in this probability table:
Two-Career Single-Career Unmarried
Refused
Accepted
0.46
Professor S. D. Balkin -- February 5, 2003
0.37
0.17
- 44 -
Constructing Probability Trees
1. Events forming the first set of branches must have known
marginal probabilities, must be mutually exclusive, and
should exhaust all possibilities
2. Events forming the second set of branches must be entered
at the tip of each of the sets of first branches. Conditional
probabilities, given the relevant first branch, must be entered,
unless assumed independence allows the use of
unconditional probabilities
3. Branches must always be mutually exclusive and exhaustive
Professor S. D. Balkin -- February 5, 2003
- 45 -
Probability Tree
 Construct a probability tree
Professor S. D. Balkin -- February 5, 2003
- 46 -
Let’s Make a Deal
 In the show Let’s Make a Deal, a prize is hidden behind on of
three doors. The contestant picks one of the doors.
 Before opening it, one of the other two doors is opened and it
is shown that the prize isn’t behind that door.
 The contestant is offered the chance to switch to the
remaining door.
 Should the contestant switch?
 Solve by making a tree…
Professor S. D. Balkin -- February 5, 2003
- 47 -
Employee Drug Testing
 A firm has a mandatory, random drug testing policy
 The testing procedure is not perfect.
• If an employee uses drugs, the test will be positive with probability
0.90.
• If an employee does not use drugs, the test will be negative 95% of the
time.
• Confidential sources say that 8% of the employees are drug users
 8% is an unconditional probability; 90 and 95% are
conditional probabilities
Professor S. D. Balkin -- February 5, 2003
- 48 -
Employee Drug Testing (cont.)
 Create a probability tree and verify the following probabilities:
• Probability of randomly selecting a drug user who tests positive =
0.072
• Probability of randomly selecting a non-user who tests positive = 0.046
• Probability of randomly selecting someone who tests positive = 0.118
• Conditional probability of testing positive given a non-drug user = 0.05
Professor S. D. Balkin -- February 5, 2003
- 49 -
Next Time…
 Random variables and probability distributions
Professor S. D. Balkin -- February 5, 2003
- 50 -
Homework #2
 Hildebrand/Ott
•
•
•
•
•
•
•
•
•
3.3
3.4
3.5
3.8
3.10, 3.11, and 3.12 on pages 76-77. These all draw on
the same data, so it’s easy to deal with them together.
Note that those who recalled the commercial correctly
are in the “favorable” and “unfavorable” columns.
3.14
3.24, pages 90-91.
Observe that the rows of the given table sum to 1.
These are thus conditional probabilities for the retest,
given the results of the first test. For example, P(Retest
= minor | First = major) = 0.5.
Part (c) asks you to supply two numbers.
3.28
3.29
Professor S. D. Balkin -- February 5, 2003
 Verzani
• NONE
- 51 -