Class #4 Notes - NYU Stern School of Business

Download Report

Transcript Class #4 Notes - NYU Stern School of Business

Statistics & Data Analysis
Course Number
Course Section
Meeting Time
B01.1305
31
Wednesday 6-8:50 pm
CLASS #4
Class #4 Outline
 Brief review of last class
 Questions on homework
 Chapter 5 – Special Distributions
Professor S. D. Balkin -- Feb. 19, 2003
-2-
Review of Last Class
 Probability trees
 Probability distribution functions
• Expected value
• Standard deviation
Professor S. D. Balkin -- Feb. 19, 2003
-3-
Chapter 5
Some Special Probability Distributions
Chapter Goals
 Introduce some special, often used distributions
 Understand methods for counting the number of sequences
 Understand situations consisting of a specified number of
distinct success/failure trials
 Understanding random variables that follow a bell-shaped
distribution
Professor S. D. Balkin -- Feb. 19, 2003
-5-
Counting Possible Outcomes
 In order to calculate probabilities, we often need to count how
many different ways there are to do some activity
 For example, how many different outcomes are there from
tossing a coin three times?
 To help us to count accurately, we need to learn some
counting rules
 Multiplication Rule : If there are m ways of doing one thing
and n ways of doing another thing, there are m times n ways
of doing both
Professor S. D. Balkin -- Feb. 19, 2003
-6-
Example
 An auto dealer wants to advertise that for $20G you can buy
either a convertible or 4-door car with your choice of either
wire or solid wheel covers.
 How many different arrangements of models and wheel
covers can the dealer offer?
Professor S. D. Balkin -- Feb. 19, 2003
-7-
Counting Rules
 Recall the classical interpretation of probability:
P(event) = number of outcomes favoring event / total number of outcomes
 Need methods for counting possible outcomes without the labor of listing
entire sample space
 Counting methods arise as answers to:
• How many sequences of k symbols can be formed from a set of r distinct
symbols using each symbol no more than once?
• How many subsets of k symbols can be formed from a set of r distinct symbols
using each symbol no more than once?
 Difference between a sequence and a subset is that order matters for a
sequence, but not for a subset
Professor S. D. Balkin -- Feb. 19, 2003
-8-
Counting Rules (cont)
 Create all k=3 letter subsets and sequences of the r=5 letters:
A, B, C, D and E
 How many sequences are there?
 How many subsets are there?
Professor S. D. Balkin -- Feb. 19, 2003
-9-
Counting Rules (cont)
 Choose a sequence of k  3 letters from r  5 letters
 Number of sequences is called the number of permutatio ns of
r symbols taken k at a time
r!
 r Pk 
(r  k )!
 Choose the number of combinatio ns of k  3 letters from r  5 letters
 Number of sequences is called the number of combinatio ns of
r symbols taken k at a time
r!
 r Ck 
k!(r  k )!
Professor S. D. Balkin -- Feb. 19, 2003
- 10 -
Review: Sequence and Subset
 For a sequence, the order of the objects for each possible
outcome is different
 For a subset, order of the objects is not important
Professor S. D. Balkin -- Feb. 19, 2003
- 11 -
Example
 A group of three electronic parts is to be assembled into a plug-in unit for a
TV set
• The parts can be assembled in any order
• How many different ways can they be assembled?
 There are eight machines but only three spaces on the machine shop
floor.
• How many different ways can eight machines be arranged in the three
available spaces?
 The paint department needs to assign color codes for 42 different parts.
Three colors are to be used for each part. How many colors, taken three
at a time would be adequate to color-code the 42 parts?
Professor S. D. Balkin -- Feb. 19, 2003
- 12 -
Binomial Distribution
 Percentages play a major role in business
 When percentage is determined by counting the number of times
something happens out of the total possibilities, the occurrences might
following a binomial distribution
 Examples:
•
•
•
•
Number of defective products out of 10 items
Of 100 people interviewed, number who expressed intention to buy
Number of female employees in a group of 75 people
Of all the stocks trades on the NYSE, the number that went up yesterday
Professor S. D. Balkin -- Feb. 19, 2003
- 13 -
Binomial Distribution (cont)

Each time the random experiment is run, either the event happens or it
doesn’t

The random variable X, defined as the number of occurrences of a
particular event out of n trials has a binomial distribution if:
1.
2.
For each of the n trials, the event always has the same probability  of
happening
The trials are independent of one another
Binomial Proportion :
X Number of occurrence s
p 
n
Number of trials
Professor S. D. Balkin -- Feb. 19, 2003
- 14 -
Example: Binomial Distribution
 You are interested in the next n=3 calls to a catalog order desk and
know from experience that 60% of calls will result in an order
 What can we say about the number of calls that will result in an
order?
 Questions:
•
•
•
•
Create a probability tree
Create a probability distribution table
What is the expected number of calls resulting in an order?
What is the standard deviation?
Professor S. D. Balkin -- Feb. 19, 2003
- 15 -
Binomial Distribution the Easy Way
Mean
Standard
Deviation
Professor S. D. Balkin -- Feb. 19, 2003
Number of
Occurrences, X
Proportion or
Percentage
E(X) = n 
E(p)= 
X=(n (1- ))0.5
p=((1- )/n)0.5
- 16 -
Finding Binomial Probabilities
 n a
na
P( X  a)    (1   )
a
Professor S. D. Balkin -- Feb. 19, 2003
- 17 -
Example: Binomial Probabilities
 How many of your n=6 major customers will call tomorrow?
 There is a 25% chance that each will call
 Questions:
•
•
•
•
How many do you expect to call?
What is the standard deviation?
What is the probability that exactly 2 call?
What is the probability that more than 4 call?
Professor S. D. Balkin -- Feb. 19, 2003
- 18 -
Example
 It’s been a terrible day for the capital markets with losers
beating winners 4 to 1
 You are evaluating a mutual fund comprised of 15 randomly
selected stocks and will assume a binomial distribution for the
number of securities that lost value
 Questions:
•
•
•
•
•
•
What assumptions are being made?
What is the random variable?
How many securities do you expect to lose value?
What is the standard deviation of the random variable?
Find the probability that 8 securities lose value
What is the probability that 12 or more lose value?
Professor S. D. Balkin -- Feb. 19, 2003
- 19 -
The Normal Distribution
Normal Distribution
 The normal distribution is sometimes called a Gaussian
Distribution, after its inventor, C. F. Gauss (1777- 1855).
 Well-known “bell-shaped” distribution
 Mean and standard deviation determine center and spread of
the distribution curve
 The mathematical formula for the normal f (y) is given in HO,
p. 157. We won't be needing this formula; just tables of areas
under the curve.
 The empirical rule holds for all normal distributions
 Probability of an event corresponds to area under the
distribution curve
Professor S. D. Balkin -- Feb. 19, 2003
- 21 -
Standard Normal Distribution
 Normal Distribution with =0 and =1
 Letter Z is used to denote a random variable that follows a
Standard Normal Distribution
Professor S. D. Balkin -- Feb. 19, 2003
- 22 -
Visualization
Symmetrical
Tail
Tail
Mean, Median and Mode
Professor S. D. Balkin -- Feb. 19, 2003
- 23 -
Characteristics
 Bell-shaped with a single peak at the exact center of the
distribution
 Mean, median and mode are equal and located at the peak
 Symmetrical about the mean
 Falls off smoothly in both directions, but the curve never
actually touches the X-axis
Professor S. D. Balkin -- Feb. 19, 2003
- 24 -
Why Its Important
 Many psychological and educational variables are distributed
approximately normally
• Measures of reading ability, introversion, job satisfaction, and memory are
among the many psychological variables approximately normally distributed
• Although the distributions are only approximately normal, they are usually quite
close.
 It is easy for mathematical statisticians to work with
• This means that many kinds of statistical tests can be derived for normal
distributions
• Almost all statistical tests discussed in this text assume normal distributions
• These tests work very well even if the distribution is only approximately
normally distributed.
Professor S. D. Balkin -- Feb. 19, 2003
- 25 -
0.12
More Visualizations
0.08
0.06
=3.9 years, Plant B
0.04
=5 years, Plant C
0.00
0.02
dnorm(x, 20, 3.1)
0.10
=3.1 years, Plant A
5
10
15
20
25
30
35
Length of Service
Professor S. D. Balkin -- Feb. 19, 2003
- 26 -
Z-score
 Compute probabilities using tables or computer
 Convert to z-score:
number  mean
z
standard deviation
 Look up CUMULATIVE PROBABILITY ON TABLE:
P( Z  z )
Professor S. D. Balkin -- Feb. 19, 2003
- 27 -
Determining Probabilities
P( Z  a )
Professor S. D. Balkin -- Feb. 19, 2003
P ( Z  b)
P ( a  Z  b)
- 28 -
LOOKUP Table
Standard Normal Lookup Table
Professor S. D. Balkin -- Feb. 19, 2003
- 29 -
Example
 Sales forecasts are assumed to follow a normal distribution
 Target, or expected value is $20M with a $3M standard
deviation
•
•
•
•
What is the probability of sales lower than $15M?
What is the probability sales exceed $25M?
What is the probability sales are between $15M and $25M ??
What is the value of k such that the sales forecast exceeds k is 60% ?
Professor S. D. Balkin -- Feb. 19, 2003
- 30 -
Example
 Benefits compensation costs for employees with a certain
financial services firm are approximately normally distributed
with a mean of $18,600 and standard deviation of $2,700.
• Find the probability that an employee chosen at random has an
benefits package that costs less than $15,000
• Find the probability that an employee chosen at random has an
benefits package that costs more than $21,000
• What is the value of k such that the benefits compensation exceeds k
is 95% ?
Professor S. D. Balkin -- Feb. 19, 2003
- 31 -
Example
 A telephone-sales firm is considering purchasing a machine that randomly
selects and automatically dials telephone numbers
 The firm would be using the machine to call residences during the
evening; calls to business phones would be wasted.
 The manufacturer of the machine claims that its programming reduces the
business-phone rate to 15%
 As a test, 100 phone numbers are to be selected at random from a very
large set of possible numbers
• Are the binomial assumptions satisfied?
• Find the probability that at least 24 of the numbers belong to business phones
• If in fact 24 of the 100 numbers turn out to be business phones, does that cast
series doubt on the manufacturer’s claim?
• Find the expected value and standard deviation of the number of business
phone numbers in the sample
Professor S. D. Balkin -- Feb. 19, 2003
- 32 -
Example
 Assumed the stock market closed at 8,000 yesterday.
 Today you expect the market to rise a mean of 1 point, with a
standard deviation of 34 points. Assume a normal
distribution.
•
•
•
•
•
What is the probability the market goes down tomorrow?
What is the probability the market goes up more than 10 points tomorrow?
What is the probability the market goes up more than 40 points tomorrow?
What is the probability the market goes up more than 60 points tomorrow?
Find the probability that the market changes by more than 20 points in either
direction.
• What is the value of k such that the market close exceeds k is 75% ?
Professor S. D. Balkin -- Feb. 19, 2003
- 33 -
Using R
•
factorial(n) – n!
•
dbinom(x, n, p) – binomial probability distribution function
•
pbinom(x, n, p) – binomial cumulative distribution function
•
pnorm(q, mean, sd) – normal cumulative distribution function
•
qnorm(p, mean, sd) – inverse CDF
Professor S. D. Balkin -- Feb. 19, 2003
- 34 -
Homework #4
 Hildebrand/Ott
•
•
•
•
•
•
•
5.2, page 141
5.3, page 141
5.9, page 150
5.14, page 150
5.32, page 163
5.33, page 163
5.34, page 163
 Verzani
• 6.5
• Reading: Chapter 6 (all) and
7 (all).
Professor S. D. Balkin -- Feb. 19, 2003
- 35 -