Survey Methods - University of California, Santa Barbara
Download
Report
Transcript Survey Methods - University of California, Santa Barbara
Discrete Choice Models
for Modal Split
Overview
Outline
General procedure for model application
Basic assumptions in Random Utility Model
Uncertainty in choice
Utility & Logit model
Numerical example
Application issues in four step model
Summary
Individual & Travel Data
Choice
Model
Formulation
Estimate
Dissagregate
Choice
Model(s)
Predict Exogenous
Explanatory & Policy
Variables
Apply
Prediction
Procedure
Aggregate
(TAZ) Travel
Prediction
Insert Predicted
Proportions for Each
Mode in the Four Step
Sequence
Theory from microeconomics
We will skip the more theoretical description of
principles, theorems, lemas
Emphasize practical aspects
Look at examples
Note: Dan McFadden is Professor of Economics
and Nobel Laureate in Economics
http://emlab.berkeley.edu/users/mcfadden/
A site that contains a very good bibliography on
Random Utility Models
Basic Assumptions (1)
Suppose a trip maker i faces J options (choices
or alternatives) with index j=1,2,3…J.
Assume that each trip maker associates with
each choice j=1,2,...,J a function called
UTILITY representing the "convenience" of
choosing mode j.
j=1,2,..., J is called the choice set. This is the set
from which a decision maker chooses an option.
Note: Let’s assume that choice and consideration sets are the same.
Basic Assumptions Example
– A person, i, needs to go to work from home to the
downtown area.
– Suppose this person has three possible modes to
choose from: Car (j=1), Bus (j=2), and her Bike
(j=3). Total number of options (J=3).
– One possible form of the person’s convenience
function (called utility) is:
Ucar=F (car attributes, person characteristics, trip attributes)
Ubus=F (bus attributes, person characteristics, trip attributes)
Ubike=F (bike attributes, person characteristics, trip attributes)
Utility components
Variables describing the individual --> this is an attempt to represent
"taste variation" from person to person. In our example if young
persons have systematically differing preferences from the older
individuals, then, age would be one of the variables.
Variables describing the choice characteristics (called choice
attributes) in the choice set. For example, some travel modes are less
expensive than others. Cost of the trip for each available mode would
be another variable in the utility. Travel time is another key variable.
Variables describing the context such as the trip type, time of day,
budget constraints.
Key Assumption (maximum utility)
Travelers (decision makers) formulate for each
option a utility and they calculate its value.
Then, they choose the option with the most
advantageous utility (maximum utility).
Example: U(car,bus,bike)=-0.5*cost-2*waiting time
Cost by bus=$1, Waiting time=5 minutes
Cost by car=$2.5, Waiting time=1 minute
Cost by bike=$0.2, Waiting time=0 minutes
Which mode is the most desirable, second less desirable, etc?
Utility is actually an Indirect Conditional Utility
Uncertainty in utility (1)
We (analysts) do not know all the factors that influence
choice behavior
Travelers (decision makers) do not always make choices
consistently
We are not interested in including all possible variables
that affect behavior in our models
We are interested in policy variables (taxes, fares,
gasoline costs, waiting times) that we can “manipulate”
to find travelers reaction
We are also interested in social, demographic, and
economic traveler characteristics because these variables
allow us to link models to TAZs
Incorporating uncertainty and
traveler/trip characteristics
The example becomes: U(bus,car,bike)=-0.5*cost2*waiting time + SOMETHING ELSE
The “something else” is an indicator of “general
mode attractiveness” AND a random component
Let’s look at the details:
Utility elements
Uij
= aj-0.5*costj-2*waiting
timej + bj * agei + eij
Utility of person i for mode j
Utility elements
A constant for each mode j.
Captures desirability of j
for unknown reasons
= aj-0.5*costj-2*waiting
timej + bj * agei + eij
Uij
Utility elements
Cost is different for each mode j
= aj-0.5*costj-2*waiting
timej + bj * agei + eij
Uij
Waiting time is different for each mode j
Utility elements
= aj-0.5*costj-2*waiting
timej + bj * agei + eij
Uij
The effect of the age variable is different
for each alternate mode
(Class: Let’s talk about behavioral
meaning - bikes?)
Utility elements
= aj-0.5*costj-2*waiting
timej + bj * agei + eij
Uij
The key indicator of uncertainty =
our ignorance &
traveler variability for unknown reasons
Utility elements
In
a similar way as for age we
can include other traveler and
trip characteristics (explanatory)
Uij = aj-0.5*costj-2*waiting
timej + bj * agei + eij
In applications: These are parameters we
estimate from data using regression
methods
Utility elements
= aj-0.5*costj-2*waiting
timej + bj * agei + eij
Can write as: Uij = Vjj + eij
Uij
Random
Systematic & measurable part
Numerical example
(trip from home to work/school)
= 6 - 0.5*cost - 2*waiting
time + 0.15 * agei + eicar
Uibus = 5 - 0.5*cost - 2*waiting
time + 0.25 * agei + eibus
Uibike = 12 - 0.5*cost - 2*waiting
time - 0.3 * agei + eibike
Uicar
Note: Different age coefficients - why?
Compare systematic part (V)
Compute for each person the systematic
part of utility for each mode
Plot all V (syst. utilities) for all persons
Horizontal = age
Vertical = V the systematic part of utility of
each mode
Modal Utilities
Utility Value
20
10
0
-10
0
20
40
60
-20
Age
80
100
Vcar
Vbus
Vbike
Probability of Choice
We need to convert utilities to an estimate
of the chance to choose a mode
The specific equation to use depends on the
probability distribution of the random
component (e) in the utility function
(U=V+e)
Ease of calculations should be considered in
selecting a probability function
LOGIT Model
Assume the random components (ei) of the
utility are independent identically Gumbel
distributed random variables then:
Pi ( car)
exp(Vicar )
bus,bike
exp(Vij )
j car
Pi (bus)
exp(Vibus)
bus,bike
exp(Vij )
j car
Pi (bike)
exp(Vibike )
bus,bike
exp(Vij )
j car
Probability
Probability to choose a
mode
1.2
1
0.8
Pcar
Pbus
Pbike
0.6
0.4
0.2
0
0
20
40
60
Age of Traveler
80
100
Applications
Modal split (type of mode)
Route choice (link by link or entire path)
Car ownership (type of car)
Destination choice (shopping place)
Activity types (type of activity)
Residential unit (size and type of home)
Practical issues
Choice set - consideration set
Variables to include in utility
Measurement of mode attributes (e.g.,invehicle-travel-time)
Need survey data and mode by mode
attributes!
Next: TAZ application and “complete”
enumeration
Individual & Travel Data
Choice
Model
Formulation
Estimate
Dissagregate
Choice
Model(s)
Predict Exogenous
Explanatory & Policy
Variables
Apply
Prediction
Procedure
Aggregate
(TAZ) Travel
Prediction
Insert in the
Four Step
Sequence
For the four step modal split
We need aggregate TAZ proportions by
each mode (% of trips by car, % trips by
bus, % trips by bike)
We have a disaggregate (individual) model
which tells us the likelihood (chance) of a
person to choose each mode
We need a procedure to go from
disaggregate predictions of chance to
aggregate predictions of proportions
Taking Average TAZ
Characteristics Does Not Work
(Pa+Pb)/2 is not the same as P ([Va+Vb]/2)
- a and b are value points for V
When the two are equated we have the
Naïve method of aggregation
Bias depends on how close the probability
function is to a linear function
Following is an example from Probability to
choose bus as an option
Pbus
Proability of Choice
1.2 a TAZ with two persons with V=2 &V=12
Consider
1
0.8
P(V=12)=0.679
0.6
Pbus
0.4
0.2
P(V=2)=0.034
0
-5
V=2
0
V=12
5
10
15
Systematic Utility of Bus (Vbus)
20
What is the correct TAZ
Proportion of Choosing the Bus?
(P(V=2)+P(V=12))/2
or
P((2+12)/2)=P(V=7)
Pbus
The correct value is: [P(V=2)+P(V=12)]/2=0.357
Proability of Choice
1.2
1
0.8
P(V=12)=0.679
0.6
Pbus
0.4
P(V=7)=0.223
0.2
P(V=2)=0.034
0
-5
0
V=2
5
V=7
10
V=12
15
Systematic Utility of Bus (Vbus)
20
Pbus
Proability of Choice
1.2
1
0.8
0.6
Pbus
[P(V=2)+P(V=12)]/2=0.357
0.4
P(V=7)=0.223
Bias (see page 310 OW)
0.2
0
-5
0
V=2
5
V=7
10
V=12
15
Systematic Utility of Bus (Vbus)
20
Naïve Aggregation
For each TAZ take the average value of
explanatory variables
Compute average value for each utility
function for each mode
Compute the corresponding probability and
use it as the TAZ proportion choosing each
mode
Market Segmenation
Divide the residents in each TAZ into
relatively homogeneous segments
Apply Naïve aggregation to each segment
and get proportions for each mode
Compute the TAZ proportion either as
average segment-specific proportion or
weighted segment-specific proportion
Complete Enumeration
Compute for each person and for each mode
the probability to choose a mode
Compute the proportion for each mode as
an average of the individual probabilities
Stochastic microsimulation is a method
derived from this - see also Chapter 12 of
Goulias, 2003 (red book)
Example
(TAZ with four persons)
Age
Segment 1
Segment 2
Segment 2
Segment 3
Average
Exp (U)
Naïve Prob
45
21
20
79
41.25
Vcar
Vbus
Vbike
7.500
5.750
-1.600
3.900
-0.250
5.600
3.750
-0.500
5.900
12.600
14.250
-11.800
6.938
4.813
-0.475
1030.192 123.039
0.622
0.893
0.107
0.001
Vicar = 6 - 0.5*cost - 2*waiting time + 0.15 * agei
Vibus = 5 - 0.5*cost - 2*waiting time + 0.25 * agei
Vibike = 12 - 0.5*cost - 2*waiting time - 0.3 * agei
Compare values of the three
methods
Average of Segments
Weigthed Average
of Segments
Pcar
Pbus
Pbike
0.372
0.329
0.298
0.305
0.247
0.447
Naïve Aggregation
0.893
0.107
0.001
Complete Enumeration
0.318
0.248
0.434
Theoretical issues
Gumbel IID convenient but is it realistic?
IID components imply unrelated options in
the unobserved components - new models
account for relations
Trips are related - different formulations
See CE 523
Additional sources
http://www.bts.gov/ntl/DOCS/SICM.html
(Spear’s report on how to apply models)
http://www.bts.gov/ntl/DOCS/UT.html
(self-instructional overview with examples)
http://www.tfhrc.gov///////safety/pedbike/vol
2/sec2.5.htm (simple description of most of
the key issues)
All sites accessed September 22, 2003
Summary
Rational economic behavior
Utility linear in systematic and random
components
Choice probability is function of utilities –
non linear function!
Application by enumeration is best weighted average by market segments may
be good - depends on application!
Aggregate models are also available –
approximate!
Surveys must be used for this step
Additional reading suggestions
(for future reference)
Ortuzar Willumsen - Chapter 8 (8.1, 8.2,
8.3)
Ortuzar Willumsen - Chapter 9 (9.1, 9.2,
9.3)