forecasting presidential elections

Download Report

Transcript forecasting presidential elections

FORECASTING
PRESIDENTIAL ELECTIONS
N. R. Miller
POLI 423
Forecasting Election Outcomes
• Let us consider how — and how accurately and how far
in advance — the outcome of Presidential elections can
be predicted.
• We need to distinguish between
– short-term forecasts (made shortly before the Presidential
election), and
– long-term forecasts (made at the outset of the general election
campaign or even earlier).
• We will consider three types of forecasts:
– pre-election polls and surveys;
– a more or less qualitative scoring method; and
– predictive models based on long-term patterns in aggregate
data.
Pre-Election Polls
• In the relatively distant past, pre-election polls taken even shortly
before election day have sometimes been disastrously and famously
wrong — most notably
– a Literary Digest poll predicted the defeat of President Roosevelt in
1936, and
– the Gallup Poll (and almost all other polls) predicted the defeat of
President Truman in 1948.
The Literary Digest Polls
• The Literary Digest collected from telephone companies and state
motor vehicle departments lists with the names and addresses of
telephone subscribers and registered automobile owners.
• The magazine then mailed out many millions of “sample ballots” to
these names and addresses.
• Many people did not respond, of course, but the Digest still received
millions of ballots, on the basis of which it forecast the outcome of
the election.
• This procedure was quite successful in predicting the Republican
presidential victories in the 1920s and also Franklin Roosevelt’s
victory in 1932.
• But in 1936 it predicted that President Roosevelt would be badly
defeated for re-election by Alf Landon, when in fact he won in one of
the greatest Presidential landslides of all time.
• Can you account for why the Literary Digest forecasts that had
previously been successful failed so badly in 1936?
Gallup and Other Polls in 1948
• Gallup (and other) polls still used quota sampling (rather
than random sampling).
• There were relatively few polling organizations, each of
which took relatively few polls.
• Gallup averaged together the results of their several
most recent polls.
• Gallup stopped polling altogether about two weeks
before the election.
• There were two third-party candidacies, support for
which was hard to predict
• In the election itself, Truman ran relatively weakly in the
Northeast but much more strongly in the farm belt and
the West.
– Hence the famous Chicago Tribune photo.
Polling and Survey Research Today
• Today most commercial and media polls) use well tested
methodologies,
– based on some variant of random sampling, and with
– considerable care going into the wording of the questions and
their assembly into a questionnaire.
• Most such polls are very accurate, giving very similar
estimates.
• However, all such polls are subject to sampling error,
stated in terms of a margin of error of ± X% that depends
largely on sample size.
• They are also subject to “house effects” pertaining to
mode of interview, call-back procedures, various
adjustments, and (especially) their “likely voter” screen.
• Pre-election polls ask something like “If the election were
held today, who would you vote for?”
• Such polls taken many months in advance of an election
bear no reliable relation to the actual outcome.
• Hence the question: why are American Presidential
election campaign polls so variable [over time] when
voters [in the aggregate] are so predictable?
2000
2004
2008
Lichtman’s Keys to the White House
• Lichtman developed this system in 1981, in collaboration
with Volodia Keilis-Borok, a [Soviet] world-renowned
authority on the mathematics of prediction models.
• It presumes that presidential elections are primarily
referenda on how well the party holding the White House
has governed during its term, but taking account of more
than economic performance.
• The Keys further presumes that “by default” the
incumbent party (that controls the White House) wins
unless six or more keys are false [turned against the
incumbent].
– The keys gives specificity to this idea of how presidential
elections work, assessing the performance, strength, and unity of
the party holding the White House to determine whether or not it
has crossed the threshold that separates victory from defeat.
Polling and Survey Research Today
• Today most commercial and media polls) use well tested
methodologies,
– based on some variant of random sampling, and with
– considerable care going into the wording of the questions and
their assembly into a questionnaire.
• Most such polls are very accurate, giving very similar
estimates.
• However, all such polls are subject to sampling error,
stated in terms of a margin of error of ± X% that depends
largely on sample size.
– In general, the margin of error is inversely proportional to the
square root of sample size.
– Margin of error = X%: if we took a great many random sample of
this size from the same population, 95% of them would come
within ±X% points of the “true value” (population parameter).
Polling and Survey Research Today (cont.)
• Different polls are also subject to “house effects”
pertaining to mode of interview, call-back procedures,
various adjustments, and (especially) their “likely voter”
screen.
• Poll averages/aggregates are more accurate than
individual polls, e.g., Real Clear Politics Poll Average
• But all polls ask (literally or in effect): “If the election were
held today, how would you vote?”
– And voters may change their voting intentions.
• The claim of predictive models (e.g., Lichtman) is that
they can predict how voters will vote (in the aggregate)
better than voters can.
Predictive Models
• Forecasting models constructed by social scientists seek
– to predict the percent of the popular votes received by
the Presidential candidate of the party the controls the
White House, and
– to do this using information that is available well
before the election (e.g., in mid-summer).
• All such models use information pertaining to
– the recent performance of the economy, and
– the recent popularity of the incumbent President
• The models differ with respect to the exact measures of
economic performance and Presidential popularity that
are used and with respect to what other variables [if any]
are also used.
PS #1: Problem 2
The predictive models constructed by
Abramowitz, Fair, Lewis-Beck, and others all
use information concerning (i) the performance
of the economy and (ii) the popularity of the
incumbent President (and perhaps other
information) that is available long before the
election in order to predict the percent of the
popular vote received by the Presidential
candidate of the party the controls the White
House. (The models differ with respect to the
exact measures of economic performance and
Presidential popularity that are used and with
respect to what other variables [if any] are also
used.) Attached you will find data for
economic performance and Presidential
popularity for each Presidential election from
1948 through 2008, as well as the percent of
the popular vote received by the incumbent
party’s Presidential candidate for 1948-2008.
Also find and fill in the data for “Streak” and
“Electoral Votes.” Can you devise some kind of
rule or formula based on this data to predict
the percent of the popular vote won by the
incumbent party candidate? According to your
rule or formula, is President Obama likely to be
re-elected?
YEAR is the Presidential election year.
INC is 1 if the incumbent President is running for
re-election; 0 if “open-seat” election.
GDP is real annualized GDP growth over the Fall,
Winter, and Spring quarter preceding the election
(e.g., from October 1, 2011 through June 30, 2012
(from U.S. Department of Commerce, Bureau of
Economic Analysis, http://www.bea.gov/national/
index.htm#gdp).
UNEMP is the Unemployment Rate in July
preceding the election (from U.S. Department of
Labor, Bureau of Labor Statistics,
http://data.bls.gov/timeseries/ LNS14000000).
ΔUN is the change in the Unemployment Rate
from July of the year preceding the election to July
of the election year.
STRK is “streak,” i.e., the number of consecutive
elections won by the incumbent party
PRES is the incumbent President’s approval rating
in the first Gallup Poll taken after June 30 of the
election year (Gallup Reports and
http://www.gallup.com/).
PV is the percent of the two-party (i.e., excluding
Perot, Nader, etc.) popular vote won by the
incumbent party candidate.
EV is the number of electoral votes won by the
incumbent party candidate (including “faithless
electors”). The total number of electoral votes was
531 prior to 1960, 537 in 1960, and 538 since 1960
*
39 electoral votes were cast for States Rights Democrat
candidate J. Strom Thurmond
** 14 “unpledged” electoral votes were cast for Harry F.
Byrd.
*** 45 electoral votes were cast for American Independent
Party candidate George C. Wallace.
Summary Statistics: All Elections
(excluding 2012)
Summary
Statistics:
Incumbent
vs.
Open-Seat
Elections
(excluding
2012)
Scattergram of PV by PRES
Scattergram of PV by PRES
(Labelled by YEAR)
Scattergram of PV by PRES
(with Regression Line)
Definition of Regression Line and R Squared
The Regression Equation: Predicting the PV in
2012 on the Basis of PRES Only
PV by PRES: Incumbent vs. Open-Seat Elections
Scattergram of PV by GDP
PRES by GDP (including 2012)
PV by GDP: Incumbent vs. Open-Seat Elections
Unemployment and Presidential Reelection
PV by Unemployment
PV by Unemployment:
Incumbent vs. Open-Seat Elections
PV by UNEMPL: Democratic vs.
Republican Incumbent Status
PV by DELTA (Change in Unemployment)
DELTA by UNEMPL
PV by DELTA:
Incumbent vs. Open-Seat Elections
PV by STREAK
PV by INC
PV by GDP and PRES (and INC)
• Multiple Regression Equations:
PV = 36.13 + 0.467 x GDP + 0.298 x PRES
R² = 0.736; Adj. R² = 0.695
2012: PV = 36.13 + 0.467 x 2.6 + 0.298 x 47 = 51.35
PV = 35.48 + 0.439 x GDP + 0.278 x PRES + 2.48 x INC
R² = 0.792; Adj. R² = 0.740
2012: PV = 35.48 + 0.439 x 2.6 + 0.278 x 47 + 2.74 = 52.42
• For comparison:
PV = 36.12 + 0.337 x PRES
PV = 47.45 + 1.18 x GDP
2012: 51.96
2012: PV = 52.52
A Summary of Predictions
Predictive Models: 2004
Predictive Models Presented at APSA 2008
Political Scientist
Brad Lockerbie
Thomas Holbrook
Alan Abramowitz
Christopher Wlezien
Alfred Cuzan
Helmut Norpoth
Michael Lewis-Beck
James Campbell
Predicted Obama Vote
58%
55.5%
54.3%
52.2%
51.9%
50.1%
50.07%
Wait till Labor Day
Department of Commerce has revised second-quarter growth from
1.9% to 3.3%
Predictive Models: 2012
Alan Abramowitz 2012
PV = 47.3 + (.107*NETAPP) + (.541*Q2GDP)
+ (4.4*TERM1INC)
•
•
PV stands for the predicted share of the major party vote for
the party of the incumbent president; NETAPP stands for the
incumbent president’s net approval rating (approval –
disapproval) in the final Gallup Poll in June; Q2GDP stands for
the annualized growth rate of real GDP in the second quarter
of the election year; and TERM1INC stands for the presence or
absence of a first-term incumbent in the race.
In order to incorporate this polarization effect in the Time for
Change Model, I added a new predictor (POLARIZATION) for
elections since 1996.** The estimates for the revised model
are as follows:
PV = 46.9 + (.105*NETAPP) + (.635*Q2GDP)
+ (5.22*TERM1INC) – (2.76*POLARIZATION)
•
**For elections since 1996, the polarization variable takes on the value 1 when
there is a first-term incumbent running or when the incumbent president has a
net approval rating of greater than zero; it takes on the value -1 when there is
not a first-term incumbent running and the incumbent president has a net
approval rating of less than zero.
EV by PV
EV by PV: Incumbents vs. Open Seat)
DEVPC by DPV
1924 Electoral Map (“Solid South”)
Electoral Maps: 1956 vs. 1984