Slides. - Rasmusen, Eric
Download
Report
Transcript Slides. - Rasmusen, Eric
G492
Eric Rasmusen,
[email protected]
29 September 2009
Regressions
1
How would you explain how
regressions work?
Here is a fun way.
2
What does it mean for a test
to be accurate at the 5%
confidence level?
3
Consider testing whether a coin is
fair by flipping once and rejecting
fairness if Heads comes up. That test
will falsely reject 50% of the time. You
DON’T say there is a 50% probability the
coin is unfair. The 50% is saying
something about the test reliability as
well as the estimates.
Rice Applet: Confidence Levels
4
KNOW THY DATA!
In being persuasive, data management is
important. Anyone can understand a
data entry mistake. That destroys
credibility.
Maybe have one member of the team just
for data.
5
ETHICS
“… the problem is not that one is always
being asked to step across a welldefined line by unscrupulous lawyers.
Rather, it is that one becomes caught
up in the adversary proceeding itself
and acquires the desire to win. …”
“Continuing to regard oneself as objective,
one can slip little by little from true
objectivity.”
6
The Data (abridged)
7
An Excel file with the data is at
http://www.rasmusen.org/g492/data08.xls.
8
Principles of Tables
1. Keep the data-to-ink ratio high.
2. Leave out dividing lines and boxes unless you have a
good reason for them.
3. Leave off repetitive, useless numbers.
4. Don't use just capital letters.
5. Circle or otherwise mark important numbers, in
particular, ones you mention in the text or talk.
6. Make the table self-contained. Don't require the reader to
refer to the text or a previous table. Include the source
and the units of measurement.
7. Number and title every table.
9
Uniform Crime Reports: Crime in the United States, Section IV
10
Back to the Election Data
11
The correlation coefficient is the square root of the R2 for a regression of one
variable on a constant and the other variable.
12
If income
rises by $100, how
much does the
Republican vote
rise?
13
Principles of Regression
Presentation
1. Only present relevant and meaningful numbers.
2. Do not write 1.23423 when rounding to 1.23 will do just
as well. Fewer digits yield greater clarity.
3. Use correlation matrices to show the simple correlations
between important variables.
4. Give summary statistics. Think about which are most
useful. Think about presenting the mean, median, mode,
minimum, maximum, standard deviations, and number of
observations. Do not present all of these--think.
14
5. Use words for variable names, not computer codes.
6. Present the coefficients, standard errors or t-statistics
(not both), R2, and number of observations. Do not
present other statistics (e.g. an F-test for all coefficients
equalling zero) unless you have a reason to.
7. If the left-hand variable (y-variable, dependent variable,
endogenous variable) takes only a few values (e.g., 0
and 1) then use a special technique such as logit or tobit.
If a right-hand variable (x-variable, independent
variable, exogenous variable) takes only a few values,
that does not create a need to use anything besides
15
OLS.
Consider using nonlinear specifications,
as illustrated in the java applet at
http://www.ruf.rice.edu/~lane/stat_sim/transf
ormations/index.html
16