DANE PANELOWE - Uniwersytet Warszawski
Download
Report
Transcript DANE PANELOWE - Uniwersytet Warszawski
PANEL DATA
Development
Workshop
What are we going to do today?
1.
2.
3.
4.
5.
Panels – introduction and data properties
How to measure distance
What comes first: trade or GDP?
What else affects trade?
Role of currency?
Why panel data?
What is the sense of panel data?
pooled
data in econometrics
panels in econometrics
long or wide?
fixed or random effects?
Gravity model
All that theory is ql, but transport costs matter and
market size matters: => push and pull
–
–
–
–
–
Isard (1954), logs by Tinbergen (1962) [what if there were no
barriers? „missing trade”], Linneman (1966) [standard macro
approach],
Anderson (1979) [first theoretical model – expenses based]
Helpman-Krugman (1985) [intra-industry trade]
Bergstrand (1985) [general equilibrium, one country/one factor]
Bergstrand (1989) [H-O model with Lindera hypothesis]
Simplest model
Variables:
Explained: bilateral trade
– Explanatory: GDP, populations, distance
reg trade gdp pop dist
–
Source
SS
df
MS
Model
Residual
196764.006
129238.275
3
1070
65588.0021
120.783434
Total
326002.281
1073
303.823188
tradevolume
Coef.
gdpsum
population~m
distance
_cons
.0141613
.0528096
-.0073704
5.762674
Std. Err.
.0011921
.0228549
.0005152
1.067794
t
11.88
2.31
-14.31
5.40
Number of obs
F( 3, 1070)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.000
0.021
0.000
0.000
=
=
=
=
=
=
1074
543.02
0.0000
0.6036
0.6025
10.99
[95% Conf. Interval]
.0118221
.0079642
-.0083813
3.667467
.0165004
.097655
-.0063594
7.857882
Panel data
Same data, same question, but „sth” consists of groups over time
STATA learns that by
1. Set of commands:
iis grouping_var
tis time_var
2. xtset grouping_var time_var
3. tsset grouping_var time_var
(they are all equivalent)
Once data are set for panel? xtsum vs sum
Panel regression
Do not forget context menu in STATA
To find out how to do panel regressions in STATA: Statistics =>
Longtitudal/panel data
–
Many options already covered: xtset, sum, des, tab (check’em
out)
–
Also: linear models
Simplest code xtreg trade pop gdp dist
Panel results
Random-effects GLS regression
Group variable: id
Number of obs
Number of groups
=
=
1074
91
R-sq:
Obs per group: min =
avg =
max =
6
11.8
12
within = 0.4879
between = 0.6091
overall = 0.5995
Random effects u_i ~ Gaussian
corr(u_i, X)
= 0 (assumed)
Std. Err.
Wald chi2(3)
Prob > chi2
tradevolume
Coef.
z
gdpsum
population~m
distance
_cons
.0187795
-.0098166
-.0068902
4.429218
.0006722
.0375135
.0017132
3.53079
sigma_u
sigma_e
rho
10.536556
3.3908988
.90615037
(fraction of variance due to u_i)
27.94
-0.26
-4.02
1.25
P>|z|
0.000
0.794
0.000
0.210
=
=
1070.28
0.0000
[95% Conf. Interval]
.017462
-.0833418
-.010248
-2.491003
.0200969
.0637085
-.0035324
11.34944
How do we know if it makes sense?
Different from pooled estimator?
What if we add country effects to the pooled estimation? Let’s try
areg trade pop gdp dist, absorb(grouping_var)
Some we know from the literature and some from experience
– Linear or in logs? Maybe also non-linear terms and
interactions, trade or export share, etc.
– Should we do fixed or random effects?
– Are we interested in differences across time or across
countries? Between and within R2 tell a different story, no?
What do our models say?
xttest0
tradevolume[id,t] = Xb + u[id] + e[id,t]
Estimated results:
Var
tradevo~e
e
u
Test:
303.8232
11.49819
111.019
sd = sqrt(Var)
17.43052
3.390899
10.53656
Var(u) = 0
chi2(1) =
Prob > chi2 =
4793.89
0.0000
Huge problem - endogeneity
What is first:
– rich trade more or rich because trade more?
– how to go around this problem?
What is it that we want?
– Cross country differences?
– Time evolutions within one country?
– Test theory?
What do you find on do-file?
1.
2.
3.
Declare panel, run simplest models, do graphs, etc
Run diagnostics
Learn more