DANE PANELOWE - Uniwersytet Warszawski

Download Report

Transcript DANE PANELOWE - Uniwersytet Warszawski

PANEL DATA
Development
Workshop
What are we going to do today?
1.
2.
3.
4.
5.
Panels – introduction and data properties
How to measure distance
What comes first: trade or GDP?
What else affects trade?
Role of currency?
Why panel data?

What is the sense of panel data?
 pooled
data in econometrics
 panels in econometrics
 long or wide?
 fixed or random effects?
Gravity model

All that theory is ql, but transport costs matter and
market size matters: => push and pull
–
–
–
–
–
Isard (1954), logs by Tinbergen (1962) [what if there were no
barriers? „missing trade”], Linneman (1966) [standard macro
approach],
Anderson (1979) [first theoretical model – expenses based]
Helpman-Krugman (1985) [intra-industry trade]
Bergstrand (1985) [general equilibrium, one country/one factor]
Bergstrand (1989) [H-O model with Lindera hypothesis]
Simplest model

Variables:
Explained: bilateral trade
– Explanatory: GDP, populations, distance
reg trade gdp pop dist
–
Source
SS
df
MS
Model
Residual
196764.006
129238.275
3
1070
65588.0021
120.783434
Total
326002.281
1073
303.823188
tradevolume
Coef.
gdpsum
population~m
distance
_cons
.0141613
.0528096
-.0073704
5.762674
Std. Err.
.0011921
.0228549
.0005152
1.067794
t
11.88
2.31
-14.31
5.40
Number of obs
F( 3, 1070)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.000
0.021
0.000
0.000
=
=
=
=
=
=
1074
543.02
0.0000
0.6036
0.6025
10.99
[95% Conf. Interval]
.0118221
.0079642
-.0083813
3.667467
.0165004
.097655
-.0063594
7.857882
Panel data


Same data, same question, but „sth” consists of groups over time
STATA learns that by
1. Set of commands:
iis grouping_var
tis time_var
2. xtset grouping_var time_var
3. tsset grouping_var time_var
(they are all equivalent)

Once data are set for panel? xtsum vs sum
Panel regression

Do not forget context menu in STATA

To find out how to do panel regressions in STATA: Statistics =>
Longtitudal/panel data

–
Many options already covered: xtset, sum, des, tab (check’em
out)
–
Also: linear models
Simplest code xtreg trade pop gdp dist
Panel results
Random-effects GLS regression
Group variable: id
Number of obs
Number of groups
=
=
1074
91
R-sq:
Obs per group: min =
avg =
max =
6
11.8
12
within = 0.4879
between = 0.6091
overall = 0.5995
Random effects u_i ~ Gaussian
corr(u_i, X)
= 0 (assumed)
Std. Err.
Wald chi2(3)
Prob > chi2
tradevolume
Coef.
z
gdpsum
population~m
distance
_cons
.0187795
-.0098166
-.0068902
4.429218
.0006722
.0375135
.0017132
3.53079
sigma_u
sigma_e
rho
10.536556
3.3908988
.90615037
(fraction of variance due to u_i)
27.94
-0.26
-4.02
1.25
P>|z|
0.000
0.794
0.000
0.210
=
=
1070.28
0.0000
[95% Conf. Interval]
.017462
-.0833418
-.010248
-2.491003
.0200969
.0637085
-.0035324
11.34944
How do we know if it makes sense?



Different from pooled estimator?
What if we add country effects to the pooled estimation? Let’s try
areg trade pop gdp dist, absorb(grouping_var)
Some we know from the literature and some from experience
– Linear or in logs? Maybe also non-linear terms and
interactions, trade or export share, etc.
– Should we do fixed or random effects?
– Are we interested in differences across time or across
countries? Between and within R2 tell a different story, no?
What do our models say?
xttest0
tradevolume[id,t] = Xb + u[id] + e[id,t]
Estimated results:
Var
tradevo~e
e
u
Test:
303.8232
11.49819
111.019
sd = sqrt(Var)
17.43052
3.390899
10.53656
Var(u) = 0
chi2(1) =
Prob > chi2 =
4793.89
0.0000
Huge problem - endogeneity


What is first:
– rich trade more or rich because trade more?
– how to go around this problem?
What is it that we want?
– Cross country differences?
– Time evolutions within one country?
– Test theory?
What do you find on do-file?
1.
2.
3.
Declare panel, run simplest models, do graphs, etc
Run diagnostics
Learn more 