Matchless Statistics - American Statistical Association

Download Report

Transcript Matchless Statistics - American Statistical Association

Combining Administrative Records
and Business Registers to Estimate
Quarterly Employment in Nonprofit
Organizations in the USA
Martin H. David
Professor, Emeritus, Univ. of Wisconsin – Madison
Associate Scholar, Urban Institute
ICES3, 21 June 2007
Views expressed here do not reflect policies and estimates of the Department of Labor or
the Bureau of Labor Statistics.
1
Acknowledgements
• Bureau of Labor Statistics
– provided access to the QCEW via its research
enclave.
– Rick Clayton, David Talan, Amy Knaup, and Merissa
Piazza advised and commented on earlier reports.
• Center For Nonprofits, Urban Institute
– provided its IRS databases and funding.
– Tom Pollak and Linda Lampkin provided technical
information about IRS information returns.
– Jen Auer and Kendall Golladay assisted and
provided useful checking on inconsistencies in the
IRS_QCEW match.
2
“Public Sector Failures” (Niskanen)
Employment in charitable organizations
What failures?
• Estimates
– badly understated
– not timely
– not published
• Flawed Information
returns
• Censor employment
Which agencies?
Statistical agencies
• Census
• IRS/SOI
• BLS
IRS/TEGO
OMB/OIRA
4
Defining private nonprofit organizations
• Constitutionally exempt
– Religious congregations
• Exempt under income tax law §501(c)
– Charitable Organizations (501(c)(3))
–e.g. Education, Hospitals, Social
Services, Research, Arts, Advocacy
– Other exempt (not 501(c)(3))
–Governments
–Membership orgs & Assoc.
–Credit Unions, Coops, Mutual Ins. etc.
6
Why study Nonprofits?
• SIZE: Nonprofit employment 8-13 million 2003q1
• GROWTH: larger than private sector (Salamon 2005)
• NEED: Timely national and area data used to
understand availability of charitable services
• Example: disaster response to Katrina
• How many employees could respond?
• INCENTIVES: For-profit / nonprofit entities differ.
– Performance of nonprofits
• a public issue because of subsidy from tax system.
– Productivity of nonprofit workers
• needed to understand outcomes of nonprofit activity
• Commingling nonprofit with for-profit obfuscates
understanding of both sectors.
7
Published employment for nonprofits
• Economic Census
– Limited coverage of NAICS sectors
• 61 Education
– Excludes many educational institutions
• 62 Health & Social services
• 71 Arts, sports & entertainment
• 813 Religious, grantmaking, civic, prof. orgs.
– 5-year estimates,
• publication in 2005 for 2002 data
– Not classified into 501(c)(3) and others
– Small enterprises imputed from IRS data
• employment data censored
8
Table 1. Nonprofit employment reported in Economic Census 2002
Naics
Employment
Establishments
classifiA
C
D
E
G
H
cation
Exempt
All
Ratio: A/C Exempt
All
Ratio: E/G
61
120
431
0.28
12
50
0.24
62
7,980
15,048
0.53
136
703
0.19
71
1,363
1,847
0.74
49
109
0.45
813
936
936
Total
exc. 813
inc. 813
9,463
10,400
17,326
18,262
1.00
11
11
197
208
862
874
1.00
All estimates in 1,000’s
9
Unpublished employment for nonprofits
Elicited on IRS Form 990
• Strengths
• filed annually by larger nonprofits
• comprehensive data
• Classifications
– Exemption: differentiate 501(c)(3), others
– Industry classified by NTEE
• Weaknesses
– Incomplete
• No employment for smaller organizations
• 20% non-response for large organizations
– Unavailable
– Not timely (due to fiscal years, filing calendar)
10
Employment reports on Form 941
• Limitations
– No indicator for nonprofits
– Applies to employers with more than $2500 of
liability for payroll and income tax withholding
• Value
– captures most employers that withhold
income taxes or FICA
– Quarterly report
• Q1 reference period identical to Form 990
• captures employment missing from Form 990
– 20% of 990 filers fail to report employment
– late filers provide no timely information
11
Proposed estimates from matched data
Exempt identified by:
• IRS Form
990/990-EZ
and
• IRS Registry of
exempt entities
– includes defunct
orgs.
Employment from:
• IRS/Form 941
– quarterly since 2005
– larger payroll
employers
• Form W-2 (low payroll
employers)
13
Can we match nonprofits to Forms 941?
• Feasible procedure established for match to
BLS/QCEW
– A similar procedure applies to Form 941
– Match entails two merges
• A. Link Form 990/990-EZ to Form 941 by EIN
• B. For EIN’s that are identified as exempt
(Registry) and have no Form 990
– Link Registry to Form 941by EIN
• Matching errors require
– deleting mismatched EIN’s (3.5% in QCEW)
– weighting for matches that fail because of
error in recording EIN
(1.2% in QCEW)
14
Outline
Focus is on operating charities, 501(c)(3)
1
2
3
4
Matching records to improve estimates
Details of QCEW matching process
“Enhanced” employment estimates
Implications, conclusions
15
Estimates from matched data
Exempt identified by:
• IRS Form
990/990-EZ
and
• IRS Registry of
exempt entities
– includes defunct
orgs.
Employment from:
• BLS/QCEW
– UC liable employers
– filed quarterly
16
Gain/Loss from matching
• QCEW-Form 990 matched by EIN
– increases measured nonprofit employment
• allows imputation for nonreporting Form 990 filers
• some reported 990 employees not eligible for UC
– classified by both NAICS and NTEE
– measures extent of multi-establishment
employers
• Unmatched Form 990 include
– Employers not liable for UC
– Employers who are censored in QCEW
– Employees who are truncated
– Nonemployers
17
Nonprofit employment & organizations by
class of exemption, 2003 (000’s)
IRC
subsection:
501(c)( )
All
Employment
(wtd.)
Organizations
13,300
343
(c)(3) plus*
11,700
277
Remainder
1,600
66
* Includes some organizations with subsection NA
18
Coverage of 501(c)(3) match
• For US, est. 43% match rate (98% employers)
• For US, est. 34% no wage and no employees
– Match covers 57% of all filing employers
• For all matches,
– NTEE, NAICS industry classifications
– Allocation of employment to worksites
– Comparable legal names
21
Matching errors (ME)
• Can ME be ignored?
• Only if no selectivity in observed matches
• Only if no interest in multiple variable regression.
• Match failures (false negatives) lead to
incomplete coverage and bias.
– Example: Distribution of employment by size
is flawed in enterprise level statistics
• 2 smaller employers are tallied when interstate
employer can not be matched across states Okolie
(2004).
– Problem exists even though aggregate
employment is correct.
• Regression inconsistent (Scheuren-Winkler
1997)
23
Outline
1 Matching records to improve estimates
2 Details of QCEW matching
3 “Enhanced” employment estimates
4 Implications, conclusions
24
IRS public information on nonprofits
(to be matched to QCEW)
• Form 990/990-EZ
• Coverage: Most entities with revenues > $25,000
• NCCS Census available from filings for 1999-2003
– Over 95% match the Registry (by EIN)
• Extract of Registry
• Coverage: All entities operating under IRS approval
• Used to match QCEW in this study, when no Form
990 is available (<10% of 501c3)
• Registry a “gold standard” for EIN
• less than 4/10,000 registry ein’s are invalid.
25
Nonprofit liability for UC
(induces QCEW records)
• 1/3 nonprofits employ no one.
• Exclusions from UC
– 30 states exclude employers of 1-3
employees.
• Not available for matching
– Many employees are excluded:
• Part-time workers (35 states)
• Students and interns (most states).
QCEW employment understates actual
employment in nonprofit organizations
26
Early warnings of match errors
• 2% QCEW records lack ein’s in 2000
• Substantial numbers of QCEW match to IRS
Registry and not to IRS information returns
(990/990-EZ)
• Some matches link tiny organizations to
behemoths whose payroll is far larger than
organization expenses.
28
Removing False Positive Matches
• Test relationship between organization expenses
(Form 990) and payroll (QCEW)
• Reject match where 0.2 expense < payroll
• Delinked 3% of matched organizations
• Registry matches: Scan Legal name and
industry class
• Reject where NAICS sector = 52, 8x
• Reclassify where exempt entity is linked to
business association, labor union
• Delinked 5% of Registry matches
• 3.3% organizations delinked
29
Weighting to offset false negative matches
• A few QCEW EIN’s are invalid
• 1.9% in 2000q1, 1.4% in 2003q1
• States range from 0.4% to 5.8% in 2001q1
• Rates decline with increasing number of
employees
• Fit probit to level of employees by state, s
• Calculated nonresponse weight, controlled to
state totals by year
30
Outline
1 Matching records to improve estimates
2 Details of QCEW matching process
3 “Enhanced” employment estimates
4 Implications, conclusions
31
T2 QCEW matches to Forms 990, 501(c)(3) organizations, 2003q1*
Included
states
Excluded
Matched?
Yes
A
Form 990-EZ
proportion of Forms 990
B
Form 990
proportion of Forms 990
C
Subtotal NCCS Census
proportion of Forms 990
D
Registry-QCEW matches
E
TOTAL
proportion of total
F
Estimated matches**
G
UNIVERSE
proportion of universe
Total
states
No
4
42
7
53
0.08
0.79
0.13
1.00
79
95
27
201
0.39
0.47
0.13
1.00
83
137
34
254
0.33
0.54
0.13
1.00
23
--
--
23
106
137
34
277
0.38
0.50
0.12
1.00
13
21
0
119
158
0
0.43
0.57
277
1.00
32
Circles show where QCEW finds employment not reported on Form 990.
T3. QCEW and Form 990 Employment: 2003, 501(c)(3), in 000's
QCEW emp.
Source
Match?
Orgs.
Raw
IRS employment
Wtd.
Raw
Imputed
Augmented
990, inc.
Yes
79
No
95
Yes
4
No
42
6,780
6,930
6,831
7,463
7,463
1,005
1,005
1,005
10
10
8,478
8,478
990-EZ, inc.
Subtotal
Registry
Available states
990, exc.
990-EZ, exc.
Total
Yes
10
10
220
6,790
6,940
7,836
23
1,724
1,777
NA
243
8,514
8,717
27
NA
NA
10,255
1,485
1,485
7
277
1,777
1,485
NA
NA
NA
9,321
9,963
11,740
33
Circles show where Form 990 reveals employment not in QCEW (lower bound estimates).
T3. QCEW and Form 990 Employment: 2003, 501(c)(3), in 000's
QCEW emp.
Source
Match?
Orgs.
Raw
IRS employment
Wtd.
Raw
Imputed
Augmented
990, inc.
Yes
79
No
95
Yes
4
No
42
6,780
6,930
6,831
7,463
7,463
1,005
1,005
1,005
10
10
8,478
8,478
990-EZ, inc.
Subtotal
Registry
Available states
990, exc.
990-EZ, exc.
Total
Yes
10
10
220
6,790
6,940
7,836
23
1,724
1,777
NA
243
8,514
8,717
27
NA
NA
10,255
1,485
1,485
7
277
1,777
1,485
NA
NA
NA
9,321
9,963
11,740
34
Nonprofits by class of exemption, 2003
(000’s)
IRC
subsection:
501(c)( )
All
(c)(3) plus
Remainder
Employment
(wtd.)
Organizations
13,300
343
11,700
277
1,600
66
35
Outline
1 Matching records to improve estimates
2 Details of QCEW matching process
3 “Enhanced” employment estimates
4 Implications, conclusions
36
What have we learned?
• Nonprofit employment is large
• 13.3m in all
• 11.7m in 501(c)(3)
• Prior estimates need to be refined
– Matching errors can not be ignored
– Form 990 estimates need imputation
– Registry matches improve coverage
• Part of nonprofit employment is not measured
anywhere
37
Where do we go from here? IRS/TEGO
• IRS/TEGO can and should
– resolve false positive matches
– require wage-paying organizations to report
employees
– revise Form 990-EZ to include counts of
employees
38
Where do we go from here? IRS/SOI
A. Publish nonprofit employment numbers.
– IRS measurement is more complete than
BLS,
•
–
1.0+ million employees not in QCEW
Imputing employment for nonreporting
organizations is possible
•
More then 600,000 employees can be found
B. Estimate nonprofit employment with Form 941
match for best coverage.
– captures some Form 990-EZ employment
– imputation with low MSE.
C. Employment on match to W-2 needs study.
39
Where do we go from here? BLS
• Can make publication of nonprofit employment
part of the OEUS program
– This would be efficient
• past estimates match selected states at haphazard
intervals using different methods.
– Yearly estimates, with a 9-month lag
• indicator that can be benchmarked to more
universal coverage in IRS Form 990 & 941
• Should support continuing research on the IRSQCEW match to produce BED estimates
40
Comments?
Contact: [email protected]
42
Matches of QCEW that use EIN
• Nonprofit employment: matched to IRS Registry
• Salamon 2005, Gronbjerg 2005, Michigan
nonprofit research 2004
• Enterprise statistics on employment dynamics
• Okolie 2004
• Quality of business registers: match to Census
records.
• Spletzer and Elvery 2005, 2006
• Match errors a problem in all of these studies
43
Matching errors (continued)
• Cross-linked records (False positive
matches) vitiate classical regression
estimators.
• Scheuren and Winkler 1997
• Absurd relationships are suggested by
inclusion of false positives in linked data.
• 2 donor files: A. correct ID; B. false ID
– Match produces links to 2 distinct
records in donee file – Error!
44
IRS regulates nonprofit sector
• Exemption from corporate income tax
– automatic for religious congregations
– otherwise, approved by IRS under IRC 501(c)
– contributions to c(3) tax deductible
– c(4) membership organizations
– other subsections include many entities
• Rules
– Governance; no distribution of excess revenue
– Annual information reports to public required.
• IRS information about exempt entities disclosable!
45
Findings: Probability of invalid ein
• State-by-state
– Dominant pattern
• probability decreases as employees
increase to 10, then nearly constant
• negative time trend
– A few states
• probability not monotonic decreasing with
less than 11 employees
–May have positive trend
46
Weighting for failed matches
• Use probability of invalid ein, θis, to
calculate weights for matched data
– Sum θis by state, giving θ.s .
• ks =(Actual sum invalid ein) / θ.s
– wt i s =1/(1 – θis ks) assuming η≈0
• Calculate weighted sum, by state
– Establishments
– Employment
47
Borrowing strength from QCEW matches
• Previous slide demonstrated how we can
extrapolate to excluded states using match
rates in included states.
• Now we investigate consistency of QCEW
and Form 990 reports.
– Consistency provides a basis for
estimating the universe of employers.
48
Consistency of employment, wage reports
• Internal to Form 990
– Compensation > 0 ↔ employees > 0
• FN no employees, positive wages
• FP employees, no wages
• Comparing Form 990 to QCEW
– Employees on March 12 should agree
• FN QCEW employees, Form 990 none
• FP QCEW none, Form 990 employees
• Consistency allows us to estimate proportion of
organizations with no employees (not in
universe of study)
49
Consistency of compensation &
employees, Form 990
Consistency of employee reporting: Form 990 - QCEW
FN
FN
19
FP
TN
TN
TP
1
1
1
*
2
20
2
2
Total
20
*
TP
Total
FP
1
1
76
77
77
100
Note: 95% of Form 990 FP are QCEW TP
Only 2/5 of Form 990 TN are QCEW TN.
* less than 0.5%
50
Fig. 4 Concentration of employment in NAICS
sectors
Low
90+ % of emp.
20
15
10
5
Y
-W
R
NTEE 15
P
M
I
2
no
tE
E
B4
0
A
NAICS sectors
25
51
Employees, counted by employer
• BLS
– Current employment statistics (CES)
• monthly for 160,000 organizations
• focus: private, nonfarm, non-household
– QCEW
• quarterly “census” of UC employers
–available 9 months after quarter
–analyzed for employment dynamics
–no reliable exempt classifier
52
T5 Employment by NTEE major sectors, 2003 q1, 501c3
QCEW
NTEE 15
Orgs
Wtd.
Forms 990
Imputed
Augmented
A
29
188
261
290
B not B4
48
1,158
644
1,381
1
757
1,184
1,271
C,D
10
67
78
83
E not E2
33
1,289
1,439
1,713
E2
3
2,971
3,636
3,876
I
5
48
63
65
21
274
400
429
4
7
8
9
N,O
26
142
141
185
P
39
1,272
1,693
1,761
Q
5
21
41
42
R-W
33
282
298
359
X
17
128
70
161
Y
1
13
5
15
Z
2
99
1
100
277
8,717
9,963
11,740
B4
J,K,L
M
Total
53
Imputation &augmentation, 2003q1, 501(c)(3)
Matched Form 990's
NTEE 15
E2
Q
B4
A
E not E2
X
P
C,D
J,K,L
R-W
Z
B not B4
N,O
I
M
Y
All
Imputation
rate*
0.043
0.055
0.086
0.107
0.117
0.118
0.127
0.132
0.139
0.154
0.164
0.173
0.177
0.178
0.197
0.906
0.093
NTEE 15
Q
P
I
E2
C,D
J,K,L
B4
A
M
E not E2
R-W
N,O
B not B4
X
Y
Z
All
Augmentation rate**
0.026
0.040
0.044
0.066
0.067
0.072
0.073
0.111
0.126
0.190
0.206
0.309
1.144
1.295
1.745
78.110
0.210
*Observed for Form 990 matched to QCEW.
**Increase over imputed total.
Bold: consistency of rank in 1/3 or 3/3 of NTEE.
Italics: no industry class is assigned to Z.
54