Source Data Layer

Download Report

Transcript Source Data Layer

RESEARCH
TRIANGLE
PARK,
NORTH
CAROLINA
Where Are The Farms?
A Synthetic Database of Poultry and Livestock Operations
in Support of Infectious Disease Control Strategies
Presented by Jamie Cajka
ESRI Federal Users Conference, Washington, DC
Feb. 21, 2008
RESEARCH
TRIANGLE
PARK,
NORTH
CAROLINA
Acknowledgements
 This research is in support of the Models of Infectious
Disease Agent Study (MIDAS) project which is funded by
the National Institute of General Medical Sciences (US
Department of Health and Human Services).
 This work was also performed by:
 Mark Bruhn (RTI)
 Dr. Gary Smith (University of Pennsylvania)
 Ross Curry (RTI)
 Seth Dunipace (University of Pennsylvania)
3
Presentation Overview
 Framing the problem
 Desired output
 Data sources
 Data manipulations
 Creation and attribution process
 Results
 Conclusions
 Future work
4
Framing the Problem
 Animal-borne disease modelers need to know:
 Animal operation locations.
 Proximity to other animal operations.
 Composition of animal operation.
– Types of animals.
– Number of head.
 This is necessary to model the spread and mitigate the effects of
outbreaks such as avian influenza and foot-and-mouth disease.
 Avian influenza is a serious human health threat.
 Modelers desire to create and test strategies.
5
Framing the Problem
 Actual farm locations and animal counts by type are NOT
available nationally.
 Grower privacy concerns
 National security concerns
 National Animal Identification System (NAIS) will not be the
answer.
 Currently voluntary with about 20% participation.
 RTI created synthetic farm locations that can be used as inputs
into animal-borne disease models.
 This presentation will focus on poultry operations, as that is the
animal type that is currently complete.
6
Desired Output
 A geographically referenced set of farms within an area,
characterized by:
 Type of animals.
 Number of animals.
 Mix of animals.
 Format could be one of:
 A spatial data layer such as a
shapefile.
 A text file with x and y coordinates.
7
Data Sources
Data Layer
Source
Slope
Derived from National Elevation Dataset (NED)
Land Cover (incl. forests & crop lands)
National Land Cover Dataset (NLCD 2001)
Wetlands
National Wetlands Inventory & NLCD 2001
Federal (public) Lands
ESRI data disks version 9.2
State & Local Parks
ESRI data disks version 9.2
National & State Roads
ESRI business analyst street map (TeleAtlas 2006)
Residential Roads
Estimated from ESRI business analyst street map
(TeleAtlas 2006)
Water bodies
National Hydrography Dataset (medium resolution)
Airports & Railroads
ESRI data disks version 9.2
Poultry Support Businesses
ESRI business analyst
Non-Agriculture Businesses
ESRI business analyst
Municipalities & Urbanized Areas
ESRI data disks version 9.2 & US Census Bureau
Sensitive areas (churches, schools,
etc.)
ESRI data disks version 9.2 (including Geographic
Names Information Service – GNIS names)
8
Data Sources (con’t)
9
Tabular Data
 Census of Agriculture
 Aggregation and cross-tabulation to create a single
record for each county in the U.S.
10
Rasterization
 All vector data were projected into Albers (NAD 83,
meters)
 Buffers were created as needed
 Polygons were attributed for rasterization
 Vector data were rasterized to a 30 meter resolution
(to match NLCD)
11
Assigning of Probabilities
 Focused on farm building location rather than land
parcel location.
 Based on:
 Research team’s experience
 literature review
 examination of “truth” data for selected counties
 Idea was to multiply probabilities together, so that 0
probability on a layer made the cell impossible for
farm location.
12
Combining Raster Surfaces
Probabilities
Land Cover
0
0.20
1.00
0.20
0.20
1.00
1.00
1.00
1.00
Slope
X
Distance from Roads
1.00
0.50
0.50
0.50
0
0.20
0.50
0.50
1.00
0
0.02
0.25
0.02
0
0.20
0.25
0.50
1.00
X
0.20
0.20
0.50
0.20
0.50
1.00
0.50
1.00
1.00
=
13
Combining Raster Surfaces
•Individual probability surfaces were
combined on a state by state basis
14
Creation and Attribution
 The production process was a combination of
VB and ArcGIS Modelbuilder
 VB
 GUI Front End
 Opened up a cursor into the Census of
Agriculture summary
 Attribution of type of farm
15
Creation and Attribution (con’t)
 ArcGIS Modelbuilder
16
17
18
Results
 RTI generated a synthetic poultry operation
shapefile for every county in the United States.
 The number of farms was correct.
 The locations corresponded to the probability
surface.
 The size and type were
randomized.
19
Results (Con’t)
 RTI sent synthetic poultry operation locations to researchers at
University of Pennsylvania, to compare against the complete set
of truth data.
Actual Locations
Synthetic Locations
20
Conclusions
 Synthetic locations matched up very well to actual
locations.
 Data is still being tested in the models to see how
sensitive the various parameters are.
 Inter-farm distance
 Animal type
 Number of animals
21
Future Work
 Creation of different locations for broilers, layers, and pullets
using surfaces created specifically for each.
 Creation of all farms with animal operations nationwide.
 Cattle (currently underway)
 Sheep
 Goats
 Hogs
 Creation of synthetic cattle operation locations for the UK
(currently underway)
 Creation of SE Asian synthetic poultry operation locations.
22
Contact Info
Jamie Cajka
RTI International
(919) 541-6470
[email protected]
23
ModelBuilder Model
Input Probability Surface
Output Synthetic Locations
24