Source Data Layer
Download
Report
Transcript Source Data Layer
RESEARCH
TRIANGLE
PARK,
NORTH
CAROLINA
Where Are The Farms?
A Synthetic Database of Poultry and Livestock Operations
in Support of Infectious Disease Control Strategies
Presented by Jamie Cajka
ESRI Federal Users Conference, Washington, DC
Feb. 21, 2008
RESEARCH
TRIANGLE
PARK,
NORTH
CAROLINA
Acknowledgements
This research is in support of the Models of Infectious
Disease Agent Study (MIDAS) project which is funded by
the National Institute of General Medical Sciences (US
Department of Health and Human Services).
This work was also performed by:
Mark Bruhn (RTI)
Dr. Gary Smith (University of Pennsylvania)
Ross Curry (RTI)
Seth Dunipace (University of Pennsylvania)
3
Presentation Overview
Framing the problem
Desired output
Data sources
Data manipulations
Creation and attribution process
Results
Conclusions
Future work
4
Framing the Problem
Animal-borne disease modelers need to know:
Animal operation locations.
Proximity to other animal operations.
Composition of animal operation.
– Types of animals.
– Number of head.
This is necessary to model the spread and mitigate the effects of
outbreaks such as avian influenza and foot-and-mouth disease.
Avian influenza is a serious human health threat.
Modelers desire to create and test strategies.
5
Framing the Problem
Actual farm locations and animal counts by type are NOT
available nationally.
Grower privacy concerns
National security concerns
National Animal Identification System (NAIS) will not be the
answer.
Currently voluntary with about 20% participation.
RTI created synthetic farm locations that can be used as inputs
into animal-borne disease models.
This presentation will focus on poultry operations, as that is the
animal type that is currently complete.
6
Desired Output
A geographically referenced set of farms within an area,
characterized by:
Type of animals.
Number of animals.
Mix of animals.
Format could be one of:
A spatial data layer such as a
shapefile.
A text file with x and y coordinates.
7
Data Sources
Data Layer
Source
Slope
Derived from National Elevation Dataset (NED)
Land Cover (incl. forests & crop lands)
National Land Cover Dataset (NLCD 2001)
Wetlands
National Wetlands Inventory & NLCD 2001
Federal (public) Lands
ESRI data disks version 9.2
State & Local Parks
ESRI data disks version 9.2
National & State Roads
ESRI business analyst street map (TeleAtlas 2006)
Residential Roads
Estimated from ESRI business analyst street map
(TeleAtlas 2006)
Water bodies
National Hydrography Dataset (medium resolution)
Airports & Railroads
ESRI data disks version 9.2
Poultry Support Businesses
ESRI business analyst
Non-Agriculture Businesses
ESRI business analyst
Municipalities & Urbanized Areas
ESRI data disks version 9.2 & US Census Bureau
Sensitive areas (churches, schools,
etc.)
ESRI data disks version 9.2 (including Geographic
Names Information Service – GNIS names)
8
Data Sources (con’t)
9
Tabular Data
Census of Agriculture
Aggregation and cross-tabulation to create a single
record for each county in the U.S.
10
Rasterization
All vector data were projected into Albers (NAD 83,
meters)
Buffers were created as needed
Polygons were attributed for rasterization
Vector data were rasterized to a 30 meter resolution
(to match NLCD)
11
Assigning of Probabilities
Focused on farm building location rather than land
parcel location.
Based on:
Research team’s experience
literature review
examination of “truth” data for selected counties
Idea was to multiply probabilities together, so that 0
probability on a layer made the cell impossible for
farm location.
12
Combining Raster Surfaces
Probabilities
Land Cover
0
0.20
1.00
0.20
0.20
1.00
1.00
1.00
1.00
Slope
X
Distance from Roads
1.00
0.50
0.50
0.50
0
0.20
0.50
0.50
1.00
0
0.02
0.25
0.02
0
0.20
0.25
0.50
1.00
X
0.20
0.20
0.50
0.20
0.50
1.00
0.50
1.00
1.00
=
13
Combining Raster Surfaces
•Individual probability surfaces were
combined on a state by state basis
14
Creation and Attribution
The production process was a combination of
VB and ArcGIS Modelbuilder
VB
GUI Front End
Opened up a cursor into the Census of
Agriculture summary
Attribution of type of farm
15
Creation and Attribution (con’t)
ArcGIS Modelbuilder
16
17
18
Results
RTI generated a synthetic poultry operation
shapefile for every county in the United States.
The number of farms was correct.
The locations corresponded to the probability
surface.
The size and type were
randomized.
19
Results (Con’t)
RTI sent synthetic poultry operation locations to researchers at
University of Pennsylvania, to compare against the complete set
of truth data.
Actual Locations
Synthetic Locations
20
Conclusions
Synthetic locations matched up very well to actual
locations.
Data is still being tested in the models to see how
sensitive the various parameters are.
Inter-farm distance
Animal type
Number of animals
21
Future Work
Creation of different locations for broilers, layers, and pullets
using surfaces created specifically for each.
Creation of all farms with animal operations nationwide.
Cattle (currently underway)
Sheep
Goats
Hogs
Creation of synthetic cattle operation locations for the UK
(currently underway)
Creation of SE Asian synthetic poultry operation locations.
22
Contact Info
Jamie Cajka
RTI International
(919) 541-6470
[email protected]
23
ModelBuilder Model
Input Probability Surface
Output Synthetic Locations
24