Sampling and Occupancy Estimation
Download
Report
Transcript Sampling and Occupancy Estimation
Krishna Pacifici
Department of Applied Ecology
NCSU
January 10, 2014
Designing studies
Why, what, and how?
Why collect the data?
What type of data to collect?
How should the data be collected in the field and then
analyzed?
Clear objectives help relate all three components.
Why?
Clear objectives
How will the data be used to discriminate between
scientific hypotheses about a system?
How the data will be used to make management
decisions?
For example:
Determine overall level of occupancy for a species in
particular region.
Compare the level of occupancy in two different habitat
types within that region.
What?
Many kinds of data
Population-level
Population size/density
Survival
Immigration & emigration
Presence/absence
Community-level
Persistence
Colonization & extinction
Species richness/diversity
How?
Sampling and Modeling
Interest lies in making inference from a sample to a
population
Statistics!
Want it to be repeatable and accurate
Others should understand what you have done and be
able to replicate
Many different modeling/analysis approaches
Distance sampling, multiple observer, capturerecapture, occupancy modeling…
PURPOSES OF SAMPLING
ESTIMATE ATTRIBUTES (PARAMETERS)
Abundance/ density
Survival
Occurrence probability
ALLOW LEGITIMATE EXTRAPOLATION FROM
DATA TO POPULATIONS
PROVIDE MEASURES OF STATISTICAL
RELIABILITY
SAMPLING NEEDS TO BE
ACCURATE– LEADING TO UNBIASED ESTIMATES
REPEATABLE– ESTIMATES LEAD TO SIMILAR
ANSWERS
EFFICIENT– DO NOT WASTE RESOURCES
BIAS
HOW GOOD “ON AVERAGE” AN ESTIMATE IS
CANNOT TELL FROM A SINGLE SAMPLE
DEPENDS ON SAMPLING DESIGN, ESTIMATOR,
AND ASSUMPTIONS
UNBIASED
TRUE VALUE
SAMPLE
ESTIMATE
*
* * *
* *
**
AVERAGE ESTIMATE
BIASED
TRUE VALUE
*
*
*
*
*
SAMPLE
ESTIMATE
*
*
BIAS
AVERAGE ESTIMATE
REPEATABLE (PRECISE)
*
* * *
* *
**
SAMPLE
ESTIMATE
NOT REPEATABLE (IMPRECISE)
*
*
*
*
*
*
*
*
SAMPLE
ESTIMATE
CAN BE IMPRECISE BUT UNBIASED..
OR
*
AVERAGE ESTIMATE
*
*
*
*
*
*
*
TRUE VALUE
SAMPLE
ESTIMATE
PRECISELY BIASED..OR
TRUE VALUE
*
*
*
* **
* *
SAMPLE
ESTIMATE
AVERAGE ESTIMATE
IMPRECISE AND BIASED!
AVERAGE ESTIMATE
*
*
*
SAMPLE
ESTIMATE
*
*
*
*
TRUE VALUE
*
ACCURATE=UNBIASED & PRECISE
TRUE VALUE
SAMPLE
ESTIMATE
*
* * *
* *
**
AVERAGE ESTIMATE
HOW DO WE MAKE ESTIMATES
ACCURATE ?
KEEP BIAS LOW
SAMPLE TO ADEQUATELY REPRESENT
POPULATION
ACCOUNT FOR DETECTION
KEEP VARIANCE LOW
REPLICATION (ADEQUATE SAMPLE SIZE)
STRATIFICATION, RECORDING OF COVARIATES,
BLOCKING
Key Issues
Spatial sampling
Proper consideration and incorporation of
detectability
Sampling principles
What is the objective?
What is the target population?
What are the appropriate sampling units?
Size, shape, placement
Quantities measured
Remember
Field sampling must be representative of the
population of inference
Incomplete detection MUST be accounted for in
sampling and estimation
What is the objective?
Unbiased estimate of population density of snakes
(e.g., cobras) on Corbett National Park
Coefficient of variation of estimate < 20%
As cost efficient as possible
What is the target population?
Population in the NP
What are the appropriate sampling
units?
Quadrats?
Point samples?
Line transects?
Sampling units- nonrandom
placement
Road
Nonrandom placement
Advantages
Easy to lay out
More convenient to sample
Disadvantage
Do not represent other (off road) habitats
Road may attract (or repel) snakes
OR- redefine the target:
Road
Sampling units- random placement
Random placement
Advantages
Valid statistical design
Represents study area
Replication allows variance estimation
Disadvantage
May be logistically difficult
Harder to lay out
May not work well in heterogeneous study areas
Stratified sampling
Stratified sampling
Advantages
Controls for heterogeneous study area
Allows estimation of density by strata
More precise estimate of overall density
Disadvantages
More complex design
May require larger total sample
Single, unreplicated line
Are these hard “rules” –NO!
Some violations of assumptions can be OK – and even
necessary (idea of “robustness”)
These are ideals to strive toward
Good if you can achieve them
If you can’t, you can’t– but study results may need
different interpretation
Estimation: from Count Data to
Population (I)
Geographic variation (can’t look everywhere)
Frequently counts/observations cannot be conducted
over entire area of interest
Proper inference requires a spatial sampling design that
permits inference about entire area, based on a sample
A valid sampling design
Allows valid probability inference about the
population
Statistical model
Allows estimates of precision
Replication, independence
Other Spatial Sampling Designs
Systematic sampling
Can approximate random sampling in some cases
Cluster sampling
When the biological units come in clusters
Double sampling
Very useful for detection calibration
Adaptive sampling
More efficient when populations are distributed
“clumpily”
Dual-frame sampling
Estimation: from Count to
Population (II)
Detectability (can’t see everything in places where you
do look)
Counts represent some unknown fraction of animals in
sampled area
Proper inference requires information on detection
probability
Sampling Take Home Messages
Field sampling must be designed to meet study or
conservation objectives
Field sampling must be representative of the
population of inference
Incomplete detection MUST be accounted for in
sampling and estimation
Occupancy Estimation
Species status = present or absent
Coarse measure of population status
Proportion of occupied patches
Data can be collected efficiently over large spatial and
temporal extents
Species and community-level dynamics
Occupancy Estimation: Uses
Surveys of geographic range
Habitat relationships
Metapopulation dynamics
Observed colonization and extinction
Extensive monitoring programs: 'trends' or changes in
occupancy over time
Species Occurrence
Conduct “presence-absence” (detection-nondetection)
surveys.
Estimate what fraction of sites (or area) is occupied by
a species when species is not always detected with
certainty, even when present (p < 1).
‘Site’: Arbitrarily defined spatial unit (forest patch of a
specified size) or discrete naturally occurring sampling
units (ponds).
Site occupancy: A solution
MacKenzie et al. 2002 (Ecology)
Key design issues: Replication
Temporal replication: repeat visits to sample units
Replicate visits occur within a relatively short period of time
(e.g., a breeding season)
Spatial replication: randomly selected ‘sites’ or sample
units within area of interest
Basic Sampling Scheme:
Single Season
s sites are surveyed, each at k distinct sampling
occasions.
Species is detected/not detected at each occasion.
Necessary information:
Data summary → Detection histories
Detection history: Record for each visited site or
sample unit
1 denotes detection
0 denotes nondetection
Example detection history: hi = 1 0 0 1 0
Denotes 5 visits to the site
Target species detected during visits 1 and 4
0 does not necessarily mean the species was absent
Not detected, but could be there!
Model Parameters: Single-Season
Models
𝜓𝑖 -probability site i is occupied.
pij -probability of detecting the species in
site i at time j, given species is
present.
Model assumptions
• Sites are closed to changes in occupancy
state between sampling occasions
• No heterogeneity that cannot be explained
by covariates
• The detection process is independent at
each site
• > 500 meters apart
Timing of repeated surveys
Usually conducted as multiple discrete visits (e.g., on
different days)
Can also use multiple surveys within a single visit
Multiple independent observers
Potentially introduce heterogeneity into data
Single visit to each site vs. multiple visits to each site
Rotate observers amongst sites on each day
Rotate order each site is sampled within a day
Designing occupancy surveys
Several important issues to consider:
1. Clear objectives that are explicitly linked to science or
management
2. Selection of sampling units
Probabilistic sampling design
Size of unit relative to species of interest
Timing of repeat surveys
3.
“closed”
Relaxed for lab project
Allocation of survey effort
4.
Survey all of the sites equal number of times?
Getting To Know
PRESENCE
PRESENCE is software that has been developed to
apply these models to collected data.
Within PRESENCE you can fit multiple models to your
data.
PRESENCE stores the results from each model and
presents a summary of the results in a model selection
table using AIC.
PRESENCE
The analysis is stored in a project file (created from the
File menu).
A project consists of 3 files, *.pao, *.pa2 and *.pa2.out
*.pao is the data file
*.pa2 stores a summary of the models fit to the data
*.pa2.out stores the full results for all the models
PRESENCE
consists of 2
main windows
Number crunching
window
Point and click
window
When you create a new
project, you must
specify the data file (if
previously created), or
input the data to be
analysed.
Once the data file has
been defined and
selected, the filename
for the project file will be
the same as the data
file.
To enter data specify the
number of sites, survey
occasions, site-specific
and survey-specific
(sampling) covariates.
Then select the Input
Data Form.
The No.
Occasions/season box is
used for multi-season
data. You must list the
number of surveys per
season, separated with a
comma.
Data can be copied
and pasted (via the
menus only) from a
spreadsheet into
each respective tab.
You can also enter
data directly, or insert
from a comma
delimited text (.csv)
file.
Note the number of PRESENCErelated windows now open.
Once data has been
entered, you must
save the data before
closing the window!
After saving your data
and closing the data
window, check that the
correct data filename
appears here. If not
then will have to select
the file manually.
Make sure you click OK before proceeding.
The type of analysis to perform is selected from the run menu.
After setting up
your project, an
empty Results
Browser window
should appear.
Make sure you
see this before
attempting to run
any models!