what is an anomaly

Download Report

Transcript what is an anomaly

Detection of thermal anomaly of land surface
temperature dataset of an EDC geothermal
prospect in Peru using statistical methods
Winston P.C. Pioquinto
13th National Convention on Statistics
EDSA-Shangri-La, Mandaluyong City
October 3-4, 2016
OUTLINE
•
•
•
•
•
Overview
Objectives and Limitations
Materials and Methods
Results of study
Conclusions and further study
WHAT IS AN ANOMALY ?
http://www.datalgorithm.co.uk/portfolio_page/anomaly-detection/
An anomaly is an observation that deviates so much from other observations as to
arouse suspicion that is/was generated by a different mechanism. - Hawkins, 1980
THERMAL ANOMALY REQUIRES THE PRESENCE OF AN
OBJECT THAT IS SIGNIFICANTLY BRIGHTER OR HOTTER
THAN OTHERS
An object or feature
that is different in its
recorded temperature
when compared to the
background
temperature of a
certain area (spatially)
or within a specific time
(temporally) (Harris, 2013)
http://www.ulb.ac.be/sciences/cvl/multispectral/multispectral2.htm
ANOMALY DETECTION - IDENTIFICATION OF “EXTREME”
OR “OUT OF PLACE” PIXELS IN THE DIGITAL IMAGERY
• Pixel anomalies are classified into three categories:
(1) point anomalies – an individual pixel or region of
pixels are spectrally anomalous with respect to
the background (image/scene);
(2) contextual anomalies – the context of the pixel
(its classification label is anomalous with respect
to the rest of pixels with certain classes
(3) collective anomalies - where a collection of
measures is anomalous with respect to the entire
scene.
STUDY AREA IS ONE OF EDC’S GEOTHERMAL
PROSPECT IN LATIN AMERICA
Study area
Prospect A
located in the
high mountains
of southern Peru
OBJECTIVES AND LIMITATIONS
• Pinpoint specific regions or pixels with
anomalous temperature values
• Discretize subtle temperature anomalies that
could be possibly related to geothermal
features
• Data points are dependent on resolution of
image used (~60m)
MAIN MATERIALS USED IS A LAND SURFACE
TEMPERATURE DATASET DERIVED FROM LANDSAT
SATELLITE IMAGE
Land surface
temperature dataset
of Prospect A, Peru
- LST map derived from
thermal infra-red Band 6
(low gain) of Landsat 7
ETM+ acquired 11-06-2000
With pixel resolution of
~60m
IN THE LST MAP , THE BRIGHTER THE PIXEL THE
HIGHER THE TEMPERATURE IT INDICATES
DIGITAL ELEVATION MODEL OF PROSPECT A BLOCK
USED TO EXTRACT ELEVATION VALUES
Digital elevation
model Prospect A
block
- Elevation values obtained
from DEM which was
derived from ALOS satellite
stereographic optical sensor
METHODOLOGY USED IS CLUSTER ANALYSIS,
HOTSPOT/OUTLIER ANALYSIS, VISUAL EXAMINATION OF
HISTOGRAM AND DENSITY PLOTS USING R SOFTWARE
PLOT SHOWS APPARENT CLUSTERING OF TEMPERATURE
VALUES AT CERTAIN TEMPERATURE RANGES
Temperature (°C)
2D histogram plot of temperature versus elevation
Elevation (m)
Brightness of pixels denote relative abundance
THREE, FOUR OR FIVE NUMBER OF CLUSTERS ARE
SUBJECTIVELY DETERMINED BASED ON THE ELBOW OR
INFLECTION IN THE PLOT
Cluster analysis is a multivariate analysis that attempts to form groups or "clusters"
of objects that are "similar" to each other but which differ among clusters.
Plot showing how to determine the number of clusters that can be formed in the
dataset. In this case, 3,4 and 5 clusters are subjectively determined
THE LARGE DATASET MAKES CLUSTERS OR PARTITIONS
DIFFICULT TO DELINEATE THUS IMPRACTICAL TO DETECT
ANOMALIES
Plots of components at specified number of
clusters (3,4, and 5 clusters respectively)
ANOMALIES ARE DETECTED USING CUT-OFF VALUES
OF OUTLIERS AT VARIOUS ELEVATION RANGES
Outlier analysis identify values that are
disproportionately high based on both
the deviance of any given value from a
statistical distribution and its similarity to
other values.
cut-off value = 53.9708
cut-off value = 48.84044
Plot of outlier temperature values per elevation range (indicated by red bar/points)
ANOMALIES ARE DETECTED USING CUT-OFF VALUES
OF OUTLIERS AT VARIOUS ELEVATION RANGES
cut-off value = 56.05709
cut-off value = 53.57436
Plot of outlier temperature values per
elevation range (indicated by red
bar/points)
cut-off value = 6.898726
SUBTLE ANOMALIES DETECTED BY VISUAL
EXAMINATION OF HISTOGRAMS AND DENSITY PLOTS
Histogram and kernel density plots of
temperature values per elevation range
CURVES ARE STEEP OR PROGRESSIVELY GOING DOWN SO A SUBTLE
ANOMALY IS DIFFICULT TO RECOGNIZE
SUBTLE TEMPERATURE HIGHS DETECTED AT ELEVATION
4000-5000M BECAUSE OF GRADUAL SLOPING OF CURVE
Histogram and kernel density plots of
temperature values per elevation range
- note at elevation 4000-5000m, a
gradual slope in the curve possibly
indicating subtle temperature highs
gradual slope in the
curve @ 48.12693
anomaly !!
ANOMALOUS TEMPERATURE LOWS AT ELEVATION 5000-6000M
INTERPRETED TO BE DUE TO SNOW OR ICE PORTIONS OF THE REGION
SUBTLE TEMPERATURE HIGHS DIFFICULT TO DETECT DUE
TO PREDOMINANTLY LOW TEMPERATURE VALUES AT
ELEVATION 6000-7000M
Histogram and kernel density plots of temperature values at 6000-7000m
elevation range - note predominantly lower temperature values wherein subtle
temperature highs are difficult to recognize
RESULTS OF STUDY
Extraction and display of data points interpreted as anomalies
Plot of spatial location of outliers and
subtle temperature highs per elevation
range
DETECTED THERMAL ANOMALIES ARE MORE PREVALENT
IN THE EASTERN SECTOR BUT SOME POINTS ARE NEAR
THE VICINITIES OF THERMAL SPRINGS (TRIANGLES)
Plot of statistical point anomalies in the
prospect block superimposed with
actual location of thermal springs
(triangles)
RELATIVELY HIGHER TEMPERATURES LOCATED MOSTLY ALONG MOUNTAIN
FLANKS IN THE EASTER SECTOR AND SOME IN THE LOWLANDS IN THE NORTH
WHILE IN PEAKS LOWER TEMPERATURE VALUES ARE EXHIBITED
CONCLUSIONS AND FURTHER STUDY
• Statistical methods used in the thermal anomaly
detection includes cluster/outlier analysis, and
examination of histogram and density plots
• Subtle temperature anomalies are subjectively
determined by examining histogram and density
plots of temperature values at specific elevation
ranges
• A gradual sloping of the curve in the density plot of
the elevation range from 4000 to 5000m was
interpreted to be a subtle anomaly
CONCLUSIONS AND FURTHER STUDY
• Anomalously low temperature values at 6000-7000m
elevation could be snow-laden spots in the region
• Limitations in R software to handle a large dataset
makes cluster analysis challenging and the statistical
method used is constrained by image resolution
• Since we are dealing primarily with point anomalies
with spatial attributes, the use of other R packages
such as spatstat and gstat would be another option
to study spatial point patterns and conduct spatial
statistics analysis and modeling
CONCLUSIONS AND FURTHER STUDY
• The study in conjunction with other remote sensing
techniques and using higher resolution images could
help with EDC’s exploration efforts of finding thermal
areas previously unmapped in the vicinity of the
prospect area
• Ground truthing is needed to validate the indicated
thermal anomalies whether they could be attributed
to geothermal features (i.e. hot grounds, fumaroles,
etc.) or may just be due to highly reflective surfaces
(i.e. bare rocks, desert sand, roofs, etc.)
Thank you…
“Through every rift of discovery some seeming anomaly
drops out of the darkness, and falls, as a golden link into
the great chain of order.”
- Edwin Hubbel Chapin