Materials and methods

Download Report

Transcript Materials and methods

專題討論
授課老師:謝平城教授
指導老師:詹勳全副教授
學生:温祐霆
學號:7102042012
2016/4/10
1
Analysis of topographic and
vegetative factors with data
mining for landslide verification
Ecological Engineering :2013
Fuan Tsaia,b,∗, Jhe-Syuan Laib, Walter W. Chenc,
Tang-Huang Lina
aCenter
for Space and Remote Sensing Research,
National Central University, Zhongli, Taoyuan 320,
Taiwan
bDepartment of Civil Engineering, National Central
University, Zhongli, Taoyuan 320, Taiwan
cDepartment of Civil Engineering, National Taipei
University of Technology, Taipei 10608, Taiwan
2016/4/10
2
Contents
1
2
2016/4/10
Introduction
Landslide analysis and data mining
3
Materials and methods
4
Results
5
Discussion
6
Conclusions
3
Introduction
Taiwan is located in East Asia where Eurasian
continent and Philippine Sea plates collide with
each other.
Taiwan is also located in the passing route of
Western Pacific tropical cyclones (typhoons).
The geological and climate conditions in
conjunction with the dense population make
Taiwan one of the most vulnerable countries to
natural disasters as listed by the World Bank
(Dilley et al.,2005).
2016/4/10
4
Introduction
Among the natural disasters in Taiwan, landslides
are commonly triggered by earthquakes and
heavy rainfall, especially in the mountainous
regions.
For example, the Chi–Chi earthquake in 1999
caused numerous landslides in central Taiwan
(Lin et al.,2006; Lo et al., 2010); and Typhoon
Morakot in 2009 also induced catastrophic
landslides and debris flows in southern Taiwan
(Tsaiet al., 2010).
2016/4/10
5
Introduction
These types of natural hazards often result in not
only serious property and infrastructure damages
but also human casualties.
Therefore, landslide analysis and assessment has
become an important issue in hazard mitigation
and prevention in Taiwan.
In order to better understand the relationship
between landslides and various topographic and
vegetative factors, this study utilized data mining
techniques to analyze the factors with collected
landslide events in the Shimen reservoir
watershed in northern Taiwan.
2016/4/10
6
Landslide analysis and data mining
Quantitative Analysis
deterministic
method
heuristic
method
statistical
method
1.based on the
physical laws
2. suitable for small
and relatively
homogenous
regions
1. rank and weight
the causative
factors of
landslides
2. The processing is
usually subjective
1. will occur on
similar conditions
from past and
present instability
2016/4/10
7
Landslide analysis and data mining
spatial technologies and data have been used
intensively to effectively investigate and monitor
natural hazards, including landslides.
data mining (DM) is an important and effective
technique in the field of knowledge discovery that
can extract knowledge from complicated data,
database or data warehouse.
A few landslide-related analysis methods have
integrated DM algorithms in different forms,
including Decision ,Bayesian Network ,artificial
neural network and object-oriented methods
2016/4/10
8
Landslide analysis and data mining
Decision Tree (DT) algorithm is a classical,
universal and comprehensible method.
Bayesian Network (BN) has also been proved an
effective data mining approach for landslide
related assessment.
this study integrates data mining techniques and
spatial analysis to analyze topographic and
vegetative factors of landslides from collected
spatial data sets and landslide inventories for
constructing landslide factor models.
2016/4/10
9
Materials and methods
 763.4 km2
 2500 mm/year
 between May
and October
every year
 250 to 3500 m
 Slop>55%
60% area
30%<slop<55%
29% area
2016/4/10
10
Study area
Materials and methods
2016/4/10
11
Study area
Materials and methods
2016/4/10
12
Study area
Materials and methods
Materials and
data preprocessing
Based on the long-term monitoring project of the
study site (Tsai and Chen, 2007), although there
were a few earthquakes, they did not cause
significant landslides in the Shi-men reservoir
watershed.
In the data-driven landslide analysis system
proposed in this paper, all landslides are assumed
to be triggered by heavy rainfall in the study site.
In addition, this study does not distinguish
different types of landslides.
2016/4/10
13
Materials and methods
Original data
DEM
Factor
Elevation
Slope
Aspect
Curvature
Materials and
data preprocessing
Resolution/scale
40 m × 40 m
Resample
10 m × 10 m
Satellite images
Stream map
Road map
NDVI
Distance to river
Distance to road
10 m × 10 m
Fault map
Geology map
Soil map
2016/4/10
Land-cover
map
Distance to fault
Geology
Soil
Landuse14
1/50,000
1/50,000
1/25,000
1/5,000
1/5,000
1/5,000
Materials and methods
Materials and
data preprocessing
landslide inventory consists
of
landslide
extents
identified with satellite
remote sensing and spatial
analysis.
Most of the landslides are
small to medium in terms of
size.
Typhoon Aere triggered a
few large-scale landslides in
the southwest region of the
watershed in 2004.
2016/4/10
15
Materials and methods Data mining analysis
Using error
matrix to
calculate
1.Overall
Accuracy (OA),
2.Producer’s
Accuracy (PA),
3.User’s
Accuracy (UA)
4. Kappa
coefficient
indexes
2016/4/10
16
Materials and methods
Data mining
kernel computing
This study employs two algorithms for the kernel
computation of data mining.
Decision Tree (DT)
Bayesian Net-work (BN)
2016/4/10
17
Materials and methods
Data mining
kernel computing
Decision Tree (DT)
◎假設有16筆顧客資料,曾購買NB有4筆,未曾購買有12筆。
◎將16位顧客分成2組:
1.年齡小於30歲:曾買NB有1筆,未買NB有5筆。
2.年齡大於或等於30歲:曾買NB有3筆,未買NB有7筆。
I(p,n)=I(4,12)=0.8113
E(age)=(6/16)I(1,5)+(10/16)I(3,7)=0.7946
Gain(age)=0.8113-0.7946=0.0167
分別計算依年齡、婚姻、收入等三個屬性資料獲利,以資訊獲利最大者
為第一分類依據
2016/4/10
18
Materials and methods
Data mining
kernel computing
Decision Tree (DT)
◎屬性值配對共有七種:年齡小於30歲、年齡大於或等於30歲、婚姻狀態為單身、
婚姻狀態為已婚、收入為低、收入為中、收入為高。
1.計算此七種屬性值配對的資訊獲利
PRISM_Gain(婚姻=單身)=log(3/7)= -1.224
2.分別計算其他屬性值配對的資訊獲利
Prism(年齡>=30)= log2(3/3)= -1
PRISM(收入=高)= log2(1/1)=0
2016/4/10
19
Materials and methods
Data mining
kernel computing
Bayesian Network(BN)
Bayesian Network is a Directed Acyclic Graph
(DAG) consisting of nodes and connectors.
Nodes represent the independent variables or
conditional attributes.
End-nodes are the dependent variables or decision
attributes.
2016/4/10
20
Materials and methods
Factor analysis and
uncertainty filtering
It is necessary to analyze the significance of
different landslide factors in the detection, check,
and prediction phases in order to better
understand their impacts.
Every condition attribute of continuous data has a
standard deviation, σ .
Calculate each Mean and Standard Deviation.
Kept rule is between the Plus-Minus n*Standard
Deviation.
After an empirical analysis, 5σ was selected as
the threshold to filter out data uncertainties in this
study.
2016/4/10
21
Materials and methods
Susceptibility
assessment
The landslide factor models constructed from data
mining analysis can be used as the basis for
landslide susceptibility assessment.
This study constructed landslide factor models are
applied to the prediction dataset to obtain
probability values.
The resultant susceptibility regions are
categorized into three different levels: very high
(>85%), high (70–85%), and medium to high
(50–70%).
2016/4/10
22
Results
Both the training and check data were generated
from landslide inventories from 2004 to 2007.
The constructed models were then applied to
analyze the 2008 data set for potential landslide
assessment (prediction).
Red-prediction
Yellow-ground truth
2016/4/10
23
Results
before
filtering out
after
filtering out
29%
2016/4/10
24
20%
Discussion
Factor significance
analysis after
uncertainty filtering.
Factor significance
analysis
2016/4/10
25
Discussion
Red-prediction
Yellow-ground truth
2016/4/10
26
Discussion
2016/4/10
27
Conclusions
2016/4/10
1
Using data mining and spatial analysis to
analyze topographic and vegetative factors
of landslides in the Shimen reservoir.
2
Decision Tree and Bayesian Network data
mining algorithms were used for landslide
detection to verify the effectiveness of the
constructed models.
3
To reduce the data uncertainties, a statisticsbased mechanism was developed to filter out
data uncertainty
28
Conclusions
2016/4/10
4
after filtering out data uncertainties, the
accuracy increased and the Kappa
coefficients for DT and BN analysis have
also increased by 29% and 20 %.
5
NDVI, land-use, distance to fault, and
distance to river are the most significant
latent factors of landslides in the study site.
6
Bayesian Network data mining approach
produced better results in landslide
detection and prediction in this study
29
Click to edit company slogan .
2016/4/10
30