Transcript Document

Short Course on Risk Management in Water Resources
Held at DCE, BUET on 13-14 March, 2010
Risk based Information and
dissemination system
Dr. A.K.M. Saiful Islam
Institute of Water and Flood Management (IWFM)
Bangladesh University of Engineering and Technology (BUET)
Outline

What is Information System and Benefits of using
Information System in Risk Management

Database Management System

Geographic Information System

Data Mining and Knowledge discovery

Web based Information System
What is an Information System?

An information system can be defined technically as
a set of interrelated components that collect (or
retrieve), process, store, and distribute information to
support decision making and control in an
organization.

Information technology is a contemporary term that
describes the combination of computer technology
(hardware and software) with telecommunications
technology (data, image, and voice networks).
Information System Diagram
Environment
Storage
Input
Processing
Feedback
Output
Importance of information
system in Risk management

Information Systems are used in almost
every aspect of Business today.

To understand Business we must understand
its Information Systems.

They are rapidly becoming part of our
everyday life.
Legacy of Information Systems used for
Decision Making






Transaction Processing Systems (TPS),
Management Information Systems (MIS)
Decision Support Systems (DSS)
Expert Systems (ES)
Executive Information Systems (EIS)
Geographic Information System (GIS)
Geographic Information System
(GIS)
An Information System that is used to input,
store , retrieve, manipulate, analyze and output
geographically referenced data or geospatial
data, in order to support decision making for
planning and management of land use, natural
resources, environment, transportation, urban
facilities, and other administrative records
GIS overlay of Spatial data

Two different object layers can be overlaid
which can result another layers
An Application of GIS and Remote
Sensing for Estimation of Potato
Yield using remote sensing data
Dr. Sujit Kumar Bala
Dr. Saiful Islam
Study Area and
location of Field Control points
Vegetation indices

Normalized Difference Vegetation Index (NDVI):
Healthy plants have a high NDVI values because of their high reflectance of
infrared light, and relatively low reflectance of red light.
NIR  Re d
NDVI 
NIR  Re d
MODIS Satellite Image
processing
MODIS 10x 10 deg grid all over the world
Bangladesh grid is at h=26, v=06
Raw Image of LAI as HDF-5 file
MODIS Image of LAI
LAI over Bangladesh
LAI over Munshigonj district
LAI Plot using 50 field data points
Criteria of classification

Agricultural Area:
Δ LAI > 0.5

Non-agricultural Area:
Δ LAI < 0.5

Potato
0.5 > NDVI > 0.9
0.5 > LAI > 2.5
Phenological Curve based on NDVI
Life Matrix of Potato
Index
NDVI
LAI
Definition
Metric
OnT
16 days
17 days
Intersection of forward lag and smooth curve
Starting date of VI high period
OnV
0.45
0.91
Value of VI at forwards intersection
VI at start of high period
EndT
88 days
92 days
Intersection of backwards lag and smooth curve
End date of VI high period
EndV
0.52
1.02
Value of VI at backwards intersection
VI at end of high period
MaxT
63 days
66 days
Time of maximum raw corrected VI
Date of maximum VI
MaxV
0.85
2.15
Maximum value of corrected raw VI
Maximum VI
DurT
72 days
75 days
Time from forwards to backwards intersections
Length of VI high period
RanV
0.40
1.24
Difference between minimum and maximum value
of smooth curve
Amplitude of season
RIN
0.01
0.025
Slope of line from forwards intersection to raw
maximum
Rate of VI increase
RDN
0.02
0.048
Slope of line from raw maximum value to
backwards intersection
Rate of VI decrease
TINDVI
20 days
63 days
Integrated area under smooth VI curve
‘Magnitude’’ of season
DurNT
25 days
26 days
Time from backwards to forwards intersection
Length of VI low period
RRINDN
0.50
0.52
Rate of increase/rate of decrease
‘‘Quality’’ of season
Life Matrix (Contd..)
Index
NDVI
LAI
Definition
Metric
RRINDN
0.50
0.52
Rate of increase/rate of decrease
‘‘Quality’’ of season
HRanTO
30 days
29 days
Time of half range value at onset—equals
OnV+(RanV/2) when rising
Start of active growing season
HRanVO
0.65
1.52
Half range value at onset—OnV+(RanV/2)
VI at start of active growing season
HRanTE
80 days
86 days
Time of half range value at end—equals
EndV+(RanV/2) when falling
End of active growing season
HRanVE
0.65
1.33
Half range value at end—EndV+(RanV/2)
VI at end of active growing season
HDurT
44 days
56 days
Duration of period from HRanTO to HRanTE
Duration of active growing season
SMMaxT
56 days
56 days
Time of maximum smooth VI curve
Date of peak of season
SMMaxV
0.79
1.93
Maximum value of smooth VI curve
Value at peak of season
SMMinT
96 days
94 days
Time of minimum smooth VI curve
Date of season minimum
SMMinV
0.33
0.74
Minimum value of smooth VI curve
Value of season minimum
Changes of mean values of indices in the study area of
Munshigonj District with days after plantation of potato
3
11
19
27
32
40
48
56
64
72
80
88
96
NDVI
0.50
0.50
0.53
0.58
0.61
0.70
0.70
-
0.73
0.72
0.69
0.63
0.53
FPAR
0.51
0.49
0.54
0.57
0.60
0.68
0.67
0.69
0.70
0.69
0.69
0.66
0.57
LAI
0.88
0.79
0.98
1.09
1.17
1.48
1.42
1.53
1.58
1.56
1.50
1.41
1.08
LAI
1.6
NDVI
fPAR
1.4
vegetation index
1.2
1.0
0.8
0.6
days after plantation
Chronological plot of Vegetation Indices
104
96
88
80
72
64
56
48
40
32
24
16
8
0.4
0
Day after
plantati
on
Spatial Distribution of NDVI
Yield and NDVImax Correlation
y = 38.373x + 4.2526
R2 = 0.793
35
Based on
Upazilla data
Yield (Ton/ha)
30
25
20
15
10
5
0
0.00
0.20
0.40
0.60
0.80
NDVI
4.5
y = 8.7507x - 4.041
R2 = 0.659
Based on
Field data
Yield (t/ha)*10
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.6
0.7
0.8
NDVI
0.9
Effect of Temperature on NDVI
Maximum rate of growth occurred at lowest temperature
Temperature distribution map
using MODIS Thermal data
Using Satellite Data and GIS to
Investigate Drought
Mr. Hasan Murad
Dr. Saiful Islam
Drought

According to Mc Mohan and
Diaz Arena (1982), “Drought
is a period of abnormally dry
weather sufficiently for the
lack of precipitation to cause
a serious hydrological
imbalance and carries
connotations of a moisture
deficiency with respect to
man’s usage of water.
Study Area
Map of the
North-West region
Study
Area
SPI Calculation

Mathematically, SPI is calculated based on equation:
Where, Xi is monthly rainfall record of the station; Xm is
rainfall mean; and σ is the standard deviation.

Monthly rainfall data from 2000 to 2008 in 6 rainfall
stations are used as an input to SPI program.
http://drought.unl.edu/monitor/spi/program/spi_progra
m.htm#program
Meteorological Drought: Dry Year 3 month
interpolated SPI for 2006 (dry year)- August,
September, October
Drought Using NDVI
Anomaly: Rainfall and NDVI for the
year 2006
Final Task needed: Classification
of Drought Risks
Database
Management System
WFM 6103: hydrologic Information System © Dr. Akm Saiful Islam
Benefits of Using A Database

Data Integrity


Consistent entries
Data validation rules

Ease of data entry with forms

Minimize duplicate data entry

Easy Reporting
Database Management System
• Information Systems process and manage data.

Data Management involves “Capturing”,
“Retrieval,” and “Storage” of data.

Today’s DBMSs are based on sophisticated
software and powerful computer hardware.

Well known DBMS software includes ORACLE,
Microsoft SQL Server, Sybase and MySQL(free
download) among others.
Data Models
1.
2.
Hierarchical
Network
3. Relational
4. Object oriented
Relational Model
Student Table

Based on two important
concepts:


Key of relation - one to
one, one to many, many
to many
Primary attribute –
which can’t be duplicate
Student
Table
*
*
Course
Table
Many to Many relationship
Student
ID
Name
CourseID
1
Mr. X
001
2
Mr. X
002
3
Mr. Y
003
Course table
Cour
seID
Title
Cre
dit
001
RS & GIS in WM
3
002
Watershed Hydrology 3
003
Risk Management
3
Relational DM: Terminology




A collection of data entities is typically known as a file.
An individual data entity is typically known as a record.
Different attributes of a record are typically known as fields.
A key is a field or a set of fields that uniquely identifies a record.
File (table)
Fields
Records
Key
Product Category
+ Product Type
+ Year
A Word on Keys




A Key maybe:
A field or set of fields that are used to identify the record.
A Primary key is a minimal set of fields that uniquely identifies the
record.
A Foreign key is a field that is a primary key in another relation
Primary Key
Foreign Keys
Multidimensional Database
In the multidimensional view,
two dimensions are viewed at
one time and the others are
available to page through.
Finding and analysing large
numbers of records is possible.
Car colour can easily be placed, as a page, along with years. Then
page, row and column dimensions can be exchanged.
Types of Databases
Operational Databases
Because of their efficient storage and speed of small additions or
updates, E-R / Relational DBs are used (mostly) for transaction
processing and we refer to these systems as Operational Systems,
or Operational Databases. Sometimes the terms “Transaction” and
“Production” are used as well.
Analytical Databases
Data that is used for decision making purposes, is typically
stored in a different form than operational data. Analytical
Databases, or Analytical Systems store data in a way that
allows for long and complicated interactions with relatively few
users.
What is a Data Warehouse?

A Data Warehouse (DW) is an IS designed to
support analytical tasks.

Integrates information from a variety of sources,
and/or applications.

Supports (relatively) few users with long
interactions

Data in a data warehouse cannot be changed!
A Data Warehouse consists of:




A Large Physical Database: This is the actual “warehouse.” It
includes the data, as well as metadata (data about the organisation
of the data in the Data Warehouse), and the processing logic used
to process the data.
The Logical Data Warehouse: This contains all the metadata,
business rules, and processing logic, as well as the information
required to access the actual data. (same as the Data Warehouse
model)
Data Marts: These are subsets of the data warehouse, used for
functional, departmental, or regional purposes. Data Marts are build
gradually, and are connected via the logical data warehouse.
DSS and EIS: These are NOT part of the data warehouse, but they
are applications that use the data warehouse.
A Company Systems
Transactions
Distribution
Data Mart
Human
Resources
Data Mart
Operational
Analytical
Decisions
Sales
Data Mart
Example
Corporate Data
Model
Data Warehouse
Data Model
CUSTOMER_INVOICE
Invoice_ID
Invoice_Date
Customer_ID
Customer_Address
Description
Message
Status
CUSTOMER_INVOICE
Invoice_ID
Invoice_Date
Customer_ID
Customer_Address
Data
unlikely to
be used for
DSS
Adding an Element of Time
• Data may change over time, but Operational Systems
do not always “record” such change.
• The time element is added only if it does not exist.
Corporate Data
Model
CUSTOMER
Customer_ID
Name
Birth_Date
Marital_Status
Credit_Rating
Data Warehouse
Data Model
CUSTOMER
Customer_ID
Snapshot_Date
Name
Birth_Date
Marital_Status
Credit_Rating
Example of Derived Data
Corporate Data
Model
INVOICE
Invoice_ID
Product_ID
Product_Code
Quantity
Unit Price
Data Warehouse
Data Model
Derived
Data
INVOICE
Invoice_ID
Product_ID
Product_Code
Quantity
Unit_Price
Total_Amount
Product_Cost
Data Marts




A Data Mart is also a “Data Warehouse,” but usually for a
single “subject” area.
It is common to model this single subject area using a starschema design. (A data mart may have more than one star
schema)
A star schema usually consists of a “central” table, called the
fact table, and a set of satellite tables, known as dimensions,
or dimension tables.
The fact table has multiple joins which connect it to the
dimensions. Dimension tables have a single join which
connects them to the fact table.
On-Line Analytical Processing (OLAP)

Codd’s OLAP was an ambitious idea; The OLAP concept
tried to introduce a new range of analytical systems that
embrace the enterprise.

Nowadays OLAP is used mainly by end-users, and usually
sits on top of a Data Mart or a Data Warehouse.

OLAP technology allows end-users to interact and perform
(at least) basic analysis of the data.

OLAP is most popular as a (data) visualisation tool.
Multidimensional Tables
Multidimensional
Tables are also known
as hypercubes, or
datacubes.
The datacube to the
right was generated
by TM/1 Perspectives,
in Excel.
The dialog box shows
the dimensions that
define this table.
Drill-Down
Quarter 1
has been
drilled down
to the
individual
months .
Roll-up
ROLL-UP: The Quarters have been rolled-up from the individual
Quarters to the year.
Example of Download Historical Climate Data: Canadian
Climate Data
www.cccma.bc.ec.gc.ca/hccd/index.shtml
• Limited Number of Stations
• Requires Registration
3. Selecting a Site
By province
By Province or Station Name
Or by Proximity
By City (limited)
Location and Elevation
Similar
Data Mining or KDD
Data Mining

To fully complete the picture, at least in terms of methods
and techniques used for extracting knowledge from data,
there should be some discussion of Data Mining, or
Knowledge Discovery in Databases (KDD).
“Knowledge Discovery in Databases is the
non-trivial process of identifying valid, novel,
potentially useful, and ultimately
understandable patterns in data.”
(Fayyad et al. 1996).
Mining and KDD

Data Mining and Knowledge Discovery in Databases are
often used synonymously.

With respect to the KDD process:
 Data Analysts, Statisticians, MIS people tend to use the
term Data Mining.
 Artificial Intelligence and Machine Learning researchers,
tend to use the terms Knowledge Discovery in Databases.

In the Artificial Intelligence and Machine Learning fields, the
term data mining refers to the step(s) of the KDD process,
when a particular algorithm or method is applied in order to
discover knowledge.
Disciplines behind knowledge
discovery

Machine Learning and Artificial Intelligence, with methods such
as Rule Induction, Case-based Reasoning, Neural Networks, and
Genetic Algorithms.

Uncertainty Methods, and in particular methods originating from
Statistical Science. These include Decision Trees, Bayesian
Methods, Fuzzy Logic, Clustering, as well as Classical Statistics
and Probability Theory.

Database Techniques: Namely Association Rules. Technologies
such as Data Warehousing, Data Marts, and OLAP are, in general
referred to as “enabling technologies” for data mining. That is, their
use is not of a primary role (KDD can be applied without these
technologies), but is of primary significance, since the use of such
technologies is beneficial for the application of KDD.
Neural Nets

Artificial Neural Networks, or simply Neural Networks are
used for classification and prediction. A mathematical
network is modelled with inputs and outputs to theoretical
neurons.

This structure resembles the human brain network of
neurons, and is used “to create a system that could solve
difficult problems and display behaviour that was much more
complex than the simple pieces that made it up” (Berson &
Smith 1997)
ANN Layers
Example of using ANN for Flood
Forecasting of Bangladesh
Data and Methodology

Water Level Data
BWDB daily water level data of 10 major
stations on the three major rivers: Brahmaputra,
Ganges and Meghna

Remote sensing data
TRMM 3B42 Data with spatial resolution is 0.25
degree (500S to 500N) and temporal resolution is
3 hours
Figure-2: GBM Basin
Figure-3: GBM Basins grid off 0.5 deg resolution. Grid is
consists of 1056 cells (22 rows 48 columns)
Figure-4: Flow network of GBM Basin
Figure-5: Flow network and GBM Basin grid
Figure-6: Neural network
Figure-7: Neural network and GBM basin grid
Figure-8: Neural network and Sub-basins
Figure-9: Three hourly TRMM 3B42 Rainfall data with
0.25 degree spatial resolution
Measured Vs Predicted WL
Predicted Vs Observed
Web based
Information System
XML

An Extensible Markup Language (XML) document
describes the structure of data.

XML and HyperText Markup Language (HTML) have a
similar syntax … both derived from Standard
Generalized Markup Language (SGML).

HTML is a small SGML application used on web (a DTD
and a set of processing conventions).

XML has no mechanism to specify the format for
presenting data to the user. An XML document resides in
its own file with an ‘.xml’ extension.
Converting Relational Database to
XML
Example: Export the following data into XML and group books by
store
 Relational Database:
Store (sid, name, phone)
Book (bid, title, authors)
StoreBook (sid , bid, price, stock)
price
name
phone
Store
sid
stock
StoreBook
Book
title
authors
bid
Converting Relational Database to
XML (Cont’d)

XML:
<store> <name> … </name>
<phone> … </phone>
<book> <title>… </title>
<authors> … </authors>
<price> … </price>
</book>
<book>…</book>
…
</store>
XML Example
<BOOKS>
<book id=“123”
loc=“library”>
<author>Hull</author>
<title>California</title>
<year> 1995 </year>
</book>
<article id=“555”
ref=“123”>
<author>Su</author>
<title> Purdue</title>
</article>
</BOOKS>
BOOKS
book
article
loc=“library”
ref
123
author
year
555
author
title
title
Hull
1995
California
Su
Purdue
Some companion W3C recommendations

XML Schema- an XML based alternative to DTD, more
powerful, Support namespace and data types

XPATH: language for addressing parts of an XML
document - used by XSLT

Extensible Stylesheet Language (XSL): an XML vocabulary
for specifying formatting semantics

XSLT: language for transforming XML documents into
other XML documents
Thank You